I am trying to track file transfers from one location to another.
Flow: Files are copied to File copy location -> Target Location
Both File copy location and Target location logs are in the same index but each has it own sourcetype.
File copy location events has logs for each file but Target location has a logs which has multiple files names.
Log format of filecopy location:
2024-12-18 17:02:50 , file_name="XYZ.csv", file copy success
2024-12-18 17:02:58, file_name="ABC.zip", file copy success
2024-12-18 17:03:38, file_name="123.docx", file copy success
2024-12-18 18:06:19, file_name="143.docx", file copy success
Log format of Target Location:
2024-12-18 17:30:10 <FileTransfer status="success>
<FileName>XYZ.csv</FileName>
<FileName>ABC.zip</FileName>
<FileName>123.docx</FileName>
</FileTransfer>
Desired result:
File Name FileCopyLocation Target Location
XYZ.csv 2024-12-18 17:02:50 2024-12-18 17:30:10
ABC.zip 2024-12-18 17:02:58 2024-12-18 17:30:10
123.docx 2024-12-18 17:03:38 2024-12-18 17:30:10
143.docx 2024-12-18 18:06:19 Pending
I want to avoid join.
@ITWhisperer Can you please help?
Assuming you already have filenames extracted, then try something like this
| eval file_name=coalesce('FileTransfer.FileName', file_name)
| stats values(eval(if(sourcetype="filecopy",_time,null()))) as FileCopyLocation values(eval(if(sourcetype="transfer",_time,null()))) as TargetLocation by file_name
| eval FileCopyLocation=strftime(FileCopyLocation,"%F %T")
| eval TargetLocation=strftime(TargetLocation, "%F %T")
| fillnull TargetLocation value="Pending"
Still i get "Pending" for all the files even though it was success and timestamp is there.
Sounds like the file names don't completely match or perhaps the TargetLocation event doesn't have it in? Is it always the same file or at least file position e.g. always the last in the list? Or possibly files after a particular point in the XML message. Without being able to see your data, it is a bit difficult to determine what might be wrong.
Filenames match exactly. Targetlocation has the file name. Like I have it in my example the file names are different. It is not related to position.
When modify the stats to values(file_name) i get results but it is so weird results
Hi @t_splunk_d ,
you can use the transaction command:
index=your_index sourcetype IN (sourcetype1, sourcetype2)
| eval FileName=coalesce(file_name, FileName)
| stats earliest(_time) AS FileCopyLocation latest(_time) AS TargetLocation BY FileName
| eval FileCopyLocation=strftime(FileCopyLocation,"%Y-%m-%d %H:%M:$S"), TargetLocation=strftime(TargetLocation,"%Y-%m-%d %H:%M:$S")
| fillnull value="Pending" TargetLocation
| table FileName FileCopyLocation TargetLocation
Ciao.
Giuseppe
Transaction command is costly and it has limitations for wider timeframe and larger datasets.
Thank you for the help. I always get null for TargetLocation in stats and thus showing "Pending"
I notice that latest(TargetLocation) has multiple values and null is the latest. Is there a way to eliminate null so that the latest time can be displayed?