I'm new to Splunk, so apologies if this is a silly question.
I have a log file that reads:
2023-03-22 00:57:09,517 INFO TestScript - Generating reports with date of 20230321 and thread pool size of 5
...
...
2023-03-22 00:59:23,681 INFO MultiTestScript - Multi Test report generation completed successfully!
and I am trying to extract the elapsed time between these two events.
If I try this search
<search terms> | transaction startswith="Generating reports" endswith="report generation completed"
I get no results found.
If I search for the two halves of the transaction separately, i.e.
<search terms> | transaction startswith="Generating reports"
and
<search terms> | transaction endswith="report generation completed"
the search returns the appropriate part of the log file.
As soon I combine the startswith= and endswith= fields in a single search, however, I get no results.
This query works properly with another log file. The only difference I can see between the files is that the file that works contains multiple transactions (i.e. "Generating report"/"report generation completed" pairs) while the files that won't work contain only one.
The transaction command works best when there is a key field (e.g. correlation ID) shared between events that tie the events together. In this case it can only go on time order, which, depending on what is logging the output, may not be a very reliable way to pair events.
It really comes down to what you are trying to achieve. Is it just the total time between the first event (startswith) and the very last event (endswith)? If so, then something like the stats command would be better and far more efficient...
search term ... ("Generating reports") OR ("report generation completed")
| eval type=if(searchmatch("Generating reports"), "startswith", "endswith")
| stats min(eval(if(type="startswith", _time, ""))) AS start
max(eval(if(type="endswith", _time, ""))) AS end
| eval duration=end - start
,duration_hm=tostring(duration, "duration")
| foreach start end [ eval <<FIELD>>=strftime('<<FIELD>>', "%c") ]
It's important to be clear in what you are aiming to achieve.
Hope this helps
Hi @algol2
That's a bit strange, for sure.
Try adding the following parameters and see if that makes a difference...
| transaction maxevents=2 keepevicted=true keeporphans=true startswith="Generating reports" endswith="report generation completed"
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Transaction?ref=hk
Hope it helps
Hey,
It certainly made a difference, the search now returns thousands of hits. It looks like it returns the lines I want, but also everything in between, in two-line chunks.
I removed the maxevents=2 argument, which reduces the hit count to 11, but Splunk is still gathering up all these extra entries that don't contain the startswith= or endswith= terms.
The transaction command works best when there is a key field (e.g. correlation ID) shared between events that tie the events together. In this case it can only go on time order, which, depending on what is logging the output, may not be a very reliable way to pair events.
It really comes down to what you are trying to achieve. Is it just the total time between the first event (startswith) and the very last event (endswith)? If so, then something like the stats command would be better and far more efficient...
search term ... ("Generating reports") OR ("report generation completed")
| eval type=if(searchmatch("Generating reports"), "startswith", "endswith")
| stats min(eval(if(type="startswith", _time, ""))) AS start
max(eval(if(type="endswith", _time, ""))) AS end
| eval duration=end - start
,duration_hm=tostring(duration, "duration")
| foreach start end [ eval <<FIELD>>=strftime('<<FIELD>>', "%c") ]
It's important to be clear in what you are aiming to achieve.
Hope this helps