Hello,
I've been experimenting with queries that makes use of the transaction command but overrides the _time field. So given the following sample data:
2011-02-01T12:00:00.000-0800 SID=1 (1,2011-01-01T12:00:00.001-0800)
2011-02-01T12:00:00.001-0800 SID=1 (3,2011-01-01T12:00:00.003-0800)
2011-02-01T12:00:00.002-0800 SID=1 (2,2011-01-01T12:00:00.002-0800)
2011-02-01T12:00:00.003-0800 SID=1 (4,2011-01-01T12:00:00.004-0800)
2011-02-01T12:00:00.004-0800 SID=1 (6,2011-01-01T12:00:00.006-0800)
2011-02-01T12:00:00.005-0800 SID=1 (5,2011-01-01T12:00:00.005-0800)
The body of the events contain data in the following format:
(actionId, timestamp)
A given transaction should flow from actionId 1 to 6 with the timestamp in the body and not the timestamp of the event.
So a query would go like:
... (data extracted into fields actionId and actionTimeStamp with actionTimeStamp in epoch time format) | eval _time=actionTimeStamp | sort 0 -_time | transaction SID startswith=(actionId="1") endswith=(actionId="6")
When I compare this to the following slightly different query:
... (data extracted into fields actionId and actionTimeStamp with actionTimeStamp in epoch time format) | eval _time=actionTimeStamp | sort 0 -actionTimeStamp | transaction SID startswith=(actionId="1") endswith=(actionId="6")
I get a different number of transactions. It seems like the first query loses some data when compared to the second. The only thing I can think of is that transaction, for whatever reason, doesn't like it when I override the value of _time and then sort it.
Other possibly related notes is that when I do the "eval _time=actionTimeStamp" but don't do the sort the query never returns for me.
So my question is if I'm going about the query the right way when I need to do a transaction on a timestamp in the body of the event rather than the original timestamp of the event.
The transaction command deals with things in a "stream" fashion. What you're trying to do can be accomplished, and I feel you're right on the edge.
First thing I want to touch on, the use of "startwith" and "endswith". These commands are to tell splunk when there is a hard end and a hard start to each transaction. The reason you are losing events, if splunk encounters items that do not follow the transaction start and end conditions, the event is "evicted" from the stream. I wrote a long post about how the transaction command works here: Transaction-Problems
Next up, splunk is fine if you are overwriting the _time field, and you can do this as a personal preference. Really what you need is to simply run 2 sorts to have your stream in order, then bind them in a transaction, you can do this with or without overriding _time:
[search]|sort -actionTimeStamp,-actionId | transaction SID
This should sort all your events descending by actionTimeStamp, and then with any items that have the same actionTimeStamp, sort descending by actionId. The transaction command will then only stick together events that have the same SID and will get rid of any event that does not have SID populated. For speed increases, if you know there will only be 6 events per transaction, you can use transaction SID maxevents=6.
Let me know if you have further questions.
The transaction command deals with things in a "stream" fashion. What you're trying to do can be accomplished, and I feel you're right on the edge.
First thing I want to touch on, the use of "startwith" and "endswith". These commands are to tell splunk when there is a hard end and a hard start to each transaction. The reason you are losing events, if splunk encounters items that do not follow the transaction start and end conditions, the event is "evicted" from the stream. I wrote a long post about how the transaction command works here: Transaction-Problems
Next up, splunk is fine if you are overwriting the _time field, and you can do this as a personal preference. Really what you need is to simply run 2 sorts to have your stream in order, then bind them in a transaction, you can do this with or without overriding _time:
[search]|sort -actionTimeStamp,-actionId | transaction SID
This should sort all your events descending by actionTimeStamp, and then with any items that have the same actionTimeStamp, sort descending by actionId. The transaction command will then only stick together events that have the same SID and will get rid of any event that does not have SID populated. For speed increases, if you know there will only be 6 events per transaction, you can use transaction SID maxevents=6.
Let me know if you have further questions.
Can you please open a support ticket with your steps to recreate and splunk version?
Also if you have any sample data, that would also be useful.
Thanks!
I'm able to recreate it but I'm also able to alleviate it if I provide a shorter time window to run the query against. I have a dashboard with the query but the chart seems to receive all the events properly, it's only in the flashtimeline page do the events disappear at the end of the query's execution.
I've never experienced this type of problem with the transaction command, can you re-create the conditions? Are you accidentally clicking on a field as you're viewing the transaction stream?
If you can re-create the issue on a consistent basis, it might be worth opening a ticket to support.
One thing I noticed when I use transaction on the flashtimeline page is sometimes I see the results being outputted from transaction but at the very end all the results are removed and shows "No matching events found." Why would this happen?