I have a log file with repeating patterns looking like this. Notice there are only 3 distinct field names and pay attention to the 4th, 5th and 6th lines:
Time1 field1: value1
Time2 field2: value2
Time3 field3: value3
Time4 field2: value4
Time5 field1: value5
Time6 field3: value6
Time7 field1: value7
Time8 field2: value8
Time9 field3: value9
So I have 1 interesting information per line and I would like to group these events together into a single event (probably requires usage of transactions). The result should look like this (pay attention to the 2nd line):
Time1 => Time3 - field1: value1, field2: value2, field3: value3
Time4 => Time6 - field1: value5, field2: value4, field3: value6
Time7 => Time9 - field1: value7, field2: value8, field3: value9
Unfortunately, as you can see, the order in which these fields are coming is more or less random. I can only rely on these rules, binding the lines together:
* "field3" always closes a transaction
* "field1" and "field2" have relatively close timestamps (5 minutes at most between them)
I've tried many combinations of "transaction", "filldown" and "sort" functions, but I'm unable to get the expected result.
Could somebody help me ?
Hi yoho, see if this article helps you shed some light:
http://foren6.wordpress.com/2014/11/18/why-is-my-splunk-transaction-not-working/
I found an ankward solution.
I must first note that "filldown" function fills in the values "down the page" which is actually in the reverse order of time (as most recent events come first).
So the trick is first to reverse the order of events with "|sort _time". Then you need to add "|filldown field1, field2", so that you can propagate values of these fields to all events in the future. In the end, lines with "field3" will contain field1, field2 and field3 values and you just have to filter on these events with "|search field3".
It doesn't entirely answers to the question as you will loose the first timestamp of the transaction (Time1, Time4 and Time7) but I didn't actually really need it.
Better than "|sort _time", "|reverse" is probably faster
If the log is written within a certain time frame consistently, you can try:
your_search | transaction maxspan=4m endswith="field3" maxevents=3 | stats...
The maxspan indicates that 4m is the time "bucket" within which all the events of the transaction fall. endswith tells transaction what the final event should be. maxevents is, well, the max number events that can be in the transaction.
Is it possible to have the log output a unique id for each transaction? That would simplify your efforts a great deal.
http://docs.splunk.com/Documentation/Splunk/5.0.3/SearchReference/Transaction
Then in that case, you'll have to have something to tie them all together, such as a field4 that has a unique identifier to tie them together.
That's interesting but doesn't exactly solve the problem.
Let's take the first transaction: although Time1 and Time2 are relatively close (4m for maxspan would be ok), Time3 can be several hours later and the "maxspan" condition isn't met for the whole transaction.
I believe "transaction" can't be used to group the first 2 lines of the transaction because there is no pre-defined order but maybe another function would be ok.