When using the transaction command, I am getting unexpected results.
Search:
sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX
Output:
[09/10/2014,12:04:46am]Processing Check Number: [XXX]
[09/10/2014,12:04:46am]ID [000000000]
[09/10/2014,12:04:46am]ID2 [000000000]
[09/10/2014,12:04:46am]Using Previous Settings
[09/10/2014,12:04:46am]Processing: abc123
[09/10/2014,12:04:46am]file : testfile2
[09/10/2014,12:04:46am]Key:[testexample2]:/myfile2
It seems that some events are bleeding into my results, I believe because of the timestamps.
Expected Output:
[09/10/2014,12:04:46am] Processing Check Number: [XXX]
[09/10/2014,12:04:46am]ID [000000000]
[09/10/2014,12:04:46am]ID2 [000000000]
[09/10/2014,12:04:47am]file : testfile1
[09/10/2014,12:04:47am]Using Pre Translater
[09/10/2014,12:04:47am]Translating inbound
[09/10/2014,12:04:47am]Key: [testexample1]:/myfile1
Why is transaction trying to group the timestamp of 12:04:46am all together instead of the mix between the seconds of 46 and 47. In the raw logs themselves, the lines are wrote out in the correct order, so the log itself is not in a sequential timestamp order. The 'Expected Output" is how the raw logs actually look like.
The first three lines are correct in the "Output", but it is missing the last 4 lines of the "Expected Output".
Any help would be greatly appreciated.
After reviewing the Splunk data in more detail, I believe I found an adequate solution.
Reviewing the _time vs. _indextime, the events that were in the expected output had a _indextime of 1410325505, as opposed to the unexpected value in the actual output had a mixture of _indextime 1410325505 and 1410325496.
With that said, I modified the search query. From: sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX
To: sourcetype=abc [ search sourcetype=abc XXX | return _indextime ] | transaction startswith="Processing Check Number" endswith="]Key [" | search XXX
Essentially, the sub-search is returning the _indextime to the main search narrowing the number of events down to the specific events that were taken in at that time, and then allowing for transaction to specifically make a match for the StartsWith/EndsWith and then searching for the specific pattern to pull out.
Hey guys, I've put together a small article addressing some of those issues with transactions, happy to receive feedback!
http://foren6.wordpress.com/2014/11/18/why-is-my-splunk-transaction-not-working/
After reviewing the Splunk data in more detail, I believe I found an adequate solution.
Reviewing the _time vs. _indextime, the events that were in the expected output had a _indextime of 1410325505, as opposed to the unexpected value in the actual output had a mixture of _indextime 1410325505 and 1410325496.
With that said, I modified the search query. From: sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX
To: sourcetype=abc [ search sourcetype=abc XXX | return _indextime ] | transaction startswith="Processing Check Number" endswith="]Key [" | search XXX
Essentially, the sub-search is returning the _indextime to the main search narrowing the number of events down to the specific events that were taken in at that time, and then allowing for transaction to specifically make a match for the StartsWith/EndsWith and then searching for the specific pattern to pull out.
Glad you found a solution. But now the wrinkle: Note that if the events span a second boundary, they'll have different _indextime values.
You would be correct, that is a potential problem, but because of how the info. is wrote to our logs it is sufficient. However, if someone needs a more robust way, because the logs are not like ours, another solution is below.
XXX = search criteria
sourcetype=abc [ search sourcetype=abc XXX | eval mytime=strftime(_time - 2, "%m/%d/%Y:%H:%M:%S") | eval endtime=strftime(_time + 2, "%m/%d/%Y:%H:%M:%S") | rename mytime as earliest | rename endtime as latest | return earliest, latest ] | sort by _indextime | transaction maxspan=2s startswith="Processing Check Number" endswith="]Key [" | search XXX
Same thing, but instead, we create a start and end time with the eval command (mytime/endtime). We then rename those fields to earliest and latest, as earliest and latest are predefined search variables Splunk Utilizes. Then we use the sort by _indextime, which puts them in the correct order for the transaction command to work with the data correctly.
Yes, a line breaker / linemerge scenario could work to do what you want, but it works on consecutive lines, meaning that if your process interleaves events from separate transactions, you could end up with a mangled result. If it's single-threaded, you'd be OK.
Note that linemerge will be a tad more expensive at index-time.
Ideally, you'll provide some field to use as a correlation linking the events of a transaction, such as, say, the message ID of an SMTP message, a session ID from a web browser, etc. Without that, Splunk will just glue together everything it finds between startswith and endswith.
In the absence of any field to correlate on, I believe that transaction will find whatever is common between the named events. In this case, the _time field is identical in the output. Splunk doesn't really care about "file order" as much, so the linear thread of how the events were written to the log isn't as visible.
Do you have any recommendations on how to solve for this issue? The only other option I can think of is to use Line Breaking and merge the events together at index time. I can guarantee that I will not be able to make modifications to the logs themselves to apply an identifier to each event (aka line), our IT department is firm on that response.
I understand what you are saying, but its not functioning the way you think it is. The "Expected Output" is how the log is wrote out, so if splunk is truly gluing together the log then all the events should be glued together as how the "Expected Output" is laid out. What seems to be happening is the translate command is pulling timestamps randomly together that match each other to create a "random" transaction.
First question: Is there a field that unites all of these events?
Don't take this the wrong way, as I appreciate the help, but I provided examples above. To answer your question, no, there is not a field that unites all of the events together, hence why I am utilizing the StartsWith and Endswith.
Fair enough. You were eliding (masking) some text in the events, not sure if you were leaving anything out.