Solved: Re: The Lovely Transaction Command

icyfeverr · ‎09-19-2014

When using the transaction command, I am getting unexpected results.

Search:
sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX

Output:
[09/10/2014,12:04:46am]Processing Check Number: [XXX]
[09/10/2014,12:04:46am]ID [000000000]
[09/10/2014,12:04:46am]ID2 [000000000]
[09/10/2014,12:04:46am]Using Previous Settings
[09/10/2014,12:04:46am]Processing: abc123
[09/10/2014,12:04:46am]file : testfile2
[09/10/2014,12:04:46am]Key:[testexample2]:/myfile2

It seems that some events are bleeding into my results, I believe because of the timestamps.

Expected Output:
[09/10/2014,12:04:46am] Processing Check Number: [XXX]
[09/10/2014,12:04:46am]ID [000000000]
[09/10/2014,12:04:46am]ID2 [000000000]
[09/10/2014,12:04:47am]file : testfile1
[09/10/2014,12:04:47am]Using Pre Translater
[09/10/2014,12:04:47am]Translating inbound
[09/10/2014,12:04:47am]Key: [testexample1]:/myfile1

Why is transaction trying to group the timestamp of 12:04:46am all together instead of the mix between the seconds of 46 and 47. In the raw logs themselves, the lines are wrote out in the correct order, so the log itself is not in a sequential timestamp order. The 'Expected Output" is how the raw logs actually look like.

The first three lines are correct in the "Output", but it is missing the last 4 lines of the "Expected Output".

Any help would be greatly appreciated.

icyfeverr · ‎09-19-2014

After reviewing the Splunk data in more detail, I believe I found an adequate solution.

Reviewing the _time vs. _indextime, the events that were in the expected output had a _indextime of 1410325505, as opposed to the unexpected value in the actual output had a mixture of _indextime 1410325505 and 1410325496.

With that said, I modified the search query. From: sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX

To: sourcetype=abc [ search sourcetype=abc XXX | return _indextime ] | transaction startswith="Processing Check Number" endswith="]Key [" | search XXX

Essentially, the sub-search is returning the _indextime to the main search narrowing the number of events down to the specific events that were taken in at that time, and then allowing for transaction to specifically make a match for the StartsWith/EndsWith and then searching for the specific pattern to pull out.

View solution in original post

inode · ‎11-18-2014

Hey guys, I've put together a small article addressing some of those issues with transactions, happy to receive feedback!

http://foren6.wordpress.com/2014/11/18/why-is-my-splunk-transaction-not-working/

icyfeverr · ‎09-19-2014

After reviewing the Splunk data in more detail, I believe I found an adequate solution.

Reviewing the _time vs. _indextime, the events that were in the expected output had a _indextime of 1410325505, as opposed to the unexpected value in the actual output had a mixture of _indextime 1410325505 and 1410325496.

With that said, I modified the search query. From: sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX

To: sourcetype=abc [ search sourcetype=abc XXX | return _indextime ] | transaction startswith="Processing Check Number" endswith="]Key [" | search XXX

Essentially, the sub-search is returning the _indextime to the main search narrowing the number of events down to the specific events that were taken in at that time, and then allowing for transaction to specifically make a match for the StartsWith/EndsWith and then searching for the specific pattern to pull out.

sowings · ‎09-19-2014

Glad you found a solution. But now the wrinkle: Note that if the events span a second boundary, they'll have different _indextime values.

icyfeverr · ‎09-22-2014

You would be correct, that is a potential problem, but because of how the info. is wrote to our logs it is sufficient. However, if someone needs a more robust way, because the logs are not like ours, another solution is below.

XXX = search criteria

Same thing, but instead, we create a start and end time with the eval command (mytime/endtime). We then rename those fields to earliest and latest, as earliest and latest are predefined search variables Splunk Utilizes. Then we use the sort by _indextime, which puts them in the correct order for the transaction command to work with the data correctly.

sowings · ‎09-19-2014

Yes, a line breaker / linemerge scenario could work to do what you want, but it works on consecutive lines, meaning that if your process interleaves events from separate transactions, you could end up with a mangled result. If it's single-threaded, you'd be OK.

Note that linemerge will be a tad more expensive at index-time.

sowings · ‎09-19-2014

Ideally, you'll provide some field to use as a correlation linking the events of a transaction, such as, say, the message ID of an SMTP message, a session ID from a web browser, etc. Without that, Splunk will just glue together everything it finds between startswith and endswith.

sowings · ‎09-19-2014

In the absence of any field to correlate on, I believe that transaction will find whatever is common between the named events. In this case, the _time field is identical in the output. Splunk doesn't really care about "file order" as much, so the linear thread of how the events were written to the log isn't as visible.

icyfeverr · ‎09-19-2014

Do you have any recommendations on how to solve for this issue? The only other option I can think of is to use Line Breaking and merge the events together at index time. I can guarantee that I will not be able to make modifications to the logs themselves to apply an identifier to each event (aka line), our IT department is firm on that response.

icyfeverr · ‎09-19-2014

I understand what you are saying, but its not functioning the way you think it is. The "Expected Output" is how the log is wrote out, so if splunk is truly gluing together the log then all the events should be glued together as how the "Expected Output" is laid out. What seems to be happening is the translate command is pulling timestamps randomly together that match each other to create a "random" transaction.

sowings · ‎09-19-2014

First question: Is there a field that unites all of these events?

icyfeverr · ‎09-19-2014

Don't take this the wrong way, as I appreciate the help, but I provided examples above. To answer your question, no, there is not a field that unites all of the events together, hence why I am utilizing the StartsWith and Endswith.

sowings · ‎09-19-2014

Fair enough. You were eliding (masking) some text in the events, not sure if you were leaving anything out.

The Lovely Transaction Command

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Quantify Your Splunk Investment Impact: Introducing Savings Metrics to Value Insights

Event Series: Telemetry Pipeline Management

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Join the Conversation