Splunk Search

The Lovely Transaction Command

icyfeverr
Path Finder

When using the transaction command, I am getting unexpected results.

Search:
sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX

Output:
[09/10/2014,12:04:46am]Processing Check Number: [XXX]
[09/10/2014,12:04:46am]ID [000000000]
[09/10/2014,12:04:46am]ID2 [000000000]
[09/10/2014,12:04:46am]Using Previous Settings
[09/10/2014,12:04:46am]Processing: abc123
[09/10/2014,12:04:46am]file : testfile2
[09/10/2014,12:04:46am]Key:[testexample2]:/myfile2

It seems that some events are bleeding into my results, I believe because of the timestamps.

Expected Output:
[09/10/2014,12:04:46am] Processing Check Number: [XXX]
[09/10/2014,12:04:46am]ID [000000000]
[09/10/2014,12:04:46am]ID2 [000000000]
[09/10/2014,12:04:47am]file : testfile1
[09/10/2014,12:04:47am]Using Pre Translater
[09/10/2014,12:04:47am]Translating inbound
[09/10/2014,12:04:47am]Key: [testexample1]:/myfile1

Why is transaction trying to group the timestamp of 12:04:46am all together instead of the mix between the seconds of 46 and 47. In the raw logs themselves, the lines are wrote out in the correct order, so the log itself is not in a sequential timestamp order. The 'Expected Output" is how the raw logs actually look like.

The first three lines are correct in the "Output", but it is missing the last 4 lines of the "Expected Output".

Any help would be greatly appreciated.

Tags (1)
0 Karma
1 Solution

icyfeverr
Path Finder

After reviewing the Splunk data in more detail, I believe I found an adequate solution.

Reviewing the _time vs. _indextime, the events that were in the expected output had a _indextime of 1410325505, as opposed to the unexpected value in the actual output had a mixture of _indextime 1410325505 and 1410325496.

With that said, I modified the search query. From: sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX

To: sourcetype=abc [ search sourcetype=abc XXX | return _indextime ] | transaction startswith="Processing Check Number" endswith="]Key [" | search XXX

Essentially, the sub-search is returning the _indextime to the main search narrowing the number of events down to the specific events that were taken in at that time, and then allowing for transaction to specifically make a match for the StartsWith/EndsWith and then searching for the specific pattern to pull out.

View solution in original post

0 Karma

inode
Explorer

Hey guys, I've put together a small article addressing some of those issues with transactions, happy to receive feedback!

http://foren6.wordpress.com/2014/11/18/why-is-my-splunk-transaction-not-working/

icyfeverr
Path Finder

After reviewing the Splunk data in more detail, I believe I found an adequate solution.

Reviewing the _time vs. _indextime, the events that were in the expected output had a _indextime of 1410325505, as opposed to the unexpected value in the actual output had a mixture of _indextime 1410325505 and 1410325496.

With that said, I modified the search query. From: sourcetype=abc source="/u/spool/zlogs/abc" | transaction maxspan=8s startswith="Processing Check Number" endswith="]Key [" | search check_number=XXX

To: sourcetype=abc [ search sourcetype=abc XXX | return _indextime ] | transaction startswith="Processing Check Number" endswith="]Key [" | search XXX

Essentially, the sub-search is returning the _indextime to the main search narrowing the number of events down to the specific events that were taken in at that time, and then allowing for transaction to specifically make a match for the StartsWith/EndsWith and then searching for the specific pattern to pull out.

0 Karma

sowings
Splunk Employee
Splunk Employee

Glad you found a solution. But now the wrinkle: Note that if the events span a second boundary, they'll have different _indextime values.

0 Karma

icyfeverr
Path Finder

You would be correct, that is a potential problem, but because of how the info. is wrote to our logs it is sufficient. However, if someone needs a more robust way, because the logs are not like ours, another solution is below.

XXX = search criteria

sourcetype=abc [ search sourcetype=abc XXX | eval mytime=strftime(_time - 2, "%m/%d/%Y:%H:%M:%S") | eval endtime=strftime(_time + 2, "%m/%d/%Y:%H:%M:%S") | rename mytime as earliest | rename endtime as latest | return earliest, latest ] | sort by _indextime | transaction maxspan=2s startswith="Processing Check Number" endswith="]Key [" | search XXX

Same thing, but instead, we create a start and end time with the eval command (mytime/endtime). We then rename those fields to earliest and latest, as earliest and latest are predefined search variables Splunk Utilizes. Then we use the sort by _indextime, which puts them in the correct order for the transaction command to work with the data correctly.

0 Karma

sowings
Splunk Employee
Splunk Employee

Yes, a line breaker / linemerge scenario could work to do what you want, but it works on consecutive lines, meaning that if your process interleaves events from separate transactions, you could end up with a mangled result. If it's single-threaded, you'd be OK.

Note that linemerge will be a tad more expensive at index-time.

0 Karma

sowings
Splunk Employee
Splunk Employee

Ideally, you'll provide some field to use as a correlation linking the events of a transaction, such as, say, the message ID of an SMTP message, a session ID from a web browser, etc. Without that, Splunk will just glue together everything it finds between startswith and endswith.

0 Karma

sowings
Splunk Employee
Splunk Employee

In the absence of any field to correlate on, I believe that transaction will find whatever is common between the named events. In this case, the _time field is identical in the output. Splunk doesn't really care about "file order" as much, so the linear thread of how the events were written to the log isn't as visible.

0 Karma

icyfeverr
Path Finder

Do you have any recommendations on how to solve for this issue? The only other option I can think of is to use Line Breaking and merge the events together at index time. I can guarantee that I will not be able to make modifications to the logs themselves to apply an identifier to each event (aka line), our IT department is firm on that response.

0 Karma

icyfeverr
Path Finder

I understand what you are saying, but its not functioning the way you think it is. The "Expected Output" is how the log is wrote out, so if splunk is truly gluing together the log then all the events should be glued together as how the "Expected Output" is laid out. What seems to be happening is the translate command is pulling timestamps randomly together that match each other to create a "random" transaction.

0 Karma

sowings
Splunk Employee
Splunk Employee

First question: Is there a field that unites all of these events?

0 Karma

icyfeverr
Path Finder

Don't take this the wrong way, as I appreciate the help, but I provided examples above. To answer your question, no, there is not a field that unites all of the events together, hence why I am utilizing the StartsWith and Endswith.

0 Karma

sowings
Splunk Employee
Splunk Employee

Fair enough. You were eliding (masking) some text in the events, not sure if you were leaving anything out.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...