Splunk Search

Transaction Command not returning all results

splunkuser1948
Engager

Hi, 

I have a splunk query as below:

index=platform env=sandbox  http_method="GET" 

This gave me 1 result back.

Now, when I am applying transaction command to it

index=platform env=sandbox  http_method="GET" | transaction  startswith="GET"

It is returning 0 results. 🤔

However, if I do like this:

index=platform env=sandbox  http_method="GET" | transaction  startswith="GET" keepevicted=true

It returns my 1 result back. Value of `closed_txn`  is 1.

Also, if I do like below, still it returns my 1 result back. What is going on ?

index=platform env=sandbox  http_method="GET" | transaction CorrelationId startswith="GET"

 

"GET" is present in my event `_raw` otherwise  first search command would not have returned me 1 result. 

Labels (1)
0 Karma

tscroggins
Influencer

@splunkuser1948 

You're encountering undefined (or at least undocumented) behavior.

The transaction command should include at least one field. If it doesn't, I suspect Splunk evicts every event it encounters, which is implied in your experiment with the keepevicted option.

The behavior of closed_txn in this case is documented (emphasis mine) [1]:

"The 'closed_txn' field is set to '1' if one of the following conditions is met: maxevents, maxpause, maxspan, startswith."

1. https://docs.splunk.com/Documentation/Splunk/8.1.3/SearchReference/Transaction#Memory_control_option...

0 Karma

splunkuser1948
Engager

Hi @tscroggins 

It is not mandatory for transaction command to have a field.

Also, what I am observing is that if I have 

index=platform env=sandbox  http_method="GET" 

returning `n` events (This time I increased/changed my time limit) , then

index=platform env=sandbox  http_method="GET" | transaction  startswith="GET"

returns me `n-1` events always. 

But below 2 commands still gives me `n` events. 🤔

index=platform env=sandbox  http_method="GET" | transaction  startswith="GET" keepevicted=true
index=platform env=sandbox  http_method="GET" | transaction CorrelationId startswith="GET"
Tags (1)
0 Karma

tscroggins
Influencer

@splunkuser1948 

Indeed, but the command is behaving as designed. Without a correlating field, every event is eligible to be added to an open transaction, and each match against startswith will close the current transaction and open a new one. The final transaction will remain open because no subsequent event forced Splunk to close it.

In single-threaded log streams, this is probably fine, but most applications--web servers in particular--are multi-threaded. Events from different transactions will be interleaved together. (This is heavily dependent on application architecture, of course.)

0 Karma

splunkuser1948
Engager
But even the last event satisfies the startswith. So, splunk should close it too as its the end and no further events are there to process. Don't you think that's a bug in splunk ?
0 Karma

tscroggins
Influencer

@splunkuser1948 

I see your point. Here's a lightly obfuscated example using splunkweb access logs:

192.0.2.1 - admin [29/May/2021:17:25:59.501 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762/events?output_mode=json&offset=0&count=20&segmentation=full&max_lines=5&field_list=host%2Csource%2Csourcetype%2Ctag%2Ctest2%2Cclientip%2Ctest1%2C_raw%2C_time%2C_audit%2C_decoration%2Ceventtype%2C_eventtype_color%2Clinecount%2C_fulllinecount%2C_icon%2Ctag*&truncation_mode=abstract&_=1622323362071 HTTP/1.1" 200 419 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 2ms
192.0.2.1 - admin [29/May/2021:17:25:59.538 -0400] "GET /en-US/splunkd/__raw/services/server/health/splunkd?output_mode=json&_=1622323362072 HTTP/1.1" 200 408 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 0ms
192.0.2.1 - admin [29/May/2021:17:25:59.723 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762?output_mode=json&_=1622323362073 HTTP/1.1" 200 1591 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 1ms

index=_internal sourcetype=splunkd_ui_access method=GET
| head 1
| transaction startswith="GET"
| table _time closed_txn eventcount _raw

returns nothing, but

index=_internal sourcetype=splunkd_ui_access method=GET
| head 1
| transaction startswith="GET" keepevicted=t
| table _time closed_txn eventcount _raw

returns

_time closed_txn eventcount _raw
2021-05-29 22:25:59.723 1 1 192.0.2.1 - admin [29/May/2021:17:25:59.723 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762?output_mode=json&_=1622323362073 HTTP/1.1" 200 1591 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 1ms

I expect closed_txn to equal 0, but it's equal to 1.

If, however, we extract a transaction field, no transaction is evicted:

index=_internal sourcetype=splunkd_ui_access method=GET
| head 1
| rex field=other "-\\s(?<session_id>[^\\s]+)"
| transaction session_id startswith="GET"
| table _time closed_txn eventcount _raw

_time closed_txn eventcount _raw
2021-05-29 22:25:59.723 1 1 192.0.2.1 - admin [29/May/2021:17:25:59.723 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762?output_mode=json&_=1622323362073 HTTP/1.1" 200 1591 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 1ms

There could be an internal difference in the behavior of transaction eviction depending on the presence of transaction fields.

Inferring the behavior of the transaction command through experimentation is useful, but if keepevicted alters the behavior of the command beyond the documented explanation, I don't know if I'd consider it a defect in the code or an oversight in the documentation. Splunk support is likely to argue the command is working as documented whether the value of closed_txn is consistent or not.

If you have the time, you could test across various earlier versions of Splunk looking for a version when keepevicted and closed_txn behaved consistently. In my experience, Splunk support doesn't provide that level of service, particularly for versions of Splunk that are no longer supported. If you have one, your account manager or sales engineer may be able to assist.

0 Karma
Get Updates on the Splunk Community!

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...

What's New in Splunk Cloud Platform 9.2.2406?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2406 with many ...

Enterprise Security Content Update (ESCU) | New Releases

In August, the Splunk Threat Research Team had 3 releases of new security content via the Enterprise Security ...