Hi,
I have a splunk query as below:
index=platform env=sandbox http_method="GET"
This gave me 1 result back.
Now, when I am applying transaction command to it
index=platform env=sandbox http_method="GET" | transaction startswith="GET"
It is returning 0 results. 🤔
However, if I do like this:
index=platform env=sandbox http_method="GET" | transaction startswith="GET" keepevicted=true
It returns my 1 result back. Value of `closed_txn` is 1.
Also, if I do like below, still it returns my 1 result back. What is going on ?
index=platform env=sandbox http_method="GET" | transaction CorrelationId startswith="GET"
"GET" is present in my event `_raw` otherwise first search command would not have returned me 1 result.
You're encountering undefined (or at least undocumented) behavior.
The transaction command should include at least one field. If it doesn't, I suspect Splunk evicts every event it encounters, which is implied in your experiment with the keepevicted option.
The behavior of closed_txn in this case is documented (emphasis mine) [1]:
"The 'closed_txn' field is set to '1' if one of the following conditions is met: maxevents, maxpause, maxspan, startswith."
Hi @tscroggins
It is not mandatory for transaction command to have a field.
Also, what I am observing is that if I have
index=platform env=sandbox http_method="GET"
returning `n` events (This time I increased/changed my time limit) , then
index=platform env=sandbox http_method="GET" | transaction startswith="GET"
returns me `n-1` events always.
But below 2 commands still gives me `n` events. 🤔
index=platform env=sandbox http_method="GET" | transaction startswith="GET" keepevicted=true
index=platform env=sandbox http_method="GET" | transaction CorrelationId startswith="GET"
Indeed, but the command is behaving as designed. Without a correlating field, every event is eligible to be added to an open transaction, and each match against startswith will close the current transaction and open a new one. The final transaction will remain open because no subsequent event forced Splunk to close it.
In single-threaded log streams, this is probably fine, but most applications--web servers in particular--are multi-threaded. Events from different transactions will be interleaved together. (This is heavily dependent on application architecture, of course.)
I see your point. Here's a lightly obfuscated example using splunkweb access logs:
192.0.2.1 - admin [29/May/2021:17:25:59.501 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762/events?output_mode=json&offset=0&count=20&segmentation=full&max_lines=5&field_list=host%2Csource%2Csourcetype%2Ctag%2Ctest2%2Cclientip%2Ctest1%2C_raw%2C_time%2C_audit%2C_decoration%2Ceventtype%2C_eventtype_color%2Clinecount%2C_fulllinecount%2C_icon%2Ctag*&truncation_mode=abstract&_=1622323362071 HTTP/1.1" 200 419 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 2ms
192.0.2.1 - admin [29/May/2021:17:25:59.538 -0400] "GET /en-US/splunkd/__raw/services/server/health/splunkd?output_mode=json&_=1622323362072 HTTP/1.1" 200 408 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 0ms
192.0.2.1 - admin [29/May/2021:17:25:59.723 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762?output_mode=json&_=1622323362073 HTTP/1.1" 200 1591 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 1ms
index=_internal sourcetype=splunkd_ui_access method=GET
| head 1
| transaction startswith="GET"
| table _time closed_txn eventcount _raw
returns nothing, but
index=_internal sourcetype=splunkd_ui_access method=GET
| head 1
| transaction startswith="GET" keepevicted=t
| table _time closed_txn eventcount _raw
returns
_time closed_txn eventcount _raw
2021-05-29 22:25:59.723 1 1 192.0.2.1 - admin [29/May/2021:17:25:59.723 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762?output_mode=json&_=1622323362073 HTTP/1.1" 200 1591 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 1ms
I expect closed_txn to equal 0, but it's equal to 1.
If, however, we extract a transaction field, no transaction is evicted:
index=_internal sourcetype=splunkd_ui_access method=GET
| head 1
| rex field=other "-\\s(?<session_id>[^\\s]+)"
| transaction session_id startswith="GET"
| table _time closed_txn eventcount _raw
_time closed_txn eventcount _raw
2021-05-29 22:25:59.723 1 1 192.0.2.1 - admin [29/May/2021:17:25:59.723 -0400] "GET /en-US/splunkd/__raw/servicesNS/nobody/search/search/jobs/1622323559.762?output_mode=json&_=1622323362073 HTTP/1.1" 200 1591 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36" - 2b9d0a154f394576ba8d947bd5ac1019 1ms
There could be an internal difference in the behavior of transaction eviction depending on the presence of transaction fields.
Inferring the behavior of the transaction command through experimentation is useful, but if keepevicted alters the behavior of the command beyond the documented explanation, I don't know if I'd consider it a defect in the code or an oversight in the documentation. Splunk support is likely to argue the command is working as documented whether the value of closed_txn is consistent or not.
If you have the time, you could test across various earlier versions of Splunk looking for a version when keepevicted and closed_txn behaved consistently. In my experience, Splunk support doesn't provide that level of service, particularly for versions of Splunk that are no longer supported. If you have one, your account manager or sales engineer may be able to assist.