I have a sourcetype that represents transactions. On the sourcetype are 3 fields of importance to this question,:an id
, a message type, msg
representing a request or a reversal and an amount
. A request and a reversal have the same id
. The reversal amount is always 0, which means that to find the original amount I have to run sourcetype=transactions msg=req id=$id$
. I can do this with a self join which works fine for the last 15 minutes or sometimes up to an hour, but when the number of subsearch results goes over 50,000, it bombs out.
The search I want is:
sourcetype=transactions msg=reversal
| eval original_amount=mnagic_function("search sourcetype=transactions msg=request id=$id_from_main_search$") | return amount"
| table msg id original_amount
The Splunk way to do this is to collect all the events in one pass and then sort it out in later pipes with eval/stats and friends.
sourcetype=transactions | stats values(msg) as msg list(amount) as amounts max(amount) as max_amount by id | search msg="reversal"
In that first stats command the "msg" and "amounts" field coming out of it are both multivalued fields, and each of those rows coming out of the stats command is a rolled up record of everything for that given id value. So when I do search msg="reversal"
, it's going to match the id rows that had at least one reversal, but the rows themselves will still be the multivalue fields with all the rows etc.
And here I've made the perhaps naive assumption that for each id, the largest amount found is the original amount. If that's not clever enough you can certainly get what you need with eval and other commands.
There are lots and lots of ways to do this particular use case. join and selfjoin and append are probably the worst. =/
Here is the main docs page for all the stats and chart functions.
http://docs.splunk.com/Documentation/Splunk/6.3.0/SearchReference/CommonStatsFunctions
This page gives a good overview and even a flow chart about how and when to use different grouping functions in Splunk - http://docs.splunk.com/Documentation/Splunk/6.2.1/Search/Abouteventcorrelation
Note: Here because of subsearch limits we went a more brute force way, but for pretty much all cases where you know the "inner" result is always going to be <10,000, and where also the "inner" (here meaning just the reversal events) is much much smaller than the "outer" results (here just meaning all transaction events) you should use a subsearch like you had at first tried, because that way ends up getting less events off disk total and is thus faster.
You need the map
command:
base search that generates id field | map search="search sourcetype=transactions msg=req id=$id$"
The Splunk way to do this is to collect all the events in one pass and then sort it out in later pipes with eval/stats and friends.
sourcetype=transactions | stats values(msg) as msg list(amount) as amounts max(amount) as max_amount by id | search msg="reversal"
In that first stats command the "msg" and "amounts" field coming out of it are both multivalued fields, and each of those rows coming out of the stats command is a rolled up record of everything for that given id value. So when I do search msg="reversal"
, it's going to match the id rows that had at least one reversal, but the rows themselves will still be the multivalue fields with all the rows etc.
And here I've made the perhaps naive assumption that for each id, the largest amount found is the original amount. If that's not clever enough you can certainly get what you need with eval and other commands.
There are lots and lots of ways to do this particular use case. join and selfjoin and append are probably the worst. =/
Here is the main docs page for all the stats and chart functions.
http://docs.splunk.com/Documentation/Splunk/6.3.0/SearchReference/CommonStatsFunctions
This page gives a good overview and even a flow chart about how and when to use different grouping functions in Splunk - http://docs.splunk.com/Documentation/Splunk/6.2.1/Search/Abouteventcorrelation
Note: Here because of subsearch limits we went a more brute force way, but for pretty much all cases where you know the "inner" result is always going to be <10,000, and where also the "inner" (here meaning just the reversal events) is much much smaller than the "outer" results (here just meaning all transaction events) you should use a subsearch like you had at first tried, because that way ends up getting less events off disk total and is thus faster.