We have distinct events for each phase of an incoming API call,
2012-09-07 01:12:59.691 category=api_request api_transaction=303312599 ... 2012-09-07 01:12:59.867 category=api_response api_transaction=303312599 ...
I've been struggling to come up with a way of finding all "
api_request" events for which there is no matching "
api_response" event. The biggest issue I have is that in any specific date/time search range, things at the very start and the very end will not have a matching pair because it falls outside of those search boundaries (i.e. with
| transaction api_transaction). Any ideas would be much appreciated.
[updated with clarification below]
To simplify this, assuming each transaction takes 2 seconds,
2012-09-07 01:02:59 api_request O --- search start 2012-09-07 01:03:00 2012-09-07 01:03:01 api_response O 2012-09-07 01:03:30 api_request P 2012-09-07 01:03:32 api_response P 2012-09-07 01:03:45 api_request Q 2012-09-07 01:03:59 api_request R --- search end 2012-09-07 01:04:00 2012-09-07 01:04:01 api_response R
The search shouldn't return
api_response O and
api_request R as false positives, because their matching
api_response lay outside of the boundaries of the search. What I'm looking for is that
api_request Q for which there is no matching
api_response event within the following X seconds.
You could do the transaction route or add in a dc(category) to whatever else you need in your stats command. In your next pipe just limit your search to where that field = 2 assuming each event will have 2 categories; request/response. This should run pretty quickly as well.
Not sure if this is resolved, but posting my answer
ur basesearch | transaction api_transaction startswith="api_request" endswith="api_response" unifyends=true
this transaction will make sure, it MUST contain both startwith and endswith and ignores the events if either request or response is not present in any any transaction even it is present during the specified time span.
No matter what approach you use, you have a fundamental issue that is not easy to resolve. When you are trying to match up a "begin event" and an "end event", one or the other could be outside of your search range. And Splunk doesn't "see" what is outside of your search range.
One option is to make your search range longer than the longest interval you can possibly have. Possibly as much as double. If you don't know how long your longest thing can be, then this can be difficult.
Another option might be to maintain state via a lookup. If your
api_transaction is unique over a long term, then this is viable. See the Splunk blog post http://blogs.splunk.com/2011/01/11/maintaining-state-of-the-union/
Hm, I'm not sure that I follow you. If you just want to find them (and have sufficiently unique api_transaction IDs);
your_sourcetype category=api_request OR category=api_response | stats c values(category) AS category by api_transaction | search c=1 api_request
EDIT: there was an error before. Added
values(category) to the stats function so that it can be filtered on further down the line.
Hope this helps,
Hm, the updated query shows false positives, for example it includes one api_response (api_transaction=303287874) where the api_request was actually logged in the minute prior to the start of the search range (limited to 01:03 minute).
category=api_request OR category=api_response | stats c values(category) AS category by api_transaction | search c=1 api_request
696 matching events, 6 results (2 api_response and 4 api_request)
Yes, that is precisely why. Group the events by the ID, discard the ones that have request AND response, and then just output the requests without responses.
Feel free to mark the question a answered a/o upvote if this solved your problem
Thanks for this, I'm getting somewhere, I think.
Searching within one minute's span (2012-09-07 01:03),
category=api_request OR category=api_response | stats c by api_transaction | search c=1 category=api_request
696 matching events, 0 results
category=api_request OR category=api_response | stats c by api_transaction | search c=1
696 matching events, 5 results
I guess the trick is that category qualifer in your example weeds out api_response entries at the beginning of 01:03 for which api_request entries came in earlier at 01:02?