Splunk Search

Correlating streamed sequential events

jsp
Engager

I have a bunch of events coming in the format of the below example. They are random in the time it takes from start to end.

I want to query a term in the 'processing' event and have it return the entire session. I have been doing so by doing a subsearch for the term and getting the session id, searching for the session ID and transactioning the result. This is very CPU intensive, slow, and causes my subsearches to time out so I can only search a very limited time frame.

I have 2 thoughts on how to solve this issue:
- Correlate at index time - I am not entirely sure how to do this since the events are streamed in with no predefined start and end, and it makes me very wary of data loss.
- Correlate in a summary index - I could run the transaction command every hour to populate a summary index. However, if a session isn't complete, I assume I would lose that data in my query. If I overlap than that would lead to duplication. I am not sure if there is some way to use the overlap command to help with this?

Event 1: session_1 Start
Event 2: session_1 Processing
Event 3: session_1 Finish
Event 4: session_2 Start
Event 5: session_2 Processing
Event 6: session_2 Processing
Event 7: session_2 Finish

Any help trying to figure this out would be much appreciated.

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

If you can live with a bit of delay and have an upper bound for the length of a transaction you could do the following, assuming the upper bound is one hour:

  • schedule a search to run every hour over the past two hours
  • compute transactions that start and end within those two hours
  • keep only those transactions that start in the first of the two hours
  • store those results in a summary index

That way you get the best of all worlds: No duplicates, no missed transactions, a fast summary to search through, and all that at the cost of a bit of delay - as the worst case, a transaction needs two hours to make it into the summary index.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

If you can live with a bit of delay and have an upper bound for the length of a transaction you could do the following, assuming the upper bound is one hour:

  • schedule a search to run every hour over the past two hours
  • compute transactions that start and end within those two hours
  • keep only those transactions that start in the first of the two hours
  • store those results in a summary index

That way you get the best of all worlds: No duplicates, no missed transactions, a fast summary to search through, and all that at the cost of a bit of delay - as the worst case, a transaction needs two hours to make it into the summary index.

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...