Splunk Search

How to calculate concurrent transactions grouped with a particular field

Krishna_R
Path Finder

Hi,

We have a centralized log from an application which reports activities on multiple hosts in a single log file.

Simplified, the log looks like below:

<time-stamp> : <host-name> : Started process pid = ...
<time-stamp> : <host-name> : Process pid = ... completed with status [...]

I would like to list the concurrent processes on each host at any time.

I have the following query to group into transactions, which gives the expected results with the duration.

| transaction hostname pid startswith=...

If I add the concurrency as "| concurrency duration=duration", the concurrency field populated has the concurrent processes as a whole and not for each hostname.

From the docs, I dont see any way to specify grouping field(s) for 'concurrency'. Is there any option to specify the same? Or can I get the expected report thru some other mechanism.

Thanks,

Krishna

1 Solution

steveyz
Splunk Employee
Splunk Employee

If you just want the concurrency number, and not a list of the actual pids active at any one time, you can do the following (don't do trasnaction first)

... | eval counter = if(searchmatch("Started process"),1,-1) | sort 0 + _time | streamstats sum(counter) as concurrency by hostname 

At which point you will have a 'concurrency' field for each event that represents the number of active pids at the time of that event (or rather at a time right after that event since it will count the effect of that event itself)

You can then doing something like | timechart max(concurrency) by hostname

View solution in original post

bischofk
New Member

I have the exact same problem. Being able to add the "by" clause for concurrency would be ideal....this is really messy.

0 Karma

Krishna_R
Path Finder

Hi Steve,

Sorry for the really late response. I found that the query you gave (using streamstats and by clause) works, but as you mentioned, it is only useful when I dont need the values.

I have two use-cases, 1) populate a graph report in the dashboard 2) results of the same to be inspected.

Item #2 is still open since concurrency does not have a 'by' clause. Currently, the only way is to filter by hostname ahead and pipe it to transaction (which does not serve the purpose of giving a system level view)

Do you agree if this can be a feature request, or there's some other way one should treat my requirement.

0 Karma

steveyz
Splunk Employee
Splunk Employee

If you just want the concurrency number, and not a list of the actual pids active at any one time, you can do the following (don't do trasnaction first)

... | eval counter = if(searchmatch("Started process"),1,-1) | sort 0 + _time | streamstats sum(counter) as concurrency by hostname 

At which point you will have a 'concurrency' field for each event that represents the number of active pids at the time of that event (or rather at a time right after that event since it will count the effect of that event itself)

You can then doing something like | timechart max(concurrency) by hostname

Get Updates on the Splunk Community!

Splunk Observability Cloud | Customer Survey!

If you use Splunk Observability Cloud, we invite you to share your valuable insights with us through a brief ...

Happy CX Day, Splunk Community!

Happy CX Day, Splunk Community! CX stands for Customer Experience, and today, October 3rd, is CX Day — a ...

.conf23 | Get Your Cybersecurity Defense Analyst Certification in Vegas

We’re excited to announce a new Splunk certification exam being released at .conf23! If you’re going to Las ...