Splunk Search

How to identify if a transaction is stalled

GersonGarcia
Path Finder

All,

I have this search:

index=main sourcetype=app | transaction jobId jobExecId startswith="Starting IgniteUpdater thread" endswith="Stopping IgniteUpdater thread" | search taskName=*Read

It returns a nice line of transactions for each job execution. But I need to find one way to identify of the job is stalled, I mean there is no events in the transactions for more than 30secs.

I came up with:

index=main sourcetype=app | eval event_time=_time | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete"

That gives me a multi value field with the time stamp of each event in the transaction:

event_time
    1510704240.372  
1510704240.374  
1510704240.380  
1510704240.394  
1510704240.395  
1510704242.581  
1510704242.582  
1510704242.583  
1510704242.648  
1510704242.664  
1510704242.681  
1510704242.682  
1510704242.699  
1510704243.378

Or this one:

index=main sourcetype=app | delta _time as event_delta_time | transaction jobId jobExecId startswith="Starting IgniteUpdater thread" endswith="Stopping IgniteUpdater thread" | search taskName=*Read

But the result is negative and does not have one value per line.

event_delta_time
-0.001  
-0.002  
-0.003  
-0.005  
-0.008  
0.000

I don't know how to manipulate these fields to have only transactions where the time between events is greater than threshold.

Thank you very much,

Gerson

Tags (1)
0 Karma
1 Solution

cmerriman
Super Champion
0 Karma

cmerriman
Super Champion

have you tried to use the maxpause=30s argument of transaction?
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Transaction

0 Karma

GersonGarcia
Path Finder

@cmerriman

Yes I tried it, but I am confusing with the result:

maxpause=30s:

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=30s | search taskName=highFrequencyLogRead

Returns: 38 events (11/28/17 12:40:48.000 PM to 11/28/17 1:40:48.000 PM)

maxpause=1s

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=1s | search taskName=highFrequencyLogRead

Results: 19 events (11/28/17 12:42:24.000 PM to 11/28/17 1:42:24.000 PM)

maxpause=5s

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=5s | search taskName=highFrequencyLogRead

Results in: 36 events (11/28/17 12:44:22.000 PM to 11/28/17 1:44:22.000 PM)

I need the opposite, only returns when the pause between entries in the transaction is greater than 30s.
What am I doing wrong?

Thank you.

0 Karma

cmerriman
Super Champion

oh, sorry, i was thinking less than, not greater than. try this:

 index=main sourcetype=app | eval event_time=_time | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete | search taskName=highFrequencyLogRead|streamstats count as txn_id|mvexpand event_time|streamstats current=f window=1 values(event_time) as prev_event_time by txn_id|eval threshold=event_time-prev_event_time|search threshold>30
0 Karma

GersonGarcia
Path Finder

It worked like a charm. Thank you!!!

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...