Splunk Search

How to identify if a transaction is stalled

GersonGarcia
Path Finder

All,

I have this search:

index=main sourcetype=app | transaction jobId jobExecId startswith="Starting IgniteUpdater thread" endswith="Stopping IgniteUpdater thread" | search taskName=*Read

It returns a nice line of transactions for each job execution. But I need to find one way to identify of the job is stalled, I mean there is no events in the transactions for more than 30secs.

I came up with:

index=main sourcetype=app | eval event_time=_time | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete"

That gives me a multi value field with the time stamp of each event in the transaction:

event_time
    1510704240.372  
1510704240.374  
1510704240.380  
1510704240.394  
1510704240.395  
1510704242.581  
1510704242.582  
1510704242.583  
1510704242.648  
1510704242.664  
1510704242.681  
1510704242.682  
1510704242.699  
1510704243.378

Or this one:

index=main sourcetype=app | delta _time as event_delta_time | transaction jobId jobExecId startswith="Starting IgniteUpdater thread" endswith="Stopping IgniteUpdater thread" | search taskName=*Read

But the result is negative and does not have one value per line.

event_delta_time
-0.001  
-0.002  
-0.003  
-0.005  
-0.008  
0.000

I don't know how to manipulate these fields to have only transactions where the time between events is greater than threshold.

Thank you very much,

Gerson

Tags (1)
0 Karma
1 Solution

cmerriman
Super Champion
0 Karma

cmerriman
Super Champion

have you tried to use the maxpause=30s argument of transaction?
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Transaction

0 Karma

GersonGarcia
Path Finder

@cmerriman

Yes I tried it, but I am confusing with the result:

maxpause=30s:

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=30s | search taskName=highFrequencyLogRead

Returns: 38 events (11/28/17 12:40:48.000 PM to 11/28/17 1:40:48.000 PM)

maxpause=1s

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=1s | search taskName=highFrequencyLogRead

Results: 19 events (11/28/17 12:42:24.000 PM to 11/28/17 1:42:24.000 PM)

maxpause=5s

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=5s | search taskName=highFrequencyLogRead

Results in: 36 events (11/28/17 12:44:22.000 PM to 11/28/17 1:44:22.000 PM)

I need the opposite, only returns when the pause between entries in the transaction is greater than 30s.
What am I doing wrong?

Thank you.

0 Karma

cmerriman
Super Champion

oh, sorry, i was thinking less than, not greater than. try this:

 index=main sourcetype=app | eval event_time=_time | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete | search taskName=highFrequencyLogRead|streamstats count as txn_id|mvexpand event_time|streamstats current=f window=1 values(event_time) as prev_event_time by txn_id|eval threshold=event_time-prev_event_time|search threshold>30
0 Karma

GersonGarcia
Path Finder

It worked like a charm. Thank you!!!

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...