Splunk Search

How to identify if a transaction is stalled

GersonGarcia
Path Finder

All,

I have this search:

index=main sourcetype=app | transaction jobId jobExecId startswith="Starting IgniteUpdater thread" endswith="Stopping IgniteUpdater thread" | search taskName=*Read

It returns a nice line of transactions for each job execution. But I need to find one way to identify of the job is stalled, I mean there is no events in the transactions for more than 30secs.

I came up with:

index=main sourcetype=app | eval event_time=_time | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete"

That gives me a multi value field with the time stamp of each event in the transaction:

event_time
    1510704240.372  
1510704240.374  
1510704240.380  
1510704240.394  
1510704240.395  
1510704242.581  
1510704242.582  
1510704242.583  
1510704242.648  
1510704242.664  
1510704242.681  
1510704242.682  
1510704242.699  
1510704243.378

Or this one:

index=main sourcetype=app | delta _time as event_delta_time | transaction jobId jobExecId startswith="Starting IgniteUpdater thread" endswith="Stopping IgniteUpdater thread" | search taskName=*Read

But the result is negative and does not have one value per line.

event_delta_time
-0.001  
-0.002  
-0.003  
-0.005  
-0.008  
0.000

I don't know how to manipulate these fields to have only transactions where the time between events is greater than threshold.

Thank you very much,

Gerson

Tags (1)
0 Karma
1 Solution

cmerriman
Super Champion
0 Karma

cmerriman
Super Champion

have you tried to use the maxpause=30s argument of transaction?
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Transaction

0 Karma

GersonGarcia
Path Finder

@cmerriman

Yes I tried it, but I am confusing with the result:

maxpause=30s:

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=30s | search taskName=highFrequencyLogRead

Returns: 38 events (11/28/17 12:40:48.000 PM to 11/28/17 1:40:48.000 PM)

maxpause=1s

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=1s | search taskName=highFrequencyLogRead

Results: 19 events (11/28/17 12:42:24.000 PM to 11/28/17 1:42:24.000 PM)

maxpause=5s

index=main sourcetype=app | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete" maxpause=5s | search taskName=highFrequencyLogRead

Results in: 36 events (11/28/17 12:44:22.000 PM to 11/28/17 1:44:22.000 PM)

I need the opposite, only returns when the pause between entries in the transaction is greater than 30s.
What am I doing wrong?

Thank you.

0 Karma

cmerriman
Super Champion

oh, sorry, i was thinking less than, not greater than. try this:

 index=main sourcetype=app | eval event_time=_time | transaction jobId jobExecId startswith="Expanding device group" endswith="Job complete | search taskName=highFrequencyLogRead|streamstats count as txn_id|mvexpand event_time|streamstats current=f window=1 values(event_time) as prev_event_time by txn_id|eval threshold=event_time-prev_event_time|search threshold>30
0 Karma

GersonGarcia
Path Finder

It worked like a charm. Thank you!!!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...