Splunk Search

Transaction two logs, alert if orphaned for 30 mins

kphillipson
Path Finder

Hello,
Been trying to figure this one out and I believe I have made it more complicated than it needs to be. I have to monitor two different log files, in the log there is a unique document ID that I can do a transaction on. Think of one log being the start of the event and the other being the finish. The main goal is to be alerted when foo shows up in one log but not the other when the time of being orphaned is > 30m.

Here is the search string I've been working with but I've hit a plateau in trying to figure it out:

index=test| rex "(Document\sID:\s|capture\.ID\\-\>)(?<DocID>[^ ]+})(\s|$)"  | search DocID=*
| transaction DocID keeporphans=1 unifyends=t  maxspan=1s maxpause=30m
|eval Closed=closed_txn
|eval Orphan=_txn_orphan
|fillnull Orphan
|eval Raw=_raw
| table _time Orphan Closed duration

Thank you,
Kyle

Tags (2)
0 Karma
1 Solution

Ayn
Legend

If you know that all successful transactions contain 2 events, you could simply search for transactions that do not have this eventcount:

index=test| rex "(Document\sID:\s|capture\.ID\\-\>)(?<DocID>[^ ]+})(\s|$)"  | search DocID=* | transaction DocID maxpause=30m | search NOT eventcount=2

You should consider moving your field extraction into conf files. That way you won't have to think about having it inline, but most importantly the DocID field will be available right from the start of the search which means the first search clause can filter on that directly, instead of as your search looks now grab ALL events from index test during the chosen timeframe before sending them on to the rest of the search pipeline.

View solution in original post

bbingham
Builder

I'd probably skip transaction all together if you can guarantee the id's the same and don't need any additional fields. I'd do something like:

 index=test| rex "(Document\sID:\s|capture\.ID\\-\>)(?<DocID>[^ ]+})(\s|$)"  | search DocID=* | stats min(_time) AS EventStartTime list(_time) AS time range(_time) as TimeDiff count AS Count by DocID |addinfo | eval ShouldAlert=if(((TimeDiff>=1800 AND Count>=2) OR (info_max_time-EventStartTime>1800 AND Count=1) ),"True","False")

Then a simple alert would check if ShouldAlert=True

kphillipson
Path Finder

Yes the ID is always going to be the same.
Good point about the running the command over all time. I'd like to see if there is a way to run a search every 10 mins over the past 1hr or something...

I'll run this tomorrow and see how it works. Thank you bbingham!

0 Karma

bbingham
Builder

Editing answer for a little more of a clear case.

0 Karma

bbingham
Builder

I'm actually thinking about this a bit farther... Both Ayn's response and mine have a similar problem, if your window isn't large enough, or if you only catch the end of transaction in the time window, you'll run into an issue. It's basically going to force you to run the command over all time which really isn't a valid option.

0 Karma

Ayn
Legend

If you know that all successful transactions contain 2 events, you could simply search for transactions that do not have this eventcount:

index=test| rex "(Document\sID:\s|capture\.ID\\-\>)(?<DocID>[^ ]+})(\s|$)"  | search DocID=* | transaction DocID maxpause=30m | search NOT eventcount=2

You should consider moving your field extraction into conf files. That way you won't have to think about having it inline, but most importantly the DocID field will be available right from the start of the search which means the first search clause can filter on that directly, instead of as your search looks now grab ALL events from index test during the chosen timeframe before sending them on to the rest of the search pipeline.

kphillipson
Path Finder

Ayn, Using this search seems to work the best without having to to an "all time" search. See I knew I was making it too complicated 🙂

0 Karma

kphillipson
Path Finder

Good point on the field extractions. I have moved them to conf file. I'll work on the two options tomorrow and see what one works best for this issue. Thank you Ayn!

0 Karma

kphillipson
Path Finder

Yes the DocID is identical and there will be 2 events for the field if they show up in both logs.

There isn't a "start" or "finish" in plain text however the text before the unique ID is different in the logs. The start would have capture.ID and the finish would be Document ID.

0 Karma

Ayn
Legend

Do all successful transaction contain the same number of events? Perhaps 2 if successful (start+end)?

0 Karma
Get Updates on the Splunk Community!

Splunk Education - Fast Start Program!

Welcome to Splunk Education! Splunk training programs are designed to enable you to get started quickly and ...

Five Subtly Different Ways of Adding Manual Instrumentation in Java

You can find the code of this example on GitHub here. Please feel free to star the repository to keep in ...

New Splunk APM Enhancements Help Troubleshoot Your MySQL and NoSQL Databases Faster

Splunk Observability has two new enhancements to make it quicker and easier to troubleshoot slow or frequently ...