Splunk Search

How to modify query if status is offline for 10 mins?

sekhar463
Path Finder

hi All,

i am using below search to get status if any offline 

and i want to create alert if status offline for more than 10 mins .

how to modify this search to get if any status is offline more than 10 mins 

i am using DB connect to get data for every 5 mins and data will update for every 5 mins in splunk, default is 5 mins to get updated data

below is the data for last 5 mins 

index=Testindex sourcetype="Bueprism" source=Botstatus
| table BOT_Name lastupdated BOT_Status _time
| search BOT_Status = Offline

 

BOT_Name lastupdated BOT_Status

HOUVMITBPRSMX20:8001 2023-08-23 05:14:12.503 Offline HOUVMITBPRSMX14:8001 2023-08-23 08:20:11.77 Offline HOUVMITBPRSMX13:8001 2023-08-23 08:20:12.693 Offline
Labels (1)
Tags (1)
0 Karma

sekhar463
Path Finder

this is working, what was cron interval can i keep to get alerts as expected 

can i run it for every 15 mins 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sekhar463,

if between two Offline it isn't possible to have an Online, you could try something like this:

index=Testindex sourcetype="Bueprism" source=Botstatus BOT_Status="Offline"
| stats 
   earliest(lastupdated) AS earliest
   latest(lastupdated) AS latest
   latest(_time) AS _time
   BY BOT_Name 
| eval latest=if(isnull(latest),_time,latest)
| where latest-earliest>300

 if instead you could have an intermediate on line between two offline you could try:

index=Testindex sourcetype="Bueprism" source=Botstatus
| stats 
   earliest(eval(if(BOT_Status="Offline",lastupdated,""))) AS earliest_offline
   latest(eval(if(BOT_Status="Offline",lastupdated,""))) AS latest_offline 
   values(eval(if(BOT_Status="Online",lastupdated,""))) AS lastupdated_online
   latest(_time) AS _time
   BY BOT_Name 
| eval latest_offline =if(isnull(latest_offline),_time,latest_offline)
| mvexpand lastupdated_online
| where latest_offline-earliest_offline>300 AND NOT (lastupdated_online>earliest_offline AND lastupdated_online<latest_offline)

I'm sure about the first solution, the second one should be tested and eventually adapted.

Ciao.

Giuseppe

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Set your search timeframe to the previous 15 minutes.

index=Testindex sourcetype="Bueprism" source=Botstatus
| stats latest(BOT_Status) as BOT_Status latest(lastupdated) as lastupdated by BOT_Name
| where BOT_Status="Offline" AND strptime(lastupdated,"%Y-%m-%d %H:%M:%S") < relative_time(now(), "-10m")
0 Karma

sekhar463
Path Finder

hai what was the cron interval can i keep for alert schedule to get expected 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Given that your data changes every 5 minutes, why not schedule the cron for every 5 minutes, but offset it so that it runs just after the new data has been loaded?

0 Karma

sekhar463
Path Finder

BUT THIS SEARCH YOU SAID TO RUN FOR EVERY 15 MINS TO GET IF OFFLINE MORE THAN 10 MINS 
SO FOR SCHEDULING ALERT WHAT TIME RANGE CAN I GIVE TO CHECK THE DATA AND CRON SCHEDULE 

index=Testindex sourcetype="Bueprism" source=Botstatus
| stats latest(BOT_Status) as BOT_Status latest(lastupdated) as lastupdated by BOT_Name
| where BOT_Status="Offline" AND strptime(lastupdated,"%Y-%m-%d %H:%M:%S") < relative_time(now(), "-10m")

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

No, I said the timeframe for the search is the previous 15 minutes, i.e. how far back to look for events, the schedule is how often it runs, which could be every 5 minutes since that's how often the data changes. It is up to you to decide how often it runs as this determines how responsive to detecting the offline time you want to be.

For example, if your search runs as 5 passed the hour and looks back 15 minutes, you will be looking at events from 10 minutes before the hour through to 5 minutes passed the hour. This should give you enough events to be able to detect if the host was down for 10 minutes (during that 15 minute period).

It is entirely up to you to choose how often your alert runs and what time period it searches over.

(I refrained from using caps in my response as I think it is clear enough!)

0 Karma

sekhar463
Path Finder

hai Thanks for your reply.

i have tested for alert using this search but 

the data is updating into splunk from source for every 5 mins 

and status was in offline when lastupdated="2023-08-24 12:51:49.62" 

and status was changed IDLE  lastupdated="2023-08-24 13:00:01.637"  

how to modify the search if status was offline for 2 polls when collecting data 

for example event time 8/24/23 6:25:00.873 PM  data collection it was in offline and also next interval 2023-08-24 08:00:02.202 if it was offline , need to get 

below are the events while testing for one 

8/24/23
6:45:00.920 PM
2023-08-24 08:15:00.920, BOT_Name="HOUVMITBPRSMX21:8001", lastupdated="2023-08-24 13:14:57.803", BOT_Status="Working"
host = TEST BP_Botstatus_Newquery sourcetype = testsoucetype
8/24/23
6:40:01.652 PM
2023-08-24 08:10:01.652, BOT_Name="HOUVMITBPRSMX21:8001", lastupdated="2023-08-24 13:10:00.85", BOT_Status="Working"
host = TEST source = testcource sourcetype = testsoucetype
8/24/23
6:35:00.968 PM
2023-08-24 08:05:00.968, BOT_Name="HOUVMITBPRSMX21:8001", lastupdated="2023-08-24 13:04:59.833", BOT_Status="Working"
host = TEST source = BP_Botstatus_Newquerysourcetype = testsoucetype
8/24/23
6:30:02.202 PM
2023-08-24 08:00:02.202, BOT_Name="HOUVMITBPRSMX21:8001", lastupdated="2023-08-24 13:00:01.637", BOT_Status="Idle"
host = TEST source = testcourcesourcetype =testsoucetype
8/24/23
6:25:00.873 PM
2023-08-24 07:55:00.873, BOT_Name="HOUVMITBPRSMX21:8001", lastupdated="2023-08-24 12:51:49.62", BOT_Status="Offline"
host = TEST source = testcourcesourcetype =testsoucetype

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| eval lastoffline=if(BOT_Status="Offline",lastupdated,null())
| stats latest(BOT_Status) as BOT_Status latest(lastupdated) as lastupdated latest(lastoffline) as lastoffline count(lastoffline) as offlinecount by BOT_Name
0 Karma

sekhar463
Path Finder

hi its giving as offline count but how it can help for if any BOT status stays in offline for 10 mins or more.

this is the search results Sshowing only count of offline.

BOT1:8001Idle2023-08-24 15:30:00.852023-08-24 11:00:13.7873
BOT2:8001Idle2023-08-24 15:29:56.8972023-08-24 13:20:12.5173
BOT3:8001Idle2023-08-24 15:29:58.6932023-08-24 09:02:12.4134
BOT4:8001Idle2023-08-24 15:30:00.5372023-08-24 08:20:13.3632
BOT24:8001Idle2023-08-24 15:29:59.4432023-08-24 08:20:11.9571
BOT15:8001Idle2023-08-24 15:29:57.432023-08-24 06:22:22.0575

 

 

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If you are looking back 15 minutes and the status is updated every 5 minutes, then there should only be 3 events per BOT, So if the count is 2 or more the BOT has been offline for at least two of those events. If the middle one is not offline, then it has been offline on two different occasions, otherwise, it has been offline for at least 2 consecutive events.

0 Karma

sekhar463
Path Finder

Hi Thanks.

below are the events for one bot based on the search 
it went offline at 9:00 (lastupdated) and if it was offline next run interval also then it will be helpfull to tigger alerts


8/25/23
2:50:01.811 PM
2023-08-25 04:20:01.811, BOT_Name="HOUVMITBPRSMX10:8001", lastupdated="2023-08-25 09:19:59.597", BOT_Status="Working"
host = TEST source = testcourcesourcetype =testsoucetype linecount = 1punct = --_::.,_=":",_="--_::.",_=""source = BP_Botstatus_Newquerysourcetype = db:blueprism_bot statussplunk_server = idx-i-0ad81b0fe967c9831.invesco.splunkcloud.com
8/25/23
2:45:01.003 PM
2023-08-25 04:15:01.003, BOT_Name="HOUVMITBPRSMX10:8001", lastupdated="2023-08-25 09:14:59.183", BOT_Status="Working"
host = TEST source = testcourcesourcetype =testsoucetype linecount = 1punct = --_::.,_=":",_="--_::.",_=""source = BP_Botstatus_Newquerysourcetype = db:blueprism_bot statussplunk_server = idx-i-06bd71cc08ca3fde6.invesco.splunkcloud.com
8/25/23
2:40:02.103 PM
2023-08-25 04:10:02.103, BOT_Name="HOUVMITBPRSMX10:8001", lastupdated="2023-08-25 09:09:59.897", BOT_Status="Idle"
host = TEST source = testcourcesourcetype =testsoucetype linecount = 1punct = --_::.,_=":",_="--_::.",_=""source = BP_Botstatus_Newquerysourcetype = db:blueprism_bot statussplunk_server = idx-i-0b3fd3ab5272edbd5.invesco.splunkcloud.com
8/25/23
2:35:00.976 PM
2023-08-25 04:05:00.976, BOT_Name="HOUVMITBPRSMX10:8001", lastupdated="2023-08-25 09:00:13.993", BOT_Status="Offline"
host = TEST source = testcourcesourcetype =testsoucetype linecount = 1punct = --_::.,_=":",_="--_::.",_=""source = BP_Botstatus_Newquery



BOT_Name BOT_Status lastupdated lastoffline offlinecount

HOUVMITBPRSMX10:8001Working2023-08-25 09:14:59.1832023-08-25 09:00:13.9931




0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

So, you are looking back more than 15 minutes and you want to ignore idle status, i.e. you want the different between when the status is "Working" and "Offline"?

| eval lastupdated=strptime(lastupdated,"%F %T.%3N")
| eval offlinetime=if(BOT_Status="Offline",lastupdated,null())
| eval workingtime=if(BOT_Status="Working",lastupdated,null())
| streamstats last(workingtime) as nextworkingtime
| where nextworkingtime - lastupdated > 600
0 Karma

sekhar463
Path Finder

no as per events i have if any BOT was offline for 2 consecutive times , which data was updated in for every 5 mins so for example if when got data for first run WHENEVER the BOT IS IN OFFLINE AND NEXT TIME ALSO WHEN DATA IS UPDATED SAME BOT IS OFFLINE THEN I WANT TO GET THOSE 

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

There is no need for caps! Your sample data does not show this situation. There is only one event with an offline status. Good luck.

0 Karma

sekhar463
Path Finder

how can i achieve to get offline status if more than 10 mins .

because might be status will change for next update during data collection, if bot still in offline for 2 interval of time then need to get those results

0 Karma

sekhar463
Path Finder

this is working, what was cron interval can i keep to get alerts as expected 

can i run it for every 15 mins 

 

and  @gcusello also provided the same query but not giving any results not sure why 

 

ndex=Testindex sourcetype="Bueprism" source=Botstatus BOT_Status="Offline"
| stats 
   earliest(lastupdated) AS earliest
   latest(lastupdated) AS latest
   latest(_time) AS _time
   BY BOT_Name 
| eval latest=if(isnull(latest),_time,latest)
| where latest-earliest>300

 

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @sekhar463,

good for you, see next time!

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...