Alerting

Creating alert for if device went offline and recovery status.

parthiban
Path Finder

Hi everyone

We have an on-premise edge device in the remote location, and it is added to the cloud. I would like to monitor and set an alert for both device offline and recovery statuses.

While I can set an alert for the offline status, I'm a bit confused about including the recovery status. Can you please assist me in configuring the alert for both scenarios?

Labels (4)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban.

if you have as results of your search : onlineStatus="online" and/or onlineStatus=offline, you could modify your search in this way:

index= "XXXXX" "Genesys system is available"
| spath input=_raw output=new_field path=response_details.response_payload.entities{}
| mvexpand new_field
| fields new_field
| spath input=new_field output=serialNumber path=serialNumber
| spath input=new_field output=onlineStatus path=onlineStatus
| where serialNumber!=""
| lookup Genesys_Monitoring.csv serialNumber
| where Country="Bangladesh"
| stats 
   count(eval(onlineStatus="offline")) AS offline_count
   count(eval(onlineStatus="online")) AS online_count
   earliest(eval(if(onlineStatus="offline",_time,""))) AS offline_time
   earliest(eval(if(onlineStatus="online",_time,""))) AS online_time
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
   offline_count=0 AND online_count>0,"Online",
   offline_count>0 AND online_count=0,"Offline",
   offline_count>0 AND online_count>0 AND online>offline, "Offline but newly online",
   offline_count>0 AND online_count>0 AND offline>online, "Offline",
   offline_count=0 AND online_count=0, "No data")
| search condition="Offline" OR condition="Offline but newly online"
| table condition

Ciao.

Giuseppe

 

 

View solution in original post

0 Karma

parthiban
Path Finder

Hi @gcusello 

In the log, we receive the payload model below. In the 'entities' section, I've only specified one device status, but in reality, there are 11 device statuses in a single log message. I want to create an alert: if a device goes offline, it will trigger one alert, and when it comes online, it will trigger a clear alarm alert. I specify having only one alert because we receive logs every 2 minutes from AWS, and to avoid multiple alerts for the same device going offline and online.Hope it is clear what my requirement is.

response_details:
▼{

response_payload:▼
{
entities:

▼{
id:"YYYYYYY",
name:"ABC",
onlineStatus:"ONLINE",
serialNumber:"XXXXXXX",

},


0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban,

please confirm: you want an alert if onlineStatus="recovery" or if, for a defined period, you don't receive logs from a device is is correct?

In this case, you can use my second search creating a list of devices to monitor in a lookup.

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello  

Yes want alert for online status="OFFLINE" and online status="Online"  for the same device

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban,

ok, but how can the device send a status if it's offline?

if it continue to send logs even if it's offline, you can add this condition to the search, but, as I suppose, it doesnt sends logs when offline, you can use my search.

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello 

This is on premises device and managed by cloud. If device went offline cloud will send log.

 

Which condition I need to add ?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban,

status = "OFFLINE"

please try this:

index=your_index 
| stats count BY device status
| append [ | inputlookup perimeter.csv | eval count=0 | fields device count ]
| stats sum(count) AS total BY device status
| eval status=if(total=0,"down",status)
| search status="recovery" OR status="offline" OR status="down"
| table device status

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello 

| rename "response_details.response_payload.entities{}.onlineStatus" as status
| stats count BY status
| append [ | makeresults | eval name=xxxx, count=0 | fields name ]
| stats sum(count) AS total BY status
| eval status=if(total=0,"OFFLINE",status)
| search status="ONLINE" OR status="OFFLINE"
| table status

I getting result is "ONLINE"

How it will works on the alert ?  How can I set in the alert? Can you please guide me

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban,

probably there's a misundertanding one the condition to check:

I understood that you want to check if status="recovery" or status=down, and I check for these statuses, but what's your requirement?

with your search you check status=down and status=online, is this the requirement?

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello 
Let me clarify,
We receive device status logs every 2 minutes from AWS Cloud. These logs indicate both online and offline statuses. If a device goes offline, we continuously receive offline logs until it comes back online, at which point we receive online logs for that specific device.

My requirement is to trigger a critical alert for the end user when a particular device goes offline. Subsequently, I will notify the end user when the device comes back online. Based I need to create alert. Is this possible?  also I have already shared example logs in this conversation.

Moreover we have this type of alert is working other observability application, now we are migrating to Splunk.

I hope this clarifies my requirement. Please let me know anything required.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban ,

it isn't a problem notification when status is offline but, after the first offline, do you want that the alert continues to fire "offline", or do you want a message when it comes back on line?

 if you want a message every time you have offline and the following online, you could try something like this:

<your_search>
| stats 
   count(eval(status="offline")) AS offline_count
   count(eval(status="online")) AS online_count
   earliest(eval(if(status="offline",_time,""))) AS offline
   earliest(eval(if(status="online",_time,""))) AS online
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
   offline_count=0 AND online_count>0,"Online",
   offline_count>0 AND online_count=0,"Offline",
   offline_count>0 AND online_count>0 AND online>offline, "Offline but newly online"),   
   offline_count>0 AND online_count>0 AND online>offline, "Offline"),   
   offline_count=0 AND online_count=0, "No data")
| table condition

in this way you can choose the conditions to trigger the alert.

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello 

No, don't want cont alert for offline... I want to trigger first offline and first online message. Thanks for understanding.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban ,

you have only to setup the conditions for the alert:

<your_search>
| stats 
   count(eval(status="offline")) AS offline_count
   count(eval(status="online")) AS online_count
   earliest(eval(if(status="offline",_time,""))) AS offline
   earliest(eval(if(status="online",_time,""))) AS online
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
   offline_count=0 AND online_count>0,"Online",
   offline_count>0 AND online_count=0,"Offline",
   offline_count>0 AND online_count>0 AND online>offline, "Offline but newly online"),   
   offline_count>0 AND online_count>0 AND online>offline, "Offline"),   
   offline_count=0 AND online_count=0, "No data")
| search condition="Offline" OR condition="Offline but newly online"
| table condition

in this way your alert will trigger the two conditions.

Ciao.

Giuseppe

 

0 Karma

parthiban
Path Finder

Hi @gcusello 

I tried which you given code, it is not working throwing some error.

"Error in 'EvalCommand': Type checking failed. 'AND' only takes boolean arguments"

index="XXXX" 
| rename "response_details.response_payload.entities{}" as status
| where name="YYYY"
| stats
count(eval(status="offline")) AS offline_count
count(eval(status="online")) AS online_count
earliest(eval(if(status="offline",_time,""))) AS offline
earliest(eval(if(status="online",_time,""))) AS online
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
offline_count=0 AND online_count>0,"Online",
offline_count>0 AND online_count=0,"Offline",
offline_count>0 AND online_count>0 AND online>offline, "Offline but newly online"),
offline_count>0 AND online_count>0 AND online>offline, "Offline"),
offline_count=0 AND online_count=0, "No data")
| search condition="Offline" OR condition="Offline but newly online"
| table condition

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi, sorry, please try this:

index="XXXX" 
| rename "response_details.response_payload.entities{}" as status
| where name="YYYY"
| stats
count(eval(status="offline")) AS offline_count
count(eval(status="online")) AS online_count
earliest(eval(if(status="offline",_time,""))) AS offline
earliest(eval(if(status="online",_time,""))) AS online
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
   offline_count=0 AND online_count>0,"Online",
   offline_count>0 AND online_count=0,"Offline",
   offline_count>0 AND online_count>0 AND online>offline, "Offline but newly 
online",
   offline_count>0 AND online_count>0 AND online>offline, "Offline",
   offline_count=0 AND online_count=0, "No data")
| search condition="Offline" OR condition="Offline but newly online"
| table condition

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

HI @gcusello 

This time its runs without error, but no result found.

index="XXXX" "Genesys system is available"
| rename "response_details.response_payload.entities{}.onlineStatus" as status
| where name="YYYY"
| stats
count(eval(status="offline")) AS offline_count
count(eval(status="online")) AS online_count
earliest(eval(if(status="offline",_time,""))) AS offline
earliest(eval(if(status="online",_time,""))) AS online
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
offline_count=0 AND online_count>0,"Online",
offline_count>0 AND online_count=0,"Offline",
offline_count>0 AND online_count>0 AND online>offline, "Offline but newly
online",
offline_count>0 AND online_count>0 AND online>offline, "Offline",
offline_count=0 AND online_count=0, "No data")
| search condition="Offline" OR condition="Offline but newly online"
| table condition

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban,

I found an error in the eval definition, but it shouldn't be the issue:

index="XXXX" "Genesys system is available"
| rename "response_details.response_payload.entities{}.onlineStatus" as status
| where name="YYYY"
| stats
count(eval(status="offline")) AS offline_count
count(eval(status="online")) AS online_count
earliest(eval(if(status="offline",_time,""))) AS offline
earliest(eval(if(status="online",_time,""))) AS online
| fillnull value=0 offline_count
| fillnull value=0 online_count
| eval condition=case(
offline_count=0 AND online_count>0,"Online",
offline_count>0 AND online_count=0,"Offline",
offline_count>0 AND online_count>0 AND online>offline, "Offline but newly
online",
offline_count>0 AND online_count>0 AND offline>online, "Offline",
offline_count=0 AND online_count=0, "No data")
| search condition="Offline" OR condition="Offline but newly online"
| table condition

Debug the search, to understand if the search conditions are verified or not: remove the search statement and see which values you have.

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello 
If I remove the below search condition I get this result.

| search condition="Offline" OR condition="Offline but newly online"
| table condition

 

parthiban_0-1702457118544.png

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban ,

use the correct field for "status" and check if the conditions in the stats command are the correct ones.

Ciao.

Giuseppe

0 Karma

parthiban
Path Finder

Hi @gcusello 

I am using correct field only which is below mentioned one.

| rename "response_details.response_payload.entities{}.onlineStatus" as status

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @parthiban,

if you run:

index="XXXX" "Genesys system is available"
| rename "response_details.response_payload.entities{}.onlineStatus" as status
| where name="YYYY"

which values have you for the status field?

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...