Solved: How to monitor log source / index outages?

Glasses · ‎12-08-2020

I have about a dozen data sources that I want to monitor for an outage... like>>> No Events in Last 60 minutes.

Currently I have been using a separate alerts for each data source / index which run every hour and alert if there are 1 < events.

I am just wondering if there is a better way to do this.... I also have to contend with some sources having longer that 60 minute delay at times...

Thank you

Glasses · ‎12-09-2020

Thank you for the suggestion, but we were not able to create alerts for individual indexes / data sources with |metadata , maybe you know how to do that...?

So we are currently using time interval outage alerts for each specific index AND/OR sourcetype AND/OR source.

For example >>>

| tstats count where index=<foo> sourcetype=<bar> earliest=<-60m>

the result is a count and thus in the alerts we use custom alert trigger "search count=0"

View solution in original post

rafamss · ‎12-08-2020

I used these two commands below to perform what you are doing. You can use your personal rule to trigger an alert or just monitor it in a Dashboard for example.

#01 - Monitor the incoming data from hosts
| metadata type=hosts
| convert ctime(lastTime), cTime(recentTime)
#02 - Monitor the incoming data from sourcetypes
| metadata type=sourcetypes
| convert ctime(lastTime), cTime(recentTime)
#03 - Monitor the incoming data from an specific index
| metadata type=sourcetypes index=_internal
| convert ctime(lastTime), cTime(recentTime)

And take a look at this post on Splunk' Blog: https://www.splunk.com/en_us/blog/tips-and-tricks/metadata-metalore.html

Glasses · ‎12-09-2020

Thank you for the suggestion, but we were not able to create alerts for individual indexes / data sources with |metadata , maybe you know how to do that...?

So we are currently using time interval outage alerts for each specific index AND/OR sourcetype AND/OR source.

For example >>>

| tstats count where index=<foo> sourcetype=<bar> earliest=<-60m>

the result is a count and thus in the alerts we use custom alert trigger "search count=0"

How to monitor log source / index outages?

saved search

scheduled search

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases

Are you a member of the Splunk Community?