Reporting

How to monitor log source / index outages?

Glasses
Builder

I have about a dozen data sources that I want to monitor for an outage...  like>>> No Events in Last 60 minutes.

Currently I have been using a separate alerts for each data source / index which run every hour and alert if there are 1 < events.

I am just wondering if there is a better way to do this....  I also have to contend with some sources having longer that 60 minute delay at times...

 

Thank you

Labels (2)
0 Karma
1 Solution

Glasses
Builder

Thank you for the suggestion, but we were not able to create alerts for individual indexes / data sources with |metadata , maybe you know how to do that...?

 

So we are currently using time interval outage alerts for each specific index AND/OR sourcetype AND/OR source.

For example >>> 

| tstats count where index=<foo>  sourcetype=<bar>  earliest=<-60m>

the result is a count and thus in the alerts we use custom alert trigger  "search count=0"

View solution in original post

0 Karma

rafamss
Contributor

I used these two commands below to perform what you are doing. You can use your personal rule to trigger an alert or just monitor it in a Dashboard for example.

#01 - Monitor the incoming data from hosts
| metadata type=hosts
| convert ctime(lastTime), cTime(recentTime)
#02 - Monitor the incoming data from sourcetypes
| metadata type=sourcetypes
| convert ctime(lastTime), cTime(recentTime)
#03 - Monitor the incoming data from an specific index
| metadata type=sourcetypes index=_internal
| convert ctime(lastTime), cTime(recentTime)

And take a look at this post on Splunk' Blog: https://www.splunk.com/en_us/blog/tips-and-tricks/metadata-metalore.html

0 Karma

Glasses
Builder

Thank you for the suggestion, but we were not able to create alerts for individual indexes / data sources with |metadata , maybe you know how to do that...?

 

So we are currently using time interval outage alerts for each specific index AND/OR sourcetype AND/OR source.

For example >>> 

| tstats count where index=<foo>  sourcetype=<bar>  earliest=<-60m>

the result is a count and thus in the alerts we use custom alert trigger  "search count=0"

0 Karma
Get Updates on the Splunk Community!

Get Operational Insights Quickly with Natural Language on the Splunk Platform

In today’s fast-paced digital world, turning data into actionable insights is essential for success. With ...

What’s New in Splunk Observability Cloud – June 2025

What’s New in Splunk Observability Cloud – June 2025 We are excited to announce the latest enhancements to ...

Almost Too Eventful Assurance: Part 2

Work While You SleepBefore you can rely on any autonomous remediation measures, you need to close the loop ...