I set up an alert, looks for the last 15 min data for every 15 min.
I have a list of hosts in the lookup table when the hosts in the lookup table are not reporting sends and an email alert. When indexing is stopped, or indexers are stopped, or indexing latency is more than 15min I am receiving alert for every host.
I want to stop receiving alerts when the environmental issue occurred as I mentioned above
I have a query something like this
|inputlooup host.csv | join host type=left [search index=* [|inputlooup host.csv]| stats count by host ] | fillnull value=0 | where count=0
Try this:
index=* [|inputlooup host.csv]|
| appendpipe [|inputlooup host.csv] stats count by host
| eval count = count - 1
| eventstats count AS hostCount
| where count==0
| eventstats count AS hostCountMissing
| eval pct = 100.0 * (hostCountMissing/hostCount)
| where pct < 75
Adjust 75
to suit.
All right, there are a few solutions to your problem.
The first solution is to set up your alert to "throttle" multiple alerts per host. It allows the first alert through, then holds any other alerts for some certain length of time.
The second solution is to change your query so that it eliminates hosts that have note reported for over a certain length of time.
The third solution is to change your query so that it uses _indextime rather than _time.
why not use the | metadata
command?
i think that there are more simple ways to achieve your goal, alert on missing hosts, without the cumbersome search and the attempt to resolve environmental issues.
also, i can't see how you will receive an alert if indexers stopped as your search will return no results at all.
please elaborate on your requirements and we will sure help!
I am sorry, I want to find hosts which are down using logs. we are getting huge data sometimes and processing queues get blocked. Alert looks for the last 15 min and it doesn't find any data when queues are blocked and consider host is down and sending an email alert which is not expected. I want to stop alert using some conditions when indexer stops indexing.
we get lookup data even indexing queues are blocked, when we run search we will have hostname and event count will be null because indexer stops indexing.
I used metadata command, we get lasttime and recenttime, if indexer queues are blocked, it doesn't index data and doesn't show current time as recenttime (because not indexing)
Is hostname and eventcount being null a definite indication that this is the problem? Are there other scenarios where this might happen? If not, could you add that to the SPL of your alert so that it only alerts when hostname and event count are not NULL?
yes, When host is down. we get null as event count
first, fix your blocked queues
second, you can use alerts that leverages _indextime
to overcome the lagging / latency issue
read this answer in great detail:
https://answers.splunk.com/answers/678655/how-to-trigger-alerts-when-indextime-time-1.html
sometimes it takes hours to solve issue, By the time receiving a lot of alerts.
Sounds like you have bigger problems to address first..