I have more than 5000+ servers and 10,000+ logs coming to Splunk daily. How do I check whether all the logs are coming into Splunk properly by using a dashboard or report.
If any of the logs not received, I need to be alerted through an email alert.
We use the following search... runs for over 2 mins in our environment (not perfect, but works for us). We have it scheduled to run periodically. You can adjust the timing to meet your need.
index=* earliest=-30m@m | eval when=if(_time>relative_time(now(), "-15m@m"), "now", "earlier") | eval h=host." - ".source | chart count over h by when | where earlier>0 AND now=0
PS: This will take a while to run. Depending on how long it runs in your environment, you may want to consider SI for this.
There are a variety of reasons why logs might not be coming into Splunk properly. If you are looking for Splunk internal ingest failures then the splunkd log in the _internal index can be used to build an alert for Splunk failing to index because of blocked queues, etc. However, this would not cover a failure outside of Splunk.
My idea would be to create a time window where you would expect at least one log to come in and alert on when that search returns no results.
For example, if you expect at least 1 event to be sent over from your syslog server every minute, build an alert:
"source=$syslogserver" for a rolling time window of 1 minute. Fire an alert if number of results==0
So for each discrete source you could generate a time window search that could tell you whether any logs have been indexed within the set time period. Absence of results would mean that something might be wrong.