Alert or Show when a host is down from any index

jaywilwk · ‎10-28-2013

I have multiple indexes that I would like to look across all of them see when a host stops sending logs or down within a 24 hour window.

kristian_kolb · ‎10-28-2013

Or you can look at the metadata to see when a host last sent some data. The example below lists hosts that have not sent data in the last day (86400 seconds). Should be significantly quicker than searching through the metrics logs.

| metadata type=hosts |where recentTime < now() - 86400 | eval lastSeen = strftime(recentTime, "%F %T") | fields + host lastSeen

/K

jaywilwk · ‎10-28-2013

Yes it worked.

kristian_kolb · ‎10-28-2013

well, yes, that is the purpose of the where statement in the original query. However, does lastSeen for your 'missing' host seem correct?

jaywilwk · ‎10-28-2013

Ok, this one shows everything. It doesn't just show ones that are not loggin or down. Is it possible to show only the ones that are down or not logging.

kristian_kolb · ‎10-28-2013

Ok.

| metadata type=hosts | eval lastSeen = strftime(recentTime, "%F %T") | fields + host lastSeen

?

jaywilwk · ‎10-28-2013

It shows with only the first part, but not with the rest.

kristian_kolb · ‎10-28-2013

does it show with only the first part?

| metadata type=hosts

/K

jaywilwk · ‎10-28-2013

This worked fine, but it left out some host. There's a host that's not showing that I know isn't reporting. Hasn't been reporting for over 3 days now.

somesoni2 · ‎10-28-2013

All the hosts (whether they are sending data or not) send heartbeat to indexer in _internal index. you can query that to identify if a host is down or not.

index=_internal source=*metrics.log group=tcpin_connections earliest=-7d@d
| eval sourceHost=coalesce(hostname, sourceHost)
| eval age = (now() - _time )
|stats first(age) as age, first(_time) as LastTime by sourceHost
| convert ctime(LastTime) as "Last Active On"
| eval Status= case(age < XXX,"Running",age > XXX,"DOWN")

Where XXX=duration in second for which is their are no heartbeat from host, the host is down. Typically is can be 2-3 min (120 or 180)

somesoni2 · ‎08-28-2014

Did it show the host before you put it down (with same search)? The query depends on existence of fields hostname OR sourceHost from the events from that host (its used in stats so if either of the field is null they won't show up.

splunker9999 · ‎11-10-2016

@somesoni2, do we to run your search on Indexer? When I tried running this on one of my search head, I was getting status of my splunk servers but not forwarders?

thelen_m_kevin · ‎08-28-2014

I know this is a bit dated, but I was interested in finding hosts that "suddenly stop reporting to splunk" and I found this answer.

When I run this search everything looks fine, and it makes sense. But I decided to test this by issuing a stop command on one of my forwarding agents. That device no longer shows up in the list at all instead of showing up with a "down" status.

Can anyone take a stab at why that would happen? (I haven't altered the search except to add seconds where there are XXX's)

jaywilwk · ‎10-28-2013

I tried this search out and wasn't able to yield any results. I tried changing the XXX to differents seconds and still didn't yield any results.

Alert or Show when a host is down from any index

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Index This | What travels the world but is also stuck in place?

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

Join the Conversation

Alert or Show when a host is down from any index

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Index This | What travels the world but is also stuck in place?

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...