Getting Data In

Alerting on forwarder up/down events

Lowell
Super Champion

I have a saved search that notifies me when a forwarder goes up or down based on various TcpInputProc and TcpOutputProc messages coming from both the indexer and the forwarder machines.

The problem I'm running into is that I'm seeing a bunch of messages like this even when the forwarder is not going down. Anybody know why this is happening, or a more reliable message that I can use for this:

05-26-2010 11:24:18.346 INFO TcpInputProc - Hostname=host.domain.com closed connection

I'm wondering if this simply means that the connection was temporarily closed due to no data, but that would seem odd since I'm seeing this primarily on a few servers that are fairly busy.


For anyone interested. My full search runs ever 5 minutes, and looks like this: (Be prepared to do some scrolling)

 index=_internal sourcetype="splunkd" (TcpInputProc "closed connection" OR "Connection accepted from") NOT localhost | eval sender=if(searchmatch("TcpOutputProc"),host,"") | eval receiver=if(searchmatch("TcpInputProc"),host,"") | eval action=if(searchmatch("Connect* accepted OR to"),"up", "down") | eval sender=coalesce(Hostname,sender) | rex "to (?<receiver>[^:]+)(:\d+)?" | rex "from (?<sender>\S+)" | replace "dnsname.example.com" with "splunk.domain.com", "anotherdnsname.domain.com" with "therealservername.domain.com" in sender, receiver | stats min(_time) as start_time, max(_time) as end_time, list(action) as actions, first(action) as final_state by sender,receiver | eval start_time=strftime(start_time,"%I:%M %p") | eval end_time=strftime(end_time,"%I:%M %p")
Tags (2)
1 Solution

Simeon
Splunk Employee
Splunk Employee

I recommend using the hosts metadata and searching for events received. Metadata contains when the last event was received from a specific host, source, or sourcetype. You can use a where statement that compares the last time an event was received to ensure that data is streaming. The reasons this is better than searching the splunkd log:

  1. You will know an event has actually come through
  2. The search itself is significantly faster

The search I use is as follows:

| metadata type=hosts | eval diff=now()-recentTime | where diff < 600 | convert ctime(*Time)

This will tell you what hosts have sent data in the past 10 minutes.

View solution in original post

Simeon
Splunk Employee
Splunk Employee

I recommend using the hosts metadata and searching for events received. Metadata contains when the last event was received from a specific host, source, or sourcetype. You can use a where statement that compares the last time an event was received to ensure that data is streaming. The reasons this is better than searching the splunkd log:

  1. You will know an event has actually come through
  2. The search itself is significantly faster

The search I use is as follows:

| metadata type=hosts | eval diff=now()-recentTime | where diff < 600 | convert ctime(*Time)

This will tell you what hosts have sent data in the past 10 minutes.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...