Getting Data In

Where does the "fwdinfo" sourcetype come from?

Jason
Motivator

I see some useful info in _internal under the fwdinfo sourcetype, fwd source. However, I can't figure out where this data is coming from. It's not in any of the conf files, on forwarder or indexer.

Edit: Annoyingly, this data doesn't use the hostname set by the Splunk instance - but always the reverse DNS - which is not always the same!

It turns out this is a bug, and should not be ending up in _internal.

1 Solution

jrodman
Splunk Employee
Splunk Employee

fwdinfo is some data generated by 4.2+ forwarders describing the state of their forwarding and communication with the indexers they are connected to. The forwarding code itself inserts these descriptive data items into the datastream bound for the receiving indexer.

The use of rdns to define the hostname is actually what happens for any events which arrive at a tcp input without a hostname provided. This applies for tcp input as well as splunktcp. However for splunktcp, the arriving data nearly always provides a hostname to use, so this behavior is not apparent.

It seems, currently (4.2.3 is current as of this writing), that this will happen if the "v3" protocol, which is 4.2+ is in use, without any available configurability.


Edit: the fwdinfo events were not ever intended to reach the index, but instead intended to provide data so that metrics.log would contain some information about connected forwarders.

Changes:

  • in 4.2.5+, the forwarders use the same host value for the fwdinfo events as other events -- specifically it fishes the value for host out of inputs.conf, from the default stanza.
    • If for some reason you have no host set in inputs.conf, they will use a host name of 'fwdinfo'.
  • In 4.3+ the indexer will not index this data, so you should never see it again. It was pretty much only wasting disk space, the useful data again should be gathered in metrics.log

View solution in original post

jrodman
Splunk Employee
Splunk Employee

fwdinfo is some data generated by 4.2+ forwarders describing the state of their forwarding and communication with the indexers they are connected to. The forwarding code itself inserts these descriptive data items into the datastream bound for the receiving indexer.

The use of rdns to define the hostname is actually what happens for any events which arrive at a tcp input without a hostname provided. This applies for tcp input as well as splunktcp. However for splunktcp, the arriving data nearly always provides a hostname to use, so this behavior is not apparent.

It seems, currently (4.2.3 is current as of this writing), that this will happen if the "v3" protocol, which is 4.2+ is in use, without any available configurability.


Edit: the fwdinfo events were not ever intended to reach the index, but instead intended to provide data so that metrics.log would contain some information about connected forwarders.

Changes:

  • in 4.2.5+, the forwarders use the same host value for the fwdinfo events as other events -- specifically it fishes the value for host out of inputs.conf, from the default stanza.
    • If for some reason you have no host set in inputs.conf, they will use a host name of 'fwdinfo'.
  • In 4.3+ the indexer will not index this data, so you should never see it again. It was pretty much only wasting disk space, the useful data again should be gathered in metrics.log

yannK
Splunk Employee
Splunk Employee

This internal sourcetype is based on the forwarders host/hostname.

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...