Getting Data In

How can I see the search peer that a forwarder is connected to when using indexer discovery?

Lucas_K
Motivator

Apart from seeing data coming from the forwarders arriving in an index, is there any way I can see which indexer a forwarder is currently sending data to? Either via a command, api call or log entry?

On a forwarder that isn't using indexer discovery, you can do a "splunk list forward-server" you can also see a "Connected to " message in splunkd.log.

With indexer discovery, neither of these show up the same.

0 Karma
1 Solution

mattymo
Splunk Employee
Splunk Employee

We are on the right track!

garethatiag is right, iz broke! the TcpOutputProc is telling us that the connection to the indexers is down, hence the blocking for x amount of seconds.

On the master node, try searching _internal for:

index="_internal" component=CMIndexerDiscovery host=cmaster

You are looking for a message like:

CMIndexerDiscovery - Registering new forwarder <GUID> (total: 1). Heartbeat assigned for next check: 30 seconds

If you see nothing, then you need to check the forwarder's config of the master URI.

If you do see something then move to checking the logs on the forwarder again:

egrep 'ERROR|WARN' splunkd.log
egrep 'HttpPubSubConnection' splunkd.log

Also, make sure you have the pass4SymmKey matching on both the master and forwarder.

Is this a single site or multi-site cluster?

Beyond that, if you paste your master and forwarder config stanza we can proof read for ya 😉

- MattyMo

View solution in original post

ketilolav
Explorer

Hi, I just discovered a CLI command you can run to see the active forwarders your UF/HF are talking to: 

$SPLUNK_HOME/bin/splunk list forward-server

will list in memory peer nodes the UF/HF is talking to. 

Tags (1)
0 Karma

landen99
Motivator

On the forwarder, you can see the search peers using tstats quite effectively.

| tstats count where index=_* by splunk_server
0 Karma

mattymo
Splunk Employee
Splunk Employee

We are on the right track!

garethatiag is right, iz broke! the TcpOutputProc is telling us that the connection to the indexers is down, hence the blocking for x amount of seconds.

On the master node, try searching _internal for:

index="_internal" component=CMIndexerDiscovery host=cmaster

You are looking for a message like:

CMIndexerDiscovery - Registering new forwarder <GUID> (total: 1). Heartbeat assigned for next check: 30 seconds

If you see nothing, then you need to check the forwarder's config of the master URI.

If you do see something then move to checking the logs on the forwarder again:

egrep 'ERROR|WARN' splunkd.log
egrep 'HttpPubSubConnection' splunkd.log

Also, make sure you have the pass4SymmKey matching on both the master and forwarder.

Is this a single site or multi-site cluster?

Beyond that, if you paste your master and forwarder config stanza we can proof read for ya 😉

- MattyMo

Lucas_K
Motivator

That getting very close to the correct answer!

index=_internal host=clustermaster component=CMIndexerDiscovery

By looking for those messages from the cluster master I was seeing that the forwarder wasn't talking into the cluster master correctly for some reason.

Revisiting the doco config examples I found that I didn't explicitly set a forwarder password. I had the cluster password set under [clustering] but not one under the [indexer_discovery] stanza.

The doco pages makes it seem like it is optional (the " If specified here" part!).

[indexer_discovery]
pass4SymmKey =
* Security key shared between master node and forwarders.
* If specified here, the same value must also be specified on all forwarders
connecting to this master.

"The pass4SymmKey attribute specifies the security key used with communication between the master and the forwarders. Its value must be the same for all forwarders and the master node. You must explicitly set this value for each forwarder."

I had wrongly assumed that I could use the existing index cluster password that search heads use.

It seems to silently fail with no indication of why.

Once I added the additional password on both the cluster master and forwarder I was able to see it report in with its GUID ( historically visible from prior logs: index="_internal" host=myforwarders GUID source="/opt/splunkforwarder/var/log/splunk/splunkd.log" | stats values(guid) by serverName).

This however seems to be a one time message.

You can't actively see which indexer a forwarder is currently talking to from those messages.

When it is correctly working however the old "Connected to idx=10.11.11.1:9997" messages return.

mattymo
Splunk Employee
Splunk Employee

Nice work!!

- MattyMo
0 Karma

mattymo
Splunk Employee
Splunk Employee

Grepping splunkd.log from the CLI on your forwarder, or searching index=_internal for TcpOutputProc should allow you to audit any TCP activity the forwarder has been up to. It is the same process responsible for the "Connected to" messages you are referring to.

Here is an example from one of my lab forwarders. Although I am not using indexer discovery, I assume the TcpOutputProc should still be responsible for setting up the connections.

splunker@n00b-splkufwd-01:/opt/splunkforwarder/var/log/splunk$ grep TcpOutputProc splunkd.log
10-09-2016 15:26:20.515 +0000 INFO  TcpOutputProc - Detected connection to 10.10.10.10:9997 closed
10-09-2016 15:26:20.515 +0000 INFO  TcpOutputProc - Will close stream to current indexer 10.10.10.10:9997
10-09-2016 15:26:20.515 +0000 INFO  TcpOutputProc - Closing stream for idx=10.10.10.10:9997
10-09-2016 15:26:51.027 +0000 INFO  TcpOutputProc - Connected to idx=10.10.10.10:9997
10-10-2016 15:07:49.195 +0000 INFO  TcpOutputProc - begin to shut down auto load balanced connection strategy
10-10-2016 15:07:49.311 +0000 INFO  TcpOutputProc - Shutting down auto load balanced connection strategy
10-10-2016 15:07:49.311 +0000 INFO  TcpOutputProc - Auto load balanced connection strategy shutdown finished
10-10-2016 15:07:49.311 +0000 INFO  TcpOutputProc - Received shutdown control key.
10-10-2016 15:07:52.746 +0000 INFO  TcpOutputProc - Initializing with fwdtype=lwf
10-10-2016 15:07:52.753 +0000 INFO  TcpOutputProc - found Whitelist forwardedindex.0.whitelist , RE : .*
10-10-2016 15:07:52.753 +0000 INFO  TcpOutputProc - found Blacklist forwardedindex.1.blacklist , RE : _.*
10-10-2016 15:07:52.753 +0000 INFO  TcpOutputProc - found Whitelist forwardedindex.2.whitelist , RE : (_audit|_introspection|_internal)
10-10-2016 15:07:52.753 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.10.10.10:9997
10-10-2016 15:07:52.753 +0000 INFO  TcpOutputProc - tcpout group n00b-splkidx-02 using Auto load balanced forwarding
10-10-2016 15:07:52.753 +0000 INFO  TcpOutputProc - Group n00b-splkidx-02 initialized with maxQueueSize=512000 in bytes.
10-10-2016 15:07:52.842 +0000 INFO  TcpOutputProc - Connected to idx=10.10.10.10:9997

Let me know if you still can't find the logs and I will set up indexer discovery and test!

- MattyMo

Lucas_K
Motivator

I don't see anything like that. The forwarder just stops working as soon as indexer discovery is turned on.

index=_internal host=forwarder-38* source=*splunkd.log TcpOutputProc

10-07-2016 11:41:08.118 +1100 INFO TcpOutputProc - Connected to idx=xxxxx:9997
.............
10-07-2016 13:45:22.105 +1100 WARN TcpOutputProc - Forwarding to indexer group Production blocked for 3700 seconds.

This uf is running v6.4.1.

0 Karma

gjanders
SplunkTrust
SplunkTrust

In the example you have provided the forwarding to the indexer is simply not working.

Do you have SSL enabled on port 9997? Do you have multiple ports open on the indexer?

Indexer discovery has some limitations, I found that multiple ports (I had 9997/9998 as splunk TCP ports) can confuse the indexer discovery...

Also do you have any errors besides the warning of the output failing to go through ?

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...