Getting Data In

What is the impact of an instance forwarding to itself

kevintelford
Path Finder

Setup We have a cluster of compute nodes, call them node01-node05. They all will run jobs that create data we'd like to put into Splunk. Jobs are farmed out based on available resources. Splunk indexers are also co-exist on node01-node05. We could index the data from the resulting job with the local Splunk installation. The problem is we'll see severe data skew because node01-node04 may be busy for a long while, thus farming all jobs that will feed Splunk to a single node. So the solution would be to redistribute the data with a forwarder (which, while not perfectly load balancing, is better). We could use a dedicated forwarder to re-distribute, but we'd then be paying the penalty of using the network twice - once to the forwarder and once back to the cluster. So I made each indexer also forward data.

inputs.conf

[batch://path/to/files]
move_policy = sinkhole
index = my_index
sourcetype = my_sourcetype
crcSalt = <SOURCE>
_TCP_ROUTING = rest_of_the_splunk_cluster

[splunktcp:9997]

outputs.conf

[tcpout]
heartbeatFrequency = 15
maqQueueSize = 10000

[tcpout:rest_of_the_splunk_cluster]
server = node01:9997, node02:9997, node03:9997, node04:9997, node05:9997
autoLB = true
autoLBFrequency = 5

Question Ok, so now that we've done that. This works. If we're on node01 the data will be distributed to node02-node05. But what is the impact, if any, of having node01 listed as a server in its own outputs.conf? Because it won't send the data to itself, but does it try and fail? Timeout?

The simple answer is to not include this entry in the outputs.conf. My issue comes in because instead of 5 nodes I have a gaggle. To further complicate matters, all of our software deployments/upgrades rely on puppet which doesn't make this sort of thing any easier.

Thanks, K

Tags (1)
0 Karma
1 Solution

kevintelford
Path Finder

This can now be accomplished by installing the Splunk universal forwarder along side Splunk itself. The only change to the forwarder that is required is the splunkd port it binds to. To do this add a web.conf to $SPLUNK_FORWARDER_HOME/ect/system/local/ that says

[settings]
mgmtHostPort = <SERVERIP:newPort>

Boom. Boots upside yo head.

View solution in original post

0 Karma

kevintelford
Path Finder

This can now be accomplished by installing the Splunk universal forwarder along side Splunk itself. The only change to the forwarder that is required is the splunkd port it binds to. To do this add a web.conf to $SPLUNK_FORWARDER_HOME/ect/system/local/ that says

[settings]
mgmtHostPort = <SERVERIP:newPort>

Boom. Boots upside yo head.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

You could do this with a non-universal forwarder (i.e., a standard Splunk Light Forwarder) if you're not on 4.2 by just installing the second instance in a second location. Disadvantage is that you can't do this on Windows machines. Also you need to muck with and copy/rename the /etc/init.d/splunk and associated run-level symlinks to remove conflicts between the two instances.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...