Solved: What is the impact of an instance forwarding to it...

kevintelford · ‎11-03-2010

Setup We have a cluster of compute nodes, call them node01-node05. They all will run jobs that create data we'd like to put into Splunk. Jobs are farmed out based on available resources. Splunk indexers are also co-exist on node01-node05. We could index the data from the resulting job with the local Splunk installation. The problem is we'll see severe data skew because node01-node04 may be busy for a long while, thus farming all jobs that will feed Splunk to a single node. So the solution would be to redistribute the data with a forwarder (which, while not perfectly load balancing, is better). We could use a dedicated forwarder to re-distribute, but we'd then be paying the penalty of using the network twice - once to the forwarder and once back to the cluster. So I made each indexer also forward data.

inputs.conf

[batch://path/to/files]
move_policy = sinkhole
index = my_index
sourcetype = my_sourcetype
crcSalt = <SOURCE>
_TCP_ROUTING = rest_of_the_splunk_cluster

[splunktcp:9997]

outputs.conf

[tcpout]
heartbeatFrequency = 15
maqQueueSize = 10000

[tcpout:rest_of_the_splunk_cluster]
server = node01:9997, node02:9997, node03:9997, node04:9997, node05:9997
autoLB = true
autoLBFrequency = 5

Question Ok, so now that we've done that. This works. If we're on node01 the data will be distributed to node02-node05. But what is the impact, if any, of having node01 listed as a server in its own outputs.conf? Because it won't send the data to itself, but does it try and fail? Timeout?

The simple answer is to not include this entry in the outputs.conf. My issue comes in because instead of 5 nodes I have a gaggle. To further complicate matters, all of our software deployments/upgrades rely on puppet which doesn't make this sort of thing any easier.

Thanks, K

kevintelford · ‎03-29-2011

This can now be accomplished by installing the Splunk universal forwarder along side Splunk itself. The only change to the forwarder that is required is the splunkd port it binds to. To do this add a web.conf to $SPLUNK_FORWARDER_HOME/ect/system/local/ that says

[settings]
mgmtHostPort = <SERVERIP:newPort>

Boom. Boots upside yo head.

View solution in original post

kevintelford · ‎03-29-2011

This can now be accomplished by installing the Splunk universal forwarder along side Splunk itself. The only change to the forwarder that is required is the splunkd port it binds to. To do this add a web.conf to $SPLUNK_FORWARDER_HOME/ect/system/local/ that says

[settings]
mgmtHostPort = <SERVERIP:newPort>

Boom. Boots upside yo head.

gkanapathy · ‎03-29-2011

You could do this with a non-universal forwarder (i.e., a standard Splunk Light Forwarder) if you're not on 4.2 by just installing the second instance in a second location. Disadvantage is that you can't do this on Windows machines. Also you need to muck with and copy/rename the /etc/init.d/splunk and associated run-level symlinks to remove conflicts between the two instances.

What is the impact of an instance forwarding to itself

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!