Hi there,
I am writing ansible playbooks that configure my local splunk universal forwarders.
To setup a mock receiver under test, I am trying to correctly use the splunk-ansible github playbook / roles. I can setup a splunk_standalone ok, and it says it's ready to receive forwarded inputs on 9777, but I can't seem to connect to it correctly.
How do I run the playbook to create an unlicensed vm for a test scenario, that can accept forwarders?
There aren't any great (work out of the box) examples out there in the documentation.
I am using molecule to spin up a pair of vagrant VMs; a 'splunk' centos VM (receiver) with the splunk_standalone role applied (github.com splunk splunk-ansible), and an 'ubuntu' VM with my own universal forwarder role applied. The version of splunk is the latest trial tgz from free enterprise 60 day trial.
To converge the receiver, I synced the code and added the splunk_standalone role to the test suite roles path, and ran the role directly from a molecule converge playbook. I had to make some guesses about which vars to define, starting with the example defaults.yml for linux, which was a little incomplete.
Before I run the play, I have to create the splunk dir, the splunk user and group, and afterwards, I configure inputs.conf.
The included vars I used to run the play are:
---
# https://github.com/splunk/splunk-ansible/blob/develop/docs/USING_DEFAULTS.md
hide_password: false
delay_num: 3
splunk_password: <sekret password>
splunk_gid: 500
splunk_uid: 500
# Splunk defaults plus remainder that allow play to run without error
retry_num: 100
splunk:
# TASK [splunk_standalone : Enable HEC services] *********************************
admin_user: molecule
# TASK [splunk_common : Apply Splunk license] ************************************
ignore_license: true
# TASK [splunk_common : Download Splunk license] *********************************
license_uri:
# TASK [splunk_standalone : include_tasks] ***************************************
apps_location:
# TASK [splunk_common : Set as license slave] ************************************
license_master_included: false
role: splunk_standalone
# TASK [splunk_common : include_tasks] *******************************************
build_location: <my-desktop>/splunk-7.2.4-8a94541dcfac-Linux-x86_64.tgz
opt: /opt
home: /opt/splunk
user: splunk
group: splunk
exec: /opt/splunk/bin/splunk
pid: /opt/splunk/var/run/splunk/splunkd.pid
password: "{{ splunk_password | default('invalid_password') }}"
# This will be the secret that Splunk will use to encrypt/decrypt.
# secret: <secret>
svc_port: 8089
s2s_port: 9997
# s2s_enable opens the s2s_port for splunktcp ingestion.
s2s_enable: 0
http_port: 8000
# This will turn on SSL on the GUI and sets the path to the certificate to be used.
http_enableSSL: 0
# http_enableSSL_cert:
# http_enableSSL_privKey:
# http_enableSSL_privKey_password:
hec_port: 8088
hec_disabled: 0
hec_enableSSL: 1
#The hec_token here is used for INGESTION only (receiving splunk events).
#Setting up your environment to forward events out of the cluster is another matter entirely
hec_token: 00000000-0000-0000-0000-000000000000
app_paths:
default: /opt/splunk/etc/apps
shc: /opt/splunk/etc/shcluster/apps
idxc: /opt/splunk/etc/master-apps
httpinput: /opt/splunk/etc/apps/splunk_httpinput
# Search Head Clustering
shc:
enable: false
#Change these before deploying
secret: some_secret
replication_factor: 3
replication_port: 9887
# Indexer Clustering
idxc:
#Change before deploying
secret: some_secret
search_factor: 2
replication_factor: 3
replication_port: 9887
When the VMs are converged, logging in with molecule login -h <hostname>
, netcat says their ssh ports are visible to each other. The VMs are configured to broadcast/receive on the 10.0.0.0/24 ip range. Spunk receiver is at 10.0.0.1 and splunk forwarder is at 10.0.0.2
nc -zv 127.0.0.1 9997
run on the receiver says the port 9997 is connected to ok. But from the forwarder,nc -zv 10.0.0.1 9997
returns error. This is in line with errors seen on the forwarder in splunk.log:
ERROR TcpOutputFd - Connection to host=10.0.0.1:9997 failed
WARN TcpOutputProc - Applying quarantine to ip=10.0.0.1 port=9997 _numberOfFailures=2
On the receiver, splunk list inputstatus
shows:
<snipped local log listeners>
tcp_cooked:listenerports :
9997
There's no active firewalls on the VMs, they're lightweight configurations for testing config management code.
Currently, the receiver inputs at ./system/local/inputs.conf
(or if I use the web UI, ./apps/splunk_monitoring_console/local/inputs.conf
) are set to:
[splunktcp://9997]
listenOnIPv6 = no
disabled = 0
acceptFrom = 10.0.0.0/24
connection_host = ip
I've tried this with ip6 enabled, and/or with connection_host set to none (dns is not configured on these hosts), but without success.
The forwarder outputs at ./system/local/outputs.conf
is set to:
[tcpout]
defaultGroup = default-autolb-group
[tcpout:default-autolb-group]
server = 10.0.0.1:9997
[tcpout-server://10.0.0.1:9997]
My sub questions are:
Ideas? What other troubleshooting steps can I take?
For example, in the docs for splunk containers, I see there is a way to generate defaults, instead of copy-pasting from the docs; and the documented way to run the play is to target using 'site.yml' directly to control the process.
Having skipped both of those, I am wondering if I missed something that would allow the receiver to accept things correctly. But since I'm new to this, it could be literally anything.