Monitoring Splunk

Splunk Commands Hanging

P47R14RCH
Observer

I have installed a Splunk Forwarder version 8.0.4 on a rhel machine. After a successful install, which I am getting logs for now. I cannot run locally, on the target Universal Forwarder, commands like "Splunk list monitor", "Splunk list forward-server",  "Splunk list deploy-poll", etc. For whatever reason I does not allow me to run these commands while the forwarder is on. However, if I turn the forwarder off (Splunk Stop) then I can run all of these commands without hanging.

Obviously if you run a "ps -ef  | grep splunk " I find splunk processes. Often times I'll see a splunk start process or a splunk restart process in there. I believe that the process is hanging on that (not sure). But when I kill those processes it auto stops splunk and I can then run the commands (with it now off) and I can get the info I need.

I have installed SPF many times and have never encountered this.

Thoughts?

Labels (1)
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

So do you have both the universal forwarder and a Heaver Forwarder/Full Install on the same machine?

0 Karma

P47R14RCH
Observer

No. It is only the SF version 8.0.4. I had the same issue with 8.0.3 and thought it may be a Splunk update issue, so I switched to 8.0.4.

I'm still experiencing the same issue in my environment. I cannot run basic forwarder commands (splunk list monitor) unless I turn off the splunk forwarder.

0 Karma

P47R14RCH
Observer

1.  Which OS version are you on? 

I am running against two OS's. Rhel 7.7 and Solaris 11 - both of which are experiencing the same issue.

2.  Do you get a prompt for a username and password when you execute the command?  If not, I assume you are prompted when the service is stopped.

I do not receive any prompt for username or password. It simply hangs and doesn't proceed. 

Yes, I can stop the service, status, and start. However those are the only commands I can do while running.

0 Karma

jcrabb_splunk
Splunk Employee
Splunk Employee

To clarify my question, when you stop Splunk and run a command such as "./splunk show monitor" are you prompted for a user name and then a password?  Is it only when Splunk is running that you are not prompted?  You should be prompted to auth to run that command, regardless if Splunk is running or not.  To me it appears to be an issue where its not necessarily "hanging" but rather it is not displaying the prompt for username and password.  I did see find answers post with the same behavior but no solution yet.

https://community.splunk.com/t5/Deployment-Architecture/Splunk-forwarder-on-Linux-splunk-quot-comman...

If you add auth to the command does it work?

./splunk list monitor -auth 'admin:password'

 

If that works, then it likely confirms that you are not getting prompted for credentials.  I can dig into that further and see what I can find.  Thanks!

Jacob
Sr. Technical Support Engineer
0 Karma

P47R14RCH
Observer

Yes, it is only when it is running that it will not display the results of the command or prompt me for username and password. If I have already authenticated AND if the SF is off, I can then use those commands or add the auth feature.

I'm wondering if it's how I have been deploying the forwarder. I'm wondering if there is a hanging process. 
I'll run commands like # ps -ef | grep splunk
and I will occasionally see a splunk restart and a splunk start sitting in those processes. To me that doesn't make sense to have a splunk restart process as a process AND a splunk start process while the SF is up and running, but the moment I kill the "restart" process it kills the service.

Perhaps I have something that even when I stop and uninstall, that it's holding on to a process in systemd.... I'm not sure though, and am at a loss.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Ok, so first lets get some terminology updated here to align with the industrial standard. There is no SF. There is a Splunk Universal Forwarder (UF) and a Splunk Heavy Forwarder (HF). The difference here is that a UF is a minimal agent without a GUI, and a HF is a full instance of Splunk that can act as a forwarder, or it can act as any role in the Splunk environment { Search Head (SH), Deployer, Deployment Server (DS), Indexer (IDX), Monitoring Console (MC), License Master (LM) etc.}

Circling back. Can you confirm there are not multiple versions of the UF or HF installed? The behavior you're describing seems very similar to what happens when you have multiple instances installed and in the executable environment path. Please confirm that this is not the case, a simple 'which splunk' may show the executables outside of the current folder you're executing from.

To cleanly check this, you can do a 'sudo killall splunkd && sudo killall mongod'. This will hard kill all Splunk processes on the system. And we can start this from scratch.

You Splunk UF should be installed in /opt/splunkforwarder, or if you installed a HF it will be in /opt/splunk.

If you installed this as systemd, you need to restart the process with a 'systemctl start Splunkd.service`. If you are running ./splunk start as a user, there are issues around this. So please confirm this.

Also, please read through our docs on systemd. There are some nuances to how systemd works vs initd. See docs here : https://docs.splunk.com/Documentation/Splunk/8.0.4/Admin/RunSplunkassystemdservice#What_is_systemd.3...

 

 

 

 

 

 

0 Karma

jcrabb_splunk
Splunk Employee
Splunk Employee

A few QQ:

1.  Which OS version are you on? 

2.  Do you get a prompt for a username and password when you execute the command?  If not, I assume you are prompted when the service is stopped.

  I tested this with Splunk UF 8.0.4 on CentOS 6.4 and 8.2 (as that is what I have loaded in my lab vms) and I did not experience the issue.  I can test on a relevant version of CentOS if it differs from what I have listed above.  Otherwise, review the splunkd.log, splunkd_access.log & audit.log from the times when you attempted the cmd.  Example entries of when it is successful:

audit.log:

06-22-2020 11:19:00.117 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.117, user=admin, action=login attempt, info=succeeded reason=user-initiated useragent="SplunkCli/6.0 (build 767223ac207f)" clientip=127.0.0.1 session=d082061bc7e043d8d6cb176472acdc85][n/a]
06-22-2020 11:19:00.118 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.118, user=admin, action=list_inputs, info=granted ][n/a]
06-22-2020 11:19:00.121 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.121, user=admin, action=list_inputs, info=granted ][n/a]
06-22-2020 11:19:00.123 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.123, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/log/splunk" operation=members][n/a]
06-22-2020 11:19:00.126 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.126, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/log/splunk/license_usage_summary.log" operation=members][n/a]
06-22-2020 11:19:00.127 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.127, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/log/splunk/metrics.log" operation=members][n/a]
06-22-2020 11:19:00.128 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.128, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/log/splunk/splunk_instrumentation_cloud.log*" operation=members][n/a]
06-22-2020 11:19:00.129 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.129, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/log/splunk/splunkd.log" operation=members][n/a]
06-22-2020 11:19:00.129 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.129, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/log/watchdog/watchdog.log*" operation=members][n/a]
06-22-2020 11:19:00.130 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.130, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/run/splunk/search_telemetry/*search_telemetry.json" operation=members][n/a]
06-22-2020 11:19:00.131 -0500 INFO  AuditLogger - Audit:[timestamp=06-22-2020 11:19:00.131, user=admin, action=list_inputs, info=granted object="$SPLUNK_HOME/var/spool/splunk/...stash_new" operation=members][n/a]

 splunkd_access.log

127.0.0.1 - admin [22/Jun/2020:11:19:00.113 -0500] "POST /services/auth/login HTTP/1.1" 200 213 - - - 4ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.118 -0500] "GET /services/data/inputs/monitor/ HTTP/1.1" 200 28945 - - - 2ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.121 -0500] "GET /services/data/inputs/monitor/?count=-1 HTTP/1.1" 200 28963 - - -1ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.123 -0500] "GET /servicesNS/nobody/system/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Flog%252Fsplunk/members?count=-1 HTTP/1.1" 200 42072 - - - 2ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.126 -0500] "GET /servicesNS/nobody/system/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Flog%252Fsplunk%252Flicense_usage_summary.log/members?count=-1 HTTP/1.1" 200 4265 - - - 1ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.127 -0500] "GET /servicesNS/nobody/SplunkUniversalForwarder/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Flog%252Fsplunk%252Fmetrics.log/members?count=-1 HTTP/1.1" 200 4347 - - - 0ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.128 -0500] "GET /servicesNS/nobody/system/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Flog%252Fsplunk%252Fsplunk_instrumentation_cloud.log%2A/members?count=-1 HTTP/1.1" 200 4314 - - - 1ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.128 -0500] "GET /servicesNS/nobody/SplunkUniversalForwarder/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Flog%252Fsplunk%252Fsplunkd.log/members?count=-1 HTTP/1.1" 200 4347 - - - 1ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.129 -0500] "GET /servicesNS/nobody/system/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Flog%252Fwatchdog%252Fwatchdog.log%2A/members?count=-1 HTTP/1.1" 200 4188 - - - 0ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.130 -0500] "GET /servicesNS/nobody/system/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Frun%252Fsplunk%252Fsearch_telemetry%252F%2Asearch_telemetry.json/members?count=-1 HTTP/1.1" 200 1967 - - - 0ms
127.0.0.1 - admin [22/Jun/2020:11:19:00.131 -0500] "GET /servicesNS/nobody/system/data/inputs/monitor%24SPLUNK_HOME%252Fvar%252Fspool%252Fsplunk%252F...stash_new/members?count=-1 HTTP/1.1" 200 1967 - - - 0ms

 

You could also run strace against the pid when the cmd is executed.

Jacob
Sr. Technical Support Engineer
0 Karma

masonmorales
Influencer

Are you seeing any errors in splunkd.log?

0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...