Getting Data In
Highlighted

What's the best way to monitor Splunk itself?

Builder

I would like to start a discussion as to how the community monitors their Splunk deployment? What are some of the methods you use?

How would you manage hundreds if not thousands of Splunk instances across multiple data centers? All of which can be clustered in groups/deployments.

Tags (1)
0 Karma
Highlighted

Re: What's the best way to monitor Splunk itself?

Esteemed Legend

Splunk has finally answered this problem definitively with v6.2's new Distributed Management Console (DMC):

http://docs.splunk.com/Documentation/Splunk/6.2.3/Admin/ConfiguretheMonitoringConsole

Highlighted

Re: What's the best way to monitor Splunk itself?

Builder

What if the instance OR host that is running DMC goes down? DMC can only monitor a SHC if I am correct. We would need multiple DMC setup for multiple SHC's.

Ideally I want a tool that can monitor more than 10+ different SHC. Nodes ranging in the hundreds. Not even talking about all the forwarders 😞

Really want just the basic check of.. is splunkd running? If not, please let me know NOW. Looking for the least amount of footprint generated by a check.

0 Karma
Highlighted

Re: What's the best way to monitor Splunk itself?

Explorer

Or you could use the Spunk on Splunk app. I would advise getting the Add-On (either windows or lInux).

https://splunkbase.splunk.com/app/748/

0 Karma
Highlighted

Re: What's the best way to monitor Splunk itself?

Builder

Currently I have a script that just hits splunkd via REST API and checks if there is a return signal. If not, then the process or something is down. Would there be a better way to check over thousands of hosts?

0 Karma
Highlighted

Re: What's the best way to monitor Splunk itself?

Explorer

SoS App works well.

You can make a search that checks the splunkd.log for stopped, started, etc. See below

index=_internal source=*splunkd.log host=* component=IndexProcessor ("shutting down: end" OR "Initializing: readonly")   | eval restart_status=if(message="shutting down: end","Stopping","Starting")  
0 Karma
Highlighted

Re: What's the best way to monitor Splunk itself?

SplunkTrust
SplunkTrust

Hi ben_leung,

like any other IT system/server Splunk as well needs basic monitoring from the outside.
A good start is for sure to monitor the main splunk processes splunkd and the Splunk helper processes. this could be done by some basic script calling the $SPLUNK_HOME/bin/splunk status command.
You can check as well if the ports are up and running; simple telnet to Splunk ports will do the trick.
But also keep in mind that there could be much more involved like SAN, NFS, network and so on.

hope this helps ...

cheers, MuS

Highlighted

Re: What's the best way to monitor Splunk itself?

Builder

Does it make sense to install crontab's on 3000+ machines to watch Splunk? Automation tools like Ansible would be a great way to hit remote hosts in a massive scale. The method of how to check Splunk process is what I would hope others can share. So many tools to try out and test, looking for the perfect solution hehe.

0 Karma
Highlighted

Re: What's the best way to monitor Splunk itself?

SplunkTrust
SplunkTrust

Sorry to say, but nobody but you will be able to provide the perfect solution for your setup 😉

Highlighted

Re: What's the best way to monitor Splunk itself?

Motivator

I use a product/service called Omnicenter, from Netreo. It monitors the health of ports 8000 and 8089, and sends an email alert to my group if anything is amiss there. We use this for monitoring all of our critical systems. I wrote a small script that does a test to see if my raid array is writeble, and which puts a zero or a one in a node in the snmp tree, which Omnicenter polls regularly (we had a situation where the raid got into some weird state where it was readable but not writeable).

0 Karma