Monitoring Splunk

Best way to diagnose a Splunk performance problem? Server is timing out.

echojacques
Builder

I have Splunk + Enterprise Security running on a Linux server with dual quad-core processors (Intel Xeon 2.4GHZ) and 16GB RAM. Indexing volume is < 20GB/day. When I run a search beyond 24 hours (7 days) Splunk will frequently time out and the first sign of a problem is when I get an error that the splunkd daemon has stopped responding which kills the GUI/website. When this happens, I have to stop/start Splunk (command line since the GUI stopped working) to get Splunk working again.

How can I diagnose what is causing Splunk to stop responding? If I run the Linux "top" command while this issue is happening, I usually see the splunkd process consuming 100% of the CPU. However, I'm not sure what exactly within Splunk is causing the drag on resources.

Thanks

Tags (3)
1 Solution

yannK
Splunk Employee
Splunk Employee

For splunkd/UI timeout, Install the SOS app

  • look at the dashboard : Warning & errors > HTTP Response Times For splunkd > panel : High response times against other metrics
    If you see the splunkd response time being above the "splunk web timeout threshold", you will have a culprit.
    look if this is regular, maybe you have expensive scheduled searches impacting splunkd perf.

  • enable the sos scripted inputs > ps_sos.sh (on linux) ps_sos.ps1 ( on windows), let it run to collect data.
    then check the Resources usage > Splunk CPU/Memory Usage

View solution in original post

yannK
Splunk Employee
Splunk Employee

For splunkd/UI timeout, Install the SOS app

  • look at the dashboard : Warning & errors > HTTP Response Times For splunkd > panel : High response times against other metrics
    If you see the splunkd response time being above the "splunk web timeout threshold", you will have a culprit.
    look if this is regular, maybe you have expensive scheduled searches impacting splunkd perf.

  • enable the sos scripted inputs > ps_sos.sh (on linux) ps_sos.ps1 ( on windows), let it run to collect data.
    then check the Resources usage > Splunk CPU/Memory Usage

echojacques
Builder

Ok, I fixed the SOS issue by making the Sideview app visible. If not visible, SOS doesn't work.

http://answers.splunk.com/answers/37715/sideview-utils-not-found-after-sos-20-upgrade

Thanks

0 Karma

echojacques
Builder

So I installed SOS and Sideview Utils and I get an error that "Sideview Utils" is not installed when I try to launch SOS. There were several different versions of SOS, I installed version 3.0.1 (latest). I also restarted Splunk after verifying that both SOS and Sideview Utils were installed.

0 Karma
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...