Monitoring Splunk

Why is my search taking so long to run?

Mick
Splunk Employee
Splunk Employee

When I queried a new Summary Index with only 100k of events and 16MB of size, the response time of the a simple query below took 102 seconds with 4 hrs time range and 901 seconds with 24 hrs time range.

index=auth chart="newChart" app="newApp" | stats sum(responseTime) AS totalTime, sum(confirm) AS confirm by _time,mainHost | eval respTime=totalTime/occur | timechart avg(totalTime) by mainHost

The CPU, Memory and Disk usage were low. The server has 32 CPUs and 32GB memory. In addition, I disabled all unnecessary default Splunk apps.

0 Karma
1 Solution

Stephen_Sorkin
Splunk Employee
Splunk Employee

I would start diagnosing a problem like this by breaking down the search into its components.

Here I would first run:

index=auth chart="newChart" app="newApp"

And inspect the number of events and scan count and time taken.

Next I would add the stats and see how long that takes. Finally, add the timechart.

If the bottleneck is the initial search, we would inspect whether our search terms did a good job of using the index.

View solution in original post

Simeon
Splunk Employee
Splunk Employee

If you are running the search from the CLI, then you should examine the problem exactly how Stephen has stated. If you are running from the UI, there are additional considerations such as field extractions which can increase the time taken to render results. One 'trick' is to modify your search so that only the relevant fields used are returned. To do this, you can add "| fields " before the stats command. Note that when viewing a search via advanced charting, searches are optimized to only return the relevant fields. When viewing search results in the flashtimeline view, their is a "preview" option which can sometimes make the search feel slow.

Stephen_Sorkin
Splunk Employee
Splunk Employee

I would start diagnosing a problem like this by breaking down the search into its components.

Here I would first run:

index=auth chart="newChart" app="newApp"

And inspect the number of events and scan count and time taken.

Next I would add the stats and see how long that takes. Finally, add the timechart.

If the bottleneck is the initial search, we would inspect whether our search terms did a good job of using the index.

View solution in original post

Lowell
Super Champion

If you have access to your splunk server, you may find that reviewing the search.log for your search job to be helpful. You can find in under $SPLUNK_HOME/var/run/splunk/dispatch/<job_id>/search.log. (You may want to "save" your search, so the job doesn't expire while your looking at the log file.) Look for anything unusual, such as error messages, what all indexes and buckets were accessed, repeating messages, etc.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.