Splunk Search

Why does a simple Splunk search such as index=abc take a long time to complete?

sushmitha_mj
Communicator

Hi,

I am working on a distributed splunk environment. I have created an app and a separate indexer for this app to load data. I have the data on the data summary, so when I got to search and for example say "Index=abc" , it takes 20 mins to load completely. If I add more complexity to my search, it would take even longer.
I do have huge volumes of data (millions of records ). Is there a way to optimize?

Tags (3)
1 Solution

sideview
SplunkTrust
SplunkTrust

index=abc feels like a simple query, but it's actually quite an expensive search for Splunk to run. It's definitely not a good candidate if you're looking for a "speed of light" test. A good speed of light test would be this index=abc | stats count. Paradoxically the added search syntax often allows splunk to do less work.

In index=abc | stats count, splunk notes that you don't actually need any fields extracted, thus no lookups run, no raw event text, or anything. It can really pare this search down to a bare minimum and do the work on the indexer.

In index=abc, you're telling Splunk you want to actually see the raw events, so it will run all possible field extractions, calculate the timeline, the summaries of all the fields and their top values, and the search head will have to pull all the raw event text and fields from the indexer to assemble the results locally.

View solution in original post

sideview
SplunkTrust
SplunkTrust

index=abc feels like a simple query, but it's actually quite an expensive search for Splunk to run. It's definitely not a good candidate if you're looking for a "speed of light" test. A good speed of light test would be this index=abc | stats count. Paradoxically the added search syntax often allows splunk to do less work.

In index=abc | stats count, splunk notes that you don't actually need any fields extracted, thus no lookups run, no raw event text, or anything. It can really pare this search down to a bare minimum and do the work on the indexer.

In index=abc, you're telling Splunk you want to actually see the raw events, so it will run all possible field extractions, calculate the timeline, the summaries of all the fields and their top values, and the search head will have to pull all the raw event text and fields from the indexer to assemble the results locally.

sushmitha_mj
Communicator

@sideview
You are right... This is definitely better, but I still do feel that this is taking long. I have 42 million records.
Splunk is taking almost 3-4 minutes to return count for the query index=abc| stats count
Is this normal?
If I make a dashboard, will the speed improve?

0 Karma

somesoni2
Revered Legend

Are you running the search in Fast Mode? (below the timerange picker, you have a dropdown to select the mode)

sideview
SplunkTrust
SplunkTrust

That's a pretty good speed (~175,000 events scanned per second), indicating either that you have a lot of indexers, or you have one indexer with a very nice IO subsystem and SSD(s).

Let's take another tour though. You can schedule your search and then the most recent results will load instantly on dashboards. Or you can "accelerate it". Or you can do weird advanced things because you don't need to actually get any fields or use the raw text - | metasearch index=foo | stats count will be VASTLY faster. | tstats count where index=foo is another advanced thing you'll never have heard of that may be just a few seconds to return.

The weird thing about index=abc | stats count is you're using Splunk and you have the power to do any analytics on the fly and transform and mash up the data any way you want, but you're not doing any of it. Sort of like using an ICBM to transport your cat a few doors down. Seems slow but you're not appreciating the fact that your cat is briefly in space!

But forget that advanced stuff, fast though it is. I advise really slowing down and going back to the tutorial maybe from here - http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchTutorial/Aboutthesearchapp Or further ahead to read about scheduling searches, accelerating searches, using summary indexing, etc...

sushmitha_mj
Communicator

@somesoni2

I was not, but now I am... It is slightly better 🙂
Thanks

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...