One search job, one thread?

nickcode · ‎05-17-2013

My deployment is: 1 forwarder + 2 indexers + 1 search head.
The forwarder has forwarded 50GB(about 100,000,000 events) to the two indexers;
When I launch a search like "sourcetype=xxx" from search head, I find the search performance really quite disappointing. Only 10,000 events can be scanned per second! That's to say, it will take about 3 hours to finish scaning all the events!
Each of my indexers has 24 cpus. And each time I launch a search from search head, only one cpu in each indexer is running about 100% while others keep free. It seems that one search job only works on one thread. That's quite a waste of my indexer' computing ability!
Are there any ways to config splunk to solve this problem?

nickcode · ‎05-17-2013

Hm..Search heads distribute search requests across multiple indexers.
U can refer to http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Scaleyourdeployment

linu1988 · ‎05-17-2013

As per my knowledge the indexer does only indexing, the search head does the searching. The search performance depends on the search head's capacity. If the performance is slower from your expectations, please check the running jobs and scheduled searches in your search head. That should indicate the cause of your performance degradation. And also please check the role under which you are running the searches too.

martin_mueller · ‎05-17-2013

The number of events scanned per second for one search usually is depending on your I/O rather than the number or speed of the CPUs.

As for your search itself, as long as you don't define any more specific filters splunk has to load 50GB off the disks and shove them towards the search head... that's a lot of work, what are you trying to find with the search?

martin_mueller · ‎05-17-2013

If you specify other keys splunk will use those keys to narrow down what needs to be looked at in the first place, that's what indexing data is for.

As a simple example, looking for sourcetype=x error will only consider events that contain the word error from that sourcetype, potentially cutting the amount of data that needs to be read from disk by a huge factor.

Personally, I see no point in specifying a query that loads and lists 50GB of data, but then I don't know what you're trying to achieve...

nickcode · ‎05-17-2013

And my I/O is not the bottleneck the iostat shows..

nickcode · ‎05-17-2013

If I change the search to "sourcetype=xxx key1=value1 AND key2=value2", indexers will also scan all the events since splunk only indexes events by timestamp, isn't it?

One search job, one thread?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Unlocking Unified Insights: New Gigamon Federated Search App for Splunk

GA: New Data Management App in Splunk Platform

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation

One search job, one thread?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Unlocking Unified Insights: New Gigamon Federated Search App for Splunk

GA: New Data Management App in Splunk Platform

Announcing Modern Navigation: A New Era of Splunk User Experience