How do I troubleshoot why indexing performance is ...

cdevoe57 · ‎10-19-2015

Operating System: Oracle Linux 3.8.13-55.1.5, 64 bit
2 CPUs, total of 40 cores, 128 gb memory, 1 GB network, 6 300 GB, 15K SAS drives

The only thing running on this system is Splunk Enterprise. Indexing Performance is 250KB/S approximately 20GB per day. According to the Capacity Planning Documents, this system should easily handle 250 GB per day. The files are JSON files.

Memory Usage and CPU usage are WAY LOW.

What could be slowing down the indexing?

cdevoe57 · ‎10-20-2015

We did an interesting thing. We moved the json files to a directory that is directly accessible by the splunk instance doing the indexing. Rates went to over 20KB/s. This tells me it is an issue with the universal forwarder. We are running these in batch mode to index the files then delete them.

emiller42 · ‎10-19-2015

First thing I would look at is the state of the indexing queues.

index=_internal host=YOUR_INDEXER sourcetype=splunkd component=Metrics  group=queue  (name=aggqueue OR name=splunktcpin OR name=parsingqueue OR name=typingqueue OR name=indexqueue) 
| eval max=if(isnotnull(max_size_kb),max_size_kb,max_size) 
| eval curr=if(isnotnull(current_size_kb),current_size_kb,current_size) 
| eval fill_perc=round((curr/max)*100,2) 
| timechart p90(fill_perc) by name

Change up metrics (median, max, p90) to see where things are falling for each queue. If any of them are consistently high, you've got a bottleneck in the indexing pipeline. For details of what each pipeline does, check the community wiki. That can help you dig further into root cause.

Splunk also reports if a queue is blocked in those events (blocked=true) so you can just search for that to see if you have any.

index=_internal host=YOUR_INDEXER sourcetype=splunkd component=Metrics  blocked=true

Since you're talking about JSON files, I'd wonder how big they are, and if the bottleneck is actually on the forwarders. Parsing configs can make a BIG difference in indexing performance, especially for structured data. If props.conf on your forwarder has INDEXED_EXTRACTIONS=JSON set, then a majority of the legwork to index that data is actually happening on the forwarder, not the indexer, meaning the forwarder could be the bottleneck. (If you're forwarding _internal data from your forwarders, you can check their queues using a search similar to the above)

cdevoe57 · ‎10-20-2015

When I run that Query I get all 0.0. There are no blocked events.

Is there a way to force Splunk to use more cores?

I just believe there is a configuration setting somewhere slowing things down.

emiller42 · ‎10-20-2015

There is a way to use more cores by adding parallel indexing pipelines. But if your queues are empty, that won't make any difference. I typically see indexers saturate 5 cores when fully loaded on indexing. (Processing about 20MB/sec) If the problem were with your indexer, you'd be seeing one of those queues as a bottleneck for the rest of the pipeline. I would suspect something going wrong at the input layer. Check queues and look for ERROR/WARN messages on your forwarders. (The queues you care about there are parsingqueue and tcpout_*)

How do I troubleshoot why indexing performance is slow?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Announcing Modern Navigation: A New Era of Splunk User Experience

Best Practices: Splunk auto adjust pipeline queue

Request for Professional Development: Attending .conf26

Join the Conversation