<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I troubleshoot why indexing performance is slow? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276172#M52962</link>
    <description>&lt;P&gt;There is a way to use more cores by adding parallel indexing pipelines.  But if your queues are empty, that won't make any difference.  I typically see indexers saturate 5 cores when fully loaded on indexing.  (Processing about 20MB/sec)  If the problem were with your indexer, you'd be seeing one of those queues as a bottleneck for the rest of the pipeline.  I would suspect something going wrong at the input layer.  Check queues and look for ERROR/WARN messages on your forwarders.  (The queues you care about there are &lt;CODE&gt;parsingqueue&lt;/CODE&gt; and &lt;CODE&gt;tcpout_*&lt;/CODE&gt;)&lt;/P&gt;</description>
    <pubDate>Tue, 20 Oct 2015 14:50:05 GMT</pubDate>
    <dc:creator>emiller42</dc:creator>
    <dc:date>2015-10-20T14:50:05Z</dc:date>
    <item>
      <title>How do I troubleshoot why indexing performance is slow?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276169#M52959</link>
      <description>&lt;P&gt;Operating System:  Oracle Linux 3.8.13-55.1.5, 64 bit&lt;BR /&gt;
2 CPUs, total of 40 cores, 128 gb memory, 1 GB network, 6 300 GB, 15K SAS drives&lt;/P&gt;

&lt;P&gt;The only thing running on this system is Splunk Enterprise.  Indexing Performance is 250KB/S approximately 20GB per day.   According to the Capacity Planning Documents, this system should easily handle 250 GB per day.  The files are JSON files.&lt;/P&gt;

&lt;P&gt;Memory Usage and CPU usage are WAY LOW.&lt;/P&gt;

&lt;P&gt;What could be slowing down the indexing?&lt;/P&gt;</description>
      <pubDate>Mon, 19 Oct 2015 15:20:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276169#M52959</guid>
      <dc:creator>cdevoe57</dc:creator>
      <dc:date>2015-10-19T15:20:27Z</dc:date>
    </item>
    <item>
      <title>Re: How do I troubleshoot why indexing performance is slow?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276170#M52960</link>
      <description>&lt;P&gt;First thing I would look at is the state of the indexing queues.  &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=_internal host=YOUR_INDEXER sourcetype=splunkd component=Metrics  group=queue  (name=aggqueue OR name=splunktcpin OR name=parsingqueue OR name=typingqueue OR name=indexqueue) 
| eval max=if(isnotnull(max_size_kb),max_size_kb,max_size) 
| eval curr=if(isnotnull(current_size_kb),current_size_kb,current_size) 
| eval fill_perc=round((curr/max)*100,2) 
| timechart p90(fill_perc) by name
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Change up metrics (median, max, p90) to see where things are falling for each queue.  If any of them are consistently high, you've got a bottleneck in the indexing pipeline.  For details of what each pipeline does, check &lt;A href="https://wiki.splunk.com/Community:HowIndexingWorks"&gt;the community wiki&lt;/A&gt;.  That can help you dig further into root cause.&lt;/P&gt;

&lt;P&gt;Splunk also reports if a queue is blocked in those events (&lt;CODE&gt;blocked=true&lt;/CODE&gt;) so you can just search for that to see if you have any.  &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=_internal host=YOUR_INDEXER sourcetype=splunkd component=Metrics  blocked=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Since you're talking about JSON files, I'd wonder how big they are, and if the bottleneck is actually on the forwarders.  Parsing configs can make a BIG difference in indexing performance, especially for structured data.  If props.conf on your forwarder has &lt;CODE&gt;INDEXED_EXTRACTIONS=JSON&lt;/CODE&gt; set, then a majority of the legwork to index that data is actually happening on the forwarder, not the indexer, meaning the forwarder could be the bottleneck.   (If you're forwarding _internal data from your forwarders, you can check their queues using a search similar to the above)&lt;/P&gt;</description>
      <pubDate>Mon, 19 Oct 2015 16:55:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276170#M52960</guid>
      <dc:creator>emiller42</dc:creator>
      <dc:date>2015-10-19T16:55:46Z</dc:date>
    </item>
    <item>
      <title>Re: How do I troubleshoot why indexing performance is slow?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276171#M52961</link>
      <description>&lt;P&gt;When I run that Query I get all 0.0.  There are no blocked events.  &lt;/P&gt;

&lt;P&gt;Is there a way to force Splunk to use more cores?  &lt;/P&gt;

&lt;P&gt;I just believe there is a configuration setting somewhere slowing things down.  &lt;/P&gt;</description>
      <pubDate>Tue, 20 Oct 2015 12:28:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276171#M52961</guid>
      <dc:creator>cdevoe57</dc:creator>
      <dc:date>2015-10-20T12:28:51Z</dc:date>
    </item>
    <item>
      <title>Re: How do I troubleshoot why indexing performance is slow?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276172#M52962</link>
      <description>&lt;P&gt;There is a way to use more cores by adding parallel indexing pipelines.  But if your queues are empty, that won't make any difference.  I typically see indexers saturate 5 cores when fully loaded on indexing.  (Processing about 20MB/sec)  If the problem were with your indexer, you'd be seeing one of those queues as a bottleneck for the rest of the pipeline.  I would suspect something going wrong at the input layer.  Check queues and look for ERROR/WARN messages on your forwarders.  (The queues you care about there are &lt;CODE&gt;parsingqueue&lt;/CODE&gt; and &lt;CODE&gt;tcpout_*&lt;/CODE&gt;)&lt;/P&gt;</description>
      <pubDate>Tue, 20 Oct 2015 14:50:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276172#M52962</guid>
      <dc:creator>emiller42</dc:creator>
      <dc:date>2015-10-20T14:50:05Z</dc:date>
    </item>
    <item>
      <title>Re: How do I troubleshoot why indexing performance is slow?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276173#M52963</link>
      <description>&lt;P&gt;We did an interesting thing.   We moved the json files to a directory that is directly accessible by the splunk instance doing the indexing.   Rates went to over 20KB/s.  This tells me it is an issue with the universal forwarder.  We are running these in batch mode to index the files then delete them.  &lt;/P&gt;</description>
      <pubDate>Tue, 20 Oct 2015 14:54:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-I-troubleshoot-why-indexing-performance-is-slow/m-p/276173#M52963</guid>
      <dc:creator>cdevoe57</dc:creator>
      <dc:date>2015-10-20T14:54:36Z</dc:date>
    </item>
  </channel>
</rss>

