<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: parallelIngestionPipelines for forwarder question in Monitoring Splunk</title>
    <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382894#M6331</link>
    <description>&lt;P&gt;Yea, good point, this is exactly what i think i am going through.&lt;BR /&gt;
the batch mode , that can read and delete looks to be a good option , so that there are only un-indexed files to scan for, ie less number of files.&lt;/P&gt;

&lt;P&gt;This is a very good point, let me try on this and get back..&lt;/P&gt;

&lt;P&gt;meanwhile, is there any limitation on how many files a forwarder can process comfortably, is there any standard to this, considering i have 12 cpus and 12 GB ram.&lt;/P&gt;</description>
    <pubDate>Fri, 15 Feb 2019 05:30:17 GMT</pubDate>
    <dc:creator>jiaqya</dc:creator>
    <dc:date>2019-02-15T05:30:17Z</dc:date>
    <item>
      <title>parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382883#M6320</link>
      <description>&lt;P&gt;We have a forwarder which has 12cpu's and 12 GB memory.&lt;BR /&gt;
we have not yet set the parallelingeationpipelines.&lt;BR /&gt;
we have a lot of CSV files ( over 40000) to index almost daily and frequently see delays in the CSV files being indexed.&lt;/P&gt;

&lt;P&gt;recently we see it takes close to 6-7 hours for csv files to index, so there is a delay.&lt;/P&gt;

&lt;P&gt;in this regard, can we increase the parallel ingestion to 2 , is this cpu/memory sufficient to handle this new setting.&lt;/P&gt;

&lt;P&gt;seeking suggestion..&lt;/P&gt;</description>
      <pubDate>Thu, 14 Feb 2019 09:14:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382883#M6320</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-14T09:14:20Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382884#M6321</link>
      <description>&lt;P&gt;How is the CSV file generated? If one file size is large and processing is delayed, it is not effective even if two pipes are made.&lt;BR /&gt;
Please check the process that is really delayed using DMC.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Feb 2019 09:32:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382884#M6321</guid>
      <dc:creator>HiroshiSatoh</dc:creator>
      <dc:date>2019-02-14T09:32:25Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382885#M6322</link>
      <description>&lt;P&gt;Pretty new to DMC, could you help where i should start to look for CSV indexing delay issue.. Too many options to look.  The CSV files arent large in size , but large in number.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Feb 2019 09:38:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382885#M6322</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-14T09:38:22Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382886#M6323</link>
      <description>&lt;P&gt;or can you direct me to a DMC tutorial/document&lt;/P&gt;</description>
      <pubDate>Thu, 14 Feb 2019 09:42:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382886#M6323</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-14T09:42:05Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382887#M6324</link>
      <description>&lt;P&gt;The general rule of thumb is at least 1.5 cores per pipeline. So 12 cores should be more than sufficient to enable 2 pipelines. Of course this all depends on what else that machine is doing that takes up CPU cores. So have a look at current CPU and Memory consumption and see if there is sufficient capacity left (and of course keep a close eye on it after adding the extra pipeline).&lt;/P&gt;

&lt;P&gt;To what extend it will resolve your problem is a valid question, but that is a separate topic altogether.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Feb 2019 12:12:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382887#M6324</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2019-02-14T12:12:28Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382888#M6325</link>
      <description>&lt;P&gt;It is necessary to check the operation of the UF while it is delayed. The following site is helpful.&lt;BR /&gt;
&lt;A href="https://wiki.splunk.com/Community:Troubleshooting_Monitor_Inputs"&gt;https://wiki.splunk.com/Community:Troubleshooting_Monitor_Inputs&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;If you increase the pipeline, 6 hours may be 3 hours, but this is not a solution.&lt;BR /&gt;
It is necessary to confirm at which stage the delay is occurring.&lt;BR /&gt;
&lt;A href="https://wiki.splunk.com/Community:HowIndexingWorks"&gt;https://wiki.splunk.com/Community:HowIndexingWorks&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Queue state&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host=your_hostname source="*metrics.log*" group=queue 
| eval max=if(isnotnull(max_size_kb),max_size_kb,max_size) 
| eval curr=if(isnotnull(current_size_kb),current_size_kb,current_size) 
| eval fill_perc=round((curr/max)*100,2) 
| timechart max(fill_perc) by name useother=false limit=15
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Block status&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host=your_hostname source="*metrics.log*" group="queue" blocked=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;tailreader status&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host=your_hostname source="*metrics*" group=tailingprocessor name="tailreader*" 
| eval host=case(isnull(ingest_pipe),host,1=1,host."_".ingest_pipe) 
| timechart max(max_queue_size) max(current_queue_size) max(files_queued) sum(new_files_queued) max(fd_cache_size) 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;batchreader status&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source="*metrics*" source="*metrics*" group=tailingprocessor name="batchreader*" 
| eval host=case(isnull(ingest_pipe),host,1=1,host."_".ingest_pipe) 
| timechart max(max_queue_size) max(current_queue_size) max(files_queued) sum(new_files_queued) max(fd_cache_size)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 15 Feb 2019 00:59:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382888#M6325</guid>
      <dc:creator>HiroshiSatoh</dc:creator>
      <dc:date>2019-02-15T00:59:24Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382889#M6326</link>
      <description>&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.2.4/DMC/DMCoverview"&gt;https://docs.splunk.com/Documentation/Splunk/7.2.4/DMC/DMCoverview&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 01:17:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382889#M6326</guid>
      <dc:creator>HiroshiSatoh</dc:creator>
      <dc:date>2019-02-15T01:17:19Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382890#M6327</link>
      <description>&lt;P&gt;I would do try to move up to 6 pipelines but BY FAR the most important thing is that you are doing fast and efficient deletion of the files.  It is probably the case that your files are atomic and are being dropped onto the filesystem in a complete state that never changes, right?  If so, be sure that you use &lt;CODE&gt;[batch]&lt;/CODE&gt; with &lt;CODE&gt;move_policy=sinkhole&lt;/CODE&gt; so that Splunk itself deletes them as it eats them.  If you have thousands of files that splunk has to sort through, it will NEVER be able to keep up.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 02:01:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382890#M6327</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-02-15T02:01:29Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382891#M6328</link>
      <description>&lt;P&gt;Hi Frank, Thank you for your input, that was good info ... I still have to do some more reading before implementing this..&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 05:22:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382891#M6328</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-15T05:22:43Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382892#M6329</link>
      <description>&lt;P&gt;HiroshiSatoh,  Here is the behavior i see, when i add new files for data input , usually csv files.&lt;BR /&gt;
i see they are not detected under "number of files". it shows blank for a very long time, as suggested, for 6 hours or more . once i see the number of files there ,the indexing is immediate.. and i see the data.&lt;/P&gt;

&lt;P&gt;the delay is for forwarder to detect the files. so i was guessing , that there are too many files for splunk to scan and hence this is causing the delay for detecting those files.&lt;/P&gt;

&lt;P&gt;as mentioned, once these are detected and i see numbers under "number of files" the indexing happens quickly..&lt;/P&gt;

&lt;P&gt;here is where i see issue&lt;BR /&gt;
splunk website -&amp;gt; Settings -&amp;gt; Data Inputs -&amp;gt; Files &amp;amp; Directories-&amp;gt; Look for my input files-&amp;gt; the "number of files" column is blank -&amp;gt; This stays blank for more than 6 hours -&amp;gt; Once i can see numbers here, the indexing is quick &lt;/P&gt;

&lt;P&gt;ill also check the points mentioned by you above..&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 05:27:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382892#M6329</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-15T05:27:28Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382893#M6330</link>
      <description>&lt;P&gt;Thanks for the document link &lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 05:27:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382893#M6330</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-15T05:27:52Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382894#M6331</link>
      <description>&lt;P&gt;Yea, good point, this is exactly what i think i am going through.&lt;BR /&gt;
the batch mode , that can read and delete looks to be a good option , so that there are only un-indexed files to scan for, ie less number of files.&lt;/P&gt;

&lt;P&gt;This is a very good point, let me try on this and get back..&lt;/P&gt;

&lt;P&gt;meanwhile, is there any limitation on how many files a forwarder can process comfortably, is there any standard to this, considering i have 12 cpus and 12 GB ram.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 05:30:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382894#M6331</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-15T05:30:17Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382895#M6332</link>
      <description>&lt;P&gt;3 cores / 2 pipelines is what i have heard which maxes you out around 6.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 06:23:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382895#M6332</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-02-15T06:23:11Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382896#M6333</link>
      <description>&lt;P&gt;"Splunk PS" is required to make the pipeline 3 or more.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Feb 2019 08:09:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382896#M6333</guid>
      <dc:creator>HiroshiSatoh</dc:creator>
      <dc:date>2019-02-15T08:09:45Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382897#M6334</link>
      <description>&lt;P&gt;you looks to have enough ressources to increase pipelines (could try with 8 )&lt;BR /&gt;
additionnaly, I would configure : &lt;BR /&gt;
MAX_DAYS_AGO  (the value depend how the data arrive)&lt;BR /&gt;
crcsalt (if the files are not renamed, use the )&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;check maxkbps limit has been tuned &lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;are your files local to the uf and how do they arrive (there could be a race condition issue here, delaying the collection) ? how do they get purged ?&lt;/P&gt;

&lt;P&gt;if the number of files to scan and the delay is still not acceptable after tuning, then you would probably have to rethink how your collection works to improve the situation.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:17:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382897#M6334</guid>
      <dc:creator>maraman_splunk</dc:creator>
      <dc:date>2020-09-29T23:17:18Z</dc:date>
    </item>
    <item>
      <title>Re: parallelIngestionPipelines for forwarder question</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382898#M6335</link>
      <description>&lt;P&gt;thanks for the inputs . i would look at your options..&lt;BR /&gt;
i do have crcsalt source  and max days ago, but we are not purging the files.&lt;BR /&gt;
i will use the batch mode to purge them. i believe that should resolve my problem.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Feb 2019 09:25:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/parallelIngestionPipelines-for-forwarder-question/m-p/382898#M6335</guid>
      <dc:creator>jiaqya</dc:creator>
      <dc:date>2019-02-20T09:25:13Z</dc:date>
    </item>
  </channel>
</rss>

