<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Tail/Batch readers confusion in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577745#M102053</link>
    <description>&lt;P&gt;OK. Same thing again - after the file rotated at midnight, it started being properly tailed (at least that's what I suppose). Then I got&lt;/P&gt;&lt;PRE&gt;12-08-2021 06:01:27.167 +0100 WARN TailReader - Enqueuing a very large file=\\&amp;lt;redacted&amp;gt; in the batch reader, with bytes_to_read=224858513, reading of other large files could be delayed&lt;/PRE&gt;&lt;P&gt;And I've been geting this message every few minutes up to this point. And the forwarder shows the file as read up to the end and stopped.&lt;/P&gt;&lt;P&gt;So if I understand the behaviour correctly - if the forwarder switches from tailreader to batchreader, it's never going back to the tailreader for this file, right?&lt;/P&gt;&lt;P&gt;The only thing to check now is to raise the limit for the batchreader so the forwarder doesn't switch to batch reader so eagerly.&lt;/P&gt;</description>
    <pubDate>Wed, 08 Dec 2021 08:31:29 GMT</pubDate>
    <dc:creator>PickleRick</dc:creator>
    <dc:date>2021-12-08T08:31:29Z</dc:date>
    <item>
      <title>Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577483#M102008</link>
      <description>&lt;P&gt;I'm having more strange situations with my UF ingesting many big files.&lt;/P&gt;&lt;P&gt;OK, I managed to make the UF read the current Exchange logs reasonably quickly (it seems that there were some age limits left ridiculously high by someone so there were many files to check). So now there are several dozens (or even hundreds) files tracked by splunkd but it seems to work somehow. The problem is that I also monitor another quite quickly growing file on this UF. And it's giving me headache.&lt;/P&gt;&lt;P&gt;Some time after the UF starts, if restarted mid-day, I get&lt;/P&gt;&lt;PRE&gt;TailReader - Enqueuing a very large file=\\&amp;lt;redacted&amp;gt; in the batch reader, with bytes_to_read=9565503150, reading of other large files could be delayed&lt;/PRE&gt;&lt;P&gt;OK, that's understandable - the batch reader is supposed to be more effective at reading a single big file at once, why not. But the trick is - the file is not getting ingested. I don't see any new events in the index. And I checked with procexp64.exe from SysInternals and handle64.exe - the file is not open by splunkd.exe at all.&lt;/P&gt;&lt;P&gt;So where is my file???&lt;/P&gt;&lt;P&gt;Other files are being monitored and the data is getting ingested.&lt;/P&gt;</description>
      <pubDate>Mon, 06 Dec 2021 15:46:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577483#M102008</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2021-12-06T15:46:36Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577575#M102030</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;I suppose that you have read/are known by these&amp;nbsp;&lt;A href="https://community.splunk.com/t5/Getting-Data-In/File-not-being-read-by-Splunk-in-a-directory-while-others-are/m-p/374214" target="_blank"&gt;https://community.splunk.com/t5/Getting-Data-In/File-not-being-read-by-Splunk-in-a-directory-while-others-are/m-p/374214?&lt;/A&gt;&amp;nbsp;Probably you also have updated thruput in limits.conf and also you have several pipelines defined?&lt;/P&gt;&lt;P&gt;What splunk list inputstatus told about this file?&lt;/P&gt;&lt;P&gt;r. Ismo&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 09:21:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577575#M102030</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2021-12-07T09:21:42Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577587#M102032</link>
      <description>&lt;P&gt;Inputstatus shows (In a section regarding TailingProcessor) ...&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="PickleRick_0-1638872369776.png" style="width: 400px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/17131iBE1A8A8ED3A297A8/image-size/medium?v=v2&amp;amp;px=400" role="button" title="PickleRick_0-1638872369776.png" alt="PickleRick_0-1638872369776.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;But does it mean that after UF switched to batch mode, it just treats the file as finished and will not tail it anymore?&lt;/P&gt;&lt;P&gt;Or does it mean that UF will run batch reader again some time in the future? (when I look in _internal for the occurrences of my source log name, I get several messages about enqueueing a very large file throughout the whole time after last restart.&lt;/P&gt;&lt;P&gt;So I don't quite get it what UF does after switching to batchreader.&lt;/P&gt;&lt;P&gt;I will have to contact the guys administering the server which provides me with the log because for now only the UF has access and I can even manually check the file contents. &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I suppose there might be something wrong on the source side after all.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 10:28:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577587#M102032</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2021-12-07T10:28:15Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577588#M102033</link>
      <description>&lt;P&gt;As it said 100%, it don't read it again (unless you remove it by btprobe on UF side from the _fishbucket).&lt;/P&gt;&lt;P&gt;If you don't see it where you are expecting, then it's read and sent to somewhere else or it was hit by some rule which drop it.&lt;/P&gt;&lt;P&gt;How about those limits and pipelines? Have you already those in place or should you increase those?&lt;/P&gt;&lt;P&gt;r. Ismo&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 10:34:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577588#M102033</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2021-12-07T10:34:43Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577589#M102034</link>
      <description>&lt;P&gt;My througput is raised quite significantly since the host is doing quite a lot of ingesting (like 32MBps or something like that). For now I have only one pipeline.&lt;/P&gt;&lt;P&gt;It's getting more and more confusing since the inputstatus is indeed - after last UF restart which I needed because of apparent lack of users defined on the UF - from the tail processor.&lt;/P&gt;&lt;P&gt;I'd still say there's something strange going on on the source side.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 10:42:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577589#M102034</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2021-12-07T10:42:12Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577590#M102035</link>
      <description>&lt;P&gt;When UF reads that file, it keeps it open 3 after it has reached EOF.&lt;/P&gt;&lt;PRE&gt;time_before_close = &amp;lt;integer&amp;gt;
* The amount of time, in seconds, that the file monitor must wait for
  modifications before closing a file after reaching an End-of-File
  (EOF) marker.
* Tells the input not to close files that have been updated in the
  past 'time_before_close' seconds.
* Default: 3&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;For that reason you couldn't see it as open after it has read the whole file. Usually UF first check if file has modified after last check and it node information has changed after that it start to read it and check from the content if there are something to read and send to indexer.&lt;/P&gt;&lt;PRE&gt;alwaysOpenFile = &amp;lt;boolean&amp;gt;
* Opens a file to check whether it has already been indexed, by skipping the
  modification time/size checks.
* Only useful for files that do not update modification time or size.
* Only known to be needed when monitoring files on Windows, mostly for
  Internet Information Server logs.
* Configuring this setting to "1" can increase load and slow indexing. Use it
  only as a last resort.
* Default: 0&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 10:49:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577590#M102035</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2021-12-07T10:49:17Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577600#M102036</link>
      <description>&lt;P&gt;Ah, forgot to mention. My time_before_close was raised to 300 seconds already.&lt;/P&gt;&lt;P&gt;I'll check the always_open but I'm afraid that it will only check the CRC which should not change mid-flight. But it's worth a try. Thanks for the hint.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 11:26:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577600#M102036</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2021-12-07T11:26:16Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577607#M102038</link>
      <description>&lt;P&gt;OK. It seems that after setting alwaysOpenFile=1 and restarting the UF, it read the file up to&lt;/P&gt;&lt;PRE&gt;file position = 14925904619&lt;BR /&gt;file size = 14084100799&lt;BR /&gt;percent = 105.98&lt;BR /&gt;type = done reading (batch)&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;and stopped again.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Dec 2021 12:07:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577607#M102038</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2021-12-07T12:07:32Z</dc:date>
    </item>
    <item>
      <title>Re: Tail/Batch readers confusion</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577745#M102053</link>
      <description>&lt;P&gt;OK. Same thing again - after the file rotated at midnight, it started being properly tailed (at least that's what I suppose). Then I got&lt;/P&gt;&lt;PRE&gt;12-08-2021 06:01:27.167 +0100 WARN TailReader - Enqueuing a very large file=\\&amp;lt;redacted&amp;gt; in the batch reader, with bytes_to_read=224858513, reading of other large files could be delayed&lt;/PRE&gt;&lt;P&gt;And I've been geting this message every few minutes up to this point. And the forwarder shows the file as read up to the end and stopped.&lt;/P&gt;&lt;P&gt;So if I understand the behaviour correctly - if the forwarder switches from tailreader to batchreader, it's never going back to the tailreader for this file, right?&lt;/P&gt;&lt;P&gt;The only thing to check now is to raise the limit for the batchreader so the forwarder doesn't switch to batch reader so eagerly.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Dec 2021 08:31:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Tail-Batch-readers-confusion/m-p/577745#M102053</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2021-12-08T08:31:29Z</dc:date>
    </item>
  </channel>
</rss>

