<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Why am I seeing duplicate events? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Why-am-I-seeing-duplicate-events/m-p/339568#M62676</link>
    <description>&lt;P&gt;I'm seeing the following two log messages on my UF. I'm also seeing big spikes in events every few minutes from this log file. What's going on?&lt;/P&gt;

&lt;P&gt;06-06-2017 13:55:47.047 -0400 WARN TcpOutputProc - Possible duplication of events with channel=source::/logs/mylogs/log4j/my-java-logs.log|host::myhost|log4j_6|16384, streamId=12699096867673601155, offset=48369192 onhost=10.217.104.156:9997&lt;/P&gt;

&lt;P&gt;06-06-2017 13:58:45.293 -0400 INFO WatchedFile - Logfile truncated while open, original pathname file='/logs/mylogs/log4j/my-java-logs.log', will begin reading from start.&lt;/P&gt;</description>
    <pubDate>Thu, 08 Jun 2017 03:27:08 GMT</pubDate>
    <dc:creator>davidpaper</dc:creator>
    <dc:date>2017-06-08T03:27:08Z</dc:date>
    <item>
      <title>Why am I seeing duplicate events?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-am-I-seeing-duplicate-events/m-p/339568#M62676</link>
      <description>&lt;P&gt;I'm seeing the following two log messages on my UF. I'm also seeing big spikes in events every few minutes from this log file. What's going on?&lt;/P&gt;

&lt;P&gt;06-06-2017 13:55:47.047 -0400 WARN TcpOutputProc - Possible duplication of events with channel=source::/logs/mylogs/log4j/my-java-logs.log|host::myhost|log4j_6|16384, streamId=12699096867673601155, offset=48369192 onhost=10.217.104.156:9997&lt;/P&gt;

&lt;P&gt;06-06-2017 13:58:45.293 -0400 INFO WatchedFile - Logfile truncated while open, original pathname file='/logs/mylogs/log4j/my-java-logs.log', will begin reading from start.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jun 2017 03:27:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-am-I-seeing-duplicate-events/m-p/339568#M62676</guid>
      <dc:creator>davidpaper</dc:creator>
      <dc:date>2017-06-08T03:27:08Z</dc:date>
    </item>
    <item>
      <title>Re: Why am I seeing duplicate events?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-am-I-seeing-duplicate-events/m-p/339569#M62677</link>
      <description>&lt;P&gt;The cause of both messages is the /logs/mylogs/log4j/my-java-logs.log is being written to, and instead of rolled, its being truncated (equivalent of cat /dev/null &amp;gt; my-java-logs.log) and re-written as it grows and reaches 50MB. &lt;/P&gt;

&lt;P&gt;To find this, we used a tool called watch.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;/usr/bin/watch -n 1 ls -l /logs/mylogs/log4j/my-java-logs.log&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;And we noticed that the file would grow up to just under 50MB and then it would reset back to 0 bytes and write data into the same file. &lt;/P&gt;

&lt;P&gt;The solution was to go back to the developer and convince them to change the logic to roll the log file to my-java-logs.log.1 and open a new my-java-logs.log for writing, instead of truncating.&lt;/P&gt;

&lt;P&gt;We also noticed that this large file was triggering the Batch reader. We updated the limits.conf: [default] min_batch_size_bytes up from 20 to 100 MB.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 14:24:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-am-I-seeing-duplicate-events/m-p/339569#M62677</guid>
      <dc:creator>davidpaper</dc:creator>
      <dc:date>2020-09-29T14:24:02Z</dc:date>
    </item>
  </channel>
</rss>

