<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why are there many duplicate events in the indexer cluster? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374967#M67924</link>
    <description>&lt;P&gt;count : 416&lt;BR /&gt;
count : 500&lt;BR /&gt;
etc ..    Returned a total of 3518 results&lt;/P&gt;

&lt;P&gt;Some events are repeated twice, and some events are repeated hundreds of times&lt;/P&gt;

&lt;P&gt;Can I use the &lt;CODE&gt;dedup_time _raw&lt;/CODE&gt; to exclude duplicate events when I search?&lt;/P&gt;</description>
    <pubDate>Wed, 23 Aug 2017 01:45:09 GMT</pubDate>
    <dc:creator>xsstest</dc:creator>
    <dc:date>2017-08-23T01:45:09Z</dc:date>
    <item>
      <title>Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374965#M67922</link>
      <description>&lt;P&gt;I have a single site cluster that contains 5 indexers, 4 search heads, a master node, and a deployer. There are also some universal forwarders with load balancing.&lt;/P&gt;

&lt;P&gt;All events in the indexer cluster are from Universal forwarders. The data flow direction is as follows.（The most common cluster architecture）&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Server/Host (UF installed here)—————TCP—————&amp;gt;indexer cluster
Server/Host(syslog)—————Universal Forwarder—————TCP—————indexer cluster 
Server/Host(UF monitors a file)——————TCP————&amp;gt;Indexer cluster
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;So the question is coming&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;&lt;P&gt;Why does it return duplicate events when I search? Is it because I'm using TCP? &lt;A href="https://answers.splunk.com/answers/537368/why-is-there-event-duplication-via-tcp-port.html"&gt;https://answers.splunk.com/answers/537368/why-is-there-event-duplication-via-tcp-port.html&lt;/A&gt;?&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;I disabled the use_ACK function in the outputs.conf on the UF&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;What are the common causes of repeated events? Please tell me, I can exclude it one by one. Thank you&lt;/P&gt;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Forgive me for my English&lt;/P&gt;</description>
      <pubDate>Tue, 22 Aug 2017 16:38:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374965#M67922</guid>
      <dc:creator>xsstest</dc:creator>
      <dc:date>2017-08-22T16:38:51Z</dc:date>
    </item>
    <item>
      <title>Re: Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374966#M67923</link>
      <description>&lt;P&gt;Hi xsstest, &lt;/P&gt;

&lt;P&gt;Here are some steps to debug, &lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;5 indexer are in same network / data center? Check the network connectivity between each and the indexer master. &lt;/LI&gt;
&lt;LI&gt;Initial step would be, check the file that shows the duplicate results manually and check. if your file have duplicate data you have to update your porps.conf / handle the duplicate in your search. &lt;/LI&gt;
&lt;LI&gt;run some simple search to understand the duplicate pattern, something like  &lt;CODE&gt;index=&amp;lt;your index&amp;gt; sourcetype=&amp;lt;sourcetype&amp;gt; source="&amp;lt;source&amp;gt;" host="&amp;lt;host&amp;gt;"  | eval bucket=_bkt | eval indextime=_indextime  |table _time, indextime, bucket splunk_server _raw | convert ctime(indextime) | stats count list(*) as * by _raw | where count&amp;gt;1 | fields * _raw&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;run the above search and check the count and index time and splunk_server . count  - number of time indexed, indextime - when the event indexed in splunk, bucket - which bucket the data is stored. &lt;/LI&gt;
&lt;LI&gt;In-case you get the indextime twice the events are indexed twice. based on your configurations. &lt;/LI&gt;
&lt;LI&gt;In-case you get the multiple splunk_server and same bucket your indexer is not able to connect to master so replicated buckets are being enabled for search. so you have to check your network and indexer cluster configurations. &lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;posting more information will helpful to assist further.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Aug 2017 17:04:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374966#M67923</guid>
      <dc:creator>vasanthmss</dc:creator>
      <dc:date>2017-08-22T17:04:37Z</dc:date>
    </item>
    <item>
      <title>Re: Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374967#M67924</link>
      <description>&lt;P&gt;count : 416&lt;BR /&gt;
count : 500&lt;BR /&gt;
etc ..    Returned a total of 3518 results&lt;/P&gt;

&lt;P&gt;Some events are repeated twice, and some events are repeated hundreds of times&lt;/P&gt;

&lt;P&gt;Can I use the &lt;CODE&gt;dedup_time _raw&lt;/CODE&gt; to exclude duplicate events when I search?&lt;/P&gt;</description>
      <pubDate>Wed, 23 Aug 2017 01:45:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374967#M67924</guid>
      <dc:creator>xsstest</dc:creator>
      <dc:date>2017-08-23T01:45:09Z</dc:date>
    </item>
    <item>
      <title>Re: Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374968#M67925</link>
      <description>&lt;P&gt;What are the common causes of repetitive events? How to carry out one by one investigation?&lt;/P&gt;</description>
      <pubDate>Wed, 23 Aug 2017 08:41:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374968#M67925</guid>
      <dc:creator>xsstest</dc:creator>
      <dc:date>2017-08-23T08:41:01Z</dc:date>
    </item>
    <item>
      <title>Re: Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374969#M67926</link>
      <description>&lt;P&gt;yes, you can filter the duplicate using dedup during the search time. Have you check the source file have duplicate or not? If yes then you can use the dedup other wise try to figure out why the events were duplicated. Any luck with splunk_server and bucket analysis? &lt;/P&gt;</description>
      <pubDate>Wed, 23 Aug 2017 17:31:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374969#M67926</guid>
      <dc:creator>vasanthmss</dc:creator>
      <dc:date>2017-08-23T17:31:53Z</dc:date>
    </item>
    <item>
      <title>Re: Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374970#M67927</link>
      <description>&lt;P&gt;@vasanthmss&lt;/P&gt;

&lt;P&gt;The reason was found: because  the rsyslog configuration error, the same &lt;CODE&gt;InputFileFacility&lt;/CODE&gt; was used&lt;/P&gt;</description>
      <pubDate>Wed, 15 Nov 2017 07:02:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/374970#M67927</guid>
      <dc:creator>xsstest</dc:creator>
      <dc:date>2017-11-15T07:02:42Z</dc:date>
    </item>
    <item>
      <title>Re: Why are there many duplicate events in the indexer cluster?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/606310#M105354</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;i have a similar problem... at first my config to forward internal splunk indexes...&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;[tcpout]&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;forwardedindex.0.whitelist = _.*&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;forwardedindex.filter.disable = false&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;defaultGroup = TEST_IDX-CLUSTER&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;[tcpout:TEST_IDX-CLUSTER]&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;forceTimebasedAutoLB = true&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;autoLBFrequency = 30&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;server = ID01.SPLUNK-TEST.local:9997,ID02.SPLUNK-TEST.local:9997,ID03.SPLUNK-TEST.local:9997&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Then i debug with this search...&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;index="_internal"&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| eval bucket=_bkt&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| eval indextime=_indextime&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| table _time, indextime, bucket splunk_server _raw&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| convert ctime(indextime)&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| stats count list(*) as * by _raw&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| where count&amp;gt;1&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| fields * _raw&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;| sort - indextime&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;Output:&lt;BR /&gt;bucket = every bucket is another in one event&lt;BR /&gt;count = 2 or sometimes 3&lt;BR /&gt;indextime = every entry is equal&lt;BR /&gt;splunk_server = 01,02,03 or 01,01 or 02,03 or 03,03 (many different combinations)&lt;BR /&gt;&lt;BR /&gt;Anyone an idea?&lt;/P&gt;&lt;P&gt;Regards - Markus&lt;/P&gt;</description>
      <pubDate>Wed, 20 Jul 2022 09:21:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-there-many-duplicate-events-in-the-indexer-cluster/m-p/606310#M105354</guid>
      <dc:creator>CMEOGNAD</dc:creator>
      <dc:date>2022-07-20T09:21:08Z</dc:date>
    </item>
  </channel>
</rss>

