<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: avoid duplicate file ingestion in splunk in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396530#M95607</link>
    <description>&lt;P&gt;if you copy and place the same file, its likely to index it again. As @FrankVl said, the splunk input monitor process checks for the CRC and indexes the files. Pls setup the inputs.conf to index the files/file pattern you need. Additionally you can use whitelist/blacklist.&lt;/P&gt;

&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Howlogfilerotationishandled"&gt;https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Howlogfilerotationishandled&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Whitelistorblacklistspecificincomingdata"&gt;https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Whitelistorblacklistspecificincomingdata&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 25 Feb 2019 09:41:09 GMT</pubDate>
    <dc:creator>lakshman239</dc:creator>
    <dc:date>2019-02-25T09:41:09Z</dc:date>
    <item>
      <title>avoid duplicate file ingestion in splunk</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396528#M95605</link>
      <description>&lt;P&gt;how to remove duplicate files from ingesting in splunk?&lt;BR /&gt;
i  am monitoring a folder in which there is a file names abcd.csv now i make a copy of this file and paste it again in that folder its getting ingested again hot o restrict splunk from doing so ?&lt;/P&gt;</description>
      <pubDate>Mon, 25 Feb 2019 05:06:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396528#M95605</guid>
      <dc:creator>test4u</dc:creator>
      <dc:date>2019-02-25T05:06:42Z</dc:date>
    </item>
    <item>
      <title>Re: avoid duplicate file ingestion in splunk</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396529#M95606</link>
      <description>&lt;P&gt;What are your inputs.conf settings for that folder? Because by default Splunk ignores files that have the same content (based on a CRC calculated over the first 256 bytes or so).&lt;/P&gt;</description>
      <pubDate>Mon, 25 Feb 2019 09:13:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396529#M95606</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2019-02-25T09:13:34Z</dc:date>
    </item>
    <item>
      <title>Re: avoid duplicate file ingestion in splunk</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396530#M95607</link>
      <description>&lt;P&gt;if you copy and place the same file, its likely to index it again. As @FrankVl said, the splunk input monitor process checks for the CRC and indexes the files. Pls setup the inputs.conf to index the files/file pattern you need. Additionally you can use whitelist/blacklist.&lt;/P&gt;

&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Howlogfilerotationishandled"&gt;https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Howlogfilerotationishandled&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Whitelistorblacklistspecificincomingdata"&gt;https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Whitelistorblacklistspecificincomingdata&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Feb 2019 09:41:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396530#M95607</guid>
      <dc:creator>lakshman239</dc:creator>
      <dc:date>2019-02-25T09:41:09Z</dc:date>
    </item>
    <item>
      <title>Re: avoid duplicate file ingestion in splunk</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396531#M95608</link>
      <description>&lt;P&gt;The whole point is that by default, Splunk does &lt;EM&gt;not&lt;/EM&gt; index files again if they are an exact copy of already ingested files.&lt;BR /&gt;
If Splunk is ingesting those files again, that points at  some specific config being in place to overrule that default behavior (e.g. changes to crcSalt setting). I would look for the solution there, rather than in changing the pattern or use white/blacklists.&lt;/P&gt;

&lt;P&gt;But let's see what the current config is, so we can determine the best course of action &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Feb 2019 09:48:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396531#M95608</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2019-02-25T09:48:54Z</dc:date>
    </item>
    <item>
      <title>Re: avoid duplicate file ingestion in splunk</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396532#M95609</link>
      <description>&lt;P&gt;i havent made any changes to inputs.conf as such.following is my inputs.conf&lt;/P&gt;

&lt;P&gt;[script://$SPLUNK_HOME\etc\apps\S_APP\bin\S_SCRIPT_FINAL.py]&lt;BR /&gt;
disabled = false&lt;BR /&gt;
index = soc&lt;BR /&gt;
interval = 60.0&lt;BR /&gt;
sourcetype = csv&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:21:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396532#M95609</guid>
      <dc:creator>test4u</dc:creator>
      <dc:date>2020-09-29T23:21:28Z</dc:date>
    </item>
    <item>
      <title>Re: avoid duplicate file ingestion in splunk</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396533#M95610</link>
      <description>&lt;P&gt;Right, so it is a scripted input, not a file monitor as your question suggested. So the solution probably needs to be found in the workings of that script.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Feb 2019 08:04:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/avoid-duplicate-file-ingestion-in-splunk/m-p/396533#M95610</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2019-02-27T08:04:33Z</dc:date>
    </item>
  </channel>
</rss>

