<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: scripted inputs and duplicate event data in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128692#M26419</link>
    <description>&lt;P&gt;The best (and possibly only) way would be to implement this logic in your script. Splunk doesn't have that kind of ability to compare incoming data to what's already in the index.&lt;/P&gt;

&lt;P&gt;My suggested approach would be for you to edit your script so it keeps the last version of the XML file, and when you issue the next request you compare the data you get from that with what's in the previous version.&lt;/P&gt;</description>
    <pubDate>Tue, 28 Jan 2014 22:25:28 GMT</pubDate>
    <dc:creator>Ayn</dc:creator>
    <dc:date>2014-01-28T22:25:28Z</dc:date>
    <item>
      <title>scripted inputs and duplicate event data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128691#M26418</link>
      <description>&lt;P&gt;Hi all.&lt;/P&gt;

&lt;P&gt;I have built a simple scripted input that grabs XML data over http:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;#!/bin/bash&lt;BR /&gt;
curl &lt;A href="http://www.a.com/EN.XML" target="test_blank"&gt;http://www.a.com/EN.XML&lt;/A&gt;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;All works fine BUT Splunk is indexing all events each time it is pinging the file, resulting in duplicate events.&lt;/P&gt;

&lt;P&gt;What is the best way to validate the index of events in Splunk against the XML file, so that Splunk only pulls back events that have not already been indexed?&lt;/P&gt;

&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 28 Jan 2014 21:58:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128691#M26418</guid>
      <dc:creator>himynamesdave</dc:creator>
      <dc:date>2014-01-28T21:58:30Z</dc:date>
    </item>
    <item>
      <title>Re: scripted inputs and duplicate event data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128692#M26419</link>
      <description>&lt;P&gt;The best (and possibly only) way would be to implement this logic in your script. Splunk doesn't have that kind of ability to compare incoming data to what's already in the index.&lt;/P&gt;

&lt;P&gt;My suggested approach would be for you to edit your script so it keeps the last version of the XML file, and when you issue the next request you compare the data you get from that with what's in the previous version.&lt;/P&gt;</description>
      <pubDate>Tue, 28 Jan 2014 22:25:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128692#M26419</guid>
      <dc:creator>Ayn</dc:creator>
      <dc:date>2014-01-28T22:25:28Z</dc:date>
    </item>
    <item>
      <title>Re: scripted inputs and duplicate event data</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128693#M26420</link>
      <description>&lt;P&gt;Thought so (was hoping I could cheat) &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Thanks for your help!&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jan 2014 09:44:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/scripted-inputs-and-duplicate-event-data/m-p/128693#M26420</guid>
      <dc:creator>himynamesdave</dc:creator>
      <dc:date>2014-01-29T09:44:49Z</dc:date>
    </item>
  </channel>
</rss>

