<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Scheduler, to clean event data -index  in Reporting</title>
    <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138501#M3113</link>
    <description>&lt;P&gt;Strive- Yes, there is not other way to clean the index. I am using mentioned script/commands to clean the index...&lt;/P&gt;</description>
    <pubDate>Thu, 31 Jul 2014 03:57:09 GMT</pubDate>
    <dc:creator>rupesh_kumar</dc:creator>
    <dc:date>2014-07-31T03:57:09Z</dc:date>
    <item>
      <title>Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138496#M3108</link>
      <description>&lt;P&gt;Hello Splunk Team,&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;How can I write/schedule a program (java/python) to clean the eventdata?&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;My use case is:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;I am generating metadata and some additional information from binary files, I am dealing with a big dataset, 100-200 TB.&lt;BR /&gt;&lt;/LI&gt;
&lt;LI&gt;Each day we are producing 1040-1100 records (metadata), some of records may be same as day before they generated.&lt;/LI&gt;
&lt;LI&gt;I am using relational database to store these record and using Splunk dbx to index the data with Splunk.&lt;/LI&gt;
&lt;LI&gt;I wants to index fresh copy of 1040-1100 record each day to avoid the duplication.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Please provide your input on same.&lt;/P&gt;

&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Jul 2014 07:51:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138496#M3108</guid>
      <dc:creator>rupesh_kumar</dc:creator>
      <dc:date>2014-07-11T07:51:08Z</dc:date>
    </item>
    <item>
      <title>Re: Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138497#M3109</link>
      <description>&lt;P&gt;You can simply do this using a shell script. Write Shell script with 3 commands&lt;BR /&gt;
Command 1: splunk stop&lt;BR /&gt;
Command 2: splunk clean eventdata -index &lt;INDEX_NAME&gt;&lt;BR /&gt;
Command 3: splunk start&lt;/INDEX_NAME&gt;&lt;/P&gt;

&lt;P&gt;Schedule this script to run before you index new data.&lt;/P&gt;

&lt;P&gt;Same logic can be used in python script also. As per my knowledge you have to schedule that python program to run using some shell script.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Jul 2014 11:31:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138497#M3109</guid>
      <dc:creator>strive</dc:creator>
      <dc:date>2014-07-11T11:31:00Z</dc:date>
    </item>
    <item>
      <title>Re: Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138498#M3110</link>
      <description>&lt;P&gt;Strive- I am looking other solution. I don't wanna stop/start the server since this is my production server (enterprise app).&lt;/P&gt;</description>
      <pubDate>Fri, 11 Jul 2014 11:58:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138498#M3110</guid>
      <dc:creator>rupesh_kumar</dc:creator>
      <dc:date>2014-07-11T11:58:55Z</dc:date>
    </item>
    <item>
      <title>Re: Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138499#M3111</link>
      <description>&lt;P&gt;As per splunk documentation, splunk recommends to clean the data by stopping splunk. But i have not tried cleaning event data without stopping splunk.. So i do not know the impact.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Jul 2014 14:29:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138499#M3111</guid>
      <dc:creator>strive</dc:creator>
      <dc:date>2014-07-11T14:29:07Z</dc:date>
    </item>
    <item>
      <title>Re: Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138500#M3112</link>
      <description>&lt;P&gt;Since the no of records are less, and it need to be updated frequently (daily), why don't you use lookup table file to store this metadata instead of Splunk Index. You can use outputlookup after your dbx command to updated the lookup table file from search.&lt;/P&gt;

&lt;P&gt;Something like this&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;your dbx command | outputlookup YourLookupName.csv append=false 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Append=false will ensure data is overwritten, so you'll always have the latest data.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Jul 2014 15:04:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138500#M3112</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2014-07-11T15:04:38Z</dc:date>
    </item>
    <item>
      <title>Re: Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138501#M3113</link>
      <description>&lt;P&gt;Strive- Yes, there is not other way to clean the index. I am using mentioned script/commands to clean the index...&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jul 2014 03:57:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138501#M3113</guid>
      <dc:creator>rupesh_kumar</dc:creator>
      <dc:date>2014-07-31T03:57:09Z</dc:date>
    </item>
    <item>
      <title>Re: Scheduler, to clean event data -index</title>
      <link>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138502#M3114</link>
      <description>&lt;P&gt;somesoni2 - Thank you for your response.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jul 2014 03:57:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Scheduler-to-clean-event-data-index/m-p/138502#M3114</guid>
      <dc:creator>rupesh_kumar</dc:creator>
      <dc:date>2014-07-31T03:57:39Z</dc:date>
    </item>
  </channel>
</rss>

