<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how to throttle some data from being indexed in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307835#M93261</link>
    <description>&lt;P&gt;You can drop events matching a certain regex by assigning them to the null queue:&lt;BR /&gt;
&lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Discard_specific_events_and_keep_the_rest"&gt;http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Discard_specific_events_and_keep_the_rest&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Question is how you're going to decide which 1 event to keep and which 9 events to drop. If they are somewhat uniformly distributed across time, you could perhaps say everything with milisecond 1-9 goes to nullqueue and milisecond 0 goes to parsing queue or something, but that's rather tricky. But as you know the data, perhaps you can think of a field that provides some uniform distribution that you could use for making a 10%-90% split...&lt;/P&gt;</description>
    <pubDate>Mon, 26 Feb 2018 15:18:26 GMT</pubDate>
    <dc:creator>FrankVl</dc:creator>
    <dc:date>2018-02-26T15:18:26Z</dc:date>
    <item>
      <title>how to throttle some data from being indexed</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307832#M93258</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I have an event that is a real license consumer. I would like to throttle only this event. I want only 1 of 10 hits of the same event will be indexed. All other events will remain the same with no change.&lt;/P&gt;

&lt;P&gt;How can I do that?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Michael&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 12:17:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307832#M93258</guid>
      <dc:creator>HadvoraMaya</dc:creator>
      <dc:date>2018-02-26T12:17:25Z</dc:date>
    </item>
    <item>
      <title>Re: how to throttle some data from being indexed</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307833#M93259</link>
      <description>&lt;P&gt;How are you currently ingesting that data? On a UF or a HF, through what input method?&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 12:49:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307833#M93259</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2018-02-26T12:49:07Z</dc:date>
    </item>
    <item>
      <title>Re: how to throttle some data from being indexed</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307834#M93260</link>
      <description>&lt;P&gt;I "Shoot" the data from the Application into Splunk engine via a specific port.&lt;BR /&gt;
Not using Splunk Forwarder.&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 13:17:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307834#M93260</guid>
      <dc:creator>HadvoraMaya</dc:creator>
      <dc:date>2018-02-26T13:17:49Z</dc:date>
    </item>
    <item>
      <title>Re: how to throttle some data from being indexed</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307835#M93261</link>
      <description>&lt;P&gt;You can drop events matching a certain regex by assigning them to the null queue:&lt;BR /&gt;
&lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Discard_specific_events_and_keep_the_rest"&gt;http://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Discard_specific_events_and_keep_the_rest&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Question is how you're going to decide which 1 event to keep and which 9 events to drop. If they are somewhat uniformly distributed across time, you could perhaps say everything with milisecond 1-9 goes to nullqueue and milisecond 0 goes to parsing queue or something, but that's rather tricky. But as you know the data, perhaps you can think of a field that provides some uniform distribution that you could use for making a 10%-90% split...&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 15:18:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307835#M93261</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2018-02-26T15:18:26Z</dc:date>
    </item>
    <item>
      <title>Re: how to throttle some data from being indexed</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307836#M93262</link>
      <description>&lt;P&gt;I know the event's name. I just want it to filter out 90% of this event's beeing indexed.&lt;BR /&gt;
Just need to be able to say that event name MM should be index 1 out of 10 events.&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 15:28:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307836#M93262</guid>
      <dc:creator>HadvoraMaya</dc:creator>
      <dc:date>2018-02-26T15:28:41Z</dc:date>
    </item>
    <item>
      <title>Re: how to throttle some data from being indexed</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307837#M93263</link>
      <description>&lt;P&gt;And as I mentioned: the only thing I can think of to do that is find a regex that (probably more or randomly) matches 10% of the events. E.g. by triggering of the milliseconds or maybe there is some incremental eventID where you could ignore all eventIDs that end with 1-9 and only accept eventIDs ending in 0 or something.&lt;/P&gt;

&lt;P&gt;As mentioned: this is all not brilliantly reliable, but it's the best I can think of. As far as I know there is no way to tell splunk to let 1 out 10 events through.&lt;/P&gt;</description>
      <pubDate>Tue, 27 Feb 2018 08:59:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/how-to-throttle-some-data-from-being-indexed/m-p/307837#M93263</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2018-02-27T08:59:08Z</dc:date>
    </item>
  </channel>
</rss>

