<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Ignoring events during indexing Vs ignoring events during search in Monitoring Splunk</title>
    <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161972#M1806</link>
    <description>&lt;P&gt;There likely is a typo in the last capturing group near the asterisk before "vod".&lt;/P&gt;</description>
    <pubDate>Fri, 21 Feb 2014 23:42:47 GMT</pubDate>
    <dc:creator>martin_mueller</dc:creator>
    <dc:date>2014-02-21T23:42:47Z</dc:date>
    <item>
      <title>Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161970#M1804</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;We use splunk 5.0.4. We have customers with daily log volume ranging from 10GB to 50GB.&lt;/P&gt;

&lt;P&gt;Our customers do not want to show up URLs with *.jpg, *.png and etc in charts and reports.&lt;/P&gt;

&lt;P&gt;We have two options:&lt;BR /&gt;
 1. Filter out these events from indexing. &lt;BR /&gt;
 2. Ignore these events while creating summary index.&lt;/P&gt;

&lt;P&gt;Given their log volumes, I would like to know which is performance intensive operation. We do not want to compromise on performance. Which option is better w.r.t performance.&lt;/P&gt;

&lt;P&gt;Also, I have this stanza in my transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[strip_images_header]
REGEX = (?i)^(?:[^ ]*( {1,2})){6}(?P&amp;lt;URL&amp;gt;[^ ]*)(?= )=(*.net|*vod|*.jpeg)
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and i have included this in my props.conf. But the events are not getting filtered.&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;

&lt;P&gt;Strive&lt;/P&gt;</description>
      <pubDate>Fri, 21 Feb 2014 22:58:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161970#M1804</guid>
      <dc:creator>strive</dc:creator>
      <dc:date>2014-02-21T22:58:53Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161971#M1805</link>
      <description>&lt;P&gt;Filtering these events during indexing should only be done if you're 100% certain you will not need those events for anything in the future.&lt;BR /&gt;
If you really are certain you will never need those events, filtering at index time is a good choice because it reduces storage, search, and license load... until you discover you do need the events after all.&lt;/P&gt;

&lt;P&gt;Personally I'd define an eventtype "charting_URLs" or similar that defines what you want to see in such a chart, and then use that as a search time filter for all your charts (basically option 2). That way you have a single configuration to adapt if the charting requirements change, and you still have the option of using the events in the future.&lt;/P&gt;</description>
      <pubDate>Fri, 21 Feb 2014 23:41:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161971#M1805</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-21T23:41:12Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161972#M1806</link>
      <description>&lt;P&gt;There likely is a typo in the last capturing group near the asterisk before "vod".&lt;/P&gt;</description>
      <pubDate>Fri, 21 Feb 2014 23:42:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161972#M1806</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-21T23:42:47Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161973#M1807</link>
      <description>&lt;P&gt;It is *vod only. I have kept it purposefully.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 00:05:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161973#M1807</guid>
      <dc:creator>strive</dc:creator>
      <dc:date>2014-02-22T00:05:29Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161974#M1808</link>
      <description>&lt;P&gt;Completely agree with your points on reduced storage, search, and license load. &lt;BR /&gt;
Filtering events during index time in high log volume systems affects indexing performance right. It has to check each and every event. I am assuming that this may affect other charts which depends on the summary indexes that are created every 5 minutes and also real time charts.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 00:07:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161974#M1808</guid>
      <dc:creator>strive</dc:creator>
      <dc:date>2014-02-22T00:07:51Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161975#M1809</link>
      <description>&lt;P&gt;Well, what does the asterisk apply to? There's nothing in front of it that could be matched zero or more times.&lt;/P&gt;

&lt;P&gt;Same question for the asterisks before .net and .jpeg&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 00:10:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161975#M1809</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-22T00:10:52Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161976#M1810</link>
      <description>&lt;P&gt;Looking further into the regex, this can't be right:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?= )=
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;That's a contradiction in and of itself. "Look ahead for a space, don't consume a char, match for an equals sign"... the char can't be both a space and an equals sign.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 00:15:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161976#M1810</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-22T00:15:44Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161977#M1811</link>
      <description>&lt;P&gt;Filtering does affect indexing performance negatively because it has to test the filter for many events - but it also affects indexing performance positively because fewer events need to be indexed and written to disk.&lt;/P&gt;

&lt;P&gt;Which effect prevails depends on the complexity of the filter and the ratio of events tossed into nullQueue. If you test a million events to move one into nullQueue you're going to have worse performance, if you move half a million into nullQueue with a simple filter you may even improve performance.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 00:18:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161977#M1811</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-22T00:18:03Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161978#M1812</link>
      <description>&lt;P&gt;Filtering at index time affects all charts &lt;EM&gt;and all searches&lt;/EM&gt;. If the data has been ditched during indexing no search can find it, regardless of realtime or not.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 00:18:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161978#M1812</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-22T00:18:48Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161979#M1813</link>
      <description>&lt;P&gt;We have log events with space as the separator. The regex i have mentioned (?i)^(?:[^ ]&lt;EM&gt;( {1,2})){6}(?P&lt;URL&gt;[^ ]&lt;/URL&gt;&lt;/EM&gt;)(?= ) is used to extract the URL field during search time and it works perfectly to fetch the 7th field (that is URL) in the log events. My use case is to take the 7th field and check if that field ends with .net or vod or jpeg&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 01:11:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161979#M1813</guid>
      <dc:creator>strive</dc:creator>
      <dc:date>2014-02-22T01:11:48Z</dc:date>
    </item>
    <item>
      <title>Re: Ignoring events during indexing Vs ignoring events during search</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161980#M1814</link>
      <description>&lt;P&gt;Up until that point - the lookahead for a space - the regex appears fine to me. However, the part after that - &lt;CODE&gt;=(*.net|*vod|*.jpeg)&lt;/CODE&gt; - is what looks wrong to me.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Feb 2014 10:43:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/Ignoring-events-during-indexing-Vs-ignoring-events-during-search/m-p/161980#M1814</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-02-22T10:43:32Z</dc:date>
    </item>
  </channel>
</rss>

