<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Filter results before regex is applied in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37125#M6865</link>
    <description>&lt;P&gt;Use:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=home/xyz.log *Exception
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Wed, 02 Feb 2011 05:57:36 GMT</pubDate>
    <dc:creator>gkanapathy</dc:creator>
    <dc:date>2011-02-02T05:57:36Z</dc:date>
    <item>
      <title>Filter results before regex is applied</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37124#M6864</link>
      <description>&lt;P&gt;I have an application log with a lot of entries.&lt;/P&gt;

&lt;P&gt;I want to be able to get only the lines with the pattern "Exception:"&lt;/P&gt;

&lt;P&gt;some examples of lines in the log file are &lt;/P&gt;

&lt;P&gt;case1: java.text.ParseException: Unparseable date:&lt;/P&gt;

&lt;P&gt;case2: com.pp.xyz.services.exception.UserException: Expected one record with user ID&lt;/P&gt;

&lt;P&gt;The following does not work&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source="/home/xyz.log" "Exception:"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;But doing the following matches case 2&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source="/home/xyz.log" "Exception"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Couple of questions regarding this. &lt;/P&gt;

&lt;P&gt;Splunk ignores case of the search term provided in this case "Exception" and matches it against "exception" ?&lt;/P&gt;

&lt;P&gt;Splunk does not match partial patterns, which should have matched  case1 when i searched for "Exception:" ? Why is this ?&lt;/P&gt;

&lt;P&gt;How you get the initial search to match against pattern "Exception:" ?&lt;/P&gt;

&lt;P&gt;If I can get that to work then i would want to do something like below for the full solution, which is to capture all Exceptions &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source="/home/xyz.log" "Exception:"|rex "\w+\.(?&amp;lt;exception&amp;gt;.\w+Exception).*?\n"|timechart count by exception usenull=f 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 02 Feb 2011 05:49:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37124#M6864</guid>
      <dc:creator>tven7</dc:creator>
      <dc:date>2011-02-02T05:49:19Z</dc:date>
    </item>
    <item>
      <title>Re: Filter results before regex is applied</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37125#M6865</link>
      <description>&lt;P&gt;Use:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=home/xyz.log *Exception
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 02 Feb 2011 05:57:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37125#M6865</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2011-02-02T05:57:36Z</dc:date>
    </item>
    <item>
      <title>Re: Filter results before regex is applied</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37126#M6866</link>
      <description>&lt;P&gt;1) Yes, Splunk search is case insensitive concerning indexed terms. However, boolean operators (AND, OR, NOT) MUST be written uppercase, field names MUST be written exactly as they appear&lt;/P&gt;

&lt;P&gt;2) Splunk matches partial patterns if you put an asterisk into them (as gkanapathy said).
The column in "Exception:" is considered a "segmenter" i.e. something breaking up words. But you should be able to get results for 3)&lt;/P&gt;

&lt;P&gt;3) "*Exception:" should do&lt;/P&gt;</description>
      <pubDate>Wed, 02 Feb 2011 06:07:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37126#M6866</guid>
      <dc:creator>Paolo_Prigione</dc:creator>
      <dc:date>2011-02-02T06:07:12Z</dc:date>
    </item>
    <item>
      <title>Re: Filter results before regex is applied</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37127#M6867</link>
      <description>&lt;P&gt;Perhaps one might prefer "*Exception:"&lt;/P&gt;</description>
      <pubDate>Wed, 02 Feb 2011 06:19:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37127#M6867</guid>
      <dc:creator>jrodman</dc:creator>
      <dc:date>2011-02-02T06:19:46Z</dc:date>
    </item>
    <item>
      <title>Re: Filter results before regex is applied</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37128#M6868</link>
      <description>&lt;P&gt;The performance of this is really bad, just going back 4 hours which is not a lot of data (&amp;lt; 300 mb). I guess the I/O is to blame, with nothing else on contendign for resources on the server. Previously i was doing this. "Exception" NOT XYZPAttern and this was performing well, but was skipping some patterns in case 1. &lt;/P&gt;

&lt;P&gt;Thank you for the help&lt;/P&gt;</description>
      <pubDate>Wed, 02 Feb 2011 08:20:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37128#M6868</guid>
      <dc:creator>tven7</dc:creator>
      <dc:date>2011-02-02T08:20:56Z</dc:date>
    </item>
    <item>
      <title>Re: Filter results before regex is applied</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37129#M6869</link>
      <description>&lt;P&gt;The previous answers are right, but I'd like to point out that searching with a leading wildcard is much less efficient than having a wildcard on the suffix.  In other words, looking for "Blah*" is pretty quick because splunk can do an efficient lookup to say find terms start with "Blah".  Whereas, searching for "*Blah", splunk must scan all terms looking for ones that ends with "Blah".  This type of index lookup will always take longer, but you may or may not notice; that's going to depend on how many unique terms your index contains.&lt;/P&gt;

&lt;P&gt;So my suggestion would be to build a list of all possible exceptions types and put them into a big "OR" list:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Step 1&lt;/STRONG&gt;:  Figure out how may different "*Exception" patterns you really have in your data.  (you may want to search over a long time period to make sure you don't miss any.)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source="/home/xyz.log" *Exception: | regex "\.(?&amp;lt;exception&amp;gt;\w+Exception:)" | dedup exception
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;Step 2&lt;/STRONG&gt;:  Take that list of terms and combine them into your original search, something like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source="/home/xyz.log" (ParseException: OR UserException: OR BlahException: OR ...)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Assuming you don't have all that many exception types, you should end up with a faster search.&lt;/P&gt;

&lt;P&gt;You'll also have to ask yourself:  How often do new exception types show up?   Which is preferable? (1) good performance with the possibly of missing events when new exception types show up, or (2) never missing events, but having a slower search.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;HR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;There's a helpful video about segmentation here: &lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;A rel="nofollow" href="http://www.splunk.com/view/SP-CAAACXB"&gt;http://www.splunk.com/view/SP-CAAACXB&lt;/A&gt;&lt;/LI&gt;

&lt;LI&gt;&lt;A href="http://www.splunk.com/web_assets/video/2008/dev/PreviewPeeks/Sorkin_Segmentation.swf" target="test_blank"&gt;http://www.splunk.com/web_assets/video/2008/dev/PreviewPeeks/Sorkin_Segmentation.swf&lt;/A&gt;&lt;/LI&gt;

&lt;/UL&gt;</description>
      <pubDate>Thu, 03 Feb 2011 02:30:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Filter-results-before-regex-is-applied/m-p/37129#M6869</guid>
      <dc:creator>Lowell</dc:creator>
      <dc:date>2011-02-03T02:30:19Z</dc:date>
    </item>
  </channel>
</rss>

