<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: rex - matching everything until a tab in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22930#M3999</link>
    <description>&lt;P&gt;Thanks Ayn for the answer.&lt;BR /&gt;
I also managed to do the same replacing &lt;CODE&gt;\.+&lt;/CODE&gt; by &lt;CODE&gt;[^\t]+&lt;/CODE&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 28 Nov 2011 11:44:34 GMT</pubDate>
    <dc:creator>wsw70</dc:creator>
    <dc:date>2011-11-28T11:44:34Z</dc:date>
    <item>
      <title>rex - matching everything until a tab</title>
      <link>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22928#M3997</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am trying to parse a log from a Tipping Point IPS. An example of the log I get is (the log is cut for clarity, there is normally more on the line)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Nov 28 07:37:50 10.22.250.151 8 4   dab8b814-b100-11e0-06b9-e527e93f10b7    00000001-0001-0001-0001-000000004270    4270: HTTP: PHP Code Injection  4270
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Everything is OK when parsing it via&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;rex "[a-zA-Z]+\\s+\\d+\\s+\\d+:\\d+:\\d+\\s+\\d+\\.\\d+\\.\\d+\\.\\d+\\s+(?P&amp;lt;ACTION&amp;gt;\\d+)\\s+(?P&amp;lt;CRIT&amp;gt;\\d+)\\s+[0-9-]+\\s+[0-9-]+\\s+(?P&amp;lt;ATTACKID&amp;gt;\\d+):"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and I get the ACTION, CRIT and ATTACKID fields. So far so good.&lt;/P&gt;

&lt;P&gt;I then wanted to get the next piece of information which is the attack description (&lt;EM&gt;HTTP: PHP Code Injection&lt;/EM&gt;). &lt;STRONG&gt;Fields are separated by a TAB&lt;/STRONG&gt;. I therefore tried&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;rex "[a-zA-Z]+\\s+\\d+\\s+\\d+:\\d+:\\d+\\s+\\d+\\.\\d+\\.\\d+\\.\\d+\\s+(?P&amp;lt;ACTION&amp;gt;\\d+)\\s+(?P&amp;lt;CRIT&amp;gt;\\d+)\\s+[0-9-]+\\s+[0-9-]+\\s+(?P&amp;lt;ATTACKID&amp;gt;\\d+):\s+(?P&amp;lt;ATTACKNAME&amp;gt;.+)\\t\\d+"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;the idea being to match every character up to the tab one. I end up catching the remaining of the line (ie. the match does not stop at the tab).&lt;/P&gt;

&lt;P&gt;I tried to run this through &lt;A href="http://www.rubular.com"&gt;Rubular&lt;/A&gt; with the source data copied/pasted from Splunk and it works (this is to say that there is indeed a tab as a separator, I also see this in the search window). Looks like there is a specific way to catch the tab character, or that &lt;CODE&gt;\.+&lt;/CODE&gt; catches everything until the end of the line.&lt;/P&gt;

&lt;P&gt;Thanks a lot for any pointer (and sorry as my question must be obvious to someone used to regex) -- WoJ&lt;/P&gt;</description>
      <pubDate>Mon, 28 Nov 2011 10:54:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22928#M3997</guid>
      <dc:creator>wsw70</dc:creator>
      <dc:date>2011-11-28T10:54:51Z</dc:date>
    </item>
    <item>
      <title>Re: rex - matching everything until a tab</title>
      <link>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22929#M3998</link>
      <description>&lt;P&gt;You need to use a non-greedy match. The current greedy one looks like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?P&amp;lt;ATTACKNAME&amp;gt;.+)\t
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;which tells the regex engine to return the longest possible match that satisfies the conditions. The corresponding non-greedy match would be (note the "?"):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?P&amp;lt;ATTACKNAME&amp;gt;.+?)\t
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This tells the regex engine to return the shortest possible match, i.e. only match up until the first tab character it finds.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Nov 2011 11:39:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22929#M3998</guid>
      <dc:creator>Ayn</dc:creator>
      <dc:date>2011-11-28T11:39:52Z</dc:date>
    </item>
    <item>
      <title>Re: rex - matching everything until a tab</title>
      <link>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22930#M3999</link>
      <description>&lt;P&gt;Thanks Ayn for the answer.&lt;BR /&gt;
I also managed to do the same replacing &lt;CODE&gt;\.+&lt;/CODE&gt; by &lt;CODE&gt;[^\t]+&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Nov 2011 11:44:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/rex-matching-everything-until-a-tab/m-p/22930#M3999</guid>
      <dc:creator>wsw70</dc:creator>
      <dc:date>2011-11-28T11:44:34Z</dc:date>
    </item>
  </channel>
</rss>

