<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: regex field extraction question in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33213#M7051</link>
    <description>&lt;P&gt;You might better off to &lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Data/Indexmulti-lineevents"&gt;break up the log lines into individual events&lt;/A&gt; by setting the SHOULD_LINEMERGE value to "false" in &lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf"&gt;props.conf&lt;/A&gt;.&lt;/P&gt;

&lt;P&gt;And then use a regex like : &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?im)Socket connect\|.*\|Time\staken\sis\s(?&amp;lt;socket_connect_time&amp;gt;.+)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You could also add well named field extractions for the other fields too :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?im)Compress\|.*\|Time\staken\sis\s(?&amp;lt;compression_time&amp;gt;.+)
(?im)Socket send\|.*\|Time\staken\sis\s(?&amp;lt;socket_send_time&amp;gt;.+)
(?im)Process successfully\|Total\sprocessing\stime\sis\s(?&amp;lt;total_processing_time&amp;gt;.+)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 23 Apr 2012 05:20:43 GMT</pubDate>
    <dc:creator>Damien_Dallimor</dc:creator>
    <dc:date>2012-04-23T05:20:43Z</dc:date>
    <item>
      <title>regex field extraction question</title>
      <link>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33212#M7050</link>
      <description>&lt;P&gt;this is the search i use:&lt;BR /&gt;
sourcetype="Outbound" | head 10000 | rex "(?im)^(?:[^:\n]*:){3}\d+\|\w+\s+\w+\s+\w+\s+(?P&lt;SOCKET_TIME&gt;.+)" | top 50 Socket_time&lt;/SOCKET_TIME&gt;&lt;/P&gt;

&lt;P&gt;which works and are able to extract the field: socket_time&lt;/P&gt;

&lt;P&gt;Corrected extracted out data: 0ms (or any time that is specified)&lt;/P&gt;

&lt;P&gt;however, the moment i identify it as a fieldtype, the extracted data goes all wrong.&lt;BR /&gt;
extracted out: &lt;BR /&gt;
0ms &lt;BR /&gt;
&amp;lt;and other remaining info from the log are included, making this search giving alot unique hits.&lt;/P&gt;

&lt;P&gt;Example of one Event:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;2012-03-21 00:00:12.299 - Socket connect|10.53.16.120:5000|Time taken is 2ms &lt;BR /&gt;
2012-03-21 00:00:12.299 - Compress|From 00173 to 00079| Time taken is 0ms &lt;BR /&gt;
2012-03-21 00:00:12.436 Socket send|10.53.16.120|Time taken is 136ms &lt;BR /&gt;
2012-03-21 00:00:12.436 - Send|00079|BQC911CM00314      BQC911  &lt;COMPRESSED&gt; &lt;BR /&gt;
2012-03-21 00:00:12.436 - &amp;gt; Process successfully|Total processing time is 160ms&lt;/COMPRESSED&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;as u can see. im just trying to get the 2ms out. but the search is extracting it all the way to the end of the event. &lt;/P&gt;

&lt;P&gt;my question to anyone whose willing to help is which regex expression should i put to ignore everything after '2ms'. &lt;/P&gt;

&lt;P&gt;Thanks!&lt;/P&gt;

&lt;P&gt;EDIT: i ran it through Field extractor and were able to produce results:&lt;BR /&gt;
e.g.&lt;BR /&gt;
&lt;FIELDNAME&gt; &lt;COUNT&gt;&lt;BR /&gt;
0ms   12 &lt;BR /&gt;
12ms  21&lt;BR /&gt;
19ms  43&lt;/COUNT&gt;&lt;/FIELDNAME&gt;&lt;/P&gt;

&lt;P&gt;BUT. when i select it normally as a field in search app: this is wat shows up:&lt;/P&gt;

&lt;P&gt;Socket_time=0ms2012-03-21 11:16:51.756 DEBUG - BQC911|Compress|From 00173 to 00078|Time taken is 0ms2012-03-21 11:16:51.877 DEBUG - BQC911|Socket send|10.53.16.120|Time taken is 120ms2012-03-21 11:16:51.877 INFO - BQC911|Send|00078|BQC911CM00413 BQC911 &lt;COMPRESSED&gt;2012-03-21 11:16:51.877 INFO - BQC911|Process successfully|Total processing time is 127ms&lt;/COMPRESSED&gt;&lt;/P&gt;

&lt;P&gt;basically the entire 'event' has been absorbed into this fieldname. &lt;/P&gt;</description>
      <pubDate>Mon, 23 Apr 2012 04:53:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33212#M7050</guid>
      <dc:creator>attgjh1</dc:creator>
      <dc:date>2012-04-23T04:53:06Z</dc:date>
    </item>
    <item>
      <title>Re: regex field extraction question</title>
      <link>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33213#M7051</link>
      <description>&lt;P&gt;You might better off to &lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Data/Indexmulti-lineevents"&gt;break up the log lines into individual events&lt;/A&gt; by setting the SHOULD_LINEMERGE value to "false" in &lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf"&gt;props.conf&lt;/A&gt;.&lt;/P&gt;

&lt;P&gt;And then use a regex like : &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?im)Socket connect\|.*\|Time\staken\sis\s(?&amp;lt;socket_connect_time&amp;gt;.+)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You could also add well named field extractions for the other fields too :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?im)Compress\|.*\|Time\staken\sis\s(?&amp;lt;compression_time&amp;gt;.+)
(?im)Socket send\|.*\|Time\staken\sis\s(?&amp;lt;socket_send_time&amp;gt;.+)
(?im)Process successfully\|Total\sprocessing\stime\sis\s(?&amp;lt;total_processing_time&amp;gt;.+)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 23 Apr 2012 05:20:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33213#M7051</guid>
      <dc:creator>Damien_Dallimor</dc:creator>
      <dc:date>2012-04-23T05:20:43Z</dc:date>
    </item>
    <item>
      <title>Re: regex field extraction question</title>
      <link>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33214#M7052</link>
      <description>&lt;P&gt;the whole chunk of text are one entire event. that's why its annoying =/ &lt;BR /&gt;
wondering if there's any regex that ignores remaining lines?&lt;/P&gt;</description>
      <pubDate>Mon, 23 Apr 2012 05:35:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33214#M7052</guid>
      <dc:creator>attgjh1</dc:creator>
      <dc:date>2012-04-23T05:35:28Z</dc:date>
    </item>
    <item>
      <title>Re: regex field extraction question</title>
      <link>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33215#M7053</link>
      <description>&lt;P&gt;Well if you really want to stick with 1 single merged event :&lt;/P&gt;

&lt;P&gt;(?im)Socket connect\|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{2,5}\|Time\staken\sis\s(?&lt;SOCKET_CONNECT_TIME&gt;\d+ms)&lt;/SOCKET_CONNECT_TIME&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Apr 2012 05:59:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33215#M7053</guid>
      <dc:creator>Damien_Dallimor</dc:creator>
      <dc:date>2012-04-23T05:59:03Z</dc:date>
    </item>
    <item>
      <title>Re: regex field extraction question</title>
      <link>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33216#M7054</link>
      <description>&lt;P&gt;thanks alot &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Apr 2012 01:50:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/regex-field-extraction-question/m-p/33216#M7054</guid>
      <dc:creator>attgjh1</dc:creator>
      <dc:date>2012-04-24T01:50:00Z</dc:date>
    </item>
  </channel>
</rss>

