<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Trying to extract from similar events in log file using regex in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Trying-to-extract-from-similar-events-in-log-file-using-regex/m-p/426917#M122306</link>
    <description>&lt;P&gt;I have two distinct events in an application log file: (see below).  The events are multiline and seperated by a line of ------------.   The fields are not in the same order between the two events.   I would like to harmonize the events so that I can report on fields like Timestamp, Message, Category, etc.&lt;/P&gt;

&lt;P&gt;I tried to use the automatic field extractor,  but it could not read any field other than timestamp.   I also attempted to use the following regex patterns,  but the fields do not line up.  I there a better single regex or a better  method using the props.conf/transforms.conf to normalize this data. &lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Regex 1:&lt;/EM&gt;&lt;BR /&gt;
&lt;CODE&gt;^\w*: (?&amp;lt;timestamp&amp;gt;.*)\s^\w*: (?&amp;lt;message&amp;gt;.*)\s\w*: (?&amp;lt;category&amp;gt;.*)\s\w*: (?&amp;lt;machine&amp;gt;.*)\s\w*\s\w*: (?&amp;lt;app_domain&amp;gt;.*)\s\w*\s\w*: (?&amp;lt;process_name&amp;gt;.*)\s\w*\s\w*:(?&amp;lt;thread_name&amp;gt;[^-]*$)&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Event 1:&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;  Timestamp: 5/29/2018 8:10:46 AM
    Message: Debugging Logging Path: In HttpClientWrapper going to call HttpClientDI.GetAsync(requestUri) with ***"website URL here"***
    Category: Informational
    Machine: 907998-WEB2
    App Domain: /LM/W3SVC/6/ROOT-1-131720684325430305
    Process Name: c:\windows\system32\inetsrv\w3wp.exe
    Thread Name: 

    ----------------------------------------
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;EM&gt;Regex 2:&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;^\w*:(?&amp;lt;timestamp&amp;gt;[^\n\r]*)\s\w*:(?&amp;lt;message&amp;gt;.*)\n\w*:(?&amp;lt;category&amp;gt;.*)\n\w*:(?&amp;lt;priority&amp;gt;.*)\n\w*:(?&amp;lt;eventid&amp;gt;.*)\n\w*:(?&amp;lt;severity&amp;gt;.*)\n\w*:(?&amp;lt;title&amp;gt;[^\n\r]*)\n\w*:(?&amp;lt;machine&amp;gt;.*)\n\w*\W\w*:(?&amp;lt;app_domain&amp;gt;[^\n\r]*)\s\w*:(?&amp;lt;process_id&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;process_name&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;thread_name&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;win32_threadid&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;Extended_Properties&amp;gt;[^\n\r]*)$&lt;/CODE&gt; &lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Event 2:&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Timestamp: 5/29/2018 7:46:21 AM
Message: Debugging Logging Path: In HttpClientWrapper going to create authentication token
Category: Informational
Priority: -1
EventId: 0
Severity: Information
Title:
Machine: 907998-WEB2
App Domain: /LM/W3SVC/13/ROOT-1-131720715526291397
ProcessId: 5016
Process Name: c:\windows\system32\inetsrv\w3wp.exe
Thread Name: 
Win32 ThreadId:6056
Extended Properties: LogSource - LeadsWebsite
----------------------------------------
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Fri, 01 Jun 2018 20:08:55 GMT</pubDate>
    <dc:creator>jd0323fhl</dc:creator>
    <dc:date>2018-06-01T20:08:55Z</dc:date>
    <item>
      <title>Trying to extract from similar events in log file using regex</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Trying-to-extract-from-similar-events-in-log-file-using-regex/m-p/426917#M122306</link>
      <description>&lt;P&gt;I have two distinct events in an application log file: (see below).  The events are multiline and seperated by a line of ------------.   The fields are not in the same order between the two events.   I would like to harmonize the events so that I can report on fields like Timestamp, Message, Category, etc.&lt;/P&gt;

&lt;P&gt;I tried to use the automatic field extractor,  but it could not read any field other than timestamp.   I also attempted to use the following regex patterns,  but the fields do not line up.  I there a better single regex or a better  method using the props.conf/transforms.conf to normalize this data. &lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Regex 1:&lt;/EM&gt;&lt;BR /&gt;
&lt;CODE&gt;^\w*: (?&amp;lt;timestamp&amp;gt;.*)\s^\w*: (?&amp;lt;message&amp;gt;.*)\s\w*: (?&amp;lt;category&amp;gt;.*)\s\w*: (?&amp;lt;machine&amp;gt;.*)\s\w*\s\w*: (?&amp;lt;app_domain&amp;gt;.*)\s\w*\s\w*: (?&amp;lt;process_name&amp;gt;.*)\s\w*\s\w*:(?&amp;lt;thread_name&amp;gt;[^-]*$)&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Event 1:&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;  Timestamp: 5/29/2018 8:10:46 AM
    Message: Debugging Logging Path: In HttpClientWrapper going to call HttpClientDI.GetAsync(requestUri) with ***"website URL here"***
    Category: Informational
    Machine: 907998-WEB2
    App Domain: /LM/W3SVC/6/ROOT-1-131720684325430305
    Process Name: c:\windows\system32\inetsrv\w3wp.exe
    Thread Name: 

    ----------------------------------------
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;EM&gt;Regex 2:&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;^\w*:(?&amp;lt;timestamp&amp;gt;[^\n\r]*)\s\w*:(?&amp;lt;message&amp;gt;.*)\n\w*:(?&amp;lt;category&amp;gt;.*)\n\w*:(?&amp;lt;priority&amp;gt;.*)\n\w*:(?&amp;lt;eventid&amp;gt;.*)\n\w*:(?&amp;lt;severity&amp;gt;.*)\n\w*:(?&amp;lt;title&amp;gt;[^\n\r]*)\n\w*:(?&amp;lt;machine&amp;gt;.*)\n\w*\W\w*:(?&amp;lt;app_domain&amp;gt;[^\n\r]*)\s\w*:(?&amp;lt;process_id&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;process_name&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;thread_name&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;win32_threadid&amp;gt;[^\n\r]*)\n\w*\W\w*:(?&amp;lt;Extended_Properties&amp;gt;[^\n\r]*)$&lt;/CODE&gt; &lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Event 2:&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Timestamp: 5/29/2018 7:46:21 AM
Message: Debugging Logging Path: In HttpClientWrapper going to create authentication token
Category: Informational
Priority: -1
EventId: 0
Severity: Information
Title:
Machine: 907998-WEB2
App Domain: /LM/W3SVC/13/ROOT-1-131720715526291397
ProcessId: 5016
Process Name: c:\windows\system32\inetsrv\w3wp.exe
Thread Name: 
Win32 ThreadId:6056
Extended Properties: LogSource - LeadsWebsite
----------------------------------------
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 01 Jun 2018 20:08:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Trying-to-extract-from-similar-events-in-log-file-using-regex/m-p/426917#M122306</guid>
      <dc:creator>jd0323fhl</dc:creator>
      <dc:date>2018-06-01T20:08:55Z</dc:date>
    </item>
    <item>
      <title>Re: Trying to extract from similar events in log file using regex</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Trying-to-extract-from-similar-events-in-log-file-using-regex/m-p/426918#M122307</link>
      <description>&lt;P&gt;I’d suggest to use props and transforms to set up individual extractions for each field. That way the order of the fields does not matter.&lt;/P&gt;

&lt;P&gt;Might even work like this (not tested), which takes the part befor the first &lt;CODE&gt;:&lt;/CODE&gt; as the key and the part after (until end of line) as the value and keeps matching that for each line of the event.&lt;/P&gt;

&lt;P&gt;Props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[yoursourcetype]
REPORT-extract-my-fields = extractmyfields
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[extractmyfields]
REGEX = (?m)^([^:]+):\s+(.*)$
FORMAT = $1::$2
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 02 Jun 2018 01:12:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Trying-to-extract-from-similar-events-in-log-file-using-regex/m-p/426918#M122307</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2018-06-02T01:12:31Z</dc:date>
    </item>
  </channel>
</rss>

