<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Need help with regex in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642398#M222530</link>
    <description>&lt;PRE&gt;PI-job application_1681360813939_33163 MAPREDUCE Thu May 4 04:30:14 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Spark-job application_1681360813939_33167 SPARK Thu May 4 04:31:17 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Spark Python Pi-job application_1681360813939_33169 SPARK Thu May 4 04:31:48 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Distcp job application_1681360813939_33172 MAPREDUCE Thu May 4 04:32:18 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Oozie Job on Vip 0517949-230412214950046-oozie-oozi-W Shell-Action Thu May 4 04:32:18 MST 2023 Wed Dec 31 17:00:00 MST 1969 RUNNING default [Thu May 04 04 cadence2
PI-job application_1681360775209_1286 MAPREDUCE Thu May 4 11:30:15 UTC 2023 Thu May 4 11:30:27 UTC 2023 SUCCEEDED default Fine gcsidle2
Spark-job application_1681360775209_1288 SPARK Thu May 4 11:31:18 UTC 2023 Thu May 4 11:31:24 UTC 2023 SUCCEEDED default Fine gcsidle2
Spark Python Pi-job application_1681360775209_1289 SPARK Thu May 4 11:31:49 UTC 2023 Thu May 4 11:31:57 UTC 2023 SUCCEEDED default Fine gcsidle2
Distcp job application_1681360775209_1290 MAPREDUCE Thu May 4 11:32:19 UTC 2023 Thu May 4 11:32:27 UTC 2023 SUCCEEDED default Fine gcsidle2
Oozie Job on Vip 0002335-230419024434725-oozie-oozi-W Shell-Action Thu May 4 11:32:19 UTC 2023 Thu May 4 11:32:27 UTC 2023 SUCCEEDED default gcsidle2&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you check the field "FinalState" it is only picking up "SUCCEEDED" wherein other events also have UNDEFINED and RUNNING, it is not picking up those.&lt;/P&gt;</description>
    <pubDate>Fri, 05 May 2023 11:21:47 GMT</pubDate>
    <dc:creator>bmanikya</dc:creator>
    <dc:date>2023-05-05T11:21:47Z</dc:date>
    <item>
      <title>How to extract fields like table below?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642125#M222434</link>
      <description>&lt;P&gt;&lt;SPAN class=""&gt;Distcp&lt;/SPAN&gt; &lt;SPAN class=""&gt;job&lt;/SPAN&gt; &lt;SPAN class=""&gt;&lt;SPAN class=""&gt;application&lt;/SPAN&gt;_1681357021637_0984&lt;/SPAN&gt; &lt;SPAN class=""&gt;MAPREDUCE&lt;/SPAN&gt; &lt;SPAN class=""&gt;Wed&lt;/SPAN&gt; &lt;SPAN class=""&gt;May&lt;/SPAN&gt; &lt;SPAN class=""&gt;3&lt;/SPAN&gt; &lt;SPAN class=""&gt;04:32:32&lt;/SPAN&gt; &lt;SPAN class=""&gt;MST&lt;/SPAN&gt; &lt;SPAN class=""&gt;2023&lt;/SPAN&gt; &lt;SPAN class=""&gt;Wed&lt;/SPAN&gt; &lt;SPAN class=""&gt;May&lt;/SPAN&gt; &lt;SPAN class=""&gt;3&lt;/SPAN&gt; &lt;SPAN class=""&gt;04:32:40&lt;/SPAN&gt; &lt;SPAN class=""&gt;MST&lt;/SPAN&gt; &lt;SPAN class=""&gt;2023&lt;/SPAN&gt; &lt;SPAN class=""&gt;SUCCEEDED&lt;/SPAN&gt; &lt;SPAN class=""&gt;default&lt;/SPAN&gt; &lt;SPAN class=""&gt;Fine&lt;/SPAN&gt; &lt;SPAN class=""&gt;edmse2&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Oozie&lt;/SPAN&gt; &lt;SPAN class=""&gt;Job&lt;/SPAN&gt; &lt;SPAN class=""&gt;on&lt;/SPAN&gt; &lt;SPAN class=""&gt;Vip&lt;/SPAN&gt; &lt;SPAN class=""&gt;0306563-230428030149477-oozie-oozi-W&lt;/SPAN&gt; &lt;SPAN class=""&gt;Shell-Action&lt;/SPAN&gt; &lt;SPAN class=""&gt;Wed&lt;/SPAN&gt; &lt;SPAN class=""&gt;May&lt;/SPAN&gt; &lt;SPAN class=""&gt;3&lt;/SPAN&gt; &lt;SPAN class=""&gt;04:32:09&lt;/SPAN&gt; &lt;SPAN class=""&gt;MST&lt;/SPAN&gt; &lt;SPAN class=""&gt;2023&lt;/SPAN&gt; &lt;SPAN class=""&gt;Wed&lt;/SPAN&gt; &lt;SPAN class=""&gt;May&lt;/SPAN&gt; &lt;SPAN class=""&gt;3&lt;/SPAN&gt; &lt;SPAN class=""&gt;04:32:17&lt;/SPAN&gt; &lt;SPAN class=""&gt;MST&lt;/SPAN&gt; &lt;SPAN class=""&gt;2023&lt;/SPAN&gt; &lt;SPAN class=""&gt;SUCCEEDED&lt;/SPAN&gt; &lt;SPAN class=""&gt;default&lt;/SPAN&gt; &lt;SPAN class=""&gt;nemoqee2&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Spark&lt;/SPAN&gt; &lt;SPAN class=""&gt;Python&lt;/SPAN&gt; &lt;SPAN class=""&gt;Pi-job&lt;/SPAN&gt; &lt;SPAN class=""&gt;application_1681357021637_0983&lt;/SPAN&gt; &lt;SPAN class=""&gt;SPARK&lt;/SPAN&gt; &lt;SPAN class=""&gt;Wed&lt;/SPAN&gt; &lt;SPAN class=""&gt;May&lt;/SPAN&gt; &lt;SPAN class=""&gt;3&lt;/SPAN&gt; &lt;SPAN class=""&gt;04:32:02&lt;/SPAN&gt; &lt;SPAN class=""&gt;MST&lt;/SPAN&gt; &lt;SPAN class=""&gt;2023&lt;/SPAN&gt; &lt;SPAN class=""&gt;Wed&lt;/SPAN&gt; &lt;SPAN class=""&gt;May&lt;/SPAN&gt; &lt;SPAN class=""&gt;3&lt;/SPAN&gt; &lt;SPAN class=""&gt;04:32:11&lt;/SPAN&gt; &lt;SPAN class=""&gt;MST&lt;/SPAN&gt; &lt;SPAN class=""&gt;2023&lt;/SPAN&gt; &lt;SPAN class=""&gt;SUCCEEDED&lt;/SPAN&gt; &lt;SPAN class=""&gt;default&lt;/SPAN&gt; &lt;SPAN class=""&gt;Fine&lt;/SPAN&gt; &lt;SPAN class=""&gt;edmse2&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Need to extract fields like the below table fields, since each event is not the same.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Job Succeeded in Nemo-Stage-GLOBAL E2 on lpqecpdb0001556.phx.aexp.com&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Application-Name&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Application-Id&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Application-Type&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Start-Time&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Finish-Time&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Final-State&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Queue&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;&lt;STRONG&gt;Queue Utilization&lt;/STRONG&gt;&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;PI-job&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;application_1678348796091_805329&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;MAPREDUCE&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:30:09 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:30:22 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SUCCEEDED&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;default&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Fine&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;Spark-job&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;application_1678348796091_805342&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SPARK&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:31:10 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:31:17 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SUCCEEDED&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;default&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Fine&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;Spark Python Pi-job&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;application_1678348796091_805345&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SPARK&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:31:41 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:31:49 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SUCCEEDED&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;default&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Fine&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;Distcp job&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;application_1678348796091_805347&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;MAPREDUCE&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:32:10 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:32:18 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SUCCEEDED&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;default&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Fine&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;
&lt;P&gt;Oozie Job on Vip&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;1446459-230327031301376-oozie-oozi-W&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Shell-Action&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:32:10 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;Tue May 2 04:32:18 MST 2023&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;SUCCEEDED&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;
&lt;P&gt;default&lt;/P&gt;
&lt;/TD&gt;
&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;</description>
      <pubDate>Thu, 04 May 2023 13:44:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642125#M222434</guid>
      <dc:creator>bmanikya</dc:creator>
      <dc:date>2023-05-04T13:44:02Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with regex</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642129#M222435</link>
      <description>&lt;P&gt;What have you tried so far?&amp;nbsp; How did those efforts not fulfill your requirementss?&lt;/P&gt;&lt;P&gt;Please review the sample events and output as they appear to be unrelated.&amp;nbsp; The table contains timestamps and application IDs that are not in the events.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 03 May 2023 12:51:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642129#M222435</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2023-05-03T12:51:51Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with regex</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642130#M222436</link>
      <description>&lt;P&gt;The format of your data example varies a lot. Writing a pattern for those specific examples would be possible, but that doesn't guarantee that it will work predictable for the rest of your data.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've tested the following pattern on the three given examples:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex field=_raw "(?&amp;lt;ApplicationName&amp;gt;.+)\s(?&amp;lt;ApplicationId&amp;gt;[\w-]+)\s(?&amp;lt;ApplicationType&amp;gt;[\w-]+)\s(?&amp;lt;StartTime&amp;gt;\w{3}\s\w{3}[\d:\s]+[A-Z]+\s\d{4})\s(?&amp;lt;EndTime&amp;gt;\w{3}\s\w{3}[\d:\s]+[A-Z]+\s\d{4})\s(?&amp;lt;FinalState&amp;gt;[A-Z]+)\s(?&amp;lt;Queue&amp;gt;[^\s]+)\s((?&amp;lt;QueueUtilization&amp;gt;[^\s]+)\s)?\w+$"&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;You can see it parsing your examples on regex101:&lt;/P&gt;&lt;P&gt;&lt;A href="https://regex101.com/r/AkNmTb/1" target="_blank" rel="noopener"&gt;https://regex101.com/r/AkNmTb/1&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Apart from&amp;nbsp;&lt;SPAN&gt;predictability, having to implement all those edge cases makes it an inefficient and relatively slow pattern.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 03 May 2023 13:38:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642130#M222436</guid>
      <dc:creator>rut</dc:creator>
      <dc:date>2023-05-03T13:38:56Z</dc:date>
    </item>
    <item>
      <title>Re: Help with regex- extracting fields like below table?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642248#M222477</link>
      <description>&lt;P&gt;As&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/255848"&gt;@rut&lt;/a&gt;&amp;nbsp;hinted, you need to explicitly break down usable patterns first because only you know how those desired fields are delimited/anchored. &amp;nbsp;If you don't, your developers would know. &amp;nbsp;It's much better them than volunteers who have no intimate knowledge about your set of applications. &amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/213957"&gt;@richgalloway&lt;/a&gt;&amp;nbsp;raised an important question: Do these applications even follow the same log format? &amp;nbsp;If not, no amount of regexing is going to save the day.&lt;/P&gt;&lt;P&gt;To help you get started, I'll take a crack by comparing your sample data with sample desired outputs.&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Application ID in most (Hadoop-based?) apps has a prefix "application_" followed by numerals and underscores.&lt;/LI&gt;&lt;LI&gt;The above breaks with that Oozie job. For that, the application ID begins with a numeral followed by a no-space string.&lt;/LI&gt;&lt;LI&gt;Application name is whatever comes before application ID.&lt;/LI&gt;&lt;LI&gt;After application ID are two horrible, terrible, very bad, no good, machine-unfriendly timestamps dreadfully conjoined. (They aren't human-friendly, either.)&lt;/LI&gt;&lt;LI&gt;Final state is a no-space string after the two timestamps.&lt;/LI&gt;&lt;LI&gt;Queue name is another no-space string following final state.&lt;/LI&gt;&lt;LI&gt;In most (Hadoop-based?) applications after queue name, there is a no-space string representing queue utilization, followed by yet another no-space string that is to be discarded.&lt;/LI&gt;&lt;LI&gt;One single space is inserted between fields.&lt;/LI&gt;&lt;LI&gt;The above breaks with that Oozie job. &amp;nbsp;Whatever that final non-space string is, it is discarded.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Are the above about right? &amp;nbsp;If it is, the safest approach would be to use two separate regex's to handle the two different application types. &amp;nbsp;For example,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex "^(?&amp;lt;Application_name&amp;gt;.+) (?&amp;lt;Application_id&amp;gt;application_\d+\S+) (?&amp;lt;Application_type&amp;gt;\S+) (?&amp;lt;Start_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;End_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;Final_state&amp;gt;\S+) (?&amp;lt;Queue&amp;gt;\S+) (?&amp;lt;Queue_utilization&amp;gt;\S+) \S+$"
| rex "^(?&amp;lt;Application_name&amp;gt;\D+) (?&amp;lt;Application_id&amp;gt;\d+\S+) (?&amp;lt;Application_type&amp;gt;\S+) (?&amp;lt;Start_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;End_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;Final_state&amp;gt;\S+) (?&amp;lt;Queue&amp;gt;\S+) \S+$"
| eval Application_name = if(isnull(Application_name), "Analyze this! " . _raw, Application_name) ``` highlight oddballs ```&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When you have potentially disparate log formats, be very afraid and be narrow. (That is why even though the last no-space string is to be discarded, I choose to match all the way to the end of line and mark any unmatched event as needing attention.) &amp;nbsp;The above further assumes that those "oozie" job names do not contain numerals. &amp;nbsp;If this is not the case, you need some other methods to anchor these elements.&lt;/P&gt;&lt;P&gt;With that, your sample data will give&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Application_id&lt;/TD&gt;&lt;TD&gt;Application_name&lt;/TD&gt;&lt;TD&gt;Application_type&lt;/TD&gt;&lt;TD&gt;End_time&lt;/TD&gt;&lt;TD&gt;Final_state&lt;/TD&gt;&lt;TD&gt;Queue&lt;/TD&gt;&lt;TD&gt;Queue_utilization&lt;/TD&gt;&lt;TD&gt;Start_time&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681357021637_0984&lt;/TD&gt;&lt;TD&gt;Distcp job&lt;/TD&gt;&lt;TD&gt;MAPREDUCE&lt;/TD&gt;&lt;TD&gt;Wed May 3 04:32:40 MST 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Fine&lt;/TD&gt;&lt;TD&gt;Wed May 3 04:32:32 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;0306563-230428030149477-oozie-oozi-W&lt;/TD&gt;&lt;TD&gt;Oozie Job on Vip&lt;/TD&gt;&lt;TD&gt;Shell-Action&lt;/TD&gt;&lt;TD&gt;Wed May 3 04:32:17 MST 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Wed May 3 04:32:09 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681357021637_0983&lt;/TD&gt;&lt;TD&gt;Spark Python Pi-job&lt;/TD&gt;&lt;TD&gt;SPARK&lt;/TD&gt;&lt;TD&gt;Wed May 3 04:32:11 MST 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Fine&lt;/TD&gt;&lt;TD&gt;Wed May 3 04:32:02 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;</description>
      <pubDate>Thu, 04 May 2023 09:24:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642248#M222477</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2023-05-04T09:24:00Z</dc:date>
    </item>
    <item>
      <title>Re: Help with regex- extracting fields like below table?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642251#M222479</link>
      <description>&lt;P&gt;+1 on that. If this is your in-house developed application, do put pressure on the dev team to be consistent about logging. I know that there are some things that are, and will always be, a free-form text but some of the common fields should be structured. Even if some fields will be blank in some cases. It greatly improves handling such logs.&lt;/P&gt;</description>
      <pubDate>Thu, 04 May 2023 09:30:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642251#M222479</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2023-05-04T09:30:00Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with regex</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642398#M222530</link>
      <description>&lt;PRE&gt;PI-job application_1681360813939_33163 MAPREDUCE Thu May 4 04:30:14 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Spark-job application_1681360813939_33167 SPARK Thu May 4 04:31:17 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Spark Python Pi-job application_1681360813939_33169 SPARK Thu May 4 04:31:48 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Distcp job application_1681360813939_33172 MAPREDUCE Thu May 4 04:32:18 MST 2023 Wed Dec 31 17:00:00 MST 1969 UNDEFINED default [Thu May 04 04 Exceeded cadence2
Oozie Job on Vip 0517949-230412214950046-oozie-oozi-W Shell-Action Thu May 4 04:32:18 MST 2023 Wed Dec 31 17:00:00 MST 1969 RUNNING default [Thu May 04 04 cadence2
PI-job application_1681360775209_1286 MAPREDUCE Thu May 4 11:30:15 UTC 2023 Thu May 4 11:30:27 UTC 2023 SUCCEEDED default Fine gcsidle2
Spark-job application_1681360775209_1288 SPARK Thu May 4 11:31:18 UTC 2023 Thu May 4 11:31:24 UTC 2023 SUCCEEDED default Fine gcsidle2
Spark Python Pi-job application_1681360775209_1289 SPARK Thu May 4 11:31:49 UTC 2023 Thu May 4 11:31:57 UTC 2023 SUCCEEDED default Fine gcsidle2
Distcp job application_1681360775209_1290 MAPREDUCE Thu May 4 11:32:19 UTC 2023 Thu May 4 11:32:27 UTC 2023 SUCCEEDED default Fine gcsidle2
Oozie Job on Vip 0002335-230419024434725-oozie-oozi-W Shell-Action Thu May 4 11:32:19 UTC 2023 Thu May 4 11:32:27 UTC 2023 SUCCEEDED default gcsidle2&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you check the field "FinalState" it is only picking up "SUCCEEDED" wherein other events also have UNDEFINED and RUNNING, it is not picking up those.&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2023 11:21:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642398#M222530</guid>
      <dc:creator>bmanikya</dc:creator>
      <dc:date>2023-05-05T11:21:47Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with regex</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642456#M222547</link>
      <description>&lt;P&gt;As I predicted previously, a little defensive coding goes a long way in face of such bad formatting. &amp;nbsp;Be specific rather than be aggressive. &amp;nbsp;The dangling partial timestamp after queue name is the only ones throwing off my previous solution. &amp;nbsp;As &lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/231884"&gt;@PickleRick&lt;/a&gt;&amp;nbsp;noted, there is no generic solution for bad logging. &amp;nbsp;Advocating for better format is important.&lt;/P&gt;&lt;P&gt;The following addition handles all variants you posted so far. &amp;nbsp;If there is any other rule breakers, the last line will catch it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex "^(?&amp;lt;Application_name&amp;gt;.+) (?&amp;lt;Application_id&amp;gt;application_\d+\S+) (?&amp;lt;Application_type&amp;gt;\S+) (?&amp;lt;Start_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;End_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;Final_state&amp;gt;\S+) (?&amp;lt;Queue&amp;gt;\S+)(\s+\[(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)( +\d+){2}){0,1} (?&amp;lt;Queue_utilization&amp;gt;\S+) \S+$"
| rex "^(?&amp;lt;Application_name&amp;gt;\D+) (?&amp;lt;Application_id&amp;gt;\d+\S+) (?&amp;lt;Application_type&amp;gt;\S+) (?&amp;lt;Start_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;End_time&amp;gt;(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +\d+ (\d+:){2}\d+ \S+ \d+) (?&amp;lt;Final_state&amp;gt;\S+) (?&amp;lt;Queue&amp;gt;\S+)(\s+\[(Sun|Mon|Tue|Wed|Thu|Fri|Sat) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)( +\d+){2}){0,1} \S+$"
| eval Application_name = if(isnull(Application_name), "Analyze this! " . _raw, Application_name) ``` highlight oddballs ```&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;Your samples yield the following:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Application_id&lt;/TD&gt;&lt;TD&gt;Application_name&lt;/TD&gt;&lt;TD&gt;Application_type&lt;/TD&gt;&lt;TD&gt;End_time&lt;/TD&gt;&lt;TD&gt;Final_state&lt;/TD&gt;&lt;TD&gt;Queue&lt;/TD&gt;&lt;TD&gt;Queue_utilization&lt;/TD&gt;&lt;TD&gt;Start_time&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360813939_33163&lt;/TD&gt;&lt;TD&gt;PI-job&lt;/TD&gt;&lt;TD&gt;MAPREDUCE&lt;/TD&gt;&lt;TD&gt;Wed Dec 31 17:00:00 MST 1969&lt;/TD&gt;&lt;TD&gt;UNDEFINED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Exceeded&lt;/TD&gt;&lt;TD&gt;Thu May 4 04:30:14 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360813939_33167&lt;/TD&gt;&lt;TD&gt;Spark-job&lt;/TD&gt;&lt;TD&gt;SPARK&lt;/TD&gt;&lt;TD&gt;Wed Dec 31 17:00:00 MST 1969&lt;/TD&gt;&lt;TD&gt;UNDEFINED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Exceeded&lt;/TD&gt;&lt;TD&gt;Thu May 4 04:31:17 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360813939_33169&lt;/TD&gt;&lt;TD&gt;Spark Python Pi-job&lt;/TD&gt;&lt;TD&gt;SPARK&lt;/TD&gt;&lt;TD&gt;Wed Dec 31 17:00:00 MST 1969&lt;/TD&gt;&lt;TD&gt;UNDEFINED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Exceeded&lt;/TD&gt;&lt;TD&gt;Thu May 4 04:31:48 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360813939_33172&lt;/TD&gt;&lt;TD&gt;Distcp job&lt;/TD&gt;&lt;TD&gt;MAPREDUCE&lt;/TD&gt;&lt;TD&gt;Wed Dec 31 17:00:00 MST 1969&lt;/TD&gt;&lt;TD&gt;UNDEFINED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Exceeded&lt;/TD&gt;&lt;TD&gt;Thu May 4 04:32:18 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;0517949-230412214950046-oozie-oozi-W&lt;/TD&gt;&lt;TD&gt;Oozie Job on Vip&lt;/TD&gt;&lt;TD&gt;Shell-Action&lt;/TD&gt;&lt;TD&gt;Wed Dec 31 17:00:00 MST 1969&lt;/TD&gt;&lt;TD&gt;RUNNING&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Thu May 4 04:32:18 MST 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360775209_1286&lt;/TD&gt;&lt;TD&gt;PI-job&lt;/TD&gt;&lt;TD&gt;MAPREDUCE&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:30:27 UTC 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Fine&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:30:15 UTC 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360775209_1288&lt;/TD&gt;&lt;TD&gt;Spark-job&lt;/TD&gt;&lt;TD&gt;SPARK&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:31:24 UTC 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Fine&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:31:18 UTC 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360775209_1289&lt;/TD&gt;&lt;TD&gt;Spark Python Pi-job&lt;/TD&gt;&lt;TD&gt;SPARK&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:31:57 UTC 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Fine&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:31:49 UTC 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;application_1681360775209_1290&lt;/TD&gt;&lt;TD&gt;Distcp job&lt;/TD&gt;&lt;TD&gt;MAPREDUCE&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:32:27 UTC 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;Fine&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:32:19 UTC 2023&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;0002335-230419024434725-oozie-oozi-W&lt;/TD&gt;&lt;TD&gt;Oozie Job on Vip&lt;/TD&gt;&lt;TD&gt;Shell-Action&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:32:27 UTC 2023&lt;/TD&gt;&lt;TD&gt;SUCCEEDED&lt;/TD&gt;&lt;TD&gt;default&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Thu May 4 11:32:19 UTC 2023&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;</description>
      <pubDate>Sat, 06 May 2023 03:08:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-like-table-below/m-p/642456#M222547</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2023-05-06T03:08:51Z</dc:date>
    </item>
  </channel>
</rss>

