<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extracted fields issues in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27456#M5330</link>
    <description>&lt;P&gt;Thanks for you reply&lt;/P&gt;

&lt;P&gt;Actually I messed up explaining the issue. I agree I'm looking for the 5th item - my problem is that Splunk sometime picks the 6th item instead of the 5th, for no apparent reason (Note the second line I pasted, saying LABEL=monitord: which is the process name, not the hostname) the others lines in my example are fine.&lt;/P&gt;</description>
    <pubDate>Tue, 10 Aug 2010 22:20:57 GMT</pubDate>
    <dc:creator>wleroy</dc:creator>
    <dc:date>2010-08-10T22:20:57Z</dc:date>
    <item>
      <title>Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27454#M5328</link>
      <description>&lt;P&gt;I'm experiencing weird issues with extracted fields : I have a custom field that basically get the hostname (in bold text), which is the 4th item of each log line :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Aug 10 09:42:54 172.31.55.1 **sables-garnier** monitord: RPC call failed: INTERFACE_get_link_state, aborting current process pid 164 : monitord
LABEL=monitord:

Aug  9 19:35:19 172.31.14.1 **talmont-port** monitord: RPC call failed: INTERFACE_get_link_state, aborting current process pid 158 : monitord 
LABEL=talmont-port  

Aug  9 16:25:04 172.31.38.1 **sables-olona** monitord: RPC call failed: INTERFACE_get_link_state, aborting current process pid 158 : monitord 
LABEL=sables-olona
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm using this regexp : (?i)^(?:[^ ]* ){5}(?P[^ ]+)&lt;/P&gt;

&lt;P&gt;Now why in the above extract Splunk shows the fifth item (process name) as a label ?
Using Splunk 4.1.4 (82143) by the way&lt;/P&gt;

&lt;P&gt;Any help appreciated&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Tue, 10 Aug 2010 20:02:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27454#M5328</guid>
      <dc:creator>wleroy</dc:creator>
      <dc:date>2010-08-10T20:02:37Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27455#M5329</link>
      <description>&lt;P&gt;It's because of your date format.
If you have a look at your date format the month and the day are seperated in the string by an whitespace.&lt;/P&gt;

&lt;P&gt;Splunk counts different values in the string as differnt items. So in your case:
1st: Aug
2nd: 10
3rd: 09:42:54
4th: 172.31.55.1
5th: &lt;STRONG&gt;sables-garnier&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;By the way is the date normalistaion working for you in this case?&lt;/P&gt;

&lt;P&gt;Cheers,&lt;/P&gt;

&lt;P&gt;Christian&lt;/P&gt;</description>
      <pubDate>Tue, 10 Aug 2010 21:02:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27455#M5329</guid>
      <dc:creator>simuvid</dc:creator>
      <dc:date>2010-08-10T21:02:32Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27456#M5330</link>
      <description>&lt;P&gt;Thanks for you reply&lt;/P&gt;

&lt;P&gt;Actually I messed up explaining the issue. I agree I'm looking for the 5th item - my problem is that Splunk sometime picks the 6th item instead of the 5th, for no apparent reason (Note the second line I pasted, saying LABEL=monitord: which is the process name, not the hostname) the others lines in my example are fine.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Aug 2010 22:20:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27456#M5330</guid>
      <dc:creator>wleroy</dc:creator>
      <dc:date>2010-08-10T22:20:57Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27457#M5331</link>
      <description>&lt;P&gt;Hi, sorry my fault misunderstood you here!&lt;/P&gt;

&lt;P&gt;Why dont you try another regex to match your host pattern.&lt;BR /&gt;
Like: *&lt;EM&gt;[a-z]{0,}-[a-z]{0,}*&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;Once gain I think it is also caused by your date format.&lt;/P&gt;

&lt;P&gt;Give it a try.&lt;/P&gt;

&lt;P&gt;Hope that helps!&lt;/P&gt;

&lt;P&gt;Cheers,&lt;/P&gt;

&lt;P&gt;Christian&lt;/P&gt;</description>
      <pubDate>Wed, 11 Aug 2010 14:46:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27457#M5331</guid>
      <dc:creator>simuvid</dc:creator>
      <dc:date>2010-08-11T14:46:03Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27458#M5332</link>
      <description>&lt;P&gt;\*&lt;EM&gt;[a-z]{0,}-[a-z]{0,}\*&lt;/EM&gt;&lt;BR /&gt;
Sorry HTML changed the slashes.&lt;/P&gt;</description>
      <pubDate>Wed, 11 Aug 2010 14:46:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27458#M5332</guid>
      <dc:creator>simuvid</dc:creator>
      <dc:date>2010-08-11T14:46:46Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27459#M5333</link>
      <description>&lt;P&gt;Ok no prob - I'd like to give your regexp a try, but the asterisks were put there by me to highlight the hostname. I'm absolutely no expert regarding regexp and unfortunately some of the hosts don't have a dash in their hostname so I'm kinda stuck here - it seems that the only way to extract the hostname would be to filter put the 5th item of the line, no matter how many digits the day number has just like "awk '{ print $5 }'" would do in a shell&lt;/P&gt;

&lt;P&gt;Thanks anyway for your time, I'm going to see how I can modify the date format and/or my regexp&lt;/P&gt;</description>
      <pubDate>Wed, 11 Aug 2010 18:27:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27459#M5333</guid>
      <dc:creator>wleroy</dc:creator>
      <dc:date>2010-08-11T18:27:06Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27460#M5334</link>
      <description>&lt;P&gt;If I'm not mistaken the regexp Splunk has generated based on my input means "extract the word located after the 4th space of the whole string", while I need "extract the 5th word of the whole string", or "extract the word located after the 4th group of 1 or more consecutive spaces" - But after many tries using online regexp tools I still can't translate that in regexp syntax banging his head on the keyboard&lt;/P&gt;</description>
      <pubDate>Wed, 11 Aug 2010 19:01:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27460#M5334</guid>
      <dc:creator>wleroy</dc:creator>
      <dc:date>2010-08-11T19:01:57Z</dc:date>
    </item>
    <item>
      <title>Re: Extracted fields issues</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27461#M5335</link>
      <description>&lt;P&gt;Your regex is designed to match the 6th non-whitespace item in the string, but the first five items are allowed to be degenerate (because of the * after the [^ ]). Occasionally it will find what appears to be the fifth word, because it will accept as the second item the empty string between the two spaces between the month and the day.&lt;/P&gt;

&lt;P&gt;You want a regex like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;^(?:\S+\s+){4}(?&amp;lt;process&amp;gt;\S+)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 25 Aug 2010 00:01:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracted-fields-issues/m-p/27461#M5335</guid>
      <dc:creator>Stephen_Sorkin</dc:creator>
      <dc:date>2010-08-25T00:01:18Z</dc:date>
    </item>
  </channel>
</rss>

