<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Conditional Rex Extraction with multiple extractions in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300871#M90579</link>
    <description>&lt;P&gt;I have events with large strings of text being output per event&lt;/P&gt;

&lt;P&gt;Sample Text:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{"userDetails":{"uuid": "Lots of different values and fields" ,"offlineString":"firstName:NAME|lastName:NAME|OTHER FIELDS","much more info",\"subscriberFirstName\":\"NAME\",\"subscriberLastName\":\"NAME\","tons more data"}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;A more general explanation of our structure is:  (FirstName......FirstName.....SubscriberName.....FirstName....FirstName.....FirstName.....FirstName....SubscriberName)&lt;/P&gt;

&lt;P&gt;There can be any number of FirstName between each subscriberName, and there can be any number of subscriberName in a single splunk event. We've identified the cause of this - lack of proper linebreaks in our props.conf that end up causing multiple JSON events to be connected together in splunk.&lt;/P&gt;

&lt;P&gt;As a result, I'm trying to use regex solutions to find answers to problems while these events are connected together. For the problem I'm tackling right now, I'm trying to find a count/percentage of errors, where if FirstName and subscriberFirstName aren't equal, it is an error. By extracting the fields I can now try to compare for equality and see what ratio of events are throwing this error, or at least thats my thought process&lt;/P&gt;

&lt;P&gt;I believe the following two rex's should capture the fields properly&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;rex "\"firstName:(?.*?)\|"

rex "\"subscriberFirstName\\\":\\\"(?.*?)\\\""
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The second rex doesn't work properly but does work when I put it into an online regex tool. The first one captures name properly.&lt;/P&gt;

&lt;P&gt;I'm trying to find a rex that can capture both first names that exist. I can write the individual rex extractions for each field, but I want to get it as a pair - and I only want subscriberfirstname IF firstname is prior to it. &lt;/P&gt;

&lt;P&gt;The challenge is that events can have multiple firstname, but every subscriberfirstname has a prior firstname. So while I can capture each one seperately, is there a way to capture both together but as separate fields?&lt;/P&gt;</description>
    <pubDate>Wed, 10 Jan 2018 16:33:59 GMT</pubDate>
    <dc:creator>brajaram</dc:creator>
    <dc:date>2018-01-10T16:33:59Z</dc:date>
    <item>
      <title>Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300871#M90579</link>
      <description>&lt;P&gt;I have events with large strings of text being output per event&lt;/P&gt;

&lt;P&gt;Sample Text:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{"userDetails":{"uuid": "Lots of different values and fields" ,"offlineString":"firstName:NAME|lastName:NAME|OTHER FIELDS","much more info",\"subscriberFirstName\":\"NAME\",\"subscriberLastName\":\"NAME\","tons more data"}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;A more general explanation of our structure is:  (FirstName......FirstName.....SubscriberName.....FirstName....FirstName.....FirstName.....FirstName....SubscriberName)&lt;/P&gt;

&lt;P&gt;There can be any number of FirstName between each subscriberName, and there can be any number of subscriberName in a single splunk event. We've identified the cause of this - lack of proper linebreaks in our props.conf that end up causing multiple JSON events to be connected together in splunk.&lt;/P&gt;

&lt;P&gt;As a result, I'm trying to use regex solutions to find answers to problems while these events are connected together. For the problem I'm tackling right now, I'm trying to find a count/percentage of errors, where if FirstName and subscriberFirstName aren't equal, it is an error. By extracting the fields I can now try to compare for equality and see what ratio of events are throwing this error, or at least thats my thought process&lt;/P&gt;

&lt;P&gt;I believe the following two rex's should capture the fields properly&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;rex "\"firstName:(?.*?)\|"

rex "\"subscriberFirstName\\\":\\\"(?.*?)\\\""
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The second rex doesn't work properly but does work when I put it into an online regex tool. The first one captures name properly.&lt;/P&gt;

&lt;P&gt;I'm trying to find a rex that can capture both first names that exist. I can write the individual rex extractions for each field, but I want to get it as a pair - and I only want subscriberfirstname IF firstname is prior to it. &lt;/P&gt;

&lt;P&gt;The challenge is that events can have multiple firstname, but every subscriberfirstname has a prior firstname. So while I can capture each one seperately, is there a way to capture both together but as separate fields?&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 16:33:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300871#M90579</guid>
      <dc:creator>brajaram</dc:creator>
      <dc:date>2018-01-10T16:33:59Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300872#M90580</link>
      <description>&lt;P&gt;Your regexes get gross in &lt;CODE&gt;rex&lt;/CODE&gt;, because of slashies, but they are doable.  I'm not sure what you mean by "capture both together but in different fields", but this will capture both (when they exist), and put both values in one field &lt;EM&gt;after&lt;/EM&gt; &lt;CODE&gt;rex&lt;/CODE&gt;, in SPL:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval _raw="{\"userDetails\":{\"uuid\": \"Lots of different values and fields\" ,\"offlineString\":\"firstName:NAME|lastName:NAME|OTHER FIELDS\",\"much more info\",\\\"subscriberFirstName\\\":\\\"NAME\\\",\\\"subscriberLastName\\\":\\\"NAME\\\",\"tons more data\"}"
| rex "firstName:(?&amp;lt;firstName&amp;gt;[^|]+)"
| rex "subscriberFirstName\\\\\":\\\\\"(?&amp;lt;subscriberFirstName&amp;gt;[^\\\]+)"
| eval firstNames=mvappend(firstName, subscriberFirstName)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 10 Jan 2018 16:55:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300872#M90580</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-10T16:55:36Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300873#M90581</link>
      <description>&lt;P&gt;hey @brajaram&lt;/P&gt;

&lt;P&gt;Try this regex:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| rex field=_raw "firstName:(?P&amp;lt;firstName&amp;gt;[^|]+).*subscriberFirstName\\\\\":\\\\\"(?&amp;lt;subscriberFirstName&amp;gt;[^\\\"]+)\\\\"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Let me know if this helps!&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 16:56:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300873#M90581</guid>
      <dc:creator>mayurr98</dc:creator>
      <dc:date>2018-01-10T16:56:21Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300874#M90582</link>
      <description>&lt;P&gt;@brajaram, there might be an easier better way yo extract fields since your data seems to be JSON. However, since you have changed the data instead of anonymizing, we can not confirm whether &lt;CODE&gt;spath&lt;/CODE&gt; to extract fields from JSON data will be applicable or not.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;YourBaseSearch&amp;gt;
| eval _raw=replace(_raw,"\\\\\"","\"")
| spath
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:17:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300874#M90582</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2018-01-10T17:17:12Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300875#M90583</link>
      <description>&lt;P&gt;That worked great, thanks! Still need to modify it more to make it work properly but it definitely has been helpful - the messy nature of our logs means any answer will only be a start, but this is exactly what I was looking for.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:20:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300875#M90583</guid>
      <dc:creator>brajaram</dc:creator>
      <dc:date>2018-01-10T17:20:02Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300876#M90584</link>
      <description>&lt;P&gt;So normally SPATH would be a good idea. However, our logs are currently in a problematic state where we have multiple json events connected together in a single splunk event, making it much more difficult.  We've identified the issue - a lack of proper linebreak definitions in props.conf, and we're currently working on setting up proper line breaks to split the events up, but in the meantime I'm trying to use regex solutions as a workaround while that occurs.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:23:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300876#M90584</guid>
      <dc:creator>brajaram</dc:creator>
      <dc:date>2018-01-10T17:23:04Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300877#M90585</link>
      <description>&lt;P&gt;Do you want this to capture &lt;CODE&gt;firstName&lt;/CODE&gt; only if &lt;CODE&gt;subscriberFirstName&lt;/CODE&gt; also exists?  It seemed to me you wanted to capture &lt;CODE&gt;firstName&lt;/CODE&gt; always, and &lt;CODE&gt;subscriberFirstName&lt;/CODE&gt; when available.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:32:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300877#M90585</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-10T17:32:34Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300878#M90586</link>
      <description>&lt;P&gt;Sure whatever works... Seems like you have found your workaround until then &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:44:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300878#M90586</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2018-01-10T17:44:13Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300879#M90587</link>
      <description>&lt;P&gt;What I was looking for is the firstname::subscriber firstname pairs, which I am able to get from that query. I can filter the initial search to always have the subscriberName show in events.&lt;/P&gt;

&lt;P&gt;Events can be structured very oddly. An event can have a structure like: "First Name...First Name...Subscriber Name...First Name...First Name...Subscriber Name". There can never be a subscriber name without a first name, but the inverse is possible. What I wanted to do is pull out the pairs of values, then compare each specific pair for equality to generate statistics off of that(inequality is an error and we want to track that). Your modification has been a huge first step in that direction for us.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:44:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300879#M90587</guid>
      <dc:creator>brajaram</dc:creator>
      <dc:date>2018-01-10T17:44:47Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300880#M90588</link>
      <description>&lt;P&gt;So the way our logs are structured is: ( FirstName......FirstName.....SubscriberName.....FirstName....FirstName.....FirstName.....FirstName....SubscriberName)&lt;/P&gt;

&lt;P&gt;There can be any number of FirstName between each subscriberName, and there can be any number of subscriberName in a single splunk event. We've identified the cause of this - lack of proper linebreaks in our props.conf that end up causing multiple JSON events to be connected together in splunk.&lt;/P&gt;

&lt;P&gt;As a result, I'm trying to use regex solutions to find answers to problems while these events are connected together. For the problem I'm tackling right now, I'm trying to find a count/percentage of errors, where if FirstName and subscriberFirstName aren't equal, it is an error. By extracting the fields I can now try to compare for equality and see what ratio of events are throwing this error, or at least thats my thought process&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:51:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300880#M90588</guid>
      <dc:creator>brajaram</dc:creator>
      <dc:date>2018-01-10T17:51:12Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300881#M90589</link>
      <description>&lt;P&gt;Excellent description.  You might consider adding it to the question so that others have an easy time determine what the solution means, and why it was needed.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 20:09:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300881#M90589</guid>
      <dc:creator>micahkemp</dc:creator>
      <dc:date>2018-01-10T20:09:54Z</dc:date>
    </item>
    <item>
      <title>Re: Conditional Rex Extraction with multiple extractions</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300882#M90590</link>
      <description>&lt;P&gt;Here. &lt;CODE&gt;_&lt;/CODE&gt; fields are a little tricky so I would &lt;CODE&gt;eval&lt;/CODE&gt;/&lt;CODE&gt;rename&lt;/CODE&gt; them like I did here. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=myindex | 

eval no_referrer_regex="MYREGEX1" |

eval referrer_regex="MYREGEX2" |

eval regex=if(_time &amp;lt; 1579250700,no_referrer_regex,referrer_regex) | eval raw=_raw |

map maxsearches=10000 search="| makeresults | eval mapped_raw=\"$$raw$$\" | rex field=mapped_raw \"$$regex$$\"" | table pst pst_epoch id action path num desc browser referrer
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;A second approach would just be to use ad-hoc searches in SimpleXML to set token values.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Jan 2020 22:30:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Conditional-Rex-Extraction-with-multiple-extractions/m-p/300882#M90590</guid>
      <dc:creator>nick405060</dc:creator>
      <dc:date>2020-01-17T22:30:23Z</dc:date>
    </item>
  </channel>
</rss>

