Solved: Re: Extracting field based on prior field extracte...

raby1996 · ‎10-15-2015

Hi all,

So I have data that looks something like this where each event contains somewhat historical data, and it has multiple fields that are similar to each other.

**Event 1**
Serial: xxxxxxxxx
BU1 - 84.5xx.x
#############################
  Serial: xxxxxxxxx
BU2- 83.5xx.x
#############################
 Serial: xxxxxxxxx
BU3- 83.6xx.x
#############################
 Serial: xxxxxxxxx
BU4- 85.xxx.x
#############################

Basically I'm running a rex command that extracts all the BU's and displays the largest value within the event as largest_BU (I.E Rex extraction command then max value for those values), however I cannot use that same logic in extracting the serial, so I was thinking that maybe I could correlate the extraction of the serial to the BU because ultimately I want to keep the largest BU and corresponding serial together in different fields. Is this possible, or might there be another way to approach this?
Thank you for all the help, and let me know if more information is needed.

somesoni2 · ‎10-15-2015

Try something like this

| search "relevant info"
 | rex max_match=0 "(?:\n|.)\s+(?<Bu>(?:3[6]+\.\d+\.\d+\.\d+))"  
 | rex max_match=0 "(?:\n|.)\s+(?<Serial>(\s+\d+\-\d+\S\S+))" 
 | eval largest_BU=max(Bu)
 | eval temp=mvzip(Bu,Serial,"##") 
 | eval SerialForLargest_BU=replace(mvfilter(match(temp,largest_BU."##")),"(.*)##.*","\1")

Update:
I tested above and the regular exp for match doesn't take field as input. So try something like this

| search "relevant info"
     | rex max_match=0 "(?:\n|.)\s+(?<Bu>(?:3[6]+\.\d+\.\d+\.\d+))"  
     | rex max_match=0 "(?:\n|.)\s+(?<Serial>(\s+\d+\-\d+\S\S+))" 
     | eval temp=mvzip(Bu,Serial,"##") 
     | eval temp=mvsort(temp)
     | eval temp=mvindex(temp,-1)
     | rex field=temp "(?<largest_BU>.*)##(?<SerialForLargestBU>.*)" 
     | fields - temp

View solution in original post

maciep · ‎10-15-2015

This seems to work with a limited set of test data. But it's also late, so it could be way off too 🙂

Essentially, I'm trying split the event by the hashtags, and then expand those sections into separate events. So instead of 1 event above, you'd have 4. And then I rex the serial and bu for each of those new events (so each section of the original 1 event). Then across all events, i do a stats for the largest bu by _raw (even after the expand, all of those new events still have the same _raw).

So now each event has a _raw, serial, bu and the max(bu) that represents the highest bu across the same original event. So finally, I filter on where the max field is the same as the bu field, since those will only match for those events that represent the largest bu per _raw. And that's it, just table the remaining serial and bu fields.

Hopefully that makes sense (and is logically correct)

index=* sourcetype="test:serial" 
| eval blah = _raw 
| makemv blah delim="#############################" 
| mvexpand blah 
| rex field=blah "Serial:\s+(?<serial>\S+)[^-]+-(?<bu>.+)" 
| eventstats max(bu) as max by _raw 
| where max=bu 
| table serial bu

somesoni2 · ‎10-15-2015

Try something like this

| search "relevant info"
 | rex max_match=0 "(?:\n|.)\s+(?<Bu>(?:3[6]+\.\d+\.\d+\.\d+))"  
 | rex max_match=0 "(?:\n|.)\s+(?<Serial>(\s+\d+\-\d+\S\S+))" 
 | eval largest_BU=max(Bu)
 | eval temp=mvzip(Bu,Serial,"##") 
 | eval SerialForLargest_BU=replace(mvfilter(match(temp,largest_BU."##")),"(.*)##.*","\1")

Update:
I tested above and the regular exp for match doesn't take field as input. So try something like this

| search "relevant info"
     | rex max_match=0 "(?:\n|.)\s+(?<Bu>(?:3[6]+\.\d+\.\d+\.\d+))"  
     | rex max_match=0 "(?:\n|.)\s+(?<Serial>(\s+\d+\-\d+\S\S+))" 
     | eval temp=mvzip(Bu,Serial,"##") 
     | eval temp=mvsort(temp)
     | eval temp=mvindex(temp,-1)
     | rex field=temp "(?<largest_BU>.*)##(?<SerialForLargestBU>.*)" 
     | fields - temp

raby1996 · ‎10-15-2015

Hmm, I can't seem to get the last part of the search to work " arguments to mvfilter are invalid" is what I'm getting.
here is the actual search I'm using, where I switched the values of ## to the actual delimiter i want to use, however I feel like I'm not understanding how the last command works, so I might be using it wrong

 | search "relevant information"
    | rex max_match=0 "(?:\n|.)\s+(?<Bu>(?:8[7]+\.\d+\.\d+\.\d+))"  
    | rex max_match=0 "((?:\n|.)*?MTMS:(?<Serial>\s+\d+\-\d+\S\S+))"
    | eval Bundle=max(Bu)
    | eval temp=mvzip(Bu,Serial,"MTMS")
    | eval SerialForLargest_BU=replace(mvfilter(match(temp,Bundle."MTMS")),"(.*)MTMS.*","\1")

jkat54 · ‎10-15-2015

Perhaps you could try something like last, latest, first, & earliest...

You might also be interested in making serial and BU multi valued with makemv, and other "mv" commands.

jkat54 · ‎10-15-2015

Can you share your search as is?

raby1996 · ‎10-15-2015

Yes it looks similar to this

| search "relevant info"
| rex max_match=0 "(?:\n|.)\s+(?<Bu>(?:3[6]+\.\d+\.\d+\.\d+))"  
| eval largest_BU=max(Bu)
| rex max_match=0 "(?:\n|.)\s+(?<Serial>(\s+\d+\-\d+\S\S+))"

Where I need the Serial to correspond to the largest_BU.
Thank you

Extracting field based on prior field extracted

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Index This | What travels the world but is also stuck in place?

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

Join the Conversation