Splunk Search

How to write regex to extract multi-value fields and graph data by time?

Path Finder

Hello,
I have a log file with a bunch of entries like this:

<carrier-index>[<error>]: 0[0], 1[0.0363152], 2[0.0228264], 3[0.0515129], 4[0.0123894], 5[0.128875], 6[0.0917961], 7[0.0460198], 8[0.81137], 9[-0.105347], 10[0.538207], 11[999], 12[999], 13[999], 14[999], 15[999], 16[0.74948], 17[0.0690911], 18[0.709686], 19[-0.184876], 20[999]

As it states at the very beginning, the first number is the carrier-index and the number in brackets is the error. This lines above come as one event and I am trying to extract the index and the error. Since this is two variables with multiple values in one event, I think I need to use a multi-value field... just not sure exactly how to do it.

This will extract the very first one.. I think I just do a repeat of this once they are multi-value fields?

(?<carrier>\d+)\[(?<error>\-?\d+\.?\d+)\]

how do I get both the fields 'carrier' and 'error' to be multi-value and then get it to pick up all the values?

My end result would then be a graph with all the carriers showing the error values over time.

1 Solution

SplunkTrust
SplunkTrust

Try something like this

sourcetype=UTAS |rex max_match=25 "(?<carrier>\d+)\[(?<error>\-?\d+\.?\d+)\]" | eval temp=mvzip(carrier,error,"#") | mvexpand temp | rex field=temp "(?<carrier>.+)#(?<error>.+)" | where NOT error=999 | timechart avg(error) by carrier

View solution in original post

SplunkTrust
SplunkTrust

Try something like this

sourcetype=UTAS |rex max_match=25 "(?<carrier>\d+)\[(?<error>\-?\d+\.?\d+)\]" | eval temp=mvzip(carrier,error,"#") | mvexpand temp | rex field=temp "(?<carrier>.+)#(?<error>.+)" | where NOT error=999 | timechart avg(error) by carrier

View solution in original post

SplunkTrust
SplunkTrust

Splunk documentations have good explanation and examples.

http://docs.splunk.com/Documentation/Splunk/6.0/SearchReference/Abstract

All search commands can be viewed from the hyperlinks on left section of the page.

Path Finder

That gives me the chart I am looking for! Not sure I really get what all this does... Will have to research the different mv stuff.. is there a good help site on what each function does and what the params are?

Path Finder

This was too long to put in the comments, so am posting it here:

tom_frotscher
That seems to work great... I can see the correct values (and multiple values) when I view the events. It correctly splits out the carriers and errors into multiple values.

rex max_match=25 "(?<carrier>\d+)\[(?<error>\-?\d+\.?\d+)\]"| search error="*"

Now that I am trying the next step (graphing it), I have a feeling that this might have been the wrong route. My ultimate goal is to have a graph of errors by carrier, which means that the carrier and error need to be related across events. So as events like this come in, the error for carrier 1 will fluctuate. I want to see a graph of the errors for carrier 1 over time. Eventually, I would put them all on the same graph, so that I have a line graph where each line represents a carrier, the x-axis is time and the y-axis is error value.

Kind of like this, but these values are all messed up:
https://www.dropbox.com/s/ie56feayop1q9ln/carrier_chart.PNG

Am I just expanding these variables incorrectly? or am I setting this up totally wrong?

0 Karma

Splunk Employee
Splunk Employee

Your regex should be corrected to this since you won't capture the first 0 in yours:

(?<carrier>\d+)\[(?<error>[^\[\]]+)

props.conf

[yoursourcetype]
REPORT-MYREPORT = MYREPORT

transforms.conf

[MYREPORT]
REGEX =  (?<carrier>\d+)\[(?<error>[^\[\]]+)
MV_ADD = TRUE

Path Finder

Thank you, I definitely plan to add these as config fields. My original regex seems to work fine though, but you say that mine won't capture the first 0? It seems to, but maybe I am missing something.

0 Karma

SplunkTrust
SplunkTrust

And if you want to set it in config files, see this

http://answers.splunk.com/answers/112311/multi-value-field-extraction

Hi!
Based on your information, i am not 100% sure, how your events realy look like. But i think a rex like this can help to get you started:

rex max_match=100 "(?<carrier>\d+)\[(?<error>\-?\d+\.?\d+)\]"

Because of the max_match, the rex doesn't stop after the first match, instead it matches more often (in this case up to 100 times, a value of 0 means unlimited). If it matches more than once, the field becomes an multivalue field.

You can also read this up in the docs:

link

Communicator

Exactly what i was looking for! Thanks mate

0 Karma