Splunk Search

Field extractions

Communicator

Hi there.

I've managed to work out some regex to grab the data I want when using regex101 but I'm having trouble porting it into Splunk because Splunk also needs the correct intormation in the right place to name that extracted field I believe.

The data I've got looks like this:

summary project x

parts 1 a part

person1 4

person2

invoice

And the regex that gets the values after the keys is:
(?<=#summary)\s(.?)[\r\n]
or
(?<=#parts)\s(.
?)[\r\n]
or
(?<=#invoice)\s[0-9]*

The first two will have carriage returns at the end and that last one won't hence the different approach for that one.

I don't know where or what to add to get Splunk to call the first field Summary for example or Parts for the second as you can see.

I realise it's going to be something like in there somewhere but can't work out where.

Thanks.

0 Karma

Ultra Champion

Field extraction

You can set it here.

0 Karma

Ultra Champion
| makeresults 
| eval sample="#summary project x
#parts 1 a part
#person1 4
#person2
#invoice"
| makemv delim="
" sample
| mvexpand sample
| rex field=sample "#(?<field_name>[^\s]+)( (?<value>.+))?"
| fillnull value value="N/A"
| eval value="\"".value."\""
| eval raw=mvzip(field_name,value,"=")
| stats count by raw
| rename raw as _raw
| kv

I put N/A where there is no value.

0 Karma

Communicator

That is a cool search and extraction thanks to4kawa. I'll definitely be able to use something like that in my project. As mentioned though I do want to extract these fields when the data comes in so I've got them ready to work with within my app. So I want to be able to go into field extractions and create the extraction in there. What I can't find documentation on is changing the regex that extracts the data into splunk regex that extracts it and then applies it to a field name. Like doing this bit but in the fields extraction creation section of the splunk gui - #(?[^\s]+)( (?.+))? - that obviously assigns the value to the field_name.

0 Karma

Ultra Champion
0 Karma

Communicator

Yeh I'd already tried that and it couldn't quite get my 5 extractions right. So I worked out the correct regex for all 5 but can't manually add it as it wants the key name somehow as well as the value which the regex pulls.

0 Karma

Communicator

So if I went into thyat field extractor you linked to I need to click the option to 'writer the regular expression myself'. I then have the par tof the regular expression that extracts the value but it won't work because splunk wants the bit to tell it what that key name will be as well. So for my exmaple of parts above ( this will be 5 separate extractions) my regex to grab the value is (?<=#parts)\s(.?)[\r\n] and this works. But splunk needs me to cahnge that to have the key name in it as well. In this case 'Parts'. So that needs to be built into the regex and it looks to be along the lines of (?...) but I don't know how to merge this into that regex.

0 Karma

Ultra Champion

transforms.conf

It may be difficult if you do not do it here.
Or, like I did, query later

0 Karma

Champion

Hi

Check this

| makeresults 
| eval _raw="#summary project x" 
| rex field=_raw "(?<=#summary)\s(?P<Summary>(.?)+)"
0 Karma

Communicator

Thanks for this. This a a search time extraction though isn't it. How can I apply this logic and get the same results with an index time extraction?

0 Karma

Communicator

Actually, also. I'm trying to find the field after the work summary so it might not be project x but anything after the string of #summary and up to the carriage return at the end of the line.

0 Karma

Communicator

Trid this too but doesn 't work: _raw "(?<=#summary)\s(?P(.?)+)"

0 Karma

Champion

Hi

Can you please let us know the fieldname that you want to extract and also the expected output with sample.

0 Karma

Communicator

So per above for the summary for example.
The raw data is:

"#summary project x"

I would like to end up with a field named Summary and a value of project x
In regex101 (?<=#summary)\s(.?)[\r\n] retrieves the value "project x"

0 Karma