Splunk Search

Regex: skipping or jumping over segments for field extraction

Bastelhoff
Path Finder

Hey there!

I am wondering if it is possible to create a regex for field extration which extracts a string, but at the same time, leaves out part of the string.

Let's say there is a logline with:
IP: 111.222.111.222
Now the extracted field should capture the IP, but without the dots (so the result should be "111222111222"). Is this even possible right at field extraction?
Can you skip certain elements? Or can you extract each segment and then combine them somehow?
Faik you can exclude with [^ ] but then you basically skip the whole entry and get nothing if this character occurs, which is not what I want. I want to identify the whole string, but then just capture just elements of it.

Thank you!

0 Karma
1 Solution

Bastelhoff
Path Finder

I found my solution:

So in this case I would do a field extraction as usual with regex, just extracting the whole field, including the parts I want to exclude.

After that I am using the "Calculated fields" option. That way I take the previous extracted field as input and use the replace() function to remove the characters or strings I do not want in the field content.

View solution in original post

0 Karma

Bastelhoff
Path Finder

I found my solution:

So in this case I would do a field extraction as usual with regex, just extracting the whole field, including the parts I want to exclude.

After that I am using the "Calculated fields" option. That way I take the previous extracted field as input and use the replace() function to remove the characters or strings I do not want in the field content.

0 Karma

Bastelhoff
Path Finder

I think there had been a misunderstanding here. I would require a field extraction. From my understanding you don't have the luxury of having multiple lines of code there, but basically have just one regex command to do what you want (for one field).

0 Karma

to4kawa
Ultra Champion

I think you have been a misunderstanding here. I would require a field extraction. From my understanding you should have multiple lines of code, because Regex extract by order.

0 Karma

to4kawa
Ultra Champion
| rex "IP: (?<digit_ip>\S+)"
| rex mode=sed field=digit_ip "s/\D//g"
0 Karma

jpolvino
Builder

You can do quite a lot of what you're asking.

| makeresults 
| eval raw="111.222.111.222:::192.168.1.1:::8.8.8.8:::008.008.008.008:::127.0.0.1"
| makemv delim=":::" raw
| mvexpand raw | rename raw AS _raw | fields - _time | eval raw2=_raw
| rex field=raw2 mode=sed "s/\.//g"
| rex field=_raw "(?<part1>\d+)\.(?<part2>\d+)\.(?<part3>\d+)\.(?<part4>\d+)"    

A word of caution. Take a look at what happens when you have an IP address that doesn't have each part consisting of 3 digits. If you remove the periods from 192.168.1.1 and get 19216811, could that mean 192.16.81.1 or maybe 192.16.8.11?

0 Karma

sumanssah
Communicator

try something like

your search 
| rex field=_raw "(?<Remotehost>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
| rex field=Remotehost mode=sed "s/\W+//g"
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...