Splunk Search

Extracting numbers between words

ibekacyril
Explorer

Say I have this data:

c.i.m This is just a sample 23456 Yes it is true.

My question is how do I extract 23456 and pass it to a new field since there is no key-value pair in this scenario? I would also want to do a count on the new field.

Thanks and please I would not mind some explanation on your code too.

0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi ibekacyril,

Well, the straight approach is to get only digits matching in the regex

(?<my_number>\d+)

This will create a field called my_number and you can run a stats count(my_number) on it 😉

Next example is this one:

\w+\s(?<my_number>[^\s]+)\s

That's faster but will match This and you want the numbers ... so on to the next one

(?:\w\s)+(?<my_number>\d+)\s

a non-captureing group of one and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_] \s match any white space character [\r\n\t\f ]
but this will be slower because it will need more steps to match the result you want - Hint: www.regex101.com

c\.i\.m\sThis\sis\sjust\sa\ssample\s(?<my_number>\d+)\s

almost as fast as the first one, but it take a bit more steps to match

There almost endless possibilities for regex to match, but you want the most efficient one - use regex101.com and learn how to use regex 😉

Hope this helps ...

cheers, MuS

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi ibekacyril,

Well, the straight approach is to get only digits matching in the regex

(?<my_number>\d+)

This will create a field called my_number and you can run a stats count(my_number) on it 😉

Next example is this one:

\w+\s(?<my_number>[^\s]+)\s

That's faster but will match This and you want the numbers ... so on to the next one

(?:\w\s)+(?<my_number>\d+)\s

a non-captureing group of one and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_] \s match any white space character [\r\n\t\f ]
but this will be slower because it will need more steps to match the result you want - Hint: www.regex101.com

c\.i\.m\sThis\sis\sjust\sa\ssample\s(?<my_number>\d+)\s

almost as fast as the first one, but it take a bit more steps to match

There almost endless possibilities for regex to match, but you want the most efficient one - use regex101.com and learn how to use regex 😉

Hope this helps ...

cheers, MuS

ibekacyril
Explorer

Thank you so much, especially with the regex101.com. The steps you gave equally helped.

0 Karma
Get Updates on the Splunk Community!

The OpenTelemetry Certified Associate (OTCA) Exam

What’s this OTCA exam? The Linux Foundation offers the OpenTelemetry Certified Associate (OTCA) credential to ...

From Manual to Agentic: Level Up Your SOC at Cisco Live

Welcome to the Era of the Agentic SOC   Are you tired of being a manual alert responder? The security ...

Splunk Classroom Chronicles: Training Tales and Testimonials (Episode 4)

Welcome back to Splunk Classroom Chronicles, our ongoing series where we shine a light on what really happens ...