Splunk Search

Need to understand Regular Expression

abilann
New Member

Team,

Can anyone please help me to understand the below regular expression used in field extraction?

(?i)CPU_COUNT\s+(?P[^ \n]*)?

Thanks,
Abilan

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi abilann,

The regex is looking for a case insensitive match for CPU_COUNT followed by one or more whitespace and puts the following characters that are not a new line in a field called cpu_cores(in a greedy mode).

This is a literal translation of the regex.

Hope this helps ...

cheers, MuS

MuS
SplunkTrust
SplunkTrust

Hi abilann,

Can you also please post some sample events, because with just the regex it hard to answer.
Also, this posted regex is not correct because you have an incomplete group structure and the last ? does not have a preceding token.

cheers, MuS

0 Karma

abilann
New Member

Hi ,

Actually this is the default field extract (hardware : EXTRACT-cpu_cores) used in "Splunk App for AWS". Am trying to understand how they are extracting CPU_Cores from the events. Because I could not find any keyword like "CPU" in the events.

Thanks,
Abilan

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @abilann,
your regex isn't readable, please use Code Sample button (the one with 101010) to display your regex.

In addition, I suggest to put your regex and a sample of your logs in regex101.com site, you can test your regex and there's (on the right side) a description of the regex.

Ciao.
Giuseppe

0 Karma

abilann
New Member
(?i)CPU_COUNT\s+(?P<cpu_cores>[^ \n]*)?
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @abilann,
This is the explanation of your regex by regex101.com:

(?i) match the remainder of the pattern with the following effective flags: gmi
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
CPU_COUNT matches the characters CPU_COUNT literally (case insensitive)
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Named Capture Group cpu_cores (?P<cpu_cores>[^ \n]*)?
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character not present in the list below [^ \n]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  matches the character   literally (case insensitive)
\n matches a line-feed (newline) character (ASCII 10)
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

That you can see by yourself at https://regex101.com/r/xfUL8y/1 .

Ciao.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...