Splunk Search

Need to understand Regular Expression

New Member

Team,

Can anyone please help me to understand the below regular expression used in field extraction?

(?i)CPU_COUNT\s+(?P[^ \n]*)?

Thanks,
Abilan

0 Karma

SplunkTrust
SplunkTrust

Hi abilann,

The regex is looking for a case insensitive match for CPU_COUNT followed by one or more whitespace and puts the following characters that are not a new line in a field called cpu_cores(in a greedy mode).

This is a literal translation of the regex.

Hope this helps ...

cheers, MuS

SplunkTrust
SplunkTrust

Hi abilann,

Can you also please post some sample events, because with just the regex it hard to answer.
Also, this posted regex is not correct because you have an incomplete group structure and the last ? does not have a preceding token.

cheers, MuS

0 Karma

New Member

Hi ,

Actually this is the default field extract (hardware : EXTRACT-cpu_cores) used in "Splunk App for AWS". Am trying to understand how they are extracting CPU_Cores from the events. Because I could not find any keyword like "CPU" in the events.

Thanks,
Abilan

0 Karma

Legend

Hi @abilann,
your regex isn't readable, please use Code Sample button (the one with 101010) to display your regex.

In addition, I suggest to put your regex and a sample of your logs in regex101.com site, you can test your regex and there's (on the right side) a description of the regex.

Ciao.
Giuseppe

0 Karma

New Member
(?i)CPU_COUNT\s+(?P<cpu_cores>[^ \n]*)?
0 Karma

Legend

Hi @abilann,
This is the explanation of your regex by regex101.com:

(?i) match the remainder of the pattern with the following effective flags: gmi
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
CPU_COUNT matches the characters CPU_COUNT literally (case insensitive)
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Named Capture Group cpu_cores (?P<cpu_cores>[^ \n]*)?
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character not present in the list below [^ \n]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  matches the character   literally (case insensitive)
\n matches a line-feed (newline) character (ASCII 10)
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

That you can see by yourself at https://regex101.com/r/xfUL8y/1 .

Ciao.
Giuseppe

0 Karma