Splunk Search

Need to understand Regular Expression

abilann
New Member

Team,

Can anyone please help me to understand the below regular expression used in field extraction?

(?i)CPU_COUNT\s+(?P[^ \n]*)?

Thanks,
Abilan

0 Karma

MuS
Legend

Hi abilann,

The regex is looking for a case insensitive match for CPU_COUNT followed by one or more whitespace and puts the following characters that are not a new line in a field called cpu_cores(in a greedy mode).

This is a literal translation of the regex.

Hope this helps ...

cheers, MuS

MuS
Legend

Hi abilann,

Can you also please post some sample events, because with just the regex it hard to answer.
Also, this posted regex is not correct because you have an incomplete group structure and the last ? does not have a preceding token.

cheers, MuS

0 Karma

abilann
New Member

Hi ,

Actually this is the default field extract (hardware : EXTRACT-cpu_cores) used in "Splunk App for AWS". Am trying to understand how they are extracting CPU_Cores from the events. Because I could not find any keyword like "CPU" in the events.

Thanks,
Abilan

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @abilann,
your regex isn't readable, please use Code Sample button (the one with 101010) to display your regex.

In addition, I suggest to put your regex and a sample of your logs in regex101.com site, you can test your regex and there's (on the right side) a description of the regex.

Ciao.
Giuseppe

0 Karma

abilann
New Member
(?i)CPU_COUNT\s+(?P<cpu_cores>[^ \n]*)?
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @abilann,
This is the explanation of your regex by regex101.com:

(?i) match the remainder of the pattern with the following effective flags: gmi
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
CPU_COUNT matches the characters CPU_COUNT literally (case insensitive)
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Named Capture Group cpu_cores (?P<cpu_cores>[^ \n]*)?
? Quantifier — Matches between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character not present in the list below [^ \n]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  matches the character   literally (case insensitive)
\n matches a line-feed (newline) character (ASCII 10)
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

That you can see by yourself at https://regex101.com/r/xfUL8y/1 .

Ciao.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...