Splunk Search

Can you help me figure out what is the job of the rex field in this line?

ramanir
New Member

This is the search:

index=vha_pronto sourcetype=pronto_neopil_prd NOT [ search index=vha_pronto sourcetype=pronto_neopil_prd "SAF process started" earliest=-24h |rex field=_raw "(?ms)^[^\\[\\n]*\\[(?P\\w+\\-\\d+)" | return $SAF_pool ]
Tags (2)
0 Karma
1 Solution

DalJeanis
Legend

1) To answer the exact question you asked: In a rex command, the default field to be analyzed is _raw, so technically, that field=_raw clause simply makes the default explicit, and has no other effect on the function of the rex.

2) The overall search says, "look for events in this index and sourcetype that do not have an SAF process started record in the last 24 hours." The function of the rex is to extract those SAF_pool values from all relevant events in the last 24 hours.

3) Please mark your code when posting. There are three easy methods (A) put grave accents (the one on the ~ key) before and after small snippets of text (b) Put at least four spaces on the line before each line of code, and a blank line before them. (c) highlight the code and press the "code" button (101 010).

My guess is that your rex really reads like this...

 "(?ms)^[^\[\\n]*\[(?P<SAF_pool>\\w+\-\\d+)"   

That breaks down as "from the beginning of the line, throw away everything that is not an open brace or carriage return, until you get to an open brace. after that, match one or more word characters, a hyphen, and one or more digits." As such, your SAF_pool numbers are probably in the format ABCD-123 up to and including as weird as AB1_cD-00123. If the prefix is always going to be alpha, then change the \\w+ to [A-Za-z]+.

4) Now here's a breakdown on your entire search code.

First, when there are subsearches, you always read the code from the innermost square braces.

search index=vha_pronto sourcetype=pronto_neopil_prd "SAF process started" 
earliest=-24h 
| rex field=_raw "(?ms)^[^\[\\n]*\[(?P\\w+\-\\d+)" 
| return $SAF_pool 

My guess is that the rex really reads like this...

 "(?ms)^[^\[\\n]*\[(?P<SAF_pool>\\w+\-\\d+)"   

If I am correct, then what that rex is doing is extracting the SAF_pool information from the events selected by that subsearch. The subsearch brackets will then feed back the answer in a form that looks like this, for all pools started in the last 24 hours...

 ( (  SAF_Pool="ABC-123"  ) OR ( SAF_pool="XYZ-456" ) OR ... )

If you want to know why it turns into that format, look at the documentation for the "format" command.

After returning those values, the rest of the search then looks like this...

index=vha_pronto sourcetype=pronto_neopil_prd  NOT ( (  SAF_Pool="ABC-123"  ) OR ( SAF_pool="XYZ-456" ) OR ... )

Which, as I said before, is basically asking "show me events in an SAF_pool that was started more than 24 hours ago.

View solution in original post

0 Karma

woodcock
Esteemed Legend

Look at the explanation in the upper-right pane here:

https://regex101.com/r/OXBli1/1

0 Karma

DalJeanis
Legend

1) To answer the exact question you asked: In a rex command, the default field to be analyzed is _raw, so technically, that field=_raw clause simply makes the default explicit, and has no other effect on the function of the rex.

2) The overall search says, "look for events in this index and sourcetype that do not have an SAF process started record in the last 24 hours." The function of the rex is to extract those SAF_pool values from all relevant events in the last 24 hours.

3) Please mark your code when posting. There are three easy methods (A) put grave accents (the one on the ~ key) before and after small snippets of text (b) Put at least four spaces on the line before each line of code, and a blank line before them. (c) highlight the code and press the "code" button (101 010).

My guess is that your rex really reads like this...

 "(?ms)^[^\[\\n]*\[(?P<SAF_pool>\\w+\-\\d+)"   

That breaks down as "from the beginning of the line, throw away everything that is not an open brace or carriage return, until you get to an open brace. after that, match one or more word characters, a hyphen, and one or more digits." As such, your SAF_pool numbers are probably in the format ABCD-123 up to and including as weird as AB1_cD-00123. If the prefix is always going to be alpha, then change the \\w+ to [A-Za-z]+.

4) Now here's a breakdown on your entire search code.

First, when there are subsearches, you always read the code from the innermost square braces.

search index=vha_pronto sourcetype=pronto_neopil_prd "SAF process started" 
earliest=-24h 
| rex field=_raw "(?ms)^[^\[\\n]*\[(?P\\w+\-\\d+)" 
| return $SAF_pool 

My guess is that the rex really reads like this...

 "(?ms)^[^\[\\n]*\[(?P<SAF_pool>\\w+\-\\d+)"   

If I am correct, then what that rex is doing is extracting the SAF_pool information from the events selected by that subsearch. The subsearch brackets will then feed back the answer in a form that looks like this, for all pools started in the last 24 hours...

 ( (  SAF_Pool="ABC-123"  ) OR ( SAF_pool="XYZ-456" ) OR ... )

If you want to know why it turns into that format, look at the documentation for the "format" command.

After returning those values, the rest of the search then looks like this...

index=vha_pronto sourcetype=pronto_neopil_prd  NOT ( (  SAF_Pool="ABC-123"  ) OR ( SAF_pool="XYZ-456" ) OR ... )

Which, as I said before, is basically asking "show me events in an SAF_pool that was started more than 24 hours ago.

0 Karma

ramanir
New Member

thanks a lot for your detailed answer @DalJeanis

0 Karma

DalJeanis
Legend

@ramanir - Be sure to mark your code with the code button (101 010) or by putting at least 4 spaces in front of it. that will stop the interface from stripping out anything that looks like html.

0 Karma

ramanir
New Member

corrected query

index=vha_pronto sourcetype=pronto_neopil_prd NOT [ search index=vha_pronto sourcetype=pronto_neopil_prd "SAF process started" earliest=-24h |rex field=_raw "(?ms)^[^\[\n]*\[(?P\w+\-\d+)" | return $SAF_pool ]

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The rex command is usually used to extract fields from an event using a regular expression. This rex command is garbled, however, so it's difficult to say precisely what it is doing. Please re-post the query by surrounding it with backticks (`).

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...