Splunk Search

Extracting and concatenating regex captured groups in a single transform / extraction

althomas
Communicator

Hi all,

I'm trying to get pivots working with a user's data, but I'm having issues getting the fields auto-extracted prior to use in the pivots.

In our example, the user has decided to include commas in the response time log message. I want to have this extracted out as an integer, but I'm not having much luck.

Example:

rex field=message "Took (?<response_time_ms>\S+) ms" | rex mode=sed field=response_time_ms "s/,//g" | where response_time_ms > 1000

This is straightforward enough at search time, but I was wondering if there as a way to do it automagically, like so:
transforms.conf

[my_response_time]
FORMAT = response_time_ms::$1$2
REGEX = [Tt]ook (?:(\d+),){0,1}(\d+) ms
SOURCE_KEY = message

props.conf

[my_sourcetype]
REPORT-my_response_time = my_response_time

Is this possible in any way? Doing the above just gives response_time_ms a value of "$1$2" literally, rather than replacing the value.

Cheers!!

Best regards,
Alex

0 Karma
1 Solution

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

View solution in original post

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

althomas
Communicator

After fiddling for a bit, I've managed to find a solution to this which will extract it out automatically for me:

transforms.conf

[response_time_extract]
REGEX = Took (?:(?<resp_time_1>\d+),){0,1}(?<resp_time_2>\d+) ms

props.conf

[test]
REPORT-test_field_extr = response_time_extract
EVAL-response_time_ms = if(isnull(resp_time_1),resp_time_2,resp_time_1 . resp_time_2)

The data looks (sort of) like this:

100
500
1,100
2,300

The transforms will always extract out the numbers under 1000 and will only extract the numbers 1000 and above if they exist. It will then concatenate them if they both exist, otherwise it will only use the second capturing group.

0 Karma

micahkemp
Champion

Excellent. You should convert this comment to an answer and accept it.

0 Karma

althomas
Communicator

Tried to but failed. I've just moved it underneath yours instead and accepted yours (as it is correct as well!).

0 Karma
Get Updates on the Splunk Community!

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...