Splunk Search

Extracting and concatenating regex captured groups in a single transform / extraction

althomas
Communicator

Hi all,

I'm trying to get pivots working with a user's data, but I'm having issues getting the fields auto-extracted prior to use in the pivots.

In our example, the user has decided to include commas in the response time log message. I want to have this extracted out as an integer, but I'm not having much luck.

Example:

rex field=message "Took (?<response_time_ms>\S+) ms" | rex mode=sed field=response_time_ms "s/,//g" | where response_time_ms > 1000

This is straightforward enough at search time, but I was wondering if there as a way to do it automagically, like so:
transforms.conf

[my_response_time]
FORMAT = response_time_ms::$1$2
REGEX = [Tt]ook (?:(\d+),){0,1}(\d+) ms
SOURCE_KEY = message

props.conf

[my_sourcetype]
REPORT-my_response_time = my_response_time

Is this possible in any way? Doing the above just gives response_time_ms a value of "$1$2" literally, rather than replacing the value.

Cheers!!

Best regards,
Alex

0 Karma
1 Solution

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

View solution in original post

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

althomas
Communicator

After fiddling for a bit, I've managed to find a solution to this which will extract it out automatically for me:

transforms.conf

[response_time_extract]
REGEX = Took (?:(?<resp_time_1>\d+),){0,1}(?<resp_time_2>\d+) ms

props.conf

[test]
REPORT-test_field_extr = response_time_extract
EVAL-response_time_ms = if(isnull(resp_time_1),resp_time_2,resp_time_1 . resp_time_2)

The data looks (sort of) like this:

100
500
1,100
2,300

The transforms will always extract out the numbers under 1000 and will only extract the numbers 1000 and above if they exist. It will then concatenate them if they both exist, otherwise it will only use the second capturing group.

0 Karma

micahkemp
Champion

Excellent. You should convert this comment to an answer and accept it.

0 Karma

althomas
Communicator

Tried to but failed. I've just moved it underneath yours instead and accepted yours (as it is correct as well!).

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Introducing Splunk 10.0: Smarter, Faster, and More Powerful Than Ever

Now On Demand Whether you're managing complex deployments or looking to future-proof your data ...

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...