Splunk Search

Extracting and concatenating regex captured groups in a single transform / extraction

althomas
Communicator

Hi all,

I'm trying to get pivots working with a user's data, but I'm having issues getting the fields auto-extracted prior to use in the pivots.

In our example, the user has decided to include commas in the response time log message. I want to have this extracted out as an integer, but I'm not having much luck.

Example:

rex field=message "Took (?<response_time_ms>\S+) ms" | rex mode=sed field=response_time_ms "s/,//g" | where response_time_ms > 1000

This is straightforward enough at search time, but I was wondering if there as a way to do it automagically, like so:
transforms.conf

[my_response_time]
FORMAT = response_time_ms::$1$2
REGEX = [Tt]ook (?:(\d+),){0,1}(\d+) ms
SOURCE_KEY = message

props.conf

[my_sourcetype]
REPORT-my_response_time = my_response_time

Is this possible in any way? Doing the above just gives response_time_ms a value of "$1$2" literally, rather than replacing the value.

Cheers!!

Best regards,
Alex

0 Karma
1 Solution

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

View solution in original post

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

althomas
Communicator

After fiddling for a bit, I've managed to find a solution to this which will extract it out automatically for me:

transforms.conf

[response_time_extract]
REGEX = Took (?:(?<resp_time_1>\d+),){0,1}(?<resp_time_2>\d+) ms

props.conf

[test]
REPORT-test_field_extr = response_time_extract
EVAL-response_time_ms = if(isnull(resp_time_1),resp_time_2,resp_time_1 . resp_time_2)

The data looks (sort of) like this:

100
500
1,100
2,300

The transforms will always extract out the numbers under 1000 and will only extract the numbers 1000 and above if they exist. It will then concatenate them if they both exist, otherwise it will only use the second capturing group.

0 Karma

micahkemp
Champion

Excellent. You should convert this comment to an answer and accept it.

0 Karma

althomas
Communicator

Tried to but failed. I've just moved it underneath yours instead and accepted yours (as it is correct as well!).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...