Splunk Search

Extracting and concatenating regex captured groups in a single transform / extraction

althomas
Communicator

Hi all,

I'm trying to get pivots working with a user's data, but I'm having issues getting the fields auto-extracted prior to use in the pivots.

In our example, the user has decided to include commas in the response time log message. I want to have this extracted out as an integer, but I'm not having much luck.

Example:

rex field=message "Took (?<response_time_ms>\S+) ms" | rex mode=sed field=response_time_ms "s/,//g" | where response_time_ms > 1000

This is straightforward enough at search time, but I was wondering if there as a way to do it automagically, like so:
transforms.conf

[my_response_time]
FORMAT = response_time_ms::$1$2
REGEX = [Tt]ook (?:(\d+),){0,1}(\d+) ms
SOURCE_KEY = message

props.conf

[my_sourcetype]
REPORT-my_response_time = my_response_time

Is this possible in any way? Doing the above just gives response_time_ms a value of "$1$2" literally, rather than replacing the value.

Cheers!!

Best regards,
Alex

0 Karma
1 Solution

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

View solution in original post

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

althomas
Communicator

After fiddling for a bit, I've managed to find a solution to this which will extract it out automatically for me:

transforms.conf

[response_time_extract]
REGEX = Took (?:(?<resp_time_1>\d+),){0,1}(?<resp_time_2>\d+) ms

props.conf

[test]
REPORT-test_field_extr = response_time_extract
EVAL-response_time_ms = if(isnull(resp_time_1),resp_time_2,resp_time_1 . resp_time_2)

The data looks (sort of) like this:

100
500
1,100
2,300

The transforms will always extract out the numbers under 1000 and will only extract the numbers 1000 and above if they exist. It will then concatenate them if they both exist, otherwise it will only use the second capturing group.

0 Karma

micahkemp
Champion

Excellent. You should convert this comment to an answer and accept it.

0 Karma

althomas
Communicator

Tried to but failed. I've just moved it underneath yours instead and accepted yours (as it is correct as well!).

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...