Getting Data In

How to work around SEDCMD trumping EXTRACT and TRANSFORM

chrismmckenna
New Member

I have events that look like the following:

1pjxVfF7i84nvqrD4p24UVa|2019-05-14 20:41:04.035:[0:T][T1847][PaymentMethodLogoRepositoryImpl][1300][]Fetch logo (consulate_0704c4eb6fb5)
1pjxVfF7i84nvqrD4p24UVa|    paymentMethod=Interac
1pjxVfF7i84nvqrD4p24UVa|    countryCode=CA

Note the repetition of 1pjxVfF7i84nvqrD4p24UVa| for every line of the log - sometimes the events are hundreds of lines long. The repetition is wasteful noise.

I want to extract the repeated value into a variable (e.g. transaction_id="1pjxVfF7i84nvqrD4p24UVa". I've used the following for that in props.conf and a SEDCMD to strip the data from _raw

[cbms_merchant_logs]
EXTRACT-transaction_id = ^(?\w{23})\|\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}:
SEDCMD-strip-transaction-id = s/\w{23})\|//g

From what I understand, the SEDCMD executes first so the data won't be available for the EXTRACT.

How can I achieve the goals of key-value EXTRACTION and SEDCMD substitution? Multiple TRANSFORMS perhaps? Examples are appreciated.

0 Karma

somesoni2
Revered Legend

The SEDCMD is executed during parsing time (before indexing), so it should be deployed to your Heavy forwarder OR indexer whichever comes first. The EXTRACT is a search time field extraction, so it executes (obviously after indexing is done) during a search is fired on that sourcetype.

For your use-case try something like this (props.conf on your heavy forwarder OR indexer)

[cbms_merchant_logs]
SEDCMD-transaction_id_ext = s/^(\w{23})(\|.+)/transactionId="\1"\2/
SEDCMD-strip-transaction-id = s/\w{23})\|//g

FrankVl
Ultra Champion

That's what I would suggest as well. Keep the value on the first line of the event and remove it elsewhere. Note: with the naming you have now, wouldn't the second one be executed first (because s comes before t)?

An alternative could be is to write a SED that only triggers for the subsequent lines (where the | is followed by some white space: SEDCMD-strip-transaction-id = s/\w{23})\|\s+//g

0 Karma

sloshburch
Ultra Champion

I made this an answer because I'm hoping that the lack of response meant you "answered" the question for @chrismmckenna. We'll find out by seeing if he accepts this as an answer.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...