Splunk Search

Replace parts of a string where values of a multivalue field match

dtaylor
Path Finder

Unfortunately, I've hit the limit of my Splunk knowledge again, and I need some help. I'm attempting to write a search which will look at an encoded email subject line and dynamically replace the encoded parts with the translated strings. There are two encodings: base64 encoding and quoted printable. To translate the encoded parts, I'm using two Splunk addons named MIME Decoder Add-on for Cisco ESA and Supporting Add-on for Base64 Conversion.

Take the below example:

| makeresults
| eval encoded_Subject = "\=?utf-8?B?QXV0b21hdGljIHJlcGx5OiBFWFQ68J+nuPCfjoFCZSBhIEhvbGlkYXkgSGVy?\= \=?utf-8?Q?o:_Request_Your_Angel_Today!?\="
| eval mv_encoded_bstrings = mvappend("QXV0b21hdGljIHJlcGx5OiBFWFQ68J+nuPCfjoFCZSBhIEhvbGlkYXkgSGVy")
| eval mv_encoded_qstrings = mvappend("\=?utf-8?Q?o:_Request_Your_Angel_Today!?\=")
| eval mv_decoded_bstrings = mvappend("Automatic reply: EXT:🧸🎁Be a Holiday Her")
| eval mv_decoded_qstrings = mvappend("o: Request Your Angel Today!")

 

In the above example, the subject line has one base64 encoded component and another quoted-printable which are extracted into the multivalue field mv_encoded_bstrings and mv_encoded_qstrings respectively. These values are then decoded into a new multivalue field mv_decoded_bstrings and mv_decoded_qstrings wherein the value at a given index for one of the mv fields corresponds to the value found at the same index in the other mv field.

The number encoded parts in this subject line is two, but it could just as easily be one, five, or none at all with any variation of different encodings. They could all be base64, all quoted printable, a mix, or even none at all.

In this example, it's a mix.

I'm attempting to find a way to search through the original subject(encoded_Subject field) for wherever each value  found within mv_encoded_bstrings or mv_encoded_qstrings. And when a match is found, that match should be replaced with the value found in mv_decoded_bstrings or mv_decoded_qstrings by matching the index from mv_encoded_bstrings or mv_encoded_qstrings.

The reason I can't just concatenate the values the 'decoded' mv fields is due to the loss of order. In this case, the base64 encoding comes before the quoted-printable, but it could just as easily been the other way around. As such, simply having something like mvjoin(mv_decoded_bstrings)." ".mvjoin(mv_decoded_qstrings) or some variation would lead to subjects with their words scrambled.

I feel like I'm hitting the wall of what's easily possible with basic SPL and infringing on the territory of custom python commands.

Labels (3)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You could try something like this

| makeresults
| eval encoded_Subject = "\=?utf-8?B?QXV0b21hdGljIHJlcGx5OiBFWFQ68J+nuPCfjoFCZSBhIEhvbGlkYXkgSGVy?\= \=?utf-8?Q?o:_Request_Your_Angel_Today!?\="
| eval mv_encoded_bstrings = ""
| eval mv_encoded_qstrings = ""
| eval mv_decoded_bstrings = ""
| eval mv_decoded_qstrings = ""
| eval mv_encoded_bstrings = mvappend(mv_encoded_bstrings,"QXV0b21hdGljIHJlcGx5OiBFWFQ68J+nuPCfjoFCZSBhIEhvbGlkYXkgSGVy")
| eval mv_encoded_qstrings = mvappend(mv_encoded_qstrings,"\=?utf-8?Q?o:_Request_Your_Angel_Today!?\=")
| eval mv_decoded_bstrings = mvappend(mv_decoded_bstrings,"Automatic reply: EXT:🧸🎁Be a Holiday Her")
| eval mv_decoded_qstrings = mvappend(mv_decoded_qstrings,"o: Request Your Angel Today!")
| eval zipped_bstrings=mvzip(mv_encoded_bstrings,mv_decoded_bstrings,"|")
| eval zipped_qstrings=mvzip(mv_encoded_qstrings,mv_decoded_qstrings,"|")
| foreach mode=multivalue zipped_bstrings
    [| eval encoded_Subject=replace(encoded_Subject,replace(mvindex(split(<<ITEM>>,"|"),0),"\+","\\+"),mvindex(split(<<ITEM>>,"|"),1))]
| foreach mode=multivalue zipped_qstrings
    [| eval encoded_Subject=replace(encoded_Subject,replace(mvindex(split(<<ITEM>>,"|"),0),"(\+|\*|\?|\\\\)","\\\\\1"),mvindex(split(<<ITEM>>,"|"),1))]

Note the initialisation of the mv fields and the correction to the mvappend syntax. This ensures that the foreach commands later have actual mv fields to work with. This also assumes that your encoded strings contain nothing more complex from a regex perspective than +, *, ?, or \

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...