Getting Data In

Help with SEDCMD Regex SPL for testing purposes

DanAlexander
Communicator

Hello clever people,

Would anyone be able to help me build a regex that would work on a SPL level e.g something like 

| rex mode=sed field=_raw s/regex_example/g

I wanted to test the result first before I add to props on the indexers. 

The below is the raw log and I would like to keep just the parts in bold all the rest should be dropped/cleared.

-----------------------------------------------------

[meta sequenceId="-2077347367"]10000 - [action:"Accept"; conn_direction:"Internal"; flags:"dd06212"; ifdir:"inbound"; ifname:"bond3.32"; logid:"0"; loguid:"{ 000.000.000.000}"; origin:"000.000.000.000"; originsicname:"CN=XXXXXXXX,O= XXXXXXXX. XXXXXXXX.q7vvv"; sequencenum:"1457"; time:"1686217674"; version:"5"; __policy_id_tag:"product=cccccccc-1[db_tag={ XXXXXXXX-8ED31 XXXXXXXX };mgmt= XXXXXXXX xxx1;date=168XXXXXXXX;policy_name=XXXXXXXX-1\]"; dst:"000.000.000.000"; log_delay:"168XXXXXXXX "; layer_name:" XXXXXXXX "; layer_name:" XXXXXXXX "; layer_uuid:" XXXXXXXX -49d7-a207-a90ea5dd66fb"; layer_uuid:"cdc569c2-d869- XXXXXXXX "; match_id:"14x"; match_id:"50331649"; parent_rule:"0"; parent_rule:"0"; rule_action:"Accept"; rule_action:"Accept"; rule_name:" XXXXXXXX Heartbeat -> Platfxxxx"; rule_name:" XXXXXXXX "; rule_uid:"211567a0-d33a- XXXXXXXX "; rule_uid:" XXXXXXXX -4bde-a9c0-3cbaefd188b6"; product:" XXXXXXXX "; proto:"6"; s_port:" XXXXXXXX "; service:"3002"; service_id:"xxxx-Control"; src:"000.000.000.000"]

-----------------------------------------------

Thank you all in advance!

0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

You need to escape the double quotes in the string you are setting _raw to

| makeresults
| eval _raw="[meta sequenceId=\"-2077347367\"]10000 - [action:\"Accept\"; conn_direction:\"Internal\"; flags:\"dd06212\"; ifdir:\"inbound\"; ifname:\"bond3.32\"; logid:\"0\"; loguid:\"{ 000.000.000.000}\"; origin:\"000.000.000.000\"; originsicname:\"CN=XXXXXXXX,O= XXXXXXXX. XXXXXXXX.q7vvv\"; sequencenum:\"1457\"; time:\"1686217674\"; version:\"5\"; __policy_id_tag:\"product=cccccccc-1[db_tag={ XXXXXXXX-8ED31 XXXXXXXX };mgmt= XXXXXXXX xxx1;date=168XXXXXXXX;policy_name=XXXXXXXX-1\]\"; dst:\"000.000.000.000\"; log_delay:\"168XXXXXXXX \"; layer_name:\" XXXXXXXX \"; layer_name:\" XXXXXXXX \"; layer_uuid:\" XXXXXXXX -49d7-a207-a90ea5dd66fb\"; layer_uuid:\"cdc569c2-d869- XXXXXXXX \"; match_id:\"14x\"; match_id:\"50331649\"; parent_rule:\"0\"; parent_rule:\"0\"; rule_action:\"Accept\"; rule_action:\"Accept\"; rule_name:\" XXXXXXXX Heartbeat -> Platfxxxx\"; rule_name:\" XXXXXXXX \"; rule_uid:\"211567a0-d33a- XXXXXXXX \"; rule_uid:\" XXXXXXXX -4bde-a9c0-3cbaefd188b6\"; product:\" XXXXXXXX \"; proto:\"6\"; s_port:\" XXXXXXXX \"; service:\"3002\"; service_id:\"xxxx-Control\"; src:\"000.000.000.000\"]"
| rex mode=sed "s/.*\[(?<action>action:\"[^\"]+\").+(?<origin>origin:\"[^\"]+\").+(?<dst>dst:\"[^\"]+\").+(?<layer_name>layer_name:\"[^\"]+\").+(?<src>src:\"[^\"]+\").*/\1 \2 \3 \4 \5/g"

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust
| rex mode=sed "s/.*\[(?<action>action:\"[^\"]+\").+(?<origin>origin:\"[^\"]+\").+(?<dst>dst:\"[^\"]+\").+(?<layer_name>layer_name:\"[^\"]+\").+(?<src>src:\"[^\"]+\").*/\1 \2 \3 \4 \5/g"
0 Karma

DanAlexander
Communicator

Hi @ITWhisperer ,

Hope you are doing well.

I wanted to ask you as you were able to help me once and wanted to see if you would be able to help me with my new challenge, please.

My original post is in Re: Help with SEDCMD raw event size reduction - Splunk Community

Thank you in advance.

 

0 Karma

DanAlexander
Communicator

Thanks for your reply @ITWhisperer 

Unfortunately, it gives me an error Unknown search command `db` after I run the following:

| makeresults
| eval _raw = "[meta sequenceId="-2077347367"]10000 - [action:"Accept"; conn_direction:"Internal"; flags:"dd06212"; ifdir:"inbound"; ifname:"bond3.32"; logid:"0"; loguid:"{ 000.000.000.000}"; origin:"000.000.000.000"; originsicname:"CN=XXXXXXXX,O= XXXXXXXX. XXXXXXXX.q7vvv"; sequencenum:"1457"; time:"1686217674"; version:"5"; __policy_id_tag:"product=cccccccc-1[db_tag={ XXXXXXXX-8ED31 XXXXXXXX };mgmt= XXXXXXXX xxx1;date=168XXXXXXXX;policy_name=XXXXXXXX-1\]"; dst:"000.000.000.000"; log_delay:"168XXXXXXXX "; layer_name:" XXXXXXXX "; layer_name:" XXXXXXXX "; layer_uuid:" XXXXXXXX -49d7-a207-a90ea5dd66fb"; layer_uuid:"cdc569c2-d869- XXXXXXXX "; match_id:"14x"; match_id:"50331649"; parent_rule:"0"; parent_rule:"0"; rule_action:"Accept"; rule_action:"Accept"; rule_name:" XXXXXXXX Heartbeat -> Platfxxxx"; rule_name:" XXXXXXXX "; rule_uid:"211567a0-d33a- XXXXXXXX "; rule_uid:" XXXXXXXX -4bde-a9c0-3cbaefd188b6"; product:" XXXXXXXX "; proto:"6"; s_port:" XXXXXXXX "; service:"3002"; service_id:"xxxx-Control"; src:"000.000.000.000"]"
| rex mode=sed "s/.*\[(?<action>action:\"[^\"]+\").+(?<origin>origin:\"[^\"]+\").+(?<dst>dst:\"[^\"]+\").+(?<layer_name>layer_name:\"[^\"]+\").+(?<src>src:\"[^\"]+\").*/\1 \2 \3 \4 \5/g"

Would it be possible to provide all used to test, please?

Thank you!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You need to escape the double quotes in the string you are setting _raw to

| makeresults
| eval _raw="[meta sequenceId=\"-2077347367\"]10000 - [action:\"Accept\"; conn_direction:\"Internal\"; flags:\"dd06212\"; ifdir:\"inbound\"; ifname:\"bond3.32\"; logid:\"0\"; loguid:\"{ 000.000.000.000}\"; origin:\"000.000.000.000\"; originsicname:\"CN=XXXXXXXX,O= XXXXXXXX. XXXXXXXX.q7vvv\"; sequencenum:\"1457\"; time:\"1686217674\"; version:\"5\"; __policy_id_tag:\"product=cccccccc-1[db_tag={ XXXXXXXX-8ED31 XXXXXXXX };mgmt= XXXXXXXX xxx1;date=168XXXXXXXX;policy_name=XXXXXXXX-1\]\"; dst:\"000.000.000.000\"; log_delay:\"168XXXXXXXX \"; layer_name:\" XXXXXXXX \"; layer_name:\" XXXXXXXX \"; layer_uuid:\" XXXXXXXX -49d7-a207-a90ea5dd66fb\"; layer_uuid:\"cdc569c2-d869- XXXXXXXX \"; match_id:\"14x\"; match_id:\"50331649\"; parent_rule:\"0\"; parent_rule:\"0\"; rule_action:\"Accept\"; rule_action:\"Accept\"; rule_name:\" XXXXXXXX Heartbeat -> Platfxxxx\"; rule_name:\" XXXXXXXX \"; rule_uid:\"211567a0-d33a- XXXXXXXX \"; rule_uid:\" XXXXXXXX -4bde-a9c0-3cbaefd188b6\"; product:\" XXXXXXXX \"; proto:\"6\"; s_port:\" XXXXXXXX \"; service:\"3002\"; service_id:\"xxxx-Control\"; src:\"000.000.000.000\"]"
| rex mode=sed "s/.*\[(?<action>action:\"[^\"]+\").+(?<origin>origin:\"[^\"]+\").+(?<dst>dst:\"[^\"]+\").+(?<layer_name>layer_name:\"[^\"]+\").+(?<src>src:\"[^\"]+\").*/\1 \2 \3 \4 \5/g"

DanAlexander
Communicator

Thanks again @ITWhisperer 

I just wanted to confirm that this is going to be the result after applying 

SEDCMD remove_unwanted_parts_from_raw_event=s/.*\[(?<action>action:\"[^\"]+\").+(?<origin>origin:\"[^\"]+\").+(?<dst>dst:\"[^\"]+\").+(?<layer_name>layer_name:\"[^\"]+\").+(?<src>src:\"[^\"]+\").*/\1 \2 \3 \4 \5/g

action:"Accept" origin:"000.000.000.000" dst:"000.000.000.000" layer_name:" XXXXXXXX " src:"000.000.000.000"

I just wanted to make sure the regex will extract the values I want to keep (above) and all the rest will be dropped before it gets indexed on the indexers and not the other way around?

Thank you!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's why I suggested you check it in testing environment. At first glance it seems that you're trying to escape too much in your SEDCMD. While some characters will work the same way even if unnecessarily escaped, others may not.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

To be honest, I don't know for certain, but I think it should work. I don't usually get involved with the ingestion side of things. As @PickleRick suggests, you should test it before rolling it out to your production environment.

PickleRick
SplunkTrust
SplunkTrust

1. Use regex101.com for testing your regexes.

2. Test in pre-prod environment, test on mockup data and send to temporary index.

3. Testing using regex SPL commands might lead to confusion sometimes since you have to escape your regex to "fit" into a string.

 

0 Karma

DanAlexander
Communicator

Thanks for the reply @PickleRick 

"3. Testing using regex SPL commands might lead to confusion sometimes since you have to escape your regex to "fit" into a string."

Would it be possible to provide any practical examples, please? Apologies, I cannot fully understand. 

Thank you!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Normally if you want to perform - for example

s/"/|/g

You type it literarily in the SEDCMD definition

But if you want to use SPL, you have to escape the quotation mark so that doesn't end the string containing the regex. So it becomes

"s/\"/|/g"

And that's the simplest example. If you have multiple quotes and some backslashes in your regex, that might get messy and "disarming" all those escapes to get proper regex definition for SEDCMD might cause additional mistakes.

DanAlexander
Communicator

I get this now. Thank you @PickleRick 

I might create separate SEDCMD entries to avoid confusion and keep it simple?

 

0 Karma
Get Updates on the Splunk Community!

Notification Email Migration Announcement

The Notification Team is migrating our email service provider from Postmark to AWS Simple Email Service (SES) ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...