Splunk Search

How to remove everything until the date but include it using SED (before indexing)?

newsplunker1
Path Finder

Hi All,

Im struggeling  to remove everything before the date using SED 

Example 

|makeresults|eval_raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)" 
 

1- I want to remove everything up to 02-Feb-2021 ( the date should be included ) 

      I tried this  rex mode=sed field=_raw "s/[^[0-9]{2}-[\w]{3}-[0-9]{4}]*//g"  NO LUCK

Desired resullt would be something like this 

02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)" 

 

2- I want to keep only the last 3 parts of the domain 

Example 

cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com

Desire result:        fp.measure.office.com

I have a working rex that does that but how can i use SED to avoid indexing anything before the last 3 parts of the domain 

| rex field=_raw "query:\s(?:[^\s.]+\.)*(?P<query_dns1>[^\s.]+\.[^\s.]+\.[^\s.]+)"

 

THANK YOU!!

Labels (3)
0 Karma
1 Solution

yeahnah
Motivator

Hi @newsplunker1 

To remove the content before indexing requires using the SEDCMD-<class> command in the props.conf file.  This occurs at the pre-indexing parsing layer, which can be manually configured on either a Splunk heavy forwarder or indexer, depending on your environmental setup, which you have not described.  

Using the SEDCMD configuration would look something like this (assuming the events have a consistent format).

props.conf

[<your event's sourcetype>]
...<existing config, if any...
SEDCMD-shorten-event =s/.+?\] //
SEDCMD-strip-url = s/(.+? \()([^\.]+)([^\)]+)(\): query: )([^\.]+)(.*)/\1\3\4\6/

This is best deployed in a simple app, e.g. $SPLUNK_HOME/etc/apps/<app_name>/local/props.conf

Restart the Splunk server to load the new configuration and test you inputs.

Here's a run anywhere search example

| makeresults 
| eval _raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)"
| rex mode=sed field=_raw "s/.+?\] //"
| rex "query: [^\.]+\.(?<query_dns1>[^ ]+)"
| rex field=_raw mode=sed "s/(.+? \()([^\.]+\.)([^\)]+)(\): query: )([^\.]+\.)(.*)/\1\3\4\6/"

Hope this helps

 

View solution in original post

0 Karma

yeahnah
Motivator

Hi @newsplunker1 

To remove the content before indexing requires using the SEDCMD-<class> command in the props.conf file.  This occurs at the pre-indexing parsing layer, which can be manually configured on either a Splunk heavy forwarder or indexer, depending on your environmental setup, which you have not described.  

Using the SEDCMD configuration would look something like this (assuming the events have a consistent format).

props.conf

[<your event's sourcetype>]
...<existing config, if any...
SEDCMD-shorten-event =s/.+?\] //
SEDCMD-strip-url = s/(.+? \()([^\.]+)([^\)]+)(\): query: )([^\.]+)(.*)/\1\3\4\6/

This is best deployed in a simple app, e.g. $SPLUNK_HOME/etc/apps/<app_name>/local/props.conf

Restart the Splunk server to load the new configuration and test you inputs.

Here's a run anywhere search example

| makeresults 
| eval _raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)"
| rex mode=sed field=_raw "s/.+?\] //"
| rex "query: [^\.]+\.(?<query_dns1>[^ ]+)"
| rex field=_raw mode=sed "s/(.+? \()([^\.]+\.)([^\)]+)(\): query: )([^\.]+\.)(.*)/\1\3\4\6/"

Hope this helps

 

0 Karma

newsplunker1
Path Finder

Thank you - That worked for me 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

here is one example

| makeresults 
| eval_raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)"
| rex mode=sed "s/((?:[\w\s\d\.:]*)(?:\[(?:[^]]+)\][:\s]*))//g"

It expects that the start of _raw has always same kind of format.

A good place to test your regex https://regex101.com/r/nGoUt3/1

r. Ismo

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

How to find the worst searches in your Splunk environment and how to fix them

Everyone knows Splunk is a powerful platform for running searches and doing data analytics. Your ...

Share Your Feedback: On Admin Config Service (ACS)!

Help Us Build a Better Admin Config Service Experience (ACS)   We Want Your Feedback on Admin Config Service ...

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...