Splunk Search

How to remove everything until the date but include it using SED (before indexing)?

newsplunker1
Path Finder

Hi All,

Im struggeling  to remove everything before the date using SED 

Example 

|makeresults|eval_raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)" 
 

1- I want to remove everything up to 02-Feb-2021 ( the date should be included ) 

      I tried this  rex mode=sed field=_raw "s/[^[0-9]{2}-[\w]{3}-[0-9]{4}]*//g"  NO LUCK

Desired resullt would be something like this 

02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)" 

 

2- I want to keep only the last 3 parts of the domain 

Example 

cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com

Desire result:        fp.measure.office.com

I have a working rex that does that but how can i use SED to avoid indexing anything before the last 3 parts of the domain 

| rex field=_raw "query:\s(?:[^\s.]+\.)*(?P<query_dns1>[^\s.]+\.[^\s.]+\.[^\s.]+)"

 

THANK YOU!!

Labels (3)
0 Karma
1 Solution

yeahnah
Motivator

Hi @newsplunker1 

To remove the content before indexing requires using the SEDCMD-<class> command in the props.conf file.  This occurs at the pre-indexing parsing layer, which can be manually configured on either a Splunk heavy forwarder or indexer, depending on your environmental setup, which you have not described.  

Using the SEDCMD configuration would look something like this (assuming the events have a consistent format).

props.conf

[<your event's sourcetype>]
...<existing config, if any...
SEDCMD-shorten-event =s/.+?\] //
SEDCMD-strip-url = s/(.+? \()([^\.]+)([^\)]+)(\): query: )([^\.]+)(.*)/\1\3\4\6/

This is best deployed in a simple app, e.g. $SPLUNK_HOME/etc/apps/<app_name>/local/props.conf

Restart the Splunk server to load the new configuration and test you inputs.

Here's a run anywhere search example

| makeresults 
| eval _raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)"
| rex mode=sed field=_raw "s/.+?\] //"
| rex "query: [^\.]+\.(?<query_dns1>[^ ]+)"
| rex field=_raw mode=sed "s/(.+? \()([^\.]+\.)([^\)]+)(\): query: )([^\.]+\.)(.*)/\1\3\4\6/"

Hope this helps

 

View solution in original post

0 Karma

yeahnah
Motivator

Hi @newsplunker1 

To remove the content before indexing requires using the SEDCMD-<class> command in the props.conf file.  This occurs at the pre-indexing parsing layer, which can be manually configured on either a Splunk heavy forwarder or indexer, depending on your environmental setup, which you have not described.  

Using the SEDCMD configuration would look something like this (assuming the events have a consistent format).

props.conf

[<your event's sourcetype>]
...<existing config, if any...
SEDCMD-shorten-event =s/.+?\] //
SEDCMD-strip-url = s/(.+? \()([^\.]+)([^\)]+)(\): query: )([^\.]+)(.*)/\1\3\4\6/

This is best deployed in a simple app, e.g. $SPLUNK_HOME/etc/apps/<app_name>/local/props.conf

Restart the Splunk server to load the new configuration and test you inputs.

Here's a run anywhere search example

| makeresults 
| eval _raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)"
| rex mode=sed field=_raw "s/.+?\] //"
| rex "query: [^\.]+\.(?<query_dns1>[^ ]+)"
| rex field=_raw mode=sed "s/(.+? \()([^\.]+\.)([^\)]+)(\): query: )([^\.]+\.)(.*)/\1\3\4\6/"

Hope this helps

 

0 Karma

newsplunker1
Path Finder

Thank you - That worked for me 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

here is one example

| makeresults 
| eval_raw="Feb 2 14:27:50 test.dh.test.com named[123456]: [ID xxxxx local6.info] 02-Feb-2021 14:27:50.448 queries: info: client @172ac618 x.x.x.x#61721 (cd1ae518178748adbca4cff52b24b791.fp.measure.office.com): query: cd1ae51817ffdfdfhhh8748adbca4cff52b24b791.fp.measure.office.com IN A + (x.x.x.x)"
| rex mode=sed "s/((?:[\w\s\d\.:]*)(?:\[(?:[^]]+)\][:\s]*))//g"

It expects that the start of _raw has always same kind of format.

A good place to test your regex https://regex101.com/r/nGoUt3/1

r. Ismo

Get Updates on the Splunk Community!

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...