Splunk Search

Data filtration at field level using SED option

gvnd
Path Finder

Hi,
I am new to splunk.. I want to filter data at fields level instead of event levels before indexing my data. data is pipe(|) separated.
I need only few fields from below data, remaining fields are not required.
Example::
event1- 123|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
event2- 234|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
event3- 456|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12

Desired output::
event1- 123|field3|field6|field11
event2- 234|field3|field6|field11
event3- 456|field3|field6|field11

Please suggest me the proper regex which works with SED option in props.conf file to extract only these fields.

Thanks in advance...

Tags (3)
0 Karma
1 Solution

woodcock
Esteemed Legend

For demonstration:

| makeresults 
| eval raw="123|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
234|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
456|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12"
| makemv delim="
" raw
| mvexpand raw
| rename raw AS _raw
| rex mode=sed "s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/"

Therefore use:

SEDCMD-fields_0_3_6_11 = s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/

View solution in original post

woodcock
Esteemed Legend

For demonstration:

| makeresults 
| eval raw="123|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
234|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
456|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12"
| makemv delim="
" raw
| mvexpand raw
| rename raw AS _raw
| rex mode=sed "s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/"

Therefore use:

SEDCMD-fields_0_3_6_11 = s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/

gvnd
Path Finder

Thanks for quick response..
Could you please explain the meaning of ::::: .$/\1\2\3\4/

0 Karma

woodcock
Esteemed Legend

Go to RegEx101.com and enter the first portion ^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$ and it will show you what that does. The dollar sign anchors to the end of the string. The \# dereferences a capture group so \1 gives the value of the first capture group, etc.

0 Karma

gvnd
Path Finder

Sorry, still I didn't get the point. and also is it possible to give field names to that extracted fields.? For example::
"ONE" for first field i.e 123,
"TWO" for second field i.e field3,
"THREE" for third field i.e field6,
"FOUR" for fourth field i.e field11 etc...

And also don't we need '//g' option to replace empty strings in events with SEDCMD syntax(SEDCMD-=s///g)

Thanks for your patience..

0 Karma

woodcock
Esteemed Legend

A RegExt that begins with ^ and ends with $ matches THE ENTIRE STRING so we only need 1 match (i.e. no g on the end). We are staying replace THE ENTIRE STRING with the 4 captured pieces, back to back. Run it on RegEx101.com and it walks you through each piece.

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...