Getting Data In

Remove portion of multivalue field with rex-sed

michael_vi
Path Finder

Hi.

I have a file that I want to remove portion of it during index time.

Remove all the text between **************************************

For example:

**********************************************************************
Started at   : 25/02/16 04:07:04
Terminated at:                                                        
Elapsed time :                                                        
                                                                                                        
Software:
   Version: 6.0.0.0
   Built  : 6.0.0.0.20141102.1-Release_
            14/11/02 10:06:52
Context:
   Account: SOC
   Machine: NEW
   IP addr: 255.555.543
   CPU    : Dual-Core

LOG Recycle Count:                                                    
**********************************************************************
25/02/16 04:07:04.834 |     7904 | TEST1
25/02/16 04:07:04.834 |     7904 | TEST2
25/02/16 04:07:04.865 |     7860 | TEST3
25/02/16 04:07:04.881 |     7860 | TEST4
...

 In the end I need to get:

25/02/16 04:07:04.834 |     7904 | TEST1
25/02/16 04:07:04.834 |     7904 | TEST2
25/02/16 04:07:04.865 |     7860 | TEST3
25/02/16 04:07:04.881 |     7860 | TEST4

Please assist

Thanks

Tags (3)
0 Karma
1 Solution

kiran_panchavat
Influencer

@michael_vi 

rex mode=sed "s/\*{10,}[\s\S]*?\*{10,}\n//g" Removes everything between (and including) **************************************.

kiran_panchavat_1-1739715530562.png

You can apply the configurations in props.conf and transforms.conf

props.conf

[YOUR_SOURCETYPE]
TRANSFORMS-remove_header = remove_header_content

transforms.conf 

[remove_header_content]
REGEX = \*{10,}[\s\S]*?\*{10,}\n
FORMAT =
DEST_KEY = _raw

 

I hope this helps, if any reply helps you, you could add your upvote/karma points to that reply, thanks.

View solution in original post

michael_vi
Path Finder

Thanks!

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @michael_vi ,

as @richgalloway and @kiran_panchavat said, you can use regex101 to find the correct regex to cut a part ot your json.

Only one attention point: json format has a well defined structure, so beware in cutting a part of the event, because if you break the json structure, the INDEXED_EXTRACTION=JSON and the spath command will not work correctly, and you have to manually parse all the fields!

Ciao.

Giuseppe

0 Karma

kiran_panchavat
Influencer

@michael_vi 

rex mode=sed "s/\*{10,}[\s\S]*?\*{10,}\n//g" Removes everything between (and including) **************************************.

kiran_panchavat_1-1739715530562.png

You can apply the configurations in props.conf and transforms.conf

props.conf

[YOUR_SOURCETYPE]
TRANSFORMS-remove_header = remove_header_content

transforms.conf 

[remove_header_content]
REGEX = \*{10,}[\s\S]*?\*{10,}\n
FORMAT =
DEST_KEY = _raw

 

I hope this helps, if any reply helps you, you could add your upvote/karma points to that reply, thanks.

kiran_panchavat
Influencer

@michael_vi You can try regex to meet your requirement. 

kiran_panchavat_0-1739715367236.png

 

I hope this helps, if any reply helps you, you could add your upvote/karma points to that reply, thanks.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

What have you tried so far?  How did those results not meet expectations?

Have you experimented with https://regex101.com?

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Splunk AI Assistant for SPL | Key Use Cases to Unlock the Power of SPL

Splunk AI Assistant for SPL | Key Use Cases to Unlock the Power of SPL  The Splunk AI Assistant for SPL ...

Buttercup Games: Further Dashboarding Techniques (Part 5)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Customers Increasingly Choose Splunk for Observability

For the second year in a row, Splunk was recognized as a Leader in the 2024 Gartner® Magic Quadrant™ for ...