Getting Data In

How to extract required data from the HEC _raw event and truncate the remaining information before indexing

Charlize
Engager

Hi,
My single event length is too long so I want to extract and ingest the specific part from it. The part is in the middle of the event, so I tried extracting it using BREAK_ONLY_BEFORE and BREAK_ONLY_AFTER. Also used the LINE_BREAKER function but it is not working as expected. How can we define start and end of the log in the props.conf file? Is there any alternative to achieve this?

Log sample:

Charlize_0-1679905768527.png

0 Karma

PickleRick
SplunkTrust
SplunkTrust

BREAK_ONLY_BEFORE and BREAK_ONLY_AFTER are settings for when you have SHOULD_LINEMERGE set (which you should not have set unless there is really no other way).

To cut data from the event before indexing you'd normally use SEDCMD but there is one caveat - you must of course write proper regex. And with json it's kinda hard. You can't simply get anything from one brace to another because:

1) You can have nested entities. But even if you can assume that you won't have any "substructures" in your json,

2) A closing brace might be simply contained within a string.

You can use ingest-time eval to extract a given path from a json structure using json_extract() but that's "heavier" solution than simple regex-based operations.

So it's not that easy. The right approach will depend on the actual data you have and how much you can safely assume about its structure.

 

Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...