Getting Data In

Truncate Events Received on HEC Input on HF ?

vinaykumar_aib
Observer

Good day Splunkers ,

We have a Data flow coming from the source A to Kakfa Topic. Splunk Connector on the kafka using HEC  Token to forward data from the Kafka Topic to Splunk HF.  Sourcetype if specified  while configuring the HEC.

This source event has huge volume , and have many key-value pairs , To Manage the High  ingestion Volume , I need to apply truncate feature on all these events  at the heavy forwarder layer before it reaches indexing layer.

Is it possible to choose only selected fields from these events and have them indexed ?

is it possible to use script applied on the source type to format the data which is coming from HEC input at the HF level ?

Labels (2)
0 Karma

vinaykumar_aib
Observer

@PickleRick  Thanks for your response !! I'm eager to know any solution that we could within splunk feature (scripts included) !!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I said - you can't spawn a script within a Splunk processing pipeline. So you're mostly limited to https://docs.splunk.com/Documentation/Splunk/9.0.4/Data/Anonymizedata

0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. No. The TRUNCATE option just cuts the event at given point and doesn't care about the logical structure of the event. And since it's relatively early on in the event processing pipeline, all the data after the truncation point is irrevocably lost

2. No. Not while the events are being processed by Splunk's "internal" pipelines. If you want to manipulate the data prior to ingesting them you'd have to implement some form of a data-mangling proxy in front of your HEC input so that you'd first receive the event from your source, cut and splice it and then forward the resulting event to the HEC input.

Other option, if you know that your data will always be in a pretty strictly defined format, you could use regexes to "extract" only some parts of the event using SEDCMD but I suppose with this fancy stuff of yours 😉 (I've never worked with Kafka) you're getting some json or something like that.

You could try to use indexed fields to extract data using index-time extractions and then truncate events "manially" but that's generally not a very good idea. And, in index-time processing you only have regexes and INGEST_EVAL at your disposal so no fancy search-time parsed fields (which means that manipulating json structure is not easy/next to impossible.

 

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...