Getting Data In

Truncate Events Received on HEC Input on HF ?

vinaykumar_aib
Observer

Good day Splunkers ,

We have a Data flow coming from the source A to Kakfa Topic. Splunk Connector on the kafka using HEC  Token to forward data from the Kafka Topic to Splunk HF.  Sourcetype if specified  while configuring the HEC.

This source event has huge volume , and have many key-value pairs , To Manage the High  ingestion Volume , I need to apply truncate feature on all these events  at the heavy forwarder layer before it reaches indexing layer.

Is it possible to choose only selected fields from these events and have them indexed ?

is it possible to use script applied on the source type to format the data which is coming from HEC input at the HF level ?

Labels (3)
0 Karma

vinaykumar_aib
Observer

@PickleRick  Thanks for your response !! I'm eager to know any solution that we could within splunk feature (scripts included) !!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I said - you can't spawn a script within a Splunk processing pipeline. So you're mostly limited to https://docs.splunk.com/Documentation/Splunk/9.0.4/Data/Anonymizedata

0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. No. The TRUNCATE option just cuts the event at given point and doesn't care about the logical structure of the event. And since it's relatively early on in the event processing pipeline, all the data after the truncation point is irrevocably lost

2. No. Not while the events are being processed by Splunk's "internal" pipelines. If you want to manipulate the data prior to ingesting them you'd have to implement some form of a data-mangling proxy in front of your HEC input so that you'd first receive the event from your source, cut and splice it and then forward the resulting event to the HEC input.

Other option, if you know that your data will always be in a pretty strictly defined format, you could use regexes to "extract" only some parts of the event using SEDCMD but I suppose with this fancy stuff of yours 😉 (I've never worked with Kafka) you're getting some json or something like that.

You could try to use indexed fields to extract data using index-time extractions and then truncate events "manially" but that's generally not a very good idea. And, in index-time processing you only have regexes and INGEST_EVAL at your disposal so no fancy search-time parsed fields (which means that manipulating json structure is not easy/next to impossible.

 

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...