Getting Data In

Truncate Events Received on HEC Input on HF ?

vinaykumar_aib
Observer

Good day Splunkers ,

We have a Data flow coming from the source A to Kakfa Topic. Splunk Connector on the kafka using HEC  Token to forward data from the Kafka Topic to Splunk HF.  Sourcetype if specified  while configuring the HEC.

This source event has huge volume , and have many key-value pairs , To Manage the High  ingestion Volume , I need to apply truncate feature on all these events  at the heavy forwarder layer before it reaches indexing layer.

Is it possible to choose only selected fields from these events and have them indexed ?

is it possible to use script applied on the source type to format the data which is coming from HEC input at the HF level ?

Labels (3)
0 Karma

vinaykumar_aib
Observer

@PickleRick  Thanks for your response !! I'm eager to know any solution that we could within splunk feature (scripts included) !!

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I said - you can't spawn a script within a Splunk processing pipeline. So you're mostly limited to https://docs.splunk.com/Documentation/Splunk/9.0.4/Data/Anonymizedata

0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. No. The TRUNCATE option just cuts the event at given point and doesn't care about the logical structure of the event. And since it's relatively early on in the event processing pipeline, all the data after the truncation point is irrevocably lost

2. No. Not while the events are being processed by Splunk's "internal" pipelines. If you want to manipulate the data prior to ingesting them you'd have to implement some form of a data-mangling proxy in front of your HEC input so that you'd first receive the event from your source, cut and splice it and then forward the resulting event to the HEC input.

Other option, if you know that your data will always be in a pretty strictly defined format, you could use regexes to "extract" only some parts of the event using SEDCMD but I suppose with this fancy stuff of yours 😉 (I've never worked with Kafka) you're getting some json or something like that.

You could try to use indexed fields to extract data using index-time extractions and then truncate events "manially" but that's generally not a very good idea. And, in index-time processing you only have regexes and INGEST_EVAL at your disposal so no fancy search-time parsed fields (which means that manipulating json structure is not easy/next to impossible.

 

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...