Getting Data In

Is it possible to remove unnecessary JSON wrapping before it's ingested to save license?


Hey there, we have a large volume (about 500-600gb) of data coming in daily but about 200gb of this is a JSON wrapper from Amazon Firehose. The data essentially looks like this:


    "message": "ACTUAL_DATA_WE_WANT",
    "logGroup": "/use1/prod/eks/primary/containers",
    "logStream": "fluent-bit/cross-services/settings-7dbb9dbdb4-qjz5b/settings-api/81d3685eaaeae0effab5931590784016ce75a8171ad7e3e76152e30bd732a739",
    "timestamp": 1675349068034


As you can see, ACTUAL_DATA_WE_WANT is what we need. This contains everything including timestamp and application information. The JSON wrapper is added by Firehose and makes up at least 250 bytes of every event.

Is it possible to remove all of this unnecessary data so that we can save ingestion for more useful things? I have heard that the SEDCMD can do this but it is resource intensive and we ingest almost a billion events a day.

0 Karma


Usually, this is done with SEDCMD.  The resource use depends on the efficiency of the regex used.  Test the regex on and evaluate the resource usage on your dev/test instances.

Another option is to use Cribl to remove the unwanted bytes.

If this reply helps you, Karma would be appreciated.
0 Karma


As you have pure json event you probably could try INGEST_EVAL with json_extract?

0 Karma
Get Updates on the Splunk Community!

Admin Your Splunk Cloud, Your Way

Join us to maximize different techniques to best tune Splunk Cloud. In this Tech Enablement, you will get ...

Cloud Platform | Discontinuing support for TLS version 1.0 and 1.1

Overview Transport Layer Security (TLS) is a security communications protocol that lets two computers, ...

New Customer Testimonials

Enterprises of all sizes and across different industries are accelerating cloud adoption by migrating ...