Getting Data In

How to decompress a single field (compressed JSON file) given the data has already been indexed in Splunk?

morin
New Member

We have a compressed (via python zlib) JSON file that is "chunked" prior to being indexed by Splunk.

The multiple events in Splunk (once indexed) can be pieced together (via Splunk's transaction command) yielding one event, containing multiple fields, one of which contains the compressed JSON file.

How do we decompress this one field in Splunk given the data has already been indexed?

(Decompressing earlier in the process, like during indexing, doesn't seem reasonable because data arrives in pieces due to various size limitations.)

Thanks.

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

While Splunk uses zlib for compression internally, that not something made available via commands out of the box.

That said, it does make sense to decompress the data before indexing (as a pre-process) since on the whole it will ALL be compressed again through the indexing process, using the same methodology that you use.

All indexed data is stored as compressed data (and usually sits on disk taking up 30%-70% less room than the raw data).

The other option is for you and yours to create a command that will take input (a field, in line) and run it through a decompression using zlib in a python script. you can read about that here feeding the output back to Splunk where you can use it.

You have not mentioned any specifics regarding why your data "arrives in pieces due to various size limitations", so it's difficult to say whether these suggestions are viable for you.

The least complicated solution would be to create a scripted input (in python, if you like) that decompresses the data as it feeds it to the indexer. (which will, in turn compress and make it available to you simultaneously)

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...