Getting Data In

How is LZ4 faring so far in 6.3+ compared to gzip for indexer rawdata compression?

moonhound
Explorer

Digging through the new stuff in 6.3 in preparation for some upgrades, I see LZ4 compression is available for bucket rawdata journal compression in indexes.conf. Awesome! I'm excited. Splunk bucket data seems like it should be a great fit for LZ4's strengths.

But LZ4 should also incur a measurable hit on storage needs over gzip, and algorithm benchmarks often focus on specific interesting data cases or a broad set of varying data types. Splunk's intake focus is pretty narrow by comparison, so I'm curious to see if anyone has any real-world numbers to throw down yet, since changing to LZ4 should change the calculations for capacity planning.

1 Solution

gjanders
SplunkTrust
SplunkTrust

The only item I've seen on this is that

Bonus finding: LZ4 does not yield any substantial

gains in performance that would be worth the

tradeoff in extra storage vs. GZIP

In the Architecting Splunk for Epic Performance conference talk by the Blizzard Splunk team

View solution in original post

gjanders
SplunkTrust
SplunkTrust

The only item I've seen on this is that

Bonus finding: LZ4 does not yield any substantial

gains in performance that would be worth the

tradeoff in extra storage vs. GZIP

In the Architecting Splunk for Epic Performance conference talk by the Blizzard Splunk team

2manyhobbies
Engager

I'm planning to use LZ4 for my current engagement, although the compression will also have the benefit of riding on top Pure All Flash so we should gain the benefit of their dedup. Pure has advised us to use LZ4 to get better dedup rates. I'll post some additional details when we get data ingestion rolling and have some real world numbers I can share. I will not be able to compare it to gzip though as we're not planning to test that nor do we have previous metrics to look at.

wcwong0
Engager

@2manyhobbies - We're about to stand up a new installation with Pure as hot/warm and are planning to use LZ4 given their recommendation. How has this been working out for you?

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...