Deployment Architecture

Syslog, data storage, buckets

ihingos
Engager

I'm looking to index and store a ton of data (syslog). My question is once splunk has index the data, and moved it to the various buckets, is there any depup, or compression that happens? Is there a document someplace that explains the process in more detail?

Thanks

Tags (1)
0 Karma
1 Solution

bmacias84
Champion

Hello ihingos,

To answer your question Splunk does not dedup raw events and its does compress them; however, Splunk allows you to dedup events in the search query language( yoursearch | dedup _raw …). Depending on the cardinality of your data you can get fairly high compression ratios. Compress will also vary depending on Bucket and index sizes.

In general the formula is : ( Daily average indexing rate ) x ( retention policy ) x 1/2

Additional Reading:

Estimateyourstoragerequirements

HowSplunkcalculatesdiskstorage

View solution in original post

bmacias84
Champion

Hello ihingos,

To answer your question Splunk does not dedup raw events and its does compress them; however, Splunk allows you to dedup events in the search query language( yoursearch | dedup _raw …). Depending on the cardinality of your data you can get fairly high compression ratios. Compress will also vary depending on Bucket and index sizes.

In general the formula is : ( Daily average indexing rate ) x ( retention policy ) x 1/2

Additional Reading:

Estimateyourstoragerequirements

HowSplunkcalculatesdiskstorage

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...

[Puzzles] Solve, Learn, Repeat: Tiling

This puzzle (first published here) is based on finding groups of tessellated tiles (inspired by floor tiles I ...