can anyone explain me tsidxWritingLevel variables from 1 to 4 ?
tsidxWritingLevel = [1|2|3|4]
Reference - https://docs.splunk.com/Documentation/Splunk/8.1.1/Admin/Indexesconf?_ga=2.85851486.671277735.164662....
Hi,
i recently wrote a small blog article regarding this setting:
How tsidxWritingLevel affects storage size and performance - https://www.batchworks.de/tsidx-storage-performance/
hope it helps,
Andreas
Hi @schose
Could you please provide the splunk query that is used to check before and after sizes of bucket.
Thank you.
@schoseThis is a really nice blog on the topic, well done! Any plans to update with Splunk 9.x as a test? From a data perspective, comparing bucket complexity (number of source/sourcetypes) might be interesting too (there was a bug a while back with a corner case where high cardinality data perverted the compression optimizations in level 3, this has long been addressed, especially since it's the default in 9.x). Thanks, will be sharing your blog to explain tsidxWritingLevel to folks.
Hi @rkantamaneni ,
Well, i can rerun the tests with a 9.x version, but wouldn't expect different results with the same level setting.
I would be also interested in behaviour for different sourcetypes, as we see huge differences there. Do you have an idea for good high cardinality logfiles?
Best regards,
Andreas
@schose, apologies, I'm just seeing your reply.
To test for high cardinality, I'm thinking the following would work:
I'd imagine the above can be done with access to a large syslog server or some programmatic manipulation.
Nice work!
As you have test setup already in place, can you do the same with
journalCompression = gzip|lz4|zstd * The compression algorithm that splunkd should use for the rawdata journal file of new index buckets. * This setting does not have any effect on already created buckets. There is no problem searching buckets that are compressed with different algorithms. * "zstd" is only supported in Splunk Enterprise version 7.2.x and higher. Do not enable that compression format if you have an indexer cluster where some indexers run an earlier version of Splunk Enterprise. * Default: gzip
r. Ismo
Hi @isoutamo ,
that's exactly what i planned for my next article. But i can share the first numbers:
After ingest 800MB raw routeros logs into event indexes the rawdata is:
gzip: 74.2MB
lz4: 136.0MB
zstd: 56.4 MB
I'll need some more time for a performance review, but will post updates.
regards,
Andreas
There is no documentation I know of that documents the difference beyond the spec file ( https://docs.splunk.com/Documentation/Splunk/latest/Admin/Indexesconf )
tsidxWritingLevel = [1|2|3|4] * Enables various performance and space-saving improvements for tsidx files.
It is set to 1 by default in case you have older Splunk versions in the cluster, I use the highest version available (4).