Archive

Which file type consumes the most data?

Communicator

I'm curious, which file type within an index bucket is largest? I'm getting conflicting responses. Some say the .tsidx file and others point to the bloom filter? Which file is it? Thanks for your help.

Tags (1)
0 Karma
1 Solution

Splunk Employee
Splunk Employee

It will really depend on many factors. An individual tsidx file may be smaller than the bloom filter file, but as you end up with more buckets, the number of tsidx files will increase, and may end up consuming more space than the bloom filter. It also depends on the number of unique words that the bloom filter needs to calculate and store, and the number of fields that are indexed and stored in the tsidx.

On my test system, my _internal index's bloom filter is 5906606 bytes in size, I have 15 tsidx files that range from 34755 bytes to 2095069 bytes.

So many many factors!

View solution in original post

0 Karma

Splunk Employee
Splunk Employee

It will really depend on many factors. An individual tsidx file may be smaller than the bloom filter file, but as you end up with more buckets, the number of tsidx files will increase, and may end up consuming more space than the bloom filter. It also depends on the number of unique words that the bloom filter needs to calculate and store, and the number of fields that are indexed and stored in the tsidx.

On my test system, my _internal index's bloom filter is 5906606 bytes in size, I have 15 tsidx files that range from 34755 bytes to 2095069 bytes.

So many many factors!

View solution in original post

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!