Getting Data In
Highlighted

Difference in license usage and diskspace usage

Path Finder

Hi,

I have an index with the following configuration:

[index1]
coldPath = $SPLUNKDB/index1/colddb
homePath = $SPLUNK
DB/index1/db
thawedPath = $SPLUNKDB/index1/thaweddb
maxDataSize = auto
high_volume
frozenTimePeriodInSecs = 31536000
maxTotalDataSizeMB = 5000000
repFactor = auto

In the license master, I can see that the cumulative raw data size for index1 is 619GB. However, on the indexers, the size of $SPLUNKDB/index1/colddb and $SPLUNKDB/index1/db are 1.8TB and 2.1TB respectively.

Is this right?

Is there any way i can reduce the disk usage with the data retention period unchanged?

Thanks.

Regards,
Jackie

Highlighted

Re: Difference in license usage and diskspace usage

Communicator

How long are your license master logs being retained? Your cumulative size reported from the license master is limited by your log retention on the _internal index, but it looks like the index you are asking about holds a full year's worth of data.

0 Karma
Highlighted

Re: Difference in license usage and diskspace usage

Path Finder

Ah ! That explains the difference in the numbers. I didn't change the license master setup so it should be the default 30 days. Thanks a lot !

0 Karma
Highlighted

Re: Difference in license usage and diskspace usage

Communicator

OK so it sounds like part 1 on the discrepancy is due to the difference in the _internal index retention which affects the license master logs that will show and the retention on the index shown here. The disk utilization on the indexer represents a much longer time period than what's reflected in _internal.

To address the second part of the question on reducing disk usage without changing the retention period, here's a couple options:

  • Check your cluster master to see if there are excess bucket copies that you can remove to free up some space.
  • If the data is infrequently accessed past a certain age or if slower searches beyond a certain age aren't a concern, look into tsidx reduction.
  • Make sure your replication and search factors aren't too high as this will require additional space for the extra copies.
  • Use something like |dbinspect index=index1 | chart count by guId to make sure you didn't happen to catch much higher than average space usage for that index due to a bucket imbalance. If the bucket counts aren't reasonably close to even, first make sure you don't have an imbalance on incoming data then consider doing a cluster rebalance for just that index (or include others if you notice the problem on other indexes too).

View solution in original post

0 Karma