we have indexers which are running in clustered environment.we have retention policy 35 days for the all app logs. Now we started missing data.now we see only 10 days of old data.and data missing continuously happening.could you please suggest how to investigate this issue.
Do you have enough disk space to accommodate 35 days worth of data? Do you have volume settings that allow you to consume that disk space for Splunk's use? Check out your Monitoring Console to be sure.
index=_internal component=BucketMover idx=YourIndexName
Look for when the data is being rolled with the search above. See if there are any errors such as "storage full" or "out of disk", or "permission denied", etc.
i am not seeing any mentioned error. we are suspecting data loss due to the restricted volume on indexer db.below are the configuration.how we can identify that data is getting overidden with earliest data inplace of oldest data.can you please help
figure out which pipeline is full using the monitoring console.
Look at indexing performance searches, and it should show you the pipelines.
Could be bad parsing, or it might be time to add indexers. How many indexers do you have now and what IOPS storage do you have? To do 2.4TB/day with reference hardware, you'd need about 10 indexers just to handle the input.
index=_internal sourcetype=splunkd source=*splunkd.log "BucketMover - will attempt to freeze" NOT "because frozenTimePeriodInSecs=" | rex field=bkt "(rb_|db_)(?P<newestDataInBucket>\d+)_(?P<oldestDataInBucket>\d+)" | eval newestDataInBucket=strftime(newestDataInBucket, "%+"), oldestDataInBucket = strftime(oldestDataInBucket, "%+") | table message, oldestDataInBucket, newestDataInBucket
ill start by checking the size of your indexes (and even your indexers disk) Splunk will apply retention or size policy, whatever comes first. so if lets say you have 100gb disk available on Indexers and you are indexing 10 GB per day, you will only have retention for 10 days (here its simplified, not calculating compression).
so, even if you set your indx time retention to 300 days, it can not hold enough data to keep it
Thank you for reply.each indexes are assigned with 500GB and total we have 43 indexers. retention policy is 35 days.having said that we have 630 GB of limit on the total size of data model acceleration (DMA).below are the valume settings in indexers.conf.
path = /SplunkIndexes/HotWarmIndex
maxVolumeDataSizeMB = 130000
path = /SplunkIndexes/ColdIndex
maxVolumeDataSizeMB = 500000
below is the total disk space consumed on indexers.
4.8T 128G 4.5T 3% /SplunkIndexes/HotWarmIndex
14T 489G 13T 4% /SplunkIndexes/ColdIndex
can you please suggest us whether we are assigned less amount of volume so in result we are seeing data loss.also suggest us how we can avoide the data loss .we should have 35 days of data as per the requirement. does increasing the volumes fix the issue? please help us in this case.
you really are defining tiny volumes across your 43 indexers
~130GB for hot/war volume
~500GB for cold volume
that explains why you barely see any data on your
read here all the way though and modify your indexes.conf accordingly
also try and run this search to see what Splunk tells you as the reason for rolling and verify your configurations down the road:
index=_internal sourcetype=splunkd component=BucketMover
hope it helps
Thanks once again. when i ran the below query i see the below output.
index=_internal sourcetype=splunkd component=BucketMover | timechart span=1d count by component
Can you please tell what it indicates