Hello,
I have recently inherited a Splunk Enterprise (v6.6) instance with some serious issues. The architecture is a distributed one with the Search head, Indexer and Heavy Forwarder all residing on different hosts. The primary problem I am facing is that after a short period of the time the queues (parsing, aggregator, typing and index) reach 100% and result in the error mentioned in the title.
Upon investigating the Index file directory where the errors are reported, there are 100+ .lock files that seem to replicate as file.lock, file.lock.lock, file.lock.lock.lock etc etc.
The machines that are running Splunk have more than enough RAM,CPU and IOPS. I have manually run splunk-optimize with no effect. I am lost on what to do next and almost considering deleting the index (not preferred) to resolve this issue.
Any help would be much appreciated.
... View more