Getting Data In

How to index part of a log file that was not indexed after disk failure?

envato_dennis
New Member

We had a disk failure on our indexer. During this time, Splunk was thinking it was indexing data. We had to stop splunk, remount the disk, and start it again. However, the period that the disk went offline (containing one of our indexes) we now have a gap were we don't have any events.

The logs are still available on the application servers and they run universal forwarders.

I want to re-index just the missing 3 hour time period. If I push the whole log via one shot (containing events before and after the disk outage), I will get duplicate events as I would if I deleted the _fishbucket on the forwarders. This is production data.

What are my options in this instance?

Thanks

Tags (3)
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Something that more selective than deleting the entire _fishbucket is using the btprobe command:

splunk cmd btprobe -d SPLUNK_HOME/var/lib/splunk/fishbucket/splunk_private-_db --file <source> --reset

You can read more about btprobe here.

Please see @YannK 's answer here as well.

How many files are involved in that 3 hour window? Are they all within a single file? I guess hypothetically you could just parse out the portion you want to reindex, and just reindex that one section? Slightly less than desirable I'm sure though 😛

0 Karma

envato_dennis
New Member

Thanks for the reply.

Yes, so the problem is that every host has at least 16 logs that need to be done and we have around 30-40 hosts that we are really interested in.

I will investigate btprobe and report back.

0 Karma
Get Updates on the Splunk Community!

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...