Getting Data In

Re-index a file and prevent duplicates

JeremyHagan
Communicator

Hi,

I have some CSV files which were indexed, but a proportion of the events were corrupted in the index. Each file has up to 1 million records. Is there a way to ask Splunk to re-index this file and to only index events that it doesn't current have? Each event has a unique record ID field.

Tags (3)

ShaneNewman
Motivator

An easier way might be to delete the events you have in your index now, clean the fishbucket, and just let Splunk reindex them.

0 Karma

ShaneNewman
Motivator

Yes. This is what I do.

Run the search that has the events you need to delete, I assume you don't want to delete the entire index. If you do, run the below command with the index name that you wish to wipe out, then clean _thefishbucket. Otherwise, run your search to find your events, then pipe that "| delete".

cd out to the Splunk\bin directory. Type splunk stop. Then type splunk clean eventdata -index _thefishbucket

Then type splunk start. The rest is automatic, assuming you have fixed the files.

0 Karma

JeremyHagan
Communicator

Clean the fishbucket?

0 Karma

jtrucks
Splunk Employee
Splunk Employee

You might want to make a report of the record IDs you have in Splunk, then cull those from your input file. Then use splunk add oneshot to import the file (or some other method).

--
Jesse Trucks
Minister of Magic

JeremyHagan
Communicator

I was kind of hoping for something a little less manual....

0 Karma
Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...