Splunk Search

Re-index directory data after indexing into temp

cpt12tech
Contributor

I'm having problems getting splunk to re-index data. Here are the steps I've taken:

Created a data input file from a shared folder on another computer
indexed into test index
checked the data, made sure everything was correct
disabled the data input
deleted data in the test index by using | delete
in the CLI, stopped splunk
ran
splunk clean eventdata -index test
splunk start

changed the data input to send to main index
started enabled data input

I was expecting the data to be re-indexed, but this hasn't happened.

Tags (1)
0 Karma

davecroto
Splunk Employee
Splunk Employee

clean the fishbucket

0 Karma

lguinn2
Legend

Yes, but cleaning the fishbucket will reset the status of all inputs - meaning that Splunk will reindex everything again, not just the one file or directory

0 Karma

lguinn2
Legend

You might find some help in this answer (even though you wouldn't guess from the name). It shows how to eliminate a single file entry from the fishbucket. Since the fishbucket is where data files are "remembered," this should cause Splunk to forget that it once indexed this file.

0 Karma

lguinn2
Legend

I updated that post - because I was wrong! Gack!! So you might want to look again...

0 Karma

lukejadamec
Super Champion

According to that post it is no longer possible to delete single files from the fishbucket.

"Splunk no longer lets you look at the fishbucket index. You cannot manage the specific records. The format is not published and the files are kept in binary. Sorry"

cpt12tech
Contributor

I'd love to just delete the entry from the fishbucket. However, how do I find out the file name to delete? There is other valid data in the fishbucket that I don't want to get rid of. Also, this data source is a directory with 1 file per entry. I want to re-index the directory.

0 Karma

lukejadamec
Super Champion

The input you created remembers data files that have already been indexed regardless of whether or not the index still exists or still has the data. You need to create a new input exactly as the one you have, but with a slightly different name and pointing to the right index, then poof your data will be re-indexed.

0 Karma

linu1988
Champion

Mention a new sourcetype name and give it a try..

0 Karma

lukejadamec
Super Champion

Now that you mention it, yes. For a directory or file monitor the inputName is the path.
I was thinking last night that you should use crc salt to reindex what is there, and then remove crc salt (because it can cause trouble down the road).

To use crcSalt you need to add the line to your input stanza:
crcSalt=

Here is what the documents say about using crcSalt=

  • If set to the literal string (including the angle brackets), the full directory path to the source file

Don't forget to delete the line and the original files after the original files are reindexed.

0 Karma

cpt12tech
Contributor

I'm not sure where the inputName is to be changed then. I'm using the Data Input - Files & Directories method to pull data off a network shared folder. Is the input name the path to the data? So I would need to change the log location?

0 Karma

lukejadamec
Super Champion

Just to be clear, it is the inputName that is important. You need to give it a new inputName. I don't think you need to change anything else.
I can't help with the fishbucket thing cause I've not done that yet, and you are correct - there is other information in there that you don't want to delete.

0 Karma

cpt12tech
Contributor

I had tried earlier to delete the input, then re-create it, however I used the same host name. After reading your post, I deleted the input, then re-created it with a different host name. The data isn't being re-indexed. Do I need to create a new sourceType?

0 Karma

cpt12tech
Contributor

Good to know and makes sense. I'll keep that in mind.

0 Karma

lguinn2
Legend

I am not sure what happened here, but I do know one thing. If you do -

splunk clean eventdata -index test
splunk start

  • then you don't need to use | delete first
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...