I accidentally imported some files into Splunk and the default line-breaking didn't work correctly. Now I want to repeat the import using a fixed props.conf and transforms.conf.
But I know that Splunk won't re-index files that it's already seen (meaning unchanged CRCs). How can I get Splunk to re-import these files?
If it's a one time re-import, I recommend you use first delete the old source with the search command:
source=/full/path/to/file | delete
(remembering that the "delete" capability is not granted to any role including admin
by default) and then use the oneshot
input at the command line, or it's equivalent in the Manager>Data Input>Files and Directories,option "Index a file on the Splunk server". The cli is:
./splunk add oneshot /full/path/to/file -sourcetype mysourcetype -index myindex -host myhostparam
The parameters other than the full path to the file (not a relative path) are optional. This is preferable to setting crcSalt
to <SOURCE>
for a one-time reload, since you don't need to then set it back if you don't want that enabled.
This is not an "answer" exactly, but more of a helpful technique that I've found that may be useful as a way of avoiding the situation where reindexing is necessary. I'm guessing that others that stumble across this question may find this helpful to them as well.
Splunk provides a command line tool to let you test your log file configuration (specifically the sourcetype association) without actually indexing anything.
splunk test sourcetype /path/to/my/file.log
This command will dump the props settings that have been associated based on source and sourcetype properties. This helps you know what settings splunk would use if/when the specified file is indexed. I've found this useful for testing source pattern matching rules, and verifying that all the props settings are setup correctly.
It's also possible to make a config change and then re-run the tool to see if your change had the desired effect without restarting splunk. I've saved many many hours by using this tool instead of waiting for splunk to restart, re-index, and then search, just to find out I had a silly configuration typo.
Hope this approach helps others to avoid re-indexing by becoming more proactive in confirming props.conf
settings...
The 'test' and 'train' commands have been deprecated.
Type "help [object|topic]" to view help on a specific object or topic.
If it's a one time re-import, I recommend you use first delete the old source with the search command:
source=/full/path/to/file | delete
(remembering that the "delete" capability is not granted to any role including admin
by default) and then use the oneshot
input at the command line, or it's equivalent in the Manager>Data Input>Files and Directories,option "Index a file on the Splunk server". The cli is:
./splunk add oneshot /full/path/to/file -sourcetype mysourcetype -index myindex -host myhostparam
The parameters other than the full path to the file (not a relative path) are optional. This is preferable to setting crcSalt
to <SOURCE>
for a one-time reload, since you don't need to then set it back if you don't want that enabled.
What do you do if you want to add many files at once? I need the source field to end up being /full/path/to/file/x.log. I have about 3,000 I need to do this way. I cannot stop the indexer just to clean my test index. I'm currently very frustrated by this functionality. I'm just trying to index data gain that will have an updated source field.
Read this https://answers.splunk.com/answers/72562/how-to-reindex-data-from-a-forwarder.html for a really good answer about different ways of reindexing the data
You should use a test index for new data sources anyway, because delete
is dangerous and overuse of it will hurt search performance. You can still use oneshot
with a test index though, and it makes the testing cycle easier. Using delete
with oneshot
is really for fixing mistakes.
dude that's a genius idea! The only other way I knew to handle this case was with SPLUNK CLEAN and a test index, but your approach sounds much easier. thanks!