I accidentally imported some files into Splunk and the default line-breaking didn't work correctly. Now I want to repeat the import using a fixed props.conf and transforms.conf.
But I know that Splunk won't re-index files that it's already seen (meaning unchanged CRCs). How can I get Splunk to re-import these files?
If it's a one time re-import, I recommend you use first delete the old source with the search command:
source=/full/path/to/file | delete
(remembering that the "delete" capability is not granted to any role including
admin by default) and then use the
oneshot input at the command line, or it's equivalent in the Manager>Data Input>Files and Directories,option "Index a file on the Splunk server". The cli is:
./splunk add oneshot /full/path/to/file -sourcetype mysourcetype -index myindex -host myhostparam
The parameters other than the full path to the file (not a relative path) are optional. This is preferable to setting
<SOURCE> for a one-time reload, since you don't need to then set it back if you don't want that enabled.
dude that's a genius idea! The only other way I knew to handle this case was with SPLUNK CLEAN and a test index, but your approach sounds much easier. thanks!
You should use a test index for new data sources anyway, because
delete is dangerous and overuse of it will hurt search performance. You can still use
oneshot with a test index though, and it makes the testing cycle easier. Using
oneshot is really for fixing mistakes.
What do you do if you want to add many files at once? I need the source field to end up being /full/path/to/file/x.log. I have about 3,000 I need to do this way. I cannot stop the indexer just to clean my test index. I'm currently very frustrated by this functionality. I'm just trying to index data gain that will have an updated source field.
This is not an "answer" exactly, but more of a helpful technique that I've found that may be useful as a way of avoiding the situation where reindexing is necessary. I'm guessing that others that stumble across this question may find this helpful to them as well.
Splunk provides a command line tool to let you test your log file configuration (specifically the sourcetype association) without actually indexing anything.
splunk test sourcetype /path/to/my/file.log
This command will dump the props settings that have been associated based on source and sourcetype properties. This helps you know what settings splunk would use if/when the specified file is indexed. I've found this useful for testing source pattern matching rules, and verifying that all the props settings are setup correctly.
It's also possible to make a config change and then re-run the tool to see if your change had the desired effect without restarting splunk. I've saved many many hours by using this tool instead of waiting for splunk to restart, re-index, and then search, just to find out I had a silly configuration typo.
Hope this approach helps others to avoid re-indexing by becoming more proactive in confirming
The 'test' and 'train' commands have been deprecated.
Type "help [object|topic]" to view help on a specific object or topic.