Getting Data In
Highlighted

HadoopConnect: How do I reset the HDFS input?

Explorer

I have a folder in HDFS that has log files continuously being put into it. I decided to test the HadoopConnect app's import feature and created a test index to store data. Then, I added the folder to the input via the web interface on HadoopConnect. The data imports successfully, and new files are being indexed correctly too.

I decide that I would like to use the app, and delete the test index and input. Then, I add the same input again. However, it seems that there is some persistency in the HDFS file monitoring because only new files are getting indexed. The old ones aren't anymore.

I'd like to know if there is a way to reset this persistent state? I tried deleting $SPLUNK_HOME/var/lib/splunk/persistentstorage/fschangemanager_state because it seemed like a good candidate but to no avail. Please advise, thanks.

0 Karma
Highlighted

Re: HadoopConnect: How do I reset the HDFS input?

Explorer

Found it: The state data is in $SPLUNK_HOME/var/lib/splunk/modinputs/hdfs. Deleting the file(s) in this seems to make splunk index everything again.

View solution in original post