Getting Data In

Splunk indexing data only after index was created.

pranaynanda
Path Finder

Last week, when I finally figured out indexing and sourcetypes in Splunk, I mapped them to my data input which is monitoring from multiple windows machines where all instances of forwarders are monitoring one standard directory in all the machines respectively. However, when I use the search application now, only the indexed data comes up and the indexed data consists of only the events that were created after the index was created. The directories have all the data. Later, I figured out that if I search using source="C:\Marvel\Avengers\CaptainAmerica*.log", all the data including the older data comes up and that data is indexed in the 'main' index and not the one I created. How can I change that and bring all the data to the same index? It has really been a pain to configure field extractions every time even after setting up a sourcetype.

Tags (1)
0 Karma
1 Solution

mdsnmss
SplunkTrust
SplunkTrust

Since the data was originally sent to an index that didn't exist it went to the default main index. Since the data was already sent when the index was created the forwarder kept it's pointer where it was at and forwarded the rest of the data to the new index. The best way to get all of the data into the correct index is to clean and re-index the data.

  • To start I would stop the universal forwarder that is sending the data.
  • Clean the indexes that contain the data (main and your new index). This could be an issue if you have other data in these indexes as well. (On the indexer: splunk clean eventdata -index ) More information on cleaning indexes can be found at http://docs.splunk.com/Documentation/Splunk/6.5.3/Indexer/RemovedatafromSplunk.
  • Clear the fishbucket on the forwarder so that it reindexes the entire log into the correct index (with Splunk stopped on the forwarder: delete $SPLUNK_HOME/var/lib/splunk/fishbucket). This will cause the forwarder to restart on all of its inputs so if there are multiple inputs you will need to account for that.
  • Start the forwarder again and it should see the log as a new log and will re-index the entire file into the correct index.

View solution in original post

0 Karma

mdsnmss
SplunkTrust
SplunkTrust

Since the data was originally sent to an index that didn't exist it went to the default main index. Since the data was already sent when the index was created the forwarder kept it's pointer where it was at and forwarded the rest of the data to the new index. The best way to get all of the data into the correct index is to clean and re-index the data.

  • To start I would stop the universal forwarder that is sending the data.
  • Clean the indexes that contain the data (main and your new index). This could be an issue if you have other data in these indexes as well. (On the indexer: splunk clean eventdata -index ) More information on cleaning indexes can be found at http://docs.splunk.com/Documentation/Splunk/6.5.3/Indexer/RemovedatafromSplunk.
  • Clear the fishbucket on the forwarder so that it reindexes the entire log into the correct index (with Splunk stopped on the forwarder: delete $SPLUNK_HOME/var/lib/splunk/fishbucket). This will cause the forwarder to restart on all of its inputs so if there are multiple inputs you will need to account for that.
  • Start the forwarder again and it should see the log as a new log and will re-index the entire file into the correct index.

View solution in original post

0 Karma

pranaynanda
Path Finder

Hi! Thank you for a very descriptive answer. I wanted to ask will this still be possible if the main index already has much of data that's valuable and indexed there with purpose. Will the cleaning process wipe that data too?

0 Karma

mdsnmss
SplunkTrust
SplunkTrust

Instead of clean you could use the "delete" command: https://docs.splunk.com/Documentation/Splunk/6.5.3/SearchReference/Delete

Cleaning will wipe the entire index so anything that is in main would be gone too. Delete can selectively delete events but it will not reclaim disk space. You will still need to clear the fishbucket on the forwarder and start again to get the desired events to the originally intended index.

0 Karma

pranaynanda
Path Finder

I would definitely want to go that way but I found something and I wanted to check if this would be a viable solution:

https://answers.splunk.com/answers/72562/how-to-reindex-data-from-a-forwarder.html#comment-260156

Deleting the directories in C:\Program Files\SplunkUniversalForwarder\var\lib\splunk to achieve the same.

How advisable would that be? Do you recommend it?

Also, can I work with the oneshot command?

0 Karma

mdsnmss
SplunkTrust
SplunkTrust

The two options I provided are covered in that thread.

Deleting the directory you have specified would not remove the information from search since it is removing it on the forwarder and looks to be the same as deleting the fishbucket reindexing everything which is something that you will want to do after deleting the data already indexed for that source.

The oneshot command tells the system to index your file once. If you want the file to be watched and send any new events appended to it you will want to set up a monitor instead.

To do what you are trying to do I would recommend selectively deleting the events you want to hide with the delete command and then clear the fishbucket on the forwarded that has the file you would like to index to have it reindex again.

0 Karma

alemarzu
Motivator

Hi there,

You have to add the attribute "index" to your input stanza, like this.

[monitor://C:\Marvel\Avengers\CaptainAmerica*.log]
index = my_custom_index
sourcetype = captain_america
disabled = 0

Hope it helps.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!