Getting Data In

Does installing a universal forwarder cause re-indexing?

nivedita_viswan
Path Finder

At present, we have a stand-alone Splunk server, monitoring a mapped directory of log files. In order to reduce the load, we are adding a search head, and also want to install a universal forwarder that will forward the files in the mapped directory to the indexer. Most of the log files in the mapped directory are already indexed, while about 10-15% of the files are yet to be indexed.

I believe installing a forwarder and forwarding these files to Splunk should not cause re-indexing, since Splunk keeps track of the files that have already been indexed. However, when I tried this scenario in a test environment, with a small subset of the data, I noticed all the files in the directory were re-indexed. Is this to be expected? Or is there something wrong with my configuration?

0 Karma

bandit
Motivator

Not sure how to limit this altogether, however you could temporarily set the MAX_DAYS_AGO value for the sourcetype on your indexer to ensure it doesn't reindex more than one day. You could then delete duplicate events with | delete command.

# in props.conf on indexer
[my_source_type]
MAX_DAYS_AGO = 1

From documentation:
MAX_DAYS_AGO =
* Specifies the maximum number of days past, from the current date, that an extracted date
can be valid.
* For example, if MAX_DAYS_AGO = 10, Splunk ignores dates that are older than 10 days ago.
* Defaults to 2000 (days), maximum 10951.
* IMPORTANT: If your data is older than 2000 days, increase this setting.

0 Karma

teunlaan
Contributor

Is the path of the mapped directory the same on the server as on the UF?
You could try to copy the "_fishbucket" directory from the server to the UF (and restart)

The _fishbucket index keeps track of what is indexed and what not.

Haven't tested it, but in theory it should work

nivedita_viswan
Path Finder

Yes, the path is the same. Thanks for the suggestion, let me give it a shot.

0 Karma

nivedita_viswan
Path Finder

For now, we decided against installing the forwarder. So I won't get a chance to try out these suggestions. I'll update here if we do try this in the future.

0 Karma

teunlaan
Contributor

it should be "_thefishbucket" btw

0 Karma

nivedita_viswan
Path Finder

We just installed Splunk and wanted to index an year's worth of data. So the mod times of these files vary from 1 day to 365 days. There is no way of knowing which files are indexed and which files are still in the process of being indexed. So I still want the yet-to-be-indexed files to be indexed.

0 Karma

bandit
Motivator

Would be interesting to know if the _fishbucket method mentioned below works out for you. Otherwise, you would also have the option of cleaning the index after installing the Universal Forwarder causing a reindex of all events for that index.

  • make sure you have the source logs before doing following as it will permanently remove all events from the index you specify below. splunk stop splunk clean eventdata myindexnamehere splunk start
0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...