We have a 125MB lookup table as a .csv file with 1.5M rows. This table is re-generated on the Search Head every 4 hours. The strategy is to deploy this csv (along with any other changed files in the deployed app) to the Indexer when it changes. The first set of experiments has resulted in slow deployment and lookup outages; the Indexer Summary Indexes cannot do a lookup when the deployment is in progress and prior to the on-disk index being built.
Several questions about deployment strategy and large lookup tables:
1) The primary issue is that there is a gap in availability of the lookup table during and slightly after it is deployed to the indexer. Splunk needs time to build the on-disk index structure. The time to deploy and prepare the index can take 5-10 mins. What steps could be taken to reduce the this time?
2) The threshold for creating an on-disk index directory defaults to 10M lookup table size. Any harm in increasing that to 100M assuming there is enough memory? The assumption is the on-disk index will not be built, it will reside in memory. Will the lookup be available while the index is being prepared?
3) For large csv lookup files, what triggers the on-disk index structure to be built? Is there any control over that timing? By observation we have noticed that is does not build the index until a lookup is attempted and the lookups will fail until the index is ready.
4) Any specific steps to increase priority and improve the transfer speed on certain files for deployment?
There are a few things here.
First, a 125 MB lookup file should take maybe 30 seconds to index, not 5 to 10 minutes. Second lookup tables (and the entire app config, but lookup tables files and their indexes, so the indexing is only done once) are pushed out to indexers from the search head automatically when they change. Sending them out via Deployment Server doesn't really do anything, and they are not used when you run queries. Instead, the one that is replicated out by the search head is used instead. If you're seeing lags, it's possible that it's actually the replication that is slow. There are a couple of solutions here. The first is to disable replication and use NFS or other file sharing (probably not Deployment Server) to make the lookup available to the indexers. This can be done in versions below 4.2, but is a bit undocumented. In 4.2, there are settings for it in distsearch.conf. In 4.2, you can also try using asynchronous replication, which will prevent stalls.