I am at a client where they are setting up a system based on a CSV lookup file. This file is managed by another system and will be output to the appropriate Splunk config directory on some interval.
Assuming they have that system write to a temporary file and then atomically move it to the CSV filename when finished, are they going to run into any issues with this lookup failing? They have summary generation searches that need to run and use this lookup. I assume Splunk keeps a lookup file in memory, but probably a hash as well so it will know when to re-read the file from disk if it changes. Is this correct?
In 4.0 and above, each search runs in its own process. So, when a search requires a lookup table that splunk search process will open it. Also, POSIX semantics promise that deleted files remain available to processes that have open handles against them as long as they have those open handles. As long as your move-into-place is definitely atomic, then current, in-process searches should work just fine. This includes summary generation, because this is run in a search process as well.
Note that you could need 2x the disk space for your lookup to support having "both" copies of it around when you replace it.
However, even taking the above into account -- I would expect to have to treat exactly how Splunk treats lookup tables during a search as an implementation detail. It could change sometime in the future, but I would think that the atomic-replace pattern would remain applicable.