Working with a hosting provider (Pantheon), they allow access to the access logs, but not to install a forwarder on their servers. So I installed a forwarder on a server i have control over and set up a process to pull in the log files, and configured the forwarder to index those files.
This Question has been edited after additional information was found
I came across a scenario, where i was unaware of the round robin DNS in place, so when connecting to a hostname to pull in the log file, i could have been in one of a few different locations. Each time i pulled in the log file it would overwrite the previous copy, and if i didnt connect to the same server as last time, the log file would appear as a new file to splunk and get indexed. And again, and again, and again, as the DNS result switched between the various end locations.
This caused a very strange result of multiple duplicate entries, many many multiple entries, and the random nature of DNS made it hard to spot a pattern. This was confused further by a comment that mentioned log file truncation, and trying to explain the results. It appeared there was a circular log rotation practice in place, but this was not the case.
The new question
Now that we are connecting to multiple servers to index a file of the same name, with server names that could change over time, would it be better to rename the file to include the hostname, and monitor the single directory, or to create a directory for each host and force the forwarder to recursively monitor all files in all directories under the root of where you store files?
... View more