I have an odd issue occurring. Essentially I have a high volume log source which is getting picked up by a Splunk forwarder and forwarding the log source to a specific index using an autoLB DNS location (pointing to 2 indexers).
I can see from the internal metrics that both indexers are receiving the log data (over 10Gb per day spread over the 2 indexers). The problem is that when I perform a search for this sourcetype, I can only see a handful of the events (i.e. 1000 over the past day when there should be a few million). If I select an event that i can see in the simple sourcetype search, then do a 'view source', I see 1000s of events in the source, which suggests the data is getting in Splunk fine, but is just not searchable.
The forwarder is version 4.0.9 (our current standard). Other sources are set up in a similar fashion with this forwarder version and work fine. Additionally, we pointed the source to our lab/test indexer and it shows up fine.
If you can confirm the appropriate metrics on the indexing side, then the data is likely in the index. However, you need to run a search that will find out where Splunk stored it. Typically, you can run a search over all-time just for that specific source or sourcetype and that will show you where the data has been stored. Most likely, the event data is getting incorrectly time stamped or bunched into a large multi line event.
A search over all time does not bring up the logs. One thing i have noticed is that when performing a search for the sourcetype, I see things like:
11 matching events|295,308 scanned events
The scanned events corresponds closely with the actual events that would be expected for the source.
Please supply your exact search query in addition to any details about how the source data is classified or input to Splunk
We were able to confirm that the logs were getting indexed, they just were not availabel for viewing in Splunk. The data concerned was a series of zipped up delimited log files containing a header row, a load of data rows, then a summary row. For some reason, Splunk would only make searchable the header and summary rows of these files, despite indexing everything. After much looking around, the fix seems to have been to add
crcSalt = <SOURCE> to the inputs.conf file. I also did a
splunk clean all on the forwarder. Now the data comes in and is fully searchable...