Getting Data In

Log not index entirely



I have been wracking my head around this for the past few days and cannot seem to figure it out. For testing purposes I/we have a "Test" splunk indexer and a "production" splunk indexer that has been setup recently.

The test Splunk Indexer is older, and has been properly indexing the forwarders log data. The new setup/indexer is not.

In the forwarder inputs.conf I have the following:

sourcetype = log4j
crcSalt = <SOURCE>
ignoreOlderThan = 7d
disabled = false
whitelist = \.(log|err)$
blacklist = \.zip

On the "Test" Splunk Indexer if the following query is ran:

error OR ftp OR host="somehostname" (this has been simplified for posterity)

Search results are as expected. On the other instance nothing turns up or it looks like it cant retreive any data prior to the initial setup date.

Any assistance anyone can give would be helpful.


Splunk Employee
Splunk Employee

What wasn't mentioned here is the forwarder configuration. I think it was implied that the forwarder should be sending data to both indexers, so I'll proceed with that in mind. I would check to ensure outputs.conf is correctly configured to do data cloning, presuming that you want to both indexers to receive the same data.

I would also check splunkd.log for the TcpOutputProc component messages. You want to be sure that connections are successfully being made to the production Indexer from the forwarder. It could be a basic connectivity issue. If it isn't, we can move on to looking at other items.

Since the test indexer is receiving the data, this wouldn't seem to be an issue of the forwarder reading the data, so the TailingProcessor:FileStatus will just confirm what you already know, that data is being read by the forwarder.

The other possibility here is that you're successfully connecting to the production Indexer, and the data is being sent, but it is being obfuscated from you for some reason.

Usually, if you expect the data to be current, you search for the last n minutes, hours, days, etc. It could be that on the production Indexer, timestamps/hosts/sources are being misapplied and the data is being indexed with timestamps from last year, last month. It could be that an incorrect source or host is being applied because of a configuration on the Indexer.

For these kinds of cases, you can change the time drop down from whatever you've been using to 'all time, real time', which is under 'Real Time'. It shows all events coming in currently, irrespective of the time stamp applied.

Should you find that you've got misapplied timestamps, you probably want to configure Splunk for that sourcetype/source to recognize the timestamp explicitly using these instructions.

If it's not a time stamp problem and you're not seeing data with the search you were using, remove the source/host/sourcetype entries to see if they're not being applied in the manner you expect. If that's the case, it's time to re-examine the indexer configuration.

If you don't find the events coming all when looking at the search with the "all time, real time" search constraints, then sometimes it's helpful to pick a particular event on disk that Splunk should have read. Choose something semi-unique about the event, such as the time stamp. Use an "all time" search over all indexes with the unique string. Maybe something like this:

index=* "2014-01-14 13:09:41.128"

Hopefully one of the items here will reveal what is occurring. This is the process I follow when presented with these kinds of issues, and it's usually been successful in turning up the problem.


Thank you! I will look into this.

0 Karma


That is not working as expected or not what I need. It seems to be only showing files read on the local system of the indexer(for both installs).

0 Karma

Revered Legend

Follow steps in this post to see which files are getting processed and which are not with reason.

0 Karma


Yes. Specifically from 1/14/14 and 1/15/14.

0 Karma


Do you have events newer than 7 days old that should match?

0 Karma