Recently, I've begun noticing that one of our lightweight forwarders is not sending data that we expect to see on the indexer (4.1.4 Linux 64-bit).
Looking at the LWF (4.1.4 on HPUX 11.31 IA64), I see the following in the logs:
10-05-2010 12:36:27.189 INFO TailingProcessor - Could not send data to output queue (parsingQueue), retrying...
10-05-2010 12:36:27.189 INFO TailingProcessor - ...continuing.
I had a look at
but that doesn't quite seem to be my problem unless I'm missing something.
If I do "grep blocked=true var/log/splunk/metrics.log*" I get a few matches, but nothing current. I certainly see plenty of INFO events from the metrics log showing up on the indexer.
So if I do a "splunk list monitor" I indeed see the file I expect to be monitored in the list, but I don't see any events showing up on the indexer itself.
What am I missing (other than my monitored files 🙂 )?
if you run:
index=_internal source=*metrics.log* group=queue | timechart perc95(current_size) by name
what kind of numbers do you see? If you use heat map, what colors do you see for each of the queues?
Lastly, can you confirm that the indexer is on the same timezone as the logs timestamp?
I'm embarrassed to say that this was my own error. I hadn't noticed the error in the logs that indirectly was telling me that there was a problem with my outputs.conf. I had ruled that out early given that splunk log events were clearly going back to the indexer, but my local app events were not. But it was ultimately a bad outputs.conf file that slipped in there.
The symptom I originally posted about was not the real culprit. Sorry for the confusion and thanks to those who tried to help me out.
The LWF's most recent "blocked=true" message is from 10/1 on the parsingqueue. The indexer has quite a few of them for several queues: aggqueue, indexqueue and typingqueue. 83 in total over the last few days on the indexer. While I'd be the first to admit that the indexer is a woefully underpowered piece of hardware (we're working to replace it), it's still only being asked to index less than 1GB/day. Events from this particular LWF show up eventually, but as of right now, the most recent events are from noon today.