Getting Data In

"Files & Directories" Monitoring not reading all files?

KLOSTR2
Engager

So here's the deal; I've pulled down a week’s worth of logs in a hierarchically structured folder from our local server, where each log file is arranged like so:

C:\User\UserNameHere\…\DirectoryPathGivenToSplunk\HostHere\…\…\ApplicationHere\...\LogTree\DAY[1-7]\application.log

I’ve passed this file tree into Splunk, giving it the <DirectoryPathGivenToSplunk> folder, and it indexed almost all of the files. The key word being almost; some files don’t register as being indexed. (i.e. their events aren't coming up in my searches.) I’ve double-checked that I have no blacklists or whitelists being enforced on upload, that the file is present, the correct type, not empty/null, not read-only or hidden, and that its contents are formatted properly; frankly, I’m stumped at what else may be causing the disconnect. Any ideas?

PS: I’m using Splunk Enterprise, (Trial…I think), Version 6.2.3 for Windows 7. Please ask if more information is required.

Edits: I've used the list monitor command, and double checked that the files whose logs are missing are indeed in the list of monitored files. In addition, The entire file structure is only ~ 100MB, a mere fifth of my daily indexing volume, and immediately after indexing the directory, almost all the logs appear in searches, so I'm rather doubtful that it's an issue with volume or speed. Even giving it 24+ hours to look for the missing files hasn't helped. And before you ask, the sizes of the missing files aren't significantly bigger or smaller than any of the others.

I would use a simpler file structure if given the chance, though I've been using the sources to contain information regarding the logs that aren't present in the logs themselves. It would be an option to upload the missing files individually, if it weren't for 2 issues:

  1. There are about 30-40 missing files. (And that's just from a cursory glance.)
  2. As mentioned earlier, I'm trying to "smuggle" some data regarding the logs in their source file's paths; this would be lost if I were to just upload them individually. (Monitor each missing file individually, you say? Well, it's possible....)
0 Karma
1 Solution

KLOSTR2
Engager

It seems that Splunk is smarter than I am...

Upon closer inspection, the content of the "missing files" were just duplicates of the previous day's logs; as such, I don't believe
Splunk saw fit to index the same event twice. The problem was that my searches to make sure that all files were being monitored involved searching for the count of distinct sources; not for distinct events within the files themselves.

Takeaway lesson: If you seem to have files that aren't being indexed, double-check that the contents aren't duplicates of already existing logs.

View solution in original post

KLOSTR2
Engager

It seems that Splunk is smarter than I am...

Upon closer inspection, the content of the "missing files" were just duplicates of the previous day's logs; as such, I don't believe
Splunk saw fit to index the same event twice. The problem was that my searches to make sure that all files were being monitored involved searching for the count of distinct sources; not for distinct events within the files themselves.

Takeaway lesson: If you seem to have files that aren't being indexed, double-check that the contents aren't duplicates of already existing logs.

sharan928
Engager

Please add crcSalt parameters to the inputs.conf so that splunk treats the files separately even if the first 256 bytes is matching.

[monitor:///]
index = temp
*crcSalt = * < SOURCE >
ignoreOlderThan = 24h

Please remove spaces between brackets and SOURCE.

0 Karma

sumitsaha
Engager

saved my day

0 Karma

jnussbaum_splun
Splunk Employee
Splunk Employee

You could try to run: splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus from your forwarder to see if the tailing processor is reading, has read, or is skipping them. Please let us know what you find.

santhoshi
Explorer

Hello,

I am the facing the same issue.  Logs are getting indexed only when I restart the forwarder, but do not index after that. I executed this command (splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus) in Forwarder and found that I do not see the newer file stanzas at all. Also I do not see watched file line also for the newer files which are not getting indexed. There is no connectivity issue to DS and Indexer.  I do not see any errors in splunkd.log. 

Can someone pls help to troubleshoot or suggest where the issue could be 😞

0 Karma

ankitarath2011
Path Finder

@santhoshi 
Is your issue fixed. Can you please share solution if it is fixed. Thanks.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Make sure your files start differently, and for example don't have a long similar header. This may cause Splunk to think it's already seen the file and not index it due to log rotation.

woodcock
Esteemed Legend

Indexing is not instantaneous and if you have a big batch of fines on a forwarder, it is going to take a while for it to clear the backlog. What does this show on one of your lagging forwarders:

/opt/splunk/bin/splunk list monitor

It probably shows all the files you are expecting but the single Splunk instance has not been able to get through them all yet. How long have you given it?

0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...