Getting Data In

How to monitor newly written log lines from a log file whose modtime is older than 12 hours?

strive
Influencer

Hi,

Our monitor configuration is:

[monitor:///opt/diags.log*]
disabled = false
host = $decideOnStartup
sourcetype = diag_snapshot
blacklist = \.(gz)$
index = my_index
initCrcLength = 1024
ignoreOlderThan = 12h

When we deploy this configuration, there is a log file whose modtime is older than 12 hours. Now when I restart Splunk, it checks the modtime and notices that it is older than 12 hours and it doesn't forward the data to indexer. After few hours some lines are written to the log file. The Splunk forwarder is not sending that log file to indexer.

According to Splunk documentation -

A file whose modtime falls outside this time window when seen for the first time will not be indexed at all.

Is there way to make Splunk forwarder forward the newly written log lines to the indexer?

For a fact, I know that if I restart Splunk again, the new lines will be indexed. Please note that it is a production system and manually monitoring files and restarting Splunk is not possible.

Please let me know if there is any setting that will work along with ignoreOlderThan.

Thanks,
Strive

0 Karma
1 Solution

somesoni2
Revered Legend

The issue that you're seeing is because of you're using ignoreOlderThan setting. When Splunk restarts, it builds as monitoring list (list of files/directories it'll monitor) and a ignore list (list of files and directories which are inside monitored directory OR following monitoring stanza, but should not be monitored due to filter settings like blacklist/ignoreOlderThan etc).

When the Splunk started, your file modification time was older than ignoreOlderThan setting hence it made the ignore list. The files in ingore list will not be checked again till the forwarder starts again OR monitoring configuration as reloaded.

So option for you would be either to increase the ignoreOlderThan settings (generally it should be same OR more that the frequency in which file gets updated. In your case the file modification frequency over 12h sometimes so keeping it fixed 12h will not work always), OR setup a scripted input that will either restart OR reload the data input configuration.

You can reload data input configuration using CLI

$SPLUNK_HOME/bin/splunk _internal call /services/data/inputs/monitor/_reload -auth admin:YourAdminPwdOnFwd

View solution in original post

somesoni2
Revered Legend

The issue that you're seeing is because of you're using ignoreOlderThan setting. When Splunk restarts, it builds as monitoring list (list of files/directories it'll monitor) and a ignore list (list of files and directories which are inside monitored directory OR following monitoring stanza, but should not be monitored due to filter settings like blacklist/ignoreOlderThan etc).

When the Splunk started, your file modification time was older than ignoreOlderThan setting hence it made the ignore list. The files in ingore list will not be checked again till the forwarder starts again OR monitoring configuration as reloaded.

So option for you would be either to increase the ignoreOlderThan settings (generally it should be same OR more that the frequency in which file gets updated. In your case the file modification frequency over 12h sometimes so keeping it fixed 12h will not work always), OR setup a scripted input that will either restart OR reload the data input configuration.

You can reload data input configuration using CLI

$SPLUNK_HOME/bin/splunk _internal call /services/data/inputs/monitor/_reload -auth admin:YourAdminPwdOnFwd

horsefez
Motivator

Hi Strive,

I'm not entirely sure how this problem of yours occurs on the system.
I do understand the part with the files that are older than 12h, who are written to.

But... when there is a change in the log file... why isn't the modtime updated?

Quick example:

It's 6 AM in the morning:
file A was modified on 1 AM 
file B was modified on 10 PM (past day)
file C was modified on 5 PM (past day)

if splunk runs... file A and B get collected (or it's changes to them)

Three hours later:

 It's 9 AM now:
    file A was modified on 1 AM 
    file B was modified on 10 PM (past day)
    file C was modified on 5 PM (past day)

somebody adds some lines of events to file C

file C modtime changes to 9 AM

if splunk reruns... file A and C get collected (or it's changes to them)

That's what I think should happen when splunk operates normaly.

Could you please be so kind and describe your problem a bit more in detail.

Thanks!
pyro_wood

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...