Getting Data In

How NOT to index old logs

nembela
Path Finder

My problem is the following:
When I install an universal forwader to a windows host it begins to forward every log entry from the event logs. But I don't need the entries that are older than 10 days.
I tried the following props.conf file (both on the indexer and the forwarder):

[default]
MAX_DAYS_AGO = 10

But it still forwards every entry. What have I done wrong?

Thanks in advance

1 Solution

jbsplunk
Splunk Employee
Splunk Employee

That should be what you'd need to do on your indexer, as that is where the data is being parsed. Not sure why it isn't working, but it could be an issue with file precedence. Have you put this entry into $SPLUNK_HOME/etc/system/local? That should contain the highest precedence. Also, not sure what behavior to expect by putting this under the [default] stanza. I would suggest trying it with the [sourcetype], [host::hostname], or [source] stanzas to see if that produces any variance.

You could also try the following setting in inputs.conf on the UF:

ignoreOlderThan = <time window>
 * Causes the monitored input to stop checking files for updates if their modtime has passed this threshold.
  This improves the speed of file tracking operations when monitoring directory hierarchies with large numbers
  of historical files (for example, when active log files are colocated with old files that are no longer
  being written to).
 * A file whose modtime falls outside this time window when seen for the first time will not be indexed at all.
 * Value must be: <number><unit> (e.g., 7d is one week).  Valid units are d (days), m (minutes), and s (seconds).
 * Default: disabled.

Another useful one on the forwarder might be:

followTail = [0|1]
 * Determines whether to start monitoring at the beginning of a file or at the end (and then index all events 
  that come in after that). 
 * If set to 1, monitoring begins at the end of the file (like tail -f).
 * If set to 0, Splunk will always start at the beginning of the file. 
 * This only applies to files the first time Splunk sees them. After that, Splunk's internal file position 
  records keep track of the file. 
 * Defaults to 0.

View solution in original post

jbsplunk
Splunk Employee
Splunk Employee

That should be what you'd need to do on your indexer, as that is where the data is being parsed. Not sure why it isn't working, but it could be an issue with file precedence. Have you put this entry into $SPLUNK_HOME/etc/system/local? That should contain the highest precedence. Also, not sure what behavior to expect by putting this under the [default] stanza. I would suggest trying it with the [sourcetype], [host::hostname], or [source] stanzas to see if that produces any variance.

You could also try the following setting in inputs.conf on the UF:

ignoreOlderThan = <time window>
 * Causes the monitored input to stop checking files for updates if their modtime has passed this threshold.
  This improves the speed of file tracking operations when monitoring directory hierarchies with large numbers
  of historical files (for example, when active log files are colocated with old files that are no longer
  being written to).
 * A file whose modtime falls outside this time window when seen for the first time will not be indexed at all.
 * Value must be: <number><unit> (e.g., 7d is one week).  Valid units are d (days), m (minutes), and s (seconds).
 * Default: disabled.

Another useful one on the forwarder might be:

followTail = [0|1]
 * Determines whether to start monitoring at the beginning of a file or at the end (and then index all events 
  that come in after that). 
 * If set to 1, monitoring begins at the end of the file (like tail -f).
 * If set to 0, Splunk will always start at the beginning of the file. 
 * This only applies to files the first time Splunk sees them. After that, Splunk's internal file position 
  records keep track of the file. 
 * Defaults to 0.

View solution in original post

bimord
Path Finder

Valid units for ignoreOlderThan also includes h (hours)

ignoreOlderThan = <non-negative integer>[s|m|h|d]

Reference : https://docs.splunk.com/Documentation/Splunk/7.2.6/Admin/Inputsconf

0 Karma

jhedgpeth
Path Finder

Adding comment for posterity. With a large app deployment configuration and the recommendation not to use followTail in an ongoing fashion, I've found this to be a safety valve to avoid pulling in GBs worth of data I don't want when deploying to new servers.

I put this at the top of most of my new apps' inputs.conf:

[default]
ignoreOlderThan = 7d

[other stuff...]
...

It effectively caps how many old/rotated logs it will pull in during the first run, and avoids deploying twice to set/un-set followTail.

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!