We had an exchange index go hog-wild yesterday and so commented-out the inputs.conf stanza and did a 'splunk reload deploy-server' to stop taking input last night to protect the license.
This morning I uncommented-out the stanza and changed the ignoreOlderThan from what it was to 'ignoreOlderThan = 1d'. I thought 1d would be today. But it indexed everything from yesterday too. I will likely have this same problem again today and have to stop the forwarding, but when I come in Monday I'll want to restart the forwarding and indexing, and want to know how to have it just take the current day's logs in that directory. Does 'ignoreOlderThan = 0d' make any sense? I don't think the documentation is clear on this at all. Also, it mentions days, minutes and seconds, but not hours. Is that intentional? No hours supported?
ignoreOlderThan looks at the file's modification timestamp. If it keeps getting appended to the file will keep being checked for new data.
What you might need is MAXDAYSAGO from props.conf, but I don't think that can be tuned to hours - worth a try though.
There is a new log file every day. So if ignoreOlderThan = 1d means today only, then yesterdays new log entries should have been ignored. I'm thinking that it must mean today + 1 day ago.
Can MAXDAYSAGO be used in the props.conf for the forwarder? rather drop the data there than on the indexer.
"Older than one day" sounds like "older than 24 hours", which would not be true for a log file that stopped being modified at midnight today.
As for MAXDAYSAGO, that can only be tested if a timestamp was parsed - hence not on a universal forwarder.
You could just blacklist yesterday's file.
Using "ignoreOlderThan = 0d' on the forwarder's inputs.conf file works for retrieving today's logs and does ignore yesterday's, so that is the way to do it.
Do not use this setting :
I confirm the
ignoreOlderThan = 0d will be interpreted as
ignoreOlderThan = 0 and cause all of your data to be indexed.
Instead use values > 1, by example a good candidate to ignore data from 2 days ago is :
ignoreOlderThan = 24h
I found out later that the sysadmin of the system I was taking the logs from had zipped up a bunch of log files. That made their data "today" and the forwarder just sucked 'em up! Apparently splunk can index zipped files!
Indeed it can. The most prominent example would be the tutorial sample data zipfile 🙂