Hello,
I am about to onboard 1000+ Windows UF. Those have windows event logs going back many years. Is there a way to exclude any windows eventlog older than 7 days from being ingested during the initial onboarding?
For log files there's an option for inputs.conf on the UF, but nothing similar for eventlog?
Kind Regards
Andre
support confirmed - no way to exclude old windows event logs from being imported.
"The Splunk Universal Forwarder's Windows Event Log input doesn't offer a built-in way to filter events based on age during initial data collection. This means you can't directly configure the forwarder to only send events newer than 7 days when it first starts monitoring.
You'll need to use other methods, like filtering at the indexer level or leveraging props/transforms.conf after the data is indexed, to remove older events. Or else take the backup of the event viewer logs which are older than 7 days in the source machine and remove them before onboarding to splunk."
Kind Regards
Andre
Wait a second. You can't use props/transforms after the events have been indexed. You can do that during indexing after initial ingestion by input.
/opt/splunk/bin/splunk btool props list XmlWinEventLog:Security --debug | grep MAX_DAYS_AGO
/opt/splunk/etc/system/local/props.conf MAX_DAYS_AGO = 7
that should work, right? Present on all indexers. All indexers restarted. (Splunk Enterprise 9.4.2)
TIme to log a support call?
Hi @Andre_
As @inventsekar mentioned, you could use MAX_DAYS_AGO as follows:
== props.conf ==
# If within 3 days old.
[WinEventLog]
MAX_DAYS_AGO = 3
[XmlWinEventLog]
MAX_DAYS_AGO = 3
This will then only apply to XmlWinEventLog/WinEventLog
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Hello,
so I've created a props.conf on the indexer under the Windows_TA local folder and put his in:
[WinEventLog]
MAX_DAYS_AGO = 7
[XmlWinEventLog]
MAX_DAYS_AGO = 7
onboarded another Windows Server - still ingested windows event logs going back a few years.
Any ideas why that's not working?
Kind Regards
Andre
Hi @Andre_
1) after props.conf 's update/creation, did you restart the splunkd on the indexer?
2) if yes for above, then pls use the btool command to check if the props.conf got applied or not(you can search for splunk btool options here in communities).
if any reply helps you in any way, a karma point / upvote would be helpful for the author, thanks.
I've done a rolling restart of the cluster and checked. Looks like it "should" work but doesn't.
Since then, I tried this approach:
put a "blacklist_all_WinEvent" app on the UF during initial start. Just an inputs.conf that has "blacklist1 = ." for all winevent sources.
let the UF do it's initial thing and an hour later I remove that app from the UF and restart the UF
whilst not optimal, that would do the trick for onboarding existing servers and automating that is easy enough.
Kind Regards
Andre
that does not work, once you remove the blacklist, it ingests the old events.....
trying that now, does it require a restart?
Hi @Andre_
Pls check the MAX_DAYS_AGO option on the props.conf
https://docs.splunk.com/Documentation/Splunk/9.4.2/Admin/Propsconf
MAX_DAYS_AGO = <integer> * The maximum number of days in the past, from the current date as provided by the input layer (For example forwarder current time, or modtime for files), that an extracted date can be valid. * Splunk software still indexes events with dates older than 'MAX_DAYS_AGO' with the timestamp of the last acceptable event. * If no such acceptable event exists, new events with timestamps older than 'MAX_DAYS_AGO' uses the current timestamp. * For example, if MAX_DAYS_AGO = 10, Splunk software applies the timestamp of the last acceptable event to events with extracted timestamps older than 10 days in the past. If no acceptable event exists, Splunk software applies the current timestamp. * If your data is older than 2000 days, increase this setting. * Highest legal value: 10951 (30 years). * Default: 2000 (5.48 years).
MAX_DAYS_AGO doesn't cut it. It will make Splunk still index the events but it will assume that the timestamp parsed from the event was wrong so it would just assume another timestamp (whatever that would effectively be).
@Andre_ What you could to to prevent some data from being indexed could be to add whitelists/blacklists in inputs matching certain timestamp values. That's a relatively ugly solution and is not something to be kept forever but for a start - might be the way to go. Just be careful because you're doing it differently when you're windows logs as "classic" and differently while it's XML.
MAX_DAYS_AGO - I would set this on the indexer? (our setup is UF -> Indexer)
Will that be a global setting for all incoming data?
Kind Regards
Andre
Another option you can consider is to change the destination path for the Windows Event Logs in Event Viewer and configure Splunk to monitor this new location. This approach allows you to start collecting only new events, effectively avoiding the indexing of historical data. Additionally, by using the standard Splunk input settings (without current_only = 1), you ensure that no events are missed during restarts, as Splunk will continue to track and ingest all new events from the updated log file.
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
Hi Giuseppe,
"ignoreOlderThan" only applies to log files, not windows event logs (like security events, application events, etc)
Kind Regards
Andre
Won't work. As you can see in the spec - it doesn't work on the event level but on the file's mtime. It is a setting for this particular input type and doesn't make sense in another context.
BTW, if you had files created sufficiently long time ago but containing events with present timestamps, it still wouldn't ingest those files.
It's not in the spec file, I tried and it does not work.
Hi @Andre_ ,
as you can read at https://docs.splunk.com/Documentation/Splunk/9.4.2/Admin/Inputsconf , to read only the events newer than 7 days, you have to use, in you inputs.conf the option ignoreOlderThan:
ignoreOlderThan = <non-negative integer>[s|m|h|d]
* The monitor input compares the modification time on files it encounters
with the current time. If the time elapsed since the modification time
is greater than the value in this setting, Splunk software puts the file
on the ignore list.
* Files on the ignore list are not checked again until the Splunk
platform restarts, or the file monitoring subsystem is reconfigured. This
is true even if the file becomes newer again at a later time.
* Reconfigurations occur when changes are made to monitor or batch
inputs through Splunk Web or the command line.
* Use 'ignoreOlderThan' to increase file monitoring performance when
monitoring a directory hierarchy that contains many older, unchanging
files, and when removing or adding a file to the deny list from the
monitoring location is not a reasonable option.
* Do NOT select a time that files you want to read could reach in
age, even temporarily. Take potential downtime into consideration!
* Suggested value: 14d, which means 2 weeks
* For example, a time window in significant numbers of days or small
numbers of weeks are probably reasonable choices.
* If you need a time window in small numbers of days or hours,
there are other approaches to consider for performant monitoring
beyond the scope of this setting.
* NOTE: Most modern Windows file access APIs do not update file
modification time while the file is open and being actively written to.
Windows delays updating modification time until the file is closed.
Therefore you might have to choose a larger time window on Windows
hosts where files may be open for long time periods.
* Value must be: <number><unit>. For example, "7d" indicates one week.
* Valid units are "d" (days), "h" (hours), "m" (minutes), and "s"
(seconds).
* No default, meaning there is no threshold and no files are
ignored for modification time reasons
Ciao.
Giuseppe
I've seen the "current_only" option but discarded that as it will not ingest any historical data.
If I set "current_only=1" during initial deployment it will not ingest old data - so far so good.
If the UF goes down for a period of time, after a restart it will not process the events that occurred whilst the UF was down - bad
What happens if I deploy the UF with "current_only =1" and after a week I remove the setting? will it start ingesting all historical? Or could I use that as a temporary setting during the onboarding phase and remove for production phase?
Kind Regards
Andre