Getting Data In

What would cause a Windows forwarder to re-scan an entire event log?

jeff
Contributor

Windows Server 2008 R2 x64 (Windows AD Domain Controller) / Splunk 4.1.1 set up as a full forwarder (custom app via deployment server).

Upon booting following a BSOD, Splunk re-sent the entire Windows security log from its earliest event ( > 4GB worth; > 8 million events). Other logs did not see to be similarly affected.

  1. Any ideas what would cause this?
  2. How might this be avoided in the future without losing events (I think that is true with current_only=1)?
  3. Any recommendations on cleaning the duplicate entries, or would it be best to leave them be? The events are co-mingled with data from other Splunk forwarders (AD domain controllers) in an index dedicated to AD events.

CerielTjuh
Path Finder

Just an idea, this is how I should do it:

  • Detele the current logs untill the date/time of your earliest entry in the current eventlog.
  • Tell Splunk to index the entire Windows eventlog (change inputs.conf)
  • After Splunk is done, change the inputs.conf to take only newest

http://www.splunk.com/base/Documentation/latest/Admin/MonitorWindowsdata

start_from = oldest
current_only = 1

0 Karma

jeff
Contributor

We'd end up losing events that way- the AD domain controller logs are quite chatty and purge their oldest events as quickly as new events are coming in (~ 40 events per second)- thanks for the thought though.

0 Karma

jrodman
Splunk Employee
Splunk Employee

The markers for the eventlogs are stored on disk in $SPLUNK_HOME/var/lib/splunk/persistentstorage/... something or other. I'm presuming that somehow they got corrupted or erased on the BSOD. Can you evaluate what's in this directory?

To identify the dupes you could run a search over a particular time range, something like :

host=problemhost sourcetype=WinEventLog* | stats count as frequency by _raw |where frequency > 1

If this reliably is identifying them, someone smarter than me can figure out a search that grabs the first event for each duplicated set, and then delete those.

jeff
Contributor

Yeah, the search I had already figured out. If I cared enough I'd figure out a way to safely purge the duplicates, but as of now I'm going to let it be.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...