Getting Data In

Re-ingest Windows logs across enterprise

reswob
Loves-to-Learn Lots

 

Hello, we had a multiday outage regarding the connectivity between the UFs and the IDXs.  This affected the ability of all the UFs (5k or so) from sending logs to Splunk from our Windows servers.  Once that connectivity was restored, for reasons yet to be determined, the UFs did not backfill, but kept sending current data.  What I'm saying is, the UFs for some reason did not realized that they could not send data and did not pause in their transmission.  Thus, we have about a 22 hour gap in our windows logs.  We are trying to figure out how to get Splunk to re-ingest that data.  All the searches I have found for re-ingestion of windows logs talk about deleting the checkpoint file for the time period and restarting Splunk.  That would work for one or a few servers, but we need to do that at scale.  

It seems the options for re-ingestion past data at scale are limited to:

1. Use something like SCCM to script the stop of Splunk UF, deletion of checkpoint files, and restart Splunk UF

2. Use something like SCCM to completely uninstall Splunk UF and reinstall with a inputs.conf that covers the missing timeframe, but realize we will duplicate everything after that.

Is there another option?

Thanks

 

What I have found so far, but seems like it would only work for a few servers, not 5k

https://splunk.my.site.com/customer/s/article/Splunk-UF-not-onboarding-Previous-Winevent-Security-lo...

https://community.splunk.com/t5/Getting-Data-In/How-do-I-trigger-the-re-indexing-of-events-from-a-lo...

 

Labels (2)
Tags (1)
0 Karma

reswob
Loves-to-Learn Lots

1. Good point. I should have been clearer in that I was hoping someone else had gone through this and could in general describe what they had done.  

3.  Noted.  

4.  Noted.

5. Noted.

 

Appreciate the feedback.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

It's not that easy.

Firstly, we have no idea what your configuration is (mostly inputs and outputs are of interest here)

Secondly, there is a very good question why didn't the forwarders stop and wait.

Thirdly, generally there is no native way to manipulate forwarder's internal state from remote. There are some ugly hacks to do it but I will not promote them here since it's very easy to shoot yourself in the foot that way.

Fourthly, with a normal desktop edition of windows 22 hours should not produce too many logs but on a busy server, depending on your configuration, that data could already have been overwritten if you hit the size limit.

Fifthly, if you do have the data on the other hand, removing checkpoints would mean rereading all available events from scratch so that could cause an overload of your license and/or infrastructure.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...