Getting Data In

How to troubleshoot why a Windows universal forwarder is sending metrics, but not Windows event logs?

reswob4
Builder

Here's my setup: I have three clustered indexers, two search heads, a deployment server, as well as several Heavy Forwarders (three Windows and three Linux). I've been collecting Windows logs remotely from the HF via WMI no problems for a while. This week, I decided to install a universal forwarder on two servers as a pilot in preparation for further deployments.

After installing, I found I was getting no log events at all. So I commenced troubleshooting.

First I checked to see if the indexers were receiving data by running tcpdump and I saw the logs and metrics coming over the wire to the indexers. CHECK

Then I checked to see if the records were in ANY index by running the following search:

index = * host=hostnames

This returned nothing. So I searched:

index=* hostnames

And while this returned multiple events, none were FROM those machines.

Then, I checked to see if there were records in the _internal index from those servers. CHECK

Then, I looked to see if any of those _internal records contained errors. No entries that said ERROR, so tentative CHECK

Then I looked on each server where where the UF was installed and looked in splunkd.log for errors. Just one:

AuditTrailManager - Private key error Error opening C:\Program Files\SplunkUniversalForwarder\etc\auth\audit\private.pem: The system cannot find the patch specified.  

But I was kind of expecting this as I told the UF to use Splunk own internal certificate during install? Not sure if this is a factor....

So no other errors.

Here's C:\Program Files\SplunkUniversalForwarder\etc\apps\Splunk_TA_Windows\local\inputs.conf

[WinEventLog://Application]
disabled = 0
index = wineventlog

[WinEventLog://Security]
disabled = 0
index = wineventlog

[WinEventLog://System]
disabled = 0
index = wineventlog

[WinEventLog://Windows Powershell]
disabled = 0
index = wineventlog

Here's C:\Program Files\SplunkUniversalForwarder\etc\system\local\outputs.conf

# BASE SETTINGS

[tcpout]
defaultGroup = primary_indexers

[tcpout:primary_indexers]
server = ip1:9997, ip2:9997, ip3:9997

## autolbsettings
autoLB = true
autoLBFrequency = 15
forceTimebasedAutoLB = true

Some other posts have mentioned that there could be a permissions issue. Is there a way to verify that? I installed this UF with the same domain admin account that the HF are using to pull logs via WMI so there shouldn't be a permissions issue?

What other steps can I take to fix this?

Thanks.

0 Karma

rewritex
Contributor

I'm currently having this issue. I am seeing metrics logs coming in when searching "index=_internal" on my cluster.
But I do not see the data coming in.... At one time it did work and it seems things sometimes work? I am currently trying to troubleshoot .... Windows Universal Forwarder 6.4.3 (port 9997-> intermediate forwarder 6.4.3 -> Index Cluster
Any other information to help isolate this issue? Thank You

0 Karma

reswob4
Builder

OK, I ended up opening a ticket on this, doing some more troubleshooting and giving them a diag, but no smoking gun was really found.

Then I had some work things come up and I didn't get to work on the problem for a couple of days. During that time things started working and events from all three main windows logs showed up in my indexers, but not for any reason I could tell as I hadn't had a chance to implement the latest suggestions from tech support.

I think in the end, the msi needed to be installed (at least in my environment) by running it from a administrator command prompt and choosing to install it as the local administrator account. Then, after it has finished installing, waiting for a few hours/days and then events started showing up in my indexes.

I'm NOT saying this is the right answer, but I didn't want to leave this question hanging.

Thanks to @jkat54 for all the help.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Sorry I went cold on you. I lost visibility on the question. So some group policy or something was getting in the way. I've always installed splunk as non-priveleged accounts and I've always run into a different issue that was always related to some silly something / policy implemented by who knows who and who knows when, etc. One time I spent weeks trying to solve something and it turned out the vendor had disabled service accounts somehow. You could add them, give them passwords etc, but when you tried to use it as a service account whatever service would fall on its face... SMH.. As with everything computer, you just never know...

0 Karma

reswob4
Builder

The events are not in the _internal log.

Furthermore, I performed a general search index=* host=hostname and found that I HAD gotten some results.

From 2pm 3 May 2016 to midnight 3 May 2016, I received about 100,000+ events per hour. Then it has dropped off to maybe one event per hour.

and even then, it's only been the events from the system log.

0 Karma

reswob4
Builder

I just tried the SPL99687 suggestion from http://docs.splunk.com/Documentation/Forwarder/latest/Forwarder/KnownIssues

and when I stopped and restarted splunk, THOSE TWO log entries showed in the search. But still nothing else.

double checking spelling again....

0 Karma

jkat54
SplunkTrust
SplunkTrust

I think i've hit a limit on comments because it keeps discarding my latest comments

are the events in the internal index?

 index=_internal source="WinEventLog*"
0 Karma

reswob4
Builder

I didn't notice that my reply to your comment didn't get posted.

The events are NOT in the internal log

0 Karma

jkat54
SplunkTrust
SplunkTrust

To check permissions the account has...

runas /noprofile /env /netonly /user:domain\username "c:\windows\system32\eventvwr.msc"

you will be asked for a password.

0 Karma

reswob4
Builder

So here are the results:

runas /noprofile /env /netonly /user:domain\username "c:\windows\system32\eventvwr.msc"

RUNAS ERROR: Unable to run - eventvwr.msc
193: eventvwr.msc is not a valid Win32 application.

To verify I ran just eventvwr.msc. That worked

I ran runas /noprofile /env /netonly /user:domain\username "notepad.exe"

That worked.

I tried both of the above from the command prompt AND the elevated command prompt with the exact same results.

Suggestions?

0 Karma

jkat54
SplunkTrust
SplunkTrust

sorry, change .msc to .exe should work fine. since notepad.exe works fine, then removing c:\windows\system32\ should be ok too.

0 Karma

reswob4
Builder

Yes, I can read the logs.

0 Karma

jkat54
SplunkTrust
SplunkTrust

well then the account has permission 😉

Are there a LOT of events in the logs? maybe from 2006 and beyond... if so it will take a while for the newer events to be read (depends on everything from size of the box to network throughput) etc. but events older than 6 years might be getting rolled to frozen as soon as they arrive, etc.

0 Karma

reswob4
Builder

The server is three years old and yes I forgot to put that limit into the .conf files.

Is there a way to determine where those events are going? If they are in any index?

While I had turned off the UF last night, it's been running now for four hours today and still nothing is showing up (I just checked)

0 Karma

jkat54
SplunkTrust
SplunkTrust

I think there's been enough time by now.

You did create an index called wineventlog correct?

no typos etc.

0 Karma

reswob4
Builder

Just verified that it's spelled correctly in both inputs.conf

0 Karma

reswob4
Builder

OK I checked _internal and no entries. So then I re-ran the search for anything from that server, just to see if anything turned up (index=wineventlog host=hostmanes) and lo and behold ONE event showed up!

But only one and from the system log.

So I expanded the search to all time (because why not) and it seems that your previous theory was right, I have VERY few events after midnight 3 May 2016, hundreds of thousands of events between 2pm and midnight and then almost nothing prior to that. 2pm on 3 May was about when I installed the UF.

So that leads to another question, can I stop the UF, add in the history limit of 3 days and restart? Or at this point will it ignore that config?

And why isn't it getting any logs after midnight?

0 Karma

jkat54
SplunkTrust
SplunkTrust

is it in internal?

index=_internal source="WinEventLog:*"

0 Karma

jkat54
SplunkTrust
SplunkTrust

very interesting... can you check to be sure theyre not ending up in index=_internal

 index=_internal source="WinEventLog:*"
0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...