All Apps and Add-ons

Ingesting only some IIS logs but not all

snix
Communicator

We have multiple IIS web servers that all host multiple sites. Each sites logs are saved to the default location of:
C:\inetpub\logs\LogFiles\W3SVC*\*.log

This is the stanza I use on each server:
[monitor://C:\inetpub\logs\LogFiles*\*.log]
disabled = 0
sourcetype = iisw3c
index = iis

The odd thing is depending on they day and time I might see logs from W3SVC2 but nothing on W3SVC1 or 2. Then on another server I might see W3SVC2 and 3 but not 1. It just seems random on what logs it will pull from on any given server on any given day.

I checked a couple of the servers to make sure logs were getting generated even though Splunk was not showing any ingested and they were. Am I missing something here like you need to make a stanza per website or something like that?

0 Karma
1 Solution

PavelP
Motivator

it could be a CRC problem since IIS logs are notorious to have a long header line which can mislead Splunk to ignore log changes. Check documentation for crcSalt and initCrcLength parameters and google for "splunk iis logs initCrcLength problem"

View solution in original post

0 Karma

PavelP
Motivator

it could be a CRC problem since IIS logs are notorious to have a long header line which can mislead Splunk to ignore log changes. Check documentation for crcSalt and initCrcLength parameters and google for "splunk iis logs initCrcLength problem"

0 Karma

snix
Communicator

@PaveIP I think you are on the correct path. I took your advice and searched "splunk iis logs initCrcLength problem". The first page that popped up talked about this same issue and said they found an error in the _internal index. I checked for this error and found it myself.
https://answers.splunk.com/answers/530434/indexing-issue-with-iis-logs-file-will-not-be-read.html

I then continued searching and found many people recommending putting this into your inputs.conf file:

crcSalt = <SOURCE>

I also read in order to get it to work I needed to create a new log file on the web server so I stopped IIS and renamed the current log file to something.log.old on each site and started IIS again:

https://medium.com/@anon5123/splunk-sometimes-doesnt-index-logs-entirely-e611efe55eca

I could see it generated a new log file. That same site then said you needed to go here https://localhost:8000/debug/refresh and do a refresh. I then checked Splunk and found that I am not indexing my IIS logs at all.

Any advice on what I might have done wrong?

0 Karma

PavelP
Motivator

if you changed settings on the UF then you need to restart UF.

0 Karma

snix
Communicator

I have is set to auto restart after a deployment but I took your advice and ran it on a couple of the servers and I did start to see logs populate. The odd thing is that they populated back up to the point of the restart time but them stop ingesting new logs again.

0 Karma

efavreau
Motivator

Depending on settings, web servers don't log in real time. They buffer events in memory and then flush to disk. When you restart, it will flush whatever it's holding onto. I'm guessing the web server is under low volume. So it's not updating the log, and Splunk has nothing to read. Is that correct?

###

If this reply helps you, an upvote would be appreciated.
0 Karma

snix
Communicator

@efavreau I have not noticed that in the past. Our IIS logs getting into Splunk usually were maybe a minuet behind but always came in pretty consistently.

That said I just happen to have a call with an outfit the provides Splunk professional support yesterday morning. They took a look at it and we tried a couple things.

First we tried charging form

crcSalt = <SOURCE>

to

initCrcLength = 400

They said they had issues in the past using crcSalt so we tried initCrcLength and set it to 400 because with all the comments and fields in the beginning of the logs I wanted it to check enough characters that it would be able to get past the comments and be able to see the time, servername, and site so we could say for sure the logs was unique.

After the change I noticed the same issue where after a UF reboot you would see logs up to that point but after that it would not pull anything in new so the next thing they noticed was we were hitting our max allowed kbps on the UF on the web servers. The default is 256kbps so we bumped it up to 750kbps.

We could see it was using the full 750 kbps and just left it because my meeting with them was over. After that I checked Splunk and I noticed it also stopped logging after the UF on each box had reset.

Next morning I came back though and found all the logs were working as expected and were all showing pulling from all the log files from each site hosted on that server.

What I think happened was that after changing over to using crcSalt or initCrcLength is that Splunk then needs to re-index all the existing IIS logs on each server again as it is using a new way to track the log files. This is an intensive process that takes a long time since each server has IIS logs that go all the way into 2018 so that is about 20+ GB of logs in total across all the web servers. I think that it would start indexing beginning from when the UF was reset and work its way back till it was finished pulling in all the old logs.

Since this was going to take some time it just looked like logging to Splunk stopped as it caught up on all the back logs. This also explains why the UF on each web server was maxing out the 256kbps transfer limit as it copied over all the old logs. Then I come in the next morning to take a look and find it had caught up with all the logs and is not showing traffic coming in live from each sites logs on each server.

In conclusion I think using crcSalt will probably work with no issues (and may be the better option of the two) but I can verify for sure if you use initCrcLength and set the character length long enough that can verify the log is unique. This will resolve the issue I was running into.

PavelP
Motivator

thank you @snix for getting back and telling this. Indeed, another setting like ignoreOlderThan or similar need to be applied if you change CRC-method and have a lot logs. 😕

0 Karma

efavreau
Motivator

How are you sure this is Splunk? Do you have a load balancer out in front of the web servers? To be certain you have traffic on all servers and all sites on those servers, you'll need to generate artificial traffic. Many companies do this constantly with unique usernames or a unique user agent (easier if the site has pages are public) to run locally on each server using a software or a script. If you want to do it manually, send (and note when you did it and on which server) requestd locally to each site, on each box. Then check the logs for those sites and servers to see if it appears there (keep in mind, logs don't write to file immediately - it may take some time). Once you can confirm all of that works without Splunk, then confirm those same logs using Splunk.

###

If this reply helps you, an upvote would be appreciated.
0 Karma

snix
Communicator

I had the same thought and you are right there are so many sites and servers and with load balancers in the mix it is hard to tell what traffic is going where.

So I decided to focus on just one server and wanted to see if there was traffic getting logged to each site on that server. Then I wanted to make sure if there was traffic getting logged that it was then showing up in Splunk.

I did this by remoting into the server and looking at the most current log file for each site and making sure traffic was getting logged under W3SVC1 W3SVC2 W3SVC3. Once verified I then went to Splunk and searched that host for all IIS related events and verified what sources the events were logged under.

This is the query I used to validate all the log files getting logged into Splunk for a specific server:

index=iis host='servernamehere' | stats count by source 

In Splunk I could only see events from the W3SVC2 logs for the entire day when I saw events on all three sites.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...