I am trying to save on space and licensing with my IIS logs. Currently the vast majority of my logs are just constant health checks from our load balancers or security tools. I would like to filter these out by their user agent strings before they are indexed.
Currently I have user agent strings that come from KEMP, Cloudflare, Nesses, and Tenable that I would like to filter out.
On the indexer I go into \SPLUNKHOME\etc\apps\Splunk_TA_microsoft-iis\local and I modified the promps.conf file and added this line to the sourcetype stanza I use for the logs:
TRANSFORMS-null= setnull
Also in the same folder I modified the transforms.conf file and added this stanza:
[setnull]
REGEX = cs_User_Agent_="(?i)(\S*kemp*[^\s]+|\S*Cloudflare*[^\s]+|\S*Nessus*[^\s]+|\S*tenable*[^\s]+)"
DEST_KEY = queue
FORMAT = nullQueue
Should the filtering happen at the index or should I move the settings to promps.conf and transforms.conf files on the app I deploy to the UF? Maybe my regex is just not right, I could not find a good example and guessed on how to reference the field to parse. Hopefully someone can let me know if I am even close to getting it right.
Okay I think I have it working by adjusting the regex to this:
REGEX = (?i)(\S*kemp*[^\s]+|\S*Cloudflare*[^\s]+|\S*Nessus*[^\s]+|\S*tenable*[^\s]+)
More testing will need to be done to verify it works as expected.
Even then I think my regex may inefficient to the point it is causing issues on the indexer. I am starting to see under the "Health Status of Splunkd" area that sometimes under the TailReader-0 it will show as yellow or red then go back to green.
If anyone has tips on the best way to implement this filter please let me know.
as most of these User-Agent strings are static, by using longer strings instead of just words and avoiding regexes you can reduce both the probability of false positives and the unneccessary load on indexer
instead of just "kemp" use literal match:
" KEMP\+1\.0 "
instead of just "cloudflare" use literal match:
"Mozilla/5\.0\+\(compatible;\+Cloudflare-Traffic-Manager/1\.0;"
including spaces and plus character can help to avoid false positives, because usually web servers url-encode such characters so even if somebody does query like:
"http://server.com/search?q= KEMP+1.0 "
in the log it will be hopefully stored as:
http://server.com/search?q=%20KEMP%2B1.0%20
so it will not match in splunk anymore.
Please check if IIS url-encodes such characters.
I understand your point I don't want to keep it too generic that I filter out potentially useful logs and over load the indexer. But at the same time I want be able to filer on on just the general names under the part of the log that would be considered the user agent string field once the fields were extracted.
That way if the vendor changes to a new user agent string or adds a new one I don't have to go in find it and then add it to filter. I can see Cloudflare doing this and already has 5 + user agent strings I will have to filter on. I know I am trying to have my cake and eat it too in this situation but just trying to find a good balance.
I still have not had a chance to filter at the IIS level but I will update after I have a chance to look at it. Not a fan of modifying IIS in this situation but it has the added benefit of not filling up the web servers with useless logs as well.
Hi snix,
it is better to restrict your regex to the User-Agent field only, as you did in your first example, to avoid the nullQueueing of events that contain these strings in the url. Filtering out based on short strings like "kemp" can lead to false positives.
To test your regex send following requests to IIS (replace your-website.com with your website):
http://your-website.com/?kemp
http://your-website.com/?cloudflare
http://your-website.com/?tenable
etc.
You can test your configuration for a week if you send such messages to other index or sourcetype instead of nullQueueing, check the events for false positives, and only then switch to nullQueue.
Good Luck!
@PaveIP I agree with you I would like to have it more targeted. That said my original post where I reference the cs_User_Agent_ field does not actually work since I think that field is added after I do the filtering. If I am wrong and the field should be there then let me know and I can try again.
I could tighten the filter but I am worried that it is putting too much strain on the indexer. Is there a more efficient way to do this? Can I do the filtering on the UF before it even gets to the indexer?
if your indexers are overloaded than it is better to add more indexes to the cluster.
if you have more health check requests than user requests then you can check the frequency of the health checks. How many health checks do you have a day?
Another solution can be log filtering (https://docs.microsoft.com/en-us/iis/extensions/advanced-logging-module/advanced-logging-for-iis-log...) on the IIS side, to split the access log into two: one for health checks and other for anything else, and let Splunk monitor only the second log.
Here are a couple event I pulled from my IIS logs:
2020-04-06 00:27:11 W3SVC2 <Server Host Name Here> <Destination IP Address Here> GET <Sub Directory Path> - 443 - <Source IP Address Here> HTTP/1.1 Mozilla/5.0+(compatible;+Cloudflare-Traffic-Manager/1.0;++https://www.cloudflare.com/traffic-manager/;+pool-id:<Removed some ID string>) - - <Website Domain Address> 200 0 0 265 219 62 -
2020-04-06 00:00:45 W3SVC2 <Server Host Name Here> <Destination IP Address Here> GET <Sub Directory Path> - 443 - <Source IP Address Here> HTTP/1.1 KEMP+1.0 - - <Website Domain Address> 200 0 0 5400 161 56
The health checks are usually between 5-10 seconds apart depending on the load balancer and how may different sites are hosted and being checked on the web site. Since we need to verify each site is up and move off of it quickly we need to constantly check but it dose look to be eating gigs of logs each day.
Thanks for the Microsoft document link, I will look at doing the filtering that way and see if it ends up being a better way to filter in the end.
please provide two or more log lines: one without "kemp" and one with "kemp". Please anonymize internal usernames, domains and IP addresses, but keep the structure of logs unchanged.
Okay I think I have it working by adjusting the regex to this:
REGEX = (?i)(\S*kemp*[^\s]+|\S*Cloudflare*[^\s]+|\S*Nessus*[^\s]+|\S*tenable*[^\s]+)
More testing will need to be done to verify it works as expected.
Even then I think my regex may inefficient to the point it is causing issues on the indexer. I am starting to see under the "Health Status of Splunkd" area that sometimes under the TailReader-0 it will show as yellow or red then go back to green.
If anyone has tips on the best way to implement this filter please let me know.
REGEX = (?i)(kemp|Cloudflare|Nessus|tenable)
How about this?
I will keep an eye on it and report back if I do continue to see issues but your much simplified query I think did the trick. I have not see any red or yellow since I switched to it. Thank you!
transforms.conf
SOURCE_KEY=field:cs_User_Agent_
DEST_KEY = queue
FORMAT = nullQueue
REGEX = (?i)(kemp|Cloudflare|Nessus|tenable)
Is there cs_User_Agent_
, FP is nothing.