I've been Googling and searching through Splunkbase trying to find an example of using the new structuredparsing queue with the nullQueue to exclude events from being forwarded to the indexer using the new Splunk 6 UF.
I found this: http://answers.splunk.com/answers/118668/filter-iis-logs-before-indexing, but other than a lot of good information on structureparsing, I'm not finding detailed information on how to apply that knowledge.
If someone could break down a props.conf/transforms.conf example that prevents IIS events from forwarding by HTTP Status code, that would be a huge help and get us moving in the right direction.
Thanks!
I just posted an answer explaining how to use INDEXED_EXTRACTIONS indextime fields to throw away events:
https://answers.splunk.com/answers/118668/filter-iis-logs-before-indexing.html#answer-119031
regex and sc_status now yields the same
The first search has an error, somehow an extra + was added. Try this one:
sourcetype=youriissourcetype |regex "200\s\d+\s\d+$"
The first yields 113,744 and the second yields 113,908 ... so, nearly identical.
You should create a test index. Copy one of the iis log files to the indexer temp directory, and use the gui to add it as a file input to your test index. The bigger the log file the better.
I should say, we don't have any IIS indexed data - we're just getting this added to our Splunk environment today.
We don't have any indexed data - we're hoping to filter it out at the UF before it goes to our indexer, ideally working the first time 😉 ...
Run this on your entire iis indexed data:
sourcetype=youriissourcetype |regex "200+\s\d+\s\d+$"
And compare the results or result count to this:
sourcetype=youriissourcetype sc_status=200
So, you're looking to drop status 200?
Gotcha, ok - so here's what our logs are looking like:
2014-03-05 01:32:45 W3SVC1098397332 10.2.101.194 GET /Resources/example.mp3 - 80 - 10.2.101.20 - 200 0 0
For windows logs in Splunk 6 life did get much easier for parsing Event logs where you can blacklist EventCodes very easily. They do not have something similar for IIS logs, yet.
In my experience creating regex statements to pull the status codes you're looking for, as well as all other IIS log fields - they were pretty solid. The status codes and other codes typically found at the end of the event, are actually easy and solid, because they are extracted from the end of the event, which is typically very clean with number fields.
The specific regex will depend on your log structure.
Maybe I'm misunderstanding the structuredparsing queue, but I thought that allow you to target parsed fields, i.e. not using a regex (which for IIS access logs is going to be fairly ugly and possibly brittle, right?)
Removing the IIS logs you don't want based on HTTP Status Code is no different than removing the header lines with the exception that the regex that identifies the events with the unwanted HTTP Status Code will be different. See this post instead:
http://answers.splunk.com/answers/104297/avoid-duplicate-data-and-ignore-fields
In that post you will see how to remove the header fields (they all start with a #), and specify the field names for the csv events.
We can't give you a specific regex because we don't know your iis log structure (fields and field positions).
Can you post some examples?
Try this: http://apps.splunk.com/app/1579/ and see if this will help you.
Is this something that can filter events on the UF before sending to the indexer? It doesn't seem like it, but maybe I'm missing something. This seems like an app on the search head/indexer.