I've upgraded to Splunk 6.01 and noticed the improved handling of the windows events prior to indexing and wondered if there were any improvements to the IIS logs. To minimize indexing licenses, I'd like to only index IIS logs with a 404 or 500 errors and would like to not depend on a REX filter to pull out the sc_status field value at index time.
Does 6.01 handle the IIS filtering any differently? If not, I guess I can use a REX to pull the error events.
@shogan is right, and here's an example config of how to make this happen when ingesting INDEXED_EXTRACTIONS logs on a Universal Forwarder:
Given sample csv file:
tak_tak.log:
foo,bar,baz
abc,123,456
abcd,234,567
abcd,abcd,567
bcd,345,678
If you want to only exclude lines where the value of field 'bar' is 'abcd', setup a sourcetype as usual:
props.conf:
[jiggy_jiggy]
INDEXED_EXTRACTIONS=csv
TRANSFORMS-throw_some_away=throw_some_away
And add a transform that makes use of "SOURCE_KEY" and "field:"
transforms.conf:
[throw_some_away]
SOURCE_KEY=field:bar
REGEX=abcd
DEST_KEY=queue
FORMAT=nullQueue
To test, ingest the file on the UF while setting the correct sourcetype:
$ ./myfwder/bin/splunk add oneshot ~/tak_tak.log -sourcetype jiggy_jiggy
And you should see 3 lines indexed, instead of 4. Note that the line where only column 'foo' contains 'abcd' WILL be present, while the line where column 'bar' contains 'abcd' will NOT be present.
Looks like a duplicate question:
@montgomeryam: I had windows admin take care of it. There is a screen available where you can configure IIS logs and filter (through checkbox) its various components. So just filter it out on the windows level and splunk does not ever see them.
I appreciate the info! I will investigate that in our test lab to see if it will work for all of our Exchange logs.
Seems weird how hard it is to send those fields to null values...
Expanding on @amrit's example code, I was able to build a filter to exclude some HTTP Status codes from IIS logs using:
props.conf
[iis]
TRANSFORMS-HttpErrorsOnly=HttpErrorsOnly
transforms.conf
[HttpErrorsOnly]
SOURCE_KEY=field:sc_status
REGEX=[123]\d+
DEST_KEY=queue
FORMAT=nullQueue
@shogan is right, and here's an example config of how to make this happen when ingesting INDEXED_EXTRACTIONS logs on a Universal Forwarder:
Given sample csv file:
tak_tak.log:
foo,bar,baz
abc,123,456
abcd,234,567
abcd,abcd,567
bcd,345,678
If you want to only exclude lines where the value of field 'bar' is 'abcd', setup a sourcetype as usual:
props.conf:
[jiggy_jiggy]
INDEXED_EXTRACTIONS=csv
TRANSFORMS-throw_some_away=throw_some_away
And add a transform that makes use of "SOURCE_KEY" and "field:"
transforms.conf:
[throw_some_away]
SOURCE_KEY=field:bar
REGEX=abcd
DEST_KEY=queue
FORMAT=nullQueue
To test, ingest the file on the UF while setting the correct sourcetype:
$ ./myfwder/bin/splunk add oneshot ~/tak_tak.log -sourcetype jiggy_jiggy
And you should see 3 lines indexed, instead of 4. Note that the line where only column 'foo' contains 'abcd' WILL be present, while the line where column 'bar' contains 'abcd' will NOT be present.
Hi Amrit,
Your example is good but i have another use case.
I want the entire field=bar not to be indexed. Specifically for IIS logs, i want the field cs_cookie to be not indexing and in hope of achieving this on the UF level.
Any ideas on what the props/transforms configuration will be?
I would just replace the contents of the cs_cookie field with an empty string. The following change to the above configuration should do it, although I haven't tested this:
transforms.conf:
[throw_some_away]
SOURCE_KEY=field:bar
REGEX=.
FORMAT=
I tried this config but it gives the below warning on the UF internal logs:
WARN regexExtractionProcessor - Too few groups in regex: setnull-cs_Cookie; captures: 0, args: 1
And the filtering of cs_cookie still fails.
Below are my two files:
props.conf
[iis]
TRANSFORMS-set=setnull-cs_Cookie
transforms.conf
[setnull-cs_Cookie]
SOURCE_KEY=field:cs_Cookie
REGEX=.
FORMAT=
Based on "too few groups in regex", how about if you change the regex line to: REGEX=(.)
Hi Amrit,
REGEX=(.) has made the WARN disappear, but cs_cookie field is still not getting filtered (or is not being set to null)
Hi Guys,
Any further thought on this? Its hard to believe that setting a field to null is so difficult in splunk.
@amrit or @vsingla1 - - did you ever figure this out? I am looking to do something similar by dropping a number of IIS log fields.
Thanks for any assistance!
@montgomeryam I could not figure this out on the splunk side. But if you contact your windows admin, there is place where you can uncheck check-boxes to exclude cs-cookie and other fields from logging. So there is no data for splunk to pick up.
Just wondering if this was actually figured out successfully. We'd like to use the structuredparsing queue to filter events before forwarding to the indexer and the samples linked to don't seem to be all that useful?
Did you ever get this working? We are wanting to drop a number of fields in an IIS log and the above solution from amrit doesn't work.
New in Splunk 6.x a Universal Forwarder can perform nullqueue filtering for inputs leveraging the INDEXED_EXTRACTIONS setting, which $SPLUNK_HOME/etc/system/default/props.conf enables this attributes for iis, csv, and msexchange as default.
You can review the new **structuredparsing** queue information here: http://wiki.splunk.com/Community:HowIndexingWorks
So if you add a props.conf/transforms.conf to the Universal Forwarder's $SPLUNK_HOME/etc/system/local directory with the proper filtering then it will be done locally on the Universal Forwarder before sending to the indexer.
Do you have any examples of the configuration ?
You don't need rex to pull any iis field if the search time extractions are configured correctly.
First, group your iis logs by content - different web sites can have different log content, i.e. different number or type of fields in the header.
Two, specify a unique sourcetype in inputs.conf on the forwarder that is collecting the logs. Use the same sourcetype for each input that has the same iis header.
Three, create a props.conf and a transforms.conf stanza on the indexer in splunk/etc/system/local
that sets delims and fields for that sourcetype.
Since this is a search time extraction it will work on all indexed iis logs so long as the header has not changed, and if you make a mistake it can be corrected without changing the indexed data.
There are few things better than iis logs that are extracted properly. If you need the details, let me know.
Well then, to answer your question about iis log filtering changes in Splunk 6 - no, there were no changes similiar to those for Windows events.
Yes, you can filter iis logs prior to indexing.
Configure the input on the forwarder, and configure the props and transforms.conf on the indexer in splunk/etc/system/local
The format for the props and transforms will be the same as regular Windows events, the difference will be the regex you use to identify the 404 and 500 errors.
See this post:
http://answers.splunk.com/answers/29218/filtering-windows-event-logs
Actually, I'm trying to perform the filtering prior to indexing. I'd like to reduce the amount of license usage.