Getting Data In

Filter iis logs before indexing

trodenbaugh
Explorer

I've upgraded to Splunk 6.01 and noticed the improved handling of the windows events prior to indexing and wondered if there were any improvements to the IIS logs. To minimize indexing licenses, I'd like to only index IIS logs with a 404 or 500 errors and would like to not depend on a REX filter to pull out the sc_status field value at index time.

Does 6.01 handle the IIS filtering any differently? If not, I guess I can use a REX to pull the error events.

1 Solution

amrit
Splunk Employee
Splunk Employee

@shogan is right, and here's an example config of how to make this happen when ingesting INDEXED_EXTRACTIONS logs on a Universal Forwarder:

Given sample csv file:

tak_tak.log:
foo,bar,baz
abc,123,456
abcd,234,567
abcd,abcd,567
bcd,345,678

If you want to only exclude lines where the value of field 'bar' is 'abcd', setup a sourcetype as usual:

props.conf:
[jiggy_jiggy]
INDEXED_EXTRACTIONS=csv
TRANSFORMS-throw_some_away=throw_some_away

And add a transform that makes use of "SOURCE_KEY" and "field:"

transforms.conf:
[throw_some_away]
SOURCE_KEY=field:bar
REGEX=abcd
DEST_KEY=queue
FORMAT=nullQueue

To test, ingest the file on the UF while setting the correct sourcetype:
$ ./myfwder/bin/splunk add oneshot ~/tak_tak.log -sourcetype jiggy_jiggy

And you should see 3 lines indexed, instead of 4. Note that the line where only column 'foo' contains 'abcd' WILL be present, while the line where column 'bar' contains 'abcd' will NOT be present.

View solution in original post

markschoonover
Explorer
0 Karma

vsingla1
Communicator

@montgomeryam: I had windows admin take care of it. There is a screen available where you can configure IIS logs and filter (through checkbox) its various components. So just filter it out on the windows level and splunk does not ever see them.

0 Karma

montgomeryam
Path Finder

I appreciate the info! I will investigate that in our test lab to see if it will work for all of our Exchange logs.

Seems weird how hard it is to send those fields to null values...

0 Karma

aaronwalker
Engager

Expanding on @amrit's example code, I was able to build a filter to exclude some HTTP Status codes from IIS logs using:

props.conf

[iis]
TRANSFORMS-HttpErrorsOnly=HttpErrorsOnly

transforms.conf

[HttpErrorsOnly]
SOURCE_KEY=field:sc_status
REGEX=[123]\d+
DEST_KEY=queue
FORMAT=nullQueue

0 Karma

amrit
Splunk Employee
Splunk Employee

@shogan is right, and here's an example config of how to make this happen when ingesting INDEXED_EXTRACTIONS logs on a Universal Forwarder:

Given sample csv file:

tak_tak.log:
foo,bar,baz
abc,123,456
abcd,234,567
abcd,abcd,567
bcd,345,678

If you want to only exclude lines where the value of field 'bar' is 'abcd', setup a sourcetype as usual:

props.conf:
[jiggy_jiggy]
INDEXED_EXTRACTIONS=csv
TRANSFORMS-throw_some_away=throw_some_away

And add a transform that makes use of "SOURCE_KEY" and "field:"

transforms.conf:
[throw_some_away]
SOURCE_KEY=field:bar
REGEX=abcd
DEST_KEY=queue
FORMAT=nullQueue

To test, ingest the file on the UF while setting the correct sourcetype:
$ ./myfwder/bin/splunk add oneshot ~/tak_tak.log -sourcetype jiggy_jiggy

And you should see 3 lines indexed, instead of 4. Note that the line where only column 'foo' contains 'abcd' WILL be present, while the line where column 'bar' contains 'abcd' will NOT be present.

vsingla1
Communicator

Hi Amrit,
Your example is good but i have another use case.
I want the entire field=bar not to be indexed. Specifically for IIS logs, i want the field cs_cookie to be not indexing and in hope of achieving this on the UF level.
Any ideas on what the props/transforms configuration will be?

0 Karma

amrit
Splunk Employee
Splunk Employee

I would just replace the contents of the cs_cookie field with an empty string. The following change to the above configuration should do it, although I haven't tested this:

transforms.conf:
[throw_some_away]
SOURCE_KEY=field:bar
REGEX=.
FORMAT=

0 Karma

vsingla1
Communicator

I tried this config but it gives the below warning on the UF internal logs:
WARN regexExtractionProcessor - Too few groups in regex: setnull-cs_Cookie; captures: 0, args: 1

And the filtering of cs_cookie still fails.

Below are my two files:
props.conf
[iis]
TRANSFORMS-set=setnull-cs_Cookie

transforms.conf
[setnull-cs_Cookie]
SOURCE_KEY=field:cs_Cookie
REGEX=.
FORMAT=

0 Karma

amrit
Splunk Employee
Splunk Employee

Based on "too few groups in regex", how about if you change the regex line to: REGEX=(.)

0 Karma

vsingla1
Communicator

Hi Amrit,
REGEX=(.) has made the WARN disappear, but cs_cookie field is still not getting filtered (or is not being set to null)

0 Karma

vsingla1
Communicator

Hi Guys,
Any further thought on this? Its hard to believe that setting a field to null is so difficult in splunk.

0 Karma

montgomeryam
Path Finder

@amrit or @vsingla1 - - did you ever figure this out? I am looking to do something similar by dropping a number of IIS log fields.

Thanks for any assistance!

0 Karma

vsingla1
Communicator

@montgomeryam I could not figure this out on the splunk side. But if you contact your windows admin, there is place where you can uncheck check-boxes to exclude cs-cookie and other fields from logging. So there is no data for splunk to pick up.

0 Karma

bdruth
Path Finder

Just wondering if this was actually figured out successfully. We'd like to use the structuredparsing queue to filter events before forwarding to the indexer and the samples linked to don't seem to be all that useful?

0 Karma

montgomeryam
Path Finder

Did you ever get this working? We are wanting to drop a number of fields in an IIS log and the above solution from amrit doesn't work.

0 Karma

shogan_splunk
Splunk Employee
Splunk Employee

New in Splunk 6.x a Universal Forwarder can perform nullqueue filtering for inputs leveraging the INDEXED_EXTRACTIONS setting, which $SPLUNK_HOME/etc/system/default/props.conf enables this attributes for iis, csv, and msexchange as default.

You can review the new **structuredparsing** queue information here: http://wiki.splunk.com/Community:HowIndexingWorks

So if you add a props.conf/transforms.conf to the Universal Forwarder's $SPLUNK_HOME/etc/system/local directory with the proper filtering then it will be done locally on the Universal Forwarder before sending to the indexer.

yannK
Splunk Employee
Splunk Employee

Do you have any examples of the configuration ?

0 Karma

lukejadamec
Super Champion

You don't need rex to pull any iis field if the search time extractions are configured correctly.

First, group your iis logs by content - different web sites can have different log content, i.e. different number or type of fields in the header.

Two, specify a unique sourcetype in inputs.conf on the forwarder that is collecting the logs. Use the same sourcetype for each input that has the same iis header.

Three, create a props.conf and a transforms.conf stanza on the indexer in splunk/etc/system/local that sets delims and fields for that sourcetype.

Since this is a search time extraction it will work on all indexed iis logs so long as the header has not changed, and if you make a mistake it can be corrected without changing the indexed data.

There are few things better than iis logs that are extracted properly. If you need the details, let me know.

0 Karma

lukejadamec
Super Champion

Well then, to answer your question about iis log filtering changes in Splunk 6 - no, there were no changes similiar to those for Windows events.
Yes, you can filter iis logs prior to indexing.
Configure the input on the forwarder, and configure the props and transforms.conf on the indexer in splunk/etc/system/local
The format for the props and transforms will be the same as regular Windows events, the difference will be the regex you use to identify the 404 and 500 errors.
See this post:
http://answers.splunk.com/answers/29218/filtering-windows-event-logs

0 Karma

trodenbaugh
Explorer

Actually, I'm trying to perform the filtering prior to indexing. I'd like to reduce the amount of license usage.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...