Getting Data In

IIS log files are not read properly - parts of multiple lines getting put together as one

Explorer

I have a report that groups webpage request by from an IIS log by SCSTATUS. The results are really bad because splunk appears to be getting confused on what line and what part of a line it's reading, resulting in data like "myurl.com" showing up where "200" for scstatus should be.

I have Splunk set up to monitor the folder where log files are stored in real time and I manually selected IIS logs when identifying the format of the files.

This is what Splunk has stored for one request:
2015-12-30 15:06:54 W3SVC3 MYWEBSERVER 192.111.11.11 GET /AppThemes/Blue/Blue.css - 80 - 54.69.58.243 HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) stuffid=stuff;+user=stuff;+persistcookie=True;+stuffSelection=STUFF1,STUFF2,STUFF3,STUFF4,STUFF5,;+MYWEBSITE=R285025761;+ASP.NETSessionId=3sgbsssgrvbwizta31fcynmx;+MyWebSite.ASPXAUTH=D2E24F7A75F2114DCF6AFB5DA65C739A2972D39870A74C1735EF0B3A819F27D5E743DE70EB6C5D7ADF944507DA71042D235483889FEA3A736EFBA2E81AB02F47A08BA93D51C6563422CE17055236EA5BBDCC03A03B4389CE042ADDFB89AA7A7D6C7246376DB20045AD709BE50444332F048A79BD65269C0919B0A5ADA4EE415EE1E96BCFBF3D5D33507D663A5671DE9E https://m5.0+(Macintosh;+Intel+Mac+OS+X+1010)+AppleWebKit/600.1.25+(KHTML,+like+Gecko)+Version/8.0+Safari/600.1.25 MYWEBSITE=R285025761;+ASP.NET_SessionId=o2hgz2wa34vj2v0i2c5zdmis https://mywebsite.thisisawesome.com/Logon.aspx?ReturnUrl=%2f mywebsite.thisisawesome 200 0 0 24916 515 31

This request appears to be a mashup of two or more requests:
Part 1: 2015-12-30 15:06:54 W3SVC3 MYWEBSERVER 192.111.11.11 GET /AppThemes/Blue/Blue.css - 80 - 54.69.58.243 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+Trident/7.0;+rv:11.0)+like+Gecko stuffid=stuff;+user=stuff;+persistcookie=True;+datalistSelection=OFAC,PEPFO,;+MYWEBSITE=R285025761;+ASP.NETSessionId=ykvwd2cgbhjcjck45jcy1w13 https://mywebsite.thisisawesome.com/logon.aspx mywebsite.thisisawesome.com 304 0 0 92 593 62

Part 2: 2015-12-30 15:06:38 W3SVC3 MYWEBSERVER 192.111.11.11 GET /Includes/jquery-1.4.2.min.js - 80 - 209.15.236.88 HTTP/1.1 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+1010)+AppleWebKit/600.1.25+(KHTML,+like+Gecko)+Version/8.0+Safari/600.1.25 MYWEBSITE=R285025761;+ASP.NETSessionId=o2hgz2wa34vj2v0i2c5zdmis https://mywebsite.thisisawesome.com/Logon.aspx?ReturnUrl=%2f mywebsite.thisisawesome.com 200 0 0 24916 515 31

and part of another request in the middle.

I can see at least one place where the lines were mashed together. In this snippit, "5671DE9E https://m5.0+(Macintosh;+Int", you can see "https://m" is part of a URL and "5.0+" is part of a user agent but they're put together without a space as if they're one field.

Other than that, I'm not sure where the data is coming from in the log file to put that one request together in Splunk.

My question is, how do I get Splunk to read my IIS logs properly and not mash up multiple lines into one line?

Thanks!

0 Karma

SplunkTrust
SplunkTrust

It appears you have a line merge / line breaker problem. You'll want to check your inputs.conf for the sourcetype you're using to consume these logs, then you'll want to match that up to your props.conf to see if SHOULD_LINEMERGE = false, and configure a line breaker... looks like date will be best.

inputs.conf:

[<input stanza>]
...
sourcetype=sourcetypeName

props.conf:

[sourcetypeName]
 ...
SHOULD_LINEMERGE=false
LINE_BREAKER=\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}

Check the docs for reference. If you're sending from universal forwarder, you'll need to put props on the forwarder.
http://docs.splunk.com/Documentation/Splunk/latest/admin/Propsconf

0 Karma

Explorer

I believe I should be looking for sourcetype of "iis" since that what I've configured the data input as.

In the inputs.conf file, I do not see anything for iis so I'm not sure if any changes are necessary.

I see this in the props.conf file. SHOULD_LINEMERGE is already set to false.

[iis]
pulldowntype = true
MAX
TIMESTAMPLOOKAHEAD = 32
SHOULD
LINEMERGE = False
INDEXEDEXTRACTIONS = w3c
detect
trailing_nulls = auto
category = Web
description = W3C Extended log format produced by the Microsoft Internet Information Services (IIS) web server

I'll add this below description: LINE_BREAKER=\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
... and see what happens.

0 Karma

SplunkTrust
SplunkTrust

make it a lowercase false, but yeah you gotta have LINEBREAKER or MUSTBREAKBEFORE ONLYBREAKAFTER ONLYBREAKBEFORE etc if you set SHOULDLINEMERGE=false.

0 Karma

SplunkTrust
SplunkTrust

The inputs.conf will be located on the forwarder on the IIS servers or wherever splunk is reading the log files from.

You can run $splunk_home$/bin/splunk cmd btool inputs list --debug to see what inputs.conf stanzas are loaded and what app their loaded from.

0 Karma