Getting Data In

IIS sourcetype switching by time

lsouzek
Explorer

Hello,

We're seeing kind of a strange issue with IIS sourcetypes for two IIS servers that are forwarding logs to the same Splunk indexer. From midnight until between 2 and 6 hours later (local server time; GMT/log time is +5), the logs are showing up as sourcetype "iis". Then at some given point (2:41 a.m. on host 1, 5:40 a.m. on host 2) the sourcetype is switching to "iis-3". There's no overlap between the sourcetypes and the log entries have the same number of fields. See below:

Host 1

12/8/10 2:41:07.000 AM   2010-12-08 07:41:07 HOST1 [IP_ADDRESS_1] GET / - 80 - [IP_ADDRESS_2] - - 403 14 5 218 7 0

host=HOST1, sourcetype=iis, source=D:\webserver\logs\W3SVC1\ex101208.log

12/8/10 2:41:12.000 AM   2010-12-08 07:41:12 HOST1 [IP_ADDRESS_1] GET / - 80 - [IP_ADDRESS_2] - - 403 14 5 218 7 0

host=HOST1, sourcetype=iis-3, source=D:\webserver\logs\W3SVC1\ex101208.log

Host 2

12/8/10 5:40:24.000 AM   2010-12-08 10:40:24 HOST2 [IP_ADDRESS_1] GET / - 80 - [IP_ADDRESS_2] - - 403 14 5

host=HOST2, sourcetype=iis, source=D:\webserver\logs\W3SVC1\ex101208.log

12/8/10 5:40:29.000 AM   2010-12-08 10:40:29 HOST2 [IP_ADDRESS_1] GET / - 80 - [IP_ADDRESS_2] - - 403 14 5

host=HOST2, sourcetype=iis-3, source=D:\webserver\logs\W3SVC1\ex101208.log

Any idea what might be causing the sourcetypes to flip back and forth like that?

Tags (2)
1 Solution

southeringtonp
Motivator

While Splunk's automatic sourcetyping is a convenient feature, trusting the software to make that decision can cause problems sometimes.

It sounds like Splunk's automatic sourcetyping is active, and it has decided to create different sourcetypes (perhaps something to do with seeing or not seeing the IIS header rows, especially since they can appear in the middle of the file when IIS restarts).

Your best bet is to do two things:

  • Explicitly assign the sourcetype as iis, via inputs.conf or props.conf.
  • Rename the sourcetype iis-3 to iis to make the already-indexed logs consistent.

  • Edit:

    Since you mentioned CHECK_FOR_HEADER, definitely be aware that that CHECK_FOR_HEADER can occasionally get confused about the format. Also, by design it will create new sourcetypes if the field list does not match across all of your IIS sources.

    In this case, it's safer to use a fixed field list or a regex-based field extraction, instead of using CHECK_FOR_HEADER. Here's one approach...

    props.conf:

    [rule::sourcetype_iis]
    sourcetype=iis
    MORE_THAN_75 = W3SVC
    
    [iis]
    CHECK_FOR_HEADER = False
    TIME_PREFIX = :\s
    MAX_TIMESTAMP_LOOKAHEAD = 128
    TIME_FORMAT = %Y-%m-%d %H:%M:%S
    TZ = GMT
    REPORT-iisfields = iis-fields
    

    transforms.conf:

    [iis-fields]
    REGEX = \d\d\d\d-\d\d-\d\d [\d:]+ (W3\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+)? (\S+)? (\S+)?
    FORMAT = cs_sitename::$1 cs_ip::$2 method::$3 uri_stem::$4 uri_query::$5 s_port::$6 user::$7 src_ip::$8 useragent::$9 statuscode::$10 substatuscode::$11 sc_win32_status::$12
    

    View solution in original post

    Keysofsandiego
    Path Finder

    I found out why this is happening - though I do not have a good answer on how to fix.

    <-------basically the IIS log Starts at midnight (or your time setting)------->
    ' #Software: Microsoft Internet Information Services 7.5
    ' #Version: 1.0
    ' #Date: 2015-06-03 00:00:00
    ' #Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken
    ' then log data occurs LOGS LOGS LOGS DATA HOST LATENCY YADDA YADDA ETC

    <-------later in the day the headers appear again------->
    ' continued from above log data occurs LOGS LOGS LOGS DATA HOST LATENCY YADDA YADDA ETC

    Software: Microsoft Internet Information Services 7.5

    Version: 1.0

    <------- this is where the splunk forwarder parses the log and creates "Source type iis-# ------->

    Date: 2015-06-03 00:00:00

    Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken

    ' continued from above log data occurs LOGS LOGS LOGS DATA HOST LATENCY YADDA YADDA ETC
    this will be the standard until a new IIS log is created -

    VICIOUS CYCLE REPEATS UNTIL ADMIN GOES CRAZY AND MOVES TO MEXICO

    0 Karma

    yannK
    Splunk Employee
    Splunk Employee

    Recommendations :

    • when using CHECK_FOR_HEADER = False, add it on the forwarder (UF and LWF) too, because this props is applied for the sourcetype learning (otherwise you will have iis-2 iis-3 etc...)
     #on the forwarder in props.conf
    [iis]
    CHECK_FOR_HEADER = False
    
     # on the indexer in props.conf
    [iis]
    CHECK_FOR_HEADER = False
    TIME_PREFIX = :\s
    MAX_TIMESTAMP_LOOKAHEAD = 128
    TIME_FORMAT = %Y-%m-%d %H:%M:%S
    TZ = GMT
     # add field extraction if needed, see bellow
    
    • Because the IIS logs have many format for the fields, instead of using the models that you find on answers, it's always good to define yourself the field extraction based on your real logs.

    example :


    # my iis log header
    # date : sssss
    # FIELDS: date time c-ip cs-username s-sitename etc ....

    use

    props.conf
    [iis]
    REPORT-iisfields = iis-fields

    transforms.conf
    [iis_w3c]
    FIELDS="date","time","c-ip","cs-username","s-sitename" etc....
    DELIMS = " "

    southeringtonp
    Motivator

    While Splunk's automatic sourcetyping is a convenient feature, trusting the software to make that decision can cause problems sometimes.

    It sounds like Splunk's automatic sourcetyping is active, and it has decided to create different sourcetypes (perhaps something to do with seeing or not seeing the IIS header rows, especially since they can appear in the middle of the file when IIS restarts).

    Your best bet is to do two things:

  • Explicitly assign the sourcetype as iis, via inputs.conf or props.conf.
  • Rename the sourcetype iis-3 to iis to make the already-indexed logs consistent.

  • Edit:

    Since you mentioned CHECK_FOR_HEADER, definitely be aware that that CHECK_FOR_HEADER can occasionally get confused about the format. Also, by design it will create new sourcetypes if the field list does not match across all of your IIS sources.

    In this case, it's safer to use a fixed field list or a regex-based field extraction, instead of using CHECK_FOR_HEADER. Here's one approach...

    props.conf:

    [rule::sourcetype_iis]
    sourcetype=iis
    MORE_THAN_75 = W3SVC
    
    [iis]
    CHECK_FOR_HEADER = False
    TIME_PREFIX = :\s
    MAX_TIMESTAMP_LOOKAHEAD = 128
    TIME_FORMAT = %Y-%m-%d %H:%M:%S
    TZ = GMT
    REPORT-iisfields = iis-fields
    

    transforms.conf:

    [iis-fields]
    REGEX = \d\d\d\d-\d\d-\d\d [\d:]+ (W3\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+) (\S+)? (\S+)? (\S+)?
    FORMAT = cs_sitename::$1 cs_ip::$2 method::$3 uri_stem::$4 uri_query::$5 s_port::$6 user::$7 src_ip::$8 useragent::$9 statuscode::$10 substatuscode::$11 sc_win32_status::$12
    

    lsouzek
    Explorer

    Thanks for the suggestion. I ended up using the approach outlined in this post (http://answers.splunk.com/questions/7205/w3c-fields-with-light-forwarder-still-dont-have-it), which is similar but seemed a little simpler. One strange thing is that restarting Splunk on the IIS servers today seemed to kick the logs back to iis-3 but I'm guessing/hoping they will go back to a sourcetype of iis when they roll over at midnight.

    0 Karma

    southeringtonp
    Motivator

    CHECK_FOR_HEADER is probably the issue then. It can get confused sometimes. I would definitely suggest not using it, and using a fixed field list or regex transform to pull out the fields instead. Also, make sure your IIS servers are all configured to log the same list of fields. For the already indexed data, you'll need to use rename unless you want to completely reindex those events. See edits above for more information.

    0 Karma

    lsouzek
    Explorer

    My apologies, I should have mentioned this in the first post, but we're explicitly setting the sourcetype to iis in both inputs.conf and props.conf in the deployed application. CHECK_FOR_HEADER is also set to true in props.conf. Is renaming the sourcetype iis-3 to iis my only option?

    0 Karma
    Get Updates on the Splunk Community!

    Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

    March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

    What’s New in Splunk App for PCI Compliance 5.3.1?

    The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

    Extending Observability Content to Splunk Cloud

    Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...