- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to configure timestamp and other data formatting for multiline Exchange Autodiscover logs?
I have been asked to take on some logs which have a predictable format but which on a one-shot test input shows that Splunk hasn't figured them out. Here is a sample log entry, which is multi-line:
20141021_150239.928_128.200.22.13: Request Begin. User Agent: Microsoft Office/14.0 (Windows NT 6.1; Microsoft Outlook 14.0.7128; Pro)
20141021_150239.928_128.200.22.13: XML Message: <?xml version="1.0" encoding="utf-8"?><Autodiscover xmlns="http://schemas.microsoft.com/exchange/autodiscover/outlook/requestschema/2006"><Request><EMailAddress>bvarela@uci.edu</EMailAddress><AcceptableResponseSchema>http://schemas.microsoft.com/exchange/autodiscover/outlook/responseschema/2006a</AcceptableResponseSchema></Request></Autodiscover>
20141021_150239.928_128.200.22.13: **** Start Header Dump ****
20141021_150239.928_128.200.22.13: Cache-Control: no-cache
20141021_150239.928_128.200.22.13: Connection: Keep-Alive
20141021_150239.928_128.200.22.13: Pragma: no-cache
20141021_150239.928_128.200.22.13: Content-Length: 348
20141021_150239.928_128.200.22.13: Content-Type: text/xml
20141021_150239.928_128.200.22.13: Cookie: OutlookSession="{54AE4359-2E0C-4A13-9486-1DD48DAD6B66}"
20141021_150239.928_128.200.22.13: Host: autodiscover.uci.edu
20141021_150239.928_128.200.22.13: User-Agent: Microsoft Office/14.0 (Windows NT 6.1; Microsoft Outlook 14.0.7128; Pro)
20141021_150239.928_128.200.22.13: X-User-Identity: bvarela@uci.edu
20141021_150239.928_128.200.22.13: Depth: 0
20141021_150239.928_128.200.22.13: **** End Header Dump ****
20141021_150239.928_128.200.22.13: Email address "bvarela@uci.edu" retrieved from XML request.
20141021_150239.928_128.200.22.13: Request: bvarela@uci.edu; Redirect: bvarela@exchange.uci.edu
20141021_150239.928_128.200.22.13: End Request. Took 44ms.
Using Splunk > Manager >> Data Inputs >> Files & Directories >> Data Preview I was able to Specify a pattern or regex to break before and this is the regex that I gave it:
(\d{8}_\d{1,6}\.\d{3}_)(\d{1,3}\.){3}\d{1,3}: Request Begin\.
This resulted in this (1st record)
1 10/14/01 4:15:28.000 PM
20141021_150239.928_128.200.22.13: Request Begin. User Agent: Microsoft Office/14.0 (Windows NT 6.1; Microsoft Outlook 14.0.7128; Pro)
20141021_150239.928_128.200.22.13: XML Message: <?xml version="1.0" encoding="utf-8"?><Autodiscover xmlns="http://schemas.microsoft.com/exchange/autodiscover/outlook/requestschema/2006"><Request><EMailAddress>bvarela@uci.edu</EMailAddress><AcceptableResponseSchema>http://schemas.microsoft.com/exchange/autodiscover/outlook/responseschema/2006a</AcceptableResponseSchema></Request></Autodiscover>
20141021_150239.928_128.200.22.13: **** Start Header Dump ****
20141021_150239.928_128.200.22.13: Cache-Control: no-cache
20141021_150239.928_128.200.22.13: Connection: Keep-Alive
20141021_150239.928_128.200.22.13: Pragma: no-cache
20141021_150239.928_128.200.22.13: Content-Length: 348
20141021_150239.928_128.200.22.13: Content-Type: text/xml
20141021_150239.928_128.200.22.13: Cookie: OutlookSession="{54AE4359-2E0C-4A13-9486-1DD48DAD6B66}"
20141021_150239.928_128.200.22.13: Host: autodiscover.uci.edu
20141021_150239.928_128.200.22.13: User-Agent: Microsoft Office/14.0 (Windows NT 6.1; Microsoft Outlook 14.0.7128; Pro)
20141021_150239.928_128.200.22.13: X-User-Identity: bvarela@uci.edu
20141021_150239.928_128.200.22.13: Depth: 0
20141021_150239.928_128.200.22.13: **** End Header Dump ****
20141021_150239.928_128.200.22.13: Email address "bvarela@uci.edu" retrieved from XML request.
20141021_150239.928_128.200.22.13: Request: bvarela@uci.edu; Redirect: bvarela@exchange.uci.edu
20141021_150239.928_128.200.22.13: End Request. Took 44ms.
The format of the time/date-stamp & IP before each colon is:
YYYYmmdd_(24hr)(min)(sec).(millisec)_(ipnumber)
or, put another way
YYYYmmdd_HHMMss.mmm_(ipnumber)
So in this last example (20141021_150239.928_128.200.22.13) we would be expecting 10/21/2014 3:02:39.928 PM for the timestamp, but Splunk is not getting this. Plus, it would be nice if I could reformat the IP as being a separate field, removing the '_' and having IP=128.200.22.13, and also would be great to drop redundant headers through the remainder of the log entry.
Any ideas?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Give this a try (either in props.conf directly OR in Data Preview -> Advanced mode)
BREAK_ONLY_BEFORE=(\d{8}_\d{1,6}\.\d{3}_)(\d{1,3}\.){3}\d{1,3}: Request Begin\.
MAX_TIMESTAMP_LOOKAHEAD=25
NO_BINARY_CHECK=1
SEDCMD-ipaddr=s/(\d{8}_\d{6}\.\d{3})_(.*)/\1 IP=\2/
SEDCMD-removeextra=s/(\d{8}_\d{6}\.\d{3}_\d+\.\d+\.\d+\.\d+\:\s*)//g
SHOULD_LINEMERGE=true
TIME_FORMAT=%Y%m%d_%H%M%S.%3Q_
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This works brilliantly except the SEDCMD-ipaddr sed command. After much head scratching I realized that the equal sign in the replace string was causing the sed to fail. Then, of course, the SEDCMD-removeextra sed command removed all, leaving no IP address at all.
I can use IP: and it works fine. Is there some way to include the = sign though? I tried this in regular sed on linux and it had no problem with the = sign, so it must be unique to splunk.
