<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Blackboard bb-access-logs not parsing correctly in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265172#M79706</link>
    <description>&lt;P&gt;Updated &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LINE_BREAKER = (\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)\s((\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)|\-)\s
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;We've tried variations of the regex including&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;()\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Tue, 06 Oct 2015 22:09:44 GMT</pubDate>
    <dc:creator>cyndiback</dc:creator>
    <dc:date>2015-10-06T22:09:44Z</dc:date>
    <item>
      <title>Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265171#M79705</link>
      <description>&lt;P&gt;Blackboard has changed the format of the bb-access-logs to include session information.  With the new data the logs are being split into several logs and some are being indexed with the wrong timestamp.  Is anyone else working with these logs and successfully indexing them?  &lt;/P&gt;

&lt;P&gt;Splunk universal forwarder, inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor:///usr/local/blackboard/apps/tomcat/logs/bb-access-log.*.txt]
index=oc_dev
sourcetype=bb-access-log
ignoreOlderThan=1h
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Splunk indexer props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[bb-access-log]
TIME_FORMAT = %d/%b/%Y:%H:%M:%S %z
MAX_TIMESTAMP_LOOKAHEAD = 85
SHOULD_LINEMERGE = False
TRUNCATE = 25000
LINE_BREAKER = ([\n\r]+)(\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)\s((\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)|\-)\s
EXTRACT-webcampus_access-fields = (?&amp;lt;src_ip&amp;gt;\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))\b)\s(?&amp;lt;dest_ip&amp;gt;.*?)\s(?&amp;lt;HTTPComponent&amp;gt;.*?)\s(?&amp;lt;duid&amp;gt;.*?)\s(?&amp;lt;dateInternal&amp;gt;\[(.*?)\])\s(?&amp;lt;HTTP&amp;gt;\"(.*?)\")\s(?&amp;lt;http_status_code&amp;gt;.*?)\s(?&amp;lt;unknownDigits2&amp;gt;.*?)\s(?&amp;lt;http_user_agent&amp;gt;\".*?\")\s
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Example of possible log formatting:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;ip_address ip_address http_code duid [%d/%b/%Y:%H:%M:%S %z] "http activity" digit digit "browser information" "session information" digit digit
ip_address - http_code duid [%d/%b/%Y:%H:%M:%S %z] "http activity" digit digit "browser information" "session information" digit digit
ip_address - http_code {unset id} [%d/%b/%Y:%H:%M:%S %z] "http activity" digit digit "-" "-" digit digit
- - - {unset id} [%d/%b/%Y:%H:%M:%S %z] "http activity" digit digit "-" "-" digit digit
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Example of possible logs:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;1.1.1.1 1.1.1.1 http-nio-8081-exec-3 _1_2 [06/Oct/2015:13:58:22 -0700] "POST /webapps/bb-social-learning-BBLEARN/dwr_open/call/plaincall/ToolActivityService.getActivityForAllTools.dwr HTTP/1.1" 200 116 "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36" "JSESSIONID=123456789123456879; BIGipServer~Part-Apps~blackboard-8081=155738122.37151.0000; cookies_enabled=yes; JSESSIONID=123456789123456879; web_client_cache_guid=57a5cf06-3530-48d4-a15f-24f7b7ad0c86; __utma=180787233.108259193.1444163123.1444163123.1444163123.1; __utmb=180787233.17.10.1444163123; __utmc=180787233; __utmz=180787233.1444163123.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); session_id=123456789123456879; s_session_id=123456789123456879" 14 116

1.1.1.1 - http-nio-8081-exec-7 {unset id} [06/Oct/2015:14:45:52 -0700] "GET / " 200 975 "-" "-" 78 975

1.1.1.1 1.1.1.1 http-nio-8081-exec-39 _1_2 [06/Oct/2015:14:45:50 -0700] "POST /webapps/blackboard/dwr_open/call/plaincall/ToolActivityService.getActivityForAllTools.dwr HTTP/1.1" 200 116 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36" "JSESSIONID=123456789123456879; AMCV_774C31DD5342CAF40A490D44%40AdobeOrg=793872103%7CMCIDTS%7C16709%7CMCMID%7C54023606677090264832941195089008334173%7CMCAAMLH-1444243530%7C7%7CMCAAMB-1444243530%7CNRX38WO0n5BH8Th-nqAG_A%7CMCAID%7CNONE; s_vnum=1446231162345%26vn%3D1; optimizelySegments=%7B%22183805944%22%3A%22none%22%2C%22183871552%22%3A%22direct%22%2C%22183916464%22%3A%22false%22%2C%22184368091%22%3A%22gc%22%7D; optimizelyEndUserId=123456789123456879.123456789123456879; optimizelyBuckets=%7B%7D; s_fid=123456789123456879-1F5618C69F6F5C95; mbox=session#1443638730148-106896#1443641446|PC#1443638730148-106896.17_60#1444849186; _ga=GA1.2.1929637067.1443278384; BIGipServer~Part-Apps~blackboard-8081=155738122.37151.0000; xythosdrive=0; web_client_cache_guid=6251af67-d5e0-4752-a2f7-0a05da0b8839; __utma=184114681.1929637067.1443278384.1443715769.1444002370.6; __utmc=184114681; __utmz=184114681.1443278384.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); JSESSIONID=123456789123456879; __utma=180787233.1929637067.1443278384.1444154819.1444159011.35; __utmc=180787233; __utmz=180787233.1443320404.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); session_id=123456789123456879; s_session_id=123456789123456879" 3 116
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Original post did not update successfully.&lt;/P&gt;</description>
      <pubDate>Tue, 06 Oct 2015 22:03:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265171#M79705</guid>
      <dc:creator>cyndiback</dc:creator>
      <dc:date>2015-10-06T22:03:39Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265172#M79706</link>
      <description>&lt;P&gt;Updated &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LINE_BREAKER = (\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)\s((\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)|\-)\s
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;We've tried variations of the regex including&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;()\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 06 Oct 2015 22:09:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265172#M79706</guid>
      <dc:creator>cyndiback</dc:creator>
      <dc:date>2015-10-06T22:09:44Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265173#M79707</link>
      <description>&lt;P&gt;Other LINE_BREAKER regex tried:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;()\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}
(\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)\s((\b(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])(?:\.(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[1-9])){3}\b)|\-)\s
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 06 Oct 2015 22:18:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265173#M79707</guid>
      <dc:creator>cyndiback</dc:creator>
      <dc:date>2015-10-06T22:18:50Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265174#M79708</link>
      <description>&lt;P&gt;I don't know the format of these logs, but aren't the log entries one event per line anyway? In other words, wouldn't the default line breaking on \r\n work just fine?&lt;/P&gt;</description>
      <pubDate>Tue, 06 Oct 2015 22:42:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265174#M79708</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2015-10-06T22:42:52Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265175#M79709</link>
      <description>&lt;P&gt;I agree with @ssievert.  Try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[bb-access-log]
TIME_PREFIX=^[^\[]+\[
TIME_FORMAT = %d/%b/%Y:%H:%M:%S %z
MAX_TIMESTAMP_LOOKAHEAD = 26
SHOULD_LINEMERGE = false
TRUNCATE = 25000
# LINE_BREAKER &amp;lt;- leave as default
EXTRACT-webcampus_access-fields = (?&amp;lt;src_ip&amp;gt;\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))\b)\s(?&amp;lt;dest_ip&amp;gt;.*?)\s(?&amp;lt;HTTPComponent&amp;gt;.*?)\s(?&amp;lt;duid&amp;gt;.*?)\s(?&amp;lt;dateInternal&amp;gt;\[(.*?)\])\s(?&amp;lt;HTTP&amp;gt;\"(.*?)\")\s(?&amp;lt;http_status_code&amp;gt;.*?)\s(?&amp;lt;unknownDigits2&amp;gt;.*?)\s(?&amp;lt;http_user_agent&amp;gt;\".*?\")\s
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 07 Oct 2015 13:46:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265175#M79709</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-10-07T13:46:29Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265176#M79710</link>
      <description>&lt;P&gt;My original confs did not contain line breaker and the issue started.  &lt;/P&gt;</description>
      <pubDate>Fri, 09 Oct 2015 16:37:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265176#M79710</guid>
      <dc:creator>cyndiback</dc:creator>
      <dc:date>2015-10-09T16:37:57Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265177#M79711</link>
      <description>&lt;P&gt;Found the issue - The logs are being written in incomplete chunks by blackboard.  Splunk is indexing logs as it sees it.&lt;BR /&gt;
Hard learned lesson to tail your files to see how they are being written.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Oct 2015 17:41:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265177#M79711</guid>
      <dc:creator>cyndiback</dc:creator>
      <dc:date>2015-10-09T17:41:03Z</dc:date>
    </item>
    <item>
      <title>Re: Blackboard bb-access-logs not parsing correctly</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265178#M79712</link>
      <description>&lt;P&gt;Actual solution - The issue was Blackboard does not write complete events but instead queues data, writes, pauses to queue again and writes.  Splunk support advised - As Splunk's tailing process can see the pause and interprets the complete event has been written and then tries to ingest the data.  A majority of the time the data is written as complete events but not always.  Easy way to check for bb-access-log accuracy is have search look if there are logs that do not contact source ip (e.g. NOT src_ip=*)&lt;/P&gt;

&lt;P&gt;To resolve add time_before_close to forwarder inputs.  &lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;time_before_close = 30&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;Doc:  &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.6/Admin/Inputsconf" target="_blank"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.6/Admin/Inputsconf&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;time_before_close = &lt;EM&gt;integer&lt;/EM&gt;&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Modtime delta required before Splunk can close a file on EOF.&lt;/LI&gt;
&lt;LI&gt;Tells the system not to close files that have been updated in past  seconds.&lt;/LI&gt;
&lt;LI&gt;Defaults to 3.&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 29 Sep 2020 08:11:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Blackboard-bb-access-logs-not-parsing-correctly/m-p/265178#M79712</guid>
      <dc:creator>cyndiback</dc:creator>
      <dc:date>2020-09-29T08:11:03Z</dc:date>
    </item>
  </channel>
</rss>

