Splunk Search

Splunk for BlueCoat app problem

Path Finder

I'm currently sending BlueCoat logs in W3C ELFF format to Splunk. I've also installed the latest Splunk for Blue Coat app.

However, it seems that log fields are not extracted correctly. None of the fields in the Dashboard show the correct field. For instance, the Top Websites shows "application/x-www-form-urlencoded;%20charset=utf-8" and "application/soap+msbin1" as the top 2 sites...

All logs have 39 fields which are separated by a space (" "). The fields are: date time time-taken c-ip sc-status s-action sc-bytes cs-bytes cs-method cs-uri-scheme cs-host cs-uri-path cs-uri-query cs-username s-hierarchy s-supplier-name cs(Content-Type) cs(User-Agent) sc-filter-result sc-filter-category x-virus-id s-ip s-sitename sc(Content-Encoding) x-bluecoat-release-version s-icap-info s-icap-status x-exception-reason x-exception-sourcefile x-virus-details x-icap-error-code x-icap-error-details cs-uri-stem cs-auth-group cs-auth-type x-cs-user-authorization-name sc-auth-status rs(Content-Type) rs(Content-Encoding).

However, the field "cs(User-Agent)" contains spaces and starts with a " and ends with a ". Between those 2 characters, there can be spaces.

The regex in the Splunk for BlueCoat app is the following:

[mainExtractions] REGEX = \d+-\d+-\d+\s\d+:\d+:\d+\s(?<time_taken>\d+)\s(?<c_ip>\d+.\d+.\d+.\d+)\s(?<sc_status>[^\s]+)\s(?<s_action>[^\s]+)\s(?<sc_bytes>[^\s]+)\s(?<cs_method>[^\s]+)\s\"(?<cs_uri_scheme>[^\s]+)\"\s(?<cs_host>[^\s]+)\s+(?<cs_uri_port>[^\s]+)\s(?<cs_uri_path>[^\s]+)\s(?<cs_uri_query>[^\s]+)\s(?<cs_username>[^\s]+)\s(?<cs_auth_group>[^\s]+)\s(?<s_hierarchy>[^\s]+)\s(?<s_supplier_name>[^\s]+)\s(?<rs_content_type>[^\s]+)\s(?<cs_referer>[^\s]+)\s(?<cs_UserAgent>[^\s]+)\s\"(?<sc_filter_result>.*)\"\s(?<cs_categories>[^\s]+)\s(?<x_virus_id>[^\s]+)\s(?<s_ip>[^\s]+)

I think it doesn't correctly filter the " around the cs_UserAgent. Can anyone help with this?

Tags (1)
0 Karma
1 Solution

Path Finder

Ok, I've found the issue. The log format should be "bcreportermain_v1". After changing that, everything works!

View solution in original post

0 Karma

Path Finder

My transforms.conf:

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","src_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","dvc_ip","sc_bytes","cs_bytes","x_virus_id"

Example log files:

2010-11-26 11:28:55 113 x.x.x.x 200 TCP_NC_MISS 42168 1691 POST https the.web.site /ProcessLegend.aspx - - DEFAULT_PARENT fqdn.host.name application/x-www-form-urlencoded;%20charset=utf-8 "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)" OBSERVED none - x.x.x.x SG-HTTPS-Reverse-Proxy-Service - 5.5.3.1 - ICAP_NOT_SCANNED "-" - - - - https://the.web.site/ProcessLegend.aspx - - - - text/html;%20charset=utf-8 -

2010-11-26 11:28:31 109 x.x.x.x 200 TCP_NC_MISS 562 491 POST http another.web.site /Blablabla.svc - - DEFAULT_PARENT another.host.name application/soap+msbin1 - OBSERVED none - x.x.x.x SG-HTTP-Service - 5.5.3.1 - ICAP_NOT_SCANNED "-" - - - - http://another.web.site/Blablabla.svc - - - - application/soap+msbin1 -

Edited of course 😉

0 Karma

Path Finder

I think I finally got it working correctly 🙂 It seems that the transforms.conf file in the Splunk for BlueCoat app is wrong.

Original transforms.conf

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","dvc_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","src_ip","sc_bytes","cs_bytes","x_virus_id"

[nullPound]
REGEX = ^\#
DEST_KEY=queue
FORMAT=nullQueue

When I switch "dvc_ip" and "src_ip" in the above, all graphs are correctly displayed. According to the Blue Coat documentation ("SGOS Volume 8: Access Logging"), "src_ip" is actully the 4th field and "dvc_ip" is the 4th last field.

After copying the default transforms.conf file to the local directory and changing it like this:

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","src_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","dvc_ip","sc_bytes","cs_bytes","x_virus_id"

everything works.

Path Finder

Silvermail, can you confirm your setup? (log format etc...)

0 Karma

Path Finder

Ok, I've found the issue. The log format should be "bcreportermain_v1". After changing that, everything works!

View solution in original post

0 Karma

Path Finder

Can you post the header and a few lines from your logs, as well as your transforms.conf? Probably the field-extractions naming is a bit different and that is why you are not getting the dashboards.

0 Karma

Path Finder

Hmmm, still not working correctly... When I go to "Dashboards" -> "Traffic Dashboard", the "Top Websites" and "Top Clients" are still wrong 😞

Anybody running Splunk for BlueCoat 100% correctly? 😉

0 Karma

Path Finder

Instead of using the Regex, I am actually using the delimeters option which I find it to be much easier to configure.

This is an example of how mine looks like. You will need to change the delimters accordingly in the transforms.conf to match what you are outputting from your Bluecoat.

props.conf

[bcoat_proxysg]
TRANSFORM-main=nullPound
REPORT-main=delimExtractions
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T
MAX_TIMESTAMP_LOOKAHEAD=19
KV_MODE = none

transforms.conf

[delimExtractions]
DELIMS=" "
FIELDS="date","time","time_taken","dvc_ip","user","user_group","x_exception_id","filter_result","category","http_referrer","holder","http_response","action","http_method","http_content_type","uri_scheme","dest_host","dest_port","uri_path","uri_query","uri_extension","http_user_agent","src_ip","sc_bytes","cs_bytes","x_virus_id"

[nullPound]
REGEX = ^\#
DEST_KEY=queue
FORMAT=nullQueue

Path Finder

Silvermail, how are you sending the logs from the Blue Coat to Splunk and in which format?

0 Karma

Path Finder

How do you correctly filter out the User-Agent field? Like I said in my post above, the User-Agent field is everything between the 2 double quotes ("Mozilla 4.5 whatever") and that doesn't get filtered correctly if you use a space as delimiter as you can have multiple words between the double quotes with spaces...

Which log format are you using on your BlueCoat?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!