All Apps and Add-ons

Splunk for Blue Coat ProxySG 3.0.7: Regex for User agent fails when it is set to dash "-". Regex needs to be updated

konrads
Explorer

Hello,

The regex in 3.0.7 fails when User agent is set to just dash - such as in the example here:

2015-11-10 02:00:00 100 xxx.xx.xxx.xxx XYZ abc\ryolo - OBSERVED "Search Engines/Portals" -  200 TCP_NC_MISS GET text/html;%20charset=ISO-8859-1 http www.google.co.uk 80 / ?.... - - 1.2.3.4 57738 1012 - "none" "none"

A better regex is:

^(?<date>[^\s]+)\s+(?<time>[^\s]+)\s+(?<time_taken>[^\s]+)\s+(?<c_ip>[^\s]+)\s+(?<cs_username>[^\s]+)\s+(?<cs_auth_group>[^\s]+)\s+(?<x_exception_id>[^\s]+)\s+(?<filter_result>[^\s]+)\s+\"(?<category>[^\"]+)\"\s+(?<http_referrer>[^\s]+)\s+(?<sc_status>[^\s]+)\s+(?<action>[^\s]+)\s+(?<cs_method>[^\s]+)\s+(?<http_content_type>[^\s]+)\s+(?<cs_uri_scheme>[^\s]+)\s+(?<cs_host>[^\s]+)\s+(?<cs_uri_port>[^\s]+)\s+(?<cs_uri_path>[^\s]+)\s+(?<cs_uri_query>[^\s]+)\s+(?<cs_uri_extension>[^\s]+)\s+[\"]{0,1}(?<http_user_agent>[^\"]+)[\"]{0,1}\s+(?<s_ip>[^\s]+)\s+(?<sc_bytes>[^\s]+)\s+(?<cs_bytes>[^\s]+)\s+\"?(?<x_virus_id>[^\"]+)\"?\s+\"(?<x_bluecoat_application_name>[^\"]+)\"\s+\"(?<x_bluecoat_application_operation>[^\"]+)\"
0 Karma
1 Solution

konrads
Explorer

Posting an improved regex:

^(?<date>[^\s]+)\s+(?<time>[^\s]+)\s+(?<time_taken>[^\s]+)\s+(?<c_ip>[^\s]+)\s+(?<cs_username>[^\s]+)\s+(?<cs_auth_group>[^\s]+)\s+(?<x_exception_id>[^\s]+)\s+(?<filter_result>[^\s]+)\s+\"(?<category>[^\"]+)\"\s+(?<http_referrer>[^\s]+)\s+(?<sc_status>[^\s]+)\s+(?<action>[^\s]+)\s+(?<cs_method>[^\s]+)\s+(?<http_content_type>[^\s]+)\s+(?<cs_uri_scheme>[^\s]+)\s+(?<cs_host>[^\s]+)\s+(?<cs_uri_port>[^\s]+)\s+(?<cs_uri_path>[^\s]+)\s+(?<cs_uri_query>[^\s]+)\s+(?<cs_uri_extension>[^\s]+)\s+[\"]{0,1}(?<http_user_agent>[^\"]+)[\"]{0,1}\s+(?<s_ip>[^\s]+)\s+(?<sc_bytes>[^\s]+)\s+(?<cs_bytes>[^\s]+)\s+\"?(?<x_virus_id>[^\"]+)\"?\s+\"{0,1}(?<x_bluecoat_application_name>[^\"]+)\"{0,1}\s+\"{0,1}(?<x_bluecoat_application_operation>[^\"]+)\"{0,1}

View solution in original post

konrads
Explorer

Posting an improved regex:

^(?<date>[^\s]+)\s+(?<time>[^\s]+)\s+(?<time_taken>[^\s]+)\s+(?<c_ip>[^\s]+)\s+(?<cs_username>[^\s]+)\s+(?<cs_auth_group>[^\s]+)\s+(?<x_exception_id>[^\s]+)\s+(?<filter_result>[^\s]+)\s+\"(?<category>[^\"]+)\"\s+(?<http_referrer>[^\s]+)\s+(?<sc_status>[^\s]+)\s+(?<action>[^\s]+)\s+(?<cs_method>[^\s]+)\s+(?<http_content_type>[^\s]+)\s+(?<cs_uri_scheme>[^\s]+)\s+(?<cs_host>[^\s]+)\s+(?<cs_uri_port>[^\s]+)\s+(?<cs_uri_path>[^\s]+)\s+(?<cs_uri_query>[^\s]+)\s+(?<cs_uri_extension>[^\s]+)\s+[\"]{0,1}(?<http_user_agent>[^\"]+)[\"]{0,1}\s+(?<s_ip>[^\s]+)\s+(?<sc_bytes>[^\s]+)\s+(?<cs_bytes>[^\s]+)\s+\"?(?<x_virus_id>[^\"]+)\"?\s+\"{0,1}(?<x_bluecoat_application_name>[^\"]+)\"{0,1}\s+\"{0,1}(?<x_bluecoat_application_operation>[^\"]+)\"{0,1}

daniel_augustyn
Contributor

None of these regex work to extract http_user_agent. Did anyone get it right?

0 Karma

konrads
Explorer

Slightly tweaked to match more weirdness:

^(?<date>[^\s]+)\s+(?<time>[^\s]+)\s+(?<time_taken>[^\s]+)\s+(?<c_ip>[^\s]+)\s+(?<cs_username>[^\s]+)\s+(?<cs_auth_group>[^\s]+)\s+(?<x_exception_id>[^\s]+)\s+(?<filter_result>[^\s]+)\s+\"(?<category>[^\"]+)\"\s+(?<http_referrer>[^\s]+)\s+(?<sc_status>[^\s]+)\s+(?<action>[^\s]+)\s+(?<cs_method>[^\s]+)\s+(?<http_content_type>[^\s]+)\s+(?<cs_uri_scheme>[^\s]+)\s+(?<cs_host>[^\s]+)\s+(?<cs_uri_port>[^\s]+)\s+(?<cs_uri_path>[^\s]+)\s+(?<cs_uri_query>[^\s]+)\s+(?<cs_uri_extension>[^\s]+)\s+[\"]{0,1}(?<http_user_agent>[^\"]+)[\"]{0,1}\s+(?<s_ip>[^\s]+)\s+(?<sc_bytes>[^\s]+)\s+(?<cs_bytes>[^\s]+)\s+\"?(?<x_virus_id>[^\"]+)\"?\s+\"{0,1}(?<x_bluecoat_application_name>[^\"]+)\"{0,1}\s+\"{0,1}(?<x_bluecoat_application_operation>[^\"]+)\"{0,1}
0 Karma

ppablo
Retired

Hi @konrads

Thanks for sharing this tip with the community. Can you actually post the regex as a formal answer in the "Enter your answer here..." box below and accept it to resolve this post? It'll make it easier for other users to find.

Also, you might want to consider formally submitting this issue as a bug here:
http://www.splunk.com/r/bugs

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...