All Apps and Add-ons

Blue Coat Field extractor name=custom_client_events is unusually slow

TonyLeeVT
Builder

We are running the Blue Coat ProxySG App for Splunk app (https://splunkbase.splunk.com/app/2815) and associated TA downloaded from the BTO site.

When running on a distributed environment with multiple indexers, we receive the following message:

Field extractor name=custom_client_events is unusually slow (max single event time=1146ms)

The transforms.conf contains:

[custom_client_events]
REGEX = (?<date>[^\s]+)\s+(?<time>[^\s]+)\s+(?<duration>[^\s]+)\s+(?<src_ip>[^\s]+)\s+(?<user>[^\s]+)\s+(?<cs_auth_group>[^\s]+)\s+(?<x_exception_id>[^\s]+)\s+(?<filter_result>[^\s]+)\s+\"(?<category>[^\"]+)\"\s+(?<http_referrer>[^\s]+)\s+(?<status>[^\s]+)\s+(?<action>[^\s]+)\s+(?<http_method>[^\s]+)\s+(?<http_content_type>[^\s]+)\s+(?<cs_uri_scheme>[^\s]+)\s+(?<dest>[^\s]+)\s+(?<uri_port>[^\s]+)\s+(?<uri_path>[^\s]+)\s+(?<uri_query>[^\s]+)\s+(?<uri_extension>[^\s]+)\s+\"(?<http_user_agent>[^\"]+)\"\s+(?<dest_ip>[^\s]+)\s+(?<bytes_in>[^\s]+)\s+(?<bytes_out>[^\s]+)\s+\"*(?<x_virus_id>[^\"]+)\"*\s+\"*(?<x_bluecoat_application_name>[^\"]+)\"*\s+\"*(?<x_bluecoat_application_operation>[^\"]+)

Is it possible to optimize this regex and get rid of the "unusually slow" error message? Thanks

0 Karma
1 Solution

TonyLeeVT
Builder

Figured out a solution since the one provided does not scale to large enterprises. The solution is not to make Splunk adapt, but instead change the way data is sent to it. The Blue Coat app and TA require sending data in the bcreportermain_v1 format--which is an ELFF format. Then the app and TA try to parse this space separated data using the complex regex seen above. Instead of doing that, you can instruct Blue Coat to send the data in a different format such as key value pair--which Splunk likes and natively parses.

Have the Blue Coat admins define a custom log format with the following fields:

Bluecoat|date=$(date)|time=$(time)|duration=$(time-taken)|src_ip=$(c-ip)|user=$(cs-username)|cs_auth_group=$(cs-auth-group)| x_exception_id=$(x-exception-id)|filter_result=$(sc-filter-result)|category=$(cs-categories)|http_referrer=$(cs(Referer))|status=$(sc-status)|action=$(s-action)|http_method=$(cs-method)|http_content_type=$(rs(Content-Type))|cs_uri_scheme=$(cs-uri-scheme)|dest=$(cs-host)| uri_port=$(cs-uri-port)|uri_path=$(cs-uri-path)|uri_query=$(cs-uri-query)|uri_extension=$(cs-uri-extension)|http_user_agent=$(cs(User-Agent))|dest_ip=$(s-ip)|bytes_in=$(sc-bytes)|bytes_out=$(cs-bytes)|x_virus_id=$(x-virus-id)|x_bluecoat_application_name=$(x-bluecoat-application-name)|x_bluecoat_application_operation=$(x-bluecoat-application-operation)|target_ip=$(cs-ip)|proxy_name=$(x-bluecoat-appliance-name)|proxy_ip=$(x-bluecoat-proxy-primary-address)|$(x-bluecoat-special-crlf)

Since this data comes into Splunk as key=value pair now, Splunk parses it natively.

Remove the TAs from the indexer and replace it with a simpler props.conf file of this:

[bluecoat:proxysg:customclient]
SHOULD_LINEMERGE = false

This just turns off line merging which is on by default and makes the parsing even faster. Also remember to rename the props.conf and transforms.conf (ex: .bak files) included in the app if you have it installed on your search head--that contains the same complicated regex which will slow down data ingestion. By the way, by defining your own format, you can add other fields you care about--such as the target IP (cs-ip) which is not included in the default bcreportermain_v1 format for some reason. Hope this helps others than run into this situation.

View solution in original post

0 Karma

themrkeys
New Member

Another solution I've been working on is a bit more efficient than cef like formating and allows for use of an actual syslog server updated TA is published here https://bitbucket.org/SPLServices/splunk_ta_bluecoat_proxysg/

0 Karma

shubham87
Explorer

Custom format defined in this TA does not work.

$(gmttime) $(x-bluecoat-appliance-name) bluecoat[0]: SPLV5 c-ip=$(c-ip) Content-Type=$(quot)$(rs(Content-Type) )$(quot) cs-auth-group=$(cs-auth-group) cs-bytes=$(cs-bytes) cs-categories=$(cs-categories) cs-host=$(cs-host) cs-ip=$(cs-ip) cs-method=$(cs-method) cs-uri-extension=$(cs-uri-extension) cs-uri-path=$(cs-uri-path) cs-uri-port=$(cs-uri-port) cs-uri-query=$(cs-uri-query) cs-uri-scheme=$(cs-uri-scheme) cs-User-Agent=$(quot)$(cs(User-Agent) )$(quot) cs-username=$(cs-username) dnslookup-time=$(dnslookup-time) duration=$(duration) rs-status=$(rs-status) rs-version=$(rs-version) s-action=$(s-action) s-ip=$(s-ip) s-sitename=$(s-sitename) s-supplier-ip=$(s-supplier-ip) s-supplier-name=$(s-supplier-name) sc-bytes=$(sc-bytes) sc-filter-result=$(sc-filter-result) sc-filter-result=$(sc-filter-result) sc-status=$(sc-status) time-taken=$(time-taken) x-bluecoat-appliance-name=$(x-bluecoat-appliance-name) x-bluecoat-appliance-primary-address=$(x-bluecoat-appliance-primary-address) x-bluecoat-application-name=$(x-bluecoat-application-name) x-bluecoat-application-operation=$(x-bluecoat-application-operation) x-bluecoat-proxy-primary-address=$(x-bluecoat-proxy-primary-address) x-bluecoat-transaction-uuid=$(x-bluecoat-transaction-uuid) x-exception-id=$(x-exception-id) x-virus-id=$(x-virus-id) c-url=$(quot)$(url)$(quot) cs-Referer=$(quot)$(cs(Referer) )$(quot)

0 Karma

TonyLeeVT
Builder

Figured out a solution since the one provided does not scale to large enterprises. The solution is not to make Splunk adapt, but instead change the way data is sent to it. The Blue Coat app and TA require sending data in the bcreportermain_v1 format--which is an ELFF format. Then the app and TA try to parse this space separated data using the complex regex seen above. Instead of doing that, you can instruct Blue Coat to send the data in a different format such as key value pair--which Splunk likes and natively parses.

Have the Blue Coat admins define a custom log format with the following fields:

Bluecoat|date=$(date)|time=$(time)|duration=$(time-taken)|src_ip=$(c-ip)|user=$(cs-username)|cs_auth_group=$(cs-auth-group)| x_exception_id=$(x-exception-id)|filter_result=$(sc-filter-result)|category=$(cs-categories)|http_referrer=$(cs(Referer))|status=$(sc-status)|action=$(s-action)|http_method=$(cs-method)|http_content_type=$(rs(Content-Type))|cs_uri_scheme=$(cs-uri-scheme)|dest=$(cs-host)| uri_port=$(cs-uri-port)|uri_path=$(cs-uri-path)|uri_query=$(cs-uri-query)|uri_extension=$(cs-uri-extension)|http_user_agent=$(cs(User-Agent))|dest_ip=$(s-ip)|bytes_in=$(sc-bytes)|bytes_out=$(cs-bytes)|x_virus_id=$(x-virus-id)|x_bluecoat_application_name=$(x-bluecoat-application-name)|x_bluecoat_application_operation=$(x-bluecoat-application-operation)|target_ip=$(cs-ip)|proxy_name=$(x-bluecoat-appliance-name)|proxy_ip=$(x-bluecoat-proxy-primary-address)|$(x-bluecoat-special-crlf)

Since this data comes into Splunk as key=value pair now, Splunk parses it natively.

Remove the TAs from the indexer and replace it with a simpler props.conf file of this:

[bluecoat:proxysg:customclient]
SHOULD_LINEMERGE = false

This just turns off line merging which is on by default and makes the parsing even faster. Also remember to rename the props.conf and transforms.conf (ex: .bak files) included in the app if you have it installed on your search head--that contains the same complicated regex which will slow down data ingestion. By the way, by defining your own format, you can add other fields you care about--such as the target IP (cs-ip) which is not included in the default bcreportermain_v1 format for some reason. Hope this helps others than run into this situation.

0 Karma

bnicolasmia
New Member

Hello,
We are trying to apply your solution but we are facing issues with URL field.
The parsing work pretty well but the Bluecoat keep the special character like "|" in the URL.

Do you know a workaround to force the bluecoat to reformat the URL and remove the special character ?

Regards,
Nicolas

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...