All Apps and Add-ons

transform for Cisco ironport Web Security Appliance log in squid detail format

ashabc
Contributor

I have installed splunk Cisco Ironport web security appliance (WSA) apps. All seems to be working OK. It uses sourcetype cisco_wsa_squid.

However, I have a problem. The splunk apps for wsa supports only squid format log. However, I have a huge number of historical logs those were collected in squid_detail format instead of squid format. If I try to import squid_detail log into the apps, it does not extract the fields correctly, making the data useless.

I hope someone can help me with tweaking the transform file so that I can import historical (squid_detail format log file) into Cisco Ironport web security appliance (WSA) apps. Below, I have included the header for squid and squid_detail logs along with a sample line of data and the relevant contents of props.conf and transforms.conf file. My objective is to create a sourcetype called cisco_wsa_squid_detail and use it for historical logs within the apps. Need your help to create the correct transform.


log format squid

header
#Fields: %t %e %a %w/%h %s %2r %A %H/%d %c %D %Xr %?BLOCK_SUSPECT_USER_AGENT,MONITOR_SUSPECT_USER_AGENT?%<User-Agent:%!%-%.

sample data

1381962068.488 538 10.71.66.56 TCP_CLIENT_REFRESH_MISS/200 287 POST hxxp://202.7.177.46/idle/K2emdz02xSLyCk3Z/81 "WILDFIRE\davidm2@RFS_NTLM" DIRECT/202.7.177.46 text/plain DEFAULT_CASE_11-Internet_Access-RFS_AD-NONE-NONE-NONE-DefaultGroup <IW_srch,-5.9,0,"-",0,0,0,1,"-",-,-,-,"-",1,-,"-","-",-,-,IW_srch,-,"-","trojan","Flash Video","Media","-","-",4.27,0,-,"-","-"> -

log format squid_detail

header
#Fields: %t %e %a %w/%h %s %2r %A %H/%d %c CMF:%M DCF:%j ERR:%E %D %Xr %?BLOCK_SUSPECT_USER_AGENT,MONITOR_SUSPECT_USER_AGENT?%<User-Agent:%!%-%. %u,%N

sample data

1381840464.285 363273 10.72.4.25 TCP_MISS/200 41070 CONNECT tunnel://216.115.208.230:443/ "WILDFIRE\warwickh@RFS_NTLM" DIRECT/216.115.208.230 application/octet-stream CMF:40 DCF:20 ERR:0 DEFAULT_CASE_11-Internet_Access-RFS_AD-NONE-NONE-NONE-DefaultGroup <nc,-3.5,1,"-",-,-,-,1,"-",-,-,-,"-",1,-,"-","-",-,-,nc,-,"-","-","Unknown","Unknown","-","-",0.90,0,-,"-","-"> - "Mozilla/4.0 (compatible)",216.115.208.230

relevant contents of props.conf

################
# Squid Format #
################

[cisco_wsa_squid]
KV_MODE = none
SHOULD_LINEMERGE = True
MAX_TIMESTAMP_LOOKAHEAD=19
REPORT-extract = kv_for_cisco_wsa_squid
REPORT-x_webroot_threat_name_as_signature = x_webroot_threat_name_as_signature
REPORT-x_mcafee_virus_name_as_signature = x_mcafee_virus_name_as_signature
lookup_table = cat_lookup x_webcat_code_abbr
EXTRACT-cs_username = "(?P<cs_username>[^-@]*)@
FIELDALIAS-srcip = c_ip AS src_ip
LOOKUP-vendor_info_for_cisco_wsa = cisco_wsa_vendor_info_lookup sourcetype OUTPUT vendor,product,ids_type
FIELDALIAS-url = cs_url AS url
FIELDALIAS-http_method = cs_method AS http_method 
FIELDALIAS-user = cs_username AS user
FIELDALIAS-http_content_type  = cs_mime_type AS http_content_type
FIELDALIAS-dest = s_hostname AS dest
FIELDALIAS-src = c_ip AS src
FIELDALIAS-status = sc_http_status AS status
FIELDALIAS-action = sc_result_code AS action
FIELDALIAS-bytes_in = sc_bytes AS bytes_in

relevant conents of transform.conf

##################################
# Regex to read WSA squid format #
##################################

[kv_for_cisco_wsa_squid]
REGEX = ^([0-9.]*) *[0-9]* ([0-9.]*) ([A-Z_]*)/([0-9]*) ([0-9]*) ([A-Z]*) ([^ ]*) ([^ ]*) ([^/]*)/([^ ]*) ([^ ]*) ([^ ]+) <([^,]+),([^,]+),[^,]+,([^,]+),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,([^,>]+)
FORMAT = end_time::$1 c_ip::$2 sc_result_code::$3 sc_http_status::$4 sc_bytes::$5 cs_method::$6 cs_url::$7 cs_username::$8 s_hierarchy::$9 s_hostname::$10 cs_mime_type::$11 x_acltag::$12 x_webcat_code_abbr::$13 x_wbrs_score::$14 x_webroot_threat_name::$15 x_mcafee_virus_name::$16
0 Karma
1 Solution

ashabc
Contributor

I ended up writing a vb script that converts squid_detail log to squid format. I am posting it here. Hope it helps someone in future.


function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase) 

' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function

Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
const TriStateTrue = -1

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set folder = objFSO.GetFolder("D:\Ash\Download\splunk\data\squiddetail\")

for each file in folder.Files

Set testfile = objFSO.OpenTextFile(file.path, ForReading)
Set outfile = objFSO.CreateTextFile("D:\Ash\Download\splunk\data\modified\" & file.name)

Do While Not testfile.AtEndOfStream
    line = testfile.readline
    line1 = ereg_replace(line, "CMF:[^ ]* [^ ]* [^ ]* ", "", varIgnoreCase)
    line2 =  ereg_replace(line1, "> -.*", "> -", varIgnoreCase)
    outfile.writeline(line2)
Loop

testfile.close

next

outfile.close


View solution in original post

0 Karma

ashabc
Contributor

I ended up writing a vb script that converts squid_detail log to squid format. I am posting it here. Hope it helps someone in future.


function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase) 

' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function

Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
const TriStateTrue = -1

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set folder = objFSO.GetFolder("D:\Ash\Download\splunk\data\squiddetail\")

for each file in folder.Files

Set testfile = objFSO.OpenTextFile(file.path, ForReading)
Set outfile = objFSO.CreateTextFile("D:\Ash\Download\splunk\data\modified\" & file.name)

Do While Not testfile.AtEndOfStream
    line = testfile.readline
    line1 = ereg_replace(line, "CMF:[^ ]* [^ ]* [^ ]* ", "", varIgnoreCase)
    line2 =  ereg_replace(line1, "> -.*", "> -", varIgnoreCase)
    outfile.writeline(line2)
Loop

testfile.close

next

outfile.close


0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Before I run down all of the extractions and stuff you posted, I'll say this. If your squid_detail sourcetype is extracting fields correctly [ meaning if you search for sourcetype=squid_detail and you get fields that match those of the cisco_wsa_squid sourcetypes], then you can go into the application folder and add this to Splunk_CiscoIronportWebSecurity/local/eventtypes.conf:

[ironport_proxy]
search = sourcetype="cisco_wsa_*" OR sourcetype=squid_detail

This will override the app defaults and allow your second source to be used with minimal work.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Why not just import the squid_detail log as sourcetype=squid_detail? Then make the edit I listed for eventtypes.

0 Karma

ashabc
Contributor

Thank you alacercogitatus for taking time to respond to my post.

Field extractions using squid sourcetype is fine for squid log type only and not squid_detail logs. I have tried to import a squid_detail log using squid sourcetype and field extractions does not work.

I even don't mind discarding any additional information in squid_detail log during extraction process as long as I can get all the fields those are specified in squid sourcetype.

By the way, if you are playing with sample data, I had to modify the content http with hxxp as splunk web site won't allow me to post otherwise.

0 Karma

ashabc
Contributor

transform.conf

[kv_for_cisco_wsa_squid]
REGEX = ^([0-9.]) *[0-9] ([0-9.]) ([A-Z_])/([0-9]) ([0-9]) ([A-Z]) ([^ ]) ([^ ]) ([^/])/([^ ]) ([^ ]) ([^ ]+) <([^,]+),([^,]+),[^,]+,([^,]+),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,([^,>]+)
FORMAT = end_time::$1 c_ip::$2 sc_result_code::$3 sc_http_status::$4 sc_bytes::$5 cs_method::$6 cs_url::$7 cs_username::$8 s_hierarchy::$9 s_hostname::$10 cs_mime_type::$11 x_acltag::$12 x_webcat_code_abbr::$13 x_wbrs_score::$14 x_webroot_threat_name::$15 x_mcafee_virus_name::$16

0 Karma

ashabc
Contributor

props.conf

cisco_wsa_squid]
KV_MODE = none
SHOULD_LINEMERGE = True
MAX_TIMESTAMP_LOOKAHEAD=19
REPORT-extract = kv_for_cisco_wsa_squid
REPORT-x_webroot_threat_name_as_signature = x_webroot_threat_name_as_signature
REPORT-x_mcafee_virus_name_as_signature = x_mcafee_virus_name_as_signature
lookup_table = cat_lookup x_webcat_code_abbr
EXTRACT-cs_username = "(?P[^-@]*)@
FIELDALIAS-srcip = c_ip AS src_ip
LOOKUP-vendor_info_for_cisco_wsa = cisco_wsa_vendor_info_lookup sourcetype OUTPUT vendor,product,ids_type
FIELDALIAS-url = cs_url AS url
FIELDALIAS-user = cs_username AS user

0 Karma
Get Updates on the Splunk Community!

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...