I have installed splunk Cisco Ironport web security appliance (WSA) apps. All seems to be working OK. It uses sourcetype cisco_wsa_squid
.
However, I have a problem. The splunk apps for wsa supports only squid format log. However, I have a huge number of historical logs those were collected in squid_detail
format instead of squid format. If I try to import squid_detail log into the apps, it does not extract the fields correctly, making the data useless.
I hope someone can help me with tweaking the transform file so that I can import historical (squid_detail format log file) into Cisco Ironport web security appliance (WSA) apps. Below, I have included the header for squid
and squid_detail
logs along with a sample line of data and the relevant contents of props.conf
and transforms.conf
file. My objective is to create a sourcetype called cisco_wsa_squid_detail
and use it for historical logs within the apps. Need your help to create the correct transform.
log format squid
header
#Fields: %t %e %a %w/%h %s %2r %A %H/%d %c %D %Xr %?BLOCK_SUSPECT_USER_AGENT,MONITOR_SUSPECT_USER_AGENT?%<User-Agent:%!%-%.
sample data
1381962068.488 538 10.71.66.56 TCP_CLIENT_REFRESH_MISS/200 287 POST hxxp://202.7.177.46/idle/K2emdz02xSLyCk3Z/81 "WILDFIRE\davidm2@RFS_NTLM" DIRECT/202.7.177.46 text/plain DEFAULT_CASE_11-Internet_Access-RFS_AD-NONE-NONE-NONE-DefaultGroup <IW_srch,-5.9,0,"-",0,0,0,1,"-",-,-,-,"-",1,-,"-","-",-,-,IW_srch,-,"-","trojan","Flash Video","Media","-","-",4.27,0,-,"-","-"> -
log format squid_detail
header
#Fields: %t %e %a %w/%h %s %2r %A %H/%d %c CMF:%M DCF:%j ERR:%E %D %Xr %?BLOCK_SUSPECT_USER_AGENT,MONITOR_SUSPECT_USER_AGENT?%<User-Agent:%!%-%. %u,%N
sample data
1381840464.285 363273 10.72.4.25 TCP_MISS/200 41070 CONNECT tunnel://216.115.208.230:443/ "WILDFIRE\warwickh@RFS_NTLM" DIRECT/216.115.208.230 application/octet-stream CMF:40 DCF:20 ERR:0 DEFAULT_CASE_11-Internet_Access-RFS_AD-NONE-NONE-NONE-DefaultGroup <nc,-3.5,1,"-",-,-,-,1,"-",-,-,-,"-",1,-,"-","-",-,-,nc,-,"-","-","Unknown","Unknown","-","-",0.90,0,-,"-","-"> - "Mozilla/4.0 (compatible)",216.115.208.230
relevant contents of props.conf
################
# Squid Format #
################
[cisco_wsa_squid]
KV_MODE = none
SHOULD_LINEMERGE = True
MAX_TIMESTAMP_LOOKAHEAD=19
REPORT-extract = kv_for_cisco_wsa_squid
REPORT-x_webroot_threat_name_as_signature = x_webroot_threat_name_as_signature
REPORT-x_mcafee_virus_name_as_signature = x_mcafee_virus_name_as_signature
lookup_table = cat_lookup x_webcat_code_abbr
EXTRACT-cs_username = "(?P<cs_username>[^-@]*)@
FIELDALIAS-srcip = c_ip AS src_ip
LOOKUP-vendor_info_for_cisco_wsa = cisco_wsa_vendor_info_lookup sourcetype OUTPUT vendor,product,ids_type
FIELDALIAS-url = cs_url AS url
FIELDALIAS-http_method = cs_method AS http_method
FIELDALIAS-user = cs_username AS user
FIELDALIAS-http_content_type = cs_mime_type AS http_content_type
FIELDALIAS-dest = s_hostname AS dest
FIELDALIAS-src = c_ip AS src
FIELDALIAS-status = sc_http_status AS status
FIELDALIAS-action = sc_result_code AS action
FIELDALIAS-bytes_in = sc_bytes AS bytes_in
relevant conents of transform.conf
##################################
# Regex to read WSA squid format #
##################################
[kv_for_cisco_wsa_squid]
REGEX = ^([0-9.]*) *[0-9]* ([0-9.]*) ([A-Z_]*)/([0-9]*) ([0-9]*) ([A-Z]*) ([^ ]*) ([^ ]*) ([^/]*)/([^ ]*) ([^ ]*) ([^ ]+) <([^,]+),([^,]+),[^,]+,([^,]+),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,([^,>]+)
FORMAT = end_time::$1 c_ip::$2 sc_result_code::$3 sc_http_status::$4 sc_bytes::$5 cs_method::$6 cs_url::$7 cs_username::$8 s_hierarchy::$9 s_hostname::$10 cs_mime_type::$11 x_acltag::$12 x_webcat_code_abbr::$13 x_wbrs_score::$14 x_webroot_threat_name::$15 x_mcafee_virus_name::$16
I ended up writing a vb script that converts squid_detail log to squid format. I am posting it here. Hope it helps someone in future.
function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase)
' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function
Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
const TriStateTrue = -1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set folder = objFSO.GetFolder("D:\Ash\Download\splunk\data\squiddetail\")
for each file in folder.Files
Set testfile = objFSO.OpenTextFile(file.path, ForReading)
Set outfile = objFSO.CreateTextFile("D:\Ash\Download\splunk\data\modified\" & file.name)
Do While Not testfile.AtEndOfStream
line = testfile.readline
line1 = ereg_replace(line, "CMF:[^ ]* [^ ]* [^ ]* ", "", varIgnoreCase)
line2 = ereg_replace(line1, "> -.*", "> -", varIgnoreCase)
outfile.writeline(line2)
Loop
testfile.close
next
outfile.close
I ended up writing a vb script that converts squid_detail log to squid format. I am posting it here. Hope it helps someone in future.
function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase)
' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function
Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
const TriStateTrue = -1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set folder = objFSO.GetFolder("D:\Ash\Download\splunk\data\squiddetail\")
for each file in folder.Files
Set testfile = objFSO.OpenTextFile(file.path, ForReading)
Set outfile = objFSO.CreateTextFile("D:\Ash\Download\splunk\data\modified\" & file.name)
Do While Not testfile.AtEndOfStream
line = testfile.readline
line1 = ereg_replace(line, "CMF:[^ ]* [^ ]* [^ ]* ", "", varIgnoreCase)
line2 = ereg_replace(line1, "> -.*", "> -", varIgnoreCase)
outfile.writeline(line2)
Loop
testfile.close
next
outfile.close
Before I run down all of the extractions and stuff you posted, I'll say this. If your squid_detail
sourcetype is extracting fields correctly [ meaning if you search for sourcetype=squid_detail
and you get fields that match those of the cisco_wsa_squid
sourcetypes], then you can go into the application folder and add this to Splunk_CiscoIronportWebSecurity/local/eventtypes.conf
:
[ironport_proxy]
search = sourcetype="cisco_wsa_*" OR sourcetype=squid_detail
This will override the app defaults and allow your second source to be used with minimal work.
Why not just import the squid_detail log as sourcetype=squid_detail? Then make the edit I listed for eventtypes.
Thank you alacercogitatus for taking time to respond to my post.
Field extractions using squid sourcetype is fine for squid log type only and not squid_detail logs. I have tried to import a squid_detail log using squid sourcetype and field extractions does not work.
I even don't mind discarding any additional information in squid_detail log during extraction process as long as I can get all the fields those are specified in squid sourcetype.
By the way, if you are playing with sample data, I had to modify the content http with hxxp as splunk web site won't allow me to post otherwise.
transform.conf
[kv_for_cisco_wsa_squid]
REGEX = ^([0-9.]) *[0-9] ([0-9.]) ([A-Z_])/([0-9]) ([0-9]) ([A-Z]) ([^ ]) ([^ ]) ([^/])/([^ ]) ([^ ]) ([^ ]+) <([^,]+),([^,]+),[^,]+,([^,]+),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,([^,>]+)
FORMAT = end_time::$1 c_ip::$2 sc_result_code::$3 sc_http_status::$4 sc_bytes::$5 cs_method::$6 cs_url::$7 cs_username::$8 s_hierarchy::$9 s_hostname::$10 cs_mime_type::$11 x_acltag::$12 x_webcat_code_abbr::$13 x_wbrs_score::$14 x_webroot_threat_name::$15 x_mcafee_virus_name::$16
props.conf
cisco_wsa_squid]
KV_MODE = none
SHOULD_LINEMERGE = True
MAX_TIMESTAMP_LOOKAHEAD=19
REPORT-extract = kv_for_cisco_wsa_squid
REPORT-x_webroot_threat_name_as_signature = x_webroot_threat_name_as_signature
REPORT-x_mcafee_virus_name_as_signature = x_mcafee_virus_name_as_signature
lookup_table = cat_lookup x_webcat_code_abbr
EXTRACT-cs_username = "(?P
FIELDALIAS-srcip = c_ip AS src_ip
LOOKUP-vendor_info_for_cisco_wsa = cisco_wsa_vendor_info_lookup sourcetype OUTPUT vendor,product,ids_type
FIELDALIAS-url = cs_url AS url
FIELDALIAS-user = cs_username AS user