All Apps and Add-ons

transform for Cisco ironport Web Security Appliance log in squid detail format

ashabc
Contributor

I have installed splunk Cisco Ironport web security appliance (WSA) apps. All seems to be working OK. It uses sourcetype cisco_wsa_squid.

However, I have a problem. The splunk apps for wsa supports only squid format log. However, I have a huge number of historical logs those were collected in squid_detail format instead of squid format. If I try to import squid_detail log into the apps, it does not extract the fields correctly, making the data useless.

I hope someone can help me with tweaking the transform file so that I can import historical (squid_detail format log file) into Cisco Ironport web security appliance (WSA) apps. Below, I have included the header for squid and squid_detail logs along with a sample line of data and the relevant contents of props.conf and transforms.conf file. My objective is to create a sourcetype called cisco_wsa_squid_detail and use it for historical logs within the apps. Need your help to create the correct transform.


log format squid

header
#Fields: %t %e %a %w/%h %s %2r %A %H/%d %c %D %Xr %?BLOCK_SUSPECT_USER_AGENT,MONITOR_SUSPECT_USER_AGENT?%<User-Agent:%!%-%.

sample data

1381962068.488 538 10.71.66.56 TCP_CLIENT_REFRESH_MISS/200 287 POST hxxp://202.7.177.46/idle/K2emdz02xSLyCk3Z/81 "WILDFIRE\davidm2@RFS_NTLM" DIRECT/202.7.177.46 text/plain DEFAULT_CASE_11-Internet_Access-RFS_AD-NONE-NONE-NONE-DefaultGroup <IW_srch,-5.9,0,"-",0,0,0,1,"-",-,-,-,"-",1,-,"-","-",-,-,IW_srch,-,"-","trojan","Flash Video","Media","-","-",4.27,0,-,"-","-"> -

log format squid_detail

header
#Fields: %t %e %a %w/%h %s %2r %A %H/%d %c CMF:%M DCF:%j ERR:%E %D %Xr %?BLOCK_SUSPECT_USER_AGENT,MONITOR_SUSPECT_USER_AGENT?%<User-Agent:%!%-%. %u,%N

sample data

1381840464.285 363273 10.72.4.25 TCP_MISS/200 41070 CONNECT tunnel://216.115.208.230:443/ "WILDFIRE\warwickh@RFS_NTLM" DIRECT/216.115.208.230 application/octet-stream CMF:40 DCF:20 ERR:0 DEFAULT_CASE_11-Internet_Access-RFS_AD-NONE-NONE-NONE-DefaultGroup <nc,-3.5,1,"-",-,-,-,1,"-",-,-,-,"-",1,-,"-","-",-,-,nc,-,"-","-","Unknown","Unknown","-","-",0.90,0,-,"-","-"> - "Mozilla/4.0 (compatible)",216.115.208.230

relevant contents of props.conf

################
# Squid Format #
################

[cisco_wsa_squid]
KV_MODE = none
SHOULD_LINEMERGE = True
MAX_TIMESTAMP_LOOKAHEAD=19
REPORT-extract = kv_for_cisco_wsa_squid
REPORT-x_webroot_threat_name_as_signature = x_webroot_threat_name_as_signature
REPORT-x_mcafee_virus_name_as_signature = x_mcafee_virus_name_as_signature
lookup_table = cat_lookup x_webcat_code_abbr
EXTRACT-cs_username = "(?P<cs_username>[^-@]*)@
FIELDALIAS-srcip = c_ip AS src_ip
LOOKUP-vendor_info_for_cisco_wsa = cisco_wsa_vendor_info_lookup sourcetype OUTPUT vendor,product,ids_type
FIELDALIAS-url = cs_url AS url
FIELDALIAS-http_method = cs_method AS http_method 
FIELDALIAS-user = cs_username AS user
FIELDALIAS-http_content_type  = cs_mime_type AS http_content_type
FIELDALIAS-dest = s_hostname AS dest
FIELDALIAS-src = c_ip AS src
FIELDALIAS-status = sc_http_status AS status
FIELDALIAS-action = sc_result_code AS action
FIELDALIAS-bytes_in = sc_bytes AS bytes_in

relevant conents of transform.conf

##################################
# Regex to read WSA squid format #
##################################

[kv_for_cisco_wsa_squid]
REGEX = ^([0-9.]*) *[0-9]* ([0-9.]*) ([A-Z_]*)/([0-9]*) ([0-9]*) ([A-Z]*) ([^ ]*) ([^ ]*) ([^/]*)/([^ ]*) ([^ ]*) ([^ ]+) <([^,]+),([^,]+),[^,]+,([^,]+),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,([^,>]+)
FORMAT = end_time::$1 c_ip::$2 sc_result_code::$3 sc_http_status::$4 sc_bytes::$5 cs_method::$6 cs_url::$7 cs_username::$8 s_hierarchy::$9 s_hostname::$10 cs_mime_type::$11 x_acltag::$12 x_webcat_code_abbr::$13 x_wbrs_score::$14 x_webroot_threat_name::$15 x_mcafee_virus_name::$16
0 Karma
1 Solution

ashabc
Contributor

I ended up writing a vb script that converts squid_detail log to squid format. I am posting it here. Hope it helps someone in future.


function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase) 

' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function

Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
const TriStateTrue = -1

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set folder = objFSO.GetFolder("D:\Ash\Download\splunk\data\squiddetail\")

for each file in folder.Files

Set testfile = objFSO.OpenTextFile(file.path, ForReading)
Set outfile = objFSO.CreateTextFile("D:\Ash\Download\splunk\data\modified\" & file.name)

Do While Not testfile.AtEndOfStream
    line = testfile.readline
    line1 = ereg_replace(line, "CMF:[^ ]* [^ ]* [^ ]* ", "", varIgnoreCase)
    line2 =  ereg_replace(line1, "> -.*", "> -", varIgnoreCase)
    outfile.writeline(line2)
Loop

testfile.close

next

outfile.close


View solution in original post

0 Karma

ashabc
Contributor

I ended up writing a vb script that converts squid_detail log to squid format. I am posting it here. Hope it helps someone in future.


function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase) 

' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function

Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 8
const TriStateTrue = -1

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set folder = objFSO.GetFolder("D:\Ash\Download\splunk\data\squiddetail\")

for each file in folder.Files

Set testfile = objFSO.OpenTextFile(file.path, ForReading)
Set outfile = objFSO.CreateTextFile("D:\Ash\Download\splunk\data\modified\" & file.name)

Do While Not testfile.AtEndOfStream
    line = testfile.readline
    line1 = ereg_replace(line, "CMF:[^ ]* [^ ]* [^ ]* ", "", varIgnoreCase)
    line2 =  ereg_replace(line1, "> -.*", "> -", varIgnoreCase)
    outfile.writeline(line2)
Loop

testfile.close

next

outfile.close


0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Before I run down all of the extractions and stuff you posted, I'll say this. If your squid_detail sourcetype is extracting fields correctly [ meaning if you search for sourcetype=squid_detail and you get fields that match those of the cisco_wsa_squid sourcetypes], then you can go into the application folder and add this to Splunk_CiscoIronportWebSecurity/local/eventtypes.conf:

[ironport_proxy]
search = sourcetype="cisco_wsa_*" OR sourcetype=squid_detail

This will override the app defaults and allow your second source to be used with minimal work.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Why not just import the squid_detail log as sourcetype=squid_detail? Then make the edit I listed for eventtypes.

0 Karma

ashabc
Contributor

Thank you alacercogitatus for taking time to respond to my post.

Field extractions using squid sourcetype is fine for squid log type only and not squid_detail logs. I have tried to import a squid_detail log using squid sourcetype and field extractions does not work.

I even don't mind discarding any additional information in squid_detail log during extraction process as long as I can get all the fields those are specified in squid sourcetype.

By the way, if you are playing with sample data, I had to modify the content http with hxxp as splunk web site won't allow me to post otherwise.

0 Karma

ashabc
Contributor

transform.conf

[kv_for_cisco_wsa_squid]
REGEX = ^([0-9.]) *[0-9] ([0-9.]) ([A-Z_])/([0-9]) ([0-9]) ([A-Z]) ([^ ]) ([^ ]) ([^/])/([^ ]) ([^ ]) ([^ ]+) <([^,]+),([^,]+),[^,]+,([^,]+),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,([^,>]+)
FORMAT = end_time::$1 c_ip::$2 sc_result_code::$3 sc_http_status::$4 sc_bytes::$5 cs_method::$6 cs_url::$7 cs_username::$8 s_hierarchy::$9 s_hostname::$10 cs_mime_type::$11 x_acltag::$12 x_webcat_code_abbr::$13 x_wbrs_score::$14 x_webroot_threat_name::$15 x_mcafee_virus_name::$16

0 Karma

ashabc
Contributor

props.conf

cisco_wsa_squid]
KV_MODE = none
SHOULD_LINEMERGE = True
MAX_TIMESTAMP_LOOKAHEAD=19
REPORT-extract = kv_for_cisco_wsa_squid
REPORT-x_webroot_threat_name_as_signature = x_webroot_threat_name_as_signature
REPORT-x_mcafee_virus_name_as_signature = x_mcafee_virus_name_as_signature
lookup_table = cat_lookup x_webcat_code_abbr
EXTRACT-cs_username = "(?P[^-@]*)@
FIELDALIAS-srcip = c_ip AS src_ip
LOOKUP-vendor_info_for_cisco_wsa = cisco_wsa_vendor_info_lookup sourcetype OUTPUT vendor,product,ids_type
FIELDALIAS-url = cs_url AS url
FIELDALIAS-user = cs_username AS user

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...