I have a CSV file (test.csv) that contains malicious domains and want to use that to see via Squid logs if anyone has visited any of the bad sites.
The search:
sourcetype=squid [|inputlookup test.csv | rename domain as uri_host | fields uri_host]
The test.csv line entry format:
domain,category,reference,date,isbad
"bad-domain.com",harmful,"safebrowsing.clients.google.com",20110603,true
So, the above works perfectly if the user visited bad-domain.com, but NOT if they visited www.bad-domain.com.
Tried basic regex using the first search string as the field, crazy things like:
sourcetype=squid | regex uri_host=".*"[|inputlookup test.csv | rename domain as uri_host | fields uri_hosts]
but as you probably know, that did not work....
So, my question, how can I use regex/rex to be able to provide just a high-level domain name in the CSV (bad-domain.com) and have it return all hits to any sites in that domain (www.bad-domain.com)?
Thanks
easiest way is to add a field extraction to your [squid]
sourcetype to extract just the base domain name from the log line. You already have one that pulls the full URI, its a matter of a different regex to get the high-level domain.