I have a CSV file (test.csv) that contains malicious domains and want to use that to see via Squid logs if anyone has visited any of the bad sites.
sourcetype=squid [|inputlookup test.csv | rename domain as urihost | fields urihost]
The test.csv line entry format:
So, the above works perfectly if the user visited bad-domain.com, but NOT if they visited www.bad-domain.com.
Tried basic regex using the first search string as the field, crazy things like:
sourcetype=squid | regex urihost=".*"[|inputlookup test.csv | rename domain as urihost | fields uri_hosts]
but as you probably know, that did not work....
So, my question, how can I use regex/rex to be able to provide just a high-level domain name in the CSV (bad-domain.com) and have it return all hits to any sites in that domain (www.bad-domain.com)?
easiest way is to add a field extraction to your [squid] sourcetype to extract just the base domain name from the log line. You already have one that pulls the full URI, its a matter of a different regex to get the high-level domain.