BLUF: is there a good way to search for double TLD's?
I have been attempting to get at a way to hunt for double TLD's in a firewall index, and I am looking at how to improve on some basic searching. Basically, if you search tools like phishtank, you will see a lot of phishing domains like to spoof legitimate sites with something along the lines of "mybank.com.thisisabadguy.com" so you think you're navigating to "mybank.com."
I've done very simplistic searches to see how things work, searching things like `index=firewall .com*.com | dedup url` just to see what results i get. Lots of false positives, what with Google Ad redirects, bing search results, etc. I can work with FP's though, I'll create a whitelist csv for that. Is there a way to get Splunk to recognize double TLD's, so instead of having to individually search .com*com, .org*.org, .ru*.ru, etc I can search where .tld*.tld?
bit ugly.. but should work up
You could fiddle up with `map` command to make it more deeper
| makeresults
| eval myurl="one",url="fake.com.com"
| append [| makeresults | eval myurl="two",url="google.com.some.com"]
| append [| makeresults | eval myurl="three",url="google.com"]
| append [| makeresults | eval myurl="four",url="msn.org.org"]
| append [| makeresults | eval myurl="five",url="geunine.gen2.com"]
| table myurl,url
| rex field=url "(?<tld1>[^\.]+$)"
| rex field=url "(?<tld2>[^\.]+)\.[^\.]+$"
| rex field=url "(?<tld3>[^\.]+)\.[^\.]+\.[^\.]+$"
| rex field=url "(?<tld4>[^\.]+)\.[^\.]+\.[^\.]+\.[^\.]+$"
| eval double_tld=case(tld1=tld2, "found_tld2", tld1=tld3,"found_tld3", tld1=tld4, "found_tld4")
| eval double_tld_status=if(len(double_tld)>0,"double_tld_found","not_found")
| fields - tld*