I am trying to search through logs for unusual domains generated by DGAs. I want to use regex to search for domain names with 7-12 characters ending with TLD. The characters are alphanumeric.
For example, abc1djdfkf.xyz
I have used the following regex patterns, but did not see the desired results.
rex field=URL "(?\w{7,12}.(XYZ))$"
So... you're looking for seven to twelve alphanumeric characters where at least one is a digit and at least one is a letter?
I'll be lazy and cheat:
| rex field=URL "(?<url_dga>(?=\w*\d)(?=\w*[a-zA-Z])\w{7,12}\.xyz)"
| regex URL="(?=\w*\d)(?=\w*[a-zA-Z])\w{7,12}\.xyz"
Note 1: I've added regex
, in case you're trying to filter and not extract a field.
Note 2: djdhdjahdja.xyz is technically alphanumeric 😛
Note 3: To add more laziness, take a look at https://splunkbase.splunk.com/app/3435 - one of its examples targets algorithmically generated domains.
I've added back the rex
command to extract fields rather than searching by regex.
Well, it does match the example you gave that should match, and doesn't match the example you gave that shouldn't match.
Are you trying to extract a new field |rex
or filter results |regex
?
Martin- I am looking to extract the field.
Martin- Thanks, but the query you mentioned is not providing the desired results. For example, the results include abc.zybdkdke12.xyz , www.dahdha2ddalk.xyz, when I am only interested in the main domain itself (zybdkdke12.xyz and dahdha2ddalk.xyz).
Try this
...| rex field=URL "(?<domain>\w{7,12}\.xyz)"
somesoni2- The DGA I am observing generates domains in alphanumeric characters, so in my regex I want to be able to search for domains that contain ONLY alphanumeric values. For example, I want to get a hit on ababdbdb233.xyz and not on djdhdjahdja.xyz.