I want to extract the top level domain from the CN field of a certificate in Splunk. The CN field may have multiple levels. I only want the top level domain (e.g.: mysite.com), as opposed to the full CN (e.g.: server.content.mysite.com). The CN can have multiple levels. I only want to extract the last something dot something. Any suggestions?
You can anchor your regex to the end of the line using $. Find a period followed by anything that isn't a period followed by a period followed by anything that isn't a period anchored to the end of the line. See below:
.(?[^.]+.[^.]+)$
While that answers your question, the better solution that I would suggest is leveraging the URLToolbox app (https://splunkbase.splunk.com/app/2734/). Determining TLD and other subcomponents or stats around URLs can be pretty complicated and this app provides a lot of tools to assist with this so you don't have to reinvent the wheel. When you download the app, there's a README that will explain how the app works and what parts would be most relevant to you.
Thank you for the quick response.
You can anchor your regex to the end of the line using $. Find a period followed by anything that isn't a period followed by a period followed by anything that isn't a period anchored to the end of the line. See below:
.(?[^.]+.[^.]+)$
While that answers your question, the better solution that I would suggest is leveraging the URLToolbox app (https://splunkbase.splunk.com/app/2734/). Determining TLD and other subcomponents or stats around URLs can be pretty complicated and this app provides a lot of tools to assist with this so you don't have to reinvent the wheel. When you download the app, there's a README that will explain how the app works and what parts would be most relevant to you.
This will match the last 2 elements of a CN:
| rex "^.*?(?<theField>[^.]+\.\w+)$"
See regex101: https://regex101.com/r/ev3p1K/2