Reporting

How to find duplicate values inside a single field?

New Member

For data from DNS that looks like these examples:
www.abc.com.www.bca.com
www.abc.net.www.bca.net

How can I report that .com or .net appears more than once in the field? I would stat this an push it into a count by client after finding the values that match.

0 Karma
1 Solution

Splunk Employee
Splunk Employee

if you have a single value per field, you can create a new field and trim the domain using a regex.

    __mysearch>__ | rex field=mydomainfield "(?<start_domain>.*)(?<end_domain>\.\w+)$" | table mydomainfield start_domain end_domain

the do add a

      | stats count by start_domain

if you have mutlivalue per field, you have to start by splitting your field.
see this article
http://docs.splunk.com/Documentation/Splunk/6.3.2/Search/Parsemultivaluefields

View solution in original post

0 Karma

Splunk Employee
Splunk Employee

try this

| rex field=domain max_match=10 "(?<top_level_domain>\b(?:com|net|edu|org)\b)" | stats count by domain top_level_domain | where count > 1 

Splunk Employee
Splunk Employee

if you have a single value per field, you can create a new field and trim the domain using a regex.

    __mysearch>__ | rex field=mydomainfield "(?<start_domain>.*)(?<end_domain>\.\w+)$" | table mydomainfield start_domain end_domain

the do add a

      | stats count by start_domain

if you have mutlivalue per field, you have to start by splitting your field.
see this article
http://docs.splunk.com/Documentation/Splunk/6.3.2/Search/Parsemultivaluefields

View solution in original post

0 Karma

New Member

Yes will be multivalued field. Where we want to report clients who for some reason append additional domains in a DNS query, which will fail except in rare cases.

I did get around this by using .net. OR .com. the trailing dot picks up there is more to the domain.

Paula

0 Karma