Splunk Search

Report with multi level domain

Path Finder

Background

Creating a listing of bad domains based on 2/3/4 levels of a url

Here's the sample list which I created using Eventtypes

[Bad_Domain_Red]
search = sourcetype="bcoat_proxysg" dest_host="*2o7.net" OR dest_host="*123.ddns.org"

[Bad_Domain_Orange]
search = sourcetype="bcoat_proxysg" dest_host="*abc.net"

Upon creating the event type above.

The results which will appear with the tags

e.g.: -

sourcetype="bcoat_proxysg" eventtype="Bad_Domain_Red"

1) dest_host=abc.2o7.net

2) dest_host=splunk.2o7.net

3) dest_host=manwin.2o7.net

4) myportal.123.ddns.org

5) yourportal.123.ddns.org

etc etc

When I want to create a report of counts based on these domains I'm not able to due to the multi levels.

I'm trying to create a report showing

dest_host      Count    
2o7.net        50
123.ddns.org   100

As you can see above it is suppose to consolidate the counts for sub domains i.e. abc.2o7.net,splunk.2o7.net etc into 2o7.net

Is there anyway to do it? I have about 200 different domains split into 4 different categories based on color coding red,orange,yellow,green.

Some are monitored at the 2nd level while others are monitored at 3rd or 4th level.

Tags (1)
1 Solution

Path Finder

Just to update this very old thread, I did a work around to get this to work.
I did additional extractions for 2,3,4 level domains and did taggings for the domains which are supposed to be grouped according to the individual levels.

Thus my reports can be displayed with the specified domain levels.

View solution in original post

0 Karma

Path Finder

Just to update this very old thread, I did a work around to get this to work.
I did additional extractions for 2,3,4 level domains and did taggings for the domains which are supposed to be grouped according to the individual levels.

Thus my reports can be displayed with the specified domain levels.

View solution in original post

0 Karma

Motivator

You can use rex to extract a new field containing the second-level domain, and run your report based on that.

For example:

sourcetype="bcoat_proxysg" eventtype="Bad_Domain_Red" | rex field=hostname "(?<xdomain>([^\.]+.)?[^\.]+$)"

This should pull out a new field named xdomain which will contain the top two levels. The second-level domain will be optional, in case of unqualified names.

If you want something fancier, this might work:

| rex field=hostname "((?<xhost>[^\.]+)\.)?(?<xdomain>(([^\.]+\.)+)?[^\.]+)"

For hostnames with only one or two components/segments, xdomain will contain the entire string. When there are at least three components in the name, the first will go into xhost and xdomain will contain everything else.

It works because the plus sign at the end of the the ([^\.]+\.)+ section makes it greedy, causing the regex engine to backtrack to find a match, even if it has to steal the text from the initial (non-greedy) match on (?<xhost>[^\.]+)\.)? It's worth noting that backtracking can be really bad for regex performance, so this isn't ideal. It can probably be cleaned up with more effort, but should get you going.

See also - http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Rex

Motivator

Yeah, the heuristic approach of assuming that 3-level and greater contain a hostname might not work in some cases. If you're looking at just a fixed list, the more elegant solution may be to not extract the field at all, but to use a lookup table instead.

0 Karma

Path Finder

Thanks I will test that out at the same time I'm thinking of using the collect command together with more granular tagging to differentiate the different types of domains(2/3/4 level) and throwing them into different indexes applying different regular expressions to pull out the wanted domains (in different indexes).

The difficulty is to find out which domain is suppose to be 2/3/4 level as these are all human defined.

0 Karma

Motivator

Ah, now I understand. Answer edited above, it's more readable there.

0 Karma

Path Finder

As in my example, where I'm looking at reports which may include both.
The need for identifying 3 level domains is because of domains coming from dynamic DNS.

0 Karma

Path Finder

Thanks I've tried that but however the challenge is that some of these domains are 2 levels while some are 3 or 4 levels.
That's where I'm having the problem......

Any suggestions?

0 Karma