Splunk Search
Highlighted

Using CIDR in a lookup table

Path Finder

Fellow Splunkers

I am building a query where I want to report on location based on source IP address. For example within our internal network the subnet 10.50.6.0/24 corresponds to Sydney, whereas 10.50.7.0/24 corresponds to Melbourne. Thus far I have been able to prove my approach using the cidrmatch function as follows:

eval location=case(cidrmatch("10.50.6.0/24",src_ip),"Sydney", cidrmatch("10.50.7.0/24",src_ip),"Melbourne")

Using a case statement doesn't scale when I have hundreds subnets spread across Australia and New Zealand. I was thinking of having a lookup table of the following format:

Subnet,Location
10.50.6.0/24,Sydney
10.50.7.0/24,Melbourne

My problem is that I have not been able to find a way to perform the appropriate lookup.

I have the following questions:

  1. Can any one suggest a way to execute CIDR-based query against a lookup table?
  2. Is there an alternative/better solution to the above that will allow me to match an internal source IP to location of my choosing?

Many thanks

Tags (1)
Highlighted

Re: Using CIDR in a lookup table

Splunk Employee
Splunk Employee

No. CSV lookup tables in the current version (4.1.4) must have an exact string match (optionally case-insensitive) to the field.

But you could do this using a scripted lookup instead that executed the logic above. You can take a look at $SPLUNK_HOME/etc/system/bin/external_lookup.py or here.

Highlighted

Re: Using CIDR in a lookup table

Splunk Employee
Splunk Employee

As gkanapathy said, this is not possible with CSV lookup tables, but is feasible with external lookups. That said, I think that eventtypes are probably the right knowledge primitive to use with version 4.1 (and 4.2, when it comes out) to solve this problem. For example, in our internal web analytics app, we use eventtypes to "classify" events as being from bots, browsers, internal IPs and in turn pageviews are driven by eventtypes.

You can configure eventtypes.conf to enumerate the regions like:

[location-australia-sydney]
search = Subnet=10.50.6.0/24

[location-australia-melbourne]
search = Subnet=10.50.7.0/24

To retrieve, you can search for eventtype=location-australia-sydney. If you want all of Australia, you can search for eventtype=location-australia-*. Since these are eventtypes, events are tagged with these for later filtering or reporting.

If you just want to report on these locations, you can add to your search:

... | eval location = mvfilter(eventtype LIKE "location-%") | ...
Highlighted

Re: Using CIDR in a lookup table

Splunk Employee
Splunk Employee

Probably the searches for the eventtypes should be e.g., src_ip=10.50.6.0/24, not Subnet=..., to match against the field that is present in the original event.

0 Karma
Highlighted

Re: Using CIDR in a lookup table

Motivator

As a matter of personal preference, I don't like using event types the way Stephen suggests, especially if there are lots of subnets.

Try this...

  1. Download this this script (hackish but functional) and put it in apps/search/bin:
    http://www.southerington.com/redir.php?id=7

  2. Populate a CSV file with 3 columns, like this:
        base,length,subnet_name
        127.0.0.0,8,"Loopback"
        10.50.6.0,24,"Name of Subnet Six"
        10.50.7.0,24,"Name of Subnet Seven"
        ...

    Sort the CSV by base, and then in descending order by length so that the most precise match will come first. If you put the file in search/lookups, you can still use other searches to generate it, but be aware there are possible race conditions.

  3. Put the following in transforms.conf:
    [subnetlookup]
        external_cmd  = subnetlookup.py
        external_type = python
        fields_list   = ip, subnet_name
  4. Run your search, e.g:
    src_ip=* | head 20 | lookup subnetlookup ip as src_ip OUTPUT subnet_name
Highlighted

Re: Using CIDR in a lookup table

Explorer

It's not clear to me how this works. I was able to configure everything to apparently work though.

When running this search:

host="servername" AND src_ip= | lookup subnetlookup ip as srcip OUTPUT subnetname

It only returns 196 matches, yet when I run a search on just one of the three subnets that I have in the subnets.csv file like, "123.123..", it returns thousands of results. why is it filtering it down to only a couple hundred?

Secondly, how do I do a search for IPs NOT in the subnets.csv file? I've tried adding "NOT" to various parts of the search string and it doesn't work.

0 Karma
Highlighted

Re: Using CIDR in a lookup table

Path Finder

For IP's not in the subnets.csv file, I can think of two approaches. One is to put "0.0.0.0,0,Unknown" as the last row in your file and then filter on subnetname=unknown. The database guy in me recoils at that. The other is to try isnull(subnetname).

To your first question, shouldn't it be src_ip=* ? The asterisk is missing in your post above.

0 Karma
Highlighted

Re: Using CIDR in a lookup table

Path Finder

As a heads-up to others, I ran into the case where this worked fine for me when I included "| head n" in the stream, but not when that head was taken out. The issue is that I had installed the script on a search head, but the search was running on an indexer. Using "src_ip=* | localop | lookup subnetlookup [...]" worked -- localop forced the lookup to run on the search head. And thanks to the IRC folks on EFNet (via my local splunk admin) for the answer.

0 Karma
Highlighted

Re: Using CIDR in a lookup table

Path Finder

There is a bug in the script as provided by southeringtonp. In the loadsubnetlist function, the reverse sort needs to be on the second element of the list. I fixed this by adding "import operator" to the top of the file and replacing subnet.reverse() with subnets.sort(key=operator.itemgetter(1),reverse=True)

0 Karma
Highlighted

Re: Using CIDR in a lookup table

Ultra Champion

I realize that my answer may come a bit late, but you could do it through lookups. Perhaps these options were not available at the time of your posting.

Create a CSV lookup table with your subnets (in my example I've used a different type of classification for identifying potentially bad site users). We'll call this file knownips.csv and put it in /opt/splunk/etc/system/lookups.

clientip, honesty, comment
170.192.178.10/32, BAD, open_proxy
63.236.6.247/32, BAD, open_proxy
177.23.21.223/32, BAD, TOR_exit_node
27.35.0.0/16, OK, friendly_partner
27.96.0.0/16, BAD, hostile_competitor
195.69.252.0/24, BAD, suspicious_ISP

in transforms.conf

[checkip]
filename = knownips.csv
max_matches = 1
min_matches = 1
default_match = OK
match_type = CIDR(clientip)

in props.conf

[access_combined]
LOOKUP-check = checkip clientip OUTPUT honesty comment

This lets you run searches like;

sourcetype="access_combined" honesty=BAD | stats count by clientip, comment

with results like;

clientip            comment     count
170.192.178.10  open_proxy   78
177.23.21.223   TOR_exit_node    12
189.222.1.22    TOR_exit_node    26
195.69.252.22   suspicious_ISP  138

Hope this helps,

Kristian

View solution in original post