Splunk Search

How to group IP addresses by their autonomous system number (ASN) using a lookup?

adepasquale
Path Finder

I'm monitoring access_combined logs from our Apache servers.

My goal is to group IP addresses by their ASN for reporting. In order to do this, I must match each IP address to its ASN name first.

The existing app that does something similar has no "how to" for Splunk running on a Windows server and I believe it is not currently supported for version 6.3

I've downloaded an open source database (in CSV format) which has "startip (as unsigned int)", "endip (as unsigned int)", and the AS Name.

I get how a lookup table works when there is a 1-to-1 relationship between the source lookup field and the result. But what about when your source lookup field needs to fall with in a range of two fields in your lookup table?

I'm guessing I need to do the following:
1. Convert the IP address to an unsigned int. ( source="..." | ip2decimal ) ?
2. Perform the lookup where ipdecimal is >= "startip" && ipdecimal <= "endip"

I've gotten part 1 to work, but how would i do number 2?

0 Karma
1 Solution

adepasquale
Path Finder

That's exactly what I did, i was just trying to avoid it since A) The ranges listed do not easily convert to CIDR notation (i.e. one line of ranges may have to be expressed as several different CIDR notations) and B) it makes the lookup table longer.

I wrote a script in bash that would convert all the ranges to CIDR notation and then uploaded it into splunk. Seems to be working alright after I upped the maximum amount of memory allowed for indexing lookups when using cidr or wildcard searches. Once I have all the kinks worked out of my script, I'll post it here for others to use.

644340736 644415743 AS174 Cogent Communications

becomes...

83.103.220.0/22, AS174 Cogent Communications
83.103.224.0/19, AS174 Cogent Communications
83.104.0.0/16, AS174 Cogent Communications
83.105.0.0/24, AS174 Cogent Communications

Also, i had to manually create a limits.conf to up the memory and had to manually modify transforms.conf to set the match_Type to cidr. There's no way to do this through the GUI?

View solution in original post

sundareshr
Legend

Can you create an external lookup script to accept the ipdecimal and return ASN if `ipdecimal = median(startip, endip, ipdecimal)

lguinn2
Legend

Nice idea @sundareshr

However, if you are going to write an external lookup script, you might as well return the final value that is needed, not the ASN.

0 Karma

Richfez
SplunkTrust
SplunkTrust

Another thought - Could you treat the converted IP address as a time and use that in a time bounded lookup? It's a total hack, but it just might be able to be made to work.

0 Karma

adepasquale
Path Finder

One small problem... the time bound lookup uses a single column for the lookup. This would be great and all if ROW N, COL 1 was equal to (ROW (N-1), COL 2) + 1 but it's not that nice; there are large gaps.

16777216,16777471,"AS15169 Google Inc."
[ GAP HERE means that numbers between 16777471 16778240 would be considered Google Inc, when they are in fact "no match"]
16778240,16779007,"AS56203 Big Red Group"
16779008,16779263,"AS38803 Goldenit Pty ltd Australia"
[ GAP HERE means that numbers between 16779263 16786944 would be considered Goldenit pty, when they are in fact "no match"]
16786944,16787199,"AS24155 Asia Pacific Broadband Wireless Communications Inc"
16793600,16809983,"AS18144 Energia Communications,Inc."
16809984,16810495,"AS23969 TOT Public Company Limited"

0 Karma

Richfez
SplunkTrust
SplunkTrust

Well, shucks. 😞

0 Karma

adepasquale
Path Finder

That's exactly what I did, i was just trying to avoid it since A) The ranges listed do not easily convert to CIDR notation (i.e. one line of ranges may have to be expressed as several different CIDR notations) and B) it makes the lookup table longer.

I wrote a script in bash that would convert all the ranges to CIDR notation and then uploaded it into splunk. Seems to be working alright after I upped the maximum amount of memory allowed for indexing lookups when using cidr or wildcard searches. Once I have all the kinks worked out of my script, I'll post it here for others to use.

644340736 644415743 AS174 Cogent Communications

becomes...

83.103.220.0/22, AS174 Cogent Communications
83.103.224.0/19, AS174 Cogent Communications
83.104.0.0/16, AS174 Cogent Communications
83.105.0.0/24, AS174 Cogent Communications

Also, i had to manually create a limits.conf to up the memory and had to manually modify transforms.conf to set the match_Type to cidr. There's no way to do this through the GUI?

lguinn2
Legend

As @sundareshr mentions in the other thread, you could create an external lookup.

For the external lookup, you could supply the ASN, and have the external lookup script return the value that you want. You could still upload a CSV to Splunk for your script to use; it wouldn't even need to be set as a lookup table. Just use the lookup interface to do the upload (under Settings>Lookups).

I think this is a good idea, since you are writing scripts anyway.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Those settings have to be configured from the cli unfortunately at this time.

adepasquale
Path Finder

what a PITA, every time I have to update this CSV I'll need to re-do the transforms.conf.

0 Karma

lguinn2
Legend

Actually, no - just re-upload the CSV to the same name and it will replace the existing file. All the transforms.conf settings, etc. should remain the same.

0 Karma

lguinn2
Legend

Can the startip/endip range be expressed as a subnet in CIDR notation? If so, you could use the CIDR options for lookup tables.

adepasquale
Path Finder

I suppose this should be marked as the "answer" since it's the only one that really seems to work.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...