I'm monitoring access_combined logs from our Apache servers.
My goal is to group IP addresses by their ASN for reporting. In order to do this, I must match each IP address to its ASN name first.
The existing app that does something similar has no "how to" for Splunk running on a Windows server and I believe it is not currently supported for version 6.3
I've downloaded an open source database (in CSV format) which has "startip (as unsigned int)", "endip (as unsigned int)", and the AS Name.
I get how a lookup table works when there is a 1-to-1 relationship between the source lookup field and the result. But what about when your source lookup field needs to fall with in a range of two fields in your lookup table?
I'm guessing I need to do the following:
1. Convert the IP address to an unsigned int. ( source="..." | ip2decimal
) ?
2. Perform the lookup where ipdecimal is >= "startip" && ipdecimal <= "endip"
I've gotten part 1 to work, but how would i do number 2?
That's exactly what I did, i was just trying to avoid it since A) The ranges listed do not easily convert to CIDR notation (i.e. one line of ranges may have to be expressed as several different CIDR notations) and B) it makes the lookup table longer.
I wrote a script in bash that would convert all the ranges to CIDR notation and then uploaded it into splunk. Seems to be working alright after I upped the maximum amount of memory allowed for indexing lookups when using cidr or wildcard searches. Once I have all the kinks worked out of my script, I'll post it here for others to use.
644340736 644415743 AS174 Cogent Communications
becomes...
83.103.220.0/22, AS174 Cogent Communications
83.103.224.0/19, AS174 Cogent Communications
83.104.0.0/16, AS174 Cogent Communications
83.105.0.0/24, AS174 Cogent Communications
Also, i had to manually create a limits.conf to up the memory and had to manually modify transforms.conf to set the match_Type to cidr. There's no way to do this through the GUI?
Can you create an external lookup script to accept the ipdecimal and return ASN if `ipdecimal = median(startip, endip, ipdecimal)
Nice idea @sundareshr
However, if you are going to write an external lookup script, you might as well return the final value that is needed, not the ASN.
Another thought - Could you treat the converted IP address as a time and use that in a time bounded lookup? It's a total hack, but it just might be able to be made to work.
One small problem... the time bound lookup uses a single column for the lookup. This would be great and all if ROW N, COL 1 was equal to (ROW (N-1), COL 2) + 1 but it's not that nice; there are large gaps.
16777216,16777471,"AS15169 Google Inc."
[ GAP HERE means that numbers between 16777471 16778240 would be considered Google Inc, when they are in fact "no match"]
16778240,16779007,"AS56203 Big Red Group"
16779008,16779263,"AS38803 Goldenit Pty ltd Australia"
[ GAP HERE means that numbers between 16779263 16786944 would be considered Goldenit pty, when they are in fact "no match"]
16786944,16787199,"AS24155 Asia Pacific Broadband Wireless Communications Inc"
16793600,16809983,"AS18144 Energia Communications,Inc."
16809984,16810495,"AS23969 TOT Public Company Limited"
Well, shucks. 😞
That's exactly what I did, i was just trying to avoid it since A) The ranges listed do not easily convert to CIDR notation (i.e. one line of ranges may have to be expressed as several different CIDR notations) and B) it makes the lookup table longer.
I wrote a script in bash that would convert all the ranges to CIDR notation and then uploaded it into splunk. Seems to be working alright after I upped the maximum amount of memory allowed for indexing lookups when using cidr or wildcard searches. Once I have all the kinks worked out of my script, I'll post it here for others to use.
644340736 644415743 AS174 Cogent Communications
becomes...
83.103.220.0/22, AS174 Cogent Communications
83.103.224.0/19, AS174 Cogent Communications
83.104.0.0/16, AS174 Cogent Communications
83.105.0.0/24, AS174 Cogent Communications
Also, i had to manually create a limits.conf to up the memory and had to manually modify transforms.conf to set the match_Type to cidr. There's no way to do this through the GUI?
As @sundareshr mentions in the other thread, you could create an external lookup.
For the external lookup, you could supply the ASN, and have the external lookup script return the value that you want. You could still upload a CSV to Splunk for your script to use; it wouldn't even need to be set as a lookup table. Just use the lookup interface to do the upload (under Settings>Lookups).
I think this is a good idea, since you are writing scripts anyway.
Those settings have to be configured from the cli unfortunately at this time.
what a PITA, every time I have to update this CSV I'll need to re-do the transforms.conf.
Actually, no - just re-upload the CSV to the same name and it will replace the existing file. All the transforms.conf settings, etc. should remain the same.
Can the startip/endip range be expressed as a subnet in CIDR notation? If so, you could use the CIDR options for lookup tables.
I suppose this should be marked as the "answer" since it's the only one that really seems to work.