Splunk Search

Make map command process values in series

iKate
Builder

Hi all!

How can I make map command process all the list of submitted to its input values(thousands), not just the number of maxsearches. I don't need to run all the queries simultaniously, it will be ok to run it in series by chunks of maxsearches or even one search at a time.

What I need to do at the moment is to match clientip (no subnet information) with an integer range of ip-s from lookup file to define ASN of each IP.

Here is my query, maybe you can suggest how to rewrite it.

...
| table clientip
| rex field=clientip "(?<o1>(\d)+).(?<o2>(\d)+).(?<o3>(\d)+).(?<o4>(\d)+)"
| eval integer_ip=16777216*o1+65536*o2+256*o3+o4
| map search="| inputlookup GeoIPASNum2.csv | where start<=$integer_ip$ AND end>=$integer_ip$ | eval clientip=$clientip$ | table clientip ASN " maxsearches=10

Lookup file looks like this and was dowloaded from Maxmind:

ASN                        range_start       range_end  
"AS56203 Big Red Group"       16778240            16779007
...     

Thanks in advance!

1 Solution

masonmorales
Influencer

I would not recommend using the map command for this use case. Map is better for a small number of results, but won't scale to thousands.

CIDR lookups are designed to do exactly what you're describing. Checkout this post for an example: https://answers.splunk.com/answers/5916/using-cidr-in-a-lookup-table.html

For the MaxMind GeoLite ASN data, it looks like someone has already built a TA to solve the exact problem you're describing: https://splunkbase.splunk.com/app/3531/

It ships with a CIDR lookup too, so you can simply do this after it's been installed:

...your base search... | lookup asn ip as clientip OUTPUT asn autonomous_system | table clientip asn autonomous_system

After you install, make sure you populate the initial asn lookup using the query shown in the TA's screenshot as well.

View solution in original post

niketnilay
Legend

@iKate... I am able to use where with < and > comparison in Splunk search but the same fails in Dashboard with &lt; and &gt; for map command. Following is run anywhere example (I checked with both where and search command inside map query but both have same behavior).

 | makeresults 
 | eval clientip="101.201.100.99"
 | rex field=clientip "(?<o1>(\d)+).(?<o2>(\d)+).(?<o3>(\d)+).(?<o4>(\d)+)"
 | eval integer_ip=16777216*o1+65536*o2+256*o3+o4
 | table integer_ip clientip
 | map search="| makeresults | eval range_start=16778240 | eval range_end=16779007 | eval asn=\"Big Red Group\"| where range_start<=$integer_ip$ AND range_end<=$integer_ip$| eval clientip=$clientip$| eval integer_ip=$integer_ip$| table clientip integer_ip asn range_start range_end"
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

masonmorales
Influencer

I would not recommend using the map command for this use case. Map is better for a small number of results, but won't scale to thousands.

CIDR lookups are designed to do exactly what you're describing. Checkout this post for an example: https://answers.splunk.com/answers/5916/using-cidr-in-a-lookup-table.html

For the MaxMind GeoLite ASN data, it looks like someone has already built a TA to solve the exact problem you're describing: https://splunkbase.splunk.com/app/3531/

It ships with a CIDR lookup too, so you can simply do this after it's been installed:

...your base search... | lookup asn ip as clientip OUTPUT asn autonomous_system | table clientip asn autonomous_system

After you install, make sure you populate the initial asn lookup using the query shown in the TA's screenshot as well.

View solution in original post

iKate
Builder

This app works exactly how I need! But first I needed to fix a python code a bit)

After installing the app and setting proxies an error occured:
NameError at "/opt/splunk/etc/apps/TA-asngen/bin/asngen.py", line 37 : global name 'ProxyHandler' is not defined
It was fixed by adding import urllib2 and putting prefix urllib2 this part of code:

if proxies['http'] is not None or proxies['https'] is not None:
            proxy = urllib2.ProxyHandler(proxies)
            opener = urllib2.build_opener(proxy)
            urllib2.install_opener(opener)

        try:
            url = urllib2.urlopen("https://download.maxmind.com/download/geoip/database/asnum/GeoIPASNum2.zip")
        except:
            raise Exception("Please check app proxy settings")

Than created an entry in transforms.conf as described here https://answers.splunk.com/answers/5916/using-cidr-in-a-lookup-table.html
Increased limit for lookup max_memtable_bytes=20000000
Reloaded configs and done! 🙂
Thanks again!

woodcock
Esteemed Legend

I think this app just came our (or a new version was recently released) so contact the author so he can fix it.

0 Karma

iKate
Builder

Already. I was even mentioned) https://splunkbase.splunk.com/app/3531/

0 Karma

masonmorales
Influencer

Awesome! Glad this worked for you. I upvoted your comment for sharing that fix too. I am sure it will help others in the future. 🙂

0 Karma

iKate
Builder

Wow! Looks like this app was made exeptionally for me) After installing it I'll write results here.
Thank you!

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Here's a run-anywhere proof of concept for turning that lookup table into a big honking case statement at run time.

| makeresults | eval test="1 5 11" | makemv test | mvexpand test | rename test as integer_ip
| eval ASN= 
    [
    | makeresults  
    | eval mydata = "1,3,A 4,7,B 8,12,C" 
    | makemv mydata 
    | mvexpand mydata 
    | makemv delim="," mydata 
    | eval XXX1=mvindex(mydata,0),XXX2=mvindex(mydata,1),XXX3=mvindex(mydata,2) 
    | table XXX1 XXX2 XXX3 
    | format "case(" "" AND "," "" "true(),\"unknown\")" 
    | rex field=search mode=sed "s/XXX1=\"/integer_ip>=/g" 
    | rex field=search mode=sed "s/XXX2=\"/integer_ip<=/g"
    | rex field=search mode=sed "s/\" AND/ AND/g"
    | rex field=search mode=sed "s/AND XXX3=/,/g"
    | table search
    ]

so your code would look like

...
| table clientip
| rex field=clientip "(?<o1>(\d)+).(?<o2>(\d)+).(?<o3>(\d)+).(?<o4>(\d)+)"
| eval integer_ip=16777216*o1+65536*o2+256*o3+o4
| eval ASN= 
    [
    | inputlookup GeoIPASNum2.csv  
    | rename  start as XXX1, end as XXX2, ASN as XXX3     
    | table XXX1 XXX2 XXX3 
    | format "case(" "" AND "," "" "true(),\"unknown\")" 
    | rex field=search mode=sed "s/XXX1=\"/integer_ip>=/g" 
    | rex field=search mode=sed "s/XXX2=\"/integer_ip<=/g"
    | rex field=search mode=sed "s/\" AND/ AND/g"
    | rex field=search mode=sed "s/AND XXX3=/,/g"
    | table search
    ]

The renames turned out to be necessary in order to force the sort order of the terms before converting to a case statement. I just used generic XXX# as the field names, because the order would then be fixed and obvious.

I do not know whether this method is totally practical, which will depend on the number of IP records and the splunk limits to the size of a final expanded search string, but I thought it would be a fun thing to try.

koshyk
Super Champion

@DalJeanis, great idea

0 Karma

iKate
Builder

What a trick!:) I've never used either format command or sed mode in rex, so after learning their syntax I see it can really work well in cases when you have not that many events. In my situation with several hundreds of thousands rows it won't work well in case clause I guess) But anyways this construction can help in other cases when I would try to use map previously. Thanks! Great insight of splunk commands' usage.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

I suspected as much.

Upvote answers if you like them or find them useful, even if they didn't completely solve your issue.

Accept the best one that actually helped in the solution.

0 Karma

iKate
Builder

No problem!

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!