Splunk Search

How do you match ip address with ip's with CIDR notations in the same lookuptable?

UMDTERPS
Communicator

We have a CSV with a field called application and another called IP. Within the field ip there are ip addresses and some ip addresses with CIDR notation. We have hundreds of field entries for applications and IP, below is smaller dummy data version of the list:

Application IP
sec_system 192.168.4.0/26
sec_system 192.168.1.0/25
sec_system 192.168.2.0/24
sec_system 192.168.3.0/24
internal_system 192.168.2.5
internal_system 192.168.3.50
internal_system 192.168.4.32
internal_system 192.168.1.4
win_system 192.168.1.50
win_system 192.168.1.3

Is there a way to match applcations/ips, with applications/ips with CIDR notations? (I've seen some people say you need to use the tranforms.conf or props.conf file, I can't use that file because I don't have access to it)

0 Karma
1 Solution

rmmiller
Contributor

It's a little cumbersome, but this is entirely doable without modifying system files like transforms.conf or props.conf. I'll also add the disclaimer that performance might be awful if your lookup CSV file really needs that many values and you have a lot of results in your search to lookup.

There are a lot of steps below, but don't let them intimidate you. Follow me...

First, let's get your CSV in proper shape or it's a non-starter.

  1. Move your IP column to be the first, not second column.
  2. Make sure you have a comma between the first and second column. It is a CSV, after all, right?
  3. Make sure all entries in the IP column are in CIDR format. That means changing the specific IP addresses you have like 192.168.2.5 to 192.168.2.5/32 instead.
  4. Sort your list from most-specific to least-specific. Sorting as decreasing subnet mask length, and you should be fine.

Based on your example, here's what I used as my lookup file after making these changes:

IP,Application
192.168.1.3/32,win_system
192.168.1.4/32,internal_system
192.168.1.50/32,win_system
192.168.1.0/24,sec_system
192.168.2.5/32,internal_system 
192.168.2.0/24,sec_system
192.168.3.50/32,internal_system
192.168.3.0/25,sec_system
192.168.4.32/32,internal_system
192.168.4.0/26,sec_system
10.0.0.0/8,desktops
172.16.0.0/12,servers
0.0.0.0/0,catchall

I took the liberty of adding some additional ranges to represent stuff that would fall outside the ranges in your CSV file. Presumably, you don't have your entire network mapped out in this CSV file. 😉 Also, notice I didn't exactly follow my own advice and sort by decreasing subnet mask, but you'll notice the entries are that way by subnet for easy human readability.


Next, with the CSV file issues sorted, I had to simulate results from some searches you'd run. I used the following SPL to generate some random IP addresses in my results. You don't need to do this if you're getting IP addresses in the real searches you're using in your environment.

| makeresults count=30
| eval raw=split("192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,10.X.Y.Z,172.W.X.Y",",")
| eval raw_ip=mvindex(raw,random()%8)
| eval rnd1 = random()%5
| eval rnd2 = random()%60
| eval rnd3 = random()%256
| eval rnd4 = 16 + random()%16
| eval IP=replace(replace(replace(replace(raw_ip,"W",rnd4),"Z",rnd3),"Y",rnd2),"X",rnd1)
| fields _time,IP
`comment("Mocked-up sample data with credit to to4kawa")`

You should get 30 IP addresses in the private ranges from this. Don't fret if you don't understand this search -- remember this just replicates your search results, i.e., it has nothing to do with the lookup itself.


OK, we have some random data, we have good lookup data. Now it's time to tackle setting up the lookup within Splunk. No elevated rights necessary!

  1. Logged into Splunk, click Settings --> Lookups in the upper right.
  2. Click Add new next to Lookup table files.
  3. Change the Destination app drop-down menu to match whatever app you're using in your environment for this search. This might be irrelevant for you.
  4. Click the Browse button and select the CSV file after you've made the required corrections I outlined above.
  5. I recommend making the Destination filename the same as the name of your CSV for the sake of simplicity. Whatever you use here arbitrary, but you'll refer to it by this name in a later step. I'll use the name priv_ip_lookup.csv for this explanation.
  6. Click Save to save your new lookup. You should be taken back to the list of all lookup files registered in the system, and your new file will be listed here unless you ran into an error.
  7. Go back to Settings --> Lookups. This time around, click Add new next to Lookup definitions.
  8. As before, change the Destination app drop-down menu to match whatever app you're using in your environment for this search. This might be irrelevant for you. If it was irrelevant before, it is here, too.
  9. For Name, come up with a name for this lookup definition. It's arbitrary, but you'll refer to the lookup definition using this name in your SPL. I'll use priv_ip_lookup for this example.
  10. Don't change the Type -- keep it at File-based.
  11. Find your lookup file in the Lookup file drop-down menu. For this example, I would select priv_ip_lookup.csv since that's what I uploaded in step 5.
  12. Now the really important part...check Advanced options to expose additional options.
  13. Set Maximum matches to 1 unless you want to deal with multi-value fields after your lookup. If you've sorted your data properly (remember?), you will get the expected result and only 1 result after your lookup.
  14. In the Match type field, type this: CIDR(IP)
  15. Click Save to save your new lookup definition. You should be taken back to the list of all lookup definitions registered in the system, and your new definition will be listed here unless you ran into an error.

Almost there! All the setup is done. Now it's just a matter of implementing the all-important lookup command in your SPL.

Add this line to the end of whatever your search is:

| lookup priv_ip_lookup IP OUTPUT Application

Anticlimactic, isn't it?
The lookup command uses the priv_ip_lookup definition you created, which is configured to lookup by CIDR match of the IP field and output the Application column, adding that column to your search results.

Here's my final SPL while developing this answer:

| makeresults count=30
| eval raw=split("192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,10.X.Y.Z,172.W.X.Y",",")
| eval raw_ip=mvindex(raw,random()%8)
| eval rnd1 = random()%5
| eval rnd2 = random()%60
| eval rnd3 = random()%256
| eval rnd4 = 16 + random()%16
| eval IP=replace(replace(replace(replace(raw_ip,"W",rnd4),"Z",rnd3),"Y",rnd2),"X",rnd1)
| fields _time,IP
`comment("Mocked-up sample data with credit to to4kawa")`
| lookup priv_ip_lookup IP OUTPUT Application

My sample output looks like the following, but your results will vary due to the random() calls to generate the IPs in my results.

_time        IP      Application    
2020-02-10 13:22:27     192.168.4.5     sec_system
2020-02-10 13:22:27     192.168.3.55    sec_system
2020-02-10 13:22:27     192.168.2.54    sec_system
2020-02-10 13:22:27     192.168.3.36    sec_system
2020-02-10 13:22:27     192.168.2.5     internal_system
2020-02-10 13:22:27     192.168.3.20    sec_system
2020-02-10 13:22:27     192.168.4.18    sec_system
2020-02-10 13:22:27     192.168.4.9     sec_system
2020-02-10 13:22:27     192.168.4.8     sec_system 

Note the internal_system match on the specific 192.168.2.5 address in your example. This works ONLY after you change that reference in your lookup file to 192.168.2.5/32, otherwise you'll get sec_system instead.

IP lookups are messy, but they're great once you get them configured properly.

Hope that helps!
rmmiller

View solution in original post

rmmiller
Contributor

Did the answer I detailed below provide you with the information you needed?
rmmiller

rmmiller
Contributor

It's a little cumbersome, but this is entirely doable without modifying system files like transforms.conf or props.conf. I'll also add the disclaimer that performance might be awful if your lookup CSV file really needs that many values and you have a lot of results in your search to lookup.

There are a lot of steps below, but don't let them intimidate you. Follow me...

First, let's get your CSV in proper shape or it's a non-starter.

  1. Move your IP column to be the first, not second column.
  2. Make sure you have a comma between the first and second column. It is a CSV, after all, right?
  3. Make sure all entries in the IP column are in CIDR format. That means changing the specific IP addresses you have like 192.168.2.5 to 192.168.2.5/32 instead.
  4. Sort your list from most-specific to least-specific. Sorting as decreasing subnet mask length, and you should be fine.

Based on your example, here's what I used as my lookup file after making these changes:

IP,Application
192.168.1.3/32,win_system
192.168.1.4/32,internal_system
192.168.1.50/32,win_system
192.168.1.0/24,sec_system
192.168.2.5/32,internal_system 
192.168.2.0/24,sec_system
192.168.3.50/32,internal_system
192.168.3.0/25,sec_system
192.168.4.32/32,internal_system
192.168.4.0/26,sec_system
10.0.0.0/8,desktops
172.16.0.0/12,servers
0.0.0.0/0,catchall

I took the liberty of adding some additional ranges to represent stuff that would fall outside the ranges in your CSV file. Presumably, you don't have your entire network mapped out in this CSV file. 😉 Also, notice I didn't exactly follow my own advice and sort by decreasing subnet mask, but you'll notice the entries are that way by subnet for easy human readability.


Next, with the CSV file issues sorted, I had to simulate results from some searches you'd run. I used the following SPL to generate some random IP addresses in my results. You don't need to do this if you're getting IP addresses in the real searches you're using in your environment.

| makeresults count=30
| eval raw=split("192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,10.X.Y.Z,172.W.X.Y",",")
| eval raw_ip=mvindex(raw,random()%8)
| eval rnd1 = random()%5
| eval rnd2 = random()%60
| eval rnd3 = random()%256
| eval rnd4 = 16 + random()%16
| eval IP=replace(replace(replace(replace(raw_ip,"W",rnd4),"Z",rnd3),"Y",rnd2),"X",rnd1)
| fields _time,IP
`comment("Mocked-up sample data with credit to to4kawa")`

You should get 30 IP addresses in the private ranges from this. Don't fret if you don't understand this search -- remember this just replicates your search results, i.e., it has nothing to do with the lookup itself.


OK, we have some random data, we have good lookup data. Now it's time to tackle setting up the lookup within Splunk. No elevated rights necessary!

  1. Logged into Splunk, click Settings --> Lookups in the upper right.
  2. Click Add new next to Lookup table files.
  3. Change the Destination app drop-down menu to match whatever app you're using in your environment for this search. This might be irrelevant for you.
  4. Click the Browse button and select the CSV file after you've made the required corrections I outlined above.
  5. I recommend making the Destination filename the same as the name of your CSV for the sake of simplicity. Whatever you use here arbitrary, but you'll refer to it by this name in a later step. I'll use the name priv_ip_lookup.csv for this explanation.
  6. Click Save to save your new lookup. You should be taken back to the list of all lookup files registered in the system, and your new file will be listed here unless you ran into an error.
  7. Go back to Settings --> Lookups. This time around, click Add new next to Lookup definitions.
  8. As before, change the Destination app drop-down menu to match whatever app you're using in your environment for this search. This might be irrelevant for you. If it was irrelevant before, it is here, too.
  9. For Name, come up with a name for this lookup definition. It's arbitrary, but you'll refer to the lookup definition using this name in your SPL. I'll use priv_ip_lookup for this example.
  10. Don't change the Type -- keep it at File-based.
  11. Find your lookup file in the Lookup file drop-down menu. For this example, I would select priv_ip_lookup.csv since that's what I uploaded in step 5.
  12. Now the really important part...check Advanced options to expose additional options.
  13. Set Maximum matches to 1 unless you want to deal with multi-value fields after your lookup. If you've sorted your data properly (remember?), you will get the expected result and only 1 result after your lookup.
  14. In the Match type field, type this: CIDR(IP)
  15. Click Save to save your new lookup definition. You should be taken back to the list of all lookup definitions registered in the system, and your new definition will be listed here unless you ran into an error.

Almost there! All the setup is done. Now it's just a matter of implementing the all-important lookup command in your SPL.

Add this line to the end of whatever your search is:

| lookup priv_ip_lookup IP OUTPUT Application

Anticlimactic, isn't it?
The lookup command uses the priv_ip_lookup definition you created, which is configured to lookup by CIDR match of the IP field and output the Application column, adding that column to your search results.

Here's my final SPL while developing this answer:

| makeresults count=30
| eval raw=split("192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,192.168.X.Y,10.X.Y.Z,172.W.X.Y",",")
| eval raw_ip=mvindex(raw,random()%8)
| eval rnd1 = random()%5
| eval rnd2 = random()%60
| eval rnd3 = random()%256
| eval rnd4 = 16 + random()%16
| eval IP=replace(replace(replace(replace(raw_ip,"W",rnd4),"Z",rnd3),"Y",rnd2),"X",rnd1)
| fields _time,IP
`comment("Mocked-up sample data with credit to to4kawa")`
| lookup priv_ip_lookup IP OUTPUT Application

My sample output looks like the following, but your results will vary due to the random() calls to generate the IPs in my results.

_time        IP      Application    
2020-02-10 13:22:27     192.168.4.5     sec_system
2020-02-10 13:22:27     192.168.3.55    sec_system
2020-02-10 13:22:27     192.168.2.54    sec_system
2020-02-10 13:22:27     192.168.3.36    sec_system
2020-02-10 13:22:27     192.168.2.5     internal_system
2020-02-10 13:22:27     192.168.3.20    sec_system
2020-02-10 13:22:27     192.168.4.18    sec_system
2020-02-10 13:22:27     192.168.4.9     sec_system
2020-02-10 13:22:27     192.168.4.8     sec_system 

Note the internal_system match on the specific 192.168.2.5 address in your example. This works ONLY after you change that reference in your lookup file to 192.168.2.5/32, otherwise you'll get sec_system instead.

IP lookups are messy, but they're great once you get them configured properly.

Hope that helps!
rmmiller

View solution in original post

UMDTERPS
Communicator

My apologies for taking a while to get back to you.  I wasn't able to get the data back to test this on that data, but I did some testing and it appears this works.

Thanks!

0 Karma

rmmiller
Contributor

@UMDTERPS did this solve your problem?

0 Karma

rmmiller
Contributor

@UMDTERPS ?
Were you able to use this to complete the work you wanted to do or do you need additional help?

Thanks,
rmmiller

0 Karma

UMDTERPS
Communicator

Sorry, I was away and thought this question was dead. When I get the IP list csv, I will try this today.

First question, We use Excel spreadsheet formatted as a CSV, does that affect the first step? Can a Excel spreadsheet formatted as a CSV be used instead of formatting a file with commas?

Second Question, If you have hundreds of CIDR notations, would have every CIDR block out in your SPL (Final SPL)???

Thanks! =0)

0 Karma

rmmiller
Contributor

I thought this question was dead, too!

1) When Excel saves files to CSV, there will be a comma between the fields. I stated what I did only because there were spaces delimiting your fields in your sample file.
2) No, sorry, I think you misunderstood my "final SPL". You would not need to list out all of your CIDR notations in the SPL, only in the CSV. Once you get your CSV in good shape and the lookup configured in Splunk, line 11 ( | lookup priv_ip_lookup IP OUTPUT Application) is the only line you need to worry about in your SPL.

I'll start watching this question a little more closely!
rmmiller

0 Karma

to4kawa
SplunkTrust
SplunkTrust

lookup

Do you create lookup with matchtype CIDR?