I'm trying to come up with a way to output to a lookup file a list of calculated network addresses given a list of IP addresses. By using a destination MAC address and a source IP, I'm able to group together a list of IPs that are using the same gateway with the following:
`index=zeek sourcetype=zeek_conn
| stats values(src) by resp_l2_addr`
But from here, I need to take those src values and have Splunk give me the smallest subnet that covers that range of addresses. For example, if I have 10.0.0.5 and 10.0.1.5 in the same list, I would need the query to say, based on these two addresses, there is a 10.0.0.0/23 network.
I feel this is more of a machine learning type of problem, but wanted to see if anyone has come up with something similar that could solve this. Thanks!
I am guessing that the machine learning remark is sarcasm, but a straight-face answer is no. Machine learning deals with probabilistic computation, whereas this one is deterministic. The challenge is that SPL may not be the best tool to perform this type of calculation. But (most) any mathematical computation can be achieved if your only choice is SPL. (SPL does provide a couple convenience functions/commands.)
The following is an example of a labored "solution". A small consideration is given in case the number of raw events is huge.
index=zeek sourcetype=zeek_conn
| stats values(src) as src by resp_l2_addr
| mvexpand src ``` in order to take advantage of SPL's IP sort ```
| sort ip(src)
| stats list(src) as src by resp_l2_addr ``` src is sorted in IP order ```
| eval minip = mvindex(src, 0), maxip = mvindex(src, -1)
| eval mask = mvrange(1, 33) ``` do not consider 0 ```
| eval match = mvmap(mask, if(cidrmatch(minip . "/" . mask, minip) AND cidrmatch(minip . "/" . mask, maxip), "yes", null()))
| eval minsubnet = minip . "/" . mvcount(match)
Note, the resultant minsubnet from your example will be based on minip 10.0.0.5, not on an arbitrarily determined network address such as 10.0.0.0. This sort of violates IPv4's routing protocol. But I suspect that you are just seeking a convenient notation rather than seeking to configure routing with this calculation. If you want to have a conformant representation, you can subtract 1 from minip and add 1 to maxip.
Otherwise, you can arbitrarily force 0 onto the least significant octet of minip, and arbitrarily force 255 onto maxip. The possibilities are endless yet any choice is arbitrary.
Here is an emulation using the sample data you provided. You can play with it and compare with your real data
| makeresults
| fields - _time
| eval src=mvappend("10.0.0.5", "10.0.0.100", "10.0.1.5")
``` data emulation above ```