I need to map a clientip to their hostname and MAC address. This environment is DHCP driven and hosts move around a lot. This means that one minute an IP could be one host, and the next another.
I've implemented an NBTSTAT lookup script that works really well. Unfortunately, it uses the current snapshot of what's in the WINS/DNS for the IPs that are a result of my query. I have no way of assuring that the IP didn't change hostnames over time.
I'm not logging my DHCP logs yet.
Is there a way to get the event processor to run the lookup script before indexing the event (I know this would be heavy but until I start gathering my DHCP logs this is the only reliable way I can think of).
Also, would I be correct that when I start monitoring the DHCP logs I will be able to do some sort of join between DHCP logs and my log of interest based on the DHCP lease time and the timestamp of the event of interest?
Thanks in advance for any help on this.
Thanks for the input guys, here's a followup after I've been playing with splunk for a little while.
Since no DHCP logs are available, the only way I can solve this is to execute an NBTSTAT on a list of IP addresses. The question was, when/how do you do this.
First off, NBTSTAT isn't available through splunk and/or Ubuntu, so I installed the nmblookup binary from my distro (on Ubuntu, I did sudo apt-cache search nmblookup and installed the resulting package)
I then created the following Python script in $SPLUNKBASE/etc/apps/myapp/bin/nmblookup.py:
def lookup(ip):
if ipval.match(ip):
process = subprocess.Popen([nmb_bin, '-A', ip],executable = nmb_bin, stdout=subprocess.PIPE)
process.wait()
output = process.stdout.read()
lines = output.split('\n')
for l in lines:
ipmatch = ipre.match(l)
if ipmatch != None:
return ipmatch.group(1)
else:
print ip + " is an invalid ip address"
return None
def main():
if len(sys.argv) != 3:
print "Usage: python nmblookup.py [ip field] [host field]"
sys.exit(0);
ipf = sys.argv[1]
hostf = sys.argv[2]
r = csv.reader(sys.stdin)
w = None
header = []
first = True
for line in r:
if first:
header = line
if ipf not in header or hostf not in header:
print "Host and IP Fields not in header"
sys.exit(0)
csv.writer(sys.stdout).writerow(header)
w = csv.DictWriter(sys.stdout, header)
first = False
continue
result = {}
i = 0
while i < len(header):
if i < len(line):
result[header[i]] = line[i]
else:
result[header[i]] = ''
i += 1
# Perform the lookup
if len(result[ipf]) and len(result[hostf]):
w.writerow(result)
elif len(result[ipf]):
hostname = lookup(result[ipf])
result[hostf] = hostname
w.writerow(result)46
nmb_bin = "/usr/bin/nmblookup"
ipval = re.compile(r"\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b")
ipre = re.compile("^\s+(.*)\s+<00>\s+-\s+M\s+
main()
Once this was done, I registered the command inside of transforms.conf:
46
[nbtstat_hostlookup]
external_cmd = nmblookup.py clientip clienthost
fields_list = clientip clienthost46
46
Now comes the crux of the matter. I created a scheduled search that queries my log sources every 7 minutes (DHCP might vary, so this is a best-effort basis based on how many logs you get).
This search builds a list of IP addresses observed inside my indexed log sources along with the time, executes the nbtstat and then generates a lookup table that I can use at a later date!
Here's the search string, I'll leave the remainder of the setup to you (don't hesitate to ask if you need help figuring out the rest).
I had to tweak this script above to make it work properly on RHEL. I got it working via command line:
/opt/splunk/bin/splunk cmd python ../bin/nmblookup.py nbthost clientip < temp.txt
Where temp.txt is a csv file of nbthost with source IP addresses.
When using the lookup command with the setting above in transforms.conf and recycling Splunk service, Splunk isnt even running the python script unless I pipe table to lookup first. Weird.
Thanks for the input guys, here's a followup after I've been playing with splunk for a little while.
Since no DHCP logs are available, the only way I can solve this is to execute an NBTSTAT on a list of IP addresses. The question was, when/how do you do this.
First off, NBTSTAT isn't available through splunk and/or Ubuntu, so I installed the nmblookup binary from my distro (on Ubuntu, I did sudo apt-cache search nmblookup and installed the resulting package)
I then created the following Python script in $SPLUNKBASE/etc/apps/myapp/bin/nmblookup.py:
def lookup(ip):
if ipval.match(ip):
process = subprocess.Popen([nmb_bin, '-A', ip],executable = nmb_bin, stdout=subprocess.PIPE)
process.wait()
output = process.stdout.read()
lines = output.split('\n')
for l in lines:
ipmatch = ipre.match(l)
if ipmatch != None:
return ipmatch.group(1)
else:
print ip + " is an invalid ip address"
return None
def main():
if len(sys.argv) != 3:
print "Usage: python nmblookup.py [ip field] [host field]"
sys.exit(0);
ipf = sys.argv[1]
hostf = sys.argv[2]
r = csv.reader(sys.stdin)
w = None
header = []
first = True
for line in r:
if first:
header = line
if ipf not in header or hostf not in header:
print "Host and IP Fields not in header"
sys.exit(0)
csv.writer(sys.stdout).writerow(header)
w = csv.DictWriter(sys.stdout, header)
first = False
continue
result = {}
i = 0
while i < len(header):
if i < len(line):
result[header[i]] = line[i]
else:
result[header[i]] = ''
i += 1
# Perform the lookup
if len(result[ipf]) and len(result[hostf]):
w.writerow(result)
elif len(result[ipf]):
hostname = lookup(result[ipf])
result[hostf] = hostname
w.writerow(result)46
nmb_bin = "/usr/bin/nmblookup"
ipval = re.compile(r"\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b")
ipre = re.compile("^\s+(.*)\s+<00>\s+-\s+M\s+
main()
Once this was done, I registered the command inside of transforms.conf:
46
[nbtstat_hostlookup]
external_cmd = nmblookup.py clientip clienthost
fields_list = clientip clienthost46
46
Now comes the crux of the matter. I created a scheduled search that queries my log sources every 7 minutes (DHCP might vary, so this is a best-effort basis based on how many logs you get).
This search builds a list of IP addresses observed inside my indexed log sources along with the time, executes the nbtstat and then generates a lookup table that I can use at a later date!
Here's the search string, I'll leave the remainder of the setup to you (don't hesitate to ask if you need help figuring out the rest).
What is the scenario/use case (there may be a better way to accomplish this)? What IP's are you trying to track, and how are you discovering them? With NBTSTAT, you should only see hosts in your broadcast domain, or those that unicast with your host. Are you running this on the WINS, looking for cached NetBIOS names from clients who have made NBN registrations? The NBN cache has a default timeout of something like 10 minutes -- you might need to adjust the registry entry to cache them longer.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\NetBT\Parameters\CacheTimeout
Your script that runs NBTSTAT should do the translation, if you want them done prior to indexing. I recommend having it provide both the IP and NetBIOS name to the indexer. This way, you won't have to rely on the DHCP logs, but can use them as another measure to ensure the validity of your data.
You are correct, Splunk can do the correlation between your DHCP logs and NBTSTAT script data, based on lease time and timestamp. You might even want to correlate them to WINS events, and SPlunk can do this, too.
As a side note, the Windows DHCP app might be of assistance here: http://splunkbase.splunk.com/apps/All/4.x/app:Windows+DHCP
Hi Ron,
Thanks for the input. basically I have squid logs that provide me with IP addresses of clients and sites they visited. I'm crossing this with a list of know malware domains for possible botnet infections.
Problem with DHCP environment (as well as some VPN connections) is that source-ip changes.
Can you point me towards how to do the translation prior to indexing? I'm assuming I'll just need to add something to transforms.conf.