So after a very cool talk on lookups at .conf2011 I decided to dive into python and create an external lookup using redis! I should say upfront that i'm pretty new to both Redis and Python, so I could have missed something obvious.
So I wrote my lookup.py script based on the whois and externel_lookup.py examples, and was able to get that working. My lookup essentially takes the host field, and so long as it's either a hostname or IP will lookup internal info about the host, network, VLAN, ACL, description, etc. I'm doing a socket connection to a listening netcat server, passing the host field and building my CSV file from the JSON results.
I got this working totally fine without redis, however every time I started using Redis on my search head, the lookups fail and I never see my new fields.
I'm using the RHEL6 version of the redis server, and a downloaded copy of redis-python which exists in site-packages
Running
/opt/splunk/bin/splunk cmd python /opt/splunk/etc/apps/search/bin/lookup.py host
works as root, and as the splunk user and gives me the results as expected.
the CSV output looks ok from redis, it's slightly different then the direct lookup, but I resolved this by checking the redis DB for the keys, looking them up and populating redis and then using ONLY the redis results.
In addition, I thought that it may be empty values being retured from my lookup (Not all of the fields always have data) So I populated any missing value with a string in redis, so I'd always get a result.
I'm pretty stumped, if anyone has any suggestions or has successfully used a custom redis lookup, I'd be grateful for the feedback! My apologies if this question is a little vague.
So as an update:
Graciously Nimish had a conversation with me offline, and pointed me in the right direction! In short, I'm doing redis lookups from a search head against a number of indexers. When the search gets ditributed the indexers run the search and need to reach the redis DB. I suspect you could run multiple redis databases, or do what I did, which is open the redis server on the search head to the indexers.
So when I built my redis connection in the script, rather then hitting localhost I do something like this
red = redis.Redis(host='splunksearch.example.com', port=6379, db=0)
In addition I found that I could use the EPEL version of python-redis with by doing this in my script
sys.path.append("/usr/lib/python2.4/site-packages/")
Unfortunately, I've got RHEL5 and RHEL6 indexers and search heads in my cluster which have slightly different path's to python, /usr/lib/python2.4 VS /us/lib/python2.6 so i'm creating a symlink on all my Splunk servers to /usr/lib/python
So far this seems like it's working pretty ok! Some of this may be totally self evident, but I'm pretty new to python/redis . In any case I hope this helps someone else out who wants redis running with distributed search!
Any chance you can post your lookup script? Without seeing the actual script I don't know how much help I can be.
Have you looked at ndoshi's redis lookup app? It's a great starting point for splunk+redis (http://splunk-base.splunk.com/apps/27106/redis-lookup)