Archive

Splunk - Example external scripted lookup

Explorer

Hello Splunk Community:

I'm trying to convert several stand alone Python scripts into splunk External Lookups and running into problems.
Any thoughts?

I've looked at the external_lookup.py example that ships with Splunk and created a simple example that should just output the first field and create content for the second field.

It's working in the Splunk CLI

sh-3.2# cat csv_test.csv 
field1,field2
mydataexample,

sh-3.2# cat csv_test.csv | /Applications/Splunk/bin/splunk cmd python test_output.py field1 field2 
field1,field2
mydataexample,NoField2_Data

But not from the Splunk UI

index="ex_firewall" accept or allowed 
| stats count by dst_ip 
| lookup test_output.py dst_ip as field1 

Throwing error
Error in 'lookup' command: Could not construct lookup 'test_output, dst_ip, as, field1'. See search.log for more details.

I've also placed the script in the the proper location

/Applications/Splunk/etc/apps/splunk/etc/system/bin/test_output.py

Added it to the transforms.conf

sh-3.2# cat /Applications/Splunk/etc/apps/splunk/etc/system/local/transforms.conf 
# Example external lookup
#[dnslookup]
#external_cmd = external_lookup.py clienthost clientip
#fields_list = clienthost,clientip
# Test output external lookup 
[test_output.py] 
external_cmd = test_output.py field1 field2 
fields_list = field1, field2 
external_type = python

This is the script itself

#!/usr/bin/env python
### 
#
# Testing Stub - Splunk lookup external/scripted 
#
### 
import csv 
import sys 
import socket 
def main(): 
    #Check input 
    if len(sys.argv) != 3: 
        print "Usage: python thisfile.py [field1] [field2]"
        sys.exit(1) 
    #Input 
    field1 = sys.argv[1]
    field2 = sys.argv[2]
    infile = sys.stdin
    outfile = sys.stdout 
    r = csv.DictReader(infile) 
    header = r.fieldnames 
    w = csv.DictWriter(outfile, fieldnames=r.fieldnames)
    w.writeheader() 
    # Do something with the fields 
    for result in r: 
        # if both fields are there write out 
        if result[field1] and result[field2]:
            w.writerow(result) 
        elif result[field1]:
            result[field2] = "NoField2_Data" 
            w.writerow(result)
        elif result[field2]: 
            result[field1] = "NoField1_Data"
            w.writerow(result)
main()

Explorer

After some frustrating results I believe it's finally working.

I wanted to share the script so that others don't go through this same frustration.

Turns out modeling the external_lookup which is used as 'dnslookup' in searches provides some context.

But now I've explicitly called out all of the options I hope.

1st
Splunk lookups are fed using a 2 column csv structure and expect to have data returned in csv structure. So thats why we use 'field1' as the source of our request to the external script and 'field2' as the response back to Splunk (which is initially empty until the script fills the data).

2nd
I've placed data strings for all of the results so someone doesn't get nothing back from the external lookup and can't tell if it's an error or not.

3rd

SOURCE file "test_output.py"

#!/usr/bin/env python 

##### 
#
# Code Stub - Splunk lookup external/scripted 
#
# 
######## 
# Usage 
# 
# Edit the permissions for this file to be the owner of splunk with full execute and read 
# placing it in the main splunk search app 
# $SPLUNK_HOME$/etc/system/bin/ path 
#
# chown splunkuser:splunkgroup thisfile.py 
# chmod a+xr thisfile.py 
# 
# Edit the main search app's transforms.conf file 
# $SPLUNK_HOME$/etc/system/local/transforms.conf 
# Add this stanza 
#[test_output]
#external_cmd = test_output.py field1 field2
#fields_list = field1,field2
# 
# Where 'test_output' is the name you want to use to call for example below 
#   mysearch | lookup test_output field1 
# and field1, field2 are the expected field names from splunk 
# 
#  

# Python libraries 
import csv 
import sys 
import socket 


##### Take an Action on field1 data 
def takeactiononfield1data(ip):
    try:
        hostname, aliaslist, ipaddrlist = socket.gethostbyaddr(ip)
        return hostname 
    except:
        return 'NoFQDN'

##### Take action on field2 data 
def takeactiononfield2data(host):
    try: 
        #Do something 
        hostname, aliaslist, ipaddrlist = socket.gethostbyname(host)
        return ipaddrlist 
    except: 
        return ['NO_IPs_Returned'] 

##### Main function 
def main():
    #Check input 
    if len(sys.argv) != 3:
        print "Usage: python thisfile.py [field1] [field2]"
        sys.exit(1) 

    # Input 
    field1 = sys.argv[1]
    field2 = sys.argv[2]
    infile = sys.stdin
    outfile = sys.stdout 

    # Splunk build header and expected 2 field, columns data 
    r = csv.DictReader(infile) 
    header = r.fieldnames 
    w = csv.DictWriter(outfile, fieldnames=r.fieldnames)
    w.writeheader()

    #Do something with the fields 
    for result in r: 
        # if both fields have value write out 
        if result[field1] and result[field2]: 
            w.writerow(result) 
        # if only field1 provide field2 
        elif result[field1]:
            # Static Just put string data 
            #result[field2] = "NoField2_data"
            # Dynamic 
            result[field2] = takeactiononfield1data(result[field1])
            w.writerow(result)
        # if only field2 provide field1 
        elif result[field2]:
                    #Static Just put string data 
            result[field1] = "NoField1_data"
            # Do something like run a function 
            #result[field1] = takeaction(result[field2])
            if result[field1]:
                w.writerow(result)

##### Call Main function 
main() 

##### exit cleanly if needed 
sys.exit()