Splunk Search

Lookup Script doesn't return search results

bansi
Path Finder

Below is the props.conf at $SPLUNK_HOME/etc/system/local:

    [SPLUNK_SERVICE_Log]
lookup_table = namelookup Id OUTPUT Name

Below is the transforms.conf at $SPLUNK_HOME/etc/system/local:

[namelookup]
external_cmd = namelookup.py Id Name
external_type = python
fields_list = Id, Name

Script location :

$SPLUNK_HOME/etc/system/bin/namelookup.py


# File namelookup.py
# ------------------------------
import os,csv
#import pyodbc
import sys
import logging
import logging.config
def main():
      #if len(sys.argv) != 3:
        #print "Usage: python name_lookup.py [id field] [name field]"
        #sys.exit(0)
       logging.config.fileConfig("logging.conf")
       # create logger
       logger = logging.getLogger("namelookup")
         # "application" code
       logger.debug("====Inside Main=====")
       idf = sys.argv[1]
       namef = sys.argv[2]
       r = csv.reader(sys.stdin)
       w = None
       header = []
       first = True
       d1 = {}
      # Add items
       d1["006981166"] = "John"
       d1["007094117"] = "Mike"
       d1["007094118"] = "Scott"
       for line in r:
        if first:
            header = line
            print "Header:", header
            if idf not in header or namef not in header:
                print "Id and Name fields must exist in CSV data"
                sys.exit(0)
            csv.writer(sys.stdout).writerow(header)
            w = csv.DictWriter(sys.stdout, header)
            first = False
            continue

        # Read the result
        result = {}
        i = 0
        while i < len(header):
            if i < len(line):
                result[header[i]] = line[i]
            else:
                result[header[i]] = ''
            i += 1

        # Perform the lookup 
        if len(result[idf]) and len(result[namef]) :
            w.writerow(result)

        elif len(result[idf]):
        result[namef] = lookup(result[idf], d1)
            if len(result[namef]):
                w.writerow(result)




# Given a Id, find its Name
def lookup(id, d1):
     try:        
         for key in d1.keys():
        if key == id:
            #print "Value=", d1[key]
            return d1[key]       
     except:
        return []

main()

However, when I run the below search, It doesn't return any search results under name

source="Test_Log.txt" | xmlkv entry | lookup namelookup  Id OUTPUT Name | table Id, name

Please let me know where i am going wrong in the script or where exactly is the script failing. Is their a way to debug the script using Komodo Edit IDE . I want debugger to launch the moment you hit enter in the Splunk Web Interface because i am not even sure the script is invoked by Splunk. So i would like to see atleast the first print statement in the script is printed onto console.

When i tried to run as standlone program using the command

splunk cmd namelookup.py 123

it opens a command prompt and immediately closes it. So not sure whats going on with this script

bansi
Path Finder

Thanks for wonderful explaination hexx.

I did exactly as per your recommendation and my scripted lookup behaves in the same manner as yours i.e. the results displayed on stdout are as desired

C:\Splunk\etc\system\bin>db_lookup.py memberId memberName < memberInput.csv

produces following output retrieving memberName from database

memberId,memberName
006,RANDY
007,LEONY
009,RANDOLPH

However, when I invoke the scripted lookup from splunk search as shown below , It doesn't return any results under memberName column

source="Test_Log.txt" | xmlkv entry | lookup namelookup  memberId OUTPUT memberName | table memberId, memberName

Please let me know what am I missing?

0 Karma

stefano_guidoba
Communicator

any update on this?

hexx
Splunk Employee
Splunk Employee

Hello Bansi.

In agreement with jrodman, I believe that at this point it is important to focus on the validation of the external lookup script at a low level, from the Splunk command line.

To better understand the constraints that an external lookup script must respond to, I would to recommend a careful read of this section of our documentation :

http://www.splunk.com/base/Documentation/4.1.6/Knowledge/Addfieldsfromexternaldatasources#More_about...

As an example, I will demonstrate here how the host/ip external lookup script that ships with Splunk (external_lookup.py) can be validated in that way :

1 - Create a partially empty input CSV file that simulates the input that will be passed to the lookup script by splunk-search for resolution :

This file should contain a header listing the fields we want to work with. One of these should be the "input" field (in our example, "host") and the other should be the "output" field (in our example, "ip") that are passed on as argument to the script.

Here's what my input file "input.csv" looks like :

host,ip
www.hardware.fr,
www.bash.org,
www.somafm.com,

Note that only the "host" column is populated here. When I feed this file as input to external_lookup.py while specifying that it should look at the "host" and "ip" fields as they are defined in the CSV header, the script will use external DNS resolution to fill in the blanks.

2 - Call the lookup script from the command line, feeding the CSV input file as stdin and specifying the fields to consider within that file as arguments :

# $SPLUNK_HOME/bin/splunk cmd python $SPLUNK_HOME/etc/system/bin/external_lookup.py host ip < input.csv

3 - Check that the results displayed on stdout are as desired :

In our case, we are getting the expected result : The lookup script shows on stdout a now complete CSV file. The "ip" field has been populated for each line by performing DNS resolution on the value of the "host" field for that line :

host,ip
www.hardware.fr,83.243.20.80
www.bash.org,69.61.106.93
www.somafm.com,64.147.167.20

In the context of search, the splunk-search process would use that CSV output to enrich events by adding an "ip" field and populating it with the values generated.

My recommendation to you is to make sure that your own scripted lookup behaves in this manner when tested and can operate with the same arguments/inputs. I also think that it would be a good idea to make your lookup bi-directional and able to look up a name given an ID.

bansi
Path Finder

Thanks for wonderful explaination. I did exactly as per your recommendation and my scripted lookup behaves in the same manner as yours i.e. works fine

However, when I run the below search, It doesn't return any search results under name

source="Test_Log.txt" | xmlkv entry | lookup namelookup Id OUTPUT Name | table Id, name

0 Karma

bansi
Path Finder

Please note the purpose of lookup script in my case is to retrieve Name from database for a given Id in Splunk Search query. i.e.

.....| lookup namelookup  Id OUTPUT Name 

Please note i modelled the lookup script based on external_lookup.py shipped with installation or http://blogs.splunk.com/2009/09/14/enriching-data-with-db-lookups-part-2/

These scripts take "CSV input from Splunk via standard input" This doesnt seems to be working in my case. So is their a way to debug the lookup script to make sure CSV input from Splunk is really supplied to lookup script via stdin.

To prove my point, I modified the lookup script by commenting following lines and It runs perfectly fine as standalone program

#namef = sys.argv[2]  // This doesnt really make sense.
   #r = csv.reader(sys.stdin)
   #w = None
   #header = []
   #first = True

#csv.writer(sys.stdout).writerow(header) 
#w = csv.DictWriter(sys.stdout, header) 

So the question boils down to how to make the script

  • take "CSV input from Splunk via standard input"
  • produce Splunk Search Results with Name value from database. Please note i don't want CSV standard output.
  • mostly importantly how to launch debugger

Any pointers/suggestions to make the script working will be greatly appreciated

0 Karma

bansi
Path Finder

jrodman, Thanks for suggestion. Would you mind providing an example code snippet on how to perfor the Logging you suggested i.e. Open a logfile -- lf = open(logfile, "a", 1 ) -- at the beginning of the script and log variable states and debug lines to it lf.write(), using such things as repr(), str(), the pprint module

0 Karma

jrodman
Splunk Employee
Splunk Employee

I don't know how to use komodo. A lookup script is typically invoked multiple times and is extremely short lived. If you know a good recipe to attach a debugger to a very short lived process, go for it. Personally I find this approach slow and cumbersome. Open a logfile -- lf = open(logfile, "a", 1 ) -- at the beginning of the script and log variable states and debug lines to it lf.write(), using such things as repr(), str(), the pprint module, and other things produces repeatable rapid results across tests and modifications.

0 Karma

jrodman
Splunk Employee
Splunk Employee

I'm only read the script diagonally, but it looks like it maps ID to Name, but not Name to ID. Splunk maps in reverse in order to build an efficient search, and then maps forward in order to decorate / enrich the events.

Therefore the script should be able to handle the case where it receives names and emit completed table entries with both the ID and the name. If this is not possible, you may get what you want by emitting an asterisk as the ID.

As for your test method, I would recommend

  1. Open a command prompt
  2. From the command prompt, run splunk cmd python namelookup.py 123

You could also try this from any python, such as one downloaded yourself from the internets.

As a side note, you may want to configure the csv reader to handle large data sizes if you could imagine that you might ever receive malformed data.

jrodman
Splunk Employee
Splunk Employee

More clearly, your print statement is a bug. Please test your script externally, and validate that the output is csv only. Once you are certain the script is producing valid csv, if the lookup is still not working you may wish to engage splunk support.

0 Karma

jrodman
Splunk Employee
Splunk Employee

Sorry, I assumed this part worked. It looks like your script is currently configured to produce debug output (print "Header:"...) Did you start with a specific example?

0 Karma

bansi
Path Finder

i modified the script to run as standalone program by commenting lines : csv.writer(sys.stdout).writerow(header)

w = csv.DictWriter(sys.stdout, header)
So the question boils down to how to make it work with Splunk or how to make it write to CSV file

0 Karma

bansi
Path Finder

The purpose of lookup script in my case is to given an Id, connect to database and retrieve Name by Id. So i am not sure what you are suggesting. Could you please elaborate. Please note we don't have Name value initially available to us infact we are retriving it from database by passing Id value as an argument in "SELECT" clause. Anyhow that script is far from working so i hard-coded the values in a dictionary with keys as Id and Values as Name as a prrof-of-concept that Splunk will be able to call the lookup script namelookp.py

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...