Splunk Search

[HowTo] Script to automatically fill lookups

christian_l
Path Finder

Hi Splunkers,

one great way to bring additional information into your data is the use of lookups.
One problem I found regarding this was the data for the lookups was changing dynamically.
I had the requirement to build lookups for dynamic data several times and decided to build some type of automation.
The result was a script which I'd like to introduce here in case some else could find this useful. It could be used as some kind of template. The functions itself have to be written individual. I commented the script itself so there shouldn't be a need to explain it.

#!/bin/bash
########################################################################
#
# This script automatically fills lookupfiles for your Splunk app.
# Call the script with the filename of the lookupfile you'd like to fill.
# Lookupfiles must be located in the lookups-folder of your app.
#    Exampl0e: fill_lookups.sh hostnames.csv
#
# author at splunkbase: christian_l
#
########################################################################

# location of the splunk app folder
APP_PATH=/opt/splunk/etc/apps/Your-AppName

LOOKUP_FILE=$APP_PATH/lookups/$1

# count the lines before the update
LINES_BEFORE=`wc -l $LOOKUP_FILE | cut -d' ' -f1`

# create a backup of the lookupfile
cp $LOOKUP_FILE $LOOKUP_FILE.bak


# the following functions are doing the real work. They have to be developed for your individual needs. 
# I'll give two examples, one for a SQL-DB lookup to fill the file, and a second which fills the file 
# from content and filename of multiple files.

# SQL-query example
function hostname_lookup() {

# SQL-host variables
HOST=192.168.1.1
USER=splunk
PASSWORD=pwd4splunk

# write the header line which describes the column content
echo "hostname,hostip,location,device" > $LOOKUP_FILE

    # build the query and send it to SQL-Server       
    mysql -u $USER -h $HOST -p$PASSWORD --execute "SELECT a.host, b.hostip, b.location, c.device AS delete_row
    FROM host-db a, ip-list b, host-models c
    WHERE a.host = b.hostname
    AND a.host NOT LIKE 'staging';" | grep -v "delete_row" >> $LOOKUP_FILE

    # manual corrections by sed

    # delete the splunkhost from the file
    sed -i '/splunkhost/d' $LOOKUP_FILE

    # replace a device-name
    sed -i 's/WRT54G$/OpenWRT-device/g' $LOOKUP_FILE

    # replace the tabs from the SQL-query with spaces and optimize comma-usage
    sed -i 's/\t/,/g' $LOOKUP_FILE
    sed -i 's/,\s/,/g' $LOOKUP_FILE
}


# gather the information from several .txt files in the sources folder
# the information is located in the filenames itself and in the content of the txt-files
function hostname2_lookup {

# write the header line which describes the column content
echo "hostname,hostip,location,device" > $LOOKUP_FILE

for FILENAME_COMPLETE in $APP_PATH/lookups/sources/*.txt
do
    FILENAME=`basename $FILENAME_COMPLETE`
    LOCATION=`echo $FILENAME | cut -d'-' -f3-9 | cut -d'.' -f1`
        while read LINE;
        do
            HOSTNAME=`echo $LINE | cut -d ' ' -f1`
            HOSTIP=`echo $LINE | cut -d' ' -f2`
            DEVICE=`echo $LINE | cut -d' ' -f3-4`
            echo $HOSTNAME,$HOSTIP,$LOCATION,$DEVICE >> $LOOKUP_FILE
        done < $FILENAME_COMPLETE
done

# manual corrections
sed -i 's/Long\,Island\,/Long-Island\,Long-Island\,/g' $LOOKUP_FILE

}


case "$1" in
    hostnames.csv)
            hostname_lookup
    ;;
    hostnames2.csv)
            hostname2_lookup
    ;;
esac

# convert lookup file splunk-readable
iconv -f iso-8859-1 -t utf-8 $LOOKUP_FILE > $LOOKUP_FILE.tmp

# replace old lookup-file with the new, currently generated version
mv $LOOKUP_FILE.tmp $LOOKUP_FILE

# count lines after update
LINES_AFTER=`wc -l $LOOKUP_FILE | cut -d' ' -f1`

# calculate difference
DIFFERENCE=`echo $(($LINES_AFTER - $LINES_BEFORE))`

# write information about changes into the splunk index
if [ $DIFFERENCE -ge 0 ] ; then
    echo "Lookup-File $1 was updated. $DIFFERENCE lines were added."
else
    DIFFERENCE=`echo $(($DIFFERENCE * -1))`
    echo "Lookup-File $1 was updated. $DIFFERENCE lines were removed."
fi

exit 0

Next step is to run the script automatically. It would be possible to schedule it per crontab. But I'd prefer a way which doesn't need a part outside Splunk so I can deploy the app without doing additional changes on the operating-system-level.
I decided to run the script via a scripted input. One additional feature this method brings is to write the update-information (at the bottom of the script) into a index. This information can then be used for a Splunk alert - for example if more than 20 lines were removed from the lookupfile.
The content of the inputs.conf for this could look like the following:

# fill_lookups.sh isn't a real input script - just scheduled to renew the lookup data
[script://./bin/fill_lookups.sh hostnames.csv]
disabled = 0
interval = 1 20 * * *
host = splunk
source = lookup-filler
sourcetype = lookup
index = MyAppIndex

That's all!
Feel free to give feedback and improvement-suggestions.
Thank you.

0 Karma

krugger
Communicator

Here is an idea:

  • install the app Splunk DB Connect, this will create a new External Databases in the Manager.
  • Configure and test the access to the database.
  • Schedule a database input using a database dump, this allow you to run a custom query and save the results in an index.
  • Use the index instead of the lookup. If that doesn't work schedule a search that uses the outputlookup function to write a lookup file from that index.

This might be a bit more messy, than the script depending on how complex a tash you are attempting.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!