Knowledge Management

Backfill automated bash script timeout: Is there a best practice on how much data can be backfilled per thread/search?

Powers64
Explorer

I have created a bash script to assist with automation of backfilling missing data and to avoid overloading the server. However, at times when I increase threads and the time space of the search, some backfills are skipped due to an error. From the settings below (within script) is there a best practice on how much data can be backfilled per thread/search?

#!/bin/bash
#This script is intended to backfill splunk data that is already ingested into splunk but searches that failed due to license issues (or other phenomenon).

#Timestamp used for logs
_now=$(date +"%Y-%b-%d_%Hh_%Mm_%Ss")

############################
#Required Information Needed
############################
#Splunk path
splunk_dir=/opt/splunk/bin
#Log Path
log_dir=/opt/scripts/logs
#Splunk Username (not linux username) to run backfill script under
username=Powers64
#Name of Application the search resides in
app="SmartyPants"
#This needs to be type again manually down below. If search has - in it's name the script will not run.
search_name="'Summary - SmartyPants - 5 minutes'"
#Search Earliest EPOCH time
et=1463247600
#Search Latest EPOCH time
lt=1463605200
#How often does the search run? [In Seconds]
seconds=300
#Max Backfill Queries in every Search
maxq=10
#When this option is set to true, the script does not run saved searches for a scheduled timespan if data already exists in the summary index for that timespan.
dedup="true"
#Specifies that the summary indexes are not on the search head but are on the indexes instead. To be used in conjunction with -dedup
nolocal="true"
#Maximum number of concurrent searches to run
concurrent=2
####
#For more info on managing backfill visit http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Managesummaryindexgapsandoverlaps
####
############################
#End of required Information
############################

echo "Please enter username's password (Note: Password is invisiable, just press [Enter] after typed): "
read -s password

cd $splunk_dir
#Total queries run within 1 backfill search
queries=$(($seconds*$maxq))
#Calculates the last run if not dividable by queries ran
remaintime=$((($lt-$et)%$queries))

#Runs a recurring backfill search based on parameters above
for ((current=$et; current<$lt; current=current+$queries))
do

#Calculates remaining seconds to run. Identifies when to run last backfill search
lastrun=$(($lt-$current))

        if [ $lastrun != $remaintime ]
            then
                qrun=$(($current+$queries))
                completed=$(((($current-$et)*100)/($lt-$et)))
                echo "Running backfill from" $current "to" $qrun
                ./splunk cmd python fill_summary_index.py -app $app -name 'Summary - SmartyPants - 5 minutes' -et $current -lt $qrun -dedup $dedup -nolocal $nolocal -showprogress true -j $concurrent -auth $username:$password 2>&1 | tee $log_dir/$_now.output
                echo $_now "-" $app $search_name $current $qrun >> $log_dir/backfill_history.log
                echo $completed"% Complete - Surpressing script for 15 seconds to avoid overloading server"
                sleep 15
            else
                echo "Running last backfill from" $current "to" $lt
                ./splunk cmd python fill_summary_index.py -app $app -name 'Summary - SmartyPants - 5 minutes' -et $current -lt $lt -dedup $dedup -nolocal $nolocal -showprogress true -j $concurrent -auth $username:$password 2>&1 | tee $log_dir/$_now.output
                echo $_now "-" $app $search_name $current $lt >> $log_dir/backfill_history.log
                echo "100% Complete - Backfill completed! Yippy"
        fi

done
0 Karma

woodcock
Esteemed Legend

This looks pretty good so if you are still looking for performance/safety improvements, I suggest that you convert from using SI to using accelerated data-models + tstats.

0 Karma

Raghav2384
Motivator

First, script looks very good. It gives a lot of options to pick....I do a lot of backfilling myself based on search names/app etc

i have modified the fill_summary_index.py to suit the best for a Search head clustering environment and pick a time when the schedules are minimum. The option j cannot exceed the number of cores for the search head (I can put a 1000 in there but if my machine has only 16 cores, 16 searches is all it can run at any give time (concurrently))

I typically rely heavily on dedup -true as that's not going to harm the performance (It simply does not execute if the job has already run).

That being said, there's no backfilling best practice perse . I however pick a list f searches that have a lot in common (Example: schedule time ranges) If i have 10 searches that run in 15 min difference, i will pick a -et,-lt to cover the search window for all the 10 and use -dedup true true to ignore the ones that already ran.

./splunk cmd python fill_summary_index.py -app search -name "All the crazy summaries" -dedup true -showprogress true -j 16 (That's all my search head can handle, 16 searches concurrently) -owner admin -auth admin:admin.

Since the script you wrote covers everything...only way to overcome performance is , run few summary backfills from a different SHC member (If you have search head clustering) or even pooling.....Reason i had to edit fill_summary_index.py is i do not store any summary data on Search heads and i forward everything rom SHC to Indexers.

Hope this helps!

Thanks,
Raghav

0 Karma

Powers64
Explorer

Raghav2384, thanks for the reply. I noticed that when I try to backfill a search jobs that runs every 5 minutes with over 100k events per search it will error out if I use a wide back fill time window. On the other hand when I run a backfill on a search jobs that runs every hour with 9k events per search with a very large window of backfill it has no issue.

I figured there is a limitation on how many events per search job can be backfilled.

As for your change on the fill_summary_index.py, there is -nolocal argument that "Specifies that the summary indexes are not on the search head but are on the indexes instead. To be used in conjunction with -dedup"

0 Karma

ddrillic
Ultra Champion

From where does the fill_summary_index.py python script come from?

0 Karma

Powers64
Explorer

It is a splunk script to backfill data generated from running search jobs.
http://docs.splunk.com/Documentation/Splunk/6.4.1/Knowledge/Managesummaryindexgapsandoverlaps

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...