Hi All,
I have a SPL query that runs on an index , sourcetype which has milions of jobnames.
I want to my SPL to read through a list of jobnames from a different query and use it as subsearch
OR
I have created a lookup.csv for this 16,000 list of jonames and want to run my search on it.
How to do that ?
Main SPL that runs on millions of jobnames :
earliest=-7d index=log-13120-nonprod-c laas_appId=qbmp.prediction* "jobPredictionAnalysis" prediction lastEndDelta | table jobname, prediction_status, predicted_end_time
Below is an input lookup
freq_used_jobs_bmp_3months.csv which is a simple two columnar file
jobName, freq_count
I tried to join main query with this inputfile. I want to operate and write SPL queries on this list of jobNames only.
earliest=-7d index=log-13120-nonprod-c laas_appId=qbmp.prediction* "jobPredictionAnalysis" prediction lastEndDelta | table jobname, prediction_status, predicted_end_time
| lookup freq_used_jobs_bmp_3months.csv jobName output freq_count
|table jobname, freq_count
The above query fails with error
na_prod_secure-ist-indexer-1_iapp724.randolph.ms.com-23000] Streamed search execute failed because: Error in 'lookup' command: Could not construct lookup 'freq_used_jobs_bmp_3months.csv, jobName, output, freq_count'. See search.log for more details
I removed any null rows in the file. Still I get the same error.
Other option is to somehow combine, join main query with a sub search instead of a lookup file.
main query
earliest=-7d index=log-13120-nonprod-c laas_appId=qbmp.prediction* "jobPredictionAnalysis" prediction lastEndDelta | table jobname, prediction_status, predicted_end_time
sub search that will list a smaller number of jobNames that are used in last 3 months :
earliest=-90d index="log-13120-prod-c" sourcetype="autosys_service_secondary:app" OR "autosys_service_primary:app" "request:JobSearch" installation="P*" NOT"*%*" | stats count as freq_count by jobName
Now how to join the above two?
As you already have the lookup, just use the contents of this lookup as a subsearch. Don't use join!
Your first attempt might be failing simply due to field names case mismatch. You had jobname in your events and jobName in the lookup.
I'm not sure however which approach would be more effective here (you'd have to check for yourself). As I understand you want tomreturn only those events from the "main search" which are included in your lookup.
One approach which you already tried is to list all the events from the main search, do a lookup to fill the destination field (in your case it would be that freq_count field) and filter to include only the results with that field filled.
But that approach has its downside - you have to process all the huge set of results from the main search.
As an alternative approach you can simply use a subsearch to generate a list of jobNames
<your_search_conditions> [ | inputlookup freq_used_jobs_bmp_3months.csv | table jobName | rename jobName as jobname ] | table ...
One more thing - don't do search | table | lookup. In fact you almost never want to do search | table | anything at all. The table command moves the processing of your events from the indexers to a searchhead which kills any parallelization for the commands further down the stream. Table should almost never be used as not the last command in the chain.
Trying for a subseach and this fails:
earliest=-1d index=log-13120-nonprod-c laas_appId=qbmp.prediction* "jobPredictionAnalysis" prediction lastEndDelta
| join jobname [search earliest=-90d index="log-13120-prod-c" sourcetype="autosys_service_secondary:app" OR "autosys_service_primary:app" "request:JobSearch" installation="P*" NOT"*%*" | stats count by jobName | table jobName, freq_count]
| table jobname, jobName, freq_count
Please note that jobname is outer query field, but jobName is the subsearch field name. Both these searches run successfully in two differnt windows.
No. Don't use subsearches for stuff like this. Remember that subsearches have limit for returned number of results (50k by default) and limited execution time (60 seconds? I don't remember). And what's most annoying about them is that they fail silently if hit any of those limits. So inputlookup with a predictable number of results is a relatively good candidate for a subsearch. A complicated search with long execution time and many returned results is not.
Anyway, your subsearch has one mistake (you do stats count and then want to table a non-existent field; I assume it's a mistake in re-typing the search here) and one very ugly condition - the "NOT *%*" part - it makes Splunk have to go and read all your events from the last 90 days to check whether they match or not. It can't use its indexes. So the search will most definitely time out and - as I wrote above - fail silently and produce incomplete results if any at all.
Below works slowly. It results into a correct count of 16300 records approx.:
earliest=-1d index=log-13120-nonprod-c laas_appId=qbmp.prediction* "jobPredictionAnalysis" prediction lastEndDelta
| join
[ | inputlookup freq_used_jobs_bmp_3months.csv ]
| stats count, values(freq_count) by jobname
But can I do a reverse thing like this?:
My logic is only the inputlookup list of jobs will be considered and then other columns will be matched from the index, laas_appId. Will this be optimized?
Rather than iterating through millions of jobnames from the index, how to have only those 16.5 thousands jobnames iterated ?
inputlookup freq_used_jobs_bmp_3months.csv
| join
[ search earliest=-1d index=log-13120-nonprod-c laas_appId=qbmp.prediction* "jobPredictionAnalysis" prediction lastEndDelta ]
| stats count, values(freq_count) by jobname
What is the workaround ?
To put the problem statement generically:
I want splunk query, data analysis to run on a smaller list of jobs (around 20 thousand in number) . I have that list as a csv and created a lookup.
I now want SPLs to read this pool of jobs ONLY for analysis.
As you already have the lookup, just use the contents of this lookup as a subsearch. Don't use join!
| lookup freq_used_jobs_bmp_3months.csv jobname output freq_count ... this does NOT work!
Only the below SPL works !
index=log-13120-nonprod-c laas_appId=qbmp.prediction* sourcetype="qbmp.prediction_analyzer:app" "jobPredictionAnalysis" "prediction" "lastEndDelta"
| eval accuracy_category = case( abs(lastEndDelta) <= 600, 10, (abs(lastEndDelta) > 600 and abs(lastEndDelta) <= 1200), 20, (abs(lastEndDelta) > 1200 and abs(lastEndDelta) <= 1800), 30, 1==1,40)
| eval timeDistance_category = case(timeDistance <= 3600, 1, (timeDistance>3600 and timeDistance<=7200),2,(timeDistance>7200 and timeDistance<=10800),3,1==1,4)
| where instance like "P%" and timeDistance_category<=3
| join
[ inputlookup freq_used_jobs_bmp_3months.csv ]
| chart count(predicted_end_time) by timeDistance_category,accuracy_category
And there has to be a join.
Without a join how will my large number of events (in the index) match like an equijoin with a list of 17K jobs in the lookup file ?
More generically,
(lookup file that has one column of 2000 jobname)
(join this to a search such that the execution is for the 2000 jobs above and not the millions of jobs and other fields on index)
If you want to use a lookup to generate a set of conditions, you can use a subsearch (keep in mind all the restrictions regarding subsearches).
So if you have a lookup jobs that has a field called ID and want only to find events that have the field jobid corresponding to that field, you can construct your search like this:
<your_base_search> [ | inputlookup jobs | table ID | rename ID as jobid ]
But this works relatively well for a "reasonable" number of entries in the lookup. I have no idea how it (in)effectively it will work with several thousand conditions.