I Have a look up file called dataset.csv which will have one field,
dataset_name |
dataset1 |
dataset2 |
dataset3 |
I need to display the dataset_names in a table which are not present in the search results but present in lookup , i use the below query to retrieve all the datasets . but using lookup it does not show the missing names.
QUERY :
index = dataIntegration source=piedpiper lambda_events| search event.proc_stat_cd = "SCSS" AND event.s3_location = "*"s3:*//lake/depositescrow*"*" AND "event.module_name"=EMR | rename event.dataset_id as "Id" |rename event.regrd_dataset_nm as Name |rename event.proc_ts as "DateTime"| table "Id" , "Name","DateTime"
I want the results like,how can i achieve this
DatasetName | FilesMoved | Time |
dataset1 | YES | |
dataset2 | NO |
Well... you can use |set diff to check for the values not present in a common subset of two searches https://docs.splunk.com/Documentation/Splunk/8.2.1/SearchReference/Set
Unfortunately it doesn't show which search it is from.
So in order to compare two sets I usually use a little trick
<search one> | eval setsource=1 | append [ <search two> | eval setsource=2 ] | stats sum(setsource) by <all important fields>
This way in the resulting table you have a setsource field showing whether the line comes from first search (1), second (2) or is the line included in both results (3).
Try something
index = dataIntegration source=piedpiper lambda_events
| search event.proc_stat_cd = "SCSS" AND event.s3_location = "*"s3:*//lake/depositescrow*"*" AND "event.module_name"=EMR
| rename event.dataset_id as "Id"
| rename event.regrd_dataset_nm as dataset_name
| rename event.proc_ts as "DateTime"
| table "Id" , "Name","DateTime"
| lookup dataset.csv dataset_name output dataset_name as dataset_name_lp
| eval FilesMoved = if(isnull(dataset_name_lp),"NO","YES")
KV
It is not working , as expected.
It displays the matching search results - which means the files are moved .But not displaying the files missed .
ex: i have two datasets --> first.dat ,second.dat
lets assume first.dat is the only file processed,and the splunk search results have only one log
datasetId | dataset_name | FilesMoved |
1 | first.dat | YES |
2 | second.dat | NO |
Can you please try this?
index = dataIntegration source=piedpiper lambda_events
| search event.proc_stat_cd = "SCSS" AND event.s3_location = "*"s3:*//lake/depositescrow*"*" AND "event.module_name"=EMR
| rename event.dataset_id as "Id"
| rename event.regrd_dataset_nm as dataset_name
| rename event.proc_ts as "DateTime"
| table "Id" , "Name","DateTime"
| rename Name as dataset_name
| append [| inputlookup dataset.csv] | stats count values(*) as * by dataset_name
| eval FilesMoved = if(count==1,"NO","YES")
My Sample Search :
| makeresults | eval _raw="Id,Name,DateTime
1,dataset1,t1
2,dataset2,t2"| multikv forceheader=1 |table Id,Name,DateTime
| rename comment as "Upto now is data only"
| rename Name as dataset_name
| append [| inputlookup car.csv] | stats count values(*) as * by dataset_name
| eval FilesMoved = if(count==1,"NO","YES")
Anyway , i tried to add the input lookup first like below and retrieved the missing dataset names only. and it worked .
| inputlookup data.csv | fields "dataset_name" | search NOT [search index = asvdataintegration source=piedpiper sts_asvdataintegration_symphony_lambda_clewriter_events | search event.proc_stat_cd = "SCSS" OR event.proc_stat_cd = "FAIL" AND event.s3_location = "*"s3://cof-*/"*"/lake/depositescrow*"*" AND "event.module_name"=EMR | rename event.regrd_dataset_nm as dataset_name | table dataset_name | format]
Hi @nandhiniG,
Thank you for sharing your inputs and the SPL that helped you get the results which are present in lookup table and not in search results.
I also have a similar usecase, and I tried following your approach, but I am still getting the outputs which are present in lookup table and in search results.
|inputlookup table.csv |fields index, host |search NOT [search index="xxx" |rename orig_* AS * |table index, host |format]
Thus, please help by sharing your inputs to correct the SPL.
Thank you