Getting Data In

How to filter dynamically based on string match across two different lookups

rohankin
New Member

Hi,
I am new to Splunk and am stuck at the this problem. To elaborate:

I have attached example of datasets and the desired result table that I am working with here.
Datasets that I am using are KVStore lookups.

But basically I am trying to connect dataset 1 to dataset 2 bringing over attributes (Flag A,B & C) based on condition.
Condition is applied to column "Application Name" and there is many to one mapping which is confusing me.

Any help is greatly appreciated !

(PS: Key is to have value of Flag= True in the output if it is true for any of the application mapped to that device name)

  • Rohan alt text
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

HI rohankin,
try something like this:

| inputlookup Dataset1.csv
| rename "Application Name" AS ApplicationName
| makemv delim="," ApplicationName
| mvexpand ApplicationName
| lookup Dataset2.csv "Application Name" AS ApplicationName OUTPUT Flag-A Flag-B Flag-C
| stats values(ApplicationName) AS ApplicationName values(Flag-A) AS Flag-A values(Flag-B) AS Flag-B values(Flag-C) AS Flag-C BY "Device name"
| rename ApplicationName AS "Application Name"

If possible, avoid spaces in field names!

Ciao.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

HI rohankin,
try something like this:

| inputlookup Dataset1.csv
| rename "Application Name" AS ApplicationName
| makemv delim="," ApplicationName
| mvexpand ApplicationName
| lookup Dataset2.csv "Application Name" AS ApplicationName OUTPUT Flag-A Flag-B Flag-C
| stats values(ApplicationName) AS ApplicationName values(Flag-A) AS Flag-A values(Flag-B) AS Flag-B values(Flag-C) AS Flag-C BY "Device name"
| rename ApplicationName AS "Application Name"

If possible, avoid spaces in field names!

Ciao.
Giuseppe

View solution in original post

0 Karma

rohankin
New Member

Thanks for the quick reply !!!!!

Did some work and as it turns out "Application Name" field has NULLs which might be affecting the results. And number do add up.

So, dataset 1 has 9959 records that do not have NULL in it and final results after using your query have 9959 records so I guess it is excluding those which do not have any value in "Application Name" field.

Any way we can handle those NULL values ? Those devices that have nothing "Application Name" field should just get all the flags as "False"

0 Karma

gcusello
SplunkTrust
SplunkTrust

HI rohankin,
I was answering to your previous comment, but now I cannot see it, probably you deleted it, is it correct?

Anyway you can filter results with a simple option in the first row

| inputlookup Dataset1.csv WHERE "Application Name"=*

Ciao.
Giuseppe

0 Karma

rohankin
New Member

Hi gcusello,
Ya. I deleted the the comments after realizing the mix-up wit the NULL values but to answer you , I did try using Application Name"=*. But I dont want to exclude any of the devices that have NULL values in "Application Name" field in dataset 1.

For ex.
Dataset 1(Devices): Number of records: 27,223

Dataset 2(Applications): Number of records:4,919 

Desired Output: Expected number of results:27,223
where 9959 records will have Flag values based match between Dataset 1 and Dataset 2(Your initial query is doing this)

and

remaining records (with "Application Name" NULL in dataset 1) will have all the flags set to False.

I feel adding "Application Name"=* will only keep the rows from dataset 1 that has values in
"Application Name" which I dont want to do. I want to keep all the 27,223 records in the output.

0 Karma

gcusello
SplunkTrust
SplunkTrust

HI rohankin,
to add also the empty "Application Name", use this command:
| eval Flag-A=if(isnull("Application Name"),"False",Flag-A), Flag-B=if(isnull("Application Name"),"False",Flag-B), Flag-C=if(isnull("Application Name"),"False",Flag-C)
Ciao.
Giuseppe

0 Karma

rohankin
New Member

Thanks Gcusello ! that worked like a charm. I will have to do some extra work on the data though as it has many duplicates, NULLs and garbage data.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Good work and happy splunking!
If this answer solved your problem, please accept and/ot upvote it.

Ciao and next time!
Giuseppe

0 Karma

rohankin
New Member

Hi Giuseppe,

Ya. Deleted the comment after I realized the mix-up with the NULL values. I thought about using * as you have stated but the problem is I don't want to limit results only to the devices that have the "Application Name ".

Rather for devices that do not have anything in "Application Name" , I would want to display them in the output with all three flags set to "False".

Meaning, the final output should have all the 27,223 records from Dataset 1.
Out of which 9959 records will have flag values based on "Application Name" field while remaining will just have "False" flags as there is nothing in "Application Name" field in dataset 1 for those devices.

Does that make sense ?

  • Rohan
0 Karma