Getting Data In

How to compare and list the difference between search result and lookup file?

cheriemilk
Path Finder

hi team,

1. I have a query with below 2 columns returned only

PQ, ACT
pq1, act1
PQ1, act2
pQ1, act3
pq2, act4
QP2, act5
Pq2, act6
pq3, act7
Pq3, act8
pq_3, act9
...

 

2. Then I have a standard pq list csv file uploaded in splunk. In the csv file, there is a clumn called 'pq' with a standard pq values defined. please check below sample.

PQ
pq1
pq2
pq3
pq4
pq5
pq6
...

 

3. I want to compare the pq values in the splunk query result with the lookup csv file to find out

      a) return the  PQ  and ACT that the PQ value is not exactly matched with the one defined in lookup file, include case sensitive issue.

     b) return the PQ and ACT that the PQ value is in the query result but not in the lookup table

     c) return the PQ and ACT that PQ value is not in the query but in the lookup table.

 

How to compose the query to meet the 3 requirements in step3?

 

BestRegards!

  

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
<your search>
| eval from="query"
| append [| inputlookup pq.csv | eval from="csv"]
| stats values(ACT) as ACT values(from) as from by PQ
| where mvcount(from) > 1

View solution in original post

0 Karma

FrankVl
Ultra Champion

Can you clarify a bit what you mean with point a about the non-exact matches? Can you (given this sample data) sketch what the expected output of the query would be?

E.g. having Pq3, pq3 and pq_3, do you want the query to just tell you that pq3 exists in the lookup but Pq3 and pq_3 don't? Or should the query try to also match those 'imperfect' cases and if so should the query highlight that it is a non-exact match?

What you could do to cover all that is this:

First: create a lookup definition "pq" that refers to the pq.csv lookup file and configure that lookup definition to be case sensitive (as it is by default I believe)
Then, run this query (not tested, so you may need to debug the SPL a bit in case I made some typo or so):

...your current query...
| eval cleanedPQ = replace(lower(PQ),"[^a-z0-9]","")
| eval from="query"
| append [ | inputlookup pq.csv | eval from="csv" | eval cleanedPQ = PQ]
| eventstats dc(from) as exact_match by PQ
| eventstats dc(from) as nonexact_match by cleanedPQ
| eval match = case(exact_match>1,"exact",non_exact_match>1,"non-exact",true(),"no match")
| stats values(from) as from values(match) as match values(ACT) as ACT values(cleanedPQ) as cleanedPQ by PQ
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @cheriemilk,

please try something like this:

<your_search>
| eval PQ=lower(PQ)
| stats count values(ACT) AS ACT BY PQ
| eval status="main"
| append [ | inputlookup your_lookup.csv | eval PQ=lower(PQ), status="lookup" | fields PQ count status ]
| stats sum(count) AS total values(ACT) AS ACT values(status) AS status dc(status) AS status_count BY PQ
| eval status=if(status-count=2,"both",status)
| table PQ ACT status

Ciao.

Giuseppe

0 Karma

cheriemilk
Path Finder

hi @gcusello 

Thank you for the help.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @cheriemilk ,

if one answer solves your need, please accept one answer for the other people of Community or tell us how we can help you.

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the Contributors;-)

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
<your search>
| eval from="query"
| append [| inputlookup pq.csv | eval from="csv"]
| stats values(ACT) as ACT values(from) as from by PQ
| where mvcount(from) > 1
0 Karma

cheriemilk
Path Finder

Hi @ITWhisperer 

Thank you for the help. is the stats  values() function case senstive? Does it treat 'PQ1' , 'pq2', 'Pq2' and 'pQ2' as a same value or treat them to 4 different values?

| stats values(ACT) as ACT values(from) as from by PQ

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The by PQ will treat the values in these fields case-sensitively i.e. PQ1 is not the same as Pq1 or pQ1 or pq1

0 Karma
Get Updates on the Splunk Community!

Detecting Brute Force Account Takeover Fraud with Splunk

This article is the second in a three-part series exploring advanced fraud detection techniques using Splunk. ...

Buttercup Games: Further Dashboarding Techniques (Part 9)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Buttercup Games: Further Dashboarding Techniques (Part 8)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...