Solved: Re: Lookups exclude non matching results

bowesmana · ‎06-02-2017

I have a CSV containing wine names, vintages and prices, e.g.

Description,Vintage,Price
A,2012,100
A,2013,
B,2014,
B,2015,30
C,2010,50

I want to lookup a chosen wine and vintage and return the price for a wine/vintage match, which is easy

| lookup Price Description, Vintage OUTPUT Price

However, I want to be able to show a 'guide' price where the vintage wanted has no price, so I've played around with searching all results, and then trying to filter out any exact matches.

If I am searching for Wine A, 2013 then I would get a hit, with no price, so I would want to see (2012)100
If I am searching for Wine C, 2012, then I would not get a match at all, so I would want to see (2010)50

Getting all matches is also easy

| lookup DMPrice Description OUTPUT Price as Prices, Vintage as Vintages

But then I can't filter out an exact matches I may have found. I have tried using eval statements to replace the output mv field Vintages in the above lookup, but you can't use fields as parameters to replace. Same with rex, so I am a bit stumped.

As with all things Splunk, there are 101 ways to achieve the same end, so hoping someone else can see a solution

acharlieh · ‎06-02-2017

If we can make a slight change to your premise, and never have rows in your lookup without prices, e.g...

| inputlookup Price where Price=* | outputlookup Price

Then one way to solve your puzzle would be...

<base search> | lookup Price Description, Vintage OUTPUT Price 
| lookup Price Description OUTPUT Price as APrice Vintage as AVintage 
| eval APrice = "(".mvindex(mvzip(AVintage,APrice,") "),0), Price=coalesce(Price,APrice)
| fields - APrice,AVintage

What's happening here is that we perform both the exact price lookup and approximate price/vintage lookups, as you have found already. Next we use mvzip to combine the corresponding Vintage and Price pairs for the guide prices (Why it's important that every row in the lookup has a Price, as if a vintage did not, the corresponding lists would not necessarily line up correctly.)... we then use mvindex to select the first one, and an append to finish up the value correctly. Then using coalesce, we're able to pick the exact match price, for those rows with such, but if not present we pick the built guide price.

A variation on this, you could set the maximum number of matches on your lookup to 1, then you only need to use string append to build the guide price, coalesce would then be the same.

View solution in original post

acharlieh · ‎06-02-2017

If we can make a slight change to your premise, and never have rows in your lookup without prices, e.g...

| inputlookup Price where Price=* | outputlookup Price

Then one way to solve your puzzle would be...

<base search> | lookup Price Description, Vintage OUTPUT Price 
| lookup Price Description OUTPUT Price as APrice Vintage as AVintage 
| eval APrice = "(".mvindex(mvzip(AVintage,APrice,") "),0), Price=coalesce(Price,APrice)
| fields - APrice,AVintage

What's happening here is that we perform both the exact price lookup and approximate price/vintage lookups, as you have found already. Next we use mvzip to combine the corresponding Vintage and Price pairs for the guide prices (Why it's important that every row in the lookup has a Price, as if a vintage did not, the corresponding lists would not necessarily line up correctly.)... we then use mvindex to select the first one, and an append to finish up the value correctly. Then using coalesce, we're able to pick the exact match price, for those rows with such, but if not present we pick the built guide price.

A variation on this, you could set the maximum number of matches on your lookup to 1, then you only need to use string append to build the guide price, coalesce would then be the same.

bowesmana · ‎06-03-2017

Hi acharlieh,

You have used mvindex just to get the first one of n, but if I wanted to get all, i.e.

(2012),80.00
(2013),90.00

I have not been able to get the string prefix/suffix of the brackets around the mv elements of the field as you have done in the mvindex command. Any idea how to do that. The idea behind wanting all is that there is also a rating for the wine, so I can compare the price for the year in light of the rating. Just having the first may not be an indicative guide.

acharlieh · ‎06-03-2017

Depending on the size of the lookup and the number of descriptions you're looking for vs the number in your lookup, this might not be the most efficient solution, but it's a possible solution:

<base search> | lookup Price Description, Vintage OUTPUT Price 
| append 
    [ inputlookup Price where Price=* 
    | eval APrice="(".Vintage.")".Price 
    | fields Description, APrice ] 
| eventstats values(APrice) as APrice by Description 
| eval Price=coalesce(Price,APrice) 
| where isnotnull(Vintage) | fields - APrice

Using a subsearch like this has limits for execution time and number of results... But here we're tacking on a formatted guide price for every entry in our lookup table. Using eventstats, we then collect the guide prices for each description together, eval coalesce lets us keep the exact prices already found like before, and the where gets rid of our appended lookup values.

An alternative solution... If you build a scripted lookup, call it GPrice, that uses the existing lookup and returns formatted Guide prices per Description... Your Splunk query could be as easy as:

<base search> | lookup Price Description, Vintage OUTPUT Price 
| lookup GPrice Description OUTPUTNEW GPrice as Price

But I'll note this depends on how you do your scripted lookup, and you're offloading some of the heavy lifting to the python script to transform your existing CSV. (For that matter you could embed all this logic in the script, and have just the one lookup in your Splunk Search).

bowesmana · ‎06-03-2017

Whoa! Thanks heaps, another door opens - scripted lookups... sounds like a interesting approach, I'll go read some docs and try improve my python skills. I'm not a great fan of subsearches due to the side effects you mention, although in this case, it wouldn't be a big problem with the number of results, but I'd rather learn some new technique 🙂

bowesmana · ‎06-03-2017

That's neat, I can actually handle the empty price as there are other fields, such as rating, that may be present even if there is no price, but the mvzip/coalesce() did the trick to get what I wanted.

Thanks!

Lookups exclude non matching results

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM