Other Usage

NOT IN Subquery syntax

ycho1
Explorer

hello, everyone

I have a question about how to write a subquery in Splunk.

for example
I would like to get a list of productId that was returned, but later was not purchased again.
NOT IN Subquery part.

How can I accomplish this?


index=main sourcetype=access_combined_wcookie action=returned
NOT IN [search index=main sourcetype=access_combined_wcookie action=purchase
|table clientip]
| stats count, dc(productId) by clientip

Thank you in advance

 

Labels (2)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

There can be probably more than one approach to your problem (one was already presented) but the subquery will not work this way.

As subquery is executed and the results are returned, they are "pasted" into the original query as a condition using field names and values returned from the subquery.

So the IN operator will not with them. With it after subquery expansion you'd have (hypoteticaly - it's not a valid syntax) something like

index=main sourcetype=access_combined_wcookie action=returned
NOT IN (clientip=value1 OR clientip=value2 OR ...)

The last() approach that @bowesmana showed is a neat trick but relies on the time succession.

It might be ok, but you might as well need something a bit different. Depends on your actual data.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Technically it is possible to get the subsearch to return a search string that will work with NOT IN, the syntax would be

<search> NOT your_field IN  [
  search <search>
  | stats count by your_field
  | fields your_field
  | rename your_field as search
  | format "(" "" "" "" "" ")" ]

but there is no value in this for the OP's problem, as this will not handle the basic problem of how to determine which event came after.

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Nice to know. I always thought you could only return results as key/value pairs which get interpreted as additional conditions. Can you return any arbitrary string?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@PickleRick 

Take a look at the format command, the 6 components give you quite a lot of flexibility in returning different ways

https://docs.splunk.com/Documentation/Splunk/8.2.2/SearchReference/Format

 

PickleRick
SplunkTrust
SplunkTrust

Good to know. That's what I needed.

Thanks!

bowesmana
SplunkTrust
SplunkTrust

In order to find an event that did not occur later than one that did you are going to have to use a different method. I don't believe it will be possible to use the subquery to filter the parent query as you are trying.

Instead I would suggest you need to use stats, e.g.

index=main sourcetype=access_combined_wcookie (action=returned OR action=purchase)
| stats latest(action) as lastAction by productId clientIp
| where lastAction="returned"

This will give you the last action for the product/clientIp and then you just want to see if the last action is returned.

Naturally within your search results, the clientIp may have had any number of 'transactions' for that productId, so you will need to consider how you want to address that. 

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...