Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Splunk Search

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Community
- :
- Splunk Answers
- :
- Using Splunk
- :
- Splunk Search
- :
- equivalence of sql join of two different group bys

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

eyaler

Explorer

03-30-2015
07:59 AM

i have data of the form:

day, hour, seller, buyer

i want to find all instances where a seller appears only on a single day and sells to more than 10 different buyers in the same hour.

in sql i would do:

select a.* from (select day, hour, seller, count(distinct buyer) as cnt*buy from sales group by day, hour, seller) a join (select seller, count(distinct day) as cnt*date from sales group by seller) b on a.seller=b.seller where cnt*buy>10 and cnt*date=1

clarification: I would like to combine the two following queries:

(1) find all sellers which appear only on single day:

`| stats dc(day) as cnt_date by seller| where cnt_date=1`

(2) find all instances where one seller sold to more than 10 buyers in

one hour:`| stats dc(buyer) as cnt_buy by day, hour, seller | where cnt_buy>10`

now i want to join the above two so that only the sellers in (1) are used in (2). what would be an efficient

splunk way to do so?

1 Solution

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Re: equivalence of sql join of two different group bys

rsennett_splunk

Splunk Employee

03-30-2015
08:39 AM

This may seem oversimplified... but if your data in Splunk holds day,hour,seller,buyer fields for each event, all you would have to do is group the fields, so that you were counting the "buy" and sorting by the other stuff so it narrowed down by seller:

`|stats count by day hour seller|where count>=10`

In this case, the "buyer" is present on all of them... and how you order the fields after the 'by' will determine what exactly you are counting. In other words, the last field in the list:seller is counted per hour, per day

the transition from thinking in terms of rows and columns to thinking in terms of events and how to manipulate them, does take a bit of acclimation...

If your data is really in two physical data sources, you need to show the data more clearly so we can give you the syntax but unless I'm mistaken, the join was recursive so you could get the count. (my sql is quite rusty at this point). In this case, you are counting the events so you don't have to create something to represent the "count". It's more of a by product of the calculation...

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Re: equivalence of sql join of two different group bys

eyaler

Explorer

04-09-2015
08:34 AM

1. find all sellers which appear only on single day:

| stats dc(day) as cnt*date by seller| where cnt*date=1

2. find all instances where one seller sold to more than 10 buyers in one hour:

| stats dc(buyer) as cnt*buy by day, hour, seller | where cnt*buy>10

now i want to join the above two so that only the sellers in (1) are used in (2).

what would be an efficient splunk way to do so?

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

somesoni2

SplunkTrust

04-09-2015
09:13 AM

Try this

```
base search | stats dc(buyer) as buyers by seller, day, hour | eventstats dc(day) as days by seller | where buyers>10 AND days=1
```

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Re: equivalence of sql join of two different group bys

rsennett_splunk

Splunk Employee

04-09-2015
09:15 AM

@somesoni2: I was actually thinking sub search given the clarification... but I'm stuck on the time factor at the moment...

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

Highlighted
##

looks good. it would however seem more efficient to first filter sellers by the second condition (days=1) and only then do the hourly counts on those sellers. could you provide a way to do it?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Re: equivalence of sql join of two different group bys

eyaler

Explorer

04-09-2015
09:53 AM