Reporting

Time from subquery lost when joining to lookup table

mcantrell
Explorer

I have a report which I'm trying to enhance to use a lookup report. The existing query is doing a simple agg to count by date and identifier:

type="REST" resource="Order" status="FINISHED" | bucket span=1d _time | stats count by client_id, _time


_time client_id count
6/7/12 12:00:00.000 AM 20006 123
6/8/12 12:00:00.000 AM 20006 117
6/7/12 12:00:00.000 AM 20008 36
6/7/12 12:00:00.000 AM 20009 101

I'm using a lookup table to enhance the data but I also want to see records from the lookup table which had no events. Something like this:


_time client_id Name count
6/7/12 12:00:00.000 AM 20006 Client A 123
6/8/12 12:00:00.000 AM 20006 Client A 117
6/7/12 12:00:00.000 AM 20008 Client B 36
6/8/12 12:00:00.000 AM 20008 Client B 0
6/7/12 12:00:00.000 AM 20009 Client C 101
6/8/12 12:00:00.000 AM 20009 Client C 0
6/7/12 12:00:00.000 AM 20010 Client D 0
6/8/12 12:00:00.000 AM 20010 Client D 0

I thought I'd be able to accomplish this by using inputlookup and left joining to the results from the original query

| inputlookup client_lookup.csv | JOIN type=left client_id [type="REST" resource="Order" status="FINISHED" | bucket span=1d _time | stats count by client_id, _time]

When I do this, the date from the subquery is lost from the output:


client_id Name count
20006 Client A 123
20008 Client B 36
20009 Client C 101
20010 Client D

I obviously have my head stuck in RDBMS land! I'm sure there's probably a better way of doing this or I'm missing something obvious but I'm a little stuck at the moment. Any help would be greatly appreciated.

Tags (2)
0 Karma

sowings
Splunk Employee
Splunk Employee

Assuming a client_lookup.csv like:

client_id,Name
20006,Client A

You could do something like:

type="REST" resource="Order" status="FINISHED" | bucket span=1d _time 
| stats count by client_id, _time
| lookup client_lookup.csv client_id OUTPUT Name

No join needed.

0 Karma

mcantrell
Explorer

OK.. I think I have it now. There is a max option to the join command which defaults to 1. I adjusted it to match the number of days I was searching for and the all of the records are showing up.

0 Karma

mcantrell
Explorer

Instead of:

*sent client_id Name count *
6/7/12 12:00:00.000 AM 20006 Client A 123
6/8/12 12:00:00.000 AM 20006 Client A 117
6/7/12 12:00:00.000 AM 20008 Client B 36
6/8/12 12:00:00.000 AM 20008 Client B 0
6/7/12 12:00:00.000 AM 20009 Client C 101
6/8/12 12:00:00.000 AM 20009 Client C 0
6/7/12 12:00:00.000 AM 20010 Client D

6/8/12 12:00:00.000 AM 20010 Client D

It appears that it probably doesn't work exactly like it would in SQL. Maybe there's a better approach to accomplish the same end.

0 Karma

mcantrell
Explorer

Hmmmm, renaming the field helps a little. The date is available now but something is weird in how it's joining the records. It's finding the first match for each client and then dumping the other dates.

Example:

*sent client_id Name count *
6/7/12 12:00:00.000 AM 20006 Client A 123
6/7/12 12:00:00.000 AM 20008 Client B 36
6/7/12 12:00:00.000 AM 20009 Client C 101
6/7/12 12:00:00.000 AM 20010 Client D

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Join is really not preferred, but if you want to mess with your original query, then rename the _time field to something not beginning with an _, then name it back after the join.

mcantrell
Explorer

The reason for the join is so that I also display the clients which don't have any matching events. Without the left join, they drop off the report.

Ultimately, I want to draw attention to clients which have low or no activity. I realize this is sort of the reverse of what splunk is meant to do but I've never been shy of driving a square peg through a round hole 😉

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...