Solved: Combine two tables without the use of join and wit...

moonie · ‎08-16-2021

Hello,

I'm working on a really complex search where I need to combine results from different lookup tables. One lookup table is really big with multiple million entries, while the other one is quite small with only a thousand entries.
Both tables have one common field, let's call it "office". The big tables has entries for task which are applied to a certain office. The other table has more information about the office.
Some example data for the task lookup:

office	city	country	importance
xxx	madrid	spain	very important
yyy	paris	france	important

Office table looks similar to this:

office	group	name
xxx	this	aaa
yyy	that	bbb

I want to add the group and name fields to the first task table, without loosing any entries from the task table, so I can continue working with it. I've tried a lot of different approaches but none of them work. I got the best results with this search, but it's still not the outcome I want:

| inputlookup task_lookup 
| eval importance_very_important=if(match(importance, "very important"), 1, 0), importance_important=if(match(importance, "important"), 1, 0), importance_less_important=if(match(importance, "less important"), 1, 0)
| eval source="task"
| append [| inputlookup office_lookup | eval source="office"]
| stats values(source) as source, values(country) as country, values(city) as city, sum(importance_*) as *, values(group) as group, values(name) as name by office
| where mvcount(source)=2

This search gives me the right combination of fields BUT it also combines the different cities and countries, which I don't want, since I need them seperated so I can filter them. I get the following outcome (e.g.):

office

country

city

name

group

very_important

important

less_important

xxx

madrid

paris

spain

france

italy

aaa

this

3

7

8

yyy

rome

paris

france

spain

bbb

that

5

3

4

So all in all I need a result table that doesn't combine any values so I can work with them seperately. I'm at a point where I have no clue how to accomplish this, so any help would be highly appreciated! 🙂
Additional info: I don't want to use join since the first lookup has so many entries, I don't thinks that's going to work. I also can't just use mvexpand, since it doesn't properly expand the counts for the different task counts with their importance.

richgalloway · ‎08-16-2021

While the inputlookup command can be powerful, many seem to forget the usefulness and simplicity of lookup.

| inputlookup task_lookup 
| eval importance_very_important=if(match(importance, "very important"), 1, 0), importance_important=if(match(importance, "important"), 1, 0), importance_less_important=if(match(importance, "less important"), 1, 0)
| eval source="task"
| lookup office_lookup office

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway · ‎08-16-2021

While the inputlookup command can be powerful, many seem to forget the usefulness and simplicity of lookup.

| inputlookup task_lookup 
| eval importance_very_important=if(match(importance, "very important"), 1, 0), importance_important=if(match(importance, "important"), 1, 0), importance_less_important=if(match(importance, "less important"), 1, 0)
| eval source="task"
| lookup office_lookup office

---
If this reply helps you, Karma would be appreciated.

ITWhisperer · ‎08-16-2021

Alternatively, if your second "table" was not a lookup but a search, you could use eventstats to "join" the "tables"

| gentimes start=-1 increment=10m 
| rename starttime as _time 
| fields - endhuman endtime starthuman 
| eval count=random() % 20 
| eval office=mvindex(split("xxx,yyy,zzz",","),random() % 3)
| eval importance=mvindex(split("very important,important,less important",","),random() % 3)
| eval source="task"
| append
    [| makeresults 
    | eval range=mvrange(1,4)
    | mvexpand range
    | eval office=mvindex(split("xxx,yyy,zzz",","),range % 3)
    | eval country=mvindex(split("Spain,France,Italy",","),range % 3)
    | eval city=mvindex(split("Madrid,Paris,Rome",","),range % 3)
    | fields - range]
| eventstats values(country) as country values(city) as city by office
| where isnotnull(count)

moonie · ‎08-16-2021

This works perfectly, thank you so much!!

I didn't even think of lookup, but I'll def keep it in mind now for further searches. What a great solution. 🙂

Combine two tables without the use of join and without aggregating values

other

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM