Dashboards & Visualizations

Subsearch - Add a column

Vikas_Sharma
Explorer

Hello experts,

How can I get an additional column in the output by matching on a common field. Lets say, below query gives me:
index=1 sourcetype=b | stats sum(price) as total by products, region

products region total
a v 111
b w 222
c x 333
d y 444
e z 555

Now, I have another source that may have tags for each product and there can be multiple tags associated to a single products. The second source has different list and count, i.e, it may have more number of products. So, how can I get something like below for the matching ones for column products from above query:
products region total tags
a v 111 qw,as
b w 222 we
c x 333
d y 444 sd,zx,xc
e z 555

Can I get some help with the query please.

Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try (will work fine unless the subsearch has too many rows to produce and/or takes lots of time to run)

index=1 sourcetype=b | stats sum(price) as total by products, region
| append [search index=2 sourcetype=c | stats values(tags) as tags by product]
| stats values(region) as region, values(total) as total values(tags) as "total tags" by product
| where isnotnull(region)

View solution in original post

0 Karma

somesoni2
Revered Legend

Give this a try (will work fine unless the subsearch has too many rows to produce and/or takes lots of time to run)

index=1 sourcetype=b | stats sum(price) as total by products, region
| append [search index=2 sourcetype=c | stats values(tags) as tags by product]
| stats values(region) as region, values(total) as total values(tags) as "total tags" by product
| where isnotnull(region)
0 Karma

DalJeanis
Legend

@somesoni2 - what do you think about something like this as a way of getting around the possibility of the second search returning too much data?

index=1 sourcetype=b | stats sum(price) as total by product, region
 | appendpipe 
     [ | stats by product | format | rename search as myproductlist 
       | map search="search index=2 sourcetype=c $myproductlist$ | stats values(tags) as tags by product" ]
 | stats values(region) as region, values(total) as total, values(tags) as "total tags" by product
 | where isnotnull(region)
0 Karma

Vikas_Sharma
Explorer

Thank you for your responses @somesoni2 and @DalJeanis.

I may have multiple product values in my second source which again can have 0 or more tags. I think it will make more sense if I add some data for the second source here.

index=2 sourcetype=c | stats values(tags) as tags by product

product tags
a,f qw,as
b we
c,g,h

d sd,zx,xc
e

So, the product 'a' gets tags 'qw,as' in the final output.

0 Karma

somesoni2
Revered Legend

Did you get a chance to try any of the answers?

0 Karma

Vikas_Sharma
Explorer

There is a problem.
The products column from source1 is single valued but the productlist column in source2 is multivalued.

index=1 sourcetype=b | stats sum(price) as total by products, region

products region total
a v 111
b w 222
c x 333
d y 444
e z 555

index=2 sourcetype=c | stats values(tags) as tags by productlist

productlist tags
a,f qw,as
b we
c,g,h

d sd,zx,xc
e

Output:
products region total tags
a v 111 qw,as
b w 222 we
c x 333

d y 444 sd,zx,xc
e z 555

0 Karma

somesoni2
Revered Legend

I guess the problem is not productlist being multivalued field (the stats should've made it single valued), but the different column names in two sources. Try these

index=1 sourcetype=b | stats sum(price) as total by products, region
 | append [search index=2 sourcetype=c | stats values(tags) as tags by productlist | rename productlist as product]
 | stats values(region) as region, values(total) as total values(tags) as "total tags" by product
 | where isnotnull(region)

AND

( index=1 sourcetype=b) OR (index=2 sourcetype=c) 
| eval product=coalesce(product,productlist)
 | eventstats values(tags) as tags_all by product
 | eval tags=coalesce(tags,tags_all)
 | search  index=1 sourcetype=b
 | stats sum(price) as total values(tags) as "total tags" by product region
0 Karma

Vikas_Sharma
Explorer

Umm, I still think its the problem with the multivalue. All of these queries return data for only those rows which have single values in productlist.

We are renaming productlist as product but 'productlist' may not always have the same matching column values as that in 'product'.

product
a
b
c
d
e

productlist
a,f
b
c,g,h
d
e

For the examples that I have provided, the output is going to be like below and there is not going to be any tags for product 'a':

products region total tags
a v 111
b w 222 we
c x 333
d y 444 sd,zx,xc
e z 555

0 Karma

Vikas_Sharma
Explorer

subsearch query:
append [ search index=2 sourcetype=c | eval temp=split(productlist,",") | mvexpand temp | stats values(tags) as tags by temp | rename temp as product ]

Thank you @somesoni2 and @DalJeanis for helping with this.

0 Karma

somesoni2
Revered Legend

Map command in a subsearch can be brutal (there are good chances that it'll get auto-finalized after 2 mins if there are many products). I would rather use an eventstats implementation for that (to avoid subsearches completely) like this

( index=1 sourcetype=b) OR (index=2 sourcetype=c) 
| eventstats values(tags) as tags_all by product
| eval tags=coalesce(tags,tags_all)
| search  index=1 sourcetype=b
| stats sum(price) as total values(tags) as "total tags" by product region
Get Updates on the Splunk Community!

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

The University of Nevada, Las Vegas (UNLV) is another premier research institution helping to shape the next ...

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...