Splunk Search

Using Multiple stats list

Hazel
Communicator

Hello,

I am trying to build up a report using multiple stats, but I am having issues with duplication.

I will do one search, eg

index="ems" sourcetype="queueconfig" | multikv noheader=true | rename Column_1 as queues | stats list(queues) by instance

It splits the events into single lines and then I use stats to group them by instance

I have the following search that does the same for topics

index="ems" sourcetype="topicconfig" | multikv noheader=true | rename Column_1 as topics | stats list(topics) by instance

But now I want to join them into one search like this -

index="ems" sourcetype="queueconfig" | multikv noheader=true | rename Column_1 as queues | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | rename Column_1 as topics] | stats list(queues),list(topics) by instance

The issue that I am having is that at the time I join the topics in, the topics show up multiple times - it will join by instance, so for every queue line it fines it adds the topic line

eg if queues are queue1, queue2 and topics are topic1, you will get

queue1  topic1
queue2 topic1

So, when I do the lists, I get multiple not unique values in list(topics). If you add a uniq/dedup after, it doesnt have any effect. Is there a way to remove the duplicates? I am not approaching this the right way?

Thanks

0 Karma
2 Solutions

Lowell
Super Champion

Have you tried replacing:

| stats list(my_field) by my_group

with:

| stats values(my_field) by my_group

values(field) will give you a multi-valued field with a single occurrence of each unique value. Where as list(field) will give you a multi-value field that contains all of the values of that field in the order they were given.

See Common Stats Functions in the online docs.

It may also beneficial to do multiple stats operations. I couldn't test this, but here's a guess at slightly different approach:

index="ems" sourcetype="queueconfig" | multikv noheader=true | stats values(Column_1) as queues by instance | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | stats values(Column_1) as topics by instance] | stats values(queues),values(topics) by instance

Side note: You may find this search could be a good candidate for leveraging macros, since a large portion of it is duplicated. Sometime that helps readability and reusablility.

View solution in original post

Hazel
Communicator

I have worked this out with 2 points

  1. By default, join will only join one result to each which is why my others get lost. Adding "join max=0..." means that it will join all results where they match, not just the first match.

  2. Ive then followed this all with a stats values(queues),values(topics) by instance as suggested by Lowell which takes only the unique values for each and creates my columns.

View solution in original post

Hazel
Communicator

I have worked this out with 2 points

  1. By default, join will only join one result to each which is why my others get lost. Adding "join max=0..." means that it will join all results where they match, not just the first match.

  2. Ive then followed this all with a stats values(queues),values(topics) by instance as suggested by Lowell which takes only the unique values for each and creates my columns.

Lowell
Super Champion

Have you tried replacing:

| stats list(my_field) by my_group

with:

| stats values(my_field) by my_group

values(field) will give you a multi-valued field with a single occurrence of each unique value. Where as list(field) will give you a multi-value field that contains all of the values of that field in the order they were given.

See Common Stats Functions in the online docs.

It may also beneficial to do multiple stats operations. I couldn't test this, but here's a guess at slightly different approach:

index="ems" sourcetype="queueconfig" | multikv noheader=true | stats values(Column_1) as queues by instance | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | stats values(Column_1) as topics by instance] | stats values(queues),values(topics) by instance

Side note: You may find this search could be a good candidate for leveraging macros, since a large portion of it is duplicated. Sometime that helps readability and reusablility.

Lowell
Super Champion

Maybe try using append. As long as have you do something like search1 | append [ search2 ] | stats values()..., by instance could work.

0 Karma

Lowell
Super Champion

To be honest, I've never really been able to make join work for me. I guess I keep doing something wrong, but I always find a different solution that actually works, like using transaction, stats, sometimes even loadcsv append=t... sometimes I use set but that one can be a pain too. Good luck

0 Karma

Hazel
Communicator

Hello. Thankyou for your reply, this is interest - it has helped to remove that problem but has unmasked another. Where I have multiple values for the topic results, when it joins the searches together, it only links in the top value to each, rather than keeping all... any ideas?

Eg. if topic results were sample, sample.topic, sample.topic2, only sample is passed in and kept in the joint search. If i run the topic search separately, I get multiple results for each instance!

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...