Splunk Search

Using Multiple stats list

Hazel
Communicator

Hello,

I am trying to build up a report using multiple stats, but I am having issues with duplication.

I will do one search, eg

index="ems" sourcetype="queueconfig" | multikv noheader=true | rename Column_1 as queues | stats list(queues) by instance

It splits the events into single lines and then I use stats to group them by instance

I have the following search that does the same for topics

index="ems" sourcetype="topicconfig" | multikv noheader=true | rename Column_1 as topics | stats list(topics) by instance

But now I want to join them into one search like this -

index="ems" sourcetype="queueconfig" | multikv noheader=true | rename Column_1 as queues | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | rename Column_1 as topics] | stats list(queues),list(topics) by instance

The issue that I am having is that at the time I join the topics in, the topics show up multiple times - it will join by instance, so for every queue line it fines it adds the topic line

eg if queues are queue1, queue2 and topics are topic1, you will get

queue1  topic1
queue2 topic1

So, when I do the lists, I get multiple not unique values in list(topics). If you add a uniq/dedup after, it doesnt have any effect. Is there a way to remove the duplicates? I am not approaching this the right way?

Thanks

0 Karma
2 Solutions

Lowell
Super Champion

Have you tried replacing:

| stats list(my_field) by my_group

with:

| stats values(my_field) by my_group

values(field) will give you a multi-valued field with a single occurrence of each unique value. Where as list(field) will give you a multi-value field that contains all of the values of that field in the order they were given.

See Common Stats Functions in the online docs.

It may also beneficial to do multiple stats operations. I couldn't test this, but here's a guess at slightly different approach:

index="ems" sourcetype="queueconfig" | multikv noheader=true | stats values(Column_1) as queues by instance | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | stats values(Column_1) as topics by instance] | stats values(queues),values(topics) by instance

Side note: You may find this search could be a good candidate for leveraging macros, since a large portion of it is duplicated. Sometime that helps readability and reusablility.

View solution in original post

Hazel
Communicator

I have worked this out with 2 points

  1. By default, join will only join one result to each which is why my others get lost. Adding "join max=0..." means that it will join all results where they match, not just the first match.

  2. Ive then followed this all with a stats values(queues),values(topics) by instance as suggested by Lowell which takes only the unique values for each and creates my columns.

View solution in original post

Hazel
Communicator

I have worked this out with 2 points

  1. By default, join will only join one result to each which is why my others get lost. Adding "join max=0..." means that it will join all results where they match, not just the first match.

  2. Ive then followed this all with a stats values(queues),values(topics) by instance as suggested by Lowell which takes only the unique values for each and creates my columns.

Lowell
Super Champion

Have you tried replacing:

| stats list(my_field) by my_group

with:

| stats values(my_field) by my_group

values(field) will give you a multi-valued field with a single occurrence of each unique value. Where as list(field) will give you a multi-value field that contains all of the values of that field in the order they were given.

See Common Stats Functions in the online docs.

It may also beneficial to do multiple stats operations. I couldn't test this, but here's a guess at slightly different approach:

index="ems" sourcetype="queueconfig" | multikv noheader=true | stats values(Column_1) as queues by instance | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | stats values(Column_1) as topics by instance] | stats values(queues),values(topics) by instance

Side note: You may find this search could be a good candidate for leveraging macros, since a large portion of it is duplicated. Sometime that helps readability and reusablility.

Lowell
Super Champion

Maybe try using append. As long as have you do something like search1 | append [ search2 ] | stats values()..., by instance could work.

0 Karma

Lowell
Super Champion

To be honest, I've never really been able to make join work for me. I guess I keep doing something wrong, but I always find a different solution that actually works, like using transaction, stats, sometimes even loadcsv append=t... sometimes I use set but that one can be a pain too. Good luck

0 Karma

Hazel
Communicator

Hello. Thankyou for your reply, this is interest - it has helped to remove that problem but has unmasked another. Where I have multiple values for the topic results, when it joins the searches together, it only links in the top value to each, rather than keeping all... any ideas?

Eg. if topic results were sample, sample.topic, sample.topic2, only sample is passed in and kept in the joint search. If i run the topic search separately, I get multiple results for each instance!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...