Hello,
I am trying to build up a report using multiple stats, but I am having issues with duplication.
I will do one search, eg
index="ems" sourcetype="queueconfig" | multikv noheader=true | rename Column_1 as queues | stats list(queues) by instance
It splits the events into single lines and then I use stats to group them by instance
I have the following search that does the same for topics
index="ems" sourcetype="topicconfig" | multikv noheader=true | rename Column_1 as topics | stats list(topics) by instance
But now I want to join them into one search like this -
index="ems" sourcetype="queueconfig" | multikv noheader=true | rename Column_1 as queues | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | rename Column_1 as topics] | stats list(queues),list(topics) by instance
The issue that I am having is that at the time I join the topics in, the topics show up multiple times - it will join by instance, so for every queue line it fines it adds the topic line
eg if queues are queue1, queue2 and topics are topic1, you will get
queue1 topic1 queue2 topic1
So, when I do the lists, I get multiple not unique values in list(topics). If you add a uniq/dedup after, it doesnt have any effect. Is there a way to remove the duplicates? I am not approaching this the right way?
Thanks
Have you tried replacing:
| stats list(my_field) by my_group
with:
| stats values(my_field) by my_group
values(field)
will give you a multi-valued field with a single occurrence of each unique value. Where as list(field)
will give you a multi-value field that contains all of the values of that field in the order they were given.
See Common Stats Functions in the online docs.
It may also beneficial to do multiple stats operations. I couldn't test this, but here's a guess at slightly different approach:
index="ems" sourcetype="queueconfig" | multikv noheader=true | stats values(Column_1) as queues by instance | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | stats values(Column_1) as topics by instance] | stats values(queues),values(topics) by instance
Side note: You may find this search could be a good candidate for leveraging macros, since a large portion of it is duplicated. Sometime that helps readability and reusablility.
I have worked this out with 2 points
By default, join will only join one result to each which is why my others get lost. Adding "join max=0..." means that it will join all results where they match, not just the first match.
Ive then followed this all with a stats values(queues),values(topics) by instance as suggested by Lowell which takes only the unique values for each and creates my columns.
I have worked this out with 2 points
By default, join will only join one result to each which is why my others get lost. Adding "join max=0..." means that it will join all results where they match, not just the first match.
Ive then followed this all with a stats values(queues),values(topics) by instance as suggested by Lowell which takes only the unique values for each and creates my columns.
Have you tried replacing:
| stats list(my_field) by my_group
with:
| stats values(my_field) by my_group
values(field)
will give you a multi-valued field with a single occurrence of each unique value. Where as list(field)
will give you a multi-value field that contains all of the values of that field in the order they were given.
See Common Stats Functions in the online docs.
It may also beneficial to do multiple stats operations. I couldn't test this, but here's a guess at slightly different approach:
index="ems" sourcetype="queueconfig" | multikv noheader=true | stats values(Column_1) as queues by instance | join instance [search index="ems" sourcetype="topicconfig" | multikv noheader=true | stats values(Column_1) as topics by instance] | stats values(queues),values(topics) by instance
Side note: You may find this search could be a good candidate for leveraging macros, since a large portion of it is duplicated. Sometime that helps readability and reusablility.
Maybe try using append
. As long as have you do something like search1 | append [ search2 ] | stats values()..., by instance
could work.
To be honest, I've never really been able to make join
work for me. I guess I keep doing something wrong, but I always find a different solution that actually works, like using transaction
, stats
, sometimes even loadcsv append=t
... sometimes I use set
but that one can be a pain too. Good luck
Hello. Thankyou for your reply, this is interest - it has helped to remove that problem but has unmasked another. Where I have multiple values for the topic results, when it joins the searches together, it only links in the top value to each, rather than keeping all... any ideas?
Eg. if topic results were sample, sample.topic, sample.topic2, only sample is passed in and kept in the joint search. If i run the topic search separately, I get multiple results for each instance!