Splunk Search

How do I get a count from multiple sources after mvexpand on multiple fields?

jlosee
Path Finder

Hi, I'm redoing a search to avoid using join as it was truncating results. I'm trying to get a count after searching multiple sources and using values(field) followed by mvexpand(field), and I'm not getting the counts I expect. This is my query:

(index=index1) OR (index=index2 source="source1" OR source="source2") | stats values(field1) values(field2) values(field3) values(field4) values(field5) by field6| rename values(field1) AS Field1 values(field2) AS Field2 values(field3) as Field3 values(field4) as Field4 values(field5) as Field5 | mvexpand Field1 | mvexpand Field2  | mvexpand Field3 | mvexpand Field4 | mvexpand Field5   | stats count by Field1 , Field2 , Field3 , Field4 , Field5 | eventstats count as Total by Field1 | addcoltotals

So in the old query with join I was getting count values in the hundreds or thousands but with this, most are just 1. Is there something I'm doing wrong with mvexpand? Should I be using eventstats or streamstats?

Hopefully this question makes sense. Thanks!

0 Karma
1 Solution

jlosee
Path Finder

Okay well after talking to someone else on my team we figured a way around all this. We're still using join but we don't have any missing entries. It look like this:

index=index1
| rex field=_raw "rex_field=xxx(?<rexA>\w+.\w+)%23(?<rexB>\w+)" 
| dedup rexA rexB
| top limit=50 showperc=false rexB  
| sort count desc  

| join type=left rexB [search index=index2 source="source1"
| table rexA fieldA fieldB... 
| dedup fieldA] 

| table fieldA rexB fieldB...count 
| search fieldA!="" | head 25 
| addcoltotals labelfield=fieldA label="Top 25 Total"

We also added information from source2 to source1 to simplify things. Thanks for your help everyone!

View solution in original post

0 Karma

jlosee
Path Finder

Okay well after talking to someone else on my team we figured a way around all this. We're still using join but we don't have any missing entries. It look like this:

index=index1
| rex field=_raw "rex_field=xxx(?<rexA>\w+.\w+)%23(?<rexB>\w+)" 
| dedup rexA rexB
| top limit=50 showperc=false rexB  
| sort count desc  

| join type=left rexB [search index=index2 source="source1"
| table rexA fieldA fieldB... 
| dedup fieldA] 

| table fieldA rexB fieldB...count 
| search fieldA!="" | head 25 
| addcoltotals labelfield=fieldA label="Top 25 Total"

We also added information from source2 to source1 to simplify things. Thanks for your help everyone!

0 Karma

somesoni2
Revered Legend

Try something like this

index=index1 OR (index=index2 source="source1" source="source2" source="source3") |  rex field=rex_field "aaa<partA1>aaa" | rex field=rex_field "aaa<partB1> | eval AB=if(index="index1",partA1.":".partB1, partA.":".partB) | dedup AB | fillnull value="0"  | stats count by field1,AB,field2,field3,field4, field5,count | addcoltotals 
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

if() and case() can do the same things, they just look different.

0 Karma

jlosee
Path Finder

It seems that it will only search whichever index I specify in the if clause. What about using case?

0 Karma

woodcock
Esteemed Legend

Give us 2 or 3 events from each source and then give us (based on that data) the desired output followed by a description of the steps take to produce the output. I have no idea what you mean based on what you have shown us so far.

richgalloway
SplunkTrust
SplunkTrust

What are you trying to count? The stats values(field) command dedups its results. If you want to count all values rather than unique ones, use stats list(field).

---
If this reply helps you, Karma would be appreciated.
0 Karma

jlosee
Path Finder

Okay I've been trying out list instead and it looks promising. How would I get a sum of the counts that share some of the fields? e.g. I would like to sum up the ones that have the same field1 and field2.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this: ... | stats sum(eval x=if(field1=field2,1,0)) ...

---
If this reply helps you, Karma would be appreciated.
0 Karma

jlosee
Path Finder

Oh sorry what I meant was something like this:
field1 field2 field3
bar foo something
bar foo somethingElse

0 Karma

woodcock
Esteemed Legend

What was the old query?

0 Karma

jlosee
Path Finder
index=index1  | rex field=rex_field "aaa<partA>aaa" | rex field=rex_field "aaa<partB> | eval AB=partA.":".partB | dedup field_foo,AB |  join type=left AB [search index=index2 source="source1" |   eval AB=partA.":".partB | dedup AB| table AB,field1,field4,field5]  | join type=left AB [search index=index2 source="source2"   |   eval AB=partA.":".partB | dedup AB |  table field1 ,AB, field7] | dedup AB | fillnull value="0"  | table  field1,AB,field2,field3,field4, field5,count | addcoltotals 
0 Karma

woodcock
Esteemed Legend

Wow. I strongly suspect (mostly because of the dedups) that this search is also not doing what you would like it to. Will you back up and explain what is in the data sets and how you are trying to connect them?

0 Karma

jlosee
Path Finder

Well at a very high level, index1 has a list of assets we are sending out, whereas index2 is more of a list of assets that we have available. We want to count the assets going through the first index with more information from the second (and the two sources of that index have some of their own data). We have similar fields in both indexes but their not quite called the same thing, so we use regex to combine fields in such a way that we have a matching field in both indexes. I think without the dedups we would just have a lot of unnecessary repeats. In my new query I was trying to avoid the regex but it may be necessary after all.

Sorry I don't think I can be more detailed. Does that clarify things?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...