Splunk Search

mvcount and stats count give different results

Path Finder

I have a log file where each line has an itemId and a clusterId.
When I run the following sort of queries

| stats count(itemId) as clusterSize by clusterId 
| sort - clusterSize

vs

| stats list(itemId) AS items BY clusterId 
| eval clusterSize=mvcount(items) 
| sort -clusterSize

and get different results. I don't know if it's a coincidence but the second command results in largest clusterSizes of exactly 100.

Does anybody have an idea?

0 Karma
1 Solution

Splunk Employee
Splunk Employee

Per the Splunk documentation, list() Returns a list of up to 100 values of the field X as a multivalue entry.

View solution in original post

Super Champion

the list command only returns 100 field values. if there are more than 100 values of itemId, this is why there is that problem in the second query.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/CommonStatsFunctions#Supporte...

if you're looking for a total count of itemIds by clusterId, the first query works great, if you want to know how many unique itemIds are in each clusterId, try |stats dc(itemId) as clusterSize by clusterId

0 Karma

Splunk Employee
Splunk Employee

Per the Splunk documentation, list() Returns a list of up to 100 values of the field X as a multivalue entry.

View solution in original post

SplunkTrust
SplunkTrust

hey

list(X)
Returns a list of up to 100 values of the field X as a multivalue entry. The order of the values reflects the order of input events.

have a look in this official doc http://docs.splunk.com/Documentation/Splunk/7.0.1/SearchReference/Multivaluefunctions#list.28X.29

so your first query output is correct while your second query results in largest clusterSizes of exactly 100 because of its limit (gives wrong output) and that is why there is a mismatch.

let me know if this helps !

0 Karma