I have a table with attributes ProductName and TotalSales , and I would like to extract the rows which are in the top 50% of total sales. Naively, I would pipe this into search TotalSales>=median(TotalSales) . However, since search doesn't support the median function, Splunk returns no events.
I can make this work via the following hack:
| eventstats median(TotalSales) as MTS | where TotalSales>=MTS | fields - MTS
I'm worried about the efficiency of this hack. If I omit the fields - MTS command, then the output is a table with attribute MTS, with the median value replicated across all rows. If I have only 20 products, then this isn't that big of a deal, but if I have 500,000 products, then this is an enormous amount of redundancy in memory.
My question: what is Splunk doing under the hood? That is,
is Splunk literally replicating the median value in memory dc(ProductName) of times, then deleting it once I remove the MTS field?
Or, is Splunk smart enough to use the median value just once, and only replicate it only if I insist on viewing the entire table with the MTS field?
... View more