Splunk Search

How do I create an multivalue field after stats that don't include it?

dj69
Explorer

Is there a way to aggregate data and then show additional fields as mv fields without running another search? I want to aggregate on only 2 fields but then show additional "thrown away" fields as multi value in the final results.

Sample data:

User    Machine Function
Adam    1           x
Adam    1           y
Adam    1           y
Betty   2           x

Working query:

stats count by User, Machine, Function

Results:

User    Machine     Function    Count
A       1           x           1
A       1           y           2
B       2           x           1

Tried query (doesn't work):

stats count by User, Machine | mvcombine Function | stats sum(Count), User, Machine, Function

Desired results:

User    Machine     Function    Count
A       1           x           3
                      y
B       2           x           1
0 Karma

elliotproebstel
Champion

I think this query will get you the table you want:

| stats values(Function) AS Function, count AS Count BY User, Machine

dj69
Explorer

A similar query over a week's worth of my real data shows 260 seconds for the eventstats version and 330 seconds for the values version. Yours is simpler to write but not as quick. Will confirm by running both again though.

0 Karma

dj69
Explorer

You beat me to my own answer but my approach was slightly different. Waiting for the moderator to approve it but basically I used eventstats which keeps the other values:
| eventstats count by User, Machine | mvcombine Function | eval Function=mvdedup(Function)

0 Karma

dj69
Explorer

Actually I think I just answered my own question, eventstats keeps the data in a form I can combine:

eventstats count by User, Machine | mvcombine Function

Results:

Function    Machine User    count
x           1       Adam    3
y
y
x           2       Betty   1
0 Karma

dj69
Explorer

So now my question, which approach is "better" as far as query costs would be concerned?

0 Karma

dj69
Explorer

Eventstats with many other fields has to be cleaned up, however by removing _* fields and specifying the exact fields to keep, so it becomes:
eventstats count by User, Machine | fields Count, User, Machine, Function | fields - _* | mvcombine Function | eval Function=mvdedup(Function) | sort -Count

0 Karma

elliotproebstel
Champion

You can certainly test both approaches over your data set to see which completes more efficiently, but I'm pretty sure my approach is more efficient. Stats is almost always more efficient than eventstats, I believe, specifically because it discards extra fields immediately.

dj69
Explorer

I would have thought the same, but wanted to know for sure.

So I've run each version of the query over and over again, both for short durations and for long. The variability of server responses while everyone else is running queries makes this hard to test. However, when running them simultaneously, they are mostly in a dead heat (~977 seconds over 30 days, ~10-15 seconds over 4 hours).

When running one of them over and over again, and then doing the same with the other, the average performance of the eventstats is slightly better. I think what might be going on is that while eventstats is using more memory during the query, this doesn't cost much in the way of performance. Stats values, on the other hand, might be wasting performance in upkeep while it throws things away.

elliotproebstel
Champion

Fair enough! Thanks for running the experiments and sharing your results. I've definitely found with Splunk that it's worth the time to run experiments like this up front to tune my approaches, because what works well for one data set may not be as efficient for another. So you're very smart to set up these A/B tests and choose your final approach accordingly.

0 Karma

dj69
Explorer

I pasted the wrong "tried" query but it doesn't really matter. I'm just trying to show that I can't use mvcombine once stats throws away the other fields. The alternative, to mvcombine at the end, can't combine if the Count column changes.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...