Splunk Search

split-by ("BY") clause of CHART only takes 2 dimensions, we want 3

Splunk Employee
Splunk Employee

(The 2-dimension restriction is not mentioned in http://www.splunk.com/base/Documentation/latest/SearchReference/Chart, but is real nonetheless.)

Tags (2)

SplunkTrust
SplunkTrust

This might not answer your question, but i think it might help -- There's a difference between stats and chart that is often not noticed because the syntax is so similar.

To make a long story short, when you have a chart avg(eps) by series,date_hour, the first field after the by ends up being associated one-per-row. (We tend to call these 'group by' fields here at splunk).
The second field after the by, ends up beign associated one-per-column. (We tend to call these 'split by' fields.)

The stats command, although its syntax looks very similar, often identical, actually does something quite different from the chart command.

take this search:

index=_internal source=*metrics.log group=per_sourcetype_thruput
| chart avg(eps) by series, date_minute

run it in your UI. You'll see each row is a 'series' value, and each column is a 'date_minute' value.

now compare it to its' equivalent using the stats command.

index=_internal source=*metrics.log group=per_sourcetype_thruput 
| stats avg(eps) by series, date_minute

You'll see the first by in stats, series, ends up as one per row, and... so does the *date_minute*.
Each row is the distinct combinations of the N group-by fields.
And there's only one other column and that's 'avg(eps)'

So because stats never uses the second dimension, and just keeps expanding out combinations, you can use as many fields as you like.

Incidentally, the confusing similarity of syntax, is why we created a new syntax for chart

| chart avg(eps) over series by date_minute

to show that the first field and the second field are really treated differently.

That said, the downside of using stats over chart however, is that the FlashChart module will not understand the data format...

Splunk Employee
Splunk Employee

You may also wish to look at: http://www.splunk.com/support/forum:SplunkReporting/3928

That site will soon be replaced with this one, so to summarize, the desired:

index="MyClusterIndex" | timechart sum(handledRequests), avg(sessions) by source

and equivalent:

index="MyClusterIndex" | chart sum(handledRequests), avg(sessions) by _time,source

is not allowed, but the desired result can be obtained in general with:

index=MyClusterIndex | stats sum(handledRequests) as hRs, avg(sessions) as ssns by source | eval s1="handledReqs sessions" | makemv s1 | mvexpand s1 | eval yval=case(s1=="handledReqs",hRs,s1=="sessions",ssns) | eval series=host+":"+s1 | xyseries _time,series,yval

Splunk Employee
Splunk Employee

Interesting; thank you! Not the exact problem this question's about, but good to know.

0 Karma

Splunk Employee
Splunk Employee

Although in the special case of _time field, the timechart and chart commands will bucket them automatically. When using stats, you will have to apply the bucket command to the _time field explicitly before using stats

0 Karma

Splunk Employee
Splunk Employee

Say we want to split by build, platform, and test. We create a synthetic field by concatenating 2 of the 3 fields desired, as:

... | EVAL platformAndTest = platform . ", " . test | CHART median(value) BY build, platformAndTest

Splunk Employee
Splunk Employee

Hmm, a most penetrating analysis, sideview. Now if only "stats" were equivalent to "chart". But it's not. So your suggestion does not, how you say, hold water yes yes?

SplunkTrust
SplunkTrust

Not a bad trick, but if #-of-platforms * #-of-tests is very large this story will start to scale badly, both in the UI and eventually in the backend. What I suspect is more useful is just "stats median(value) by build, platform, test"

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!