Splunk Search

Streamstats by clause


Hi splunkers,

I'm using the streamstats command with the by clause to split the results using another field but the results are not what I expect:

| streamstats last(used) by S_NAME

What are the fields that streamstats creates to put it's results into when you use the by clause?
So far I've only been able to use it when I specify the destination field using the "as" clause:

| streamstats last(CPU) as CPULast

But using it like this means I have to specify every field name I want streamstats to use. I'd like to use the "by" clause instead and have streamstats automatically generate stats for each value of a specific field specified in the "by" clause

Any ideas? : /


Tags (2)
0 Karma


Let's back up here just a bit. I think you may be trying to use streamstats in a way that doesn't really work for your problem. Unfortunately, the community doesn't have the perspective on the problem that you do... Here is my understanding of the problem:

  • Your data occasionally has missing values for different fields. Sometimes it might be missing CPU, other times MemoryUse is missing etc.
  • You want to fill the missing data with the last value of the field. However, the last value of the field is not simply the most recent value. It is the "most recent value for this S_NAME"
  • I think we are confused about S_NAME - what is it?

Two things would help complete this picture: (1) what do you want to do with this data - are you creating a report? Can you describe it? and (2) a sanitized sample of a few events.

I would prefer to come up with a solution that avoids streamstats partly because it is relatively inefficient. But try this

| sort S_NAME _time
| streamstats current=f window=1 last(*) as LAST* by S_NAME
| foreach * [eval <<FIELD>>=if(isnull('<<FIELD>>'),'LAST<<FIELD>>','<<FIELD>>') ]
| fields - LAST*

If any of your existing fields start with LAST, you should change that in the example above.

But it would be better if we could some up with the search that calculates the end result; this search will give you just events with fields filled-in - and no indication of which events are "real" and which events had missing data.


Also, you might want to put | fillnull after the sort, depending on how your incoming data is defined.

0 Karma


If streamstats works just fine with the "as" clause when using the split "by" clause, I'm doing something wrong because after using:

streamstats last(used) as UsedLast by **S_NAME* | eval used=if(ifnull(Used,UsedLast,Used))*

I don't see a new field name with the name I don't see a new field pop up called "UsedLast".

Am I missing something here?


0 Karma


That might very well be the case though the documentation is not helping me, Ayn : /

What I'm trying to do is use streamstats to fill null values. So I use something similar to:

..| streamstats last(used) as UsedLast | eval used=if(ifnull(Used,UsedLast,Used))

But since the data in the "used" field used is really not usefull unless its sorted by the "S_ NAME" field I'm resorting to using multiple streamstat statements (One for each known value of "S_NAME").

.. timechart avg(used) by S_NAME | streamstats last(CPUUse) as CPU | streamstats last(MemoryUse) as Memory | streamstats...

0 Karma


No, you're doing something else wrong in that case. streamstats works perfectly fine with what lguinn suggested. streamstats isn't limited to one field name. I'm not saying you're not having problems because you obviously are, but I think you're jumping to conclusions regarding what the cause to these problems are.


Hi Iguinn,

I tried that too. It doesn't work. I'm guessing since I'm splitting using S_NAME, stream stats needs a different field for each value of S_NAME. And since using the "as" clause only lets you specify one field name, it limits me to one field name... In any case, even when there is only one unique value for S_NAME, it does not work : /

Thanks for the thought though!

0 Karma


why not use both clauses?

| streamstats last(used) as lastUsed by S_NAME