For each eventID I have a, b, c fields on multiple hosts. I need to build report to present daily values of a, b, c field and avg(a) for a day for each host.
Something like select avg(a), b, c from ... group by a,b,c. What would be equivalent of this statement in Splunk search?
Probably the direct translation that you are looking for:
... | stats avg(a), first(a), first(b), first(c) by a,b,c
although if you are grouping by
c, it doesn't make a lot of sense to also be asking for
c, as the values will always be identical. (It's similarly pointless in the SQL version.) Note that with Splunk, you must specify
first(b) as b for example, rather than just
b, because you must say what to do if there happen to be multiple values of
b in one of the groups. When you just ask for
b in SQL, it's basically either taking the first, last, or some arbitrary value from the group. In the particular case where you are splitting by
b (among other fields), there will only be one possible value, so it's unnecessary to ask for it. So you could actually get the exact same information with:
... | stats avg(a) by a,b,c
However, you might actually mean to query for:
... | stats avg(a), first(a) as a, first(b) as b, first(c) as c by x,y,z
which would be the equivalent of the SQL:
select avg(a),a,b,c from ... group by x,y,z
(Again, we have to tell Splunk how to resolve potential multiple values of
c. Using the function
first() is basically what SQL does if you don't specify an aggegration function.)
Then, if you need it split by day:
... | bucket _time span=1d | stats avg(a) first(a) as a first(b) as b first(c) as c by _time,x,y,z
will do it.
I'm not sure what a, b, c are, you can't get the average of a and group by it at the same time, I don't think. That doesn't make sense.
If you are searching one day a time, then try one of these:
... | stats avg(a), avg(b), avg(c) by host ... | stats avg(a) by b, c, host
If you are looking for a daily breakdown of the avg(a) over a larger range, then you have to do some trickery do do a "group by" over multiple columns:
... | eval group_field = b." ".c." ".host | timechart span=1d avg(a) by group_field