For each eventID I have a, b, c fields on multiple hosts. I need to build report to present daily values of a, b, c field and avg(a) for a day for each host.
Something like select avg(a), b, c from ... group by a,b,c. What would be equivalent of this statement in Splunk search?
Probably the direct translation that you are looking for:
... | stats avg(a), first(a), first(b), first(c) by a,b,c
although if you are grouping by a
, b
, and c
, it doesn't make a lot of sense to also be asking for a
, b
, and c
, as the values will always be identical. (It's similarly pointless in the SQL version.) Note that with Splunk, you must specify first(b) as b
for example, rather than just b
, because you must say what to do if there happen to be multiple values of b
in one of the groups. When you just ask for b
in SQL, it's basically either taking the first, last, or some arbitrary value from the group. In the particular case where you are splitting by b
(among other fields), there will only be one possible value, so it's unnecessary to ask for it. So you could actually get the exact same information with:
... | stats avg(a) by a,b,c
However, you might actually mean to query for:
... | stats avg(a), first(a) as a, first(b) as b, first(c) as c by x,y,z
which would be the equivalent of the SQL:
select avg(a),a,b,c from ... group by x,y,z
(Again, we have to tell Splunk how to resolve potential multiple values of a
, b
, or c
. Using the function first()
is basically what SQL does if you don't specify an aggegration function.)
Then, if you need it split by day:
... | bucket _time span=1d | stats avg(a) first(a) as a first(b) as b first(c) as c by _time,x,y,z
will do it.
I'm not sure what a, b, c are, you can't get the average of a and group by it at the same time, I don't think. That doesn't make sense.
If you are searching one day a time, then try one of these:
... | stats avg(a), avg(b), avg(c) by host
... | stats avg(a) by b, c, host
If you are looking for a daily breakdown of the avg(a) over a larger range, then you have to do some trickery do do a "group by" over multiple columns:
... | eval group_field = b." ".c." ".host | timechart span=1d avg(a) by group_field
Could you provide a sample event? I'm not sure I understand how your data is structured at the Splunk event-level.