Okay, so ended up having to change how the grouping worked and used a bit of trickery to preserve the granularity I wanted. In essence I pulled stuff out of the "by" clause and put them into "values" columns. The part that gave me grief was being able to get back to the granular counts.
So I took this query:
index=logging sourcetype=mylogs
| sistats count by application, eventcode, eventtext, host
Then moved a couple of the "by" grouping to be a "values" like this:
index=logging sourcetype=mylogs
| sistats count, values(host) as host, values(eventcode) as eventcode by application, eventtext
This caused a problem though because now I can't get the count by host easily. I did notice that the SI contained a private field called prsvd_vm_host that has field values like this:
server1 ; 2 ; server2 ; 4 ; server3 ; 6 ;
I'm pretty sure I could use that field to get back to the original counts by host, but what I need is a combination of the host AND the eventcode. The SI fields work individually but not in tandem. Ended up modifying the query to look like this:
index=logging sourcetype=mylogs
| eval errorLocation_{host}_{eventcode} = 1
| sistats sum(errorLocation_*) as *, values(host) as host, values(eventcode) as eventcode by application, eventtext
By doing this I'm able to get back to the host+eventcode counts, so far working well. The number of events has decreased dramatically and on querying back the data the performance seems good. Looks like having fewer events that are wider performs better than more events that are narrow. Thanks.
... View more