I am writing a query to look for rises in error messages over the past hour. It looks in 15 minute chunks from 0 to 60 minutes ago. Rows where there are 0 error messages are missing from the table, but i need to keep them there so when i run a median over the last 3 time bins, it includes the 0s.
Each API has their own error messages when it fails, and not every failure occurs in every 15 minute block of time for their API.
So far I have this, in run-anywhere spl, but it's not correct
Table is too large to show, but it doesn't carry with it the total traffic values for each 15 minute bin.
I looked at the following solutions, but they are each different enough that their solutions only partially worked as I have 2 group by fields and the APIs and Error Messages are not yet known until the query runs.
Hi Rich, thanks for the help. Unfortunately this doesn't preserve the api and errormsg as those fields are now gone.
I did, however, discover a quick solution. i can add an eventstats at the end after the stats command.
| eventstats values(Traffic) as Traffic by api, Minute
This doesn't fill in any data where that api + minute combo doesn't yet exist though, so it isn't complete. I think I need to redesign the whole query as this is getting a little complex. All i'm trying to do is track the rates of failures for each type of failure for each api over 15 minute spans of time over the past hour so i can see if they are rising or falling. The math does work though since if there's no data for that row that means there's no errors so the failure rate will be zero regardless of amount of traffic that's missing.
If there's a better way to track rates of failure over the past hour to see if they are rising, i'm all ears, but this may do for the time being.