We are having issues with Kubernetes containers spamming Splunk with 100's of gb's of logs sometimes. We would like to put together a search to track containers that have a sudden log spike and generate an alert. More specifically 1) look at the average rate of events 2) find the peak 3)decide a percentage of that peak 4) and then trigger an alert when a container has breached the threshold.
The closest I have come up with is the below search, which has an average rate and standard deviation of that rate by hour
index="apps" sourcetype="kube"
| bucket _time span=1h
| stats count as CountByHour by _time, kubernetes.container_name
| eventstats avg(CountByHour) as AvgByKCN stdev(CountByHour) as StDevByKCN by kubernetes.container_name