Hi @Karthikeya, False positive, false negative, etc. have the same definitions in Splunk that they have in statistics. I'm in the United States, and I find NIST/SEMATECH e-Handbook of Statistical M...
See more...
Hi @Karthikeya, False positive, false negative, etc. have the same definitions in Splunk that they have in statistics. I'm in the United States, and I find NIST/SEMATECH e-Handbook of Statistical Methods, Chapter 6, "Process or Product Monitoring and Control," a useful day-to-day reference: https://www.itl.nist.gov/div898/handbook/index.htm. In your example, you're counting events. For example, a basic search scheduled to run every minute: index=web status=400 earliest=-15m@m latest=@m | status count | where count>5 gives you the count of status=400 events over the prior 15 minutes. In this context, false positive and false negative could relate to the time the events were generated and the delay between that time and the index time. If a status=400 event occurred at 00:14:59 but was indexed by Splunk at 00:15:04, then a search that executes at 00:15:01 for the interval [00:00:00, 00:015:00) would not count the event because it has not been indexed by Splunk. This is a false negative. You can reduce the probability of false negatives by adding a backoff to your search--1 minute in this example: index=web status=400 earliest=-16m@m latest=-1m@m | status count | where count>5 However, that will not eliminate all false negatives because there is still a non-zero probability that an event will be indexed outside your search time range. False positives are more typically associated with measuring against a model. Let's say you've modeled your application's behavior and determined that more than 5 status=400 events over a 15 minute interval likely indicates a client-side code deployment issue as opposed to "normal" client behavior. "More than 5" is associated with a control limit, for example a deviation from a mean; however, the number of status=400 events is a random variable. A bad client-side code deployment may trigger 4 status=400 events, which is a false negative, and a good client-side deployment may trigger 6 status=400 events, which is a false positive. Several Splunk value-added products like Splunk Enterprise Security and Splunk IT Service Intelligence provide ready-to-run modeling and monitoring solutions, but in general, you would model your application's behavior using either traditional methods outside Splunk or statistical functions or an add-on like the Splunk Machine Learning Toolkit inside Splunk. You would then apply your model using custom Splunk searches.