Splunk Search

Calculating a baseline

aenagy
Observer

Warning: Splunk noob question.

I have a base search:

source="Administrator_logs" name="An account failed to log on"

Using  https://community.splunk.com/t5/Splunk-Search/Getting-Average-Number-of-Requests-Per-Hour/m-p/73506 I can calculate hourly averages:

source="Administrator_logs"name="An account failed to log on" | eval reqs = 1 | timechart span=1h per_hour(reqs) as AvgReqPerHour


What I would like to do is calculate a baseline. Having never done this before my thought is to calculate the hourly average and either standard deviation and/or some percentile, e.g. 90th, for all events as apposed to the last day/week/month although that would be interesting too.

Eventually, this baseline calculation will be the basis for an alert, e.g. create alert if hourly count is outside 1 stddev or 90th percentile.

Q1: How do I calculate the hourly average for all events?

Q2: How do I calculate the hourly standard deviation for all events?

Q3: How do I calculate the hourly 90th percentile for all events?

Labels (1)
0 Karma

tscroggins
Champion

@aenagy 

This assumes your data is normally distributed. If it is not, you may need to transform your data before calculating statistics.

The timechart count aggregation should be sufficient for counting by hour.

Following that, you can extract the hour from _time and use the stats command to calculate the average, standard deviation, and 90th percentile by hour.

Here's an example using random counts:

| makeresults count=10000
| eval _time=_time-_time%3600-604800*random()/2147483647 ```uniformly distributed over 7 days```
| timechart fixedrange=f span=1h count
| eval date_hour=strftime(_time, "%H")
| stats avg(count) as avg_count stdev(count) as sd_count p90(count) as p90_count by date_hour

Using your source:

source="Administrator_logs" name="An account failed to log on" earliest=-7d@h latest=@h
| timechart span=1h count
| eval date_hour=strftime(_time, "%H")
| stats avg(count) as avg_count stdev(count) as sd_count p90(count) as p90_count by date_hour

0 Karma
Get Updates on the Splunk Community!

Splunk MCP & Agentic AI: Machine Data Without Limits

  Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization ...

Finding Based Detections General Availability

Overview  We’ve come a long way, folks, but here in Enterprise Security 8.4 I’m happy to announce Finding ...

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience

Hands-On Learning and Technical Seminars  Sometimes, you just need to see the code. For those looking for a ...