I have Splunk logs with data that is roughly like this:
Timestamp | adapterName | responseCode |
xx | A | 1 |
xx | A | 2 |
xx | B | 1 |
xx | B | 2 |
For each combination of (adapterName, responseCode), I want to:
1. determine its average count for the last 24 hours (excluding the most recent 1 hour)
2. determine its count for the most recent hour
3. create alerts if its last hour count exceeds 1 standard deviation from the average counts of the last 24 hours
There are other similar questions, and I have referenced them (e.g. https://community.splunk.com/t5/Alerting/Alert-when-sample-is-2-standard-deviations-from-moving-aver...). However, the difference is that I have multiple (adapterName, responseCode) combinations, so the query is not as straightforward and I cannot use `time chart` etc easily.
Here's my own attempt so far:
index=itms "searchTerm" earliest=-25h@h latest=@h
| rex "\"adapterName[\":]*(?<adapter>[A-Z]+)\""
| rex "\"name[\":]*(?<responseCode>[\w]+)\""
| eval StartTime=relative_time(now(), "-1h@h")
| eval series=if(_time>=StartTime, "lastHour", "last24Hours")
| bin _time span=1h
| stats count As hourlyCount BY adapter, responseCode, series, _time
| stats avg(hourlyCount) AS averageCount stdev(hourlyCount) as standardDev by adapter, responseCode, series
And I am very stuck with a table like this:
adapterName | responseCode | series | averageCount | standardDev |
A | 1 | lastHour | ... | 0 |
A | 1 | last24Hours | ... | ... |
A | 2 | lastHour | ... | 0 |
A | 2 | last24Hours | ... | ... |
B | 1 | lastHour | ... | 0 |
B | 1 | last24Hours | ... | ... |
B | 2 | lastHour | ... | 0 |
B | 2 | last24Hours | ... | ... |
And I have no idea how to proceed. I think the query also made the table more complicated, but I am not sure. Can someone give some advice here?
index=itms "searchTerm" earliest=-25h@h latest=@h
| rex "\"adapterName[\":]*(?<adapter>[A-Z]+)\""
| rex "\"name[\":]*(?<responseCode>[\w]+)\""
| eval StartTime=relative_time(now(), "-1h@h")
| eval series=if(_time>=StartTime, "lastHour", "last24Hours")
| bin _time span=1h
| stats count As hourlyCount BY adapter, responseCode, series, _time
| stats avg(hourlyCount) AS averageCount stdev(hourlyCount) as standardDev by adapter, responseCode, series
``` as you had it ```
``` keep 24hour std ```
| eval standardDev=if(series="last24Hours", standardDev, null())
``` save 24hour avg ```
| eval average24Count=if(series="last24Hours", averageCount, null())
``` lose original 24hour avg ```
| eval averageCount=if(series="last24Hours", null(), averageCount)
``` gather values by adapter and response code (there should only be one value for each) ```
| stats values(standardDev) as standardDev values(averageCount) as averageCount values(average24Count) as average24Count by adapter, responseCode
``` calculate if averageCout (last hour) is more than 1 std from 24h avg ```
| eval alert=if(averageCount < average24Count - standardDev OR averageCount > average24Count + standardDev, 1, 0)
| where alert=1
Thanks so much! Appreciate the step-by-step comments as well
index=itms "searchTerm" earliest=-25h@h latest=@h
| rex "\"adapterName[\":]*(?<adapter>[A-Z]+)\""
| rex "\"name[\":]*(?<responseCode>[\w]+)\""
| eval StartTime=relative_time(now(), "-1h@h")
| eval series=if(_time>=StartTime, "lastHour", "last24Hours")
| bin _time span=1h
| stats count As hourlyCount BY adapter, responseCode, series, _time
| stats avg(hourlyCount) AS averageCount stdev(hourlyCount) as standardDev by adapter, responseCode, series
``` as you had it ```
``` keep 24hour std ```
| eval standardDev=if(series="last24Hours", standardDev, null())
``` save 24hour avg ```
| eval average24Count=if(series="last24Hours", averageCount, null())
``` lose original 24hour avg ```
| eval averageCount=if(series="last24Hours", null(), averageCount)
``` gather values by adapter and response code (there should only be one value for each) ```
| stats values(standardDev) as standardDev values(averageCount) as averageCount values(average24Count) as average24Count by adapter, responseCode
``` calculate if averageCout (last hour) is more than 1 std from 24h avg ```
| eval alert=if(averageCount < average24Count - standardDev OR averageCount > average24Count + standardDev, 1, 0)
| where alert=1