Hi,
I successfully created an SPL that does what I need for a single host but I cannot get it to work for all hosts.
This works
index=<my_index> host=<specific_host> sourcetype=<my_sourcetype> instance=_Total counter="% Processor Time"
| sort host, -_time
| dedup 2 host
| lookup <my_lookup> resource_name as host output businessprocess_name
| search businessprocess_name = "<my_business_process_name>"
| eval Value = round(Value,2)
| delta Value AS ValueDelta
| eval lowerThreshold = -25
| eval upperThreshold = 25
| eval CreateEvent = if((ValueDelta > upperThreshold OR ValueDelta < lowerThreshold),"Yes","No")
| search CreateEvent = "Yes"
| eval metric_type = "CPU Usage Anomaly"
| eval description = if(ValueDelta < 0,"CPU Usage is now: " + Value + "%. A decrease of " + ValueDelta,"CPU Usage is now: " + Value + "%. An increase of " + ValueDelta)
| table _time, host, businessprocess_name, metric_type, description
The output of that SPL is (changed the lower and upper threshold to trigger a result)
_time | host | businessprocess_name | metric_type | description |
2021-05-05 12:35:57 | <specific_host> | <my_business_process_name> | CPU Usage Anomaly | CPU Usage is now: 57.52951309736281%. A decrease of -3.69007662538445 |
I know the sort on host does not make sense in this SPL but it nicely takes the last values, compares it and based on the difference the result is what it needs to be.
When I remove the host=<specific_host> and run it on all the hosts in the system the output is wrong.
It seems that it is comparing value of row 1 (server A) with the value of row 2 (server A), then value of row 2 (again server A) with the value on row 3 (server B), etc etc. I guess that makes sense but not what I am looking for.
What would be needed to run the calculation of the delta for only the two records that belong to the same host?
What I am aiming to do is to create an event when the difference in the CPU usage between the last two values is more then the configured threshold, whether it drops or increases. Maybe I am going about it the wrong way with the Delta command?
For a similar approach, you could look to use autoregress in place of delta - a few more lines, but it may get you what you want. This will capture the host and value of the previous event (as in top-to-bottom). So then just keep events where the host of the previous event matches the current event, grab your delta and move on from there.
index=<my_index> host=<specific_host> sourcetype=<my_sourcetype> instance=_Total counter="% Processor Time"
| sort host, -_time
| dedup 2 host
| lookup <my_lookup> resource_name as host output businessprocess_name
| search businessprocess_name = "<my_business_process_name>"
| eval Value = round(Value,2)
| autoregress p=1 host as prev_host
| autoregress p=1 Value as prev_value
| where host=prev_host
| eval DeltaValue = prev_value - Value
.
.
.
You may also be able to use stats list(Value) by host, then use mvindex to grab the first two values for each (so no dedup'ing)....and i'm sure there are other ways as well.
For a similar approach, you could look to use autoregress in place of delta - a few more lines, but it may get you what you want. This will capture the host and value of the previous event (as in top-to-bottom). So then just keep events where the host of the previous event matches the current event, grab your delta and move on from there.
index=<my_index> host=<specific_host> sourcetype=<my_sourcetype> instance=_Total counter="% Processor Time"
| sort host, -_time
| dedup 2 host
| lookup <my_lookup> resource_name as host output businessprocess_name
| search businessprocess_name = "<my_business_process_name>"
| eval Value = round(Value,2)
| autoregress p=1 host as prev_host
| autoregress p=1 Value as prev_value
| where host=prev_host
| eval DeltaValue = prev_value - Value
.
.
.
You may also be able to use stats list(Value) by host, then use mvindex to grab the first two values for each (so no dedup'ing)....and i'm sure there are other ways as well.
I accepted your post as the solution as it directed me to a direction that got me there. 😉
Thank you for your respond. It helped me along nicely.
Did not get where I needed to be with autoregress but your list(Value) suggestion got me further.
Got the following SPL now and it seems to be doing what I am looking for.
index=<my_index> host=<my_server> sourcetype=<my_sourcetype> instance=_Total counter="% Processor Time"
| stats list(Value) as Value by host
| eval mValues = mvindex(Value,0,1)
| eval value1 = mvindex(mValues,0), value2 = mvindex(mValues,1)
| eval myDelta = value1 - value2
| eval createEvent = if(myDelta >2 OR myDelta < -2,"Create Event","Nothing to worry")
| table host, mValues, value1, value2, myDelta, createEvent
In the above SPL the threshold to compare against does not make sense and the table has more information then needed but it is for producing a result at the moment.