I'm trying to optimize the alerts since I'm having issues. Where I work, it's somewhat slow to solve the problem (1 to 3 days) when the alert is triggered. This causes the alert to constantly trigger in the given time. I can't use Throttle since my alerts do not depend on a single host or event. For example:
index=os_pci_windowsatom host IN (HostP1 HostP2 HostP3 HostP4) source=cnt_mx_pci_sql_*_status_db
|dedup 1 host state_desc
| streamstats values(state_desc) as State by host
| eval Estado=case(
State!="ONLINE", "Critico",
State="ONLINE", "Safe"
)
| table Estado host State _time
| where Estado="Critico"
When the status of a Host changes to critical, it triggers the alert. For this reason, I cannot use Throttle because in the time span that this alert is silenced, one of the hosts may trigger, omitting the entire alert completely.
My idea is to create logic based on the results of the last triggered alert and compare them with the current alert where if the host and status are the same, it remains unchanged. However, if the host and status are different from the previous one triggered, it should be triggered. I thought about using the data where it's stored, but I don't know how to search for this information, does anyone have an idea? e
Any comment is greatly appreciated.
Let me first try to understand the problem: You want to find servers whose end state is offline, but whose immediate previous reported state is not offline, i.e., those whose state newly becomes offline. Is this correct? In other words, given these mock events
_time | host | state_desc |
2024-12-20 18:00 | host1 | not online |
2024-12-20 16:00 | host2 | not online |
2024-12-20 14:00 | host3 | ONLINE |
2024-12-20 12:00 | host4 | not online |
2024-12-20 10:00 | host0 | not online |
2024-12-20 08:00 | host1 | ONLINE |
2024-12-20 06:00 | host2 | not online |
2024-12-20 04:00 | host3 | not online |
2024-12-20 02:00 | host4 | ONLINE |
2024-12-20 00:00 | host0 | not online |
2024-12-19 22:00 | host1 | not online |
2024-12-19 20:00 | host2 | ONLINE |
2024-12-19 18:00 | host3 | not online |
2024-12-19 16:00 | host4 | not online |
2024-12-19 14:00 | host0 | ONLINE |
2024-12-19 12:00 | host1 | not online |
2024-12-19 10:00 | host2 | not online |
2024-12-19 08:00 | host3 | ONLINE |
2024-12-19 06:00 | host4 | not online |
2024-12-19 04:00 | host0 | not online |
2024-12-19 02:00 | host1 | ONLINE |
2024-12-19 00:00 | host2 | not online |
2024-12-18 22:00 | host3 | not online |
2024-12-18 20:00 | host4 | ONLINE |
2024-12-18 18:00 | host0 | not online |
2024-12-18 16:00 | host1 | not online |
2024-12-18 14:00 | host2 | ONLINE |
2024-12-18 12:00 | host3 | not online |
2024-12-18 10:00 | host4 | not online |
2024-12-18 08:00 | host0 | ONLINE |
2024-12-18 06:00 | host1 | not online |
2024-12-18 04:00 | host2 | not online |
2024-12-18 02:00 | host3 | ONLINE |
2024-12-18 00:00 | host4 | not online |
2024-12-17 22:00 | host0 | not online |
You want alert on host1 and host4 only.
To do this with streamstats, you will need to sort events this way and that. I usually consider them costs. (And I am quite fuzzy in streamstats:-) So, I consider this one of few good uses of transaction. Something like
index=os_pci_windowsatom host IN (HostP1 HostP2 HostP3 HostP4) source=cnt_mx_pci_sql_*_status_db
| transaction host endswith=state_desc=ONLINE keepevicted=true
| search eventcount = 1 state_desc != ONLINE
Here is an emulation of the mock data for you to play with and compare with real data.
| makeresults count=35
| streamstats count as state_desc
| eval _time = relative_time(_time - state_desc * 7200, "-0h@h")
| eval host = "host" . state_desc % 5, state_desc = if(state_desc % 3 > 0, "not online", "ONLINE")
``` the above emulates
index=os_pci_windowsatom host IN (HostP1 HostP2 HostP3 HostP4) source=cnt_mx_pci_sql_*_status_db
```
Output from the search is
_time | closed_txn | duration | eventcount | field_match_sum | host | linecount | state_desc |
2024-12-20 18:00 | 0 | 0 | 1 | 1 | host1 | 1 | not online |
2024-12-20 12:00 | 0 | 0 | 1 | 1 | host4 | 1 | not online |
The rest of your search is simply manipulation of display string.