Hi Team,
I have set alert for below query:
index= "abc" "ebnc event did not balanced for filename" sourcetype=600000304_gg_abs_dev source!="/var/log/messages" | rex "-\s+(?<Exception>.*)" | table Exception source host sourcetype _time
And I got below result:
I have set the alert as below
And I have set the incident for it with SAHARA Forwarder but I am getting only 1 incident though the statistics was 6.
6 incidents should get created
And also Incidents are coming very late if event triggered at 8:20 incident is coming on 9:16
Can someone guide me on it.
OK, please do the following
1. For that specific event, run your search for that time range and show what is the _indextime of your event
index= "abc" "pace api iaCode - YYY no valid pace arrangementId as response!!!" OR "pace api iaCode - ZZZ no valid pace arrangementId as response!!!" source!="/var/log/messages" sourcetype=600000304_gg_abs_ipc2
| eval index_time=strftime(_indextime, "%F %T.%Q")
| table _time index_time _raw
2. Then run this search for the time range 00:00 to 00:20 on that day
index=_internal YOUR_ALERT_NAME sourcetype=scheduler
and you should see details of the scheduler running your alert
3. HOW are you getting your alert? Is it being sent by email? If so, what is the SENT time of the email?
Then from (1) you will see when the data is VISIBLE in Splunk from the index time for that event. That will show you if when the alert runs at 00:15 if the event is present in Splunk
From (2) you will see the result count of the alert that runs
From (3) you can see when the event was sent from Splunk
I have suggested two times before that you change the time range of your search to look a little in the past to account for ingest lag - please can you ensure you are doing that, so set the search time range to be
earliest=-16m@m
latest=-1m@m
in your alert time picker. That will allow for 1 minute lag between event creation and index time
Your alert is searching for a 15 minute window but running every 5 minutes, so at 8:20 it will search 8:05 to 8:20 and at 8:25 it will search 8:10 to 8:25 and so on, so you will get duplicate alerts.
If your event time is 8:20 but it is not getting indexed until 9:16, then you will not see that alert as when the search runs at 8:25 the data is not present and when the search runs at 9:20, the event time is 8:20, so not in the search window.
If your events are arriving late, that needs to be checked with the how those events are being forwarded.
You can look at event lag by adding
| eval index_time=strftime(_indextime, "%F %T.%Q")
before your table statement and then adding index_time field to your table statement, so you can see when that events was indexed.
If you KNOW you have lag and there is nothing you can do about it, then you may need to adjust the time window of the search to something like
earliest=-60m@m
latest=--55m@m
so that you are searching a 5 minute window 1 hour ago. The search window should generally match the frequency of the cron schedule.
I have added what you suggested in my search query as below:
index= "abc" "ebnc event did not balanced for filename" sourcetype=600000304_gg_abs_dev source!="/var/log/messages" | rex "-\s+(?<Exception>.*)" |eval index_time=strftime(_indextime, "%F %T.%Q")| table Exception source host sourcetype _time index_time
I am getting result as below:
@bowesmana could you please guide me what should I change in my alert setting
Should I change it from last 15 minutes to something else and also should I change this Cron schedule
*/5 * * * *
I want as soon as the events occurred in Splunk incident should get created at that time only.
Ho often do you want the alert to run. When you decide that, change the cron schedule accordingly.
The time window should really be the same as the frequency - only you can decide what that is.
When making these searches, it is normal to search for a window that is a little bit in the past, e.g. as I suggested in my previous post.
If your frequency is 5 minutes, then your time window would be something like
earliest=-7m@m
latest=-2m@m
so you search a 5 minute window from -7 to -2 minutes ago.
It's not clear from your original post what you mean by the incident coming in at 9:16 when the event is 8:20
I have made my window as below:
I have taken time Range as Last 15 minutes and set Cron schedule as */15 * * * *
Still not getting emails and Incidents on time
If the events are getting generated at 12:30 pm IST I want alert should triggered at that time itself via email or via incident.
@bowesmana can you please help me what time Range and cron schedule I should set to get alerts on time.
If you run the alert manually does it find any data?
If you want to find an event that is generated at 12:30, then that event will probably not be picked up until 12:45 when the alert runs on your cron schedule.
Your time range is set to last 15 minutes, so depending on exactly WHEN your alert runs, you may miss events, because if the event occurs at 12:30 and is indexed by Splunk at 12:30:04 and your search ran at 12:30:02 then it will not find it. The next search which might run at 12:45:06 will also not find it as it only searches between 12:30:06 and 12:45:06
So please set your search to run with exact time specifiers with "snap to time" using @m
Can you please guide what changes I should made
Should I need to change cron schedule expression or I need to make change in my queries
Please guide
See my earlier message that suggested you change the earliest and latest times to reflect the cron interval and allowing for a short time in the past to make sure the events have arrived.
Can you please guide on it
This is my current query:
index= "abc" "pace api iaCode - YYY no valid pace arrangementId as response!!!" OR "pace api iaCode - ZZZ no valid pace arrangementId as response!!!" source!="/var/log/messages" sourcetype=600000304_gg_abs_ipc1| rex "-\s+(?<Exception>.*)" | table Exception source host sourcetype _time
And My cron schedule is shown as below:
@bowesmana can you please suggest me what changes I need to make in my query and cron to get incidents and email on time
I am not sure what else to suggest - for some reason you have gone back to a 5 minute cron window with a 15 minute time range, which is something I earlier suggested you change.
I also suggested using a specific earliest/latest time window, which you do not appear to be doing.
It is also not clear what you meant in your original post about incidents coming in at 9:16 with events at 8:20
Unless you are able to give detail about events/times and specific detail of the problem, it is impossible for anyone to offer concrete advice that will help you.
You would need to provide an example where
Hi @bowesmana
This is the event that is occurred: The time the event occurred is 2023-11-30 00:00:33.789
I have set the Cron expression of last 15 minutes and Time Range is last 15 minutes
I am getting incidents and alerts after 30 minutes or sometimes after 45 minutes when the event triggered in splunk.
These is my query:
index= "abc" "pace api iaCode - YYY no valid pace arrangementId as response!!!" OR "pace api iaCode - ZZZ no valid pace arrangementId as response!!!" source!="/var/log/messages" sourcetype=600000304_gg_abs_ipc2| rex "-\s+(?<Exception>.*)" | table Exception source host sourcetype _time
OK, please do the following
1. For that specific event, run your search for that time range and show what is the _indextime of your event
index= "abc" "pace api iaCode - YYY no valid pace arrangementId as response!!!" OR "pace api iaCode - ZZZ no valid pace arrangementId as response!!!" source!="/var/log/messages" sourcetype=600000304_gg_abs_ipc2
| eval index_time=strftime(_indextime, "%F %T.%Q")
| table _time index_time _raw
2. Then run this search for the time range 00:00 to 00:20 on that day
index=_internal YOUR_ALERT_NAME sourcetype=scheduler
and you should see details of the scheduler running your alert
3. HOW are you getting your alert? Is it being sent by email? If so, what is the SENT time of the email?
Then from (1) you will see when the data is VISIBLE in Splunk from the index time for that event. That will show you if when the alert runs at 00:15 if the event is present in Splunk
From (2) you will see the result count of the alert that runs
From (3) you can see when the event was sent from Splunk
I have suggested two times before that you change the time range of your search to look a little in the past to account for ingest lag - please can you ensure you are doing that, so set the search time range to be
earliest=-16m@m
latest=-1m@m
in your alert time picker. That will allow for 1 minute lag between event creation and index time