this is the first time i post here but I have learn alot from this website by just using google search.
At work server admins ask if I could "silence" splunk email alerts when they were doing maintenance so that they do not get emails of errors during server maintenance. I was able to do this because I created a maintenance.log in the /var/log/ folder that splunk keeps track of.
if the admins write:
"start of maintenance..." then any alert that monitors this logs will stop sending emails.
when the admins write:
"end of maintenance..."
then splunk knows it can start sending emails since maintenance period is completed.
this was useful to silence apache access log alerts that occurred during maintenance, meaning the admins did not get alerts that the apache access log wrote while admins were during maintenance as denoted by the _time of "stat of maintenance..." and _time of "end of maintenance...."
I have to show search results that do not contain any results that were reported during a maintenance period in a dashboard.
this means that any search results between the _time of "start of maintenance...." and _time of "end of maintenance..." should not be included in the results.
Moreover, there might be times when maintenance happened several times, for example, if maintnenace was done twice in one day or if they are searching for a time period of say, 1 month, and it shows there were 3 "starts of maintneance" and 3 corresponding "end of maintenance..." entries.
I have writen SPL that will get all the results:
(host="Server-web" source="/var/log/httpd24/error_log") OR
(host="Server-Web" index=bizapps source=/var/log/bizapps_maintenance.log)
I am not sure if splunk SPL can pull this off but am confident someone can help me out.
If you need more info, let me know.
| rex "(?<maint_start>start of maintenance)"
| rex "(?<maint_end>end of maintenance)"
| stats list(eval(if(isnotnull(maint_start), _time, null) as maint_starts list(eval(if(isnotnull(maint_end), _time, null) as maint_ends by host
| eval maint_period=mvzip(maint_starts,maint_ends,",")
| mvexpand maint_period
| rex "(?<maint_start>\d+)\s*,\s*(?<maint_end>\d+),"
| eval duration=maint_end-maint_start, _time=maint_start
| fillnull value="MAINTENANCE ONGOING" duration maint_end
Maybe something like this to get the maintenance timeframes and durations? And then use those results with an append or join to filter out the alert timeframes.
You could also write the maintenance periods into a lookup file and then use that to filter out the timeframes that fall into maintenance windows.
How would I be able to:
use those results with an append or join to filter out the alert timeframes.
tis is mainly what I want to be able to iterate through the start and end _time/time (similar to what you suggested). but I do not know how I would be able to filter by these results once I have them.
for example, if I do not know how many start of maintenance and end of maintenance I have in the search results, then how will i be able to use append or join?
I will not know because the query will be in a dashboard and the user might select a time frame that contains several starts/ends of maintenance or it might have none.
any help is apreciated as I am breaking my head over this.