Splunk Search
Highlighted

Alert search query to monitor for fluctuation in system performance times between today and yesterday

Communicator

I am working with a set of transactions data where in each transaction could relate to any of our numerous systems/products.

The data could be simplified for the sake of this question to contain fields similar to the below:

Event 1:

TransationID: 1000
Product: System1
ResponseTime: 1234

Event 2:

TransactionID: 1001
Product: System2
ResponseTime: 4321

Event 3:

TransactionID: 1002
Product: System3
ResponseTime: 5678

Etc.

So I am interested in setting up an alert which works in the following manner:

The search query will return the average response time for each product for yesterday, and compare that to the average response time for each product today and for any products where there has been an X% increase - send an email alert - If possible, the alert will advise the product in question which has experienced the fluctuation in response time.

I feel like perhaps a custom trigger condition may be suitable for this instance?

The search query below will successfully provide me with the average response times per product for yesterday:

source="transactionLog" type="report" earliest=-1d@d latest=@d | stats avg(responsetime) by product

I saw a slightly similar question on Splunk Answers in which the person was attempting to use a subsearch in order to generate his alert query - however my situation differs as I am breaking down by products, so I am not sure that I can use a Number of Results trigger condition.

UPDATE:

I have put together the below query - I feel this is suitable as it actually compares the responsetime - I am having trouble understanding the logic from some of the comments in the answers below.

source="transactionLog" type="report" earliest=-1w@w latest=@w | 
stats avg(responsetime) as Previous_Week_Response | 
appendcols [search source="transactionLog" type="report" earliest=@w latest=now | 
stats avg(responsetime) as Current_Week_Response] | 
eval Week=if(_time>=relative_time(now(),"@w"),"Current_Week","Previous_Week") |
chart avg(responsetime) over product by Week |
where Current_Week_Response>=2*Previous_Week_Response
0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

SplunkTrust
SplunkTrust

I think you can achieve this alert for each product via single search only. Try something like this (example of checking 50% increase over yesterday)

source="transactionLog" type="report" earliest=-1d@d latest=now | eval Day=if(_time>=relative_time(now(),"@d"),"Today","Yesterday") 
| chart avg(responsetime) over product by Day | where Today>1.5*Yesterday

Above will list all the products where avg response time of today is 1.5 (50% ) time more than yesterday. You can then setup alert as , Alert options, "Once per search" (one alert for all products triggering the alert condition) OR "once per result" for one email for each product triggering the alert condition.

View solution in original post

Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

Communicator

@somesoni2

If I wanted to edit this to compare this week's average vs last weeks - could I so by changing to the following:

earliest=-1w@w latest=now | eval Week=if(_time>=relative_time(now(),"@w"),"This week","Last week")
0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

SplunkTrust
SplunkTrust

Absolutely. I would avoid using spaces in the labels though so that comparison later is easier. Elae you need to keep them within single quotes in where clause.

0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

Communicator

@somesoni2

I am just curious on how I can set the alert trigger conditions for this alert to function correctly.

The problem is, my search is returning over 75,000 events, and these are not restricted to the events which fit the where clause - it is simply returning all events in that time period.

However, the chart command is correctly displaying the products that fit the where clause.

Do I need to set a CUSTOM trigger as below;

Trigger condition: Custom
Custom condition: CurrentWeek>=2*PreviousWeek

0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

SplunkTrust
SplunkTrust

Use the whole search as your alert search

source="transactionLog" type="report" earliest=-1d@d latest=now | eval Day=if(_time>=relative_time(now(),"@d"),"Today","Yesterday") 
 | chart avg(responsetime) over product by Day | where Today>1.5*Yesterday

In alert condition, use "if number of events greater than 0", as above search will only return results when there are products satisfying where clause.

Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

Communicator

@somesoni2

I have a concern about the query you have provided above.

It appears that the where clause is actually comparing Today being 1.5x Yesterday - however, Today and Yesterday have no relation to the responsetime - they are only relating to the _time field???

Is this correct?

0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

SplunkTrust
SplunkTrust

After the chart command 'Today' is the column which stores the avg response time for today and 'Yesterday' is the avg response time cor yesterday. Using this you're getting avg response time for both period in single search hence its the optimum way. The same can be achieved by running two searches and joining them using appendcols or similar.

0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

Communicator

@somesoni2

I dont understand how "Today" or "Yesterday" can contain response time data as the eval for those only refer's to the _time field:

eval Day=if(_time>=relative_time(now(),"@d"),"Today","Yesterday")
0 Karma
Highlighted

Re: Alert search query to monitor for fluctuation in system performance times between today and yesterday

Communicator

@somesoni2

Should it not say something like:

eval Day=responsetime if(_time>=relative_time(now(),"@d"),"Today","Yesterday")
0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.