Alerting

How to define a token from an event to display in the subject line of an alert email?

SplunkTrust
SplunkTrust

I have 5 basic SOAP web services that get logged by splunk which have 5 different names (Example: 'DeliveryScheduleRequest'). I did a field extraction for those web services which works successfully. I then set an alert which sends an email anytime one of these web services has a response time longer than 5 seconds.

Now I would like to have that extracted field to show in the subject line of the email anytime an alert happens. How would I do that?

Tags (3)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

I was finally able to figure it out. I'll post my findings for others to see in the future

I had an extracted field called 'TestCall' which represented the web service calls in my events. The reason it wasn't working was because I did not have 'TestCall' in my search query, it was only an extracted field. Below is my query which works as expected

index=uvtrans | transaction GUID startswith="*request" endswith="*response" | where duration>5 AND isnotnull(TestCall)

View solution in original post

SplunkTrust
SplunkTrust

I was finally able to figure it out. I'll post my findings for others to see in the future

I had an extracted field called 'TestCall' which represented the web service calls in my events. The reason it wasn't working was because I did not have 'TestCall' in my search query, it was only an extracted field. Below is my query which works as expected

index=uvtrans | transaction GUID startswith="*request" endswith="*response" | where duration>5 AND isnotnull(TestCall)

View solution in original post

SplunkTrust
SplunkTrust

If your search results have a field web_service you can use its value in email alerts by using the token $result.web_service$. See http://docs.splunk.com/Documentation/Splunk/6.2.2/Alert/Setupalertactions#Use_tokens_in_email_notifi... for reference. Note, this may not exist for you if you're using fairly oldish Splunk versions.

0 Karma

SplunkTrust
SplunkTrust

Now I'm thoroughly confused... if the email tokens part of your issue is solved, you should probably close this question and open a new one with your transaction issue along with some sample data so people can reproduce the issue easily.

0 Karma

SplunkTrust
SplunkTrust

Using that very token over here works well, and it seems your alert definition is correct as well. What values does your search return for the field?

0 Karma

SplunkTrust
SplunkTrust

I discovered the mistake yesterday and you also nailed it on the head... It's an issue with my search query. I did not include TestCall="*" in my search. So when alert does not see TestCall because its not in the search. I thought that by defining it as an extracted field was enough, but I gotta put it in my search query.

Now my other issue is when I put TestCall="*", a few things happen..

The unique identifier "GUID" is attached to the request and response. This allows me to pipe it into TRANSACTION and group them together and measure the response time for each web service call. So there were 10 events which were grouped together that had a response time longer than 5 seconds today

Query 1:

index=uvtrans  | transaction GUID startswith="*request" endswith="*response" | WHERE duration>5

This returns back 10 events which had response times greater than 5 seconds.

Query 2:

index=uvtrans TestCall="*"  | transaction GUID startswith="*request" endswith="*response" 

This returns back results but the duration is not longer than 5 seconds like the original search.

Query 3:

index=uvtrans  TestCall"*" | transaction GUID startswith="*request" endswith="*response" | WHERE duration>5

Now introducing the WHERE duration>5 returns back 0 results

0 Karma

SplunkTrust
SplunkTrust

Something more general about your search, you will miss a lot of long-running calls.

First, you're running the search every minute over -1m to now. Assuming your data has on average two seconds of latency and clock inaccuracy, you will on average miss two seconds of data for every execution.

Second, you're using transaction to merge large calls together. Say a call starts at 01:02:55 and ends at 01:03:05, it'd be ten seconds long and alert-able. However, your search running at 01:03:00 will only see a start and your search running at 01:04:00 will only see an end.

Here's an alternative suggestion to be scheduled every minute with a time range of -2m@m to @m:

index=uvtrans  | transaction GUID startswith="*request" endswith="*response" | WHERE duration>5 AND _time < relative_time(now(), "-1m")

That will search each event twice, but only return events that started in the first minute of the two-minute time range. Hence an event starting in the first minute and ending in the second will be covered.
Two assumptions are needed for the timing of this, first that you have low-ish latency and that calls don't take over a minute minus latency. If one of the assumption doesn't hold for you then you'll need to extend the time range or increase the offset into the past.

0 Karma

SplunkTrust
SplunkTrust

Thanks for the suggestion, I applied to my alert. Any ideas of how to get this token working so I can see which web service is running slow in the email subject line?

0 Karma

SplunkTrust
SplunkTrust

You could call a script, sure... Bash or Python work pretty well out of the box.

Alternatively, you could post your savedsearches.conf entry for this search so we can see where the issue is and get email working like it does for thousands of others.

0 Karma

SplunkTrust
SplunkTrust

Here's my entry from savedsearches.conf. Thanks for the help so far

[Response Time > 5 Sec]
    action.email = 1
    action.email.include.trigger = 1
    action.email.reportServerEnabled = 0
    action.email.subject.alert = Splunk Alert: $result.Call5$ Resp > 5 Sec
    action.email.to = DOTCOM_PERFORMANCE_MONITORING_ALERTS@xxxxxxx.com
    action.email.useNSSubject = 1
    alert.digest_mode = 0
    alert.suppress = 0
    alert.track = 1
    counttype = number of events
    cron_schedule = * * * * *
    description = This alert goes off when a web service call takes longer than 5 seconds
    dispatch.earliest_time = -1m
    dispatch.latest_time = now
    display.events.fields = ["host","TYPE","CLASSPATH","splunk_server","Status","City","Code","GUID","req","resp","duration","CLASS","linecount","GUID1"]
    display.page.search.mode = verbose
    display.visualizations.chartHeight = 279
    display.visualizations.charting.chart = line
    enableSched = 1
    quantity = 0
    relation = greater than
    request.ui_dispatch_app = search
    request.ui_dispatch_view = search
    search = index=uvtrans  | transaction GUID startswith="*request" endswith="*response" | WHERE duration>5
0 Karma

SplunkTrust
SplunkTrust

I went ahead and did another field extraction to replace Call5 to make sure that wasn't the issue. So I have a new field extraction called 'TestCall'.

I did a search for all web service calls that were greater than 5 seconds then did a field extraction for all the request names (TestCall). I only had about 10 calls greater than 5 seconds and my TestCall extraction picked up all the request names as expected.

I then changed my alert to 'Splunk Alert: $result.TestCall$ Resp > 5 Sec'

So now I have to wait for a request that takes longer than 5 seconds to see if this works.. I also included my savedsearches.conf entry above. Please advise on what else I should try if this doesn't work

0 Karma

SplunkTrust
SplunkTrust

What version of Splunk are you on?
Also, my bad - it is $result.Call5$ without the s... http://docs.splunk.com/Documentation/Splunk/6.2.2/Alert/Setupalertactions#Use_tokens_in_email_notifi...

0 Karma

SplunkTrust
SplunkTrust

I'm running 6.2

I'm pretty sure I tried both $result.Call5$ and $results.Call5$ with no luck. Looks like my only option left is to call an external script to reference which web service is running slow. Do you know if its possible for me to use Javascript to do so?

0 Karma

SplunkTrust
SplunkTrust

My extracted field is named 'Call5'. I went into my alert and put the email subject as

Splunk Alert: $results.Call5$ Response > 5 Seconds

Now the only thing appearing in the subject line is Splunk Alert:

0 Karma