Splunk Search

Report on first event in an incident which contains a cluster of abnormal transactions

Builder

Hello, I am grouping some events using transaction and from there identifying what we will call a performance degradation in our application which manifests in longer than normal durations as calculated by transaction . I would like to characterize the incidents by the number of "slow" transactions and the times of the first and last slow transaction, in the span defined by the cluster of slow transactions. I have looked at localize and some of the other commands but can't get anything to give me something useful. For now let's just say that abnormal transactions ar anything over 3 seconds and in an incident there could be 1 abnormal transaction or 100 and they generally last 5 to 20 seconds.

Thanks in advance.
Sean

0 Karma
1 Solution

Legend

I tried a lot of stuff, but it seems to me that the problem is really "how do you define the cluster". You can use the kmeans command, but to do that you have to specify the number of clusters in the command - which means that you have to already have looked at the output.

But maybe the following will give you more ideas to try:

yoursearchhere
| stats range(_time) as duration earliest(_time) as startX latest(_time) as endX by transactionId
| where duration > 3
| sort startX
| streamstats window=1 current=f last(startX) as prevStartX
| eval beforeTimeDiff = startX - prevStartX
| reverse
| streamstats window=1 current=f last(startX) as afterStartX
| eval afterTimeDiff = afterStartX - startX
| eventstats p60(beforeTimeDiff) as beforeTimeCutoff p60(afterTimeDiff) as afterTimeCutoff
| where beforeTimeDiff > beforeTimeCutoff AND afterTimeDiff > afterTimeCutoff
| sort startX
| table startX duration 

Note that I didn't use the transaction command; I used stats instead to calculate the duration. This works great if you have a unique identifier for the transaction; otherwise, you might have to use the transaction command.

View solution in original post

Builder

Thanks for viewing and at least thinking about this, Splunk answers community. I solved my problem by writing an python search command which I named clusterstats and I shared it with the world at http://apps.splunk.com/app/1869/

Yay me!

-sean

0 Karma

Legend

I tried a lot of stuff, but it seems to me that the problem is really "how do you define the cluster". You can use the kmeans command, but to do that you have to specify the number of clusters in the command - which means that you have to already have looked at the output.

But maybe the following will give you more ideas to try:

yoursearchhere
| stats range(_time) as duration earliest(_time) as startX latest(_time) as endX by transactionId
| where duration > 3
| sort startX
| streamstats window=1 current=f last(startX) as prevStartX
| eval beforeTimeDiff = startX - prevStartX
| reverse
| streamstats window=1 current=f last(startX) as afterStartX
| eval afterTimeDiff = afterStartX - startX
| eventstats p60(beforeTimeDiff) as beforeTimeCutoff p60(afterTimeDiff) as afterTimeCutoff
| where beforeTimeDiff > beforeTimeCutoff AND afterTimeDiff > afterTimeCutoff
| sort startX
| table startX duration 

Note that I didn't use the transaction command; I used stats instead to calculate the duration. This works great if you have a unique identifier for the transaction; otherwise, you might have to use the transaction command.

View solution in original post

Builder

Thank you for your effort Lisa. I will play with that at some point. Transaction was working smooth for me due to some of its options and when I plugged your suggestion in without it I was not getting anywhere. localize seems that it would do what is needed but perhaps it does not play well with transaction. The behavior seems easy enough to write as a python command so I am going to explore that so that I can get the behavior I want. I am going to write the command to takes filtered results and an argument which represents the longest span of seconds that I want to consider a cluster.

0 Karma