Knowledge Management

How do I find the last occurrence of a pattern in an event (based on a transaction)

RickPeters
Engager

I have a search on a application log file which uses transaction to combine several events into one based on a common transactionid.
In this result I want to find the last occurrence of a certain pattern

Given this example transaction event:

2015-09-24 15:16:59.473 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.deployit.engine.tasker.Task - Publishing state change UNREGISTERED -> PENDING
2015-09-24 15:16:59.480 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.deployit.engine.tasker.Task - Publishing state change PENDING -> QUEUED
2015-09-24 15:16:59.481 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.deployit.engine.tasker.Task - Publishing state change QUEUED -> EXECUTING
2015-09-24 15:17:00.838 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.deployit.engine.tasker.Task - Publishing state change EXECUTING -> EXECUTED
2015-09-24 15:17:00.838 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.d.e.tasker.TaskManagingActor - Task [d4fd6917-158d-407f-bae5-432bb44bc368] is completed with state [EXECUTED]
2015-09-24 15:17:01.509 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.deployit.engine.tasker.Task - Publishing state change EXECUTED -> DONE
2015-09-24 15:17:02.508 [TaskExecutionEngine-akka.actor.default-dispatcher-15158] {taskId=d4fd6917-158d-407f-bae5-432bb44bc368, username=zijlsm, stepDescription=Running garbage collection on the repository} INFO  c.x.d.e.tasker.TaskManagingActor - Task [d4fd6917-158d-407f-bae5-432bb44bc368] is completed with state [DONE]

I want to retrieve the last state change (in this case EXECUTED -> DONE of which I need only the text DONE)
I tried several regular expression but can't get it to work 😞

Basic idea was looking for -> which is not followed by (.*->.*) (anything containing ->) but I can't get this to work.

My latest try was this rex field=_raw "(?m).*(-> )(?!(.*->.*))(?<curstate>.*)" bit this just captures the first state (PENDING)

I've been trying for quite some time now and any help would be welcome.
I could try and do this without using the transaction by taking just the last event with a state transition, however I also need the other events in this transaction the don't contain state transitions but other interesting fields (hurraah for this log format in which the aplication logs using a lot of different formats and the only recurring attribute is the taskId)

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

If you define fields based on the events, then when you group the events into a transaction, the fields will become multi-valued.

So, defining the state change field before creating your transaction might be helpful:

| your search here
| rex "state change\s*(?<state_change>.*)" 
| transaction here
| eval last_state_change = mvindex(state_change,-1)
| eval last_state_change = replace(last_state_change,".*?-\>\s*(.*)","\1")
| whatever you want to do next

This picks off the last value for the state_change field and then does a replace that keeps only the portion of the field following the ->

HTH!

View solution in original post

lguinn2
Legend

If you define fields based on the events, then when you group the events into a transaction, the fields will become multi-valued.

So, defining the state change field before creating your transaction might be helpful:

| your search here
| rex "state change\s*(?<state_change>.*)" 
| transaction here
| eval last_state_change = mvindex(state_change,-1)
| eval last_state_change = replace(last_state_change,".*?-\>\s*(.*)","\1")
| whatever you want to do next

This picks off the last value for the state_change field and then does a replace that keeps only the portion of the field following the ->

HTH!

RickPeters
Engager

Great Answer!

I was already thinking about evaluating the last_state_change before the transaction but didn't know that a multi-value field would be created 🙂

Next problem was that the ordering of multi-value fields is lexical by default and not in the event order (which seems very strange to me). So I had to take a lot of steps to solve that.
This is what I came up with:

| transaction whatever you want mvlist=state_change
| full_state=mvfilter(state_change!="NULL")
| eval last_state = mvindex(full_state, -1)

The mvlist is necessary to get the event order
The mvfilter is necessary to skip NULL values, which you get if you use mvlist, that I had to use "NULL" as a string was also something I didn't expect.
The final mvindex gives the last known state I needed

TFTH!

0 Karma

lguinn2
Legend

Dang, I forgot that it sorts the multi-value fields - how annoying. And "NULL" seems weird, can you use one of the following?

full_state=mvfilter(isnotnull(state_change))

full_state=mvfilter(state_change!=null())

AFAIK, there is actually no keyword called NULL, even though that is implied in the documentation.

0 Karma

RickPeters
Engager

Hi, I tried your suggestions:

state_change!=null() delivers a Error in 'eval' command: Typechecking failed. The '!-' operator received different types.

isnotnull(state_change) does not hit an error, but does not work as intended (it sees "NULL" values as real values)

I really think that the conversion of the separate events into a multivalue field has some kind of error when not sorting lexically. It should not add the NULL values as strings "NULL" to the multivalue field as it seems to do now?

thanks for all the help and the follow-up !

0 Karma

lguinn2
Legend

I agree with you, that is just weird. I assume that the string "NULL" appears nowhere in the data?

Quick question, how are you forming the transaction? Or better said, what are the parameters that you use to the transaction command?

0 Karma

RickPeters
Engager

I use this transaction statement:

transaction taskId maxpause=120m maxevents=30000 mvlist=state_change

I'm still tuning the maxpause and maxevents parameters based on real transactions. These are tasks running in a task scheduler. Some of them are really long and contain lots of steps. Most of them are ready running within a few minutes, but is specific cases (error situation) a task does not end correctly and stays as a kind of zombie in the system (hence the long maxpause). It's a bit brute force, but Splunk seems to handle this pretty wel 🙂

0 Karma
Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...