Splunk Search
Highlighted

How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

Engager

Hi,

I have something like the following, where I have a message producer and consumer.
I am using ActiveMQ for messaging.

Sometimes I notice that consumer didn't get messages and I'm logging this way:

Producer code: log.info("Status=Produced, TransactionId=123");
Consumer.code: log.info("Status=Consumed, TransactionId=123");

I also have a Dead Letter queue consumer, which logs something like:

DLQConsumer: log.info("Status=Discarded, TransactionId=123");

The whole Producer/Consumer flow is Async.

I need Splunk to alert me when it sees a transaction, that is not processed by Consumer.

How do I write a Splunk search to alert me for these?

In a nutshell what I would like to get reported is that:

All messages produced should be consumed, if not, then I need to get alerted with TransactionId.

Also I don't want to deal with a situation where a message was just produced and not yet consumed, still Splunk reporting it to me.
Maybe I can set the time range as current time - 15 minutes to current time - 1 minute to avoid a situation where a message was just produced and not yet consumed.

0 Karma
Highlighted

Re: How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

Communicator

... | eval isnotibleevent=if(condition,"t",NULL) | transaction somefield | where isnull(isnotable_event)

0 Karma
Highlighted

Re: How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

Esteemed Legend

I don't know why you mentioned the DLQ but something like this should work for you:

... | reverse | streamstats current=t count(eval(Status="Produced")) AS sessionID by TransactionId | stats earliest(_time) AS startTime latest(_time) AS endTime count by sessionID host | where count=1 | eval waitingSeconds = now() - _time | where waitingSeconds > (15*60)
0 Karma
Highlighted

Re: How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

Engager

The reason why I mentioned DLQ is that I wanted a report telling me how many messages were not processed [on the Consumer layer]. Ideally if I produce X, then I want to consume all X. Irrespective of where the messages go (either to DLQ or not consumed), I need a report that clearly tells me X were produced and X - n were consumed and the report should just have "n" records along with transactionId's.
Yesterday I ran into an issue where Producer dropped off messages and I didn't see any activity on the Consumer side. Messages were processed by DLQConsumer after a while as Consumer had some issue (likely the connectivity to ActiveMQ was broken). Though a simple restart resolved the issue, I had no clue as to know why no messages were processed by Consumer. The issue lasted for a few hours. I would have reacted if I had a splunk alert for a situation like this and that's why I posted this question yesterday.

0 Karma
Highlighted

Re: How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

Esteemed Legend

That was my point: for the purposes of your question, DLQ is irrelevant. My answer should suffice as-is.

0 Karma