Splunk Search

How do I create an alert when Splunk sees a transaction that is missing a certain log event, but avoid false positives?

servlette
Engager

Hi,

I have something like the following, where I have a message producer and consumer.
I am using ActiveMQ for messaging.

Sometimes I notice that consumer didn't get messages and I'm logging this way:

Producer code: log.info("Status=Produced, TransactionId=123");
Consumer.code: log.info("Status=Consumed, TransactionId=123");

I also have a Dead Letter queue consumer, which logs something like:

DLQConsumer: log.info("Status=Discarded, TransactionId=123");

The whole Producer/Consumer flow is Async.

I need Splunk to alert me when it sees a transaction, that is not processed by Consumer.

How do I write a Splunk search to alert me for these?

In a nutshell what I would like to get reported is that:

All messages produced should be consumed, if not, then I need to get alerted with TransactionId.

Also I don't want to deal with a situation where a message was just produced and not yet consumed, still Splunk reporting it to me.
Maybe I can set the time range as current time - 15 minutes to current time - 1 minute to avoid a situation where a message was just produced and not yet consumed.

0 Karma

woodcock
Esteemed Legend

I don't know why you mentioned the DLQ but something like this should work for you:

... | reverse | streamstats current=t count(eval(Status="Produced")) AS sessionID by TransactionId | stats earliest(_time) AS startTime latest(_time) AS endTime count by sessionID host | where count=1 | eval waitingSeconds = now() - _time | where waitingSeconds > (15*60)
0 Karma

servlette
Engager

The reason why I mentioned DLQ is that I wanted a report telling me how many messages were not processed [on the Consumer layer]. Ideally if I produce X, then I want to consume all X. Irrespective of where the messages go (either to DLQ or not consumed), I need a report that clearly tells me X were produced and X - n were consumed and the report should just have "n" records along with transactionId's.
Yesterday I ran into an issue where Producer dropped off messages and I didn't see any activity on the Consumer side. Messages were processed by DLQConsumer after a while as Consumer had some issue (likely the connectivity to ActiveMQ was broken). Though a simple restart resolved the issue, I had no clue as to know why no messages were processed by Consumer. The issue lasted for a few hours. I would have reacted if I had a splunk alert for a situation like this and that's why I posted this question yesterday.

0 Karma

woodcock
Esteemed Legend

That was my point: for the purposes of your question, DLQ is irrelevant. My answer should suffice as-is.

0 Karma

carmackd
Communicator

... | eval is_notible_event=if(condition,"t",NULL) | transaction some_field | where isnull(is_notable_event)

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...