Hi,
I have an async producer/consumer each logging something like:
producer:
log.info("id=123, status=produced);
consumer:
log.info("id=123, status=consumed");
where id is the transaction ID.
I want to get alerted only when producer is producing and for some reason consumer stopped consuming.
I did write something like:
index="myindex" sourcetype="mysourcetype" | transaction id startswith=(status="produced") endswith=(status="consumed") keepevicted=true maxevents=10 | stats count by closed_txn
Then I ran both producer and consumer simultaneously and observed Splunk showing 0 and 1 for closed_txn.
My assumption is that I should see closed_txn as 1 as both consumer and producer are running.
Later I killed the consumer and let the producer keep running.
Still I get closed_txn showing up as 1 and 0 whereas I thought Splunk should only report 0 as the transaction failed as there is no log from consumer.
I am not sure if I am doing it right.
In summary I want to get alerted when there is production but no consumption.
I don't want to get alerted when there is no production.
Hi @Aravind_Sridharan@intuit.com,
Your assumptions are right about the closed_txn
as you can see here :
https://docs.splunk.com/Documentation/Splunk/7.2.6/SearchReference/Transaction
Your problem could be due to the search being too memory intensive, see bellow between the **
from the doc above :
keepevicted
Syntax: keepevicted=<bool>
Description: Whether to output evicted transactions. Evicted transactions can be distinguished from non-evicted transactions by checking the value of the 'closed_txn' field. The 'closed_txn' field is set to '0', or false, for evicted transactions and '1', or true for non-evicted, or closed, transactions. The 'closed_txn' field is set to '1' if one of the following conditions is met: maxevents, maxpause, maxspan, startswith. For startswith, because the transaction command sees events in reverse time order, it closes a transaction when it satisfies the start condition. **If none of these conditions is specified, all transactions are output even though all transactions will have 'closed_txn' set to '0'. A transaction can also be evicted when the memory limitations are reached**.
Default: false or 0
You could use a more optimized way (processor-wise) to handle your query while avoiding transactions :
index="myindex" sourcetype="mysourcetype" status="produced" OR status="consumed"
|stats values(status) as status, dc(status) as count by id
|where count <2 AND status!="consumed"
Cheers,
David
Thank you.
@Aravind_Sridharan@intuit.com ,
Try this and test against your requirement
index="myindex" sourcetype="mysourcetype" (status="produced" OR status="consumed")
|stats values(status) as status,dc(status) as count by id |where count <2 AND isnotnull(mvfind(status,"produced"))
Alerts when there is only one status for a transaction which is produced
.
You may change the where
condition according to your final requirement
When I run this, I do see a few search results when consumer is not fast enough processing it. Is there a way to add delay say up to 5-10 seconds as sometimes I do see consumer getting messages after 5+ seconds?
try adding a condition based on time as well. Please test it thoroughly
index="myindex" sourcetype="mysourcetype" (status="produced" OR status="consumed")
|stats values(status) as status,dc(status) as count, latest(eval(if(status=="produced",_time,null()))) as time by id
|where count <2 AND isnotnull(mvfind(status,"produced")) AND (now()-time) > 10