I've got the following query to detect that a worker instance of mine is actually doing what it's supposed to on a regular basis. If it doesn't in a particular environment, the query won't return a row for that environment. I thought perhaps I could join the results with a literal dataset of environments, to ensure there is a row for each environment, but despite looking over the documentation, I can't find a way to make the join work. Admittedly, I'm new to Splunk querying, so might be missing something obvious, or there might be some other way of doing this without `join`.
| mstats sum(worker.my_metric) AS my_metric
WHERE index="service_metrics"
AND host=my-worker-*
earliest=-2h
BY host
| eval env = replace(host, "^my-worker-(?<env>[^-]+)$", "\1")
| stats sum(my_metric) AS my_metric
BY env
| eval active = IF(my_metric > 0, "yes", "no")
| join type=right left=M right=E WHERE M.env = E.env
from [{ env: "dev" }, { env: "beta" }, { env: "prod" }]
Finding something that is not there is not Splunk's strong suit. See this blog entry for a good write-up on it.
https://www.duanewaddle.com/proving-a-negative/
FTR, if an <env> value is missing, it will be absent from the stats command results, not zero.