I have an unstable data feed that sometimes only reports on a fraction of all assets. I do not want such periods to show any number. The best way I can figure to exclude those time period is to see if there is a sudden drop of some sort of total. So, I set up a condition after timechart like this:
| addtotals
| delta "Total" as delta
| foreach *
[eval <<FIELD>> = if(-delta > Total OR Total < 5000, null(), '<<FIELD>>')]
The algorithm works well for Total, and for some series in timechart, but not for all, not all the time.
Here are two emulations using index=_internal on my laptop. One groups by source, the other groups by sourcetype.
index=_internal earliest=-7d
| timechart span=2h count by source
``` data emulation 1 ```
With group by source, all series seem to blank out as expected.
Now, I can run the same tally by sourcetype, like thus
index=_internal earliest=-7d
| timechart span=2h count by sourcetype
``` data emulation 2 ```
This time, all gaps have at least one series that is not null; some series go to zero instead of null, some even obviously above zero.
What is the determining factor here?
If you have suggestion about alternative approaches, I would also appreciate.
Your foreach eval statement is wrong, it should test for Total and delta fields
[eval <<FIELD>> = if("<<MATCHSTR>>"!="Total" AND "<<MATCHSTR>>"!="delta" AND -delta > Total OR Total < 5000, null(), '<<FIELD>>')]
You are not excluding 'delta' and 'Total' fields from the eval, so Total is set to null() before you process the other fields, so breaks the eval for subsequent passes.
Your foreach eval statement is wrong, it should test for Total and delta fields
[eval <<FIELD>> = if("<<MATCHSTR>>"!="Total" AND "<<MATCHSTR>>"!="delta" AND -delta > Total OR Total < 5000, null(), '<<FIELD>>')]
You are not excluding 'delta' and 'Total' fields from the eval, so Total is set to null() before you process the other fields, so breaks the eval for subsequent passes.
Excellent, thanks for the diagnosis! So, the ASCII order in * enumeration ruined the conditions. In my emulations, source series begin with a slash (/) that precedes T, but sourcetype series all begin with a lower-case letter that succeeds T. This explains why the two groupby's behave differently.
But I still need to null out Total. So, a better (and yet simpler) approach is to place "Total" at the end of enumeration taking advantage of Splunk's globber rule:
| addtotals
| delta "Total" as delta
| foreach * Total delta
[eval <<FIELD>> = if(-delta > Total OR Total < 5000, null(), '<<FIELD>>')]
| fields - delta
Update: In real world, it is often undesirable to use arbitrary thresholds like Total < 5000. For this technique to work, I also need to make sure all other fields are nulled before delta. So, I must expressly specify the order of these two fields in foreach.
Update 2: In addition to wanting to nullify Total, I also need to remove delta. So, my best approach would be a hybrid of hacking field name and ensuring order:
| addtotals
| delta "Total" as _delta
| foreach * Total
[eval <<FIELD>> = if(-_delta > Total OR Total < 5000, null(), '<<FIELD>>')]
This way, field deletion is also unnecessary. Thanks again, @bowesmana for the inspiration!
Or you can use the _ field prefix to hide it from the foreach, i.e.
| addtotals fieldname=_T
| delta "_T" as _delta
| foreach *
[eval <<FIELD>> = if(-'_delta' > '_T' OR '_T' < 5000, null(), '<<FIELD>>')]
``` To show that the _ fields are present ```
| eval y=_T, x=_delta
Nice! Love these little Splunk quirks aka tricks. (For anyone who stumble upon the same needs in the future, using _ would be perfect if Total doesn't have to be nullified. I need to null Total, so the amount of work will be similar to reorder foreach.)