I am trying to find the rate at which parts fail. Parts send me a message every x amount of time. If I do not get a message from a part after around 20 days, I consider it dead. I have calculated the rate of "death" for each day. The last step is to multiply this value with the value above.
For instance, I am running this:
base search
| bucket _time span=1d
| stats latest(_time) AS LatestTime, earliest(_time) AS EarliestTime by uniqueID
| eval delayDays=(NOW()-LatestTime)/60/60/24
| eval dead=if((delayDays>20),1,0)
| eval lifeLength=(LatestTime - EarliestTime)/60/60/24
| chart count(eval(dead=0)) as "numberAlive", count(eval(dead=1)) as "numberDead" by lifeLength span=1
| eval sumOfDeadAlive=numberAlive+numberDead
| eventstats sum(sumOfDeadAlive) as sumOfDeadAlive
| eval survivalRate = (sumOfDeadAlive - numberDead)/(sumOfDeadAlive)
| streamstats list(survivalRate)
This will return something like this:
lifeLength numberAlive numberDead list(survivalRate) sumOfDeadAlive survivalRate
0-1 13 249 0.674935 766 0.674935
1-2 1 3 0.674935 766 0.996084
0.996084
2-3 0 0 0.674935 766 1
0.996084
1
etc
Is there a way to multiply each result in the "list(survivalRate)" together? Maybe an eval command I do not know? Or can you use mvexpand to expand the results into columns somehow instead of new rows?
EDIT:
I would like to add a column that shows the runningSurvivalRate.
In a perfect world it would be like running streamstats product(survivalRate) if that function existed (similar to sum(survivalRate) where instead of adding each result, it multiplies each result)
In the example above, the running survival rate would be like this:
lifeLength numberAlive numberDead list(survivalRate) sumOfDeadAlive survivalRate runningSurvivalRate
0-1 13 249 0.674935 766 0.674935 0.674935
1-2 1 3 0.674935 766 0.996084 0.67229(0.674935*0.996084)
0.996084
2-3 0 0 0.674935 766 1 0.67229(0.674935*0.996084*1)
0.996084
1
etc
Okay, using that clarification I'd get rid of this mv field altogether and do something like this:
...
| eval sumOfDeadAlive=numberAlive+numberDead
| eventstats sum(sumOfDeadAlive) as sumOfDeadAlive
| streamstats sum(numberDead) as sumOfDead
| eval survivalRate = (sumOfDeadAlive - sumOfDead)/(sumOfDeadAlive)
This slightly shuffles around the existing parts of the search, without getting into mv fields.
Okay, using that clarification I'd get rid of this mv field altogether and do something like this:
...
| eval sumOfDeadAlive=numberAlive+numberDead
| eventstats sum(sumOfDeadAlive) as sumOfDeadAlive
| streamstats sum(numberDead) as sumOfDead
| eval survivalRate = (sumOfDeadAlive - sumOfDead)/(sumOfDeadAlive)
This slightly shuffles around the existing parts of the search, without getting into mv fields.
This is pretty close. But mathmatically not quite the same. I really do need to multiply the result of row 1 with row 2, then the result of that with row 3, and that with the result of row 4 etc.
Then I don't understand the meaning of your survivalRate 🙂
Basic example: Half dies on the first day, the other half dies on the second day. First-day-survivalRate is 0.5, obviously. Using your multiplication, the second-day-survivalRate is 0.25... which is where I'm confused. 100% is dead, 50% died during the second day, how is the survival rate 0.25?
You are 100% right. I had my math wrong in my head. Thank you for fixing me up.
Marking your answer as correct.
Could you be more specific as to how you'd like your final result be calculated, independent of Splunk SPL?
To me it looks like your sample data doesn't match the search provided, so I'm confused about your intentions.
Sorry, updated for clarity.