I have a column chart that works great, but I want to add a single value to each column.
The columns represent the sum of run times for a series of daily sub-jobs. Jobs are variable, but lets say for example there are 5 jobs that run, and maybe 5 sub-jobs. If I run my stats and chart using:
stats sum(subJobRunTime) as totalRunTime by Job Date |
xyseries Date, Job, totalRunTime
I get a nice column chart. The table looks like this:
Date | Job 1 | Job 2 | Job 3 | Job 4 | Job 5 |
6/1/2020 | 78.7 | 11.11666667 | 3.366666667 | 48.01666667 | 2.283333333 |
6/2/2020 | 93.55 | 13.71666667 | 39.88333333 | 53 | 3.233333333 |
6/3/2020 | 99.01666667 | 11.86666667 | 5.7 | 35.91666667 | 2.666666667 |
6/4/2020 | 77.36666667 | 34.98333333 | 86.31666667 | 50.78333333 | 2.183333333 |
6/5/2020 | 117.2166667 | 12.25 | 58.65 | 2.133333333 |
I would like to simply add a row at the bottom that is the average plus one standard deviation for each column, which I would then like to add as an overlay on the chart as a "limit line" that the user can use as a visual of "above this, job is taking too long."
I have tried append, appendpipe, and even appendcols, as well as trying eventstats to add the average as an additional field to each event, but these either add no value or worse, make the chart unnecessarily complicated (converting my columns into runTimeJob1, avgRunTimeJob1, runTimeJob2, avgRunTimeJob2, etc., so I don't have a single series "average" that I can overlay on the column chart. Would like something like:
Date | Job 1 | Job 2 | Job 3 | Job 4 | Job 5 |
6/1/2020 | 78.7 | 11.11666667 | 3.366666667 | 48.01666667 | 2.283333333 |
6/2/2020 | 93.55 | 13.71666667 | 39.88333333 | 53 | 3.233333333 |
6/3/2020 | 99.01666667 | 11.86666667 | 5.7 | 35.91666667 | 2.666666667 |
6/4/2020 | 77.36666667 | 34.98333333 | 86.31666667 | 50.78333333 | 2.183333333 |
6/5/2020 | 117.2166667 | 12.25 | 58.65 | 2.133333333 | |
Average+SD | 100 | 20 | 60 | 55 | 3.5 |
(and yes, the values in the last row are totally made up, but I hope you're seeing my point). Then I'd have the ability to chart job runtime by date, with an overlay of "Average+SD"
Any suggestions?
| makeresults
| eval _raw="Date Job 1 Job 2 Job 3 Job 4 Job 5
6/1/2020 78.7 11.11666667 3.366666667 48.01666667 2.283333333
6/2/2020 93.55 13.71666667 39.88333333 53 3.233333333
6/3/2020 99.01666667 11.86666667 5.7 35.91666667 2.666666667
6/4/2020 77.36666667 34.98333333 86.31666667 50.78333333 2.183333333
6/5/2020 117.2166667 12.25 58.65 2.133333333"
| multikv forceheader=1
| rename COMMENT as "this is your sample. from here, the logic"
| table Date Job_*
| appendpipe
[| untable Date Job data
| eventstats avg(data) as avg_Job stdev(data) as sd_Job by Job
| eval "Average+SD" = avg_Job + sd_Job
| xyseries Date Job Average+SD
| head 1
| eval Date = "Average+SD"]
results:
Date | Job_1 | Job_2 | Job_3 | Job_4 | Job_5 |
6/1/2020 | 78.7 | 11.11666667 | 3.366666667 | 48.01666667 | 2.283333333 |
6/2/2020 | 93.55 | 13.71666667 | 39.88333333 | 53 | 3.233333333 |
6/3/2020 | 99.01666667 | 11.86666667 | 5.7 | 35.91666667 | 2.666666667 |
6/4/2020 | 77.36666667 | 34.98333333 | 86.31666667 | 50.78333333 | 2.183333333 |
6/5/2020 | 117.2166667 | 12.25 | 58.65 | 2.133333333 | |
Average+SD | 109.53838672500768 | 27.002833047932405 | 74.15337519346284 | 54.548603008500876 | 2.960223375576494 |
The results don't match yours, but I'm adding up the mean and standard deviation.
how about this?
This will be my first response on the new platform.
Please Accept and Upvote, if you'd like.
| makeresults
| eval _raw="Date Job 1 Job 2 Job 3 Job 4 Job 5
6/1/2020 78.7 11.11666667 3.366666667 48.01666667 2.283333333
6/2/2020 93.55 13.71666667 39.88333333 53 3.233333333
6/3/2020 99.01666667 11.86666667 5.7 35.91666667 2.666666667
6/4/2020 77.36666667 34.98333333 86.31666667 50.78333333 2.183333333
6/5/2020 117.2166667 12.25 58.65 2.133333333"
| multikv forceheader=1
| rename COMMENT as "this is your sample. from here, the logic"
| table Date Job_*
| appendpipe
[| untable Date Job data
| eventstats avg(data) as avg_Job stdev(data) as sd_Job by Job
| eval "Average+SD" = avg_Job + sd_Job
| xyseries Date Job Average+SD
| head 1
| eval Date = "Average+SD"]
results:
Date | Job_1 | Job_2 | Job_3 | Job_4 | Job_5 |
6/1/2020 | 78.7 | 11.11666667 | 3.366666667 | 48.01666667 | 2.283333333 |
6/2/2020 | 93.55 | 13.71666667 | 39.88333333 | 53 | 3.233333333 |
6/3/2020 | 99.01666667 | 11.86666667 | 5.7 | 35.91666667 | 2.666666667 |
6/4/2020 | 77.36666667 | 34.98333333 | 86.31666667 | 50.78333333 | 2.183333333 |
6/5/2020 | 117.2166667 | 12.25 | 58.65 | 2.133333333 | |
Average+SD | 109.53838672500768 | 27.002833047932405 | 74.15337519346284 | 54.548603008500876 | 2.960223375576494 |
The results don't match yours, but I'm adding up the mean and standard deviation.
how about this?
This will be my first response on the new platform.
Please Accept and Upvote, if you'd like.
@to4kawa 's solution is elegant, and I can only scavenge his idea to propose this.
As appendpipe will be used, the non-streaming eventstats can be replaced by stats:
| appendpipe [
| untable Date, Job, data
| stats avg(data) as avg_Job stdev(data) as sd_Job by Job
| eval avgSD = avg_Job + sd_Job
| transpose header_field=Job column_name=Date
]
.. but as you know, it's easier to modify/adapt/criticize than to create
Yes. In appendpipe, stats is better
| appendpipe
[| untable Date Job data
| stats avg(data) as avg_Job stdev(data) as sd_Job by Job
| eval AvgSD = avg_Job + sd_Job
| eval Date="Average+SD"
| xyseries Date Job AvgSD]
transpose makes extra rows. so xyseries is better, I guess.