Solved: Re: Add avg to xyseries

seomaniv · ‎06-09-2020

I have a column chart that works great, but I want to add a single value to each column.

The columns represent the sum of run times for a series of daily sub-jobs. Jobs are variable, but lets say for example there are 5 jobs that run, and maybe 5 sub-jobs. If I run my stats and chart using:

stats sum(subJobRunTime) as totalRunTime by Job Date | 
xyseries Date, Job, totalRunTime

I get a nice column chart. The table looks like this:

Date	Job 1	Job 2	Job 3	Job 4	Job 5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65		2.133333333

I would like to simply add a row at the bottom that is the average plus one standard deviation for each column, which I would then like to add as an overlay on the chart as a "limit line" that the user can use as a visual of "above this, job is taking too long."

I have tried append, appendpipe, and even appendcols, as well as trying eventstats to add the average as an additional field to each event, but these either add no value or worse, make the chart unnecessarily complicated (converting my columns into runTimeJob1, avgRunTimeJob1, runTimeJob2, avgRunTimeJob2, etc., so I don't have a single series "average" that I can overlay on the column chart. Would like something like:

Date	Job 1	Job 2	Job 3	Job 4	Job 5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65		2.133333333
Average+SD	100	20	60	55	3.5

(and yes, the values in the last row are totally made up, but I hope you're seeing my point). Then I'd have the ability to chart job runtime by date, with an overlay of "Average+SD"

Any suggestions?

to4kawa · ‎06-09-2020

| makeresults 
| eval _raw="Date	Job 1	Job 2	Job 3	Job 4	Job 5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65	 	2.133333333" 
| multikv forceheader=1 
| rename COMMENT as "this is your sample. from here, the logic" 
| table Date Job_* 
| appendpipe 
    [| untable Date Job data 
    | eventstats avg(data) as avg_Job stdev(data) as sd_Job by Job 
    | eval "Average+SD" = avg_Job + sd_Job 
    | xyseries Date Job Average+SD 
    | head 1 
    | eval Date = "Average+SD"]

results:

Date	Job_1	Job_2	Job_3	Job_4	Job_5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65		2.133333333
Average+SD	109.53838672500768	27.002833047932405	74.15337519346284	54.548603008500876	2.960223375576494

The results don't match yours, but I'm adding up the mean and standard deviation.
how about this?

This will be my first response on the new platform.
Please Accept and Upvote, if you'd like.

View solution in original post

to4kawa · ‎06-09-2020

| makeresults 
| eval _raw="Date	Job 1	Job 2	Job 3	Job 4	Job 5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65	 	2.133333333" 
| multikv forceheader=1 
| rename COMMENT as "this is your sample. from here, the logic" 
| table Date Job_* 
| appendpipe 
    [| untable Date Job data 
    | eventstats avg(data) as avg_Job stdev(data) as sd_Job by Job 
    | eval "Average+SD" = avg_Job + sd_Job 
    | xyseries Date Job Average+SD 
    | head 1 
    | eval Date = "Average+SD"]

results:

Date	Job_1	Job_2	Job_3	Job_4	Job_5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65		2.133333333
Average+SD	109.53838672500768	27.002833047932405	74.15337519346284	54.548603008500876	2.960223375576494

The results don't match yours, but I'm adding up the mean and standard deviation.
how about this?

This will be my first response on the new platform.
Please Accept and Upvote, if you'd like.

ldongradi_SPL · ‎06-10-2020

@to4kawa 's solution is elegant, and I can only scavenge his idea to propose this.

As appendpipe will be used, the non-streaming eventstats can be replaced by stats:

| appendpipe [
| untable Date, Job, data 
| stats avg(data) as avg_Job stdev(data) as sd_Job by Job 
| eval avgSD = avg_Job + sd_Job
| transpose header_field=Job column_name=Date
]

.. but as you know, it's easier to modify/adapt/criticize than to create

to4kawa · ‎06-10-2020

Yes. In appendpipe, stats is better

| appendpipe 
    [| untable Date Job data 
    | stats avg(data) as avg_Job stdev(data) as sd_Job by Job 
    | eval AvgSD = avg_Job + sd_Job 
    | eval Date="Average+SD"
    | xyseries Date Job AvgSD]

transpose makes extra rows. so xyseries is better, I guess.

Add avg to xyseries

chart

stats

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Splunk Community Badges!

How to find the worst searches in your Splunk environment and how to fix them

Share Your Feedback: On Admin Config Service (ACS)!

Join the Conversation