Splunk Search

Add avg to xyseries

seomaniv
Explorer

I have a column chart that works great, but I want to add a single value to each column.

The columns represent the sum of run times for a series of daily sub-jobs. Jobs are variable, but lets say for example there are 5 jobs that run, and maybe 5 sub-jobs. If I run my stats and chart using:

 

stats sum(subJobRunTime) as totalRunTime by Job Date | 
xyseries Date, Job, totalRunTime

 

I get a nice column chart. The table looks like this:

 Date Job 1 Job 2 Job 3 Job 4 Job 5
6/1/2020 78.7 11.11666667 3.366666667 48.01666667 2.283333333
6/2/2020 93.55 13.71666667 39.88333333 53 3.233333333
6/3/2020 99.01666667 11.86666667 5.7 35.91666667 2.666666667
6/4/2020 77.36666667 34.98333333 86.31666667 50.78333333 2.183333333
6/5/2020 117.2166667 12.25 58.65   2.133333333

 

I would like to simply add a row at the bottom that is the average plus one standard deviation for each column, which I would then like to add as an overlay on the chart as a "limit line" that the user can use as a visual of "above this, job is taking too long." 

I have tried append, appendpipe, and even appendcols, as well as trying eventstats to add the average as an additional field to each event, but these either add no value or worse, make the chart unnecessarily complicated (converting my columns into runTimeJob1, avgRunTimeJob1, runTimeJob2, avgRunTimeJob2, etc., so I don't have a single series "average" that I can overlay on the column chart. Would like something like:

 Date Job 1 Job 2 Job 3 Job 4 Job 5
6/1/2020 78.7 11.11666667 3.366666667 48.01666667 2.283333333
6/2/2020 93.55 13.71666667 39.88333333 53 3.233333333
6/3/2020 99.01666667 11.86666667 5.7 35.91666667 2.666666667
6/4/2020 77.36666667 34.98333333 86.31666667 50.78333333 2.183333333
6/5/2020 117.2166667 12.25 58.65   2.133333333
Average+SD 100 20 60 55 3.5

 

(and yes, the values in the last row are totally made up, but I hope you're seeing my point). Then I'd have the ability to chart job runtime by date, with an overlay of "Average+SD"

Any suggestions?

Labels (2)
0 Karma
1 Solution

to4kawa
Ultra Champion

 

| makeresults 
| eval _raw="Date	Job 1	Job 2	Job 3	Job 4	Job 5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65	 	2.133333333" 
| multikv forceheader=1 
| rename COMMENT as "this is your sample. from here, the logic" 
| table Date Job_* 
| appendpipe 
    [| untable Date Job data 
    | eventstats avg(data) as avg_Job stdev(data) as sd_Job by Job 
    | eval "Average+SD" = avg_Job + sd_Job 
    | xyseries Date Job Average+SD 
    | head 1 
    | eval Date = "Average+SD"]

 

results:

DateJob_1Job_2Job_3Job_4Job_5
6/1/202078.711.116666673.36666666748.016666672.283333333
6/2/202093.5513.7166666739.88333333533.233333333
6/3/202099.0166666711.866666675.735.916666672.666666667
6/4/202077.3666666734.9833333386.3166666750.783333332.183333333
6/5/2020117.216666712.2558.65 2.133333333
Average+SD109.5383867250076827.00283304793240574.1533751934628454.5486030085008762.960223375576494

The results don't match yours, but I'm adding up the mean and standard deviation.
how about this?

This will be my first response on the new platform.
Please Accept and Upvote, if you'd like.

View solution in original post

to4kawa
Ultra Champion

 

| makeresults 
| eval _raw="Date	Job 1	Job 2	Job 3	Job 4	Job 5
6/1/2020	78.7	11.11666667	3.366666667	48.01666667	2.283333333
6/2/2020	93.55	13.71666667	39.88333333	53	3.233333333
6/3/2020	99.01666667	11.86666667	5.7	35.91666667	2.666666667
6/4/2020	77.36666667	34.98333333	86.31666667	50.78333333	2.183333333
6/5/2020	117.2166667	12.25	58.65	 	2.133333333" 
| multikv forceheader=1 
| rename COMMENT as "this is your sample. from here, the logic" 
| table Date Job_* 
| appendpipe 
    [| untable Date Job data 
    | eventstats avg(data) as avg_Job stdev(data) as sd_Job by Job 
    | eval "Average+SD" = avg_Job + sd_Job 
    | xyseries Date Job Average+SD 
    | head 1 
    | eval Date = "Average+SD"]

 

results:

DateJob_1Job_2Job_3Job_4Job_5
6/1/202078.711.116666673.36666666748.016666672.283333333
6/2/202093.5513.7166666739.88333333533.233333333
6/3/202099.0166666711.866666675.735.916666672.666666667
6/4/202077.3666666734.9833333386.3166666750.783333332.183333333
6/5/2020117.216666712.2558.65 2.133333333
Average+SD109.5383867250076827.00283304793240574.1533751934628454.5486030085008762.960223375576494

The results don't match yours, but I'm adding up the mean and standard deviation.
how about this?

This will be my first response on the new platform.
Please Accept and Upvote, if you'd like.

ldongradi_splun
Splunk Employee
Splunk Employee

@to4kawa 's solution is elegant, and I can only scavenge his idea to propose this.

As appendpipe will be used, the non-streaming eventstats can be replaced by stats:

| appendpipe [
| untable Date, Job, data 
| stats avg(data) as avg_Job stdev(data) as sd_Job by Job 
| eval avgSD = avg_Job + sd_Job
| transpose header_field=Job column_name=Date
]

 

.. but as you know, it's easier to modify/adapt/criticize than to create

Tags (1)

to4kawa
Ultra Champion

Yes. In appendpipe, stats is better

 

| appendpipe 
    [| untable Date Job data 
    | stats avg(data) as avg_Job stdev(data) as sd_Job by Job 
    | eval AvgSD = avg_Job + sd_Job 
    | eval Date="Average+SD"
    | xyseries Date Job AvgSD]

transpose makes extra rows. so xyseries is better, I guess.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...