I have a problem with a requirement to remove data collected on weekend days from my indexes. I can do this accurately, but when I run a calculation over the resultant set, it seems to continue to include data from these days as 'zeros'. Here is the search string:
index=edistats RealName="ShipmentOpen" | `isWeekDay(_time)` | where weekDay < 6 AND weekDay > 0 | timechart span=1d avg(Elapsed) as average | `lineartrend(_time,average)` | timechart max(average) as mean sum(newY) as regression
The included macros are:
isWeekDay:
eval weekDay = strftime($time$,"%w")
and lineartrend:
eventstats count as numevents sum($x$) as sumX sum($y$) as sumY sum(eval($x$*$y$)) as sumXY sum(eval($x$*$x$)) as sumX2 sum(eval($y$*$y$)) as sumY2 | eval slope=((numevents*sumXY)-(sumX*sumY))/((numevents*sumX2)-(sumX*sumX)) | eval yintercept= ((sumY*sumX2)-(sumX*sumXY))/((numevents*sumX2)-(sumX*sumX))| eval newY=(yintercept + (slope*$x$)) | eval R=((numevents*sumXY) - (sumX*sumY))/sqrt(((numevents*sumX2)-(sumX*sumX))* ((numevents*sumY2)-(sumY*sumY))) | eval R2=R*R
The first macro does the job and comes up with the days I want to drop, however, if I do this, the fitted linear regression is lower than what it should be. This is clear from the visual display as the raw trend is mostly flat.
Do I need to do something in the lineartrend
macro to ensure it avoids these dropped days?
Thanks,
Stan
timechart always adds 0-values
, but stats does not so do it like this:
index=edistats RealName="ShipmentOpen" | `isWeekDay(_time)` | where weekDay < 6 AND weekDay > 0 | bucket _time span=1d | stats avg(Elapsed) as average by _time | `lineartrend(_time,average)` | stats max(average) as mean sum(newY) as regression BY _time
timechart always adds 0-values
, but stats does not so do it like this:
index=edistats RealName="ShipmentOpen" | `isWeekDay(_time)` | where weekDay < 6 AND weekDay > 0 | bucket _time span=1d | stats avg(Elapsed) as average by _time | `lineartrend(_time,average)` | stats max(average) as mean sum(newY) as regression BY _time
MuS,
Thanks for that. Perfect!
@woodcock, it's hard to explain without a visual. Here is a PDF of the plots:
https://app.box.com/s/ub1axvu0yz3t6z2wvxl51h6g8zew5lwz
The top one does not have the value:
<option name="charting.chart.nullValueMode">connect</option>
Once this is added, the outcome is like the bottom one.
Thanks for all your help,
Stan
woodcock,
Thanks so much. This computes correctly now. My search string is now:
index=edistats RealName="ShipmentOpen" | isWeekDay(_time)
| where weekDay < 6 AND weekDay > 0 | bucket _time span=1d | stats avg(Elapsed) as average by _time | lineartrend(_time,average)
| timechart max(average) as mean sum(newY) as regression
I want to see the linear fit superimposed on the raw data. The raw plot is done as columns. However, now that the blank days are left out, the linear trend is not drawn contiguously, but rather as a set of segments. Is there a charting option where they can be linked as a single line?
Thanks
Hi brutecat,
see the docs http://docs.splunk.com/Documentation/Splunk/6.1.1/AdvancedDev/CustomChartingConfig-chartlegend#linec... and the nullValueMode
option
I do not understand your question but if you need more help, work on clarifying it and post a new question.