We have a set of logs from different hosts that specify a metric. I want to display a line graph over a user-selectable time period that plots the metric as a percentage difference from the 30 day average for each host.
That is to get the 30 day average for each host then plot (metric/average)*100)-100. This would mean that 80% would plot as -20, 100% as 0, 120% as 20, and so on. Additionally, we would like to highlight where the value falls outside -50 - 50. The next stage would be to alert if the value is outside the -50 to 50 more than the last, say 20 mins.
What we are looking for is to spot where the metric is larger than average for an extended period (flood condition) or lower (drought). The range could change.
I'm new to Splunk so I don't really know exactly where to start.
I can get the graph to work with averaging in the same period as plotting. It's having the average over a different time period. Hosts are selected by a pattern so would be dynamic (host matching pattern CVM_AGG). We may extend this to either a second query BUS_AGG or merge the two _AGG. Since we are using percentage against average they will scale the same.
Adding as comment to attach examples.
Pivot editor
Run report
sample:
| tstats count where index=_internal source="*metrics.log" by _time span=1h PREFIX("name=")
| rename "name=" as name
| rename COMMENT as "from here, the logic"
| bin _time span=1d
| stats sum(count) as count by _time name
| streamstats global=f window=30 avg(count) as 30days_avg by name
visualization > Line Chart with trellis by name
Why don't you try the following:
Well this chart will (if usable) fit to an existing dashboard but that bit maybe irrelevant.
This is my data model
index = "tivoli_omnibus_prod" sourcetype=objsvr_stats_log source=CVM_AGG earliest=-30d@d | eventstats avg(Status_Inserts) as Inserts by Omnibus | eval percstat=(100*(Status_Inserts/Inserts))-100
I know the "earliest" statement is fixing date range.
Then into this report
| pivot Netcool_Insert_Model RootObject avg(percstat) AS "Compare to 30 day average" SPLITROW _time AS _time PERIOD auto SPLITCOL Omnibus SORT 0 _time ROWSUMMARY 0 COLSUMMARY 0 NUMCOLS 100 SHOWOTHER 0
Is this the right way to go about things? Where does your logic fit?
I don't know. I can't understand pivot query.
This is what I get saving pivot as a report using the interface. So it's generated by Splunk.
If I view in the Pivot editor I can see two lines and it all looks fine, I can change the time period and the graph follows that selection.
If I run in Report it's all fixed at 30 days regardless of the TimePicker.
| tstats avg(percstat) as count from datamodel=Netcool_Insert_Model by _time Omnibus
| xyseries _time Omnibus count
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Tstats
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Pivot
I tried this in the Search app and get
Error in 'DataModelCache': Invalid or unaccelerable root object for datamodel
The search job has failed due to an error. You may be able view the job in the Job Inspector.
Editing DataModel to use acceleration gives
You can only accelerate data models that include at least one event-based dataset or one search-based dataset that does not include streaming commands.
| pivot Netcool_Insert_Model RootObject avg(percstat) AS "Compare to 30 day average" earliest=-4h SPLITROW _time AS _time PERIOD auto SPLITCOL Omnibus SORT 0 _time ROWSUMMARY 0 COLSUMMARY 0 NUMCOLS 100 SHOWOTHER 0
Thanks but this leads to
Error in 'PivotProcessor': Error in 'PivotCell': The dataset 'RootObject' has no field 'earliest=-4h'.
I'm going to see what I can play with. Just still getting used to terms etc.
I don't even know the data, so it's a dead end.
Well the data is pretty simple, one file per Netcool and the lines in the log have time stamp and a number of numeric values one of which is a count of inserts.
But isn't earliest part of Splunk limiting how far back the search can go?
This all works nicely in the Pivot editor but not run as a report so there must be something in the editor that obeys the data picker that's ignored in the report.
What we are trying to highlight is where the inserts are outside a range around the 30 day average, that is if inserts are outside 50-150% of the average. If too low for an extended period possible data feed issue, if too high we could be seeing a data flood that if maintained could bring system down.