Hi everyone! To start off this question, let me give a bit of context; we’re currently using Splunk to report on a custom in-house application that we’ve developed. As part of this report, we’ve created a plot of the application's response times. We expect this response time to look like an exponential distribution.
As a next step, we’d like to start doing some advanced statistical analysis on the response time data to tell us whether this distribution is changing over time. Specifically, we’d like to be able to calculate the quartiles of the distribution and construct the cumulative density function (CDF) each time we run the search. The ideal solution would be to plot the raw data, the CDF, and the quantiles on the same graph; if that’s not possible, then we’d still be happy if we could simply post all this data on a dashboard somewhere. As best I can tell, there’s no easy way to perform either of these functions in Splunk 4.3.3 (our current version) or Splunk 5.0.1 (which we’ll be upgrading to soon).
I know that there’s functionality via the Python SDK to create a Python script that could run a Splunk search, grab the data, and then calculated the desired quantities using functionality in the numpy and scipy modules. Once this is done, is there an easy way to pass the results back to Splunk to be displayed on a dashboard? Can Splunk’s Python support numpy and scipy? Or is there an easier way to do this that I’m not thinking of?
You could definitely implement this yourself using a custom search command that does exactly what you're thinking of. There's good guidance on how to do this in the docs: http://docs.splunk.com/Documentation/Splunk/5.0/Search/Aboutcustomsearchcommands
You could definitely implement this yourself using a custom search command that does exactly what you're thinking of. There's good guidance on how to do this in the docs: http://docs.splunk.com/Documentation/Splunk/5.0/Search/Aboutcustomsearchcommands