Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Archive

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark Topic
- Subscribe to Topic
- Mute Topic
- Printer Friendly Page

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

07-28-2014
05:49 PM

1 Solution

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

08-06-2014
05:18 PM

Following @martin_mueller's R-rated suggestion and help from R-rated app author @rfujara_splunk😉 as well as a frantic search for cheap interpolation, the following is a recipe to analyse event count.

```
| timechart count
| appendpipe [
| stats count
| addinfo
| eval temp=info_min_time."##".info_max_time
| fields temp count
| makemv temp delim="##"
| mvexpand temp
| rename temp as _time
] | timechart max(count) as COUNT
| fillnull
| eventstats count as TOTAL
| r "output=transform(input,FFT=Mod(fft(COUNT)),Freq=((1:TOTAL)-1)/(TOTAL*X_span))"
```

Application notes

- You need to install the
**R app**. See @martin_meuller's answer above. - For event counts, gaps should be interpreted as 0. The largest part of the above search is to do just that, thanks to @somesoni2's answer to my question.
- The
`eventstats`

to obtain`TOTAL`

is superficial and a waste of computation. There should be a better way to do this within R. - The above only outputs modulus of the transformation because counts are all real numbers. You can output the complex numbers by ridding
`Mod()`

from the above. (Interestingly, although Splunk lacks complex number arithmetics, its stats functions accepts complex numbers. Maybe it takes the real part and discards imaginary part as NaN.) `Freq`

is a dummy sequence for interpretation, expressed in hertz. You can chart over`Freq`

, for example.- Maximum frequency you can analyse is 0.5/
`span`

.`span`

in both`timechart`

calls must be equal. - Beware of an undesirable side effect of
`timechart`

used to fill gaps: It forces an extra interval.

A few F(FT)-words

- As discrete Fourier transform goes, you only look at half of the output sequence (positive frequencies) when inputs are all real.
- When analyzing (all-positive) event counts, output at frequency 0 is meaningless, as this component contains the strong DC bias.
`fft()`

uses a square sampling window. Spectrum leakage could diffuse your analysis especially when dealing with black-and-white data such as event counts.

R-rated notes

- Object
`input`

from Splunk is in "data frame” class. You need to “transform" it into arrays that most R functions deal with. The`transform()`

function in the above has nothing to do with Fourier*transformation*. The latter is performed in`fft()`

function. - In addition to fields you pass to R,
`input`

also passes certain Splunk internal fields as X-rated objects. In the above, X_span is`span`

in the last stats function (`timechart`

); you also have access to X_time which corresponds to _time in Splunk. (This is perhaps not limited to R app.)

The above doesn’t address how to separate data series into R arrays then output transformed objects. That will be my end goal. But it’s a good start.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

08-06-2014
05:18 PM

Following @martin_mueller's R-rated suggestion and help from R-rated app author @rfujara_splunk😉 as well as a frantic search for cheap interpolation, the following is a recipe to analyse event count.

```
| timechart count
| appendpipe [
| stats count
| addinfo
| eval temp=info_min_time."##".info_max_time
| fields temp count
| makemv temp delim="##"
| mvexpand temp
| rename temp as _time
] | timechart max(count) as COUNT
| fillnull
| eventstats count as TOTAL
| r "output=transform(input,FFT=Mod(fft(COUNT)),Freq=((1:TOTAL)-1)/(TOTAL*X_span))"
```

Application notes

- You need to install the
**R app**. See @martin_meuller's answer above. - For event counts, gaps should be interpreted as 0. The largest part of the above search is to do just that, thanks to @somesoni2's answer to my question.
- The
`eventstats`

to obtain`TOTAL`

is superficial and a waste of computation. There should be a better way to do this within R. - The above only outputs modulus of the transformation because counts are all real numbers. You can output the complex numbers by ridding
`Mod()`

from the above. (Interestingly, although Splunk lacks complex number arithmetics, its stats functions accepts complex numbers. Maybe it takes the real part and discards imaginary part as NaN.) `Freq`

is a dummy sequence for interpretation, expressed in hertz. You can chart over`Freq`

, for example.- Maximum frequency you can analyse is 0.5/
`span`

.`span`

in both`timechart`

calls must be equal. - Beware of an undesirable side effect of
`timechart`

used to fill gaps: It forces an extra interval.

A few F(FT)-words

- As discrete Fourier transform goes, you only look at half of the output sequence (positive frequencies) when inputs are all real.
- When analyzing (all-positive) event counts, output at frequency 0 is meaningless, as this component contains the strong DC bias.
`fft()`

uses a square sampling window. Spectrum leakage could diffuse your analysis especially when dealing with black-and-white data such as event counts.

R-rated notes

- Object
`input`

from Splunk is in "data frame” class. You need to “transform" it into arrays that most R functions deal with. The`transform()`

function in the above has nothing to do with Fourier*transformation*. The latter is performed in`fft()`

function. - In addition to fields you pass to R,
`input`

also passes certain Splunk internal fields as X-rated objects. In the above, X_span is`span`

in the last stats function (`timechart`

); you also have access to X_time which corresponds to _time in Splunk. (This is perhaps not limited to R app.)

The above doesn’t address how to separate data series into R arrays then output transformed objects. That will be my end goal. But it’s a good start.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

08-13-2014
05:24 PM

Finally figured out how to handle multiple Splunk data series. R also has this concept of "multivalue", hence `mvfft()`

.

`| r "`

D=length(input)-1

N=length(input[[1]])

N_span=N*input$X_span

output=data.frame(Freq=((1:N)-1)/(N_span),Mod(mvfft(as.matrix(input[2:D]))))

"

Here, X_span is from Splunk `_span`

. (You can also access Splunk _time in X_time.) R app adds "X" to input series names. For example, if you do `timechart count as COUNT by host`

, it will output `Freq`

and `Xhost1`

, `Xhost2`

, etc.

Filling 0 in timechart is not the best interpolation for FFT. Better use R's own capability.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

08-07-2014
10:16 PM

That's a really interesting bug. It doesn't show in preview mode.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

08-07-2014
02:17 PM

`like this`

, see those eventstats0 eventstats1 etc. bits near the end.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

07-29-2014
12:14 AM

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

08-06-2014
11:33 AM

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

08-06-2014
11:19 AM

Not familiar with cost of streamstats, but excellent work on a straight-Splunk interpolation. You may want to give an answer in http://answers.splunk.com/answers/79513/. I made a nuanced analysis there.

For my use case, I need to make sure missing data are treated as 0. @somesoni2 offered an inexpensive way to do this in http://answers.splunk.com/answer_link/149598/.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

07-29-2014
05:28 PM

First line grabs data and builds a `timechart`

with data gaps in it.

Second line prepares lots of data to fill in the gaps: previous value, next value, time of previous value, time of next value

Last line calculates the naïve linearly interpolated value.

Some results:

```
_time ev interpolated_ev
2014-07-30 00:55:00 99
2014-07-30 00:55:10 98.000000
2014-07-30 00:55:20 97.000000
2014-07-30 00:55:30 96
2014-07-30 00:55:40 101.000000
2014-07-30 00:55:50 106.000000
2014-07-30 00:56:00 111
```

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

07-29-2014
05:24 PM

Here's a run-anywhere example using `_internal`

data coming in every 30s, interpolated to 10s:

```
index=_internal eps="*" group=per_host_thruput | head 10 | timechart fixedrange=f span=10s avg(ev) as ev
| eval value_time = case(isnotnull(ev), _time) | streamstats last(ev) as last_ev last(value_time) as last_time | reverse | streamstats last(ev) as next_ev last(value_time) as next_time | reverse
| eval interpolated_ev = last_ev + ((_time - last_time) / (next_time - last_time)) * (next_ev - last_ev)
```

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

07-29-2014
05:22 PM

If you have more data points than you need you can make them equally paced using `timechart`

.

If you have too few data points you can do the same and throw some `streamstats`

shenanigans in the mix... won't be fast for a large data set though.

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

07-29-2014
01:25 PM

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

07-29-2014
11:03 AM

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

martin_mueller

SplunkTrust

07-29-2014
10:56 AM

You could probably buy a dedicated R-manual-Kindle for the price of printing that 😄

- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

yuanliu

Builder

07-29-2014
10:50 AM

State of Splunk Careers

Find out what your skills are worth!

Read the report >