I have data like this:
`a----b----c----d`
`10----12----30----5`
`50----34----46----55`
`22----23----98----56`
`32----78----33----7`
Now my requirement is to get the stdevp of the first value of each row vs all the values on that row.
which means the output should be:
`stda----stdb----stdc----stdd`
`std(10,10)----std(10,12)----std(10,30)----std(10,5)`
`std(50,50)----std(50,34)----std(50,46)----std(50,55)`
`and so on...`
For this, what I have done is:
`base search`
`|table the fields`
`| transpose //to get all the similar values in one column`
`|rename 'row *' to 'r*' // just to avoid the space`
`|eventstats first(r1) as current_val`
`|eval x=mvzip(current_val, r1)| mvexpand x`
`|rex max_match=0 field=x "(?\d+)"`
`|streamstats window=1 stdevp(numbers) as stdDeviation`
`|fields r* current_val stdDeviation`
Here, this does the standard deviation correctly but I cannot move from r1 to r2 to r3. So far, I'm doing it manually. 😞
Foreach would not help since eventstats is present so can I use map to do the looping through the columns? if yes, how? I cannot handle the wildcard (r*)
The foreach
command is probably your best friend here. I put your data above into a makeresults
for you to see if it's the desired output you're looking for. if so, you can use it from the transpose
, but you might need to tweak a few things, perhaps. Also, i'm sure you know that stdevp
is the population standard deviation and stdev
is the sample standard deviation, so just make sure you're calculating the one you want.
http://docs.splunk.com/Documentation/Splunk/7.0.3/SearchReference/Foreach
|makeresults|eval data="a=10,b=12,c=30,d=5 a=50,b=34,c=46,d=55 a=22,b=23,c=98,d=56 a=32,b=78,c=33,d=7"|makemv data|mvexpand data|rename data as _raw|kv|table a b c d|transpose |rename "row *" as r*|streamstats count|foreach r* [eval std_first_<<FIELD>>=if(count=1,'<<FIELD>>',null())]|filldown|foreach r* [|eval temp_<<FIELD>>=mvzip('std_first_<<FIELD>>', '<<FIELD>>')|rex max_match=0 field=temp_<<FIELD>> "(?<numbers_<<FIELD>>>\d+)"]|stats values(r*) as r* values(std_first*) as std_first* stdevp(numbers*) as stdDeviation* by column
The foreach
command is probably your best friend here. I put your data above into a makeresults
for you to see if it's the desired output you're looking for. if so, you can use it from the transpose
, but you might need to tweak a few things, perhaps. Also, i'm sure you know that stdevp
is the population standard deviation and stdev
is the sample standard deviation, so just make sure you're calculating the one you want.
http://docs.splunk.com/Documentation/Splunk/7.0.3/SearchReference/Foreach
|makeresults|eval data="a=10,b=12,c=30,d=5 a=50,b=34,c=46,d=55 a=22,b=23,c=98,d=56 a=32,b=78,c=33,d=7"|makemv data|mvexpand data|rename data as _raw|kv|table a b c d|transpose |rename "row *" as r*|streamstats count|foreach r* [eval std_first_<<FIELD>>=if(count=1,'<<FIELD>>',null())]|filldown|foreach r* [|eval temp_<<FIELD>>=mvzip('std_first_<<FIELD>>', '<<FIELD>>')|rex max_match=0 field=temp_<<FIELD>> "(?<numbers_<<FIELD>>>\d+)"]|stats values(r*) as r* values(std_first*) as std_first* stdevp(numbers*) as stdDeviation* by column
Great. I see you broke it down to four steps. table & transpose | find today's value| combine| stdev
I guess, I messed up during the combining part. Thanks a lot for help in that.
And yes, I need the std dev for the population, not the sample. that's the requirement.