bmorgan

Explorer

โ06-22-2010
09:26 PM

I need to take already summarized data in the logs, aggregate it from a large group of servers, and build an si-type index. Looking at si-generated data from sistas fields, I have deduced the following meanings but need further clarification

psrsvd_ct_FIELDNAME = count

psrsvd_nc_FIELDNAME = Also Count?

psrsvd_sm_FIELDNAME = sum

psrsvd_ss_FIELDNAME = sum of squars

psrsvd_vt_cnt = ?? some kind of variance ??

So is ct = count, what is nc really for, what formula do you use for SS (does it include std-dev or is it a simple sum of squares), and what is vt?

Thanks,

Blaine

steveyz

Splunk Employee

โ06-23-2010
02:33 AM

nc = numerical count (number of values for this field that can be interpreted as numbers), e.g. computing the average would be sum/nc

ss is just simple sum of squares, i.e. X1*X1 + X2*X2 + X3*X3

vt is actually originally stood for "valuetype", but what we store in it is actually just the maximum precision of the numerical values of the field. This is so that if you average together a bunch of values like 9.5,10.5,11.5, you get 10.5 and not 10.500000 or 11

joy76

Path Finder

โ05-06-2013
07:32 PM

What does "psrsvd" as in psrsvd_* stand for here ?

steveyz

Splunk Employee

โ06-29-2010
10:52 PM

nc is numerical count, i.e. the number of values that can be entirely as numbers, where as ct is the total count.

I.e. if your field had the following values:

0,foo,1,bar

ct=4 and nc=2

bmorgan

Explorer

โ06-24-2010
04:35 PM

is there a difference between ct and nc