Archive

How to get _time at median value?

Explorer

Hello everyone,

Now, I encountered hard problem that I can't solve for long times. I was also google on many hours but not result. The problem has following:

I wanna get median a value on search. I had that value by:

source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import" | stats median(el) as abc by URL

And I received a result following:

URL                      abc
/ShippingOrder/Import   29250

Yes, I got median value = 29250. But I wanna have add more a table that show _time at event happen has abc=29250. I searched following:

source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import" | stats median(el) as abc by URL | table URL abc _time

But table _time is blank.

I found that time by:

source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import" | table URL _time el

Result:

URL                 el                 _time
/ShippingOrder/Import   29016   2017-09-10 18:08:58
/ShippingOrder/Import   6657    2017-09-10 16:47:58
/ShippingOrder/Import   11656   2017-09-10 16:11:35
/ShippingOrder/Import   23906   2017-09-10 14:46:58
/ShippingOrder/Import   46719   2017-09-10 11:03:56
/ShippingOrder/Import   15016   2017-09-10 16:54:22
/ShippingOrder/Import   29250   2017-09-10 16:46:22
/ShippingOrder/Import   51188   2017-09-10 14:58:22
/ShippingOrder/Import   44000   2017-09-10 14:51:22
/ShippingOrder/Import   12046   2017-09-10 14:42:22
/ShippingOrder/Import   50984   2017-09-10 14:41:22
/ShippingOrder/Import   39735   2017-09-10 14:25:22

And at the time has abc is median value 29250 is 2017-09-10 16:46:22

So, how to I get result following by some search?

URL                     abc               Time
/ShippingOrder/Import   29250   2017-09-10 16:46:22

I appreciate your reply

Tags (2)
1 Solution

Super Champion

try this to see if it works for you:

 source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import"  | eventstats median(el) as abc by URL|eval medTime=if(abc=el,_time,null())|stats values(medTime) as _time values(abc) as abc by URL

View solution in original post

SplunkTrust
SplunkTrust

CORRECTION - In Splunk, the calculation of the aggregate function median() does not match the mathematical definition, so the simpler code provided by @cmerriman is a complete solution. Below is the code for how you would do the real median() if that were correctly calculated.


Mathematically, median is not necessarily an actual number present in the dataset. There could be only one, or there could be more than one, or there could be none. Here's some examples to show why...

 1 2 3 6 9        median is 3, which is present in the data
 1 2 3 3 3 6 9    median is 3, three present in the data
 1 2 2 4 6 9     mathematical median is 3, which is not present in the data.  However, splunk picks 4 instead.

This following code is redundant under current implementation of median() in splunk. Just use @cmerriman's.

That mathematical definition would make it a little funky to get the _time, but it could be done. What we would do, is calculate the DIFFERENCE from the median, and then pass any records that have the lowest difference.

source=check_request app="test1" url="/ShippingOrder/Import" 
| rename url as URL 
| eventstats median(el) as UrlMedianEl by URL
| eval DeltaToMedian= abs(UrlMedianEl - el)
| eventstats min(DeltatoMedian) as minDeltaToMedian by URL
| where DeltaToMedia = minDeltaToMedian 
| table _time URL el DeltaToMedian 

See, wasn't that tricky?


Examples altered - changed 4 and 5 to 6 and 9 respectively to help distinguish median - the "middlest" value - from average.

Also, the whole thing has become moot, because splunk's median() just picks the higher of the two.

Super Champion

just for definition sake, median will grab the middle value of all present variables. not the middle value between min and max.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/SearchReference/Aggregatefunctions#median.28X...

   |makeresults|eval data="1,2,3,5,6,7"| makemv data delim=","| mvexpand data | rename data as _raw | kv|stats median(_raw)

using this, you get a median of 5, not 4, because Splunk should pick the higher of the two middle values

technically, if it is an even number, you're supposed to average the two middle numbers.
http://reference.wolfram.com/language/ref/Median.html

0 Karma

SplunkTrust
SplunkTrust

@cmerriman - Thanks! I didn't expect to have to look up how splunk implemented a standard mathematical term. Post has been updated and annotated, and I'm leaving my redundant brilliance in place for posterity, and as a warning to others.

0 Karma

Super Champion

try this to see if it works for you:

 source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import"  | eventstats median(el) as abc by URL|eval medTime=if(abc=el,_time,null())|stats values(medTime) as _time values(abc) as abc by URL

View solution in original post

Explorer

That's great! It worked. Thanks so much cmerriman!

0 Karma

SplunkTrust
SplunkTrust

@luanvn, @cmerriman - you need a little more, because of the definition of median when there are even numbers of events. You are only covering two of the three cases, and the third one should occur in roughly 50% of the searches.

0 Karma

Explorer

@DalJeanis Thank for considering about that.

0 Karma