Splunk Search

stats min/max not returning the correct result?

rododoodles
New Member

Hi! I'm trying to figure out what I'm doing wrong with this stats query:

"Advanced checkpoint" | rex "shardId-0+(?<shardId>\d+) to (?<checkpoint>\d+)" | convert num(shardId) num(checkpoint) | stats min(checkpoint) max(checkpoint) count(checkpoint) by shardId

I get the following results:

shardId min(checkpoint) max(checkpoint) count(checkpoint)
0   49560932439872968484629418486593580327004169175954358272    49560932439872968484629418486593580327004169175954358272    13
1   49560932452026874617828608098730546785597526196713684992    49560932452026874617828608098730546785597526196713684992    13
2   49560932427429152663849330773616649530866383455617286144    49560932427429152663849330773616649530866383455617286144    13
6   49568412890294036113805838273969294693600630993783357440    49568412890294036113805838273969294693600630993783357440    61
7   49568412890316336859004368897110830411873279355289337856    49568412890316336859004368897110830411873279355289337856    60
8   49568412901399807222674088598454082393379515023761604608    49568412901399807222674088598454082393379515023761604608    60
9   49568412901422107967872619221595618111652163385267585024    49568412901422107967872619221595618111652163385267585024    60

The problem is that for some reason I'm not getting the actual min and the max values for each shard— I'm getting the same value for both min and max.

It is not the case that all of the checkpoint values are the same for a given shardId. For instance, the two most recent events for shardId=0 are as follows:

 _time                    shardId checkpoint
3/16/17 9:12:45.085 PM    0       49560932439872968484643752785443264706111991494612090882
3/16/17 8:54:44.860 PM    0       49560932439872968484643752274065225157484546774917971970

Thoughts? Any help is greatly appreciated 🙂

Tags (1)
0 Karma

niketn
Legend

@rododoodles... To me it seems to be an issue with Splunk trying to process really long number (I think it is confusing after 16 digits or so, but could be based on memory). One of the options you have is to convert to string instead of number.

To replicate the issue, following is run anywhere example:

| makeresults
| eval num1="49560932439872968484629418486593580327004169175954358272"
| eval num2="49560932439872968484643752274065225157484546774917971970"
| eval num1="\"".tostring(num1)."\""
| eval num2="\"".tostring(num2)."\""
| eval test=if(num2>num1,"Greater","Equal?")
| table num1 num2 test

Following is the output, even though numbers are different yet they show up as equal:

49560932439872968484629418486593580327004169175954358272 49560932439872968484643752274065225157484546774917971970 Equal?

However, if you can work with a number stored as string withing double quotes for your scenario, then following should work:

| eval checkpoint="\"".tostring(checkpoint)."\""
| stats min(checkpoint) max(checkpoint) count(checkpoint) by shardId

Following is run anywhere example for testing the fix:

| makeresults
| eval num1="49560932439872968484629418486593580327004169175954358272"
| eval num2="49560932439872968484643752274065225157484546774917971970"
| eval num1="\"".tostring(num1)."\""
| eval num2="\"".tostring(num2)."\""
| eval test=if(num2>num1,"Greater","Equal?")
| table num1 num2 test

Following is the output, numbers inside double quotes identifies num2 > num1:

"49560932439872968484629418486593580327004169175954358272" "49560932439872968484643752274065225157484546774917971970" Greater

I do feel this is a bug and should be reported to Splunk.

On a different note, you should also consider saving rex regular expression based field extraction as Knowledge Objects using Extract new fields or directly by editing props.conf.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

rododoodles
New Member

Thank you @niketnilay! I suspected something like that was going on. Splunk probably isn't using BigIntegers to do the calculation. I'll report it as a bug.

The issue with using strings is that the results may be incorrect if the numbers are of different lengths (lexicographical ordering != natural number ordering). But I suppose I could left-pad zeroes as a workaround.

Thanks also for the tip on saving my regexes.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Probably not a bug, this is beyond the precision level of CPUs for numbers I believe. ( A double only has 52 mantissa bits, for 16 decimal digit precision, although it can approximately represent numbers as high as 10^300+

Like you said, best to treat as string.

0 Karma

niketn
Legend

@rododoodles... Sorry I got too fixated on two numbers not matching. Yes you should try prefixing with zero/s. Let us know if this works for you.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

cmerriman
Super Champion

have you tried |eval shareId=tonumber(shareId)|eval checkpoint=tonumber(checkpoint) instead of convert to see if that works?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...