Solved: Re: Search certain ratio of minimum data

KongJian · ‎06-16-2021

the Scenario like this:

I want to pick up 5% minimum value from thousands of data,

Example:

1,2,3 ,4 5,6,7,8,9,10 I want pickup minimum 30%, i.e (1,2,3) will be listed.

can any support for the SPL

bowesmana · ‎06-17-2021

@KongJian

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

View solution in original post

bowesmana · ‎06-16-2021

Can you provide an example of your data

KongJian · ‎06-17-2021

Example data：the value like following csv

11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20

we wanna to pick the 30% portion of min value from total

1. sort the data in inverted order

2. take out 30% min value of total

3. result should be 1,2,3,4,5,6

Hope you can understand the explanation

bowesmana · ‎06-17-2021

@KongJian

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

Search certain ratio of minimum data

stats

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

Are you a member of the Splunk Community?

Search certain ratio of minimum data

stats

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...