Solved: Search certain ratio of minimum data

KongJian · ‎06-16-2021

the Scenario like this:

I want to pick up 5% minimum value from thousands of data,

Example:

1,2,3 ,4 5,6,7,8,9,10 I want pickup minimum 30%, i.e (1,2,3) will be listed.

can any support for the SPL

bowesmana · ‎06-17-2021

@KongJian

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

View solution in original post

bowesmana · ‎06-16-2021

Can you provide an example of your data

KongJian · ‎06-17-2021

Example data：the value like following csv

11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20

we wanna to pick the 30% portion of min value from total

1. sort the data in inverted order

2. take out 30% min value of total

3. result should be 1,2,3,4,5,6

Hope you can understand the explanation

bowesmana · ‎06-17-2021

@KongJian

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

Search certain ratio of minimum data

stats

SplunkTrust Application Period is Officially OPEN!

Splunk Answers Content Calendar, June Edition II

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

Are you a member of the Splunk Community?

Search certain ratio of minimum data

stats

SplunkTrust Application Period is Officially OPEN!

Splunk Answers Content Calendar, June Edition II

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...