Splunk Search

Search certain ratio of minimum data

KongJian
Engager

the Scenario like this:

 I want to pick up 5% minimum  value from thousands of data,

 Example:

1,2,3 ,4 5,6,7,8,9,10   I want pickup minimum 30%, i.e (1,2,3) will be listed.

 can any support for the SPL

 

Labels (1)
0 Karma
1 Solution

bowesmana
Champion

@KongJian 

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

 

View solution in original post

0 Karma

bowesmana
Champion

Can you provide an example of your data

0 Karma

KongJian
Engager

Example data:the value like following csv

11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20

 we wanna to pick the 30% portion of min value from total

1. sort the data in inverted order 

2. take out 30% min value of total

3. result should be 1,2,3,4,5,6 

Hope you can understand the explanation

0 Karma

bowesmana
Champion

@KongJian 

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

 

View solution in original post

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.