Splunk Search

Search certain ratio of minimum data

KongJian
Engager

the Scenario like this:

 I want to pick up 5% minimum  value from thousands of data,

 Example:

1,2,3 ,4 5,6,7,8,9,10   I want pickup minimum 30%, i.e (1,2,3) will be listed.

 can any support for the SPL

 

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

@KongJian 

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

 

View solution in original post

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Can you provide an example of your data

0 Karma

KongJian
Engager

Example data:the value like following csv

11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20

 we wanna to pick the 30% portion of min value from total

1. sort the data in inverted order 

2. take out 30% min value of total

3. result should be 1,2,3,4,5,6 

Hope you can understand the explanation

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@KongJian 

I think I understand that you want to get 30% of events and that 30% should represent the lowest values.

Here are some examples using your data and random data to show how you can use eventstats to generate the data you need to test.

| makeresults
| fields - _time
| eval x=split("11,2,14,4,5,6,7,8,18,10,1,12,13,3,15,16,17,9,19,20", ",")
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count/events<=0.3

This example sets up your data and then gives you the results - you would use from the eventstats command onwards.

Here's another example where x doesn't start at 1

| makeresults
| fields - _time
| eval x=mvrange(41,444)
| mvexpand x
| sort x
| eventstats count as events
| streamstats count
| where count<=(events*.3)

Here's an example where the value is generated as a random number from 0-19999 and then it shows the smallest 30%

And for completeness, here is another example where the 30% refers to the value being tested as being within the bottom 30% of the range between smallest and largest value.

| makeresults count=1000
| fields - _time
| eval x=random() % 20000
| sort x
| eventstats max(x) as largest min(x) as smallest
| eval range=largest-smallest
| where x-smallest<(range*.3)

Hope this helps

 

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...