Splunk Search

Sample command limitations

cosminstefanmar
Explorer

I noticed sample command in Splunk is limited in how many parameters can be used at the same time:
https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Customsearchcommands#sample

I am interested in replicating below functionality of numpy.random.choice library in python, here's an example of it's output:

>>>aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>>np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])

array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet']

So basically I would like to sample based on both "proportional" and "count", both at the same time. Has anyone come across this issue before and how did you work around it in SPL? Thank you.

DavidHourani
Super Champion

Hi @cosminstefanmarin,

In your case if you want to use both proportional and count then you can chain both commands, starting with proportional so it makes sense to what you are trying to achieve.

... | sample proportional="some_field" | sample count=20

Since count is random and proportional isn't, starting with proportional then adding count should do the trick.

Let me know what you think.

Cheers,
David

0 Karma

cosminstefanmar
Explorer

Tried that already, doesn't provide the expected output.
I'll give you an example:
| sample proportional="some_field" generates random output, say 5
which means the immediate | sample count=20 won't be able to pull 20 events, because it doesn't make sense anymore, in this case it will be limited to only 5!!

0 Karma

DavidHourani
Super Champion

Yeah you're right, and if you do it the other way around then it doesn't make sense at all...

The only way it would work is if your count is smaller than the total number returned by the proportional. But that makes sense doesn't it, If you get 5 that match with proportional than that's all you were going to get even if you had a count of 20 mixed with it.

Unless what you're trying to do is force the proportional to give more results than it ought to..then not sure what the point of proportional would be in the first place. Do you see my point ?

0 Karma

cosminstefanmar
Explorer

Reason for using proportional is to be able to give different probabilities to certain items, based on a baseline created on a longer period of time. At the same time I need count in order to sample different sizes based on "by" field clause.

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

Hello @cosminstefanmarin,

I'm not much sure about this but with MLApp you can try below:

| sample count=<value of count> proportional=<name of numeric field>

But as you can see you for proportional you need to give some field name which specify probability of that event. This gives you random count number of events and probability of the event to be selected will be taken from the given field. Compare to python array will be the Splunk events.

Hope this helps!!!

0 Karma

cosminstefanmar
Explorer

I am afraid using count and proportional at the same time is not allowed by the command itself. I already mentioned about it in the description. In my opinion this is the weakness of the command, and it should be dealt by Splunk as a feature enhancement.

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

I don't know when Splunk implements this but till then if you want you can create your own custom command with python and use the python function that you specified in the question. (You can put python libraries in bin directory of your App.)

0 Karma

cosminstefanmar
Explorer

I thought about it as well, will explore this in more detail. Another alternative would be to modify sample.py directly and introduce the missing functionality in the Splunk command itself. This can be a direct contribution to the community.

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

Yeah, I like your idea, that's great. You can introduce more arguments to sample.py file and change command logic accordingly.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...

Global Splunk User Group Events: May + June 2026

Your Splunk Community Awaits: Discover Upcoming User Group Events Worldwide    Staying ahead in the fast-paced ...