<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Sample command limitations in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428869#M170650</link>
    <description>&lt;P&gt;I noticed &lt;STRONG&gt;sample&lt;/STRONG&gt; command in Splunk is limited in how many parameters can be used at the same time:&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Customsearchcommands#sample" target="_blank"&gt;https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Customsearchcommands#sample&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I am interested in replicating below functionality of &lt;EM&gt;numpy.random.choice&lt;/EM&gt; library in python, here's an example of it's output:&lt;/P&gt;

&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;aa_milne_arr&amp;nbsp;=&amp;nbsp;['pooh',&amp;nbsp;'rabbit',&amp;nbsp;'piglet',&amp;nbsp;'Christopher']&lt;BR /&gt;
&amp;gt;&amp;gt;&amp;gt;np.random.choice(aa_milne_arr,&amp;nbsp;5,&amp;nbsp;p=[0.5,&amp;nbsp;0.1,&amp;nbsp;0.1,&amp;nbsp;0.3])&lt;/P&gt;

&lt;P&gt;array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet']&lt;/P&gt;

&lt;P&gt;So basically I would like to sample based on both "&lt;EM&gt;proportional&lt;/EM&gt;" and "&lt;EM&gt;count&lt;/EM&gt;", both at the same time. Has anyone come across this issue before and how did you work around it in SPL? Thank you.&lt;/P&gt;</description>
    <pubDate>Wed, 30 Sep 2020 00:57:57 GMT</pubDate>
    <dc:creator>cosminstefanmar</dc:creator>
    <dc:date>2020-09-30T00:57:57Z</dc:date>
    <item>
      <title>Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428869#M170650</link>
      <description>&lt;P&gt;I noticed &lt;STRONG&gt;sample&lt;/STRONG&gt; command in Splunk is limited in how many parameters can be used at the same time:&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Customsearchcommands#sample" target="_blank"&gt;https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Customsearchcommands#sample&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I am interested in replicating below functionality of &lt;EM&gt;numpy.random.choice&lt;/EM&gt; library in python, here's an example of it's output:&lt;/P&gt;

&lt;P&gt;&amp;gt;&amp;gt;&amp;gt;aa_milne_arr&amp;nbsp;=&amp;nbsp;['pooh',&amp;nbsp;'rabbit',&amp;nbsp;'piglet',&amp;nbsp;'Christopher']&lt;BR /&gt;
&amp;gt;&amp;gt;&amp;gt;np.random.choice(aa_milne_arr,&amp;nbsp;5,&amp;nbsp;p=[0.5,&amp;nbsp;0.1,&amp;nbsp;0.1,&amp;nbsp;0.3])&lt;/P&gt;

&lt;P&gt;array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet']&lt;/P&gt;

&lt;P&gt;So basically I would like to sample based on both "&lt;EM&gt;proportional&lt;/EM&gt;" and "&lt;EM&gt;count&lt;/EM&gt;", both at the same time. Has anyone come across this issue before and how did you work around it in SPL? Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 00:57:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428869#M170650</guid>
      <dc:creator>cosminstefanmar</dc:creator>
      <dc:date>2020-09-30T00:57:57Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428870#M170651</link>
      <description>&lt;P&gt;Hello @cosminstefanmarin,&lt;/P&gt;

&lt;P&gt;I'm not much sure about this but with MLApp you can try below:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| sample count=&amp;lt;value of count&amp;gt; proportional=&amp;lt;name of numeric field&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;But as you can see you for proportional you need to give some field name which specify probability of that event. This gives you random count number of events and probability of the event to be selected will be taken from the given field. Compare to python array will be the Splunk events.&lt;/P&gt;

&lt;P&gt;Hope this helps!!!&lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2019 14:29:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428870#M170651</guid>
      <dc:creator>VatsalJagani</dc:creator>
      <dc:date>2019-06-19T14:29:23Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428871#M170652</link>
      <description>&lt;P&gt;I am afraid using &lt;EM&gt;count&lt;/EM&gt; and &lt;EM&gt;proportional&lt;/EM&gt; at the same time is not allowed by the command itself. I already mentioned about it in the description. In my opinion this is the weakness of the command, and it should be dealt by Splunk as a feature enhancement. &lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2019 14:49:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428871#M170652</guid>
      <dc:creator>cosminstefanmar</dc:creator>
      <dc:date>2019-06-19T14:49:49Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428872#M170653</link>
      <description>&lt;P&gt;Hi @cosminstefanmarin,&lt;/P&gt;

&lt;P&gt;In your case if you want to use both &lt;CODE&gt;proportional&lt;/CODE&gt; and &lt;CODE&gt;count&lt;/CODE&gt; then you can chain both commands, starting with proportional so it makes sense to what you are trying to achieve.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | sample proportional="some_field" | sample count=20
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Since count is random and proportional isn't, starting with proportional then adding count should do the trick.&lt;/P&gt;

&lt;P&gt;Let me know what you think.&lt;/P&gt;

&lt;P&gt;Cheers,&lt;BR /&gt;
David&lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2019 15:58:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428872#M170653</guid>
      <dc:creator>DavidHourani</dc:creator>
      <dc:date>2019-06-19T15:58:56Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428873#M170654</link>
      <description>&lt;P&gt;Tried that already, doesn't provide the expected output.&lt;BR /&gt;
I'll give you an example:&lt;BR /&gt;
| sample proportional="some_field" generates random output, say 5&lt;BR /&gt;
which means the immediate | sample count=20 won't be able to pull 20 events, because it doesn't make sense anymore, in this case it will be limited to only 5!!&lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2019 16:11:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428873#M170654</guid>
      <dc:creator>cosminstefanmar</dc:creator>
      <dc:date>2019-06-19T16:11:31Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428874#M170655</link>
      <description>&lt;P&gt;Yeah you're right, and if you do it the other way around then it doesn't make sense at all...&lt;/P&gt;

&lt;P&gt;The only way it would work is if your count is smaller than the total number returned by the proportional. But that makes sense doesn't it, If you get 5 that match with proportional than that's all you were going to get even if you had a count of 20 mixed with it. &lt;/P&gt;

&lt;P&gt;Unless what you're trying to do is force the proportional to give more results than it ought to..then not sure what the point of proportional would be in the first place. Do you see my point ?&lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2019 16:17:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428874#M170655</guid>
      <dc:creator>DavidHourani</dc:creator>
      <dc:date>2019-06-19T16:17:18Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428875#M170656</link>
      <description>&lt;P&gt;I don't know when Splunk implements this but till then if you want you can create your own custom command with python and use the python function that you specified in the question. (You can put python libraries in bin directory of your App.)&lt;/P&gt;</description>
      <pubDate>Wed, 19 Jun 2019 16:25:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428875#M170656</guid>
      <dc:creator>VatsalJagani</dc:creator>
      <dc:date>2019-06-19T16:25:24Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428876#M170657</link>
      <description>&lt;P&gt;I thought about it as well, will explore this in more detail. Another alternative would be to modify &lt;STRONG&gt;sample.py&lt;/STRONG&gt; directly and introduce the missing functionality in the Splunk command itself. This can be a direct contribution to the community. &lt;/P&gt;</description>
      <pubDate>Thu, 20 Jun 2019 09:41:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428876#M170657</guid>
      <dc:creator>cosminstefanmar</dc:creator>
      <dc:date>2019-06-20T09:41:55Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428877#M170658</link>
      <description>&lt;P&gt;Yeah, I like your idea, that's great. You can introduce more arguments to sample.py file and change command logic accordingly.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Jun 2019 16:01:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428877#M170658</guid>
      <dc:creator>VatsalJagani</dc:creator>
      <dc:date>2019-06-20T16:01:38Z</dc:date>
    </item>
    <item>
      <title>Re: Sample command limitations</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428878#M170659</link>
      <description>&lt;P&gt;Reason for using proportional is to be able to give different probabilities to certain items, based on a baseline created on a longer period of time. At the same time I need count in order to sample different sizes based on "by" field clause. &lt;/P&gt;</description>
      <pubDate>Tue, 25 Jun 2019 13:43:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Sample-command-limitations/m-p/428878#M170659</guid>
      <dc:creator>cosminstefanmar</dc:creator>
      <dc:date>2019-06-25T13:43:25Z</dc:date>
    </item>
  </channel>
</rss>

