<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Machine Learning tool kit v3.4 model not returning result in All Apps and Add-ons</title>
    <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453680#M55784</link>
    <description>&lt;P&gt;I just upgraded the MLTK from v2.2 to v3.4, along with the latest python SA. After this change, I realize that my Random Forest model is returning empty result for some rows. (I apply the model to a few thousand rows each time.) At first I thought that it was an input data problem. But when I took a row that had empty result before, ran it individually (i.e. doing a |head 1), then the model returned result. Then I thought maybe the model was built in v2.2, so I rebuilt (or fit again) the model in v3.4, again it was returning empty results for some rows, but a different subset of rows this time.&lt;/P&gt;

&lt;P&gt;Has anyone seen the same issue? Should I revert back to the old version??&lt;/P&gt;

&lt;P&gt;I don't see anything in search.log that will help, but I always see this:&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - stderr:   File "/opt/splunk/etc/apps/Splunk_ML_Toolkit/bin/util/search_util.py", line 114, in add_distributed_search_info&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - stderr:     raise RuntimeError('Failed to load model "%s": ' % (process_options['model_name']))&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - stderr: KeyError: 'model_name'&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - Error in 'apply' command: (KeyError) 'model_name'&lt;/P&gt;

&lt;P&gt;Is it trying to distribute the apply command to the indexers? Can I run it locally on the search head, since all my input data (csv and kvstore) are on the search head? &lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2020 21:13:13 GMT</pubDate>
    <dc:creator>teresachila</dc:creator>
    <dc:date>2020-09-29T21:13:13Z</dc:date>
    <item>
      <title>Machine Learning tool kit v3.4 model not returning result</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453680#M55784</link>
      <description>&lt;P&gt;I just upgraded the MLTK from v2.2 to v3.4, along with the latest python SA. After this change, I realize that my Random Forest model is returning empty result for some rows. (I apply the model to a few thousand rows each time.) At first I thought that it was an input data problem. But when I took a row that had empty result before, ran it individually (i.e. doing a |head 1), then the model returned result. Then I thought maybe the model was built in v2.2, so I rebuilt (or fit again) the model in v3.4, again it was returning empty results for some rows, but a different subset of rows this time.&lt;/P&gt;

&lt;P&gt;Has anyone seen the same issue? Should I revert back to the old version??&lt;/P&gt;

&lt;P&gt;I don't see anything in search.log that will help, but I always see this:&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - stderr:   File "/opt/splunk/etc/apps/Splunk_ML_Toolkit/bin/util/search_util.py", line 114, in add_distributed_search_info&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - stderr:     raise RuntimeError('Failed to load model "%s": ' % (process_options['model_name']))&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - stderr: KeyError: 'model_name'&lt;BR /&gt;
09-10-2018 22:08:54.625 ERROR ChunkedExternProcessor - Error in 'apply' command: (KeyError) 'model_name'&lt;/P&gt;

&lt;P&gt;Is it trying to distribute the apply command to the indexers? Can I run it locally on the search head, since all my input data (csv and kvstore) are on the search head? &lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 21:13:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453680#M55784</guid>
      <dc:creator>teresachila</dc:creator>
      <dc:date>2020-09-29T21:13:13Z</dc:date>
    </item>
    <item>
      <title>Re: Machine Learning tool kit v3.4 model not returning result</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453681#M55785</link>
      <description>&lt;P&gt;Is it a distributed or Search head cluster setup? Are you using streaming apply on all the indexers?? If yes, did you upgraded PSC on all the indexers?  You need to recreate your model after upgrading PSC version.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Sep 2018 17:15:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453681#M55785</guid>
      <dc:creator>grana_splunk</dc:creator>
      <dc:date>2018-09-11T17:15:40Z</dc:date>
    </item>
    <item>
      <title>Re: Machine Learning tool kit v3.4 model not returning result</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453682#M55786</link>
      <description>&lt;P&gt;It is set up for distributed search to multiple indexers. Not a search head cluster. How do I know if I'm using streaming apply? I only upgraded PSC on the search head, not the indexers.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Sep 2018 21:55:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453682#M55786</guid>
      <dc:creator>teresachila</dc:creator>
      <dc:date>2018-09-11T21:55:06Z</dc:date>
    </item>
    <item>
      <title>Re: Machine Learning tool kit v3.4 model not returning result</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453683#M55787</link>
      <description>&lt;P&gt;Open mlspl.conf file under $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/default/mlspl.conf or $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/mlspl.conf and check if streamily apply has been set to true or not.&lt;/P&gt;

&lt;P&gt;Also, if you have upgraded the setup and streaming apply is true., Please upgrade PSC on all your indexers.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 21:14:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453683#M55787</guid>
      <dc:creator>grana_splunk</dc:creator>
      <dc:date>2020-09-29T21:14:07Z</dc:date>
    </item>
    <item>
      <title>Re: Machine Learning tool kit v3.4 model not returning result</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453684#M55788</link>
      <description>&lt;P&gt;I think I found the issue. For some reason, the new version does not like null or anything close to null being passed to the model. It does not like empty string (i.e. "", or len=0), and it does not like string values "NA" or "N/A" or "null" either. (The "NA" was returned by an external API.)&lt;/P&gt;

&lt;P&gt;So far I observed three different symptoms: 1) the model returns an empty prediction value, no other messages in the log, 2) the model fails with an error message about null values being passed, 3) the model returns a prediction, but with warning message in search.log about null value in the model. Which symptom manifests when depends on how many rows are being processed. If I apply the model with 1 row, it usually returns a prediction value. If I apply it to thousands of rows, it usually returns empty value.&lt;/P&gt;

&lt;P&gt;To remediate, I added this code:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| fillnull value="NoValue"
| foreach prefix_*  [eval &amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;=if(len(&amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;)=0 OR &amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;="N/A" OR &amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;="NA" OR &amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;="null","NoValue",&amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;)]
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 18 Sep 2018 13:27:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453684#M55788</guid>
      <dc:creator>teresachila</dc:creator>
      <dc:date>2018-09-18T13:27:53Z</dc:date>
    </item>
    <item>
      <title>Re: Machine Learning tool kit v3.4 model not returning result</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453685#M55789</link>
      <description>&lt;P&gt;Thanks! stream_apply is set to false.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 13:56:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Machine-Learning-tool-kit-v3-4-model-not-returning-result/m-p/453685#M55789</guid>
      <dc:creator>teresachila</dc:creator>
      <dc:date>2018-09-18T13:56:56Z</dc:date>
    </item>
  </channel>
</rss>

