<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Can you help me with the following issue involving the mvexpand Limit (Total Output Limit)? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391252#M113930</link>
    <description>&lt;P&gt;I like and need &lt;CODE&gt;mvexpand&lt;/CODE&gt; to work with some of my data.  &lt;/P&gt;

&lt;P&gt;Sometimes, our input events contain information about multiple, underlying events (esp. rich JSON data sources).  I understand that &lt;CODE&gt;mvexpand&lt;/CODE&gt; can, under certain situations, can lead to scaling challenges with SPL.  I generally think of these problematic cases as examples where each individual input event expands into lots (hundreds, thousands or more) of newevents.  I can imagine this being especially tricky when the arity of the expansion varies greatly from input event to input event.&lt;/P&gt;

&lt;P&gt;I want to believe that cases where &lt;CODE&gt;mvexpand&lt;/CODE&gt; causes the event count to be doubled should be safe.  It seems that these cases could be implemented to be fully streamable (at the indexers) and that the SPL should scale out embarrassingly easily.  Here's an example query:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults count=10000 | streamstats count | eval count=1000*round((count-1)/1000-0.5,0)
| eval mcount=mvrange(0,99,10) | mvexpand mcount | fields count mcount | fields - _raw
| eval ucount=mvrange(0,49,10) | mvexpand ucount | fields count mcount ucount | fields - _raw
| eventstats count as total by count | eventstats count as mtotal by mcount | eventstats count as utotal by ucount
| stats count, values(eval(count." (".total.")")) as cvalues,
               values(eval(mcount." (".mtotal.")")) as mvalues,
               values(eval(ucount." (".utotal.")")) as uvalues
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This SPL makes 10,000 events and then &lt;CODE&gt;mvexpand&lt;/CODE&gt;s twice, once by 10x and once by 5x.  The result is 500,000 events as expected.  By tweaking the &lt;CODE&gt;makeresults&lt;/CODE&gt; and &lt;CODE&gt;mvrange&lt;/CODE&gt; commands, we can test different limits of the &lt;CODE&gt;mvexpand&lt;/CODE&gt; command.&lt;/P&gt;

&lt;P&gt;Adjusting the &lt;CODE&gt;ucount&lt;/CODE&gt; to &lt;CODE&gt;mvrange(0,99,10)&lt;/CODE&gt; produces the expected 1,000,000 events.  This, however, is the highest number that works as I expected.  Once the total number of total events exceeds 1,000,000 events, as any &lt;CODE&gt;mvexpand&lt;/CODE&gt;, some (undesirable) caps begin to be applied.&lt;/P&gt;

&lt;P&gt;In my case, I need to use &lt;CODE&gt;mvexpand&lt;/CODE&gt; with a case where the base search itself produces many tens or hundreds of millions of events.  The "expansion factor", if you will, is a small, constant number (&amp;lt;100, likely less than 10 and can be constrained).&lt;/P&gt;

&lt;P&gt;Here is an example where the final expansion merely doubles the event count (in a completely local way) that I believe should work...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults count=10000 | streamstats count | eval count=1000*round((count-1)/1000-0.5,0)
| eval mcount=mvrange(0,99,1) | mvexpand mcount | fields count mcount | fields - _raw
| eval ucount=mvrange(0,49,25) | mvexpand ucount | fields count mcount ucount | fields - _raw
| eventstats count as total by count | eventstats count as mtotal by mcount | eventstats count as utotal by ucount
| stats count, values(eval(count." (".total.")")) as cvalues,
               values(eval(mcount." (".mtotal.")")) as mvalues,
               values(eval(ucount." (".utotal.")")) as uvalues
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Instead of 2,000,000 events, I only get 984,200 on my environment.&lt;/P&gt;

&lt;P&gt;I am imagining building my own custom command, but I suspect that others have hit this limit.  It certainly seems that &lt;CODE&gt;mvexpand&lt;/CODE&gt; /could/ be smarter than this.  Any advice?&lt;/P&gt;

&lt;P&gt;(For the record, I have already tried the &lt;CODE&gt;fields - _raw&lt;/CODE&gt; trick shared in other &lt;CODE&gt;mvexpand&lt;/CODE&gt; answers.)&lt;/P&gt;</description>
    <pubDate>Mon, 12 Nov 2018 19:41:44 GMT</pubDate>
    <dc:creator>kulick</dc:creator>
    <dc:date>2018-11-12T19:41:44Z</dc:date>
    <item>
      <title>Can you help me with the following issue involving the mvexpand Limit (Total Output Limit)?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391252#M113930</link>
      <description>&lt;P&gt;I like and need &lt;CODE&gt;mvexpand&lt;/CODE&gt; to work with some of my data.  &lt;/P&gt;

&lt;P&gt;Sometimes, our input events contain information about multiple, underlying events (esp. rich JSON data sources).  I understand that &lt;CODE&gt;mvexpand&lt;/CODE&gt; can, under certain situations, can lead to scaling challenges with SPL.  I generally think of these problematic cases as examples where each individual input event expands into lots (hundreds, thousands or more) of newevents.  I can imagine this being especially tricky when the arity of the expansion varies greatly from input event to input event.&lt;/P&gt;

&lt;P&gt;I want to believe that cases where &lt;CODE&gt;mvexpand&lt;/CODE&gt; causes the event count to be doubled should be safe.  It seems that these cases could be implemented to be fully streamable (at the indexers) and that the SPL should scale out embarrassingly easily.  Here's an example query:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults count=10000 | streamstats count | eval count=1000*round((count-1)/1000-0.5,0)
| eval mcount=mvrange(0,99,10) | mvexpand mcount | fields count mcount | fields - _raw
| eval ucount=mvrange(0,49,10) | mvexpand ucount | fields count mcount ucount | fields - _raw
| eventstats count as total by count | eventstats count as mtotal by mcount | eventstats count as utotal by ucount
| stats count, values(eval(count." (".total.")")) as cvalues,
               values(eval(mcount." (".mtotal.")")) as mvalues,
               values(eval(ucount." (".utotal.")")) as uvalues
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This SPL makes 10,000 events and then &lt;CODE&gt;mvexpand&lt;/CODE&gt;s twice, once by 10x and once by 5x.  The result is 500,000 events as expected.  By tweaking the &lt;CODE&gt;makeresults&lt;/CODE&gt; and &lt;CODE&gt;mvrange&lt;/CODE&gt; commands, we can test different limits of the &lt;CODE&gt;mvexpand&lt;/CODE&gt; command.&lt;/P&gt;

&lt;P&gt;Adjusting the &lt;CODE&gt;ucount&lt;/CODE&gt; to &lt;CODE&gt;mvrange(0,99,10)&lt;/CODE&gt; produces the expected 1,000,000 events.  This, however, is the highest number that works as I expected.  Once the total number of total events exceeds 1,000,000 events, as any &lt;CODE&gt;mvexpand&lt;/CODE&gt;, some (undesirable) caps begin to be applied.&lt;/P&gt;

&lt;P&gt;In my case, I need to use &lt;CODE&gt;mvexpand&lt;/CODE&gt; with a case where the base search itself produces many tens or hundreds of millions of events.  The "expansion factor", if you will, is a small, constant number (&amp;lt;100, likely less than 10 and can be constrained).&lt;/P&gt;

&lt;P&gt;Here is an example where the final expansion merely doubles the event count (in a completely local way) that I believe should work...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults count=10000 | streamstats count | eval count=1000*round((count-1)/1000-0.5,0)
| eval mcount=mvrange(0,99,1) | mvexpand mcount | fields count mcount | fields - _raw
| eval ucount=mvrange(0,49,25) | mvexpand ucount | fields count mcount ucount | fields - _raw
| eventstats count as total by count | eventstats count as mtotal by mcount | eventstats count as utotal by ucount
| stats count, values(eval(count." (".total.")")) as cvalues,
               values(eval(mcount." (".mtotal.")")) as mvalues,
               values(eval(ucount." (".utotal.")")) as uvalues
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Instead of 2,000,000 events, I only get 984,200 on my environment.&lt;/P&gt;

&lt;P&gt;I am imagining building my own custom command, but I suspect that others have hit this limit.  It certainly seems that &lt;CODE&gt;mvexpand&lt;/CODE&gt; /could/ be smarter than this.  Any advice?&lt;/P&gt;

&lt;P&gt;(For the record, I have already tried the &lt;CODE&gt;fields - _raw&lt;/CODE&gt; trick shared in other &lt;CODE&gt;mvexpand&lt;/CODE&gt; answers.)&lt;/P&gt;</description>
      <pubDate>Mon, 12 Nov 2018 19:41:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391252#M113930</guid>
      <dc:creator>kulick</dc:creator>
      <dc:date>2018-11-12T19:41:44Z</dc:date>
    </item>
    <item>
      <title>Re: Can you help me with the following issue involving the mvexpand Limit (Total Output Limit)?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391253#M113931</link>
      <description>&lt;P&gt;Yes, &lt;CODE&gt;mvexpand&lt;/CODE&gt; is very inefficient. You can trigger the default 500MB memory limit with &lt;CODE&gt;| makeresults | eval foo = mvrange(0,10000) | mvexpand foo&lt;/CODE&gt; in some splunk instances, for example - 20000 simple values shouldn't need 5MB, let alone 500.&lt;/P&gt;

&lt;P&gt;Since there's no actual question in your question I'll provide advice instead of an answer: File an ER with support.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Nov 2018 21:36:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391253#M113931</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2018-11-12T21:36:25Z</dc:date>
    </item>
    <item>
      <title>Re: Can you help me with the following issue involving the mvexpand Limit (Total Output Limit)?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391254#M113932</link>
      <description>&lt;P&gt;Unfortunate, this behavior, but if it is the current state of the art, then ER seems the best path forward.  Thanks.&lt;/P&gt;</description>
      <pubDate>Sat, 17 Nov 2018 06:23:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391254#M113932</guid>
      <dc:creator>kulick</dc:creator>
      <dc:date>2018-11-17T06:23:31Z</dc:date>
    </item>
    <item>
      <title>Re: Can you help me with the following issue involving the mvexpand Limit (Total Output Limit)?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391255#M113933</link>
      <description>&lt;P&gt;Well, in many cases you can write searches in a way that don't need mvexpand. Whether that's possible in your case or not depends on your case.&lt;/P&gt;</description>
      <pubDate>Sat, 17 Nov 2018 06:54:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391255#M113933</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2018-11-17T06:54:58Z</dc:date>
    </item>
    <item>
      <title>Re: Can you help me with the following issue involving the mvexpand Limit (Total Output Limit)?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391256#M113934</link>
      <description>&lt;P&gt;And in fact, Martin taught me a great trick to avoid needing &lt;CODE&gt;mvexpand&lt;/CODE&gt;.  The trick covers cases where you would ultimately just be using the field in question in a group by clause of a subsequent &lt;CODE&gt;stats&lt;/CODE&gt; command.  In this case, you can simply leave the multi-valued field multi-valued and things will "just work".  Cool trick!  Thanks for showing me that one, Martin!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Aug 2019 00:59:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-you-help-me-with-the-following-issue-involving-the-mvexpand/m-p/391256#M113934</guid>
      <dc:creator>kulick</dc:creator>
      <dc:date>2019-08-08T00:59:02Z</dc:date>
    </item>
  </channel>
</rss>

