Hello, everyone!
I was encountered with weird problem. I have the following search:
| tstats `summariesonly` count by source, host, index, sourcetype | table source, host, index, sourcetype | stats values(host) as src_host dc(host) as count by source, index | sort + count
| mvexpand src_host
| outputlookup sourcescheck
Long story short, i got the error:
warn : command.mvexpand: output will be truncated at 5200 results due to excessive memory usage. Memory threshold of 500MB as configured in limits.conf / [mvexpand] / max_mem_usage_mb has been reached.
and my truncated sourcescheck.csv, which size was less then 1Mb (sic!)
Ok, we can see in docs (https://docs.splunk.com/Documentation/Splunk/7.3.1/SearchReference/Mvexpand):
The total necessary memory is the average result size multiplied by the number of results in the chunk multiplied by the average size of the multivalue field being expanded.
My question: is it the practical reality that for processing less that 1Mb of data with mvexpand command we need more than half of 1Gb of memory? Are there some workarounds or optimized approaches?
Thank you for any answers!
less than 1Mb might be due to partial OR incomplete execution of the search. Can you please try below search. Hopefully, this will give you all the required rows with the performance improvement.
| tstats `summariesonly` count by source, host, index, sourcetype
| stats values(host) as src_host dc(host) as count by source, index | eval n=1 | accum n
| sort + count
| stats values(*) as * by n,src_host | table sourcetype index src_host count
| outputlookup sourcescheck
Thanks
less than 1Mb might be due to partial OR incomplete execution of the search. Can you please try below search. Hopefully, this will give you all the required rows with the performance improvement.
| tstats `summariesonly` count by source, host, index, sourcetype
| stats values(host) as src_host dc(host) as count by source, index | eval n=1 | accum n
| sort + count
| stats values(*) as * by n,src_host | table sourcetype index src_host count
| outputlookup sourcescheck
Thanks
Hi I had similar mvexpand truncate error and used below lines
| eval n=1 | accum n
| stats values(*) as * by n, (all the fields that were mentioned in mvexpand)
I am able to get the results without any warnings/error but can you please explain what is happening in the above line?
`mvexpand` has its own limitation (Memory Limit). in most cases `mvexpand` will work like charm but with a huge dataset or resultset, it will break due to this limitation.
On other hand, the `stats` command has the beauty of managing large datasets with awesome performance.
So I have used stats instead of mvexpand.
The challenge was the output should be the same as the result with mvexpand.
So below SPL is the magical line that helps me to achieve it.
| eval n=1 | accum n
This command will number the data set from 1 to n(total count events before mvexpand/stats). and below stats command will perform the operation which we want to do with the mvexpand. which will gives you exact same output.
| stats values(*) as * by n, (all the fields that were mentioned in mvexpand)
I hope this will help you.
Thanks
KV
If any of my reply helps you to solve the problem Or gain knowledge, an upvote would be appreciated.
Do not table then stats. Just use the stats removing the table line in the middle. You are breaking the map reduce behavior efficiencies.
It works. Although i was left disappointed about mvexpand command.
Thanks a lot!
Yes, make sense. Thanks, @starcher 🙂