Splunk Search

Search query using spath and mvexpand on multi-value nested JSON doesn't scale

sharad06
Explorer

Hi Splunk Experts,

I am sending events to Splunk Enterprise in the following nested JSON format:

{
     compliance:     Compliance Unknown,    
     ctupdate:   hostinfo,  
     host_properties:   [
            {
             name:   _times,    
              since:     1508907165,    
             value:  last_onsite    
            },  
            {
              name:  compliance_state,  
               since:    1508268020,    
               value:    N/A    
            },  
            {
              name:  engine_seen_packet,    
               since:    1508907165,    
               value:    Yes    
            },  
            {
               name:     guest_corporate_state, 
               since:    1508268020,    
               value:    N/A    
            },  
            {
             name:   linux_operating_system,    
             since:  1508907165,    
             value:  2.0.2  
            },  
            {
             name:   online,    
             since:  1508907165,    
             value:  online_multivalue1
            },  
                {
             name:   online,    
             since:  1508907165,    
             value:  online_multivalue2
            },
                {
             name:   online,    
             since:  1508907165,    
             value:  online_multivalue3
            },
            {
             name:   ssh_open_port, 
             since:  1508959259,    
             value:  0  
            },
            {
             name:   va_netfunc,    
             since:  1508959247,    
             value:  Linux Desktop/Server   
            }   
    ]   
     ip:     192.168.1.17,  
     tenant_id:  acd1034578ef
}

I want to create a dashboard that would go over all such events and display a pie-chart with online values seen so far. To achieve this, I've been using the following query:

`ct_hostinfo` `get_sourcetypes`
| spath output=prop_name path=host_properties{}.name
| spath output=prop_val path=host_properties{}.value
| eval prop_key_val=mvzip(prop_name, prop_val, "---")
| mvexpand prop_key_val
| eval prop_key_val=split(prop_key_val, "---")
| eval prop_name=mvindex(prop_key_val, 0)
| eval prop_val=mvindex(prop_key_val, 1)
| search prop_name=online
| stats count by prop_val

The above query does its work but it doesn't scale. If I let my dashboard collect data for over ~5k events, I start seeing the following error:

command.mvexpand: output will be truncated at 1300 results due to excessive memory usage. Memory threshold of 500MB as configured in limits.conf / [mvexpand] / max_mem_usage_mb has been reached.

I would like to know if I can somehow scale the above search query so that it can handle greater number of events. Would I need to defined field extractions etc. in transforms.conf/props.conf? I don't know much about field extraction configs.

I would be very grateful, if someone can suggest me a better query or field extractions.

Thanks.

0 Karma

DalJeanis
Legend

You are carrying along a ton of fields when you mvexpand your records. That calculates to over 300K per record. You only need prop_key_val at that point, right?

Two fields commands, immediately before the mvexpand, should put you up to millions of records possible.

 | eval prop_key_val=mvzip(prop_name, prop_val, "---")
 | fields prop_key_val
 | fields - _*
 | mvexpand prop_key_val

Looking deeper, you only want ONE of the values for prop_name=online, so you don't need the mvexpand at all. You need mvfilter.

Try this against your existing search for a small time number of records. It should get the same results. If not, then adjust the mvfilter test until it does. (Adjust case sensitivity, like() vs match(), and so on)

`ct_hostinfo` `get_sourcetypes`
 | spath output=prop_name path=host_properties{}.name
 | spath output=prop_val path=host_properties{}.value
 | eval prop_key_val=mvzip(prop_name, prop_val, "---")
 | eval prop_key_val=mvfilter( like(prop_key_val,"online%") 
 | mvexpand prop_key_val
 | eval prop_val=mvindex(split(prop_key_val, "---"), 1)
 | stats count by prop_val

sharad06
Explorer

Hi Dal,

I tried your suggestion to remove internal fields (fields - _*). But it doesn't work 😞

It's always returning 'No results found'.

0 Karma
Get Updates on the Splunk Community!

Brains, Bytes, and Boston: Learn from the Best at .conf25

When you think of Boston, you might picture colonial charm, world-class universities, or even the crack of a ...

Splunk AppDynamics Agents Webinar Series

Mark your calendars! On June 24th at 12PM PST, we’re going live with the second session of our Splunk ...

SplunkTrust Application Period is Officially OPEN!

It's that time, folks! The application/nomination period for the 2025 SplunkTrust is officially open! If you ...