Splunk Search

Search query using spath and mvexpand on multi-value nested JSON doesn't scale

sharad06
Explorer

Hi Splunk Experts,

I am sending events to Splunk Enterprise in the following nested JSON format:

{
     compliance:     Compliance Unknown,    
     ctupdate:   hostinfo,  
     host_properties:   [
            {
             name:   _times,    
              since:     1508907165,    
             value:  last_onsite    
            },  
            {
              name:  compliance_state,  
               since:    1508268020,    
               value:    N/A    
            },  
            {
              name:  engine_seen_packet,    
               since:    1508907165,    
               value:    Yes    
            },  
            {
               name:     guest_corporate_state, 
               since:    1508268020,    
               value:    N/A    
            },  
            {
             name:   linux_operating_system,    
             since:  1508907165,    
             value:  2.0.2  
            },  
            {
             name:   online,    
             since:  1508907165,    
             value:  online_multivalue1
            },  
                {
             name:   online,    
             since:  1508907165,    
             value:  online_multivalue2
            },
                {
             name:   online,    
             since:  1508907165,    
             value:  online_multivalue3
            },
            {
             name:   ssh_open_port, 
             since:  1508959259,    
             value:  0  
            },
            {
             name:   va_netfunc,    
             since:  1508959247,    
             value:  Linux Desktop/Server   
            }   
    ]   
     ip:     192.168.1.17,  
     tenant_id:  acd1034578ef
}

I want to create a dashboard that would go over all such events and display a pie-chart with online values seen so far. To achieve this, I've been using the following query:

`ct_hostinfo` `get_sourcetypes`
| spath output=prop_name path=host_properties{}.name
| spath output=prop_val path=host_properties{}.value
| eval prop_key_val=mvzip(prop_name, prop_val, "---")
| mvexpand prop_key_val
| eval prop_key_val=split(prop_key_val, "---")
| eval prop_name=mvindex(prop_key_val, 0)
| eval prop_val=mvindex(prop_key_val, 1)
| search prop_name=online
| stats count by prop_val

The above query does its work but it doesn't scale. If I let my dashboard collect data for over ~5k events, I start seeing the following error:

command.mvexpand: output will be truncated at 1300 results due to excessive memory usage. Memory threshold of 500MB as configured in limits.conf / [mvexpand] / max_mem_usage_mb has been reached.

I would like to know if I can somehow scale the above search query so that it can handle greater number of events. Would I need to defined field extractions etc. in transforms.conf/props.conf? I don't know much about field extraction configs.

I would be very grateful, if someone can suggest me a better query or field extractions.

Thanks.

0 Karma

DalJeanis
Legend

You are carrying along a ton of fields when you mvexpand your records. That calculates to over 300K per record. You only need prop_key_val at that point, right?

Two fields commands, immediately before the mvexpand, should put you up to millions of records possible.

 | eval prop_key_val=mvzip(prop_name, prop_val, "---")
 | fields prop_key_val
 | fields - _*
 | mvexpand prop_key_val

Looking deeper, you only want ONE of the values for prop_name=online, so you don't need the mvexpand at all. You need mvfilter.

Try this against your existing search for a small time number of records. It should get the same results. If not, then adjust the mvfilter test until it does. (Adjust case sensitivity, like() vs match(), and so on)

`ct_hostinfo` `get_sourcetypes`
 | spath output=prop_name path=host_properties{}.name
 | spath output=prop_val path=host_properties{}.value
 | eval prop_key_val=mvzip(prop_name, prop_val, "---")
 | eval prop_key_val=mvfilter( like(prop_key_val,"online%") 
 | mvexpand prop_key_val
 | eval prop_val=mvindex(split(prop_key_val, "---"), 1)
 | stats count by prop_val

sharad06
Explorer

Hi Dal,

I tried your suggestion to remove internal fields (fields - _*). But it doesn't work 😞

It's always returning 'No results found'.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...