Dashboards & Visualizations

Need help choosing most performant data input format


I've got a system based on an XML API that will be spitting out a good amount of data (100k's of events an hour?) in XML format. We'll be using scripted inputs to retrieve the data, so the format can be changed prior to it being indexed. My question is this: how much if any work should be spent on munging the data in terms of impact that can have on search and index performance? Is there benefit to converting it to JSON, for example, or flattening it into tables or KV pairs? Or should I not bother and just do that work at search time?


I don't yet know how much baggage the XML will come with for a given event type. Obviously if a single event is 50% larger that's got to be a part of the equation. Let's assume for the sake of argument that the sizes are roughly similar.

0 Karma


Hal, it depends on how deeply nested the xml is. I don't think there is much difference in terms of parsing xml or JSON. Though, i have seen with other customers, that having xml events with several thousand lines severely impacts search performance. I would write it out to key value pairs if it were up to me, but if the events are small it shouldn't cause too much trouble. my .02

Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...