Dashboards & Visualizations

Need help choosing most performant data input format


I've got a system based on an XML API that will be spitting out a good amount of data (100k's of events an hour?) in XML format. We'll be using scripted inputs to retrieve the data, so the format can be changed prior to it being indexed. My question is this: how much if any work should be spent on munging the data in terms of impact that can have on search and index performance? Is there benefit to converting it to JSON, for example, or flattening it into tables or KV pairs? Or should I not bother and just do that work at search time?


I don't yet know how much baggage the XML will come with for a given event type. Obviously if a single event is 50% larger that's got to be a part of the equation. Let's assume for the sake of argument that the sizes are roughly similar.

0 Karma


Hal, it depends on how deeply nested the xml is. I don't think there is much difference in terms of parsing xml or JSON. Though, i have seen with other customers, that having xml events with several thousand lines severely impacts search performance. I would write it out to key value pairs if it were up to me, but if the events are small it shouldn't cause too much trouble. my .02

Get Updates on the Splunk Community!

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Get More Out of Your Security Practice With a SIEM

Get More Out of Your Security Practice With a SIEMWednesday, July 31, 2024  |  11AM PT / 2PM ETREGISTER ...