Dashboards & Visualizations

Need help choosing most performant data input format

halr9000
Motivator

I've got a system based on an XML API that will be spitting out a good amount of data (100k's of events an hour?) in XML format. We'll be using scripted inputs to retrieve the data, so the format can be changed prior to it being indexed. My question is this: how much if any work should be spent on munging the data in terms of impact that can have on search and index performance? Is there benefit to converting it to JSON, for example, or flattening it into tables or KV pairs? Or should I not bother and just do that work at search time?

halr9000
Motivator

I don't yet know how much baggage the XML will come with for a given event type. Obviously if a single event is 50% larger that's got to be a part of the equation. Let's assume for the sake of argument that the sizes are roughly similar.

0 Karma

RicoSuave
Builder

Hal, it depends on how deeply nested the xml is. I don't think there is much difference in terms of parsing xml or JSON. Though, i have seen with other customers, that having xml events with several thousand lines severely impacts search performance. I would write it out to key value pairs if it were up to me, but if the events are small it shouldn't cause too much trouble. my .02

Get Updates on the Splunk Community!

Index This | When is October more than just the tenth month?

October 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What’s New & Next in Splunk SOAR

 Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us for an ...