About adamcohen

adamcohen · ‎04-02-2018

Thanks for the response @starcher, however, I'm not trying to solve this problem for a JSON formatted log - I already know how to do that, and it works well. The problem is how to solve this problem for key-value formatted logs, since my organization wants to have a clear comparison of JSON formatted logs versus key-value. This is why I'm trying to figure out the best way to store a nested data structure in key-value format, so I can attempt to run the same queries against both JSON and key-value formatted data to figure out what the differences are between the two formats, in order to summarise the advantages/disadvantages of both approaches. For example, say I want to return all restaurants that have more than 15 categories, I can use the following query on JSON formatted data: source="business.json" | spath categories{} | where mvcount('categories{}') > 15 The above query requires using spath, which can be slow. In order to compare this to key-value, I need to first understand how to store the nested data (including the categories array) in key-value format, so I can then construct a query.

adamcohen · ‎03-26-2018

The Splunk best practices document recommends: Use clear key-value pairs key1=value1, key2=value2, key3=value3 . . . This makes sense for simple data that can be represented in key-value format, but what about nested data structures? For example, what's the best way of representing the following log data using key-value format? { "categories": [ "Restaurants", "American (New)", "Southern" ], "attributes": { "BusinessParking": { "street": false, "garage": true }, "WheelchairAccessible": true, "GoodForKids": false, }, "stars": 4.5, "city": "Las Vegas", "name": "Yardbird Southern Table & Bar", } I can represent the attributes and top level keys using dotted-notation: attributes.BusinessParking.street="false", attributes.BusinessParking.garage"true", attributes.WheelchairAccessible="true", attributes.GoodForKids"false", stars="4.5", city="Las Vegas", name="Yardbird Southern Table & Bar", Although I'm not sure if this is optimal. However, my main question is: how should I represent the categories array? I need to be able to perform a search on the above data and return all records that have more than N number of categories, so how should my data be structured in order to facilitate such a query in the most efficient way possible? The reason I'm asking is because we're currently storing our logs in JSON format, and I can indeed perform the above query using JSON data with spath, but there are people in my organization that believe that spath is very slow and using key-value is much faster, and they want to change our logging format from JSON to key-value. I'd like to be able to compare both log structures, JSON and key-value, to understand which format is more efficient for querying (if, in fact there is any difference at all), and at the moment, I can't even figure out how to best structure the key-value logs to allow me to query array data.

Posts	2
Solutions	0
Karma Given	0
Karma Received	0
Member Since	‎03-26-2018

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

How to format nested data using key-value structur...

Re: How to format nested data using key-value stru...

How to format nested data using key-value structur...

Join the Conversation

How to format nested data using key-value structur...

Re: How to format nested data using key-value stru...

How to format nested data using key-value structur...