Before I rant, thank you for sharing valid mock data in text. This said, this is the second time in as many consecutive days I feel like screaming at lazy developers who makes some terrible use of J...
See more...
Before I rant, thank you for sharing valid mock data in text. This said, this is the second time in as many consecutive days I feel like screaming at lazy developers who makes some terrible use of JSON arrays. (The developer might be you. But the rant stands ) Your data would have much cleaner, self-evidenced semantics had the developer simply use this: [
{
"attributes": {"host.name":{"stringValue":"myname1"},"telemetry.sdk.name":{"stringValue":"my_sdk"}},
"metrics": {"hw.host.energy":{"dataPoints":[{"timeUnixNano":"1712951030986039000","asDouble":359}]},"hw.host.power":{"dataPoints":[{"timeUnixNano":"1712951030986039000","asDouble":26}]}}
},
{
"attributes": {"host.name":{"stringValue":"myname2"},"telemetry.sdk.name":{"stringValue":"my_sdk"}},
"metrics": {"hw.host.energy":{"dataPoints":[{"timeUnixNano":"1712951030987780000","asDouble":211}]}}
] In other words, only two JSON arrays in the original data are used correctly. resourceMetrics.resource.attributes[] and resourceMetrics.scopeMetrics.metrics[] are total abomination of the intent of JSON arrays. Speak to your developers to see if they could change the data structure not just for Splunk, but for future maintainers of their own code and any other downstream team as well. Now that this is off my chest, I understand that it will take more than one day for developers to change code even if you convince them on day one. Here is the SPL that I use to tabulate your data like the following: host.name.stringValue hw.host.energy{}.asDouble hw.host.energy{}.timeUnixNano hw.host.power{}.asDouble hw.host.power{}.timeUnixNano sdk.name.stringValue myname1 359 1712951030986039000 26 1712951030986039000 my_sdk myname2 211 1712951030987780000 my_sdk In this form, I have assumed that dataPoints[] is the only node of interest under resourceMetrics[].scopeMetrics[].metrics.gauge | spath path=resourceMetrics{}
| fields - _* resourceMetrics{}.*
| mvexpand resourceMetrics{}
| spath input=resourceMetrics{} path=resource.attributes{}
| spath input=resourceMetrics{} path=scopeMetrics{}
| spath input=scopeMetrics{} path=metrics{}
| fields - resourceMetrics{} scopeMetrics{}
| foreach resource.attributes{} mode=multivalue
[eval key = mvappend(key, json_extract(<<ITEM>>, "key"))]
| eval idx = mvrange(0, mvcount(key))
| eval attributes_good = json_object()
| foreach idx mode=multivalue
[eval attribute = mvindex('resource.attributes{}', <<ITEM>>),
attributes_good = json_set_exact(attributes_good, json_extract(attribute, "key"), json_extract(attribute, "value"))]
| fields - key attribute resource.attributes{}
| foreach metrics{} mode=multivalue
[eval name = mvappend(name, json_extract(<<ITEM>>, "name"))]
| eval name = if(isnull(name), json_extract('metrics{}', "name"), name)
| eval idx = mvrange(0, mvcount(name))
| eval metrics_good = json_object()
| foreach idx mode=multivalue
[eval metric = mvindex('metrics{}', <<ITEM>>),
metrics_good = json_set_exact(metrics_good, json_extract(metric, "name"), json_extract(metric, "gauge.dataPoints"))]
``` the above assumes that gauge.dataPoints is the only subnode of interest ```
| fields - idx name metric metrics{}
``` the above transforms array-laden JSON into easily understandable JSON ```
| spath input=attributes_good
| spath input=metrics_good
| fields - *_good
``` the following is only needed if dataPoints[] actually contain multiple values. This is the only code requiring prior knowledge about data fields ```
| mvexpand hw.host.energy{}.timeUnixNano
| mvexpand hw.host.power{}.timeUnixNano (The fields - xxx commands are not essential; they just declutter view.) Hope this helps. This is an emulation you can play with and compare with real data: | makeresults
| eval _raw = "{
\"resourceMetrics\": [
{
\"resource\": {
\"attributes\": [
{
\"key\": \"host.name\",
\"value\": {
\"stringValue\": \"myname1\"
}
},
{
\"key\": \"telemetry.sdk.name\",
\"value\": {
\"stringValue\": \"my_sdk\"
}
}
]
},
\"scopeMetrics\": [
{
\"metrics\": [
{
\"name\": \"hw.host.energy\",
\"gauge\": {
\"dataPoints\": [
{
\"timeUnixNano\": \"1712951030986039000\",
\"asDouble\": 359
}
]
}
},
{
\"name\": \"hw.host.power\",
\"gauge\": {
\"dataPoints\": [
{
\"timeUnixNano\": \"1712951030986039000\",
\"asDouble\": 26
}
]
}
}
]
}
]
},
{
\"resource\": {
\"attributes\": [
{
\"key\": \"host.name\",
\"value\": {
\"stringValue\": \"myname2\"
}
},
{
\"key\": \"telemetry.sdk.name\",
\"value\": {
\"stringValue\": \"my_sdk\"
}
}
]
},
\"scopeMetrics\": [
{
\"metrics\": [
{
\"name\": \"hw.host.energy\",
\"gauge\": {
\"dataPoints\": [
{
\"timeUnixNano\": \"1712951030987780000\",
\"asDouble\": 211
}
]
}
}
]
}
]
}
]
}"
| spath
``` data emulation above ``` Final thoughts about data structure with self-evidence semantics: If my speculation about dataPoints[] being the only node of interest under resourceMetrics[].scopeMetrics[].metrics.gauge stands, good data could be further simplified to [
{
"attributes": {"host.name":{"stringValue":"myname1"},"telemetry.sdk.name":{"stringValue":"my_sdk"}},
"metrics": {"hw.host.energy":[{"timeUnixNano":"1712951030986039000","asDouble":359}],"hw.host.power":[{"timeUnixNano":"1712951030986039000","asDouble":26}]}
},
{
"attributes": {"host.name":{"stringValue":"myname2"},"telemetry.sdk.name":{"stringValue":"my_sdk"}},
"metrics": {"hw.host.energy":[{"timeUnixNano":"1712951030987780000","asDouble":211}]}
] I do understand that listing hw.host.energy and hw.host.power as coexisting columns is different from your illustrated output and may not suite your needs. But presentation can easily be adapted. Bad data structure remains bad.