Hi
I am looking to extract some key-value pairs, for each event.
I have data that always has resourceSpans{}.scopeSpans{}.spans{}.attributes{}.key
but it might have resourceSpans{}.scopeSpans{}.spans{}.attributes{}.value.doubleValue or resourceSpans{}.scopeSpans{}.spans{}.attributes{}.value.stringValue.
I want to use them to run stats commands on them. So I was looking to extract each
Key | doubleValue or stringValue
and then use them
This is some of the data I have. We can see that doubleValue and stringValue are mixed and can pop up anytime.
I have tried the following. But there is an issue
source="trace_Marketing_Bench_31032016_17_cff762901d1eff01766119738a9218e2.jsonl" host="TEST1" index="murex_logs" sourcetype="Market_Risk_DT" "**strategy**" 920e1021406277a9
| spath "resourceSpans{}.scopeSpans{}.spans{}.attributes{}.value.stringValue"
| spath "resourceSpans{}.scopeSpans{}.spans{}.attributes{}.value.doubleValue"
| spath "resourceSpans{}.scopeSpans{}.spans{}.attributes{}.key"
| eval output=mvzip('resourceSpans{}.scopeSpans{}.spans{}.attributes{}.value.stringValue','resourceSpans{}.scopeSpans{}.spans{}.attributes{}.key') | table output
The order is not coming out correctly. In red, we can see that
WARNING is with mr_batch_status, not mr_batch_compute_cpu_time - That is because they are both extracting independently and not synced to each other. How do I get them to extract the same?
SOme raw data
{"resourceSpans":[{"resource":{"attributes":[{"key":"telemetry.sdk.version","value":{"stringValue":"1.12.0"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}},{"key":"service.instance.id","value":{"stringValue":"00vptl2h"}},{"key":"service.namespace","value":{"stringValue":"MXMARKETRISK.SERVICE"}},{"key":"service.name","value":{"stringValue":"MXMARKETRISK.ENGINE.MX"}}]},"scopeSpans":[{"scope":{"name":"murex::tracing_backend::otel","version":"v1"},"spans":[{"traceId":"cff762901d1eff01766119738a9218e2","spanId":"71d94e8ebb30a3d5","parentSpanId":"920e1021406277a9","name":"fullreval_task","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1716379123221825454","endTimeUnixNano":"1716379155367858727","attributes":[{"key":"market_risk_span","value":{"stringValue":"true"}},{"key":"mr_batchId","value":{"stringValue":"440"}},{"key":"mr_batchType","value":{"stringValue":"Full Revaluation"}},{"key":"mr_bucketName","value":{"stringValue":"imccBucket#ALL_10_Reduced"}},{"key":"mr_jobDomain","value":{"stringValue":"Market Risk"}},{"key":"mr_jobId","value":{"stringValue":"Marketing_Bench | 31/03/2016 | 17"}},{"key":"mr_strategy","value":{"stringValue":"typo_Bond"}},{"key":"mr_uuid","value":{"stringValue":"b1ed4d3a-0e4d-4afa-ad39-7cf6a07c36a9"}},{"key":"mrb_batch_affinity","value":{"stringValue":"Marketing_Bench_run_Batch|Marketing_Bench|2016/03/31|17_FullReval0_00029"}},{"key":"mr_batch_compute_cpu_time","value":{"doubleValue":31.586568}},{"key":"mr_batch_compute_time","value":{"doubleValue":31.777}},{"key":"mr_batch_load_cpu_time","value":{"doubleValue":0.0}},{"key":"mr_batch_load_time","value":{"doubleValue":0.0}},{"key":"mr_batch_status","value":{"stringValue":"WARNING"}},{"key":"mr_batch_total_cpu_time","value":{"doubleValue":31.912966}},{"key":"mr_batch_total_time","value":{"doubleValue":32.14}}],"status":{}}]}]}]}
Try something like this
| spath resourceSpans{}.scopeSpans{}.spans{}.attributes{} output=attributes
| mvexpand attributes
| spath input=attributes
| eval X_{key}=coalesce('value.doubleValue', 'value.stringValue')
| stats values(X_*) as * by _raw
Try something like this
| spath resourceSpans{}.scopeSpans{}.spans{}.attributes{} output=attributes
| mvexpand attributes
| spath input=attributes
| eval X_{key}=coalesce('value.doubleValue', 'value.stringValue')
| stats values(X_*) as * by _raw
Hi
In the end, I used this however it was not clear to me why I did not need to reference the newly created X_Mr. I could go straight to MR
source="trace_Marketing_Bench_31032016_17_cff762901d1eff01766119738a9218e2*.jsonl" host="TEST2" index="murex_logs" sourcetype="Market_Risk_DT" "**strategy**" 920e1021406277a9
| spath resourceSpans{}.scopeSpans{}.spans{}.attributes{} output=attributes
| mvexpand attributes
| spath input=attributes
| eval X_{key}=coalesce('value.doubleValue', 'value.stringValue')
| stats values(X_*) as * by _time
| stats sum(mr_batch_load_cpu_time) as batch_load_cpu_time sum(mr_batch_load_time) as batch_load_time sum(mr_batch_compute_time) as mr_batch_compute_time sum(mr_batch_compute_cpu_time) as mr_batch_compute_cpu_time by mr_strategyThis created the below table that I was looking to do
What I don't understand is at this point I can only see the new fields X_mr
I added in a "| stats values(X_*) as * by _time" and we are back to the original - I don't get that.
| stats values(X_*) as * by _timeThis line removes all the other fields. You would need to add more fields to this if you want more fields to be kept.
Hi @robertlynch2020
Is this what you are after? Ive loaded in your sample event to start with but you can replace this with the search for your events!
| makeresults
| eval _raw="{\"resourceSpans\":[{\"resource\":{\"attributes\":[{\"key\":\"telemetry.sdk.version\",\"value\":{\"stringValue\":\"1.12.0\"}},{\"key\":\"telemetry.sdk.name\",\"value\":{\"stringValue\":\"opentelemetry\"}},{\"key\":\"telemetry.sdk.language\",\"value\":{\"stringValue\":\"cpp\"}},{\"key\":\"service.instance.id\",\"value\":{\"stringValue\":\"00vptl2h\"}},{\"key\":\"service.namespace\",\"value\":{\"stringValue\":\"MXMARKETRISK.SERVICE\"}},{\"key\":\"service.name\",\"value\":{\"stringValue\":\"MXMARKETRISK.ENGINE.MX\"}}]},\"scopeSpans\":[{\"scope\":{\"name\":\"murex::tracing_backend::otel\",\"version\":\"v1\"},\"spans\":[{\"traceId\":\"cff762901d1eff01766119738a9218e2\",\"spanId\":\"71d94e8ebb30a3d5\",\"parentSpanId\":\"920e1021406277a9\",\"name\":\"fullreval_task\",\"kind\":\"SPAN_KIND_INTERNAL\",\"startTimeUnixNano\":\"1716379123221825454\",\"endTimeUnixNano\":\"1716379155367858727\",\"attributes\":[{\"key\":\"market_risk_span\",\"value\":{\"stringValue\":\"true\"}},{\"key\":\"mr_batchId\",\"value\":{\"stringValue\":\"440\"}},{\"key\":\"mr_batchType\",\"value\":{\"stringValue\":\"Full Revaluation\"}},{\"key\":\"mr_bucketName\",\"value\":{\"stringValue\":\"imccBucket#ALL_10_Reduced\"}},{\"key\":\"mr_jobDomain\",\"value\":{\"stringValue\":\"Market Risk\"}},{\"key\":\"mr_jobId\",\"value\":{\"stringValue\":\"Marketing_Bench | 31/03/2016 | 17\"}},{\"key\":\"mr_strategy\",\"value\":{\"stringValue\":\"typo_Bond\"}},{\"key\":\"mr_uuid\",\"value\":{\"stringValue\":\"b1ed4d3a-0e4d-4afa-ad39-7cf6a07c36a9\"}},{\"key\":\"mrb_batch_affinity\",\"value\":{\"stringValue\":\"Marketing_Bench_run_Batch|Marketing_Bench|2016/03/31|17_FullReval0_00029\"}},{\"key\":\"mr_batch_compute_cpu_time\",\"value\":{\"doubleValue\":31.586568}},{\"key\":\"mr_batch_compute_time\",\"value\":{\"doubleValue\":31.777}},{\"key\":\"mr_batch_load_cpu_time\",\"value\":{\"doubleValue\":0.0}},{\"key\":\"mr_batch_load_time\",\"value\":{\"doubleValue\":0.0}},{\"key\":\"mr_batch_status\",\"value\":{\"stringValue\":\"WARNING\"}},{\"key\":\"mr_batch_total_cpu_time\",\"value\":{\"doubleValue\":31.912966}},{\"key\":\"mr_batch_total_time\",\"value\":{\"doubleValue\":32.14}}],\"status\":{}}]}]}]}"
| eval eventKey=md5(_raw)
| eval attributes=json_array_to_mv(json_extract(_raw,"resourceSpans{}.scopeSpans{}.spans{}.attributes"))
| mvexpand attributes
| eval attribute_key=json_extract(attributes,"key")
| eval attribute_val=coalesce(json_extract(json_extract(attributes,"value"),"stringValue"),json_extract(json_extract(attributes,"value"),"doubleValue"))
| eval extracted_{attribute_key}=attribute_val
| stats values(extracted_*) as * by eventKey
Basically you're doing
| eval eventKey=md5(_raw)
| eval attributes=json_array_to_mv(json_extract(_raw,"resourceSpans{}.scopeSpans{}.spans{}.attributes"))
| mvexpand attributes
| eval attribute_key=json_extract(attributes,"key")
| eval attribute_val=coalesce(json_extract(json_extract(attributes,"value"),"stringValue"),json_extract(json_extract(attributes,"value"),"doubleValue"))
| eval extracted_{attribute_key}=attribute_val
| stats values(extracted_*) as * by eventKey
Please let me know how you get on and consider accepting this answer or adding karma this answer if it has helped.
Regards
Will