We have a web api that orchestrates calls to other services. So for example we may have an incoming call to `/api`, which then may do 3 calls to a system of record. These executions are ending up in splunk in a single metric (mostly for ease of investigation in other avenues). So we have events that look like the following: {
SOR.Executions.0.Operation: "GetInfo",
SOR.Executions.0.TimeInMs: 321,
SOR.Executions.1.Operation: "UpdateRecord",
SOR.Exectuions.1.TimeInMs: 234,
SOR.Executions.2.Operation: "DoSomethingElse",
SOR.Exectuions.2.TimeInMs: 532,
} I've been able to successfully extract singular values from these via a search such as index="docker"
| fields SOR.Executions.*.Operation
| foreach SOR.Executions.*.Operation [eval Operation=mvappend(VcmRequest, '<<FIELD>>')]
| mvexpand Operation
| fields Operation
| stats count by Operation
| sort count desc This works well for retrieving singular values without correlations (and allows for nice pie charts for how often specific operations are performed) but now I want to get percentile timings of each individual operation. So for example I want to run statistics on how often a "GetInfo" operation takes vs how often a "DoSomethingEsle" operation takes. The problem is that the number of SOR calls being made varies depending on input to the API call. We may end up with 3 SOR calls if 1 customer is passed in, or we may end up with 8 calls if 5 customer ids are passed in. My initial thought was to do an eval to grab the field name and put it in a multi-value field, then do a `rex` to pull out the execution digit (the dynamic part), tokenizing that, then doing a subsearch based on that. However, that's getting to things I can't find documentation of (pulling out field names for example) and is going into a complex rabbit hole. Is there an easier avenue to do this that I'm missing?
... View more