I am attempting to correlate network latency fields from different indices. Basically, I would like to end up with a table with 2 columns, with each column being populated with every recorded value for network latency in each of the indices.
I have tried using subsearches like so:
index=indexA | eval "latencyFromIndexB"=[search index=indexB | return <count> $latencyB]+0 | fields "latencyFromIndexA" "latencyFromIndexB"
But, I have been unable to get multiple values to return from the subsearch in a useful way. Depending on the exact method, latencyFromIndexB either lists a single value or the same value over and over, presumably from the first event it finds.
End result will hopefully look something like this:
latencyA | latencyB 12312 | 545 324 | 2123 etc...
Thank you in advance if you are able to assist me!
@martin_mueller Sorry for the delayed reply! Index A logs front-end events, Index B logs events from our API's. The latency reported by each should be different but directly related, with their difference being latency caused by the request traversing the internet.
In other words, each front-end event that reports network latency should have a correlating event in the API index.
How does one correlate an event from A with an event from B on a conceptual level? Ignore Splunk/SPL for now.
Sample events would really help here.
So, let's say the application front-end reports latency of 500ms for a given event. An API (with events being logged in B) takes that front-end request and processes it, Splunk logs this as an event and reports the API response_time=50ms. I believe this means that the request spent 450ms (the difference between the 2 values) traveling through proxies and gateways and whatnot. I am trying to run a Linear Regression as a proof of concept against these two values to show they are directly related.
Sample events (only showing relevant fields):
Index A - Front End
Index B - API
So here, these events were produced via the same request. From this, I can tell that the API is not responsible for any of the latency experienced on the front-end (via response_time: 0). Thus, the latency is likely due to the request routing.
How can you tell they were generated by the same request? I see no request ID or something like that to link the two together.
I currently do not have a request ID field that linking the two events together. My thinking was that if I am exclusively plotting the values of each over time, it would show their relationship just as well as creating some sort of link. That being said, it is possible I do not fully understand the value in adding in a request ID field.
How would a request ID field assist in getting the output I am looking for?
Without an ID you can chart general trends, e.g. this:
index=A OR index=B | timechart avg(latency) avg(response_time)
With an ID you could link which slow requests over here are related to which requests over there, potentially providing lots of value for troubleshooting.
Spot on. I will add in an ID to make the search more effective. Thank you very much for your time and assistance @martin_mueller !