I have a simple web service with a request and response called DeliverySchedule. The request and response have a unique identifier called a GUID which are in pairs. I'm trying to find the duration (response time) between the response and request.
Here's my search which is returning some correct durations but also listing 0 for more than 400 events . I'm also missing a lot of events
index=unleashed "deliveryschedule" | transaction GUID startswith="Request" endswith="Reply" | timechart avg(duration)
GUID = "33efb817-1948-4c8f-bdf4-4111aa1941cc" in this case. We will only have 2 matching GUID's which are attached to the req/resp. They will be different for each req/resp. I also did a field extraction called 'GUID' to capture the GUID between the pipes..
REQUEST
DEBUG 2015-05-13 15:31:40,590
fterReceiveRequest - Request Record : |33efb817-1948-4c8f-bdf4-4111aa1941cc |
<GetDeliverySchedule xmlns="http://tempuri.org/">
<request>
<DeliveryType /DotCom_Delivery">A</DeliveryType>
<EndDate/DotCom_Delivery">2015-05-15</EndDate>
<Region_StoreNo/DotCom_Delivery">970</Region_StoreNo>
<Region_zip/DotCom_Delivery">11111</Region_zip>
<StartDate/DotCom_Delivery">2015-05-13</StartDate>
</request>
</GetDeliverySchedule>
RESPONSE
DEBUG 2015-05-13 15:31:41,276 BeforeSendReply - Response Record : |33efb817-1948-4c8f-bdf4-4111aa1941cc |
<GetDeliveryScheduleResponse xmlns="http://tempuri.org/">
<GetDeliveryScheduleResult xmlns:a="http://schemas.datacontract.org/2004/07/DotCom_Delivery" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<a:DeliveryCalendar>
<a:Ranges>
<a:DeliveryRange i:type="a:DeliveryDate">
<a:IsAvailableForDelivery>false</a:IsAvailableForDelivery>
<a:Date>2015-05-13T00:00:00-04:00</a:Date>
</a:DeliveryRange>
<a:DeliveryRange i:type="a:DeliveryDate">
<a:IsAvailableForDelivery>false</a:IsAvailableForDelivery>
<a:Date>2015-05-14T00:00:00-04:00</a:Date>
</a:DeliveryRange>
</a:Ranges>
<a:TypeOfCalendar>DayType</a:TypeOfCalendar>
<a:ZoneId>110</a:ZoneId>
</a:DeliveryCalendar>
<a:StatusCode>200</a:StatusCode>
<a:StatusMessage>OK</a:StatusMessage>
</GetDeliveryScheduleResult>
</GetDeliveryScheduleResponse>
Transaction does a ton of things that you don't appear to need. Have you tried something like this:
index=unleashed "deliveryschedule" | stats earliest(_time) AS firstTime latest(_time) AS lastTime by GUID | eval duration=lastTime-firstTime | timechart avg(duration)
I used your suggestion and pasted a snippet of the results below. Notice the difference between the first and last time, why do some have the same times? I looked at some of the events that had 0.000 for the duration and there was a noticeable difference in time.
Here's a request and response pair of timestamp from looking at the events from GUID 023b0868-8fff-4f52-8374-61a305f96ef4
Request = 2015-05-14 02:27:33,405 80361983ms
Response = 2015-05-14 02:27:33,810 80362388ms
RTG_GUID firstTime lastTime duration
| 021c4152-e86a-42d5-ad30-b0ace69ea807 | 1431597859.119 1431597860.070 0.951
| 023b0868-8fff-4f52-8374-61a305f96ef4 | 1431584853.405 1431584853.405 0.000
| 027e273f-3e7b-423e-833f-5062ddeafcae | 1431604857.327 1431604857.327 0.000
| 05ba2e35-0275-457a-93c6-cca482bc7c63 | 1431610171.550 1431610171.550 0.000
| 06f6b992-c42c-4409-9dd2-09ea62c99ebe | 1431581886.482 1431581886.841 0.359
| 0a3046db-42fb-4fdc-9845-374a9765b540 | 1431603377.069 1431603377.069 0.000
| 0a48e95c-6a4d-4b45-9334-6698d87f8d9e | 1431580396.203 1431580396.203 0.000
| 0a972800-4681-44e9-805c-915d5b5deeb0 | 1431590652.840 1431590653.339 0.499
| 0c279929-c087-4fce-a685-7a129af473fe | 1431591504.562 1431591504.562 0.000
| 1084be43-ec11-42fa-a240-0605c7921126 | 1431582171.221 1431582171.564 0.343
I suspect that your timestamping
may not be working as you are expecting it to. In other words, the timestamp inside your event is not actually the timetamp that Splunk has given to the event. Try this search and see if the _time
value matches what you see in each event (I expect that it won't):
... | table _raw _time
Looking at the events without |transaction
, does the timestamp to the left of the raw event appear correctly including subseconds?
Yes it does. All the timestamps are correct
Both _time
fields are extracted correctly?
Not sure. Can you elaborate on that?
So... Request and Response are linked together correctly by the transaction
, but the duration field is zero instead of zero point something?
Yes correct. Transaction is working as expected by grouping the request and response into the same event which have a matching guid. The duration for most is absolute 0. For the on one's that return a valid duration, they are equal to like 0.336 and 0.386.. we rarely have response times over a second. After i fix this issue ill change it from seconds to ms
Shouldn't it be endswith="Response"
?
Thanks for the response. I tried a lot of different strings for startswith="" and endswith="" with no luck.
The one I'm currently trying is startswith="fterReceiveRequest" endswith="BeforeSendReply"
and still returning 400+ values with duration = 0..
It's working correctly for some events but for 400+ of them, its returning duration=0. I even went to some of the events which have duration=0 and took the difference between the response and request and it's typically in the 200-300ms range, no events have duration = 0 from what I've seen.
The example I posted above is an event which has duration = 0