First the business case: We want a dashboard with a bar graph that shows the time a transaction spends at each step in the communication. e.g. 0.1 second at web server; 0.4 seconds at JVM; 1.2 seconds at application; 0.2 seconds back at JVM; 0.3 seconds at web server
Currently I can have extracted fields that contain a UID that is consistent through all logging, so I can easily | transaction UID
to get total duration. However, I need this split, not just by server, but by each step in the server.
Second the example data (the host=a or host=b isn't in the data, but is included for the illustration):
host=a 08/19/2015-16:18:57.532 - <snip> UID: 1440015537532502136 - <snip> - WebService Method Call Received for <snip>
host=a 08/19/2015-16:18:57.535 - <snip> UID: 1440015537532502136 - <snip> - Calling Siebel rfhValidation web service
host=a 08/19/2015-16:18:57.536 - <snip>: UID: 1440015537532502136 - Initializing <snip> Web Service
host=a 08/19/2015-16:18:57.538 - <snip> UID: 1440015537532502136 - Initializing <snip> Web Service <snip>
host=a 08/19/2015-16:18:57.644 - <snip> UID: 1440015537532502136 - <snip> Web Service initilized <snip>
host=b 8/19/2015 16:18:58:335 UID: 1440015537532502136 started <snip>
host=b 8/19/2015 16:18:58:507 UID: 1440015537532502136 <snip> finished <snip>
host=a 08/19/2015-16:18:58.849 - <snip> UID: 1440015537532502136 - <snip> - <snip> web service call finished
host=a 08/19/2015-16:18:58.853 - <snip> UID: 1440015537532502136 - <snip> - WebService Method Call Finished
08/25/2015-10:33:47.287 - WebContainer : 2 - DEBUG [com.cgi.mas.provider.services.SiebelService : Line 112]: UID: 1440513226275396914 - RequestForHearingValidation - Siebel rfhValidation web service call finished
08/25/2015-10:33:47.291 - WebContainer : 2 - DEBUG [com.cgi.mas.provider.services.Level3ServiceProvider : Line 112]: UID: 1440513226275396914 - RequestForHearingValidation - WebService Method Call Finished
So I'd like the result of 0.112 seconds for the first section of host a, 0.172 seconds for the section from host b, and then 0.004 seconds for the conclusion from host a. We will get more logs from more servers eventually, but I think if I can get a solid start on this, that will give me everything I need.
I'm thinking I can grab the relevant strings (e.g. Web Service initilized
or WebService Method Call Received
) and subtract the time difference, but I'm hoping there's something more elegant (and quicker).
Thank you!
Like this:
... | stats earliest(_time) AS StageStartTime latest(_time) AS StageEndTime BY host UID | eval StageDuration = StageEndTime - StageStartTime | eventstats first(StageEndTime) AS DelayStartTime last(StageStartTime) AS DelayEndTime BY UID | eval InterStageDelay = DelayEndTime - DelayStartTime | eval StageDuration = host . ": " . StageDuration | stats values(StageDuration) AS StageDurations first(InterStageDelay) AS InterStageDelay BY UID
NOTE: This is optimized for the 2-host case and will not work for more than 2 hosts.
I'm getting 0 for everything when that clearly isn't the case (e.g. duration is 2.488, but each Stage is 0). Also, I seem to be losing lines where the UID is there on both hosts (which is almost all of them). Working step-by-step, when I do the first BY host UID
I think is where I see the issue.
Regardless, I think you for your answer - that was along the lines of what I was hoping not to have to do, so I'll just plan to troubleshoot through it on my own and will likely open new questions for more specific issues if I run across them.
I have double-checked and the only thing that I think could be an issue is the order of the events headed into eventstats
so you might have to swap the words first
and last
. Other than that, it looks rock-solid.
This solution also presumes you have your inputs/fields correct so that _time
, host
, and UID
are proper for all events.