Splunk Enterprise

Time average on Dashboard - Two scenarios

Engager

Hi,
I am new to splunk and trying to find average data for below two scenarios.

Scenario 1 - Employer Request / Response data without ID.
In my log file, I have logs for request received and response sent without any id to understand which response is against which request.
E.g. Sample data

16-02-2017 17:00:32:329 Employer Request Received
16-02-2017 17:00:33:734 Employer Request Received
16-02-2017 17:00:34:532 Employer Response Sent
16-02-2017 17:00:37:454 Employer Response Sent

In above scenario, I have two request received one at 32 secs at 5pm and response was sent on 34 secs, that is transaction took 2 secs to process and second request was received at 33 secs and response was sent on 37 secs, so it took 4 secs to process.

I want a DASHBOARD panel to show that how many request are received and what is the average time, that is (2+4)/2 = 3 secs in these scenario.

Scenario 2 - Customer Request / Response data with ID.
In my log file, I have logs for request received and response sent with id .
E.g. Sample data

16-02-2017 17:00:39:329 Customer Request Received. ID - 1234
16-02-2017 17:00:42:734 Customer Request Received. ID - 2345
16-02-2017 17:00:45:532 Customer Response Sent. ID - 1234
16-02-2017 17:00:52:454 Customer Response Sent. ID - 2345

Similar to above, need average on DASHBOARD as well as REPORT that how much each transaction too.

Report
Transaction id Request Time Response Time Total Time (in secs)
1234 16-02-2017 17:00:39:329 16-02-2017 17:00:45:532 6
2345 16-02-2017 17:00:42:734 16-02-2017 17:00:52:454 10

Dashboard
Employer Transaction number of request received and average which shows 3 secs .
Customer Transaction number of request received and average which shows 8 secs.

Please help.

Tags (1)
0 Karma

SplunkTrust
SplunkTrust

The good news is, the AVERAGE transaction time can be calculated without knowing which event goes with which other event.

| rex "Employer (?<ReqType>Request Received|Response Sent)"
| eval ReqTime=if(ReqType=="Request Received",_time,NULL)
| eval RespTime=if(ReqType=="Response Sent",_time,NULL)
| stats min(ReqTime) as minReqTime, min(RespTime) as minRespTime, 
    max(ReqTime) as maxReqTime, max(RespTime) as maxRespTime, 
    avq(ReqTime) as AvgReqTime, avq(RespTime) as AvgRespTime
| eval avgLagTime = AvgRespTime - AvgReqTime

There are a couple of problems with this approach. Primary is that you need to make sure that the first event is a request. Secondary is that you need to make sure that there are the same number of requests and responses.

Let me work on that a bit.


Okay, so this makes some random test data. The timing of the transactions will be swapped around at each execution, so that you can test various kinds of results.

| makeresults 
| eval mydata= mvappend("Employer Response Sent","Employer Request Received","Employer Request Received","Employer Response Sent","Employer Response Sent","Employer Request Received","Employer Response Sent","Employer Request Received","Employer Request Received","Employer Request Received","Employer Response Sent","Employer Response Sent") 
| mvexpand mydata 
| eval _raw=mydata 
| streamstats count as count1 
| eval myrand=random() 
| eval nextval= count1 + tonumber(substr(myrand,1,2)) 
| eval _time=_time + nextval 
| eval _raw = strftime(_time,"%Y-%m-%d %H:%M:%S  ")._raw
| table _time _raw 
| sort _time -ReqType

This part breaks the records out by type. We assume that we always need to receive a request before we send a response, so we'll end up having to scan all the data and see what the maximum number of "extra" responses we have at any given time, and then kill that many responses off the front of the results. If two items have the exact same time, we need to process the response first-- since we cannot provide instant responses, that has to be a response to an earlier request. Requests count as negative, responses as positive, so that the highest positive cumoffset is the number of responses to strip off the front. There MIGHT be more extras than that, but we can't detect that at this level.

| rex "Employer (?<ReqType>Request Received|Response Sent)"
| eval offset = if(ReqType=="Request Received",-1,1)
| accum offset as cumoffset
| streamstats count as count2 by ReqType
| eventstats max(cumoffset) as killsent
| where (count2>killsent) OR (ReqType=="Request Received")

Now we've killed the excess responses off the front, so the requests and responses can be paired correctly. In _time order, we assign a transaction number to each event, partitioned by ReqType.

| accum offset as cumoffset2
| streamstats count as tranNo by ReqType
| sort _time tranNo ReqType

For your test purposes, when you've coded the above to use your live data, you should look at the output at this point and see if the match-up is reasonable. Now we get to the juicy part.

| transaction tranNo
| where eventcount=2
| eventstats count as numberoftrans, avg(duration) as avgduration

That gets you your average, but you can still review how the transactions were lined up. Once you like how the results look for your actual data, change that last line to stats instead of eventstats, and the transaction-level details will disappear.

0 Karma

Legend

For the employer log, there appears to be no way to correlate the request received with the response sent, so there is no way for Splunk to calculate the transaction times.

For the customer log, you can do this:

yoursearchhere
| rex "ID - (?<transaction>\d+)"
| stats earliest(_time) as request_time latest(_time) as response_time by transaction
| eval total_time = response_time - request_time
| eval request_time = strftime(request_time,"%x %X")
| eval response_time = strftime(response_time,"%x %X")
| appendpipe [ stats count sum(total_time) as overall_total 
      | eval avg_time = round(overall_total/count,1) 
      | eval transaction="Number of transactions = "  .  count
      | eval response_time = "Avg Time = "  .  avg_time  . " seconds"]
      | table transaction response_time ]

I created the field extraction for the transaction id, but you might not need to do that if the field has already been created.

0 Karma

Champion

The community can most efficiently help you if you share information about the logs that shows how fields are extracted. Do you have a simple search example that shows the field names for your logs?

0 Karma

Engager

Thanks @rjthibod for response. The sample data that I have given is exactly the way data is printed in Log file, i.e. datetime and then message like that (viz Request Received and Response Sent)

Obviously there are too many logs around it, however I have given the logs that matters for the Report and Dashboard.

I am new to Splunk and learning the basics right now, as of now I just got the records with following search
E.g. sourcetype="server.log" AND ("Employer Request Received" OR "Employer Response Sent")

Not sure what next to get the required count and average time on Dashboard.

0 Karma