Dashboards & Visualizations

Transaction Per Second and Percentiles from Access Log of Traefik

shashinandan
Explorer

Hi,

I have traefik access log in json format with below values.

--- Value Samples ---

Duration: 109249593                  ==> [The total time taken (in nanoseconds) by processing the response]

time: 2021-03-20T09:30:01-07:00. ==> [The Request Time]

RequestAddr: example.domain.com. ==> [The HTTP Host header]

------

I am trying to use these 3 key/values to calculate TPS and also percentiles(using the Duration as response time metric).

I have below query to do TPS as of now.

---

index=myindex sourcetype=access_log RequestAddr=*.domain.com
| eval count=1 | timechart per_second(count) as TPS by RequestAddr

----

I am assuming above query will not provide actual TPS since logs are buffered before written to file and then this is pushed by splunk forwarder.

Still a noob figuring out splunk. 

Please let me know any ideas to go about calculating TPS and percentiles using the value samples.

Thank you !

Labels (1)
0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @shashinandan,

I meant on your monitor input that read these json files use time field as timestamp. Your events will have correct timestamps, you will not need convert time to _time by eval. 

Your first query should work, could you please post a few events for us to guess the problem?

If this reply helps you an upvote and "Accept as Solution" is appreciated.

shashinandan
Explorer

Hello @scelikok,

Thank you for responding !

Please find below an event sample.

{"BackendAddr":"10.0.19.45:8080","BackendName":"backend-<servicename>","BackendURL":{"Scheme":"http","Opaque":"","User":null,"Host":"10.0.19.45:8080","Path":"","RawPath":"","ForceQuery":false,"RawQuery":"","Fragment":""},"ClientAddr":"10.255.0.38:47741","ClientHost":"10.255.0.38","ClientPort":"47741","ClientUsername":"-","DownstreamContentSize":100574,"DownstreamStatus":200,"DownstreamStatusLine":"200 OK","Duration":95292255,"FrontendName":"PathPrefix-<servicename>","OriginContentSize":100574,"OriginDuration":95016703,"OriginStatus":200,"OriginStatusLine":"200 OK","Overhead":275552,"RequestAddr":"example.domain.com","RequestContentSize":0,"RequestCount":266290620,"RequestHost":"example.domain.com","RequestLine":"GET /<Request> HTTP/1.1","RequestMethod":"GET","RequestPath":"/<Request>","RequestPort":"-","RequestProtocol":"HTTP/1.1","RetryAttempts":0,"StartLocal":"2021-03-23T19:19:51.901899656-07:00","downstream_Content-Type":"application/json","downstream_Date":"Wed, 24 Mar 2021 02:19:51 GMT","level":"info","msg":"","origin_Content-Type":"application/json","origin_Date":"Wed, 24 Mar 2021 02:19:51 GMT","request_Content-Type":"application/json","request_X-Device-Type":"Handheld with SIM","request_X-Transaction-Id":"OAbYkrBVYFip0fc8azAQg","time":"2021-03-23T19:19:51-07:00"}

 

I have verified that "_time" and "time" had the same value by executing below query. The monitor input is reading the "time" field as timestamp.

"index=myindex sourcetype=access_log  | table time _time StartLocal"

Also, as you said, I was able to get the a result using below query.

"index=myindex sourcetype=access_log RequestAddr=*.domain.com | timechart span=1s count as TPS by RequestAddr"

Hoping this is an accurate way to know how much transactions the application is handling per second.

Thanks !

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @shashinandan,

If time field in your log file is showing the actual time of request (not being affected by buffering) and using this field as _time while indexing, timechart per_second(count) should calculate correctly. You don't need eval count=1.

If your time field is buffered there is no way to calculate.

If this reply helps you an upvote and "Accept as Solution" is appreciated.

shashinandan
Explorer

Hello @scelikok 

Thank you for your inputs. I might be doing something wrong. Below 2 queries that I tried with your inputs with no results. Can you please let me know what you meant by "using this field as _time while indexing"?

And yes, the "time" is not buffered, it is the actual request time.

--- Queries ---

1.

index=myindex sourcetype=access_log RequestAddr=*.domain.com | eval _time=strptime(time,"%FT%T%z") | timechart per_second(count) by RequestAddr

2.

index=myindex sourcetype=access_log RequestAddr=*.domain.com | timechart per_second(count) by RequestAddr

---

Thanks

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...