Hi,
I have traefik access log in json format with below values.
--- Value Samples ---
Duration: 109249593 ==> [The total time taken (in nanoseconds) by processing the response]
time: 2021-03-20T09:30:01-07:00. ==> [The Request Time]
RequestAddr: example.domain.com. ==> [The HTTP Host header]
------
I am trying to use these 3 key/values to calculate TPS and also percentiles(using the Duration as response time metric).
I have below query to do TPS as of now.
---
index=myindex sourcetype=access_log RequestAddr=*.domain.com
| eval count=1 | timechart per_second(count) as TPS by RequestAddr
----
I am assuming above query will not provide actual TPS since logs are buffered before written to file and then this is pushed by splunk forwarder.
Still a noob figuring out splunk.
Please let me know any ideas to go about calculating TPS and percentiles using the value samples.
Thank you !
Hi @shashinandan,
I meant on your monitor input that read these json files use time field as timestamp. Your events will have correct timestamps, you will not need convert time to _time by eval.
Your first query should work, could you please post a few events for us to guess the problem?
Hello @scelikok,
Thank you for responding !
Please find below an event sample.
{"BackendAddr":"10.0.19.45:8080","BackendName":"backend-<servicename>","BackendURL":{"Scheme":"http","Opaque":"","User":null,"Host":"10.0.19.45:8080","Path":"","RawPath":"","ForceQuery":false,"RawQuery":"","Fragment":""},"ClientAddr":"10.255.0.38:47741","ClientHost":"10.255.0.38","ClientPort":"47741","ClientUsername":"-","DownstreamContentSize":100574,"DownstreamStatus":200,"DownstreamStatusLine":"200 OK","Duration":95292255,"FrontendName":"PathPrefix-<servicename>","OriginContentSize":100574,"OriginDuration":95016703,"OriginStatus":200,"OriginStatusLine":"200 OK","Overhead":275552,"RequestAddr":"example.domain.com","RequestContentSize":0,"RequestCount":266290620,"RequestHost":"example.domain.com","RequestLine":"GET /<Request> HTTP/1.1","RequestMethod":"GET","RequestPath":"/<Request>","RequestPort":"-","RequestProtocol":"HTTP/1.1","RetryAttempts":0,"StartLocal":"2021-03-23T19:19:51.901899656-07:00","downstream_Content-Type":"application/json","downstream_Date":"Wed, 24 Mar 2021 02:19:51 GMT","level":"info","msg":"","origin_Content-Type":"application/json","origin_Date":"Wed, 24 Mar 2021 02:19:51 GMT","request_Content-Type":"application/json","request_X-Device-Type":"Handheld with SIM","request_X-Transaction-Id":"OAbYkrBVYFip0fc8azAQg","time":"2021-03-23T19:19:51-07:00"}
I have verified that "_time" and "time" had the same value by executing below query. The monitor input is reading the "time" field as timestamp.
"index=myindex sourcetype=access_log | table time _time StartLocal"
Also, as you said, I was able to get the a result using below query.
"index=myindex sourcetype=access_log RequestAddr=*.domain.com | timechart span=1s count as TPS by RequestAddr"
Hoping this is an accurate way to know how much transactions the application is handling per second.
Thanks !
Hi @shashinandan,
If time field in your log file is showing the actual time of request (not being affected by buffering) and using this field as _time while indexing, timechart per_second(count) should calculate correctly. You don't need eval count=1.
If your time field is buffered there is no way to calculate.
Hello @scelikok
Thank you for your inputs. I might be doing something wrong. Below 2 queries that I tried with your inputs with no results. Can you please let me know what you meant by "using this field as _time while indexing"?
And yes, the "time" is not buffered, it is the actual request time.
--- Queries ---
1.
index=myindex sourcetype=access_log RequestAddr=*.domain.com | eval _time=strptime(time,"%FT%T%z") | timechart per_second(count) by RequestAddr
2.
index=myindex sourcetype=access_log RequestAddr=*.domain.com | timechart per_second(count) by RequestAddr
---
Thanks