How to group the data by api name, date and variou...

kuul13 · ‎07-17-2025

I have tried to write a query that outputs the transaction counts, and response times but not sure how to group it by APIs and the date? Here is what I have written so far:

index=my_app sourcetype=my_logs:hec (source=my_Logger) msgsource="*" msgtype="*MyClient*" host=*  [ |inputlookup My_Application_Mapping.csv  | search Client="SomeBank" | table appl ]
| rex field=elapsed "^(?<minutes>\\d+):(?<seconds>\\d+)\\.(?<milliseconds>\\d+)"
| eval total_seconds = (tonumber(seconds) * 1000)
| eval total_milliseconds = (tonumber(minutes) * 60 * 1000) + (tonumber(seconds) * 1000) + (tonumber(milliseconds))
| timechart span=1m cont=f usenull=f useother=f   count(total_milliseconds) as AllTransactions,  
avg(total_milliseconds) as AvgDuration
count(eval(total_milliseconds<=1000)) as "TXN_1000",
count(eval(total_milliseconds>1000 AND total_milliseconds<=2000)) as "1sec-2sec"
count(eval(total_milliseconds>2000 AND total_milliseconds<=5000)) as "2sec-5sec"
count(eval(total_milliseconds>5000 )) as "5sec+",
| timechart span=1d sum(AllTransactions) as "Total" avg(AvgDuration) as AvgDur
sum(TXN_1000)
sum(1sec-2sec)
sum(2sec-5sec) 
sum(5sec+)

`msgsource` has my API name. The output of above query is:

_time | Total | AvgDur | sum(TXN_1000) | sum (1sec-2sec) | sum(2sec-5sec) | sum(5sec+)
2025-07-10| 10000 | 162.12312322 | 1000 | 122 | 1

I want final output to be
_time| API | Total | AvgDur | sum(TXN_1000) | sum (1sec-2sec) | sum(2sec-5sec) | sum(5sec+)
2025-07-10| RetrievePay2 | 10000 | 162.12312322 | 1000 | 122 | 1
2025-07-10 | RetrievePay5 | 2000 | 62.12131244 | 333 | 56 | 2

2025-07-09| RetrievePay2 | 10000 | 162.12312322 | 1000 | 122 | 1
2025-07-09 | RetrievePay5 | 2000 | 62.12131244 | 333 | 56 | 2

Any help is appreciated. Thanks!

yuanliu · ‎07-23-2025

I know you accepted an answer in Is there a way to group the data by time and another field? There is a simpler, perhaps more intuitive approach.

index=my_app sourcetype=my_logs:hec (source=my_Logger) msgsource="*" msgtype="*MyClient*" host=*
    [ inputlookup My_Application_Mapping.csv  | search Client="SomeBank" | table appl ]
| eval total_milliseconds = 1000 * (strptime("00:" . elapsed, "%T.%N") - relative_time(now(), "-0d@d"))
| eval timebucket = case(total_milliseconds <= 1000, "TXN_1000", total_milliseconds <= 2000, "1sec-2sec",
  total_milliseconds <= 5000, "2sec-5sec", true(), "5sec+")
| rename msgsource as API
| bucket _time span=1d
| eventstats avg(total_milliseconds) as AvgDur by _time API
| stats count by AvgDur _time API timebucket
| tojson output_field=api_time _time API AvgDur
| chart values(count) over api_time by timebucket
| addtotals
| spath input=api_time
| rename time as _time
| fields - api_time

Main ideas:

Organizing time bucket by case function is easier to maintain.
Use chart command to perform transpose over one composite field (api_time).
Use tojson to pack all information needed after transpose.

In addition, using strptime to calculate total_milliseconds is more maintainable. (It would be even simpler if Splunk doesn't have a bug near epoc zero.)

kuul13 · ‎07-31-2025

Thanks for the response.

I was trying to see the accuracy of both the queries as I see difference in the counts, the one that you provided and the one from accepted answer. I edit the query to use some mock data. I am not able to use the mock data with the query in accepted answer, I would appreciate if you can help me fix that so that I can compare the results.

| makeresults format=csv data="interactionid,_time,elapsed,msgsource
1,2025-07-31,00:00.756,retrieveAPI
2,2025-07-31,00:00.556,createAPI
3,2025-07-31,00:00.156,createAPI
4,2025-07-31,00:00.256,updateAPI
5,2025-07-31,00:00.356,retrieveAPI
6,2025-07-31,00:00.156,retrieveAPI
7,2025-07-31,00:01.056,createAPI
8,2025-07-31,00:00.256,retrieveAPI
9,2025-07-31,00:06.256,updateAPI
10,2025-07-31,00:10.256,createAPI"
| rex field=elapsed "^(?<minutes>\\d+):(?<seconds>\\d+)\\.(?<milliseconds>\\d+)"
| eval TimeMilliseconds = (tonumber(minutes) * 60 * 1000) + (tonumber(seconds) * 1000) + (tonumber(milliseconds))
| timechart span=1d count as AllTransactions,  
avg(TimeMilliseconds) as AvgDuration
count(eval(TimeMilliseconds<=1000)) as "TXN_1000",
count(eval(TimeMilliseconds>1000 AND TimeMilliseconds<=2000)) as "1sec-2sec"
count(eval(TimeMilliseconds>2000 AND TimeMilliseconds<=5000)) as "2sec-5sec", by msgsource
| untable _time msgsource count
| eval group=mvindex(split(msgsource,": "),0)
| eval msgsource=mvindex(split(msgsource,": "),1)
| eval _time=_time.":".msgsource
| xyseries _time group count
| eval msgsource=mvindex(split(_time,":"),1)
| eval _time=mvindex(split(_time,":"),0)
| table _time msgsource AllTransactions AvgDuration TXN_1000 "1sec-2sec" "2sec-5sec"

This query created the table but the counts are all 0s.

And, here is the edited query that you shared that shows the results:

| makeresults format=csv data="interactionid,_time,elapsed,msgsource
1,2025-07-31,00:00.756,retrieveAPI
2,2025-07-31,00:00.556,createAPI
3,2025-07-31,00:00.156,createAPI
4,2025-07-31,00:00.256,updateAPI
5,2025-07-31,00:00.356,retrieveAPI
6,2025-07-31,00:00.156,retrieveAPI
7,2025-07-31,00:01.056,createAPI
8,2025-07-31,00:00.256,retrieveAPI
9,2025-07-31,00:06.256,updateAPI
10,2025-07-31,00:10.256,createAPI"
| eval total_milliseconds = 1000 * (strptime("00:" . elapsed, "%T.%N") - relative_time(now(), "-0d@d"))
| eval timebucket = case(total_milliseconds <= 1000, "TXN_1000", total_milliseconds <= 2000, "1sec-2sec",
  total_milliseconds <= 5000, "2sec-5sec", true(), "5sec+")
| rename msgsource as API
| bucket _time span=1d
| eventstats avg(total_milliseconds) as AvgDur by _time API
| stats count by AvgDur _time API timebucket
| tojson output_field=api_time _time API AvgDur
| chart values(count) over api_time by timebucket
| addtotals
| spath input=api_time
| rename time as _time
| fields - api_time

You query shows the correct result but the fields are not in a order how I want to display. Any help to fix both queries would be appreciated.

yuanliu · ‎08-01-2025

You query shows the correct result but the fields are not in a order how I want to display.

The easiest way to get the display order you wanted is to use table command, like

| table _time API Total AvgDur TXN_1000 1sec-2sec 2sec-5sec 5sec+

@ITWhisperer already pointed out the mistake in your data emulation that is fixed by strptime. The two method gives exactly the same result. The only difference is in field names, which can easily be coordinated.

ITWhisperer · ‎07-31-2025

The first search does not have a valid value for _time, you need to parse the time value from the dummy data

| makeresults format=csv data="interactionid,_time,elapsed,msgsource
1,2025-07-31,00:00.756,retrieveAPI
2,2025-07-31,00:00.556,createAPI
3,2025-07-31,00:00.156,createAPI
4,2025-07-31,00:00.256,updateAPI
5,2025-07-31,00:00.356,retrieveAPI
6,2025-07-31,00:00.156,retrieveAPI
7,2025-07-31,00:01.056,createAPI
8,2025-07-31,00:00.256,retrieveAPI
9,2025-07-31,00:06.256,updateAPI
10,2025-07-31,00:10.256,createAPI"
| eval _time=strptime(_time,"%F")
| rex field=elapsed "^(?<minutes>\d+):(?<seconds>\d+)\.(?<milliseconds>\d+)"
| eval TimeMilliseconds = (tonumber(minutes) * 60 * 1000) + (tonumber(seconds) * 1000) + (tonumber(milliseconds))
| timechart span=1d count as AllTransactions,  
avg(TimeMilliseconds) as AvgDuration
count(eval(TimeMilliseconds<=1000)) as "TXN_1000",
count(eval(TimeMilliseconds>1000 AND TimeMilliseconds<=2000)) as "1sec-2sec"
count(eval(TimeMilliseconds>2000 AND TimeMilliseconds<=5000)) as "2sec-5sec", by msgsource
| untable _time msgsource count
| eval group=mvindex(split(msgsource,": "),0)
| eval msgsource=mvindex(split(msgsource,": "),1)
| eval _time=_time.":".msgsource
| xyseries _time group count
| eval msgsource=mvindex(split(_time,":"),1)
| eval _time=mvindex(split(_time,":"),0)
| table _time msgsource AllTransactions AvgDuration TXN_1000 "1sec-2sec" "2sec-5sec"

How to group the data by api name, date and various response times?

stats

timechart

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation