Splunk Search

I want to group similar urls as one and get the count and average times

pandeyrohit51
Explorer

My query is 

 

index=stuff | search "kubernetes.labels.app"="some_stuff" "log.msg"="Response" "log.level"=30 "log.response.statusCode"=200 | spath "log.request.path"| rename "log.request.path" as url | convert timeformat="%Y/%m/%d" ctime(_time) as date | stats min("log.context.duration") as RT_fastest max("log.context.duration") as RT_slowest p95("log.context.duration") as RT_p95 p99("log.context.duration") as
RT_p99 avg("log.context.duration") as RT_avg count(url) as Total_Req by url

 

And i am getting the attached screenshot response. I want to club all the similar api's like all the /getFile/* as one API and get the average time


Screenshot 2024-10-08 at 4.55.14 PM.png

Labels (5)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| eval url=if(mvindex(split(url,"/"),1)="getFile","/getFile",url)

View solution in original post

pandeyrohit51
Explorer

I want all the api's with /reports/getFile/* grouped as one and then take the p95,p99,average,count etc. I don't want them as separate entries. Since the endpoint is same and only the id differes. 

/getFile/1
/getFile/2
/getFile/3

this should be grouped as 1 like
/getFile - and all the p95,p99,count should be calculated as p95/p99/sum of all the three.

/getFile - count(3, since /1,/2,/3), p95(p95 of all the three calculated as 3 similar api call /getFile/*) and so on

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Essentially you need to extract from the url field the part that you want. For example, is it always the first two parts, or fewer, or only applied to particular urls? Please describe your requirement in more detail.

0 Karma

pandeyrohit51
Explorer

I want all the api's with /reports/getFile/* grouped as one and then take the p95,p99,average,count etc. I don't want them as separate entries. Since the endpoint is same and only the id differes. 

/getFile/1
/getFile/2
/getFile/3

this should be grouped as 1 like
/getFile - and all the p95,p99,count should be calculated as p95/p99/sum of all the three.

/getFile - count(3, since /1,/2,/3), p95(p95 of all the three calculated as 3 similar api call /getFile/*) and so on

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| eval url=if(mvindex(split(url,"/"),1)="getFile","/getFile",url)

pandeyrohit51
Explorer

Thanks a lot. this really helps to solve the current problem. But is there a generic way to solve this? As we would have to write this condition for all the API's which would have this pattern or fall into this type of API pattern. 

e.g.

| eval url=if(mvindex(split(url,"/"),1)="getFile","/getFile",url)
| eval url=if(mvindex(split(url,"/"),1)="import","/import",url)

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If you can explain the algorithm for determining how the url is to be changed, we might be able to help you - currently your requirement is too vague.

0 Karma

pandeyrohit51
Explorer

@ITWhisperer 
Based on the response i changed my query to below.

index=stuff "kubernetes.labels.app"="some-stuff" | search "log.msg"="Response" "log.level"=30 "log.response.statusCode"=200 | spath "log.request.path"| rename "log.request.path" as url| eval url=if(mvindex(split(url,"/"),4)="namespace","/attribute/namespace/{id}",url) | eval url=if(mvindex(split(url,"/"),2)="schema","/spec-api/schema/{id}",url)| convert timeformat="%Y/%m/%d" ctime(_time) as date | stats min("log.context.duration") as RT_fastest max("log.context.duration") as RT_slowest p95("log.context.duration") as RT_p95 p99("log.context.duration") as
RT_p99 avg("log.context.duration") as RT_avg count(url) as Total_Req by url | sort Total_Req desc

 

If you see, i had to write the eval twice for two different end points. But as my application grows, there may come different API's(endpoints) with the same patterns. And i would have to write the eval for each one of them. 

So, I was thinking is there a more generic way to group these types of API's into one rather than writing the eval again and again. I was looking into the "cluster" query, but was not able to get anything out of it. 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Another possibility is to use the sed mode of the rex command to replace the id part with a fixed value. This would rely on the id being formatted in an identifiable pattern. You may need to work with your application designers to ensure that all ids follow a particular pattern or patterns otherwise you may end up having more rex commands to replace different formats of ids.

Get Updates on the Splunk Community!

Technical Workshop Series: Splunk Data Management and SPL2 | Register here!

Hey, Splunk Community! Ready to take your data management skills to the next level? Join us for a 3-part ...

Spotting Financial Fraud in the Haystack: A Guide to Behavioral Analytics with Splunk

In today's digital financial ecosystem, security teams face an unprecedented challenge. The sheer volume of ...

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability As businesses scale ...