Splunk Search

Optimizing Splunk Search Query without Using Joins and Subquery due to Record Limitation

Arpit-Dwivedi
New Member

 

Hello Community,


I'm seeking some guidance with optimizing a Splunk search query that involves multiple table searches and joins. The primary issue I'm encountering is the limitation imposed by subqueries, restricting the total records to 50,000.
Here's the current query structure I'm working with:


index="sample" "message.process"="*app-name1" "message.flowName"="*| *"
| rex field=message.correlationId "(?<UUID>^[0-9a-z-]{0,36})"
| rename "message.flowName" as sapi-outbound-call
| stats count by sapi-outbound-call UUID
| join type=inner UUID
[search index="sample" "message.process"="*app-name2" "message.flowName"="*| *"
| rex field=message.correlationId "(?<UUID>^[0-9a-z-]{0,36})"
| rename "message.flowName" as exp-inbound-call]
| stats count by exp-inbound-call sapi-outbound-call
| join left=L right=R where L.exp-inbound-call = R.exp-inbound-call
[search index="sample" "message.process"="*app-name2" "message.flowName"="*| *"
| rename "message.flowName" as exp-inbound-call
| stats count by exp-inbound-call]
| stats list(*) AS * by R.exp-inbound-call R.count
| table R.exp-inbound-call R.count L.sapi-outbound-call L.count

The intention behind this query is to generate statistics based on two query searches or tables while filtering out data based on a common UUID. However, the usage of multiple joins within subqueries is causing limitations due to the 50,000 record cap.


I'm looking for alternative approaches or optimizations to achieve the same result without relying heavily on joins within subqueries. Any insights, suggestions, or examples would be incredibly valuable.
Thank you in advance for your help and expertise!


Regards

Labels (5)
0 Karma

tscroggins
Influencer

 

 

Hi @Arpit-Dwivedi,

A description of your call graph may better illustrate the problem, but given your SPL, I would expect the following output, where [] represents a list or array:

 

R.exp-inbound-call R.count L.sapi-outbound-call[] L.count[]

 

You can often reduce joins to one or more stats commands. This may produce the expected output:

 

index=sample message.process IN (*app-name1 *app-name2) message.flowName="*| *"
| rex field=message.correlationId "(?<UUID>^[0-9a-z-]{0,36})"
| eval call-type=case(like('message.process', "%app-name1"), "sapi-outbound-call", like('message.process', "%app-name2"), "exp-inbound-call")
| eval {call-type}='message.flowName'
| stats list(exp-inbound-call) as exp-inbound-call list(sapi-outbound-call) as sapi-outbound-call by UUID
| stats count by exp-inbound-call sapi-outbound-call
| stats max(count) as exp-inbound-call-count list(sapi-outbound-call) as sapi-outbound-call list(count) as sapi-outbound-call-count by exp-inbound-call

 

 

0 Karma
Get Updates on the Splunk Community!

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...

IM Landing Page Filter - Now Available

We’ve added the capability for you to filter across the summary details on the main Infrastructure Monitoring ...

Dynamic Links from Alerts to IM Navigators - New in Observability Cloud

Splunk continues to improve the troubleshooting experience in Observability Cloud with this latest enhancement ...