Splunk Search

How to increase subsearch limit?

Vivekmishra01
Explorer

I am trying to run a query like below but I am limited to 10000 sub search result. Is there a way to make this query run for more than 10000 sub search result.

search index="sample_index" "Kubernetes.namespace"="ABC" "Two String" [index="sample_index" "Kubernetes.namespace"="ABC" "Success work done" | fields demo_id ] | stats count as Result by marksObtained


I saw someone has already asked a similar question here, and I tried implementing it in the same way, but it's not working for me.  Below is the query which I wrote, but results are not as expected.

index="sample_index" "Kubernetes.namespace"="ABC" ("Two String"  OR "Success work done") | stats count as Result by marksObtained



Labels (1)
Tags (2)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @Vivekmishra01,

you could configure a different limit for subsearches (by default 50,000) but it isn't a best practice, but anyway you could filter your results using the common field, something like this:

search index="sample_index" "Kubernetes.namespace"="ABC" ("Two String" OR "Success work done")
| eval kind=if(search_match("Two String"),"Two String","Success work done")
| stats dc(kind) AS kind_count values(marksObtained) AS marksObtained BY  demo_id
| where kind_count=2
| mvexpand marksObtained 
| stats count AS Result BY marksObtained

Ciao.

Giuseppe

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @Vivekmishra01,

you could configure a different limit for subsearches (by default 50,000) but it isn't a best practice, but anyway you could filter your results using the common field, something like this:

search index="sample_index" "Kubernetes.namespace"="ABC" ("Two String" OR "Success work done")
| eval kind=if(search_match("Two String"),"Two String","Success work done")
| stats dc(kind) AS kind_count values(marksObtained) AS marksObtained BY  demo_id
| where kind_count=2
| mvexpand marksObtained 
| stats count AS Result BY marksObtained

Ciao.

Giuseppe

Vivekmishra01
Explorer

@gcusello It worked for me for up to last 48 hours.  But as I am increasing the time I see some inconsistencies in data. I believe splunk logs are dropping or something like that. Can you explain me below why you did it like that.

stats dc(kind) AS kind_count values(marksObtained) AS marksObtained BY  demo_id
| where kind_count=2

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vivekmishra01,

with the eval before the stats I defined events,

then in the stats I categorized events using the variable in eval.

using the where condition I take only the events with both the events.

Maybe there's some inconsistence because there's one of the two kind of events outside the time period, but they should be very few.

Ciao.

Giuseppe

0 Karma

yeahnah
Motivator

Hi @Vivekmishra01 

Both events must contain the marksObtained field must be in both events for the stats command group by to work.

If you provide examples of both types of event data  ("Two String" OR "Success work done") then we might be able to assist in getting this working for you.

Please obfuscate any sensitive data.

0 Karma

Vivekmishra01
Explorer

@yeahnah The inner subquery don't have "marksObtained" but both the query has common field demo_id

0 Karma

Vivekmishra01
Explorer

@yeahnah 
Outer query result will be like below and this is demo_id="64236fa4c43595ajj4eudhjjsh344,0ohf430765235178"

 

{"log":"2023-03-28 22:52:20.504  INFO [my-application-web,64236fa4c43595ajj4eudhjjsh344,0ohf430765235178] 1 --- [nio-1892-exec-4] j.c.o.m.t.c.NotificationEventsController : Two Strings  marksObtained=A, ,"Kubernetes.node":"sample-node","Kubernetes.pod":"sample-pod","Kubernetes.namespace":"ABC","hostname":"demo_name"}

 


Inner query Result

 

{"log":"2023-03-28 22:50:14.534  INFO [my-application-web,64236fa4c43595ajj4eudhjjsh344,0ohf430765235178] 1 --- [nio-1892-exec-4] c.j.c.o.m.t.s.AlertsKafkaProducer        : Success work done","Kubernetes.node":"sample-node","Kubernetes.pod":"sample-pod","Kubernetes.namespace":"ABC","hostname":"demo_name"}

 

marksObtained will have only three value "A", "B" and "C"

0 Karma

yeahnah
Motivator

Hi @Vivekmishra01 

OK, based on your sample data this should work...

index=dummy
| append [| makeresults
| eval data="{\"log\":\"2023-03-28 22:52:20.504  INFO [my-application-web,64236fa4c43595ajj4eudhjjsh344,0ohf430765235178] 1 --- [nio-1892-exec-4] j.c.o.m.t.c.NotificationEventsController : Two Strings  marksObtained=A\",\"Kubernetes.node\":\"sample-node\",\"Kubernetes.pod\":\"sample-pod\",\"Kubernetes.namespace\":\"ABC\",\"hostname\":\"demo_name\"}|{\"log\":\"2023-03-28 22:50:14.534  INFO [my-application-web,64236fa4c43595ajj4eudhjjsh344,0ohf430765235178] 1 --- [nio-1892-exec-4] c.j.c.o.m.t.s.AlertsKafkaProducer        : Success work done\",\"Kubernetes.node\":\"sample-node\",\"Kubernetes.pod\":\"sample-pod\",\"Kubernetes.namespace\":\"ABC\",\"hostname\":\"demo_name\"}"
| makemv data delim="|"
| mvexpand data ]
| rename data AS _raw
| tojson
| spath
``` ignore above, just used to create dummy events ```
| rex field=log ",(?<demo_id>[^\]]+)(.*=(?<marksObtained>\w+))*"   ``` may not need this rex if field values already extracted ```
| stats count AS Result values(marksObtained) AS marksObtained BY demo_id

 

0 Karma

yeahnah
Motivator

OK, if both events have the demo_id field that tie the events together, then that is what you should use as the group by "key".  So, something like this should work... 

 

index="sample_index" "Kubernetes.namespace"="ABC" ("Two String"  OR "Success work done")
| stats count AS Result max(marksObtained) BY demo_id

 


Note, the max(marksObtained) assumes the the values is a number, not a a string.  Use values(marksObtained) if it is a string value.

Hope that helps 

0 Karma

Vivekmishra01
Explorer

I am trying to count number of  "A", "B" and "C".  So, I think it must be BY "marksObtained". demo_id will be more than 10000. 

0 Karma

yeahnah
Motivator

You will not hit the 10000 limit because you do not need to use the inefficient and limited subsearch to get your result.
 
And, to find the distinct count (dc) of "A", "B" and "C" just add this to the end of the query provided above

|  stats dc(marksObtained) AS tally_marksObtained

 

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...