Splunk Search

How to increase the subsearch limit?

vitorvmiguel
Explorer

Hello,

I'm trying to do a subsearch like this one:

 index = raw_internet_cartonista programa = ILCL [ search index = raw_internet_cartonista programa = WNHC tipo = E | fields codigoAcesso ] | stats count by info10

But I receive the message:

[subsearch]: Subsearch produced 12632 results, truncating to maxout 10000.

How can I configure my search to expand this limit?

I've consulted the documentation and there are some parameters to set:

[subsearch] maxout = • Maximum number of results to return from a subsearch. • This number cannot be greater than or equal to 10500. • Defaults to
100. maxtime = • Maximum number of seconds to run a subsearch before finalizing • Defaults to 60. ttl = • Time to cache a given subsearch's results. • Defaults to
300.

Are these parameter correct? Where do I have to place these parameters? Which limits are most indicated?

Regards,
Vitor

0 Karma
1 Solution

javiergn
Super Champion

Short answer: do not use subsearches for this type of queries

Detailed answer: subsearches are expensive in terms of performance and there's a limit for a reason. Do not increase this. You can normally find much better alternatives. Keep in mind your subsearch above is basically returning "codigoAcesso = value1 OR codigoAcesso = value2 OR .... OR codigoAcesso = value10000".

First of all, what are you trying to achieve? I'm not 100% sure based on the search you are performing.

If you just want both type of events do this:

index = raw_internet_cartonista (programa = ILCL OR (programa = WNHC tipo = E))
| stats count by info10

If you just want to display those matching both types of "programas" then you can try this:

index = raw_internet_cartonista (programa = ILCL OR (programa = WNHC tipo = E))
| stats count, dc(programa) as distinct_count by info10
| where distinct_count > 1

Hope that helps

View solution in original post

javiergn
Super Champion

Short answer: do not use subsearches for this type of queries

Detailed answer: subsearches are expensive in terms of performance and there's a limit for a reason. Do not increase this. You can normally find much better alternatives. Keep in mind your subsearch above is basically returning "codigoAcesso = value1 OR codigoAcesso = value2 OR .... OR codigoAcesso = value10000".

First of all, what are you trying to achieve? I'm not 100% sure based on the search you are performing.

If you just want both type of events do this:

index = raw_internet_cartonista (programa = ILCL OR (programa = WNHC tipo = E))
| stats count by info10

If you just want to display those matching both types of "programas" then you can try this:

index = raw_internet_cartonista (programa = ILCL OR (programa = WNHC tipo = E))
| stats count, dc(programa) as distinct_count by info10
| where distinct_count > 1

Hope that helps

BernardEAI
Communicator

Hi javiergn

Can you maybe give some technical detail on why subsearches are expensive in terms of performance? Is the performance cost simply equal to doing that search on its own?

0 Karma

vitorvmiguel
Explorer

Thank you javiergn.

I've seen across all the Splunk documentation the recomendation to not change the limits. And obviously there's a reason for that.

My problem is to correlate events like:

Event A: {time=10:01:000, program=ABC, logLevel=I, userAgent=iPhone, userID=00001}
Event B: {time=10:02:000, program=DEF, logLevel=E, userAgent=, userID=00001}

Imagine that i want to find who has errors on program=DEF and uses an iPhone, i have to correlate with a subsearch this two events, or there's a better way of doing that? The userAgent information in this example only appears in one single identification event.

index=raw program=ABC AND logLevel=I [search index=raw program=DEF AND logLevel=E | fields userID ] | stats count by userAgent

Thank you for helping me.
Rgs.,

0 Karma

javiergn
Super Champion

Try this instead:

 index=raw (program=ABC AND logLevel=I) OR (program=DEF AND logLevel=E)
| stats values(logLevel) as logLevel, values(program) as program, values(userAgent) as userAgent by userID
| search program = ABC userAgent=iPhone
| table userID

vitorvmiguel
Explorer

It works!! Thank you very much javiergn!!

One last question: If the events are in different indexes? How should i do?

Event A: {index=raw_1, time=10:01:000, program=ABC, logLevel=I, userAgent=iPhone, userID=00001}
Event B: {index=raw_2, time=10:02:000, program=DEF, logLevel=E, userAgent=, userID=00001}

0 Karma

javiergn
Super Champion

Hi, apologies for the late reply.

If the events are in different indexes you can still apply the same logic:

(index=index1 program=ABC logLevel=I) OR (index=index2 logLevel=E)
0 Karma

woodcock
Esteemed Legend

Like this:

index = raw_internet_cartonista programa = ILCL [ search index = raw_internet_cartonista programa = WNHC tipo = E | stats values(codigoAcesso) AS codigoAcesso ] | stats count by info10
0 Karma

datamine
Loves-to-Learn Lots

But what if our subsearch has results more than 50000 and we need to those as well. As splunk subsearches has maxout 50000 whats the best way to optimize them? to increase the limit in limits.conf or is there any better way to do it by optimizing the query itself to allow the results for more than 50000.

Thanks,
Dave

Tags (2)
0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...