Splunk Search

How to overcome sub search limitation (only 10k records).

ksolanki88
Explorer

...dedup Order_Number|search NOT[|inputlookup Order_Details_Lookup.csv|fields Order_Number]|table Order_Number

Need to compare and show all the records which are not available in lookup file. Here Query is working fine but only problem is there are more then 10000 records in lookup file and query is truncating the records and comparing only 10k records leaving all other. I want to compare all 20k records.
Note- Records in lookup file can be more . for ex- 50k or 1 lakh

Is there any way to increase sub search limit ?

Tags (3)

romedome
Path Finder

I just came across this gem via a co-worker. do:

dedup Order_Number|

search NOT[
|inputlookup Order_Details_Lookup.csv |stats values(Order_Number) AS 
Order_Number]

|table Order_Number

That will make the subsearch return a single row with a multi-value field containing all of the order numbers but the individual values will get passed along correctly into the base search.

balmeida
Explorer

This is great, I was stuck using lookup but it's very slow and this solves my problem, as long as the values() limit is not too small. I've done some testing and so far I haven't managed to hit the limit

This doesn't hit the maxout limit:


|makeresults| eval a=1 | append [|inputlookup large_lookup.csv | head 150000| stats values(clientip) AS clientip ] | stats dc(clientip)

dc(clientip)
150000

This hits the maxout limit:


|makeresults| eval a=1 | append [|inputlookup large_lookup.csv | head 150000| fields clientip ] | stats dc(clientip)

dc(clientip)
50000

[subsearch]: Search Processor: Subsearch produced 150000 results, truncating to maxout 50000.

0 Karma

mwk1000
Path Finder

except values() also has a limit

0 Karma

romedome
Path Finder

what limit is that?

0 Karma

MuS
SplunkTrust
SplunkTrust

reading the docs on limits.conf http://docs.splunk.com/Documentation/Splunk/latest/Admin/Limitsconf it looks like there is no limit by default:

maxvalues = <integer>
* Maximum number of values for any field to keep track of.
* When set to “0”: Specifies an unlimited number of values.
* Default: 0

sideview
SplunkTrust
SplunkTrust

A better answer may be to use the lookup as a lookup rather than just as a mechanism to exclude events with a subsearch.

...| dedup Order_Number|search NOT[|inputlookup Order_Details_Lookup.csv|fields Order_Number]|table Order_Number

Making the assumptions that
1) there's some other field in here besides Order_Number
2) at least one of those other fields is present on all rows.

Then let's call that field "otherLookupField" and then we can instead do:

...| dedup Order_Number|lookup Order_Details_Lookup.csv Order_Number OUTPUT otherLookupField | search NOT otherLookupField=*

You'll end up filtering out all of the rows whose Order_Number is present in the lookup, and of course there are no longer any limits on number of rows or search execution time. No subsearch stuff to worry about at all.

romedome
Path Finder

The problem with this is that it's very slow as splunk still has to pull all of the events from the disk and then compare them against the lookup. If you use the sub search splunk can use the bloom filters to determine if a bucket is in or out.

0 Karma

ksolanki88
Explorer

thanks for your quick reply. i have gone through provided links but still my problem is not solved. even in documents its given that the sub serach limit can not be more than 10500 whereas i have to compare 20k records. could you be more specific on which limit i need to set in limits.conf file.

[subsearch]
* This stanza controls subsearch results.
* NOTE: This stanza DOES NOT control subsearch results when a subsearch is called by
commands such as join, append, or appendcols.
* Read more about subsearches in the online documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches

maxout =
* Maximum number of results to return from a subsearch.
* This value cannot be greater than or equal to 10500.
* Defaults to 10000.

0 Karma

MuS
SplunkTrust
SplunkTrust

Why you're using a subsearch at all? You could use the inputlookup with append=t anywhere in the search pipeline and compare that to your events

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi ksolanki88,

take a look at limits.conf http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/Limitsconf within this file you can change subsearch limits.

cheers, MuS

0 Karma

kml_uvce
Builder

althomas
Communicator
maxout = <integer>
* Maximum number of results to return from a subsearch.
* This value cannot be greater than or equal to 10500.
* Defaults to 10000.
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...