Splunk Search

Subsearch produced 144180 results, truncating to maxout 10000.

sumitkathpal
Explorer

Hi All,

I am running tstats command and matching with large lookup file but i am getting the "[subsearch]: Subsearch produced 144180 results, truncating to maxout 10000."

| tstats `summariesonly` count from datamodel=Email by All_Email.src_user All_Email.subject | `drop_dm_object_name("All_Email")` | search [| inputlookup  local_domain_intel.csv| rename domain as src_user |fields src_user ] 

My question is : Does my search match it with lookup file and then it truncate to 10000 results ?
Or First it truncate the lookup file with 10000 than compare with my search ? I checked in my lookup last row i updated with known domain and saved , search didn't produce any result . If i move this known domain in first row than search matches and produces result.

Also how to remove this error ?

lookup file row : 144180

Reason

0 Karma
1 Solution

niketn
Legend

@sumitkathpal, if your use-case is to show tstats count only for sources/domains present in the lookup, instead of using | inputlookup which will run as a subsearch, you can run lookup command to identify the domains that are present in the lookup and those which are not can be filtered out. Please try the following query.

| tstats `summariesonly` count from datamodel=Email by All_Email.src_user All_Email.subject 
| `drop_dm_object_name("All_Email")` 
| lookup local_domain_intel.csv domain as src_user outputnew domain as domainFromLookup
| search domainFromLookup!=""
| fields - domainFromLookup

Following is the run anywhere sample approach that I used to test:

1) Created .15M events for lookup. Used streamstats to create unique source names as domain. PS: datasource="lookup" is created as identifier just for demo purpose.

| makeresults count=150000 
| fields - _time
| streamstats count as domain 
| eval domain="src".printf("%06d",domain), datasource="lookup"

2) Piped outputlookup to above result to save as localtestdata.csv

| outputlookup localtestdata.csv

3) Used new query to generate stats count by various sources. PS: count below can be changed to any number you want to test. I tested with 1.5M as well. Following is 15K for demo example. Eval function random() along with substr() is used to generate some random count. PS: datasource="tstats" is just for demo purpose.

| makeresults count=15000 
| fields - _time 
| streamstats count as src_user
| eval src_user=if(src_user<=100,"0",src_user )
| eval src_user="src".printf("%06d",src_user), count=substr("".random(),4), datasource="tstats"

PS: | eval src_user=if(src_user<=100,"0",src_user ) eval has been added in raw event to rename first 100 events as src000000 so that not all events from search matches data in lookup.

4) Once Lookup file using Step 1 and Step 2 is created and you have run a new search with Query 3 to generate your sample events you can match the srcuser field in raw event with domain field in lookup and filter only matched domains using the following command:

| lookup localtestdata.csv domain as src_user outputnew domain as domainFromLookup
| search domainFromLookup!=""
| fields - domainFromLookup
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

niketn
Legend

@sumitkathpal, if your use-case is to show tstats count only for sources/domains present in the lookup, instead of using | inputlookup which will run as a subsearch, you can run lookup command to identify the domains that are present in the lookup and those which are not can be filtered out. Please try the following query.

| tstats `summariesonly` count from datamodel=Email by All_Email.src_user All_Email.subject 
| `drop_dm_object_name("All_Email")` 
| lookup local_domain_intel.csv domain as src_user outputnew domain as domainFromLookup
| search domainFromLookup!=""
| fields - domainFromLookup

Following is the run anywhere sample approach that I used to test:

1) Created .15M events for lookup. Used streamstats to create unique source names as domain. PS: datasource="lookup" is created as identifier just for demo purpose.

| makeresults count=150000 
| fields - _time
| streamstats count as domain 
| eval domain="src".printf("%06d",domain), datasource="lookup"

2) Piped outputlookup to above result to save as localtestdata.csv

| outputlookup localtestdata.csv

3) Used new query to generate stats count by various sources. PS: count below can be changed to any number you want to test. I tested with 1.5M as well. Following is 15K for demo example. Eval function random() along with substr() is used to generate some random count. PS: datasource="tstats" is just for demo purpose.

| makeresults count=15000 
| fields - _time 
| streamstats count as src_user
| eval src_user=if(src_user<=100,"0",src_user )
| eval src_user="src".printf("%06d",src_user), count=substr("".random(),4), datasource="tstats"

PS: | eval src_user=if(src_user<=100,"0",src_user ) eval has been added in raw event to rename first 100 events as src000000 so that not all events from search matches data in lookup.

4) Once Lookup file using Step 1 and Step 2 is created and you have run a new search with Query 3 to generate your sample events you can match the srcuser field in raw event with domain field in lookup and filter only matched domains using the following command:

| lookup localtestdata.csv domain as src_user outputnew domain as domainFromLookup
| search domainFromLookup!=""
| fields - domainFromLookup
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

lakshman239
SplunkTrust
SplunkTrust

You can change the search to use the lookup, something like ( change fields as needed)

| tstats `summariesonly` count from datamodel=Email by All_Email.src_user All_Email.subject | `drop_dm_object_name("All_Email")` | inputlookup  local_domain_intel.csv domain AS src_user  

This will match the src_user in the tstats results to lookup contents and return matched results.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The subsearch executes first and is limited to returning 10,000 results. It is governed by the maxout setting in limits.conf, but must always be less than 10500 so changing this value will not help you. You will have to redesign your query.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...