I am trying to search for data that is in a .csv lookup file and NOT in Splunk. My issue is that my subsearch stops (and returns to the main search) after it processes 10k records, so I can only search within the last 4 hours. I would like to be able to search within the last 24 hours, but I don't want to modify the configuration for the subsearch to handle more than 10k records because I was told that is bad practice.
I have this search query:
| inputlookup macaddress.csv | eval macaddress=upper(macaddress) | search NOT [ search sourcetype=DhcpSrvLog index=dhcp source="C:\Windows\System32\DHCP\DhcpSrvLog*.log" (host="AE-VENOM" OR host="AE-CARNAGE") macaddress=* | fields macaddress ]
I get this message from the job inspector:
[subsearch]: Subsearch produced 10000 results, truncating to maxout 10000.
I was advised to try using a data model to improve performance, so I tried using the Network Session > DHCP data model but I notice that I still get over 10k records. This makes sense to me since the data model improves the speed of the search by focusing on certain fields rather than whole events. The number of records would remain the same since that's determined by the filter content.
So my main question is: Is there a way for me to force the subsearch to process all of the records without making any configuration changes that might be detrimental to other searches? I know that I can change the configuration to allow more records to be processed but is that a good idea?
... View more