------ Start of Edit -------------
EDIT 1: Use Case
- The production server is sends analytics events to Splunk as tagged, log entries. I have a Python script which runs every 5 mins and searches for specific analytics tags within the logs and ingests that into our data warehouse. This incremental search is now htting the 50k limit
EDIT 2: Python Code
def search(self, text, options):
search_string = 'search ' + text
kwargs_oneshot = {"earliest_time": options['start_time'],
"latest_time": options['end_time']}
oneshotsearch_results = self.service.jobs.oneshot(search_string, **kwargs_oneshot)
lazy_results = results.ResultsReader(oneshotsearch_results)
return [l for l in lazy_results]
-------- End of Edit -------------
Hi,
In my use case, I can't lose data during search. I'm currently hitting the max limit of 50000 result rows. I am currently reducing the time interval for my searches but our log data is scaling up. The documentation says the following about maxresultrows.
maxresultrows =
* This limit should not exceed 50000. Setting this limit higher than 50000
causes instability.
You should definitely post the SPL syntax for the search that is hitting the limit. Much or most of the time folks in the community can advise you on a different way to get the same end results that will not be subject to the limits at all.
Fair enough, I'm interested in seeing what he's trying to accomplish
I had this same problem, after 50k rows it will truncate the results based on a FIFO. You can increase this value in the limits.conf
and you will retain all results up to that value
thanks @skoelpin, so you found the documentation to be wrong about the instability after 50k with your testing?
I set the value to be 500,000 and have not noticed a difference on my setup. The instability the docs are referring to are memory.. If the search process exceeds max_mem_usage_mb
and maxresultrows
then data will spill over to the disk. I would try increasing this value then monitor your memory usage for a little bit to verify its good