A query to count tag=pci entries by eventtype (and happens to be part of the application):
tag=pci | stats count by eventtype | sort -count
...results in this error message:
Error in 'UnifiedSearch': unable to parse search 'The specified search is too large. Please try to simplify your search.'
I have about 156m global indexed events on a single indexer and every event is tag=pci. I'm hoping to understand where an upper limit is defined and what I may be doing wrong.
The problem is not the number of search results with the tag, but the size of the search query. I'm assuming that you have items tagged
pci. These essentially expand out into a long OR sequence of every field/value combination with that tag (and eventtypes are themselves expanded). So the search query is too large. I believe the upper limit is something on the order of a few hundred terms, but I don't know what it is or if it can be raised.
What is the reason for tagging every event in the system with 'pci'? A simpler way to achieve this is to run: * | top limit=100 eventtype
Your tag=pci is expanding to more than the number of search terms allowed, which is currently 420. You need to reduce the number of terms with a more restricted tag, perhaps filtering out those results with another search. Future versions of Splunk should allow a massive increase of this size, but for now it's fixed at 420.
For this case, Johnvey's suggestion will be vastly more efficient.
If you can come up with a variety of situations where you run into this ceiling currently, that will help us design the product to better serve those types of needs.
The problem you're bumping into first is that our query parser handles disjunctions with a recursive approach. Infinite recursion on the C stack leads to crashes, so we cut it off at some number before things will break. While our query parser could be redesigned to recurse via iteration or to have successive simplifications or other patterns, it's a lot of work, and it won't necessarily make these searches perform. I sort of think the best channel for this is support, where you provide the set of cases where you bump into this ceiling, which can be broken into alternative query responses, or possibly Enhancement Requests that lead to a better design.
Didn't see johnvey's comment. This is a query in the Splunk PCI app. We happen to be using Splunk exclusively for PCI. Based on the comment, perhaps a fix is in order for the upcoming release of the latest PCI app.