How many tags can be created before Splunk's performance is adversely affected? And what specifcally is adversely affected when too many tags are defined-- index perf, search perf, or both?
Per Steve Z, one of Splunk's rocket scientists:
The tag approach is definitely not scalable beyond a few thousand. Tags were designed to handle expanding to tens or hundreds of values, not tens of thousands or more. Also, note that tagging is designed to tag specific values of a single field, rather than events as a whole
I wouldn't recommend more than 1000 tags, I have seen degradation linked to applications loading a specific amount of tags and neglecting tags that correlate to the same numbers of hosts beyond the 1000 mark. Though they don't have any impact on indexing, search can be more meaningful with better direction of your catagories through eventtypes. This can also be associated with hosts r sources, as opposed to matching a case in an event.
Tags have no effect on indexing or indexing performance, so any effect would only be realized at search time.
Per Steve Z, one of Splunk's rocket scientists:
The tag approach is definitely not scalable beyond a few thousand. Tags were designed to handle expanding to tens or hundreds of values, not tens of thousands or more. Also, note that tagging is designed to tag specific values of a single field, rather than events as a whole
To extend this, the better bet is to use field lookups, which scale easily to millions of items. Another technique is eventtypes, which allow for "tagging" of events that match generic search expressions.