Splunk Search

what's max # events I can have timestamped with a particular second? millisecond?

Splunk Employee
Splunk Employee

If that limit is breached, what will stop working?

Is there a way to raise the limit?

Merged question:

I'm running v4.0.9. My "All Indexed Data: dashboard shows there are about 20M events from our Splunk server. I'd like to be able to review these events. However, when I do a search for the host, (ie, host=hostname) I get the following error "Search failed: more than 200000 events found at time 1251742014". Searching before or after this timestamp returns 0 results. So I'm assuming that all of these events where somehow indexed with the same timestamp. Is there a way to increase this limit? Or at least a way I can find out what these events look like by going to the raw data?

Tags (1)
1 Solution

Splunk Employee
Splunk Employee

The maximum number of consecutively indexed events from a single source with identical timestamps to the second is 100,000 (per unique combination of indexer/index/host/source/sourcetype). At that point, Splunk will increment the seconds clock. You can not raise that limit. It is possible to have more events than this indexed by indexing other events from the same source with a different timestamp in between each batch of 100,000 or less. However, if there are more than 200,000 events indexed within the same second, those events will not be searchable or returnable in results (at least not in any reliable way), so you should not attempt to do this.

It would be possible to work around this limit by modifying any one of the host, source, or sourcetypes to keep the number of events with the identical combination below 200,000. Doing this allows the events to be within the same second, and keeps them searchable.

Having different millisecond (or finer) counters for events does not affect these limits, i.e., even if you have different subseconds for every one of the 200,000 events, you will still have this limit.

View solution in original post

New Member

Now it's splunk 7., is there still same limitation? I got the link from the application of Splunk Monitor. Thanks.,Now the splunk has been version 7.. Is there still same limitation? I got the link of the answer from the application of Splunk monitor. Thanks.

0 Karma

Super Champion

BTW, this can be a problem with summary indexing too. I know that in 4.x "source" is set to the name of the saved search, which helps this problem, but it seems like people are still running into this; for example if your doing per-day summary; then the first second of the day can be given a very large number of events.

0 Karma

Splunk Employee
Splunk Employee

Artificially generated log, via logger-emulator script. This is for Splunk-internal performance testing. We push the envelope, so it doesn't rip when you lick it!

0 Karma

Path Finder

I am curious, what kind of log source are you using to generate such a high number of events? If it's ok to share..

0 Karma

Splunk Employee
Splunk Employee

The core issue here, if you're curious, is that Splunk stores the values by second, but needs to return the data to the UI, and other clients, in sub-second order. This means that there can be a search-time requirement to sort the events thousands at a time.

Usually, this is not a big deal, because usually each individual host, source, and sourcetype do not produce hundreds of thousands of events per second. If you're seeing this sort of case frequently, you need to chat with engineering about designing the core problem away.

New Member

This is a follow up question -- is it possible to work with these events with a particular search string? E.g in my case...

search="/opt/splunk/etc/apps/unix/bin/lsof.sh"
Error in 'IndexScopedSearch': The search failed. More than 3125000 events found at time 1284355500.

source="/opt/splunk/etc/apps/unix/bin/lsof.sh" _time="1284355500"
Error in 'IndexScopedSearch': The search failed. More than 1562500 events found at time 1284355500.

source="/opt/splunk/etc/apps/unix/bin/lsof.sh" _time="1284355500" | sort 1000 host
Error in 'IndexScopedSearch': The search failed. More than 1562500 events found at time 1284355500.

0 Karma

Splunk Employee
Splunk Employee

This issue is a search time problem, so it depends upon the search.

Splunk has to return all the events in time order, so if there's a very large number of events during the same second being returned by a given search, splunk has to perform a sort of all those events based on subsecond. If the value of seconds differs, there is no large-scale memory intensive sort required, because they are already ordered as they are retreived. This behavior occurs per-indexer in distributed search so distributed search limits the problem by the number of nodes.

0 Karma

Super Champion

I'm seconding Dan's question... So is this basically a per-bucket limit? In other words, if you have spikes of 30,0000 messages per second for a single source,sourcetype,host combo will load balancing that across 2 or more indexers "fix" the problem?

0 Karma

Splunk Employee
Splunk Employee

Does this limitation exist on the search head during distributed search? In other wordsk, could I solve this problem by load-balancing the data stream amongst a farm of indexers?

Splunk Employee
Splunk Employee

Wasn't there also some search limitation, in terms of how many events belonging to a particular second the webUI/CLI can deal with?

0 Karma

Splunk Employee
Splunk Employee

The maximum number of consecutively indexed events from a single source with identical timestamps to the second is 100,000 (per unique combination of indexer/index/host/source/sourcetype). At that point, Splunk will increment the seconds clock. You can not raise that limit. It is possible to have more events than this indexed by indexing other events from the same source with a different timestamp in between each batch of 100,000 or less. However, if there are more than 200,000 events indexed within the same second, those events will not be searchable or returnable in results (at least not in any reliable way), so you should not attempt to do this.

It would be possible to work around this limit by modifying any one of the host, source, or sourcetypes to keep the number of events with the identical combination below 200,000. Doing this allows the events to be within the same second, and keeps them searchable.

Having different millisecond (or finer) counters for events does not affect these limits, i.e., even if you have different subseconds for every one of the 200,000 events, you will still have this limit.

View solution in original post

Path Finder

How do we index a data file which is an aggregated data for a given day. It does not contain timestamp. If an analyst is looking at an YEARLY chart with a span of days (not hours or minutes or seconds), Splunk fails here. Isn't it?

Imagine, you are looking at a stock price on a day-scale for 6 months. The data file in this case, may contain ticker price for the day. For this case, please suggest how to index using Splunk.
thanks..

0 Karma

SplunkTrust
SplunkTrust

Im curious, since it's been so long, whether any of this was improved in 4.1 or 4.2... Commenting so as to bump the question up again.

Splunk Employee
Splunk Employee

Basically, if your indexed search (not including any search time field extractions or post-processing or internal filtering) returns more than 200,000 events with the same second, you will not have any results given to you. If you are able to provide an indexed term that limits the results so there are fewer than 200,000, then you will get results.

Splunk Employee
Splunk Employee

Thank you -- answer accepted. Although I would still like to know the exact search limitations, in this respect. (N events per second, where N is...?)

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!