I performed the exact same search (index=|head 2000000|stats count) on the same indexer against THREE different indexes: fictionaldata, main, udp_syslog
The results were:
fictionaldata: 3.444 seconds
main: 70.491 seconds
udp_syslog: 3.852 seconds
What is going on with main? How can i troubleshoot the performance difference where the primary difference is the target index?
It is probably worth disclosing that main is larger than the other 2 indexes: 8GB vs 500MB- but all have > 2,000,000 rows.
I should also disclose that main has many more field extractions defined... but a 20x performance difference is simply shocking!
Splunk Enterprise Server 6.5.0
Linux, 12 GB RAM, 6 CPU Cores
index=main is a default dumpingground for when you don't specify an index, if your admin is doing his job correctly, it should have VERY few events coming into it. This means that you are going to have to dig into FAR more buckets and files to get 2000000 events. It looks like the events/bucket ratio is 20X less dense in main, than in the other indices. This makes sense to me. If you have to open 20 files to get 2000000 events (and each of those files requires uncompression) for
index=udp_syslog but you have to open 2000000 files to get 2000000 for
index=main, obviously it will take way longer. This does not even consider the fact that the older events in
index=main are probably on slower storage, too.
I think there are a couple of factors you might look at that may have peformance impacts:
- The number of sourcetypes and related search-time transformations: almost certainly the default index contains far more types of data that requires search-time operations than those in the other two indexes
- Storage location: Are the index paths of the three indexes all located on the same storage space?
- Time spans: See the earliest event returned by the search in these indexes - recent events are stored in hot and warm buckets but older ones are stored in cold buckets
Hope it helps. Thanks!