Doing a simple search index=test over 10mln events gives me browsing speed around 5000 events per second. Extremely slow timeline build. Cpu load 100%. Doing that in fast mode gives 20k per second. In my opinion, too slow. Modern RAID with high iops and xenon cpu. Lots of cores and ram 64gigs... It's faster on my laptop with a similar dataset.
How to investigate?
If it's a linux server, Did you disable THP? It's one of the main reasons for slow searches over time.  Although it shouldn't result in high cpu (afaik)
Ref: https://answers.splunk.com/answers/188875/how-do-i-disable-transparent-huge-pages-thp-and-co.html
What type of data is in the index? How large are the events? How saturated is your Splunk installation?
The best place to start is by analyzing the search job inspector. Check that there aren't any lookups or field extractions that are slowing you down. Is this a distributed installation? If so, look at how long it took to stream the data back (dispatch.stream.remote) and identify any slow search peers.
Use the Distributed Management Console to check the health of Splunk.
Also use other OS related tools to troubleshoot system performance; vmstat, iostat, top, lsof to look for any processes hogging CPU, memory or any high iowait times on your disk array.
Beyond that, searching index=test is a terrible way to test search performance. You have to bring back every event in the index for the given timeframe.
Type of data: mostly cisco asa logs. No more than 300 bytes per event.
Distributed two indexers one search head. But all data on one indexer and I am temporarily do searchea on it to benchmark.
Index=test for testing worst case scenario.
Done all linux optimize stuff i am aware of. Bottleneck is cpu.
I am surprised because fast search without extraction (as I understand it) is doing so poorly. 200bytes multiply by 10k is 2MB per sec reading raw data. I should get something near my io performance 500MB/s.
Am I missing something?
Bottleneck is cpu Where did you get this information from? Did you check the job inspector as previously suggested? If so, what is using the most resources or time in the list?
cheers, MuS
I presume CPU as iostats show that drives are not saturated, and top gives 1 process 100% usage in userspace.
Job inspection for short time - circa 200.000 in circa 20 sec.
Duration (seconds)      Component   Invocations     Input count     Output count
    0.02    command.fields  26  197,038     197,038
    19.78     command.search  26  -   197,038
    0.38    command.search.calcfields   25  197,038     197,038
    0.15    command.search.fieldalias   25  197,038     197,038
    0.08    command.search.index    26  -   -
    0.00    command.search.index.usec_1_8   6   -   -
    0.00    command.search.index.usec_8_64  36  -   -
    9.75  command.search.kv   25  -     -
    6.51    command.search.typer    25  197,038     197,038
    1.41    command.search.rawdata  25  -   -
    0.83    command.search.lookups  25  197,038     197,038
    0.32    command.search.tags     25  197,038     197,038
    0.00    command.search.summary  26  -   -
    0.00    dispatch.check_disk_usage   2   -   -
    0.00    dispatch.createdSearchResultInfrastructure  1   -   -
    0.10    dispatch.evaluate   1   -   -
    0.10    dispatch.evaluate.search    1   -   -
    9.24  dispatch.fetch  27    -   -
    19.77   dispatch.localSearch    1   -   -
    0.02    dispatch.preview    16  -   -
    0.04    dispatch.readEventsInResults    1   -   -
    19.78     dispatch.stream.local   26    -   -
    8.14    dispatch.timeline   27  -   -
    0.17    dispatch.writeStatus    38  -   -
    0.02    startup.configuration   1   -   -
    0.06    startup.handoff     1   -   -