Monitoring Splunk

Search degradation on previous days results

jbouch03
Path Finder

Have a strange instance that I haven't seen before. Two days ago I could complete my core search within 10 seconds (to include All Time). For the last two days just 24 hours worth of data takes 30 minutes to complete. There haven't been any changes to the Search Head, or Indexers, and the amount of data hasn't increased. I've tried investigating through SOS but I can't find anything that would be causing this.

On my dashboards, and apps everything runs as normal. It's just on Search and Reporting that I am having an issue. I even removed all my searches from the dashboards and made sure that all concurrent searches were stopped. No change in performance. I can't think of anything else to check.

0 Karma
1 Solution

jbouch03
Path Finder

The two searches are the same logs, just ingested with different sourcetypes (ps is production server). This was done as a test. I found the issue though, and its stupid mistake. I, or one of the other admins, moved the searching results to verbose instead of fast. With the amount of data being processed Verbose mode was taking way too long.

View solution in original post

0 Karma

jbouch03
Path Finder

The two searches are the same logs, just ingested with different sourcetypes (ps is production server). This was done as a test. I found the issue though, and its stupid mistake. I, or one of the other admins, moved the searching results to verbose instead of fast. With the amount of data being processed Verbose mode was taking way too long.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

I have marked this as a separate answer, the essence is verbose mode vs fast mode.

0 Karma

jbouch03
Path Finder

Added to answer to fit all the information

Execution costs for degraded

Duration (seconds) Component Invocations Input count Output count
0.03 command.fields 28 238,967 238,967
15.91 command.remotetl 28 238,967 -
50.73 command.search 28 - 238,967
0.33 command.search.index 48 - -
0.18 command.search.filter 28 - -
0.11 command.search.fieldalias 28 238,967 238,967
0.04 command.search.calcfields 28 238,967 238,967
0.00 command.search.index.usec_1_8 1,280 - -
0.00 command.search.index.usec_64_512 12 - -
0.00 command.search.index.usec_8_64 249 - -
37.94 command.search.typer 28 238,967 238,967
5.48 command.search.tags 28 238,967 238,967
3.38 command.search.kv 28 - -
2.92 command.search.rawdata 28 - -
0.07 command.search.lookups 28 238,967 238,967
0.02 command.search.summary 28 - -
0.00 dispatch.check_disk_usage 4 - -
0.01 dispatch.createdSearchResultInfrastructure 1 - -
0.15 dispatch.evaluate 1 - -
0.15 dispatch.evaluate.search 1 - -
37.47 dispatch.fetch 28 - -
0.03 dispatch.parserThread 28 - -
0.01 dispatch.preview 13 - -
1.15 dispatch.process_remote_timeline 27 12,928,718 -
0.02 dispatch.remote_timeline_fullevents 3 411,073 200
66.63 dispatch.stream.remote 28 - 13,482,133
37.45 dispatch.stream.remote.index1 15 - 5,762,677
29.17 dispatch.stream.remote.index2 13 - 7,719,456
0.05 dispatch.timeline 28 - -
0.06 dispatch.writeStatus 28 - -
0.08 startup.configuration 3 - -
493.61 startup.handoff 3 - -


Execution costs for non-degraded

Duration (seconds) Component Invocations Input count Output count
0.00 command.addinfo 3 10,744 10,744
0.01 command.bin 3 10,744 10,744
0.26 command.eval 15 56,600 56,600
0.00 command.fields 3 10,744 10,744
0.42 command.multikv 3 2,880 13,624
0.10 command.prestats 3 10,744 2,873
0.07 command.rex 9 32,232 32,232
0.17 command.search 6 13,624 13,624
0.08 command.search.kv 1 - -
0.05 command.search.rawdata 1 - -
0.03 command.search.filter 4 - -
0.00 command.search.calcfields 1 2,880 2,880
0.00 command.search.fieldalias 1 2,880 2,880
0.00 command.search.lookups 1 2,880 2,880
0.00 command.search.summary 3 - -
0.00 command.search.tags 1 2,880 2,880
0.00 command.search.typer 1 2,880 2,880
0.00 command.search.index.usec_1_8 171 - -
0.00 command.search.index.usec_8_64 3 - -
0.02 command.stats.execute_input 4 - -
0.00 command.stats.execute_output 1 - -
0.02 command.timechart 1 2,873 49
0.00 command.timechart.execute_output.reduce_and_emit 1 - -
0.00 command.timechart.execute_output.series_filter 1 - -
0.00 command.timechart.execute_output.getIntermediateResults() 1 - -
0.00 dispatch.check_disk_usage 1 - -
0.01 dispatch.createdSearchResultInfrastructure 1 - -
0.09 dispatch.evaluate 1 - -
0.09 dispatch.evaluate.search 2 - -
0.00 dispatch.evaluate.eval 5 - -
0.00 dispatch.evaluate.rex 3 - -
0.00 dispatch.evaluate.bin 1 - -
0.00 dispatch.evaluate.multikv 1 - -
0.00 dispatch.evaluate.stats 1 - -
0.00 dispatch.evaluate.timechart 1 - -
1.21 dispatch.fetch 4 - -
0.01 dispatch.parserThread 3 - -
1.00 dispatch.stream.remote 3 - 370,779
0.99 dispatch.stream.remote.indexer1 2 - 366,197
0.00 dispatch.stream.remote.indexer2 1 - 4,582
0.02 dispatch.writeStatus 7 - -
0.08 startup.configuration 3 - -
493.45 startup.handoff 3 - -

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

One searches the sourcetype iis, the other searches the sourcetype ps?

No wonder performance is different, apparently the two searches have nothing in common.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

The non-degraded search is loading 370,779 events off your indexers (dispatch.stream.remote) while the degraded search is loading fifty times as much, 13,482,133 events - no wonder it's taking a bit longer.

To further investigate, search for "lispy" in the search.log linked from the job inspector, or (if you're on pre-6.2) just copy the lispy from the top of the job inspector, for both jobs. I'm sure one has a good filter while the other doesn't... probably caused by side effects from some field configuration.

0 Karma

jbouch03
Path Finder

The number of events are a lot different, but I'm still not seeing why the degradation has only been occurring the last couple of days, with the same number of events.

lispy for degraded is [AND sourcetype::iis]
lispy for non-degraded is [AND sourcetype::ps]

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do post the top part of the job inspector, once for a job that exhibits the degradation and once for a job that doesn't.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...