How do I find sources/source types/hosts/indexes causing typing queue blockage?
Steps
1) Set under [default] stanza in limits.conf
regex_cpu_profiling = true
regex_cpu_profiling =
* Enable CPU time metrics for RegexProcessor. Output will be in the
metrics.log file.
Entries in metrics.log will appear per_host_regex_cpu, per_source_regex_cpu,
per_sourcetype_regex_cpu, per_index_regex_cpu.
* Default: false
2) Set under [metrics] stanza in limits.conf
maxseries = 50
maxseries =
* The number of series to include in the per_x_thruput reports in metrics.log.
* Default: 10
3) restart splunk
4) Wait for typing queue to block.
5) Goto splunk UI and following queries will be helpful:
Which source type is taking most of the cpu time.
index=_internal host= source=*metrics.log group=per_sourcetype_regex_cpu |timechart max(cpu) by series
Which source type is taking most of the cpu time per event:
index=_internal host= source=*metrics.log group=per_sourcetype_regex_cpu |timechart max(cpupe) by series
Repeat queries for per_host_regex_cpu, per_source_regex_cpu, and per_index_regex_cpu(if needed)
cpu > total cpu time for a given series
cpupe > cpu time per event for a given series
bytes > total bytes processes for a given series
ev > total events for a given series
Steps
1) Set under [default] stanza in limits.conf
regex_cpu_profiling = true
regex_cpu_profiling =
* Enable CPU time metrics for RegexProcessor. Output will be in the
metrics.log file.
Entries in metrics.log will appear per_host_regex_cpu, per_source_regex_cpu,
per_sourcetype_regex_cpu, per_index_regex_cpu.
* Default: false
2) Set under [metrics] stanza in limits.conf
maxseries = 50
maxseries =
* The number of series to include in the per_x_thruput reports in metrics.log.
* Default: 10
3) restart splunk
4) Wait for typing queue to block.
5) Goto splunk UI and following queries will be helpful:
Which source type is taking most of the cpu time.
index=_internal host= source=*metrics.log group=per_sourcetype_regex_cpu |timechart max(cpu) by series
Which source type is taking most of the cpu time per event:
index=_internal host= source=*metrics.log group=per_sourcetype_regex_cpu |timechart max(cpupe) by series
Repeat queries for per_host_regex_cpu, per_source_regex_cpu, and per_index_regex_cpu(if needed)
cpu > total cpu time for a given series
cpupe > cpu time per event for a given series
bytes > total bytes processes for a given series
ev > total events for a given series
This is a fantastic post. The only thing I would add is that the regex_cpu_profiling
was added in 6.6. Thanks!
It's integrated with DMC as well starting 7.x. However enabling regex_cpu_profiling is required.
Didn't know that. This post is solid gold!