Getting Data In

Heavy forwarder - Could not send data to output queue (parsingQueue)

nareshinsvu
Builder

Below is my use-case (Heavy Forwarders -> Indexers). Need expert assessment.

1) I have very huge log files.
2) So, I have used heavy forwarders to cut down data at source using REGEX transforms.
3) Got a peak load and splunk not able to get any data .
splunkd.log on forwarders - Could not send data to output queue (parsingQueue)
metrics.log on forwarders - Metrics - group=queue, name=aggqueue, blocked=true, max_size_kb=1024, current_size_kb=1023, current_size=2324, largest_size=2336, smallest_size=2169
4) Didn't help after restart/deletion of metrics.log*

Are there any specific config changes that can fix this without changing my architecture? Else, I have to go for Universal Forwarder -> Heavy Forwarders -> Indexers. And any issues that I foresee in the new architecture?

changing interval in inputs.conf might help? currently it is set to 10. Suspecting that the log filling is very fast and it couldnt cope with the changes? Should I increase it to may be 50 and see?

0 Karma
1 Solution

nareshinsvu
Builder

my logfiles are complex with mixed data (JSON without timestamp and non-JSON). Splunk is parsing the timestamp for each json lines and taking more time there.

Performance got better only after removing custom time format and using DATE_CONFIG = CURRENT. AND separating JSON data using INDEXED EXtRACTIONS

Thanks for your effort in this.

View solution in original post

0 Karma

nareshinsvu
Builder

my logfiles are complex with mixed data (JSON without timestamp and non-JSON). Splunk is parsing the timestamp for each json lines and taking more time there.

Performance got better only after removing custom time format and using DATE_CONFIG = CURRENT. AND separating JSON data using INDEXED EXtRACTIONS

Thanks for your effort in this.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Can you please let us know how many queues are blocked? If Indexing queue will block then due to back pressure parsing , aggregation and typing queue will also block. Queue blocking are on Heavy Forwarders only or on Indexers as well ?

0 Karma

nareshinsvu
Builder

Hi,

It's blocking only on the HF. How can I find the # of queues getting blocked? I could see these continuous blocked messges in the metrics.log

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

When you check blocked message in metrics.log on HF, you can see queue name in same event that which queue has been blocked. During same time if more than one queue was blocked then you'll able to see multiple events with blocked=true with their queue name.

Alternatively you can check in Monitoring Console if you have MC present & configured in your environment.

0 Karma

nareshinsvu
Builder

Only aggqueue or parsingqueue at any instance.

Strangely, this is happening only with a clustered environment (modified parsingqueue to 30MB and aggqueue to 10MB).

When I load the same large file onto a single node splunk (with default parsingqueue value of 6MB and aggqueue to 1MB), it is faster.

Realyy blowing my mind and unsure how to get it fixed in my clustered setup Any environment variables to tweak?

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

How large your files ? Do you know average event size in that file?

As you mentioned that only parsing and aggregation queues are blocked and NOT any other queues, based on https://wiki.splunk.com/Community:HowIndexingWorks those queues are doing Line Breaking, Line Merging, Timestamp extraction & assignment. In your question you mentioned that you are using REGEX to cut down data, that goes into typing queue. Can you please double check whether typing queue is blocked in your clustered environment ? If your REGEX is not well written and using more steps then it will really impact your splunk performance on large environments or big data source.

To know more about Regex Performance you can enable Regex profiling on Heavy Forwarder

limits.conf

regex_cpu_profiling = <boolean>
* Enable CPU time metrics for RegexProcessor. Output will be in the 
  metrics.log file.
  Entries in metrics.log will appear per_host_regex_cpu, per_source_regex_cpu,
  per_sourcetype_regex_cpu, per_index_regex_cpu.
* Default: false
0 Karma

nareshinsvu
Builder

Not able to catch the root cause. One of the pages suggest to increase parsingqueue to 10mb but it will impact on memory. Did someone try that solution?

Currently Splunk is way behind and not capturing my log content with the same messages in forwarder's splunkd.log

0 Karma

nareshinsvu
Builder

Increased the parsingQueue size. I am getting the data now. BUT BUT HF's CPU spiked to 30%. Could someone confirm on best solution for my use case?

[queue=parsingQueue]
maxSize = 10MB
0 Karma
*NEW* Splunk Love Promo!
Snag a $25 Visa Gift Card for Giving Your Review!

It's another Splunk Love Special! For a limited time, you can review one of our select Splunk products through Gartner Peer Insights and receive a $25 Visa gift card!

Review:





Or Learn More in Our Blog >>