As the data travels through Splunk it goes through the following pipelines (among others): aggQueue::Merging -> typingQueue::Typing -> indexQueue::indexerPipe. If the aggregation queue (aggQueue) is the bottleneck that means that any of the processor pipelines that come after it MAY be experiencing heavy load and therefore spending too much time processing the data.
The pipelines listed above are responsible for the following and tuning any of the attributes/parameters (listed alongside them) may help increase performance:
Merging
Responsible for:
Line Merging: SHOULD_LINEMERGE, BREAK_ONLY_BEFORE, MUST_BREAK_AFTER
Timestamp Extraction: TIME_PREFIX, TIME_FORMAT, DATETIME_CONFIG, MAX_TIMESTAMP_LOOKAHEAD, MAX_DAYS_AGO/HENCE
Typing
Responsible for:
Regex Replacement: TRANSFORMS-xxx, SEDCMD; transforms: SOURCE/DEST_KEY, REGEX, FORMAT
indexerPipe
Responsible for: TCP/SYSLOG output, Block Signing, Writing to disk
Given that you indicate a problem with aggregation queue, I would start by investigating Line Merging and Timestamp Extraction. For example, a couple of settings that will give you a performance boost are: SHOULD_LINEMERG=false (so that splunk does not merge lines into multiline events), LINE_BREAKER=<regex> (tell splunk exactly where to break instead), TIME_PREFIX=<regex> (tell splunk where to find the timestamp), MAX_TIMESTAMP_LOOKAHEAD=<number> (tell splunk how long the timestamp is) and TIME_FORMAT=<strptime> (tell splunk how the timestamp looks like.
Hope this helps.
> please upvote and accept answer if you find it useful - thanks!
... View more