Reporting

multireport vs. appendpipe

actionabledata
Path Finder

Code Architecture:

  • common code generating initial results; generate unique key for each grouping
  • multireport or appendpipe  
  •      stanza 1 with its own set of stats, evals, etc.; uses the key from the common code
  •      stanza 2 with its own set of stats, evals, etc.; uses the key from the common code
  • Aggregates the resulting data from the common, stanza1, stanza2 results using the key

Issue Statement

  • common code + stanza 1 takes about 1 min to execute
  • common code + stanza 2 takes about 1 min to execute
  • common code + stanza 1 + stanza2 using either a multireport or appendpipe takes about 17 min

[Q] Does this huge execution time difference make sense?

I have attached a few images to show how I think multireport and appendpipe work.

[Q] Is my understanding accurate?

 

actionabledata_0-1625861746744.png

The pre-appendpipe SPL reads the data from the index, filters the data, creates some initial fields using streamstats and eventstats and creates a key that is unique per the overall groupings correlated within this code.

Lines 1 and 2 are identical and originate from the pre-multireport SPL. These results are presented to the stanza1 and stanza2 SPL.

Lines 3 and 4 are independent results from stanza1 and stanza2 respectively

stanza1 and stanza2 execute mutually exclusive from one another

The sort and stats clauses within stanza1 and stanza2 are quite different but the one does NOT impact the other.

The final aggregation software ties all the data together based on a common key.

 

actionabledata_1-1625861773920.png

The pre-appendpipe SPL reads the data from the index, filters the data, creates some initial fields using streamstats and eventstats and creates a key that is unique per the overall groupings correlated within this code.

Lines 1 and 2 and 3 are identical and originate from the pre-appendpipe SPL. These results are presented to the stanza1 and stanza2 SPL.

Lines 3 and 4 CAN be removed if I filter the input data with a where clause and the flag I called "which" associated with each set of data.

Lines 5 and 6 are independent results from stanza1 and stanza2 respectively

stanza1 and stanza2 execute mutually exclusive from one another.

The stats clauses within stanza1 and stanza2 are quite different but the one does NOT impact the other.

The final aggregation software ties all the data together based on a common key.

 

Labels (1)
Tags (2)
0 Karma
1 Solution

actionabledata
Path Finder

Update to the appendpipe version of code

I eliminated stanza2 and the final aggregation SPL reducing the overall code to just the pre-appendpipe SPL and stanza 1 but leaving the appendpipe nomenclature in the code.

Total execution time = 486 sec

Then for this exact same search, I eliminated the appendpipe  syntax. Everything is the same except for the  | appendpipe and [ ] syntax.

Total execution time = 77 sec

The overhead to using appendpipe is HUGE.

I suspect the same is true for using multireport.

View solution in original post

0 Karma

actionabledata
Path Finder

Update to the appendpipe version of code

I eliminated stanza2 and the final aggregation SPL reducing the overall code to just the pre-appendpipe SPL and stanza 1 but leaving the appendpipe nomenclature in the code.

Total execution time = 486 sec

Then for this exact same search, I eliminated the appendpipe  syntax. Everything is the same except for the  | appendpipe and [ ] syntax.

Total execution time = 77 sec

The overhead to using appendpipe is HUGE.

I suspect the same is true for using multireport.

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...