<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Combining And Analyzing Stats Across Events by Other Fields in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594722#M206997</link>
    <description>&lt;P&gt;The source system produces messages that contain a field "transaction_id" which is a uuid and each message contains data about some unknown number of accounts (this data and these accounts are not involved, and I will exclude any further discussion of them).&lt;/P&gt;&lt;P&gt;Our service reads messages from a producer, and is optimized to multithread the processing of these larger messages in increments of 100 accounts. So, any inbound message is "split" into blocks that each generate log messages containing three major pieces of data:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The source message "transaction_id" value (extracted to a field via regex called "transaction_id")&lt;UL&gt;&lt;LI&gt;There will be at least one event per transaction_id, but there are often more (there can be thousands of accounts in especially large messages)&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;The number of accounts represented by the event expressed in the message body&amp;nbsp;(again, extracted to a field "message_accounts" via regex)&lt;/LI&gt;&lt;LI&gt;How long the block of accounts took to process (again, extracted to a field "message_processing" via regex)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I can get this working&amp;nbsp; and it gives me table like the following:&lt;/P&gt;&lt;P&gt;Side note - the important commands are:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| bin message_accounts span=30
| stats avg(message_processing) by message_accounts&lt;/LI-CODE&gt;&lt;TABLE border="1" width="44.443129208754215%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;message_accounts&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;avg(message_processing)&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;0-30&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;184&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="25px"&gt;30-60&lt;/TD&gt;&lt;TD height="25px"&gt;966&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="25px"&gt;60-90&lt;/TD&gt;&lt;TD height="25px"&gt;1610&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="25px"&gt;90-120&lt;/TD&gt;&lt;TD height="25px"&gt;2096&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, because we split any large messages down to 100, and there's no function currently aggregating these message-level stats by their shared "transaction_id" values, the chart only analyzes these chunks and, instead, I want to combine the stats to consider the sum of all "message_accounts" and "message_processing" values across events that share a common "transaction_id" to reconstruct the total accounts.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 21 Apr 2022 18:51:45 GMT</pubDate>
    <dc:creator>duggym122</dc:creator>
    <dc:date>2022-04-21T18:51:45Z</dc:date>
    <item>
      <title>Combining And Analyzing Stats Across Events by Other Fields</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594704#M206988</link>
      <description>&lt;P&gt;tl;dr&amp;nbsp;I want to take a list of events, separately sum the fields "message_accounts" (accounts processed in the event) and "message_processing" (time it takes to process) by "transaction_id" (so, in essence, two composite values related to the transaction_id across however many chunks it was split to) so that I can bucket/bin the sum of the message_accounts by the corresponding average of the message_processing value across all of these families of events&lt;BR /&gt;&lt;BR /&gt;I have messages that show sub-totals of processing time for split-off chunks of a larger message, identified by a field called "transaction_id"&amp;nbsp;&lt;/P&gt;&lt;P&gt;For example, our service accepts consolidated messages from another service (from 1 unit to thousands of combined message units) and splits them into chunks no larger than 100, where each chunk retains the "transaction_id" of the message source, so it's unique to the original message which we then split into more manageable pieces to be processed in parallel.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Apr 2022 16:30:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594704#M206988</guid>
      <dc:creator>duggym122</dc:creator>
      <dc:date>2022-04-21T16:30:55Z</dc:date>
    </item>
    <item>
      <title>Re: Combining And Analyzing Stats Across Events by Other Fields</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594709#M206991</link>
      <description>&lt;P&gt;This use case is not clear.&amp;nbsp; Please share some sample (sanitized) data, the SPL you've tried, the actual results, and the desired results.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Apr 2022 17:18:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594709#M206991</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2022-04-21T17:18:03Z</dc:date>
    </item>
    <item>
      <title>Re: Combining And Analyzing Stats Across Events by Other Fields</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594722#M206997</link>
      <description>&lt;P&gt;The source system produces messages that contain a field "transaction_id" which is a uuid and each message contains data about some unknown number of accounts (this data and these accounts are not involved, and I will exclude any further discussion of them).&lt;/P&gt;&lt;P&gt;Our service reads messages from a producer, and is optimized to multithread the processing of these larger messages in increments of 100 accounts. So, any inbound message is "split" into blocks that each generate log messages containing three major pieces of data:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The source message "transaction_id" value (extracted to a field via regex called "transaction_id")&lt;UL&gt;&lt;LI&gt;There will be at least one event per transaction_id, but there are often more (there can be thousands of accounts in especially large messages)&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;The number of accounts represented by the event expressed in the message body&amp;nbsp;(again, extracted to a field "message_accounts" via regex)&lt;/LI&gt;&lt;LI&gt;How long the block of accounts took to process (again, extracted to a field "message_processing" via regex)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I can get this working&amp;nbsp; and it gives me table like the following:&lt;/P&gt;&lt;P&gt;Side note - the important commands are:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| bin message_accounts span=30
| stats avg(message_processing) by message_accounts&lt;/LI-CODE&gt;&lt;TABLE border="1" width="44.443129208754215%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;message_accounts&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;avg(message_processing)&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;0-30&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;184&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="25px"&gt;30-60&lt;/TD&gt;&lt;TD height="25px"&gt;966&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="25px"&gt;60-90&lt;/TD&gt;&lt;TD height="25px"&gt;1610&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD height="25px"&gt;90-120&lt;/TD&gt;&lt;TD height="25px"&gt;2096&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, because we split any large messages down to 100, and there's no function currently aggregating these message-level stats by their shared "transaction_id" values, the chart only analyzes these chunks and, instead, I want to combine the stats to consider the sum of all "message_accounts" and "message_processing" values across events that share a common "transaction_id" to reconstruct the total accounts.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Apr 2022 18:51:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-And-Analyzing-Stats-Across-Events-by-Other-Fields/m-p/594722#M206997</guid>
      <dc:creator>duggym122</dc:creator>
      <dc:date>2022-04-21T18:51:45Z</dc:date>
    </item>
  </channel>
</rss>

