<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Aggregation search for different fields help in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648862#M224441</link>
    <description>&lt;P&gt;I'm glad that transaction solves the problem at hand. &amp;nbsp;People hesitate to use this command for a good reason. &amp;nbsp;As you gain more experience with stats, eventstats, and some other efficient ways to aggregate data, you may one day find a better method than transaction for this. &amp;nbsp;There are some good examples that you can find in this community as well.&lt;/P&gt;</description>
    <pubDate>Sat, 01 Jul 2023 07:27:13 GMT</pubDate>
    <dc:creator>yuanliu</dc:creator>
    <dc:date>2023-07-01T07:27:13Z</dc:date>
    <item>
      <title>Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648668#M224400</link>
      <description>&lt;P&gt;Hello everyone! I have Splunk events in the following format:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;activity_time:&amp;nbsp;2023-06-29T12:45:06Z&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;event_time:&amp;nbsp;2023-06-29T14:49:42.787Z&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;shipment_status:&amp;nbsp;delivered&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;timestamp:&amp;nbsp;2023-06-29T14:49:51.069Z&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;tracking_number:95AAEC4900000&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;And of course, every event has a time value on top of already provided values, &lt;EM&gt;shipment_status&lt;/EM&gt; can contain values like 'delivered' and 'in_process' and some other values.&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;I need to find the percentage of events with the same 'tracking_number' values, for which events with the shipment_status: &lt;STRONG&gt;in_process&lt;/STRONG&gt; values came first, before events with the&amp;nbsp;shipment_status: &lt;STRONG&gt;delivered&lt;/STRONG&gt; status values.&amp;nbsp;&lt;/EM&gt;&lt;BR /&gt;It means that I need to filter events based on their tracking_number, find among those events only those with shipment_status:&amp;nbsp;&lt;STRONG&gt;in_process&lt;/STRONG&gt; or &lt;STRONG&gt;delivered&lt;/STRONG&gt;, and then compare the time of both events and count them to the overall percentage of the filtered events if &lt;STRONG&gt;in_process&lt;/STRONG&gt; events were first, for example, 1 hour before the &lt;STRONG&gt;delivered&lt;/STRONG&gt; event.&lt;BR /&gt;&lt;BR /&gt;I'm very confused with the operators that Splunk uses for the filtering and calculating logic, could someone please help me with the composition of the query?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jun 2023 15:04:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648668#M224400</guid>
      <dc:creator>heorhii12412</dc:creator>
      <dc:date>2023-06-29T15:04:11Z</dc:date>
    </item>
    <item>
      <title>Re: Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648745#M224409</link>
      <description>&lt;P&gt;I know that feeling when I couldn't see how the question I wanted answered can be expressed in SPL. &amp;nbsp;In this case, though, you also need some command that people hesitate to use, namely&amp;nbsp;&lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Transaction" target="_blank" rel="noopener"&gt;transaction&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;But first, is there more vertical knowledge that volunteers will need? &amp;nbsp;For example, should the three timestamps, or any relationship therein be considered? &amp;nbsp;When calculating which event comes first, which time value should be used? &amp;nbsp;In the following, I will only consider _time that Splunk already has.&lt;/P&gt;&lt;P&gt;Another example. &amp;nbsp;For each tracking_number in consideration, how many events can there be in each shipment_status? &amp;nbsp;I will guess that only one tracking_number-shipment_status combination is possible. &amp;nbsp;If there are more, you will need to specify how you are going to group them.&lt;/P&gt;&lt;P&gt;So, let's look at the question at hand: You want to group events with 'tracking_number' - in_process -&amp;gt; delivered, and see how many are grouped and how many are not grouped.&lt;/P&gt;&lt;P&gt;An easy way to think about this is to consider the transition from in_process to delivered as a transaction. &amp;nbsp;SPL's transaction command produces two useful fields, closed_txn tells you whether the transaction is closed, i.e., whether the group satisfies all requirements you specified, and eventcount that tell you how many events are in that group.&lt;/P&gt;&lt;P&gt;Once you have the transaction, you calculate how many events are in closed group and unclosed group, then compare the count in closed group to the total. &amp;nbsp;Like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| transaction tracking_number startswith=shipment_status=in_process endswith=shipment_status=delivered keepevicted=true
| stats sum(eventcount) as eventcount by closed_txn
| eventstats sum(eventcount) as totalcount
| where closed_txn == 1 ``` remaining eventcount only includes complete transactions ```
| eval percentage = eventcount * 100 / totalcount&lt;/LI-CODE&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;&lt;P&gt;The following is a data emulation that you can play with and compare with real data&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults
| eval data = mvappend("  activity_time: \"2023-06-29T12:45:06Z\"
   event_time: \"2023-06-29T14:49:42.787Z\"
   shipment_status: delivered
   timestamp: \"2023-06-29T14:49:51.069Z\"
   tracking_number:95AAEC4900000", "  activity_time: \"2023-06-29T12:25:06Z\"
   event_time: \"2023-06-29T14:49:22.787Z\"
   shipment_status: in_process
   timestamp: \"2023-06-29T14:29:51.069Z\"
   tracking_number:95AAEC4900000", "  activity_time: \"2023-06-29T11:45:06Z\"
   event_time: \"2023-06-29T13:49:42.787Z\"
   shipment_status: delivered
   timestamp: \"2023-06-29T13:49:51.069Z\"
   tracking_number:95AAEC4900000")
| mvexpand data
| rename data as _raw
| extract kvdelim=":" pairdelim="
"
| eval _time = strptime(timestamp, "%FT%H:%M:%S.%3N%Z")
``` data emulation above ```&lt;/LI-CODE&gt;</description>
      <pubDate>Fri, 30 Jun 2023 06:44:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648745#M224409</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2023-06-30T06:44:45Z</dc:date>
    </item>
    <item>
      <title>Re: Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648795#M224421</link>
      <description>&lt;P&gt;Thanks a lot, you really just saved me! I haven't even watched on the &lt;STRONG&gt;transaction&amp;nbsp;&lt;/STRONG&gt;command side, tried to do it just by &lt;STRONG&gt;eval&lt;/STRONG&gt; and &lt;STRONG&gt;eventstats &lt;/STRONG&gt;commands.&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jun 2023 11:17:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648795#M224421</guid>
      <dc:creator>heorhii12412</dc:creator>
      <dc:date>2023-06-30T11:17:20Z</dc:date>
    </item>
    <item>
      <title>Re: Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648862#M224441</link>
      <description>&lt;P&gt;I'm glad that transaction solves the problem at hand. &amp;nbsp;People hesitate to use this command for a good reason. &amp;nbsp;As you gain more experience with stats, eventstats, and some other efficient ways to aggregate data, you may one day find a better method than transaction for this. &amp;nbsp;There are some good examples that you can find in this community as well.&lt;/P&gt;</description>
      <pubDate>Sat, 01 Jul 2023 07:27:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648862#M224441</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2023-07-01T07:27:13Z</dc:date>
    </item>
    <item>
      <title>Re: Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648867#M224442</link>
      <description>&lt;P&gt;+1 to that.&lt;/P&gt;&lt;P&gt;transaction is a relatively straightforward command to use but it has its limitations (both in terms of functionality as well as performace) so it's often better to use other means to get the desired results. But sometimes transaction can be the only reasonable way to do something - in such case it's good to be able to filter your data as much as possible before feeding it to the transaction command.&lt;/P&gt;</description>
      <pubDate>Sat, 01 Jul 2023 08:05:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/648867#M224442</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2023-07-01T08:05:09Z</dc:date>
    </item>
    <item>
      <title>Re: Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/649020#M224450</link>
      <description>&lt;P&gt;I also have a question related to the time frames, for the query you provided earlier&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| transaction tracking_number startswith=shipment_status=in_transit endswith=shipment_status=delivered keepevicted=true
| stats sum(eventcount) as eventcount by closed_txn
| eventstats sum(eventcount) as totalcount
| where closed_txn == 1``` remaining eventcount only includes complete transactions ```
| eval percentage = eventcount * 100 / totalcount&lt;/LI-CODE&gt;&lt;P&gt;Is it possible to add a clause, that only transactions, for which events span more than 1 hour between them, could be included in the&amp;nbsp;&lt;STRONG&gt;eventcount&amp;nbsp;&lt;/STRONG&gt;? using the &lt;STRONG&gt;_time&lt;/STRONG&gt; field for example?&amp;nbsp;&lt;BR /&gt;Transaction's&amp;nbsp;&lt;STRONG&gt;maxspan &lt;/STRONG&gt;and&lt;STRONG&gt;&amp;nbsp;&lt;SPAN&gt;maxpause&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN&gt;do not suit these calculations, and I couldn't write something using &lt;STRONG&gt;mvindex&lt;/STRONG&gt; or &lt;STRONG&gt;strptime&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN&gt;to make it work.&lt;BR /&gt;For example,&amp;nbsp;&amp;nbsp;&lt;EM&gt;in_transit&lt;/EM&gt; event was received at&amp;nbsp;7/3/23 9:07:00.000 AM and &lt;EM&gt;delivered&lt;/EM&gt; events came after that at&amp;nbsp;7/3/23 11:10:00.000 AM, in this case, this transaction should be considered as completed and counted to&amp;nbsp;&lt;STRONG&gt;eventcount&lt;/STRONG&gt;.&lt;BR /&gt;I would be very grateful for your help!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Jul 2023 14:30:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/649020#M224450</guid>
      <dc:creator>heorhii12412</dc:creator>
      <dc:date>2023-07-03T14:30:20Z</dc:date>
    </item>
    <item>
      <title>Re: Aggregation search for different fields help</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/649040#M224455</link>
      <description>&lt;P&gt;To confirm, even if a shipment is delivered, the percentage count should only include "long" transactions. &amp;nbsp;Is this correct?&lt;/P&gt;&lt;P&gt;transaction command produces a "duration" field for closed transactions. &amp;nbsp;So, it is easy to filter by duration in addition to closed state.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| transaction tracking_number startswith=shipment_status=in_transit endswith=shipment_status=delivered keepevicted=true
| stats sum(eventcount) as eventcount by closed_txn
| eventstats sum(eventcount) as totalcount
| where closed_txn == 1 AND duration &amp;gt; 3600
``` remaining eventcount only includes complete transactions that lasted longer than 1 hour ```
| eval percentage = eventcount * 100 / totalcount&lt;/LI-CODE&gt;</description>
      <pubDate>Mon, 03 Jul 2023 21:03:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Aggregation-search-for-different-fields-help/m-p/649040#M224455</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2023-07-03T21:03:19Z</dc:date>
    </item>
  </channel>
</rss>

