<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic transaction vs stats commands in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9224#M48</link>
    <description>&lt;P&gt;When should I use the transaction command and when should I use stats?&lt;/P&gt;

&lt;P&gt;I could use a recap...&lt;/P&gt;</description>
    <pubDate>Fri, 15 Jan 2010 09:11:00 GMT</pubDate>
    <dc:creator>cfrln</dc:creator>
    <dc:date>2010-01-15T09:11:00Z</dc:date>
    <item>
      <title>transaction vs stats commands</title>
      <link>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9224#M48</link>
      <description>&lt;P&gt;When should I use the transaction command and when should I use stats?&lt;/P&gt;

&lt;P&gt;I could use a recap...&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2010 09:11:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9224#M48</guid>
      <dc:creator>cfrln</dc:creator>
      <dc:date>2010-01-15T09:11:00Z</dc:date>
    </item>
    <item>
      <title>Re: transaction vs stats commands</title>
      <link>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9225#M49</link>
      <description>&lt;P&gt;Transaction marks a series of events as interrelated, based on a shared piece of common information. e.g. the flow of a packet based on clientIP address, a purchase based on user_ID. &lt;/P&gt;

&lt;P&gt;Stats produces statistical information by looking a group of events. Primarily used when the field(s) in question has a numeric value, and you want to do a statistical calculation. e.g. the average time to complete a transaction based on the averaged sum of all latencies, find re-try attempts that exceed session time out by more than 2 standard deviations. &lt;/P&gt;

&lt;P&gt;Both combine events. However transactions creates relationships based on metadata you provide, while stats calculates statistical relationships based on values or relationships already defined (by you, or by splunk).&lt;/P&gt;</description>
      <pubDate>Sat, 16 Jan 2010 01:29:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9225#M49</guid>
      <dc:creator>cervelli</dc:creator>
      <dc:date>2010-01-16T01:29:37Z</dc:date>
    </item>
    <item>
      <title>Re: transaction vs stats commands</title>
      <link>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9226#M50</link>
      <description>&lt;P&gt;Both are similar in that they allow you to aggregate individual events/lines together. &lt;/P&gt;

&lt;P&gt;However, &lt;CODE&gt;stats&lt;/CODE&gt; is meant to calculate statistical values on events grouped by the value of fields, and discards the events.&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;transaction&lt;/CODE&gt; can also group events based on the same field values, but it does not compute statistics over the group events (other than the duration between oldest and newest), while retaining the raw event and other field values from the original event. &lt;CODE&gt;transaction&lt;/CODE&gt; can also group events using much more complex criteria, such as limiting the grouping by time span or delays, requiring terms to define the start of a group or the end of a group, &lt;/P&gt;

&lt;P&gt;There is a small set of use cases that can be solved with either one, primarily through clever use of &lt;CODE&gt;stats&lt;/CODE&gt;. Mostly these use some variation of &lt;CODE&gt;stats max(_time),min(_time) by grouping_field&lt;/CODE&gt; to compute the duration in lieu of using &lt;CODE&gt;transaction&lt;/CODE&gt; to compute the duration of a group.&lt;/P&gt;

&lt;P&gt;In some cases &lt;CODE&gt;stats&lt;/CODE&gt; may be less resource-intensive than &lt;CODE&gt;transaction&lt;/CODE&gt;, though in those cases where either command can be used, any difference is likely to be small.&lt;/P&gt;</description>
      <pubDate>Sat, 16 Jan 2010 03:51:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9226#M50</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2010-01-16T03:51:25Z</dc:date>
    </item>
    <item>
      <title>Re: transaction vs stats commands</title>
      <link>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9227#M51</link>
      <description>&lt;P&gt;The transaction command is most useful in two specific cases:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;&lt;P&gt;Unique id (from one or more fields) alone is not sufficient to discriminate between two transactions. This is the case when the identifier is reused, for example web sessions identified by cookie/client IP. In this case, time span or pauses are also used to segment the data into transactions. In other cases when an identifier is reused, say in DHCP logs, a particular message may identify the beginning or end of a transaction.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;When it is desirable to see the raw text of the events combined rather than analysis on the constituent fields of the events.&lt;/P&gt;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;In other cases, it's usually better to use stats as the performance is higher, especially in a distributed search environment. Often there is a unique id and stats can be used.&lt;/P&gt;

&lt;P&gt;For example, to compute statistics on the duration of trades identified by the unique id "trade_id" the following searches will yield the same answer:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | transaction trade_id | chart count by duration span=log2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | stats range(_time) as duration by trade_id | chart count by duration span=log2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The second search is more efficient.&lt;/P&gt;

&lt;P&gt;If, however, trade_ids are reused but each trade ends with some text "END" the only viable solution is:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | transaction trade_id endswith=END | chart count by duration span=log2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If, instead, trade_ids are not reused within 10 minutes, the solution is:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | transaction trade_id maxpause=10m| chart count by duration span=log2
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 16 Jan 2010 06:04:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9227#M51</guid>
      <dc:creator>Stephen_Sorkin</dc:creator>
      <dc:date>2010-01-16T06:04:26Z</dc:date>
    </item>
    <item>
      <title>Re: transaction vs stats commands</title>
      <link>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9228#M52</link>
      <description>&lt;P&gt;One other surprising and wonderful thing about the &lt;CODE&gt;transaction&lt;/CODE&gt; command is that it recognizes transitive relationships.  If some events have &lt;CODE&gt;userID&lt;/CODE&gt; &amp;amp; &lt;CODE&gt;src_IP&lt;/CODE&gt; and others have &lt;CODE&gt;sessionID&lt;/CODE&gt; &amp;amp; &lt;CODE&gt;src_IP&lt;/CODE&gt; and still others have &lt;CODE&gt;sessionID&lt;/CODE&gt; &amp;amp; &lt;CODE&gt;userID&lt;/CODE&gt;, the &lt;CODE&gt;transaction&lt;/CODE&gt; command will be able to recognize the transitive relationships and bundle them all together with a single command; this is not the case for &lt;CODE&gt;stats&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jun 2015 21:48:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/transaction-vs-stats-commands/m-p/9228#M52</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-06-12T21:48:29Z</dc:date>
    </item>
  </channel>
</rss>

