<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Transaction includes non-relevant events? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485653#M135876</link>
    <description>&lt;P&gt;Why does &lt;CODE&gt;transaction&lt;/CODE&gt; group irrelevant events together with relevant ones? What am I doing wrong?&lt;/P&gt;

&lt;P&gt;Sample Postfix log event:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Jan 20 14:04:57 smtp2 postfix/local[14880]: 0EA1B1027961: to=&amp;lt;REDACTED&amp;gt;, orig_to=&amp;lt;root&amp;gt;, relay=local, delay=0.04, delays=0.03/0.01/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;...Search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="postfix_syslog" host="*-smtp*" 
| rex field=_raw "\:\s(?&amp;lt;threadId&amp;gt;[0-9A-F]{6,24})\:" 
| transaction threadId host maxspan=24h 
| search threadId=0EA1B1027961
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;... returns this result:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Jan 20 13:14:36 smtp2 postfix/smtpd[8618]: connect from &amp;lt;somehost&amp;gt;
Jan 20 13:14:36 smtp2 postfix/smtpd[8618]: disconnect from &amp;lt;somehost&amp;gt;
Jan 20 13:19:36 smtp2 postfix/smtpd[9238]: connect from &amp;lt;somehost&amp;gt;
Jan 20 13:19:36 smtp2 postfix/smtpd[9238]: disconnect from &amp;lt;somehost&amp;gt;
... &amp;lt;repeated ~ 20 times&amp;gt;
Jan 20 13:24:36 smtp2 postfix/smtpd[9857]: disconnect from &amp;lt;somehost&amp;gt;
Jan 20 14:04:57 smtp2 postfix/pickup[6583]: 0EA1B1027961: uid=0 from=&amp;lt;root&amp;gt;
Jan 20 14:04:57 smtp2 postfix/cleanup[14874]: 0EA1B1027961: message-id=&amp;lt;REDACTED&amp;gt;
Jan 20 14:04:57 smtp2 postfix/qmgr[1436]: 0EA1B1027961: from=&amp;lt;REDACTED&amp;gt;, size=1343, nrcpt=1 (queue active)
Jan 20 14:04:57 smtp2 postfix/local[14880]: 0EA1B1027961: to=&amp;lt;REDACTED&amp;gt;, orig_to=&amp;lt;root&amp;gt;, relay=local, delay=0.04, delays=0.03/0.01/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Jan 20 14:04:57 smtp2 postfix/qmgr[1436]: 0EA1B1027961: removed
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Only the bottom few events include the search keyword &lt;CODE&gt;0EA1B1027961&lt;/CODE&gt; - the rest don't.&lt;/P&gt;

&lt;P&gt;The above in turn is the result of my attempt to use &lt;CODE&gt;transaction&lt;/CODE&gt; on a standard Postfix log on our smtp relays that is fairly simple: each email passing through (or deferred or rejected) results in several log entries that have a common session ID. I'd like to group them together and run various stats commands on them. Yet my results are way off.&lt;/P&gt;

&lt;P&gt;Original field extraction and transaction statement displaying results in a table:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; sourcetype="postfix_syslog" host="*-smtp*"
 | rex field=_raw "\:\s(?&amp;lt;threadId&amp;gt;[0-9A-F]{6,24})\:"
 | transaction threadId host maxspan=24h
 | search threadId=*
 | table threadId status duration eventcount
 | sort -duration
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Results:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;threadId    status  duration    eventcount
0EA1B1027961    sent    3300    29
AB0711005181    sent    233 11
7BF211005181    sent    202 11
5CE8B1005181    sent    178 11
244971005181    sent    173 10
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Clicking on an event and then "view events" brings up a "transaction" event above, that includes irrelevant events.&lt;/P&gt;

&lt;P&gt;Appreciate any help and apologies if my question is not clear.&lt;/P&gt;

&lt;P&gt;Incidentals:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Splunk Enterprise 7.1.2&lt;/LI&gt;
&lt;LI&gt;fairly new to Splunk&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;P.S. Excluding the &lt;CODE&gt;host&lt;/CODE&gt; field from the &lt;CODE&gt;transaction&lt;/CODE&gt; field returns better results (but not perfect if I do want to isolate the results to a single host).&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| transaction threadId maxspan=24h 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;So how do I ensure that including the &lt;CODE&gt;host&lt;/CODE&gt;field in the transaction statement does not result in including irrelevant events? In other words that only events with both the relevant &lt;CODE&gt;threadId&lt;/CODE&gt; and &lt;CODE&gt;host&lt;/CODE&gt; values are grouped by the &lt;CODE&gt;transaction&lt;/CODE&gt;?&lt;/P&gt;

&lt;P&gt;P.P.S. This &lt;EM&gt;may&lt;/EM&gt; be related (from "&lt;A href="https://docs.splunk.com/Documentation/Splunk/8.0.1/SearchReference/Transaction"&gt;Search Reference - Transaction&lt;/A&gt;"):&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;The events are grouped into transactions based on the values of this field. If a quoted list of fields is specified, events are grouped together if they have the same value for each of the fields.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;What is "a quoted list of fields"?&lt;/P&gt;</description>
    <pubDate>Tue, 21 Jan 2020 00:27:40 GMT</pubDate>
    <dc:creator>mitag</dc:creator>
    <dc:date>2020-01-21T00:27:40Z</dc:date>
    <item>
      <title>Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485653#M135876</link>
      <description>&lt;P&gt;Why does &lt;CODE&gt;transaction&lt;/CODE&gt; group irrelevant events together with relevant ones? What am I doing wrong?&lt;/P&gt;

&lt;P&gt;Sample Postfix log event:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Jan 20 14:04:57 smtp2 postfix/local[14880]: 0EA1B1027961: to=&amp;lt;REDACTED&amp;gt;, orig_to=&amp;lt;root&amp;gt;, relay=local, delay=0.04, delays=0.03/0.01/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;...Search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="postfix_syslog" host="*-smtp*" 
| rex field=_raw "\:\s(?&amp;lt;threadId&amp;gt;[0-9A-F]{6,24})\:" 
| transaction threadId host maxspan=24h 
| search threadId=0EA1B1027961
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;... returns this result:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Jan 20 13:14:36 smtp2 postfix/smtpd[8618]: connect from &amp;lt;somehost&amp;gt;
Jan 20 13:14:36 smtp2 postfix/smtpd[8618]: disconnect from &amp;lt;somehost&amp;gt;
Jan 20 13:19:36 smtp2 postfix/smtpd[9238]: connect from &amp;lt;somehost&amp;gt;
Jan 20 13:19:36 smtp2 postfix/smtpd[9238]: disconnect from &amp;lt;somehost&amp;gt;
... &amp;lt;repeated ~ 20 times&amp;gt;
Jan 20 13:24:36 smtp2 postfix/smtpd[9857]: disconnect from &amp;lt;somehost&amp;gt;
Jan 20 14:04:57 smtp2 postfix/pickup[6583]: 0EA1B1027961: uid=0 from=&amp;lt;root&amp;gt;
Jan 20 14:04:57 smtp2 postfix/cleanup[14874]: 0EA1B1027961: message-id=&amp;lt;REDACTED&amp;gt;
Jan 20 14:04:57 smtp2 postfix/qmgr[1436]: 0EA1B1027961: from=&amp;lt;REDACTED&amp;gt;, size=1343, nrcpt=1 (queue active)
Jan 20 14:04:57 smtp2 postfix/local[14880]: 0EA1B1027961: to=&amp;lt;REDACTED&amp;gt;, orig_to=&amp;lt;root&amp;gt;, relay=local, delay=0.04, delays=0.03/0.01/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Jan 20 14:04:57 smtp2 postfix/qmgr[1436]: 0EA1B1027961: removed
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Only the bottom few events include the search keyword &lt;CODE&gt;0EA1B1027961&lt;/CODE&gt; - the rest don't.&lt;/P&gt;

&lt;P&gt;The above in turn is the result of my attempt to use &lt;CODE&gt;transaction&lt;/CODE&gt; on a standard Postfix log on our smtp relays that is fairly simple: each email passing through (or deferred or rejected) results in several log entries that have a common session ID. I'd like to group them together and run various stats commands on them. Yet my results are way off.&lt;/P&gt;

&lt;P&gt;Original field extraction and transaction statement displaying results in a table:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; sourcetype="postfix_syslog" host="*-smtp*"
 | rex field=_raw "\:\s(?&amp;lt;threadId&amp;gt;[0-9A-F]{6,24})\:"
 | transaction threadId host maxspan=24h
 | search threadId=*
 | table threadId status duration eventcount
 | sort -duration
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Results:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;threadId    status  duration    eventcount
0EA1B1027961    sent    3300    29
AB0711005181    sent    233 11
7BF211005181    sent    202 11
5CE8B1005181    sent    178 11
244971005181    sent    173 10
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Clicking on an event and then "view events" brings up a "transaction" event above, that includes irrelevant events.&lt;/P&gt;

&lt;P&gt;Appreciate any help and apologies if my question is not clear.&lt;/P&gt;

&lt;P&gt;Incidentals:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Splunk Enterprise 7.1.2&lt;/LI&gt;
&lt;LI&gt;fairly new to Splunk&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;P.S. Excluding the &lt;CODE&gt;host&lt;/CODE&gt; field from the &lt;CODE&gt;transaction&lt;/CODE&gt; field returns better results (but not perfect if I do want to isolate the results to a single host).&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| transaction threadId maxspan=24h 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;So how do I ensure that including the &lt;CODE&gt;host&lt;/CODE&gt;field in the transaction statement does not result in including irrelevant events? In other words that only events with both the relevant &lt;CODE&gt;threadId&lt;/CODE&gt; and &lt;CODE&gt;host&lt;/CODE&gt; values are grouped by the &lt;CODE&gt;transaction&lt;/CODE&gt;?&lt;/P&gt;

&lt;P&gt;P.P.S. This &lt;EM&gt;may&lt;/EM&gt; be related (from "&lt;A href="https://docs.splunk.com/Documentation/Splunk/8.0.1/SearchReference/Transaction"&gt;Search Reference - Transaction&lt;/A&gt;"):&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;The events are grouped into transactions based on the values of this field. If a quoted list of fields is specified, events are grouped together if they have the same value for each of the fields.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;What is "a quoted list of fields"?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 00:27:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485653#M135876</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-21T00:27:40Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485654#M135877</link>
      <description>&lt;P&gt;The quoted list of fields looks like &lt;CODE&gt;transaction "threadId host" maxspan=24h&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 02:39:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485654#M135877</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2020-01-21T02:39:55Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485655#M135878</link>
      <description>&lt;P&gt;Running transaction with maxspan=24h is going to bring your search head to its knees for any appreciable volume of search results.  At first glance, transaction might appear to be the function you're looking for, but it really isn't.  Stats is what you want.&lt;/P&gt;

&lt;P&gt;I mocked-up your data with just the meaningful fields like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults count=50
| eval raw=split("0EA1B1027961,2EB1F149F9CD,1D7620F38E4C,0EA1B1027961,DFC2597BE73E",",")
| eval statuschoices=split("sent, ",",")
| eval threadId=mvindex(raw,random()%5)
| eval alphabet=split("abcdefg","")
| eval status=mvindex(statuschoices,random()%2)
| eval host=mvindex(alphabet,random()%7)
`comment("Mocked-up sample data above with credit to to4kawa")`
`comment("The stats line immediately below is important since it does all the heavy lifting for you")`
| stats count,values(status) as Status,earliest(_time) as Beginning,latest(_time) as Ending by host,threadId
`comment("You need to compute the duration below by subtracting Beginning time from Ending time, both of which are in epoch format")`
`comment("I added some random value to Ending just to give some variation in duration")`
| eval Ending = Ending+random()%100000
| eval Duration = Ending - Beginning
`comment("Neither the Beginning nor Ending fields are important at this point, so remove them from the results")`
| fields - Beginning,Ending
| sort -Duration
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It gives what I think you're looking for.&lt;/P&gt;

&lt;P&gt;Adapting your search to this approach looks something like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="postfix_syslog" host="*-smtp*"
| rex field=_raw "\:\s(?&amp;lt;threadId&amp;gt;[0-9A-F]{6,24})\:"
| stats count,values(status) as Status,earliest(_time) as Beginning,latest(_time) as Ending by host,threadId
| search threadId=*
| eval Duration = Ending - Beginning
| fields - Beginning,Ending
| sort -Duration
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Hope that helps!&lt;/P&gt;

&lt;P&gt;rmmiller&lt;/P&gt;

&lt;P&gt;References:&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Stats"&gt;https://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Stats&lt;/A&gt;&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Timefunctions"&gt;https://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Timefunctions&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 03:16:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485655#M135878</guid>
      <dc:creator>rmmiller</dc:creator>
      <dc:date>2020-01-21T03:16:31Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485656#M135879</link>
      <description>&lt;P&gt;Thanks - that's what I thought - but it must be something other than an "AND" between fields in transactions: no results when I use it. Would you know what Splunk means when they say "If a quoted list of fields is specified, events are grouped together if they have &lt;STRONG&gt;the same value for each of the fields&lt;/STRONG&gt;."?&lt;/P&gt;

&lt;P&gt;"the same value for each of the fields", i.e. events are grouped into a transaction based on &lt;CODE&gt;threadId&lt;/CODE&gt;==&lt;CODE&gt;host&lt;/CODE&gt;?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 15:27:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485656#M135879</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-21T15:27:41Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485657#M135880</link>
      <description>&lt;P&gt;No, "the same value for each of the fields" should be interpreted as (threadId is the same in event A and event B) AND (host is the same in event A and event B) --&amp;gt; merge event A and event B into the smae transaction.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 16:05:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485657#M135880</guid>
      <dc:creator>rmmiller</dc:creator>
      <dc:date>2020-01-21T16:05:00Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485658#M135881</link>
      <description>&lt;P&gt;Thank you!&lt;/P&gt;

&lt;P&gt;Possible to modify your search to include additional fields in the table? ("to", "from", "relay", "client", etc.)&lt;/P&gt;

&lt;P&gt;Also, to do something similar to &lt;CODE&gt;maxpause&lt;/CODE&gt; and &lt;CODE&gt;maxspan&lt;/CODE&gt; clauses in &lt;CODE&gt;transaction&lt;/CODE&gt; in your search, i.e. if a group of events with the same &lt;CODE&gt;threadId&lt;/CODE&gt; spans more than 1h, or there's a pause longer than 10min, break them up into separate groups (logical events) for further analysis?&lt;/P&gt;

&lt;P&gt;Overall - &lt;STRONG&gt;thank you!&lt;/STRONG&gt; What you did is awesome and hopefully one day I'll wrap my mind over this and be able to craft similar searches. Regretfully I am not there yet. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;(Back to &lt;CODE&gt;transaction&lt;/CODE&gt; for a moment: would you know if it's possible to force an "AND" relationship between fields in &lt;CODE&gt;transaction&lt;/CODE&gt;, to prevent the behavior I described in my OP?)&lt;/P&gt;

&lt;P&gt;Regarding the cost of &lt;CODE&gt;transaction&lt;/CODE&gt;: I think we can manage it, at least with respect to Postfix logs.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;Running transaction with maxspan=24h is going to bring your search head to its knees for any appreciable volume of search results.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Running my original search with &lt;CODE&gt;transaction&lt;/CODE&gt; over 30 days:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;This search has completed and has returned 36,025 results by scanning 394,663 events in 12.567 seconds&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;This works for me and It's unlikely I'll do it over 30 days often. A similar search over 24h: "968 results by scanning 12,090 events in 0.494 seconds". Past year: "6,087 results by scanning 2,957,109 events in 245.953 seconds".&lt;/P&gt;

&lt;P&gt;Postfix is the simplest of transactional logs we have; some other (Aspera, tomcat) may run for hours using the same thread or session ID, and contain dozens of fields we need to run stats on. The key is to group events with the same session ID into a single transaction (per host) and then treat it as a single event for further analysis - even if it's expensive. Far from sure I'll be able to use &lt;CODE&gt;stats&lt;/CODE&gt; for the purpose.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 16:26:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485658#M135881</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-21T16:26:10Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485659#M135882</link>
      <description>&lt;P&gt;Is it possible that my Splunk instance has a bug or something else is wrong with it? Else I don't understand why I'd get no results on this - when it's clear there should be:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="postfix_syslog" host="*-smtp*" 
 | rex field=_raw "\:\s(?&amp;lt;threadId&amp;gt;[0-9A-F]{6,24})\:" 
 | transaction "threadId host" maxspan=24h 
 | search threadId=0EA1B1027961
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 21 Jan 2020 16:30:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485659#M135882</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-21T16:30:17Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485660#M135883</link>
      <description>&lt;P&gt;I don't think you should have quotes around those 2 fields unless it was actually a single field named "threadId host".  In this particular case, you have 2 distinct fields, i.e., one named "threadId" and another named "host".  Neither one really needs quotes around it in order to function with transaction.&lt;/P&gt;

&lt;P&gt;When using multiple fields with transaction, you'll get your expected grouping if each event includes both fields your specify.  If you have some events that are missing "threadId", they will probably not end up in the same transaction as those with the same values in both the "threadId" and "host" fields.&lt;/P&gt;

&lt;P&gt;Transaction is also really picky about the order of events.  Events must be in descending time order, otherwise it can get confused and include some events that shouldn't be part of a transaction.&lt;/P&gt;

&lt;P&gt;See &lt;A href="https://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Transaction#Usage"&gt;https://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Transaction#Usage&lt;/A&gt; for further explanations of both.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 17:46:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485660#M135883</guid>
      <dc:creator>rmmiller</dc:creator>
      <dc:date>2020-01-21T17:46:08Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485661#M135884</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;Possible to modify your search to include additional fields in the table? ("to", "from", "relay", "client", etc.)&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Yes, just add additional items to the by clause of stats like this:&lt;BR /&gt;
    | stats count,values(status) as Status,earliest(_time) as Beginning,latest(_time) as Ending by host,threadId,relay,client&lt;/P&gt;

&lt;P&gt;See my latest comment under your OP regarding what you're seeing with transaction.&lt;/P&gt;

&lt;P&gt;One thing to keep in mind regarding your use of transaction being acceptable in terms of performance.  There are memory limits (in limits.conf) that could affect the accuracy of your searches using transaction.  Stats allows you to work around this limitation and be faster.&lt;/P&gt;

&lt;P&gt;There's a fantastic presentation from one of the previous Splunk conferences that explains the benefits of using stats.&lt;BR /&gt;
&lt;A href="https://conf.splunk.com/files/2016/slides/let-stats-sort-them-out-building-complex-result-sets-that-use-multiple-source-types.pdf" target="_blank"&gt;https://conf.splunk.com/files/2016/slides/let-stats-sort-them-out-building-complex-result-sets-that-use-multiple-source-types.pdf&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Hope that helps!&lt;BR /&gt;
rmmiller&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 03:47:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485661#M135884</guid>
      <dc:creator>rmmiller</dc:creator>
      <dc:date>2020-09-30T03:47:02Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485662#M135885</link>
      <description>&lt;P&gt;Thank you. Perhaps &lt;CODE&gt;transaction&lt;/CODE&gt; gets confused by the presence of several events with the same exact time stamp and takes it out on me by including unrelated events? &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;This though may be the answer to my OP:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;You might see the following events grouped into a transaction:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; event=1 host=a
 event=2 host=a cookie=b
 event=3 cookie=b
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;This effectively means there's no way to force the fields in &lt;CODE&gt;transaction&lt;/CODE&gt; to behave in an &lt;CODE&gt;AND&lt;/CODE&gt; fashion - which in turn means we can't use it for our purposes and it'll be bound to include unrelated events.&lt;/P&gt;

&lt;P&gt;Unless that mysterious "quoted list of fields" Splunk refers to in their documentation (but gives no example of) is still a possibility?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2020 18:02:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485662#M135885</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-21T18:02:07Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485663#M135886</link>
      <description>&lt;P&gt;That produces no results... I'll dig into it...&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;Yes, just add additional items to the by clause of stats like this:&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;PRE&gt;&lt;CODE&gt;| stats count,values(status) as Status,earliest(_time) as Beginning,latest(_time) as Ending by host,threadId,relay,client
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 21 Jan 2020 18:12:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485663#M135886</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-21T18:12:16Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485664#M135887</link>
      <description>&lt;P&gt;Hi @mitag &lt;BR /&gt;
My apologies.  I was in a rush.  You are looking for the values of "relay" and "client" rather than splitting them out into different groups for stats, which is what my erroneous command would give you.&lt;/P&gt;

&lt;P&gt;I was actually thinking this instead:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| stats count, values(status) AS Status, values(relay) AS Relay, values(client) AS client, earliest(_time) AS Beginning, latest(_time) AS Ending by host,threadId
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 22 Jan 2020 14:38:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485664#M135887</guid>
      <dc:creator>rmmiller</dc:creator>
      <dc:date>2020-01-22T14:38:07Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction includes non-relevant events?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485665#M135888</link>
      <description>&lt;P&gt;From Splunk doc team:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;Turns out that the text refers to an old way that the “fields” argument worked. You do not specify fields in a quoted list.  You simply list the fields immediately after the “transaction” command.  Several of the Extended Examples show this.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Wed, 22 Jan 2020 23:47:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-includes-non-relevant-events/m-p/485665#M135888</guid>
      <dc:creator>mitag</dc:creator>
      <dc:date>2020-01-22T23:47:37Z</dc:date>
    </item>
  </channel>
</rss>

