<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Transaction command over a large dataset in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502333#M139828</link>
    <description>&lt;P&gt;Thanks for your feeedback, but I need 4 multivalue fields, I don't see how stats could do that, ideally I would also like to keep all fields with single values for each unique id_number, which are 20 more fields&lt;/P&gt;</description>
    <pubDate>Sat, 23 May 2020 15:53:48 GMT</pubDate>
    <dc:creator>brabagaza</dc:creator>
    <dc:date>2020-05-23T15:53:48Z</dc:date>
    <item>
      <title>Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502328#M139823</link>
      <description>&lt;P&gt;Hi all, &lt;/P&gt;

&lt;P&gt;Hoping someone can give some pointers how to solve this problem:&lt;/P&gt;

&lt;P&gt;I run a transaction command on the last two weeks, which gives about 20.000 events, and for about 85 percent of events the transaction command combines the events perfectly. &lt;BR /&gt;
However, for the remaining 13% there are still duplicate 's meaning that the transaction command has not combined them properly. &lt;/P&gt;

&lt;P&gt;I think this is due to memory limits in the limits.conf and these could be increased, but it seems that there should be smarter options. &lt;BR /&gt;
For example appending new events with a transaction command on an existing lookup if that is possible. &lt;BR /&gt;
Or perhaps there is a better way of combining the information without using transaction at all.&lt;/P&gt;

&lt;P&gt;The downside of the dataset is that transactions can occur over the entire two weeks; which means I cannot filter on maxspan, also filters on maxevents don't improve performance since the transactions can vary a lot. &lt;/P&gt;

&lt;P&gt;Cheers,&lt;BR /&gt;
Roelof&lt;/P&gt;

&lt;P&gt;the minimal search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index= sourcetype= earliest=@d-14d
| fields ...
| transaction  keeporphans=True keepevicted=True
| outputlookup .csv
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is the full minimal search ^&lt;/P&gt;

&lt;P&gt;Two examples of the snippets from the correct dataset  would be:&lt;BR /&gt;
   &lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/8934iFC1306DD6F4AD7BF/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;   &lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/8935i1FA85BCBCA5BEA30/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;BR /&gt;
(id number deleted, but just an integer on which transaction is performed)&lt;BR /&gt;
SYSMODTIME is a multivalue field, and there are a couple more mv fields in the complete dataset&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 16:39:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502328#M139823</guid>
      <dc:creator>brabagaza</dc:creator>
      <dc:date>2020-05-22T16:39:00Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502329#M139824</link>
      <description>&lt;P&gt;You r minimal search unfortunately lost a lot of code when posted.&lt;/P&gt;

&lt;P&gt;In general, &lt;CODE&gt;transaction&lt;/CODE&gt; is not the best tool for most times it is used.  If you explained the use case, what the underlying data looks like, and what you are getting as your result from the &lt;CODE&gt;transaction&lt;/CODE&gt;, we might be able to give you a code model that works 100% of the time and uses less machine time at the same time.&lt;/P&gt;

&lt;P&gt;The &lt;CODE&gt;splunk soup&lt;/CODE&gt; model is the way to aim for here, but without even pseudocode for your search, we can't narrow it down much.  &lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 18:08:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502329#M139824</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2020-05-22T18:08:10Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502330#M139825</link>
      <description>&lt;P&gt;I've edited the initial question to include some sample data, hope this provides enough information&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 21:25:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502330#M139825</guid>
      <dc:creator>brabagaza</dc:creator>
      <dc:date>2020-05-22T21:25:01Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502331#M139826</link>
      <description>&lt;P&gt;Hi @DalJeanis&lt;BR /&gt;
what's the &lt;CODE&gt;splunk soup&lt;/CODE&gt;model?&lt;BR /&gt;
I hear google, but I can't find related.&lt;/P&gt;

&lt;P&gt;&lt;A href="https://answers.splunk.com/answers/561130/how-to-join-two-tables-where-the-key-is-named-diff.html"&gt;https://answers.splunk.com/answers/561130/how-to-join-two-tables-where-the-key-is-named-diff.html&lt;/A&gt;&lt;BR /&gt;
Is this?&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 22:21:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502331#M139826</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-22T22:21:15Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502332#M139827</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;index= sourcetype= earliest=@d-14d
| stats count as eventcount values(SYSMODTIME) as SYSMODTIME by ID_NR
| outputlookup .csv
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;your &lt;EM&gt;SYSMODTIME&lt;/EM&gt; is in order, you can use &lt;CODE&gt;values()&lt;/CODE&gt; not &lt;CODE&gt;list()&lt;/CODE&gt; &lt;BR /&gt;
you have to worry about limits.conf.&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 22:25:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502332#M139827</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-22T22:25:24Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502333#M139828</link>
      <description>&lt;P&gt;Thanks for your feeedback, but I need 4 multivalue fields, I don't see how stats could do that, ideally I would also like to keep all fields with single values for each unique id_number, which are 20 more fields&lt;/P&gt;</description>
      <pubDate>Sat, 23 May 2020 15:53:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502333#M139828</guid>
      <dc:creator>brabagaza</dc:creator>
      <dc:date>2020-05-23T15:53:48Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502334#M139829</link>
      <description>&lt;P&gt;&lt;CODE&gt;stats&lt;/CODE&gt; can aggregate many fields. &lt;BR /&gt;
use sort and append &lt;EM&gt;sorter&lt;/EM&gt; by eval . and use &lt;CODE&gt;stats values()&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 23 May 2020 20:15:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502334#M139829</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-05-23T20:15:20Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction command over a large dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502335#M139830</link>
      <description>&lt;P&gt;This seems to work well to4kawa, thanks, example of what the search looks like now:&lt;BR /&gt;
|stats count values(SYSMODTIME) as SYSMODTIME values(mvfield2) as mvfield2 values(mv3) as etc values(singlevaluefield) by id_number&lt;BR /&gt;
|eval test=mvindex(SYSMODTIME,1)&lt;/P&gt;

&lt;P&gt;produces a table with 1 unique id_number and multiple multivalue fields, and single value fields&lt;/P&gt;</description>
      <pubDate>Sat, 23 May 2020 20:42:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-command-over-a-large-dataset/m-p/502335#M139830</guid>
      <dc:creator>brabagaza</dc:creator>
      <dc:date>2020-05-23T20:42:47Z</dc:date>
    </item>
  </channel>
</rss>

