<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Is appendcols or join better for search performance? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217206#M63785</link>
    <description>&lt;P&gt;I would go on other route to completely avoid the subsearches (in turn join/appendcols). Give this a try&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; index=abc data=billing scoring_type=decisioned latest=now earliest=-30d@d 
| eval period=if(_time&amp;lt;relative_time(now(),"@d"),"Last 30 Day","Today") | top field1, field2 by period field3 
| eval "Last 30 Day Total"=if(period="Last 30 day",count,null()) | eval "Last 30 Day Percent"=if(period="Last 30 day",percent,null())
| eval "Today Total"=if(period="Today",count,null()) | eval "Today Percent"=if(period="Today",percent,null())
| fields - period count percent | stats values(*) as * by field1 field2 field3 
| table field1 field2 field3 "Last 30 Day Total" "Last 30 Days %"  "Today Total" "Today %"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;Update#1&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;There were many typos than I expected in above query. Try this another variation, which works and performs better (tested with _internal data) in my machine&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=abc data=billing scoring_type=decisioned latest=now earliest=-30d@d
 | eval period=if(_time&amp;lt;relative_time(now(),"@d"),"Last 30 Day","Today") 
| stats count by period field3 field1 field2  | eventstats sum(count) as Total by period, field3 | eval percent=count*100/Total | sort period -count | dedup 10 period | eval "Last 30 Day Total"=if(period="Last 30 Day",count,null()) | eval "Last 30 Day %"=if(period="Last 30 Day",percent,null())
 | eval "Today Total"=if(period="Today",count,null()) | eval "Today %"=if(period="Today",percent,null())
 | fields - period count percent Total| stats values(*) as * by field3 field1 field2 | table field3 field1 field2 "Last 30 Day Total" "Last 30 Days %"  "Today Total" "Today %"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Tue, 09 Aug 2016 18:44:20 GMT</pubDate>
    <dc:creator>somesoni2</dc:creator>
    <dc:date>2016-08-09T18:44:20Z</dc:date>
    <item>
      <title>Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217203#M63782</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;

&lt;P&gt;I am trying to decide which Splunk command I should use to give better long-term performance on the search and the search head and am looking for advice. The functions are &lt;CODE&gt;join type=left&lt;/CODE&gt; OR &lt;CODE&gt;appendcols override=true&lt;/CODE&gt;. The goal is to see information that may or may not be in both searches (saw page hits in the last 30 days but no hits to that page today)&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Example result&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;field1  field 2  field3  Last 30 Day Total   Last 30 Days %   Today %   Today Total
website1  URL1   Page1   40618             8.605436      7.814981   1083
website1  URL2   Page2   6                 0.001271     
website2  URL1   Page1   95650             12.050409        
website2  URL2   Page2   43862             5.525928      8.103728   50
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;I have a search with the left join&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=abc data=billing scoring_type=decisioned latest=@d earliest=-30d@d| top field1, field2 by field3 countfield="Last 30 Day Total" percentfield="Last 30 Days %" | join type=left [search index=abc data=billing latest=now earliest=@d| topfield1, field2 by field3  countfield="Today Total" percentfield="Today %"]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;And then the same search using &lt;CODE&gt;appendcols override=true&lt;/CODE&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=abc latest=@d earliest=-30d@d| top field1, field2 by field3 countfield="Last 30 Day Total" percentfield="Last 30 Days %" |appendcols override=true [search index=abc  latest=now earliest=@d| topfield1, field2 by field3  countfield="Today Total" percentfield="Today %"]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Both searches give me the results I am looking for, but they are both long running and I want to know performance-wise which is the better method. &lt;/P&gt;

&lt;P&gt;Thanks for the responses&lt;BR /&gt;
msmapper&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 17:36:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217203#M63782</guid>
      <dc:creator>msmapper</dc:creator>
      <dc:date>2016-08-09T17:36:07Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217204#M63783</link>
      <description>&lt;P&gt;One thing you need to be aware of about join is that it has some significant limits OOB.  You can alter them in limits.conf if needed:&lt;/P&gt;

&lt;P&gt;[join]&lt;BR /&gt;
subsearch_maxout = integer&lt;BR /&gt;
* Maximum result rows in output from subsearch to join against.&lt;BR /&gt;
* Defaults to 50000&lt;/P&gt;

&lt;P&gt;subsearch_maxtime = integer&lt;BR /&gt;
* Maximum search time (in seconds) before auto-finalization of subsearch.&lt;BR /&gt;
* Defaults to 60&lt;/P&gt;

&lt;P&gt;subsearch_timeout = integer&lt;BR /&gt;
* Maximum time to wait for subsearch to fully finish (in seconds).&lt;BR /&gt;
* Defaults to 120&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 17:40:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217204#M63783</guid>
      <dc:creator>sjohnson_splunk</dc:creator>
      <dc:date>2016-08-09T17:40:32Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217205#M63784</link>
      <description>&lt;P&gt;Hi sjohnson,&lt;/P&gt;

&lt;P&gt;Great reminder of the timeout and maxtime for subsearches. Do you know if it goes, 2 min for search time out and then 1 minute for maxtime, for a total of 3 min? &lt;/P&gt;

&lt;P&gt;Right now, the query takes about 30-45 secs to fully display but I am concerned about how it will perform once more pages go live.  I want to make sure I start off with the optimal query as well. I do also plan on turning on acceleration for the query once it is finalized to help performance and results.&lt;/P&gt;

&lt;P&gt;Regards&lt;BR /&gt;
msmapper&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 17:54:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217205#M63784</guid>
      <dc:creator>msmapper</dc:creator>
      <dc:date>2016-08-09T17:54:06Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217206#M63785</link>
      <description>&lt;P&gt;I would go on other route to completely avoid the subsearches (in turn join/appendcols). Give this a try&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; index=abc data=billing scoring_type=decisioned latest=now earliest=-30d@d 
| eval period=if(_time&amp;lt;relative_time(now(),"@d"),"Last 30 Day","Today") | top field1, field2 by period field3 
| eval "Last 30 Day Total"=if(period="Last 30 day",count,null()) | eval "Last 30 Day Percent"=if(period="Last 30 day",percent,null())
| eval "Today Total"=if(period="Today",count,null()) | eval "Today Percent"=if(period="Today",percent,null())
| fields - period count percent | stats values(*) as * by field1 field2 field3 
| table field1 field2 field3 "Last 30 Day Total" "Last 30 Days %"  "Today Total" "Today %"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;Update#1&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;There were many typos than I expected in above query. Try this another variation, which works and performs better (tested with _internal data) in my machine&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=abc data=billing scoring_type=decisioned latest=now earliest=-30d@d
 | eval period=if(_time&amp;lt;relative_time(now(),"@d"),"Last 30 Day","Today") 
| stats count by period field3 field1 field2  | eventstats sum(count) as Total by period, field3 | eval percent=count*100/Total | sort period -count | dedup 10 period | eval "Last 30 Day Total"=if(period="Last 30 Day",count,null()) | eval "Last 30 Day %"=if(period="Last 30 Day",percent,null())
 | eval "Today Total"=if(period="Today",count,null()) | eval "Today %"=if(period="Today",percent,null())
 | fields - period count percent Total| stats values(*) as * by field3 field1 field2 | table field3 field1 field2 "Last 30 Day Total" "Last 30 Days %"  "Today Total" "Today %"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 09 Aug 2016 18:44:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217206#M63785</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2016-08-09T18:44:20Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217207#M63786</link>
      <description>&lt;P&gt;Hi somesoni2,&lt;/P&gt;

&lt;P&gt;While the query works really well, it doesn't put the results from Last 30 Days and Today on the same row. I get 2 rows per Field, one for Last 30 Day and one for Today stats. This doesn't seem to give any performance boost over the subsearch or the join or appendcols, in fact it actually seems a bit slower to run. The subsearch runs in less than 1 min and your query takes 2 min, this is over 1.8 million rows.&lt;/P&gt;

&lt;P&gt;Am I doing something wrong with your query?&lt;/P&gt;

&lt;P&gt;Also in your example as is the eval statements for the Total and Percent fields aren't populating. There are a few typo's that I want to correct for others to use later&lt;/P&gt;

&lt;P&gt;| eval "Last 30 Day Total"=if(period=="Last 30 Day",count,null())| eval "Last 30 Days %"=if(period="Last 30 Day",percent,null())| eval "Today Total"=if(period="Today",count,null()) | eval "Today %"=if(period="Today",percent,null())|stats values(*) as * by field1, field2, field3 ) |table field1, field2, field3  "Last 30 Day Total" "Last 30 Days %"  "Today Total" "Today %"&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 20:30:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217207#M63786</guid>
      <dc:creator>msmapper</dc:creator>
      <dc:date>2016-08-09T20:30:42Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217208#M63787</link>
      <description>&lt;P&gt;Thanks for pointing out the typos. Try the updated answer.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 21:35:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217208#M63787</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2016-08-09T21:35:34Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217209#M63788</link>
      <description>&lt;P&gt;Hi somesoni2,&lt;/P&gt;

&lt;P&gt;The query performance is a bit better but I am still experiencing the same issue with the results not being on the same line together. Getting the data from each timeframe to show on one line with the best performance is my goal. Right now 30 Day and Today numbers are completely separate rows in the table. I have only every seen join and appendcols put data like that in the same row. &lt;/P&gt;

&lt;P&gt;Regards&lt;BR /&gt;
msmapper&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 22:57:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217209#M63788</guid>
      <dc:creator>msmapper</dc:creator>
      <dc:date>2016-08-09T22:57:16Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217210#M63789</link>
      <description>&lt;P&gt;Why using dedup over stats?&lt;BR /&gt;
stats list( eval (period&amp;lt;11)) by period field3 field1 field2 &lt;/P&gt;

&lt;P&gt;| dedup 10 period | eval "Last 30 Day Total"=if(period="Last 30 Day",count,null()) | eval "Last 30 Day %"=if(period="Last 30 Day",percent,null())&lt;BR /&gt;
  | eval "Today Total"=if(period="Today",count,null()) | eval "Today %"=if(period="Today",percent,null())&lt;/P&gt;</description>
      <pubDate>Sun, 28 Oct 2018 15:40:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217210#M63789</guid>
      <dc:creator>valiquet</dc:creator>
      <dc:date>2018-10-28T15:40:35Z</dc:date>
    </item>
    <item>
      <title>Re: Is appendcols or join better for search performance?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217211#M63790</link>
      <description>&lt;P&gt;Hi msmapper,&lt;/P&gt;

&lt;P&gt;Like @somesoni2 already said, avoid all types of sub searches - because soon then later you will hit limits/problems without even knowing. &lt;/P&gt;

&lt;P&gt;Regarding your use case: &lt;/P&gt;

&lt;P&gt;I find &lt;CODE&gt;multisearch&lt;/CODE&gt; being really useful to compare different time ranges, see an example here &lt;A href="https://answers.splunk.com/answers/663294/timewrap-compare-last-24-hours-to-the-same-day-ove.html"&gt;https://answers.splunk.com/answers/663294/timewrap-compare-last-24-hours-to-the-same-day-ove.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Hope this helps ...&lt;/P&gt;

&lt;P&gt;cheers, MuS&lt;/P&gt;</description>
      <pubDate>Sun, 28 Oct 2018 19:14:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Is-appendcols-or-join-better-for-search-performance/m-p/217211#M63790</guid>
      <dc:creator>MuS</dc:creator>
      <dc:date>2018-10-28T19:14:20Z</dc:date>
    </item>
  </channel>
</rss>

