<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to optimize Splunk query by avoiding join or anything else? in Splunk Enterprise</title>
    <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618522#M14327</link>
    <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/190794"&gt;@johnhuang&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;appendpipe - nice way to do the data copy - better than using mv and mvexpand. You can also do renames, instead of the evals in my solution as you have the subsearch.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 27 Oct 2022 04:43:13 GMT</pubDate>
    <dc:creator>bowesmana</dc:creator>
    <dc:date>2022-10-27T04:43:13Z</dc:date>
    <item>
      <title>How to optimize Splunk query by avoiding join or anything else?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618337#M14315</link>
      <description>&lt;P&gt;Hi Everyone,&lt;/P&gt;
&lt;P&gt;I have below query by which i am extracting manager name,email etc. by applying join on managerno to all employee records since manager will also be a part of it.This query is taking 120 sec to execute but i need to minimize and keep it under 20 seconds.Any help will be appreciated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;index="myid_transac_idx" sourcetype="myID_Identity" earliest=-1d@d latest=now()
|fields employeeno display_name loginid email status managerno Termination_process_date
|where status="Terminated" and Termination_process_date &amp;gt; "2022-10-01 00:00:00.00" |join type=LEFT managerno [ search index="myid_transac_idx" sourcetype="myID_Identity" earliest=-1d@d latest=now()
|fields employeeno display_name loginid email status |rename employeeno as managerno
|rename display_name as manager_name |rename loginid as managerloginid
|rename email as manageremail
|rename status as managerstatus]|fields employeeno display_name loginid email status managerno manager_name managerloginid manageremail managerstatus
| table employeeno display_name loginid email status managerno manager_name managerloginid manageremail managerstatus&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Oct 2022 14:06:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618337#M14315</guid>
      <dc:creator>Splunk4</dc:creator>
      <dc:date>2022-10-26T14:06:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to optimize Splunk query by avoiding join or anything else?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618506#M14323</link>
      <description>&lt;P&gt;Review this conf talk as a starting place. &amp;nbsp;&lt;A href="https://conf.splunk.com/files/2019/slides/FNC2751.pdf" target="_blank"&gt;https://conf.splunk.com/files/2019/slides/FNC2751.pdf&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Oct 2022 23:43:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618506#M14323</guid>
      <dc:creator>starcher</dc:creator>
      <dc:date>2022-10-26T23:43:40Z</dc:date>
    </item>
    <item>
      <title>Re: How to optimize Splunk query by avoiding join or anything else?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618511#M14324</link>
      <description>&lt;P&gt;Although a very useful doc, joining on self where the data set is the same is not something covered in that doc and stats will not really solve the problem in a simple way.&lt;/P&gt;&lt;P&gt;The simplest way is to create a lookup of managers with their managerno and then lookup on that rather than try to join on self.&lt;/P&gt;&lt;P&gt;It is technically possible to join on self by duplicating the data and then using stats/eventstats and some eval/if combinations.&lt;/P&gt;&lt;P&gt;Here's a psuedo example that creates 26 employees and assigns them to random managers and then populates the manager name to the terminated employee&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults
| fields - _time
``` Create 26 employees A-Z ```
| eval ee=split("ABCDEFGHIJKLMNOPQRSTUVWXYZ", "")
| mvexpand ee
``` Create their manager - one of 25 - not self ```
| eval mg=split("ABCDEFGHIJKLMNOPQRSTUVWXYZ", "")
| eval mgs=mvmap(mg, if(mg=ee, null(), mg))
| eval mg=mvindex(mgs, random() % 25)
| fields - mgs
``` Now create their name ```
| eval ee_name="User ".ee
``` Now duplicate the entire data set ```
| eval n=mvrange(1,3,1)
| mvexpand n
``` So now we have duplicated data and n=1 are the employees and we will make n=2 potential managers ```
``` So, now let's make some of the n=1 employees terminated ```
| eval status=case(n=1 AND random() % 100 &amp;lt; 20, "Terminated", n=1, "Employed", n=2, null())
``` and filter out employeed employees - keep terminated and managers ```
| where n=2 OR status="Terminated"
``` Now fudge the manager data by setting manager name to employee name ```
| eval mgr_name=if(n=2, ee_name, null())
| eval ee_name=if(n=2, null(), ee_name)
``` and set the manager to self ```
| eval mg=if(n=2, ee, mg)
``` and this will not populate the manager to the terminated employee ```
| eventstats values(mgr_name) as mgr_name by mg
``` Now get rid of managers ```
| where n=1
| fields - n&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;You will need to work that somewhat to your data set and depending on the volume of your data, the mvexpand may not work if your data set is large.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2022 01:16:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618511#M14324</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2022-10-27T01:16:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to optimize Splunk query by avoiding join or anything else?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618519#M14326</link>
      <description>&lt;P&gt;Like&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/6367"&gt;@bowesmana&lt;/a&gt;&amp;nbsp;said, the ideal solution is to generate a lookup. But if you want to do it without, this is how you can simplify:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;index="myid_transac_idx" sourcetype="myID_Identity" earliest=-1d@d
| fields employeeno display_name loginid email status managerno Termination_process_date
| dedup employeeno
| appendpipe [| rename employeeno AS managerno display_name AS manager_name loginid AS managerloginid email AS manageremail status AS managerstatus | fields - Term*]
| eventstats max(manager*) AS manager* BY managerno
| where LEN(display_name)&amp;gt;1 AND status="Terminated" AND Termination_process_date&amp;gt;"2022-10-01 00:00:00.00"
| table employeeno display_name loginid email status managerno manager_name managerloginid manageremail managerstatus&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2022 03:39:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618519#M14326</guid>
      <dc:creator>johnhuang</dc:creator>
      <dc:date>2022-10-27T03:39:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to optimize Splunk query by avoiding join or anything else?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618522#M14327</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/190794"&gt;@johnhuang&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;appendpipe - nice way to do the data copy - better than using mv and mvexpand. You can also do renames, instead of the evals in my solution as you have the subsearch.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2022 04:43:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618522#M14327</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2022-10-27T04:43:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to optimize Splunk query by avoiding join or anything else?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618546#M14328</link>
      <description>&lt;P&gt;Thanks,,will try through lookup option since the data is not small&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2022 08:10:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/How-to-optimize-Splunk-query-by-avoiding-join-or-anything-else/m-p/618546#M14328</guid>
      <dc:creator>Splunk4</dc:creator>
      <dc:date>2022-10-27T08:10:37Z</dc:date>
    </item>
  </channel>
</rss>

