<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help with self join in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696236#M236637</link>
    <description>&lt;P&gt;Where does 2-a2 come from?&lt;/P&gt;&lt;P&gt;Assuming values in fieldB are unique, you could try something like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults format=csv data="ID,fieldA,fieldB
1,1-a1,
1,1-a2,
1,,1-b1
2,2-a1,
2,,2-b1
2,,2-b2
3,,1-b1"
``` The lines above emulate the data you shared ```
| stats latest(fieldA) as fieldA list(fieldB) as fieldB by ID
| mvexpand fieldB&lt;/LI-CODE&gt;</description>
    <pubDate>Wed, 14 Aug 2024 07:55:05 GMT</pubDate>
    <dc:creator>ITWhisperer</dc:creator>
    <dc:date>2024-08-14T07:55:05Z</dc:date>
    <item>
      <title>Help with self join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696207#M236621</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I have a single search that produces the following table where fieldA and fieldB are arbitrary strings that may be duplicate.&lt;BR /&gt;&lt;BR /&gt;This is an exact representation of each event where each event may have a key "fieldA" or a key "fieldB" but not both but they always have an ID and Timestamp&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;TABLE border="1" width="100.00000000000001%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;Timestamp&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;ID&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;fieldA&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;fieldB&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;11115&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"z"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;11245&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"a"&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;11378&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"b"&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;11768&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"d"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;11879&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;"d"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;12550&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;2&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"c"&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;13580&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;2&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"e"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;15703&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;2&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"f"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;18690&lt;/TD&gt;&lt;TD width="16.666666666666668%" height="25px"&gt;3&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"g"&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;BR /&gt;and I need help to transform the data as follows.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;TABLE border="1" width="101.93050193050193%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;ID&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;fieldA&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;fieldB&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;1&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"b"&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"d"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;"b"&lt;/TD&gt;&lt;TD&gt;"d"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;2&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"c"&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"e"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;2&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"c"&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="25px"&gt;"f"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="33.333333333333336%" height="74px"&gt;3&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="74px"&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD width="33.333333333333336%" height="74px"&gt;"g"&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;BR /&gt;Thanks to suggestion below, I have tried `stats latest(fieldA) list(fieldB)` but I would prefer to not have any multivalued fields&lt;BR /&gt;&lt;BR /&gt;For every distinct value for "fieldA", the latest record with that value would be kept and any records with that ID occuring before that record would be discard.&lt;BR /&gt;&lt;BR /&gt;There is no requirement to have 2 searches. Hope that makes it more clear and easier.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 Aug 2024 10:35:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696207#M236621</guid>
      <dc:creator>tly22</dc:creator>
      <dc:date>2024-08-14T10:35:00Z</dc:date>
    </item>
    <item>
      <title>Re: Help with self join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696208#M236622</link>
      <description>&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Aug 2024 09:49:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696208#M236622</guid>
      <dc:creator>tly22</dc:creator>
      <dc:date>2024-08-14T09:49:03Z</dc:date>
    </item>
    <item>
      <title>Re: Help with self join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696214#M236625</link>
      <description>&lt;P&gt;Just as you say, Splunk is not SQL. &amp;nbsp;So, forget join. &amp;nbsp;Please let us know what is the nature of the two searches, how close are they? &amp;nbsp;What are their search periods? &amp;nbsp;Most of the time, you shouldn't run two separate searches, but instead, combine the two into one search, then try to get the result you need from that one search.&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;HR /&gt;Criteria being if there are duplicate values in fieldA, only the row with the latest value is kept&lt;BR /&gt;and each row with fieldB joined to fieldA on same ID. or if there are no values for fieldA, just join with null/blank value&lt;BR /&gt;&lt;BR /&gt;Ideally, we can also throw away all rows with col fieldB that have a timestamp earlier than fieldA but not a hard requirement if that adds too much complexity to the query&lt;BR /&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Here, you talk about "latest" and "earlier". &amp;nbsp;But your mock data illustration contains no time information. &amp;nbsp;How are volunteers supposed to help?&lt;/P&gt;&lt;P&gt;Now, if you MUST run the two searches separately, yes, there are ways to produce right join output in SPL without using join command which most Splunkers advise against. &amp;nbsp;But let's start at the ABCs of asking answerable questions in a data analytics forum. (That's right, this is not a SQL forum.) &amp;nbsp;Here are four golden rules that I call Four Commandments:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Illustrate data input (in raw text, anonymize as needed), whether they are raw events or output from a search that volunteers here do not have to look at.&lt;/LI&gt;&lt;LI&gt;Illustrate the desired output from illustrated data.&lt;/LI&gt;&lt;LI&gt;Explain the logic between illustrated data and desired output&amp;nbsp;&lt;EM&gt;without&lt;/EM&gt;&amp;nbsp;SPL.&lt;/LI&gt;&lt;LI&gt;If you also illustrate attempted SPL, illustrate actual output and compare with desired output, explain why they look different&amp;nbsp;&lt;U&gt;to you&lt;/U&gt;&amp;nbsp;if that is not painfully obvious.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Start from here.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Aug 2024 03:50:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696214#M236625</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2024-08-14T03:50:01Z</dc:date>
    </item>
    <item>
      <title>Re: Help with self join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696236#M236637</link>
      <description>&lt;P&gt;Where does 2-a2 come from?&lt;/P&gt;&lt;P&gt;Assuming values in fieldB are unique, you could try something like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults format=csv data="ID,fieldA,fieldB
1,1-a1,
1,1-a2,
1,,1-b1
2,2-a1,
2,,2-b1
2,,2-b2
3,,1-b1"
``` The lines above emulate the data you shared ```
| stats latest(fieldA) as fieldA list(fieldB) as fieldB by ID
| mvexpand fieldB&lt;/LI-CODE&gt;</description>
      <pubDate>Wed, 14 Aug 2024 07:55:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696236#M236637</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2024-08-14T07:55:05Z</dc:date>
    </item>
    <item>
      <title>Re: Help with self join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696253#M236640</link>
      <description>&lt;P&gt;Thank you for the feedback, I have updated the post&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Aug 2024 09:30:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696253#M236640</guid>
      <dc:creator>tly22</dc:creator>
      <dc:date>2024-08-14T09:30:31Z</dc:date>
    </item>
    <item>
      <title>Re: Help with self join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696256#M236641</link>
      <description>&lt;P&gt;Taking on your changes, please explain your logic for excluding "z" in your expected results. Also, your example does not have any duplicates so it is unclear, from the expected results, how you want duplicates treated. Having an accurate representation of your data might help clarify this.&lt;/P&gt;&lt;P&gt;Assuming "z" was supposed to be in the results, then my previous solution still works - the mvexpand expands the multivalue field created by list()&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults format=csv data="Timestamp,ID,fieldA,fieldB
11115,1,,z
11245,1,a,
11378,1,b,
11768,1,,d
12550,2,c,
13580,2,,e
15703,2,,f
18690,3,,g"
| stats latest(fieldA) as fieldA list(fieldB) as fieldB by ID
| mvexpand fieldB&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Aug 2024 10:09:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-self-join/m-p/696256#M236641</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2024-08-14T10:09:33Z</dc:date>
    </item>
  </channel>
</rss>

