<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Lookup Tables - Dedup in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493165#M137540</link>
    <description>&lt;P&gt;Hello,&lt;BR /&gt;
I Googled and checked several answer posts, but perhaps I am not wording it correctly in the search engines.&lt;/P&gt;

&lt;P&gt;I have a lookup table and I want to remove duplicates from the table itself. Not just when the table is being used.&lt;BR /&gt;
There are 3 fields: &lt;STRONG&gt;ACCT&lt;/STRONG&gt;, &lt;STRONG&gt;AUID&lt;/STRONG&gt;, &lt;STRONG&gt;ADDR&lt;/STRONG&gt;.&lt;BR /&gt;
It is quite possible that a user may login from another PC, so I need to keep entries where the &lt;STRONG&gt;ACCT&lt;/STRONG&gt; and &lt;STRONG&gt;AUID&lt;/STRONG&gt; are the same but the &lt;STRONG&gt;ADDR&lt;/STRONG&gt; is different. I using &lt;EM&gt;append=true&lt;/EM&gt; in my &lt;EM&gt;outputlookup&lt;/EM&gt; command to add new entries. Issue is, all entries are being added to the lookup, including those containing duplicate values of those 3 fields.&lt;BR /&gt;
Here is my SPL (which is running in a dashboard).&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="linuxevents" AND host=rub.us AND source="/var/log/audit/audit.log" 
    AND acct="$userId_tok$"
| stats count by acct, auid, addr 
| fields acct, auid, addr 
| head limit=0 
| table acct, auid, addr --&amp;gt; 
| rename acct AS ACCT, auid AS AUID, addr AS ADDR 
| table ACCT, AUID, ADDR 
| outputlookup myAAAlookup.csv append=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I am aware that I can run this to remove duplicates at search time.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| inputlookup myAAAlookup.csv 
| dedup ACCT,AUID,ADDR
| outputlookup myAAAlookup.csv append=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, I want to remove all duplicate entries from the lookup table itself. The table should contain only 5 rows at this time of testing. Instead, there are over 300 duplicate rows, and growing each time the dashboard is run.&lt;/P&gt;

&lt;P&gt;Thanks and God bless,&lt;BR /&gt;
Genesius&lt;/P&gt;</description>
    <pubDate>Fri, 04 Oct 2019 21:00:11 GMT</pubDate>
    <dc:creator>genesiusj</dc:creator>
    <dc:date>2019-10-04T21:00:11Z</dc:date>
    <item>
      <title>Lookup Tables - Dedup</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493165#M137540</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;
I Googled and checked several answer posts, but perhaps I am not wording it correctly in the search engines.&lt;/P&gt;

&lt;P&gt;I have a lookup table and I want to remove duplicates from the table itself. Not just when the table is being used.&lt;BR /&gt;
There are 3 fields: &lt;STRONG&gt;ACCT&lt;/STRONG&gt;, &lt;STRONG&gt;AUID&lt;/STRONG&gt;, &lt;STRONG&gt;ADDR&lt;/STRONG&gt;.&lt;BR /&gt;
It is quite possible that a user may login from another PC, so I need to keep entries where the &lt;STRONG&gt;ACCT&lt;/STRONG&gt; and &lt;STRONG&gt;AUID&lt;/STRONG&gt; are the same but the &lt;STRONG&gt;ADDR&lt;/STRONG&gt; is different. I using &lt;EM&gt;append=true&lt;/EM&gt; in my &lt;EM&gt;outputlookup&lt;/EM&gt; command to add new entries. Issue is, all entries are being added to the lookup, including those containing duplicate values of those 3 fields.&lt;BR /&gt;
Here is my SPL (which is running in a dashboard).&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="linuxevents" AND host=rub.us AND source="/var/log/audit/audit.log" 
    AND acct="$userId_tok$"
| stats count by acct, auid, addr 
| fields acct, auid, addr 
| head limit=0 
| table acct, auid, addr --&amp;gt; 
| rename acct AS ACCT, auid AS AUID, addr AS ADDR 
| table ACCT, AUID, ADDR 
| outputlookup myAAAlookup.csv append=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I am aware that I can run this to remove duplicates at search time.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| inputlookup myAAAlookup.csv 
| dedup ACCT,AUID,ADDR
| outputlookup myAAAlookup.csv append=true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, I want to remove all duplicate entries from the lookup table itself. The table should contain only 5 rows at this time of testing. Instead, there are over 300 duplicate rows, and growing each time the dashboard is run.&lt;/P&gt;

&lt;P&gt;Thanks and God bless,&lt;BR /&gt;
Genesius&lt;/P&gt;</description>
      <pubDate>Fri, 04 Oct 2019 21:00:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493165#M137540</guid>
      <dc:creator>genesiusj</dc:creator>
      <dc:date>2019-10-04T21:00:11Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup Tables - Dedup</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493166#M137541</link>
      <description>&lt;P&gt;Add the inputlookup command to your saved search to dedup before you output. &lt;BR /&gt;
Run it without the outputlookup command first for testing purposes. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="linuxevents" AND host=rub.us AND source="/var/log/audit/audit.log" 
     AND acct="$userId_tok$"
 | stats count as _count by acct, auid, addr 
 | rename acct AS ACCT, auid AS AUID, addr AS ADDR 
 | inputlookup myAAAlookup.csv append=true
 | dedup ACCT AUID ADDR
 | outputlookup myAAAlookup.csv append=true
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 04 Oct 2019 21:41:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493166#M137541</guid>
      <dc:creator>cmerriman</dc:creator>
      <dc:date>2019-10-04T21:41:47Z</dc:date>
    </item>
    <item>
      <title>Re: Lookup Tables - Dedup</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493167#M137542</link>
      <description>&lt;P&gt;@cmerriman &lt;BR /&gt;
Thank you for your reply.&lt;BR /&gt;
I have a couple of questions.&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;What is &lt;STRONG&gt;_count&lt;/STRONG&gt;?&lt;/LI&gt;
&lt;LI&gt;I understand "&lt;STRONG&gt;append=true&lt;/STRONG&gt;" for &lt;STRONG&gt;inputlookup&lt;/STRONG&gt;. Why is it used on the &lt;STRONG&gt;outputlookup&lt;/STRONG&gt;?&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Thanks and God bless,&lt;BR /&gt;
Genesius&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2019 19:14:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Lookup-Tables-Dedup/m-p/493167#M137542</guid>
      <dc:creator>genesiusj</dc:creator>
      <dc:date>2019-10-08T19:14:52Z</dc:date>
    </item>
  </channel>
</rss>

