<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: List difference between two csv files in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41270#M7661</link>
    <description>&lt;P&gt;Have you tried the following for the 2nd search: | inputlookup InactiveCustomers.csv &lt;BR /&gt;
| search NOT &lt;BR /&gt;
[inputlookup SynchedCustomers.csv ]&lt;/P&gt;</description>
    <pubDate>Tue, 21 Aug 2012 21:47:21 GMT</pubDate>
    <dc:creator>chris</dc:creator>
    <dc:date>2012-08-21T21:47:21Z</dc:date>
    <item>
      <title>List difference between two csv files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41268#M7659</link>
      <description>&lt;P&gt;I have three CSV files. One is a list of all customers that have logged into my system in the past 24 hours. The second is the master list of all of my customers. I want to list the difference between the two CSV files or, if you prefer, I want to list the customers that have not logged in in the past 24 hours. &lt;/P&gt;

&lt;P&gt;CSV file 1: AllCustomers.csv, a static list containing more fields than CSV file 2.&lt;/P&gt;

&lt;P&gt;CSV file 2: InactiveCustomers.csv, a static list of all customers and reasons why they might be inactive. This file has two columns, cs_username, Reason&lt;/P&gt;

&lt;P&gt;CSV file 3: SynchedCustomers.csv, a list of customers who have logge4din in the past 24 hours. This file has one column, cs_username.&lt;/P&gt;

&lt;P&gt;First Search, this returns an accurate list of all active customers into SynchedCustomers.csv&lt;BR /&gt;
:&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
sourcetype="iis" cs_uri_stem=*configs.xml &lt;BR /&gt;
| lookup AllCustomers.csv cs_username &lt;BR /&gt;
| dedup cs_username &lt;BR /&gt;
| fields cs_username &lt;BR /&gt;
| table cs_username&lt;BR /&gt;
| outputlookup SynchedCustomers.csv&lt;BR /&gt;
&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;Using a second search all I want to do is to list the two fields in InactiveCustomers.csv if they are &lt;STRONG&gt;NOT&lt;/STRONG&gt; found in SynchedCustomers.csv.&lt;/P&gt;

&lt;P&gt;This search returns more than the inactive customers:&lt;/P&gt;

&lt;PRE&gt;
| inputlookup InactiveCustomers.csv 
| search NOT 
[search SynchedCustomers.csv | fields cs_username]
&lt;/PRE&gt;

&lt;P&gt;What am I doing wrong.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 12:18:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41268#M7659</guid>
      <dc:creator>kmattern</dc:creator>
      <dc:date>2020-09-28T12:18:18Z</dc:date>
    </item>
    <item>
      <title>Re: List difference between two csv files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41269#M7660</link>
      <description>&lt;P&gt;It returns MORE than the inactive customers? In that case by definition InactiveCustomers.csv does not in fact only contain inactive customers, as that's what the initial results are loaded from. Is that correct and expected for some reason?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Aug 2012 21:14:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41269#M7660</guid>
      <dc:creator>Ayn</dc:creator>
      <dc:date>2012-08-21T21:14:07Z</dc:date>
    </item>
    <item>
      <title>Re: List difference between two csv files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41270#M7661</link>
      <description>&lt;P&gt;Have you tried the following for the 2nd search: | inputlookup InactiveCustomers.csv &lt;BR /&gt;
| search NOT &lt;BR /&gt;
[inputlookup SynchedCustomers.csv ]&lt;/P&gt;</description>
      <pubDate>Tue, 21 Aug 2012 21:47:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41270#M7661</guid>
      <dc:creator>chris</dc:creator>
      <dc:date>2012-08-21T21:47:21Z</dc:date>
    </item>
    <item>
      <title>Re: List difference between two csv files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41271#M7662</link>
      <description>&lt;P&gt;You want to use the set command it looks like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| set diff [inputlookup InactiveCustomers.csv | fields cs_username] [inputlookup SynchedCustomers.csv | fields cs_username]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This returns the list of usernames. Then you want to lookup the values from your lookup to add the inactive reasons back in:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | lookup InactiveCustomers.csv cs_username OUTPUT Reason | table cs_username Reason
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;For a full search of:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| set diff [inputlookup InactiveCustomers.csv | fields cs_username] [inputlookup SynchedCustomers.csv | fields cs_username] | lookup InactiveCustomers.csv cs_username OUTPUT Reason | table cs_username Reason
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Then you can get fancy and start running stats like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[above search] | stats count values(cs_username) by Reason
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;More on the set command:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Set"&gt;http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Set&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Aug 2012 17:46:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41271#M7662</guid>
      <dc:creator>tfletcher_splun</dc:creator>
      <dc:date>2012-08-22T17:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: List difference between two csv files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41272#M7663</link>
      <description>&lt;P&gt;Sorry I forgot you need to filter out the ones in Synched that are not in Inactive, use this search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| set diff [inputlookup InactiveCustomers.csv | fields cs_username] [inputlookup SynchedCustomers.csv | fields cs_username] | lookup InactiveCustomers.csv cs_username OUTPUT Reason | search Reason=* | table cs_username Reason
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 22 Aug 2012 17:48:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41272#M7663</guid>
      <dc:creator>tfletcher_splun</dc:creator>
      <dc:date>2012-08-22T17:48:41Z</dc:date>
    </item>
    <item>
      <title>Re: List difference between two csv files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41273#M7664</link>
      <description>&lt;P&gt;Thanks, that worked. I had to run a lot of simple searches to verify it but it is the best solution yet.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2012 13:41:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/List-difference-between-two-csv-files/m-p/41273#M7664</guid>
      <dc:creator>kmattern</dc:creator>
      <dc:date>2012-08-23T13:41:30Z</dc:date>
    </item>
  </channel>
</rss>

