<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165429#M33560</link>
    <description>&lt;P&gt;Ok.  Slight change in the results but still sorting the same way as before.  I think the explanation might be better served in another way.  The file numbers may not correspond to previous work.&lt;/P&gt;

&lt;P&gt;File 1 - Accounts.csv - Key fields: "Account Name" "Sales Representative"&lt;BR /&gt;
File 2 - Contacts.csv - Key fields:  "Email" &amp;amp; "Company Name"&lt;BR /&gt;
File 3 - Attendees.csv - Key fields: "E-mail", "Full Name"&lt;/P&gt;

&lt;P&gt;Links:  "Account Name" = "Company Name" and "Email" = "Email"&lt;/P&gt;

&lt;P&gt;Objectives: &lt;/P&gt;

&lt;P&gt;1) Assign "Sales Representative" from Accounts.csv to each attendee "Full Name"  in Attendees.csv&lt;BR /&gt;
2) Count how many attendees are participating in event by sales rep.  e.g.  Joe Smith (Rep1) has 15 attendees, Jane Laird (Rep2) has 20 attendees), etc.&lt;/P&gt;

&lt;P&gt;It is a different way of processing it.  Does this help?&lt;/P&gt;

&lt;P&gt;Sandy&lt;/P&gt;</description>
    <pubDate>Mon, 27 Apr 2015 14:39:12 GMT</pubDate>
    <dc:creator>sandyelrick</dc:creator>
    <dc:date>2015-04-27T14:39:12Z</dc:date>
    <item>
      <title>How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165423#M33554</link>
      <description>&lt;P&gt;Hello Everyone,&lt;/P&gt;

&lt;P&gt;Here is the scenario.  I have three source CSV files with joining fields: &lt;/P&gt;

&lt;P&gt;file1 field1 = file2 field2&lt;BR /&gt;
file2 field3 = file3 field4&lt;/P&gt;

&lt;P&gt;I need to get a count of entries by association... a sale rep is listed in file1 and there are one or more customers in file3.  File2 is the key to bring everything together and I cannot figure out how to match and count how many customers there are per sales rep.&lt;/P&gt;

&lt;P&gt;Any thoughts?&lt;/P&gt;

&lt;P&gt;Sandy&lt;/P&gt;</description>
      <pubDate>Fri, 24 Apr 2015 15:27:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165423#M33554</guid>
      <dc:creator>sandyelrick</dc:creator>
      <dc:date>2015-04-24T15:27:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165424#M33555</link>
      <description>&lt;P&gt;Try this&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=file2
| stats count by field2
| join type=outer field2 [ search source=file1 | eval field2=field1 ]
| stats sum(count) as NumCustomers by SalesRep
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;assuming that file1 contains a field named "SalesRep" that identifies the sales rep. If there is one entry in file2 for each customer and sales rep combination, you don't actually need file3 if all you want is a count.&lt;/P&gt;</description>
      <pubDate>Fri, 24 Apr 2015 15:52:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165424#M33555</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2015-04-24T15:52:48Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165425#M33556</link>
      <description>&lt;P&gt;Hello Lisa,&lt;/P&gt;

&lt;P&gt;I can see where you are going but this only references two of the three sources, so I don't see how this can complete the  count.  Perhaps the correlation is too complex?&lt;/P&gt;</description>
      <pubDate>Fri, 24 Apr 2015 16:15:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165425#M33556</guid>
      <dc:creator>sandyelrick</dc:creator>
      <dc:date>2015-04-24T16:15:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165426#M33557</link>
      <description>&lt;P&gt;It is definitely not too complex for Splunk; we can search  as many sources as we want. But let me explain with data:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;file1
SalesRep,field1
Jo Smith,1
Kerry Lee,2
S Rick,3

file2
field2,field3
1,100
1,101
2,102
3,103
3,104
3,105

file3
Customer,field4
ABC Co,100
XYZ Co,101
abc industries,102
xyz industries,103
myCo Inc,104
yourCo Inc,105
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I've written these out as CSV files, but this will work exactly the same no matter what the file format as long as the fields exist. In &lt;CODE&gt;file2&lt;/CODE&gt;, there is clearly a mapping between the sales rep and the customer. And, if all we want is the count of customers, we can count them up in file2 - there is no need to actually get the customer names from file3.&lt;/P&gt;

&lt;P&gt;If I simply count the number of records for each sales rep (field2) in file2, I have the counts. All I need to do is to join file1 to the counts so that I can identify the sales rep.&lt;/P&gt;

&lt;P&gt;But if you &lt;EM&gt;must&lt;/EM&gt; join in file3, you can do this&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=file2
| join type=outer field2 [ search source=file1 | eval field2=field1 ]
| join type=outer field3 [ search source=file3 | eval field3=field4 ]
| stats count by SalesRep
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You will get the same answer, but the search will be a lot less efficient.&lt;/P&gt;</description>
      <pubDate>Fri, 24 Apr 2015 19:25:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165426#M33557</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2015-04-24T19:25:35Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165427#M33558</link>
      <description>&lt;P&gt;Hello Lisa,&lt;/P&gt;

&lt;P&gt;I think I finally got this to work.  I think my logic has been incorrect as the result just reflects a count of all customers of file1 for each salesrep as opposed to only customers in file3 for each salesrep.&lt;/P&gt;

&lt;P&gt;Hmmm!&lt;/P&gt;</description>
      <pubDate>Fri, 24 Apr 2015 20:35:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165427#M33558</guid>
      <dc:creator>sandyelrick</dc:creator>
      <dc:date>2015-04-24T20:35:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165428#M33559</link>
      <description>&lt;P&gt;Ah, so there could be entries in file2 that don't actually correspond to anything in file3!&lt;/P&gt;

&lt;P&gt;Okay, then change to this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; source=file2
 | join type=outer field2 [ search source=file1 | eval field2=field1 ]
 | join type=outer field3 [ search source=file3 | eval field3=field4 ]
 | stats dc(CustNo) by SalesRep
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Assuming that file1 contains a field called SalesRep and file3 contains a field called CustNo&lt;/P&gt;</description>
      <pubDate>Sun, 26 Apr 2015 20:27:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165428#M33559</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2015-04-26T20:27:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165429#M33560</link>
      <description>&lt;P&gt;Ok.  Slight change in the results but still sorting the same way as before.  I think the explanation might be better served in another way.  The file numbers may not correspond to previous work.&lt;/P&gt;

&lt;P&gt;File 1 - Accounts.csv - Key fields: "Account Name" "Sales Representative"&lt;BR /&gt;
File 2 - Contacts.csv - Key fields:  "Email" &amp;amp; "Company Name"&lt;BR /&gt;
File 3 - Attendees.csv - Key fields: "E-mail", "Full Name"&lt;/P&gt;

&lt;P&gt;Links:  "Account Name" = "Company Name" and "Email" = "Email"&lt;/P&gt;

&lt;P&gt;Objectives: &lt;/P&gt;

&lt;P&gt;1) Assign "Sales Representative" from Accounts.csv to each attendee "Full Name"  in Attendees.csv&lt;BR /&gt;
2) Count how many attendees are participating in event by sales rep.  e.g.  Joe Smith (Rep1) has 15 attendees, Jane Laird (Rep2) has 20 attendees), etc.&lt;/P&gt;

&lt;P&gt;It is a different way of processing it.  Does this help?&lt;/P&gt;

&lt;P&gt;Sandy&lt;/P&gt;</description>
      <pubDate>Mon, 27 Apr 2015 14:39:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165429#M33560</guid>
      <dc:creator>sandyelrick</dc:creator>
      <dc:date>2015-04-27T14:39:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to correlate data from three CSV file sources with joining fields and run a stats count on the final result?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165430#M33561</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;  source=Contacts.csv
  | join "Company Name" [ search source=Accounts.csv 
                            | rename "Account Name" as "Company Name" ]
  | join  Email [ search source=Attendees.csv  ]
  | stats dc("Full Name") as "Attendee Count" by "Sales Representative"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 27 Apr 2015 19:51:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-correlate-data-from-three-CSV-file-sources-with-joining/m-p/165430#M33561</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2015-04-27T19:51:59Z</dc:date>
    </item>
  </channel>
</rss>

