Solved: how to join 3 different data sources

Laya123 · ‎03-16-2016

Hi,

I have 3 different sources. I want to merge splunk search data with another data of 2 different csv files using inputlookup.

From all 3 different sources p_name is the common field. But csv files having more number of records (this file is having duplicate project names) than IVW logs. I want to lookup with csv file for latest values.

Example:

from IVW logs, I got output like

p_name; respondents; host
A 20 X
B 25 Y
C 5 Z
D 12 W
E 6 V

From A-N.csv, I have fileds like

p_code; O_Status; O_Server; Contact; SVer DateTime
A; Pass; AB1; abc@f ; 2; 23/01/2016 06:30:20

A; Pass; AB1; abc@f ; 2; 22/02/2016 05:10:20
A; fail; AB1; abc@f ; 2; 13/03/2016 08:30:20
A; Pass; AB1; abc@f ; 2; 05/03/2016 07:20:20
A; Pass; AB1; abc@f ; 2; 06/03/2016 05:30:20
A; Pass; AB1; abc@f ; 2; 11/03/2016 05:30:20
C; fail; NA Not found NA 15/01/2016 05:30:20
C; Pass; AB1; abc@f ; 2; 05/02/2016 05:30:20
C; Pass; AB1; abc@f ; 2; 15/02/2016 05:30:20
C; fail; AB1; abc@f ; 2; 02/03/2016 05:30:20
C; fail; AB1; abc@f ; 2; 13/03/2016 05:30:20
B; Pass; AB1; abc@f ; 2; 12/01/2016 05:30:20
B; Pass; AB1; abc@f ; 2; 11/02/2016 05:30:20
B; Pass; AB1; abc@f ; 2; 10/02/2016 05:30:20
B; Pass; AB1; abc@f ; 2; 25/02/2016 05:30:20
B; Pass; AB1; abc@f ; 2; 02/03/2016 05:30:20
B; Pass; AB1; abc@f ; 2; 07/03/2016 05:30:20
B; pass; AB1; abc@f ; 2; 12/03/2016 05:30:20
H; Pass; AB1; abc@f ; 2; 10/02/2016 05:30:20
H; Pass; AB1; abc@f ; 2; 25/02/2016 05:30:20
H; Pass; AB1; abc@f ; 2; 02/03/2016 05:30:20
H; Pass; AB1; abc@f ; 2; 07/03/2016 05:30:20
M; fail; AB1; abc@f ; 2; 12/03/2016 05:30:20
M; Pass; AB1; abc@f ; 2; 02/03/2016 05:30:20
M; Pass; AB1; abc@f ; 2; 07/03/2016 05:30:20
M; fail; AB1; abc@f ; 2; 12/03/2016 05:30:20
M; Pass; AB1; abc@f ; 2; 10/02/2016 05:30:20

From D1.csv, I have fields like

Project; CCoun; CO1; Contacts

A; ABC ; AB ; AB@g
B; XYZ ; CD ; BC@g

C; LMN ; EF ; CD@g
D; PQR ; GH ; EF@g
E; STU ; YZ ; GH@g

I want my output like

p_name; respondents; host; CCoun; CO1; Contancts; O_status O_Server SVer

A ; 20 ; X; ABC ; AB ; AB@g ; Pass ; AB1 2
B ; 25 ; Y XYZ; CD ; BC@g ; pass ; AB1 2

C ; 5 ; Z LMN ; EF ; CD@g fail ; AB1 2
D ; 12 ; W PQR ; GH; EF@g

E ; 6 ; V STU ; YZ ; GH@g

Below query is giving results but in some places there is data in A-N.csv but not showing in output. Can any one help me to do this.

there is duplicate project names in A-N.csv, but when I am mapping with ivw data it has to show latest values based on the date in A-N.csv

index=ibm sourcetype=ivw host=ABC |eval p_name=upper(p_name) |chart distinct_count(respondentID) as respondents, values(host) as "host", count as Entries, |join max=0 type=outer "p_name" [|inputlookup A-N.csv |rename PCode as p_name|values(O_Server) as O_Server values(O_Status) as O_Status values(SVer) as SVer by p_name] |join max=0 type=outer "p_name" [|inputlookup D1.csv |rename project as p_name|stats values(CCoun) as CCoun values(CO1) as CO1 values(Contacts) as Contacts by p_name ]

Thanks in advance

somesoni2 · ‎03-16-2016

Looks like you've csv fiels ingested as lookup table files, so you don't need join at all. It could be as simple as this

index=ibm sourcetype=ivw host=ABC |eval p_name=upper(p_name) |chart distinct_count(respondentID) as respondents, values(host) as "host", count as Entries, 
| lookup A-N.csv PCode as p_name OUTPUT O_Server OStatus Sver 
| lookup D1.csv project as p_name OUTPUT CCount CO1 Contacts

Update

Give this a try

index=ibm sourcetype=ivw host=ABC |eval p_name=upper(p_name) |chart distinct_count(respondentID) as respondents, values(host) as "host", count as Entries by p_name
| lookup D1.csv project as p_name OUTPUT CCount CO1 Contacts
| append [| inputlookup A-N.csv | rename PCode as p_name | sort 0 p_name -DateTime | dedup p_name | table p_name O_Server OStatus Sver ]
| stats values(*) as * by p_name

View solution in original post

somesoni2 · ‎03-16-2016

Looks like you've csv fiels ingested as lookup table files, so you don't need join at all. It could be as simple as this

index=ibm sourcetype=ivw host=ABC |eval p_name=upper(p_name) |chart distinct_count(respondentID) as respondents, values(host) as "host", count as Entries, 
| lookup A-N.csv PCode as p_name OUTPUT O_Server OStatus Sver 
| lookup D1.csv project as p_name OUTPUT CCount CO1 Contacts

Update

Give this a try

index=ibm sourcetype=ivw host=ABC |eval p_name=upper(p_name) |chart distinct_count(respondentID) as respondents, values(host) as "host", count as Entries by p_name
| lookup D1.csv project as p_name OUTPUT CCount CO1 Contacts
| append [| inputlookup A-N.csv | rename PCode as p_name | sort 0 p_name -DateTime | dedup p_name | table p_name O_Server OStatus Sver ]
| stats values(*) as * by p_name

Laya123 · ‎03-16-2016

Hi,

When I used this query I am getting all values from A-N.csv but I don't want all values from A-N.csv for one project I want to display only latest value from the A-N.csv.

Means from my sample data If Project A is having more than one OStatus but I want only OStatus of latest. If Ostatus for A is having fail on 15/Feb/2016 but on 16/Mar/2016 the project status was Pass I want to display only this pass in my output not all values from A-N.csv

Thanks

Laya123 · ‎03-16-2016

Thank you so much the update one worked for me

Thanks

asimagu · ‎03-16-2016

this may point you in the right direction

https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...

Laya123 · ‎03-16-2016

Thank you for your immediate response.

I tried this but no luck. Here I am mapping values with csv

Thanks & Regards

how to join 3 different data sources

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Automating Threat Operations and Threat Hunting with Recorded Future

Join the Conversation