Splunk Search
Highlighted

How to combine data from two CSV files using a common field where this field has a different name in each file?

Motivator

Hi,

I have 2 CSV files that have a unique key of sorts.... The of sorts is, in one CSV file the unique key is named PREMISE, but in the other file it is named PREMISE_ID. What would be the best way to "join" the two files together using Splunk?

0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Legend

Not sure what you mean by join. You could try something like this.

source=file1.csv OR source=file2.csv | eval PREMISE=coalsce(PREMISE, PREMISE_ID) | stats count by PREMISE

This will give you a count of event grouped by PREMISE across both files.

Now, if you want to do a JOIN like a DB JOIN, then you could do something like this

source=file1.csv | JOIN PREMISE [ search source=file2.csv | rename PREMISE_ID as PREMISE | table PREMISE_ID ] 
0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Motivator

Hi Sundareshr,

I think you are on to something. Let me clarify a bit

file1.csv has fields:

PREMISE
TICKETNUMBER
REASON
CODE
blah

file2.csv has fields
PREMISEID
LAST
NAME
FIRST_NAME
blah

I'm trying to join/link/reference/whatever you want to call it, the two files based off of PREMISE/PREMISEID. The two fields have slightly different names but the data is the same. Meaning PREMISE=1 in file1.csv equals PREMISEID=1 in file2.csv

0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Legend

Do any of the fields repeat in both files? Try this

source=1.csv OR source=2.csv | eval premise=coalesce(premise, premise_id) | stats list(lastname) as lname list(firstname) as fname list(ticketnumber) tnbr etc by premise

If this doesn't give you what you're looking for, you could do the join

source=file1.csv | JOIN PREMISE [ search source=file2.csv | rename PREMISE_ID as PREMISE | table PREMISE_ID LAST_NAME blah ] | table TICKET_NUMBER REASON_CODE blah
0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Motivator

Hmmmm, I was certain that it would work but oddly the query isn't returning any results

Here is what I have

index=top10 source="/home/oracle/workdir/account_log.csv" |join PREMISE [search source="/home/oracle/workdir/overalllocationdata.csv" | rename PREMISE_ID as PREMISE|table PREMISE LAST_NAME]|table TICKET_CODE,CHANNEL

Where did I go wrong?

0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Legend

Do you get results when you run the queries separately? Try:

index=top10 source="/home/oracle/workdir/account_log.csv"  | table TICKET_CODE,CHANNEL

Do you get results you would expect? Then try

source="/home/oracle/workdir/overalllocationdata.csv" | rename PREMISE_ID as PREMISE|table PREMISE LAST_NAME]

Do you get results you would expect?

Are the field names correct (case sensitive)? Try adding a PREMISE=* before the | join

0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Motivator

Ok, scratching head a bit,

If I do this it works

index=top10 source="/home/oracle/workdir/account_log.csv" | table TICKET_CODE,CHANNEL

and if I do this it works

index=top10 source="/home/oracle/workdir/overalllocationdata.csv" | rename PREMISE_ID as PREMISE|table PREMISE LAST_NAME

aside for the starting index=top10 the commands seem to work independently (nice trick by the way)

but combining them with the join and I get 0 results even though I've hacked at it trying several different combinations. So it is something I'm doing wrong with the join.

No joy with this one either

index=top10 source="/home/oracle/workdir/account_log.csv" PREMISE="*" |join PREMISE [search source="/home/oracle/workdir/overalllocationdata.csv" | rename PREMISE_ID as PREMISE|table PREMISE LAST_NAME]|table TICKET_CODE,CHANNEL

I double checked and all the field names are in UPPER case.

I'm at the point where my head hurts and I'm wondering why there is a bloody spot on the wall......

0 Karma
Highlighted

Re: How to combine data from two CSV files using a common field where this field has a different name in each file?

Motivator

also just FYI here is the complete tables and fields schema

AccountLog.csv
CHANNEL
CID
CPE
CONVERTED
CREATIONDATE
DIFF
OBJECT
CPEID
OBJECT
TYPE
OPERATIONTYPE
PREMISE
PREMISE
RANK
TICKET_CODE

OverallLocationData.csv
PREMISEID
ADDRESS1
ADDRESS2
CITY
COUNTRY
FIRST
NAME
LASTNAME
LATITUDE
LAT
LNGTYPE
LONGITUDE
POSTAL
CODE
PROVINCE

0 Karma