Splunk Search

How to combine two sources using common field.

rfernandez2010
New Member

Hi everyone,

I just start using splunk and hit a road block.

Using two sources (Loaninfo and Loanapp), my end goal is to extract the "APPLICATION_DATE" found in Loanapp using the "ACCOUNT_NUMBER" and "APPLICATION_ID" from Loaninfo as references.

index=[test_index] source="*LOANINFO*"  REFERENCE="SPRING SALE 16" OR REFERENCE="SPRING SALE 17"
|dedup ACCOUNT_NUMBER, LOAN_ID
|table ACCOUNT_NUMBER, LOAN_ID, ORIGINAL_BALANCE, LOAN_BALANCE,BUREAU_SCORE_EQUIFAX,PAYMENT_COUNT,PAYMENT_FREQUENCY,PAYMENT_SKIPS,LOAN_PURPOSE,OPEN_DATE,CLOSE_DATE,APPLICATION_DATE,APPLICATION_ID,REFERENCE
|rename LOAN_BALANCE as "CURRENT_BALANCE"
|sort REFERENCE, ACCOUNT_NUMBER

Depending on the manager's response, we might have to search a third source (TRACKINGINFO) to pick up specific detail on the loans like make and model of the car.

My problem right now is not knowing what the syntax to reference two (possibly three) sources using the information found in another source.

Thank you in advance.

"Help Me Splunk Answers, You're my only hope."

0 Karma
1 Solution

DalJeanis
Legend

If you come from a SQL background, then read this one first

https://answers.splunk.com/answers/561130/sql-analogy-for-my-log-search-join-two-tables-wher.html

After that, read these...

https://answers.splunk.com/answers/524250/how-to-search-for-matches-in-two-different-searche.html
https://answers.splunk.com/answers/593621/combine-different-fileds-from-different-events.html#answer...
https://answers.splunk.com/answers/527046/join-data-from-two-indexes.html

Each of those describes very well what you are trying to do, with minor differences in the use case. You should have a really good idea of the correct approach after reading any two of those.

View solution in original post

0 Karma

DalJeanis
Legend

If you come from a SQL background, then read this one first

https://answers.splunk.com/answers/561130/sql-analogy-for-my-log-search-join-two-tables-wher.html

After that, read these...

https://answers.splunk.com/answers/524250/how-to-search-for-matches-in-two-different-searche.html
https://answers.splunk.com/answers/593621/combine-different-fileds-from-different-events.html#answer...
https://answers.splunk.com/answers/527046/join-data-from-two-indexes.html

Each of those describes very well what you are trying to do, with minor differences in the use case. You should have a really good idea of the correct approach after reading any two of those.

0 Karma

rfernandez2010
New Member

Thank you DalJeanis, this helped a lot.

0 Karma

rfernandez2010
New Member

Maybe this can help:

index=[test_index] source="*LOANINFO*"  REFERENCE="SPRING SALE 16" OR REFERENCE="SPRING SALE 17"
    |eval APP_ID = APPLICATION_ID, ACCTNUM = ACCOUNT_NUMBER
    |join APPLICATION_DATE search [source="*LOANAPP*" APPLICATION_ID=APP_ID  ACCOUNT_NUMBER=ACCTNUM ]
    |dedup ACCOUNT_NUMBER, LOAN_ID
    |table ACCOUNT_NUMBER, LOAN_ID, ORIGINAL_BALANCE, LOAN_BALANCE,BUREAU_SCORE_EQUIFAX,PAYMENT_COUNT,PAYMENT_FREQUENCY,PAYMENT_SKIPS,LOAN_PURPOSE,OPEN_DATE,CLOSE_DATE,APPLICATION_DATE,APPLICATION_ID,REFERENCE
    |rename LOAN_BALANCE as "CURRENT_BALANCE"
    |sort REFERENCE, ACCOUNT_NUMBER

I think the problem lies in how I'm using join or the second search.

0 Karma

mayurr98
Super Champion

You said in comments that application_date is not a common field between two sources then how can you join using application_date?
Also join is a very expensive command when it comes to performance instead use stats if you can !

0 Karma

493669
Super Champion

Not able to understand what you want to achieve here .....

0 Karma

mayurr98
Super Champion

Hey you can try something like this

index=<your_index>  (source=loaninfo OR source=loanapp) | stats values(APPLICATION_DATE) as APPLICATION_DATE values(ACCOUNT_NUMBER) as ACCOUNT_NUMBER values(LOAN_ID) as LOAN_ID values(ORIGINAL_BALANCE) as ORIGINAL_BALANCE values(LOAN_BALANCE) as CURRENT_BALANCE values(BUREAU_SCORE_EQUIFAX) as BUREAU_SCORE_EQUIFAX values(PAYMENT_COUNT) as PAYMENT_COUNT values(PAYMENT_FREQUENCY) as PAYMENT_FREQUENCY values(PAYMENT_SKIPS) as PAYMENT_SKIPS values(LOAN_PURPOSE) as LOAN_PURPOSE values(OPEN_DATE) as OPEN_DATE values(CLOSE_DATE) as CLOSE_DATE values(REFERENCE) as REFERENCE by APPLICATION_NUMBER APPLICATION_ID  |sort REFERENCE, ACCOUNT_NUMBER

You need to tune this query a bit according to your need

let me know if this helps!

0 Karma

rfernandez2010
New Member

Sorry, I should give a bit more detail. What I'm attempting to do is create a report for our lending department which list all the instances where a customer purchased a car during our spring auto sale promotion for 2016 and 2017. They are requesting account numbers, balances, loan ids, application dates, and more. All our loan information is stored in a source called loaninfo. but information pertaining to the application including the day the application was started is stored in another source called loanapp. What I need to do is using the application id and account number found in source=loaninfo, and match it to the corresponding application in source=loanapp to include the application date in my report.

0 Karma

mayurr98
Super Champion

so can you tell me what is common in both sources app id or account number or both?it would be easy for me to build a query if you give me a sample input from both the sources

0 Karma

rfernandez2010
New Member

Sure the common field would be ACCOUNT_NUMBER and APPLICATION_ID. For lack of better words, these would be the primary keys which both sources share. We need to use both fields since accounts can have multiple applications assigned to them but each APPLICATION_ID is unique.

0 Karma

mayurr98
Super Champion

i have changed my query pls check..query would not work if you do not have APPLICATION_NUMBER and APPLICATION_ID fields in both sources. Also one more things field names should be common as well as in if source=loaninfo contains APPLICATION_NUMBER then source=loanapp must also contain APPLICATION_NUMBER exactly same name

0 Karma

rfernandez2010
New Member

Unfortunately, the query you wrote did not work. I might just be having a hard time trying to explain exactly what I need.

Maybe this can help:

index=symitar_test2 source="*LOANINFO*"  REFERENCE="SPRING SALE 16" OR REFERENCE="SPRING SALE 17"
    |eval APP_ID = APPLICATION_ID, ACCTNUM = ACCOUNT_NUMBER
    |join APPLICATION_DATE search [source="*LOANAPP*" APPLICATION_ID=APP_ID  ACCOUNT_NUMBER=ACCTNUM ]
    |dedup ACCOUNT_NUMBER, LOAN_ID
    |table ACCOUNT_NUMBER, LOAN_ID, ORIGINAL_BALANCE, LOAN_BALANCE,BUREAU_SCORE_EQUIFAX,PAYMENT_COUNT,PAYMENT_FREQUENCY,PAYMENT_SKIPS,LOAN_PURPOSE,OPEN_DATE,CLOSE_DATE,APPLICATION_DATE,APPLICATION_ID,REFERENCE
    |rename LOAN_BALANCE as "CURRENT_BALANCE"
    |sort REFERENCE, ACCOUNT_NUMBER

I think the problem lies in how I'm using join or the second search.

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...