So, I have one source (transactions) with userNumber and another source (users) with number. I want to join both of them. In each source, they have different field names. I want my table to have the employees name, which in in source users, which I get in my 2nd query in the join separately. Below is my SPL as of now:
index=* sourcetype=transaction
| stats dc(PARENT_ACCOUNT) as transactionMade by POSTDATE, USERNUMBER
| join left=L right=R where L.USERNUMBER=R.NUMBER [search sourcetype=users | stats values(NAME) as Employee by NUMBER]
| table USERNUMBER Employee PARENT_ACCOUNT POSTDATE transactionMade
What is it that I am doing wrong?
I was not getting any data together, that was what was wrong... sorry for the miscommunication. I implemented your idea, and I am getting data now! Thank you!!!
However, I am not getting the Employee column filled. It might because of the data issue. But, I wanted to know if we can label the fields by source. For example. I have UserNumber in both sources that mean different things and name in both sources that mean different things... How can I help Splunk differentiate them? Is there any resources would you suggest?
Thank you so much!
Try something like this
index=* sourcetype=transaction OR sourcetype=users
| eval CommonNumber=if(sourcetype="transaction", USERNUMBER, NUMBER)
| eventstats values(NAME) as Employee by CommonNumber
| stats dc(PARENT_ACCOUNT) as transactionMade values(Employee) as Employee values(USERNUMBER) as USERNUMBER by POSTDATE, CommonNumber
| table CommonNumber USERNUMBER Employee PARENT_ACCOUNT POSTDATE transactionMade
It depends what it is you are trying to do, and what you think is wrong. As it stands, PARENT_ACCOUNT is not a field beyond the stats command (since it isn't listed as an output field - dc just counts the distinct values of the field without listing them).
For the "join", you don't need a join (and they usually should be avoided if possible as they are slow and have limitations). Try something like this:
index=* sourcetype=transaction OR sourcetype=users
| eval USERNUMBER=coalesce(USERNUMBER, NUMBER)
| eventstats values(NAME) as Employee by USERNUMBER
| stats dc(PARENT_ACCOUNT) as transactionMade values(Employee) as Employee by POSTDATE, USERNUMBER
| table USERNUMBER Employee PARENT_ACCOUNT POSTDATE transactionMade