Splunk Search

Join in Splunk

cx233alvin
Explorer

Hi there!

Just want to ask if possible to execute a non-equijoin in Splunk? A non-equijoin (in SQL) joins two table using non-equal operator such as <, >, BETWEEN-AND etc.

I have two tables:

Employees
Staff ID | Log-in | Log-out | Location
1001 | 12-30-2017 11:30:00 | 12-30-2017 23:30:00 | LDP1
1002 | 12-30-2017 12:30:00 | 12-30-2017 20:30:00 | LDP3

Sales
Txn Date Time | Location | Txn Type
12-30-2017 11:48:17 | LDP1 | Sale

I want to count the number of transactions done by each staff based on their location, log-in and log-out time

Example:
Staff ID | Date |Location | Count of Txns
1 | 12-30-2017 | LDP 1 | 345
2 | 12-30-2017 | LDP 2 | 416

Thanks in advance.

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

Try this

(index="sales" AND Txn-Type=Sale) OR (index="employee")
| eval timestamp =strptime( if(index="employee",'Log-in','Txn-Date-Time'),"%d-%m-%Y %H:%M:%S")
| sort Location "Staff-ID" timestamp index
| streamstats current=f reset_on_change=t 
        last("Staff-ID") as employee last("Log-in") as login last("Log-out") as logout by location "Staff-ID"
| eval login=strptime(login,"%d-%m-%Y %H:%M:%S")
| eval logout=strptime(logout,"%d-%m-%Y %H:%M:%S")
| where index="sales" and timestamp >= login and timestamp <= logout
| eval Date=strftime(timestamp,"%x"
| stats count as "Count of Txns" by employee date Location

You may need to play around with streamstats command to get this to work as you want. I didn't have any test data to verify that this will work.

View solution in original post

cx233alvin
Explorer

Thanks @niketnilay and @lguinn for your replies. I was able to do it based on your suggestions.

Similar to @lguinn, I just sorted the rows by location and loginTime/ txnDateTime. Then, I used streamstats to get the last login, logout and employeeID. I used eval to check whether the sale transaction is within the login and logout time (set 1 if yes, otherwise 0). Finally, I used sum to get the total number of transactions per teller.

lguinn2
Legend

Try this

(index="sales" AND Txn-Type=Sale) OR (index="employee")
| eval timestamp =strptime( if(index="employee",'Log-in','Txn-Date-Time'),"%d-%m-%Y %H:%M:%S")
| sort Location "Staff-ID" timestamp index
| streamstats current=f reset_on_change=t 
        last("Staff-ID") as employee last("Log-in") as login last("Log-out") as logout by location "Staff-ID"
| eval login=strptime(login,"%d-%m-%Y %H:%M:%S")
| eval logout=strptime(logout,"%d-%m-%Y %H:%M:%S")
| where index="sales" and timestamp >= login and timestamp <= logout
| eval Date=strftime(timestamp,"%x"
| stats count as "Count of Txns" by employee date Location

You may need to play around with streamstats command to get this to work as you want. I didn't have any test data to verify that this will work.

niketn
Legend

@lguinn [Splunk], just FYI, following is the query I used to mock up data for my SPL above in comment. I have not posted mine as answer because it will work for first 100 match per location.

| makeresults
| eval employee="1001 | 12-30-2017 11:30:00 | 12-30-2017 23:30:00 | LDP1;1002 | 12-30-2017 12:30:00 | 12-30-2017 13:30:00 | LDP1;1003 | 12-30-2017 22:30:00 | 12-30-2017 23:30:00 | LDP2"
| makemv employee delim=";"
| mvexpand employee
| eval employee=replace(employee," ","")
| makemv employee delim="|"
| eval Staff-ID=mvindex(employee,0),Log-in=mvindex(employee,1),Log-out=mvindex(employee,2),Location=mvindex(employee,3)
| fields Staff-ID, Log-in,Log-out,Location
| eval _time=strptime('Log-in',"%m-%d-%Y%H:%M:%S")
| eval index="employee"
| append [| makeresults
| eval sales="12-30-2017 11:48:17 | LDP1 | Sale;12-30-2017 13:48:17 | LDP1 | Sale"
| makemv sales delim=";"
| mvexpand sales
| eval sales=replace(sales," ","")
| makemv sales delim="|"
| eval Txn-Date-Time=mvindex(sales,0),Location=mvindex(sales,1),Txn-Type=mvindex(sales,2)
| fields Txn-Date-Time,Location,Txn-Type
| eval _time=strptime('Txn-Date-Time',"%m-%d-%Y%H:%M:%S")
| eval index="sales"]
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

lguinn2
Legend

How do you know which employee made which sale? Is it possible to have 2 employees at the same location at the same time?

0 Karma

cx233alvin
Explorer

You can determine which employee performed the sale if:
a. the date and timestamp of sale is within the log-in and logout of employee; AND
b. the location of sale is equal to location of employee.

No. there is only one employee assigned at specific time and location. It is not possible to attribute 1 sale to multiple employees.

0 Karma

niketn
Legend

@cx233alvin, since Location is the correlating field between employee and sale (because Staff-ID is missing in sale). If per location you can find up to 100 StaffID who made the sale, you can try the following:

 (index="sales" AND Txn-Type=Sale) OR (index="employee")
| stats list(Staff-ID) as Staff-ID list(Log-in) as Log-in list(Log-out) as Log-out list(Txn-Date-Time) as Txn-Date-Time values(index) as index by Location
| search index="employee" index="sales"
| eval data=mvzip('Staff-ID',mvzip('Log-in',mvzip('Log-out','Txn-Date-Time')))
| fields - "Staff-ID" "Log-in" "Log-out" "Txn-Date-Time"
| mvexpand data
| eval data=split(data,",")
| eval "Staff-ID"=mvindex(data,0), "Log-in"=mvindex(data,1), "Log-out"=mvindex(data,2), "Txn-Date-Time"=mvindex(data,3)
| fields - data
| eval Log-in=strptime('Log-in',"%m-%d-%Y%H:%M:%S"), Log-out=strptime('Log-out',"%m-%d-%Y%H:%M:%S"), Txn-Date-Time=strptime('Txn-Date-Time',"%m-%d-%Y%H:%M:%S")
| where 'Log-in'<='Txn-Date-Time' AND 'Txn-Date-Time'<='Log-out'
| fieldformat "Log-in"=strftime('Log-in',"%m-%d-%Y %H:%M:%S")
| fieldformat "Log-out"=strftime('Log-out',"%m-%d-%Y %H:%M:%S")
| fieldformat "Txn-Date-Time"=strftime('Txn-Date-Time',"%m-%d-%Y %H:%M:%S")
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

bangalorep
Communicator

Hello! Is all the data in one index? Or are they spread across multiple?

0 Karma

cx233alvin
Explorer

They are from 2 different indices.

0 Karma

bangalorep
Communicator
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...