- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there!
Just want to ask if possible to execute a non-equijoin in Splunk? A non-equijoin (in SQL) joins two table using non-equal operator such as <, >, BETWEEN-AND etc.
I have two tables:
Employees
Staff ID | Log-in | Log-out | Location
1001 | 12-30-2017 11:30:00 | 12-30-2017 23:30:00 | LDP1
1002 | 12-30-2017 12:30:00 | 12-30-2017 20:30:00 | LDP3
Sales
Txn Date Time | Location | Txn Type
12-30-2017 11:48:17 | LDP1 | Sale
I want to count the number of transactions done by each staff based on their location, log-in and log-out time
Example:
Staff ID | Date |Location | Count of Txns
1 | 12-30-2017 | LDP 1 | 345
2 | 12-30-2017 | LDP 2 | 416
Thanks in advance.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Try this
(index="sales" AND Txn-Type=Sale) OR (index="employee")
| eval timestamp =strptime( if(index="employee",'Log-in','Txn-Date-Time'),"%d-%m-%Y %H:%M:%S")
| sort Location "Staff-ID" timestamp index
| streamstats current=f reset_on_change=t
last("Staff-ID") as employee last("Log-in") as login last("Log-out") as logout by location "Staff-ID"
| eval login=strptime(login,"%d-%m-%Y %H:%M:%S")
| eval logout=strptime(logout,"%d-%m-%Y %H:%M:%S")
| where index="sales" and timestamp >= login and timestamp <= logout
| eval Date=strftime(timestamp,"%x"
| stats count as "Count of Txns" by employee date Location
You may need to play around with streamstats command to get this to work as you want. I didn't have any test data to verify that this will work.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @niketnilay and @lguinn for your replies. I was able to do it based on your suggestions.
Similar to @lguinn, I just sorted the rows by location and loginTime/ txnDateTime. Then, I used streamstats to get the last login, logout and employeeID. I used eval to check whether the sale transaction is within the login and logout time (set 1 if yes, otherwise 0). Finally, I used sum to get the total number of transactions per teller.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Try this
(index="sales" AND Txn-Type=Sale) OR (index="employee")
| eval timestamp =strptime( if(index="employee",'Log-in','Txn-Date-Time'),"%d-%m-%Y %H:%M:%S")
| sort Location "Staff-ID" timestamp index
| streamstats current=f reset_on_change=t
last("Staff-ID") as employee last("Log-in") as login last("Log-out") as logout by location "Staff-ID"
| eval login=strptime(login,"%d-%m-%Y %H:%M:%S")
| eval logout=strptime(logout,"%d-%m-%Y %H:%M:%S")
| where index="sales" and timestamp >= login and timestamp <= logout
| eval Date=strftime(timestamp,"%x"
| stats count as "Count of Txns" by employee date Location
You may need to play around with streamstats command to get this to work as you want. I didn't have any test data to verify that this will work.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@lguinn [Splunk], just FYI, following is the query I used to mock up data for my SPL above in comment. I have not posted mine as answer because it will work for first 100 match per location.
| makeresults
| eval employee="1001 | 12-30-2017 11:30:00 | 12-30-2017 23:30:00 | LDP1;1002 | 12-30-2017 12:30:00 | 12-30-2017 13:30:00 | LDP1;1003 | 12-30-2017 22:30:00 | 12-30-2017 23:30:00 | LDP2"
| makemv employee delim=";"
| mvexpand employee
| eval employee=replace(employee," ","")
| makemv employee delim="|"
| eval Staff-ID=mvindex(employee,0),Log-in=mvindex(employee,1),Log-out=mvindex(employee,2),Location=mvindex(employee,3)
| fields Staff-ID, Log-in,Log-out,Location
| eval _time=strptime('Log-in',"%m-%d-%Y%H:%M:%S")
| eval index="employee"
| append [| makeresults
| eval sales="12-30-2017 11:48:17 | LDP1 | Sale;12-30-2017 13:48:17 | LDP1 | Sale"
| makemv sales delim=";"
| mvexpand sales
| eval sales=replace(sales," ","")
| makemv sales delim="|"
| eval Txn-Date-Time=mvindex(sales,0),Location=mvindex(sales,1),Txn-Type=mvindex(sales,2)
| fields Txn-Date-Time,Location,Txn-Type
| eval _time=strptime('Txn-Date-Time',"%m-%d-%Y%H:%M:%S")
| eval index="sales"]
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

How do you know which employee made which sale? Is it possible to have 2 employees at the same location at the same time?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can determine which employee performed the sale if:
a. the date and timestamp of sale is within the log-in and logout of employee; AND
b. the location of sale is equal to location of employee.
No. there is only one employee assigned at specific time and location. It is not possible to attribute 1 sale to multiple employees.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@cx233alvin, since Location is the correlating field between employee and sale (because Staff-ID is missing in sale). If per location you can find up to 100 StaffID who made the sale, you can try the following:
(index="sales" AND Txn-Type=Sale) OR (index="employee")
| stats list(Staff-ID) as Staff-ID list(Log-in) as Log-in list(Log-out) as Log-out list(Txn-Date-Time) as Txn-Date-Time values(index) as index by Location
| search index="employee" index="sales"
| eval data=mvzip('Staff-ID',mvzip('Log-in',mvzip('Log-out','Txn-Date-Time')))
| fields - "Staff-ID" "Log-in" "Log-out" "Txn-Date-Time"
| mvexpand data
| eval data=split(data,",")
| eval "Staff-ID"=mvindex(data,0), "Log-in"=mvindex(data,1), "Log-out"=mvindex(data,2), "Txn-Date-Time"=mvindex(data,3)
| fields - data
| eval Log-in=strptime('Log-in',"%m-%d-%Y%H:%M:%S"), Log-out=strptime('Log-out',"%m-%d-%Y%H:%M:%S"), Txn-Date-Time=strptime('Txn-Date-Time',"%m-%d-%Y%H:%M:%S")
| where 'Log-in'<='Txn-Date-Time' AND 'Txn-Date-Time'<='Log-out'
| fieldformat "Log-in"=strftime('Log-in',"%m-%d-%Y %H:%M:%S")
| fieldformat "Log-out"=strftime('Log-out',"%m-%d-%Y %H:%M:%S")
| fieldformat "Txn-Date-Time"=strftime('Txn-Date-Time',"%m-%d-%Y %H:%M:%S")
| makeresults | eval message= "Happy Splunking!!!"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello! Is all the data in one index? Or are they spread across multiple?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They are from 2 different indices.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you can use append
to join both the tables
https://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Append
