[updated the question based on feedback]
I am trying to join events from these 2 log entries
Events of Type 1 dateTime policyNo timeTook 2018-06-18 10:43:43,316 85107204 3.327 2018-06-18 10:39:08,253 85107204 3.887 2018-06-18 10:43:05,993 85107204 3.698 Events of Type 2 dateTime policyNo timeTook 2018-06-18 10:43:43,307 85107204 3.3 2018-06-18 10:43:05,986 85107204 3.657 2018-06-18 10:39:08,246 85107204 3.864
using the join
index="xyz" RuleModule="BizRule" | rename timeTook as "Rule Response Time"| table dateTime, policyNo, "Rule Response Time" | join type=left usetime=true earlier=true max=1 policyNo [search index="xyz" client.xyz | fields policyNo customerNumber timeTook | rename timeTook as procTime] | fields dateTime, policyNo, "Rule Response Time", procTime
I get 3.3 for all the 3 events - whereas I am expecting the join to be unique. I also tried setting usetime, max settings for join (based on comments) but didn't help.
Which log source is unique "Type1" or "Type2" ? Try to keep unique log source with policyNo in the subsearch and join the duplicate events source with the unique events source.
If both the sources have duplicates, It's difficult to achieve the JOIN functionality.
What exact output are you expecting?
I updated the question - please note that I can join only using the policyNumber as the events are logged at different time. I thought of rounding dateTime field to a minute and join but would be inaccurate.
Like @sandeeprachuri says, it is difficult to understand what you need.
Have you read the join doc - http://docs.splunk.com/Documentation/Splunk/7.1.1/SearchReference/Join
Extract from the same, are you looking for one-many joins or perhaps a combination of usetime and earlier?
Use the join command when the results of the subsearch are relatively small, for example 50,000 rows or less. To minimize the impact of this command on performance and resource consumption, Splunk software imposes some default limitations on the subsearch. See the subsearch section in the syntax for more information about these limitations.
One-to-many and many-to-many relationships
To return matches for one-to-many, many-to-one, or many-to-many relationships, include the max argument in your join syntax and set the value to 0. By default max=1, which means that the subsearch returns only the first result from the subsearch. Setting the value to a higher number or to 0, which is unlimited, returns multiple results from the subsearch.
There is really no reason to use a join, the table function will join the data on the matching field:
(index="xyz" RuleModule="BizRule") OR (index="xyz" client.xyz ) | rename timeTook as "Rule Response Time"| rename timeTook as procTime | table dateTime, policyNo, "Rule Response Time", procTime
This assumes there is a one-to-one match based on policy number. If you expect there to be multiple events per policy number you can use stats:
(index="xyz" RuleModule="BizRule") OR (index="xyz" client.xyz ) | rename timeTook as "Rule Response Time"| rename timeTook as procTime | stats max(dateTime), avg("Rule Response Time"), avg(procTime) by policyNo