Splunk Search

Join on one to many with max=0 still only returns 1 row per match

drpog
New Member

This is the query:

source=Audit earliest=-2d
[search source=Audit | stats count by persistent_id | where count > 20 and isnotnull(persistent_id) | fields persistent_id]
| JOIN max=0 Audit_Id
[ SEARCH source= Applicant ]
| table Audit_Id , FirstName, LastName

It first selects any rows from the audit table that have a not null persistent_id that occurs in the table more than 20 times. Then if any rows that have that persistent_id have turned up in the last 2 days it joins them to the Applicant table and returns a table result with the audit id and the names

The problem is that the join only returns the first match even though the max=0 setting is set.

I am trying to translate this sql query:

SELECT Audit_Id, FirstName, LastName
FROM Audit JOIN Applicant ON Audit_Id
WHERE persistent_id IN
(
SELECT persistent_id from Audit group by persistent_id having count(*)>20 and persistent_id is not null
)
and timestamp > DATEADD(hh, -48, GETDATE());

The sql returns

Audit_Id FirstName LastName
414 joe strummer

422 johny rotten
422 sid vicious

Whereas the splunk search returns

Audit_Id FirstName LastName
414 joe strummer

422 johny rotten

Tags (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Joins are slow and usually not necessary. Try this query instead.

(source=Audit earliest=-2d 
[ search source=Audit 
| stats count by persistent_id 
| where count > 20 and isnotnull(persistent_id) 
| fields persistent_id] ) OR
source= Applicant 
| stats values(FirstName) values(LastName) by Audit_Id
---
If this reply helps you, Karma would be appreciated.

drpog
New Member

Hi
the first bit of 2 day search (lines 1-5) returns 4 rows from the Audit table and really what I want to do is add the columns (firstname and lastname) from the other table (Applicant) to those 4 rows using the column Audit_Id which is common to both tables.

The change suggested above using OR gives me a result of 113000 rows

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Thanks for the clarification. I've modified my answer. If that doesn't give the results you want, we can try using another subsearch.

---
If this reply helps you, Karma would be appreciated.
0 Karma

drpog
New Member

Thanks rich, I appreciate the help, but unfortunately the result still goes from 4 rows to 113,000 rows

0 Karma

richgalloway
SplunkTrust
SplunkTrust

OK. Here's a different search to try. Completely untested. The idea is to use each nested search to build the main search into something akin to source=Applicant (FirstName=<fname> LastName=<lname>...). If this goes nowhere then it's back to looking at why JOIN isn't showing all results.

source=Applicant [source=Audit earliest=-2d 
[ search source=Audit 
| stats count by persistent_id 
| where count > 20 and isnotnull(persistent_id) 
| fields persistent_id]  
| fields FirstName LastName]
| table Audit_Id FirstName LastName
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...