Splunk Search

Join Search Hitting Row Limit - Stats Alternative

IRHM73
Motivator

HI, I wonder whether someone may be able to help me please.

I'm trying to put together a query which looks for two specific patterns of auditSource and auditType for given "SessionID".

I've put together the query below but it is slow to run and suffers from the data row limit, so is not returning all of the events.

index=main auditSource=auth-frontend auditType=Login
    | rename tags.X-Session-ID as sessionID
    | join sessionID [search index=main auditSource=per-frontend auditType=RequestReceived
    | rename tags.X-Session-ID as sessionID
    | table sessionID]
    | stats dc(detail.User)

In know from help already received and form research I've undertaken that you can use the 'Stats' command in the place of Join, but despite spending quite sometime on this now, I'm unable to get this to work.

I just wondered whether someone may be able to look at this please and offer some guidance on how I may overcome this.

Many thanks and kind regards

Chris

Tags (2)
0 Karma
1 Solution

javiergn
Super Champion

If you want to maintain the structure and keep using join, then simply apply some field filtering and see how that helps:

index=main auditSource=auth-frontend auditType=Login
| fields _time, tags.X-Session-ID, detail.User
| rename tags.X-Session-ID as sessionID
| join sessionID [
    search index=main auditSource=per-frontend auditType=RequestReceived
    | fields tags.X-Session-ID
    | rename tags.X-Session-ID as sessionID
    | table sessionID
]
| stats dc(detail.User)

Alternatively you should be able to replace the join with stats and that should help too. See the following posts for help. If you can't manage to work it out by yourself let me know and I'll try to help:

https://answers.splunk.com/answers/34107/joining-two-searches-with-and-without-stats-command.html
https://answers.splunk.com/answers/822/simulating-a-sql-join-in-splunk.html

Thanks,
Javier

View solution in original post

0 Karma

IRHM73
Motivator

Hi thank you for coming back to me with this but unfortunately it's not working.

I think the issue is that it is not extracting those records which have both sets of parameters. In addition it is listing the 'Session ID' in the table rather than the count of the detail.IdaUser

0 Karma

javiergn
Super Champion

Hi, could you post an example of your data source 1 (auth-frontend), data source 2 (per-frontend) and then the outcome you are expecting after joining both?
Otherwise I'm going to keep trying without too much success I'm afraid.

Based on your last comment, if you want to both group by Session-ID and count the distinct users, I would try this instead, but it would really help if you could post some examples as I mentioned above:

index=main ((auditSource=auth-frontend auditType=Login) OR (auditSource=per-frontend auditType=RequestReceived))
 | stats count by tags.X-Session-ID, detail.IdaUser

Thanks,
J

0 Karma

IRHM73
Motivator

Hi @javeirgn, thank you for coming back to me with this and my apologies for not coming back to you earlier. To be honest I was thinking of a way by which I could illustrate my data, so here goes!

auth-frontend

tags.X-Session-ID

session-00620efe-41e6-4179-a1f8-a91db4a94447

detail.IdaUser
06988f29727fd77bba7e49d0b2ff6ba631c9ff2050f4a8f1667a5c6368617f7d

per-frontend

tags.X-Session-ID

session-00620efe-41e6-4179-a1f8-a91db4a94447

Results

Stats DC (detail.IdaUser)

1

Please note I want to count the detail.IdaUser from events which have both the "aurth-frontend" and the "per-frontend"
I hope this helps.

Many thanks and kind regards

0 Karma

javiergn
Super Champion

Based on your example below I came up with the following:

 index=main ((auditSource=auth-frontend auditType=Login) OR (auditSource=per-frontend auditType=RequestReceived))
| stats list(detail.IdaUser) as detail.IdaUser, dc(auditSource) as auditSources by tags.X-Session-ID
| where auditSources > 1
| stats dc(detail.IdaUser) as count by tags.X-Session-ID

The examples I used were:

auth-frontend

tags.X-Session-ID, detail.IdaUser 
session-1, user1
session-1, user2
session-2, user1

per-frontend

 tags.X-Session-ID 
 session-1

And the result was:

tags.X-Session-ID   count
session-1   2 

I understand this is what you were looking for so please let me know otherwise.

Thanks,
J

0 Karma

IRHM73
Motivator

Hi J, if it's ok to call you that?

Thank you for coming back to me with this and I'm pleased to say the query works great, thank you. It's a huge relief.

May I just, because I'm keen to learn from this if you could write a brief narrative of what the query is doing.

Many thanks and kind regards

Chris

0 Karma

javiergn
Super Champion

Sure thing.

index=main ((auditSource=auth-frontend auditType=Login) OR (auditSource=per-frontend auditType=RequestReceived))

Bring me the logs where index = main and are either coming from auditSource auth-frontend and auditType login, or auditSource per-frontend and auditType RequestReceived.

 | stats list(detail.IdaUser) as detail.IdaUser, dc(auditSource) as auditSources by tags.X-Session-ID

Group by Session-ID and return all the IdaUsers and count the different auditSources

 | where auditSources > 1

Filter by those where the number of auditSources is greater than 1, that is, events found in both per-frontend and auth-frontend (your distinct auditSources)

 | stats dc(detail.IdaUser) as count by tags.X-Session-ID

Then only for those found in both auditSources, return the distinct count of IdaUsers group by Session-ID

Also as a piece of advise for future queries, if you post an example like the one your posted above, it'll always be faster for us to help. As you can see it only took me one attempt after you did that.

Thanks,
J

0 Karma

IRHM73
Motivator

Hi thank you for this and point taken 🙂

All the best and kind regards

Chris

0 Karma

IRHM73
Motivator

Hi thank you for coming back to me with this.

I appreciate your comment in respect of the data, but unfortunately by data is restricted.

I've looked back at my original query and perhaps it wasn't as explicit as it should have been, so I'll give it another shot.

  1. For each tags.X-Session-ID which contain: auditSource=per-frontend and auditType=RequestReceived then auditSource=auth-frontend and auditType=Login
  2. Count distinct values from the detail.IdaUser field.

I'm not sure whether this helps.

Many thanks and kind regards

Chris

0 Karma

javiergn
Super Champion

Hi Chris,

I understand data might be restricted but it's probably worth obfuscating it in order for us to better understand what you are trying to achieve.

Based on your comments 1 and 2 above, I'm pretty sure you still need to group by "tags.X-Session-ID" first (comment 1) and then either use dc or count.

Anyway, if you could post three tables that represent your data as follows, it would definitely help:

auth-frontend

tags.X-Session-ID, detail.IdaUser
100, User1
100, User2
101, User2

per-frontend

tags.X-Session-ID
100
101

result

whatever you are trying to achieve
0 Karma
Get Updates on the Splunk Community!

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

 Splunk is More Than Just the Web Console For Digital Forensics and Incident Response (DFIR) practitioners, ...

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...