Splunk Search

Issue with performance on join command?

jsoohoo
New Member

Hello there,
Sorry for asking a noob question! But I'm struggling to determine why my join isn't working across all data. My main search which queries the total number of accesses by a student or staff (in a separate query) and then the inner search tries to match the username to find the faculty. Then I try to dedup and combine the faculties if they are studying more than one degree.

I can query the summary index directly and retrieve a single student's faculty information successfully however if I query the entire summary index I do not get many faculty results returned. The summary index has 3.8mill events. Could it be performance limitation affecting the results? Honestly I dont know the details of the limit, as the Sys Admins set this enterprise version.

Is it correct to assume there is capacity issues or is it my query?

index="123" source="/path/to/logs.log"
| rex field=uri "(?[^.]+).(com|org|de|co|net|gov)"
| table username, url
| join type=left username [search index="student-summary-index" source="SummaryIndex"
| table username, FACULTY]

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@jsoohoo

If you want to go with join then I suggest restricting events by adding a field filter. Like ( uri=* and username=* FACULTY=*) This will fetch only the required events.
Check this:

index="123" source="/path/to/logs.log" uri=*
| rex field=uri "(?[^.]+).(com|org|de|co|net|gov)" 
| table username, url 
| join type=left username 
    [ search index="student-summary-index" source="SummaryIndex" username=* FACULTY=* 
    | table username, FACULTY]

I would like you to try below search also. Here I have used append for achieving the same.

index="123" source="/path/to/logs.log" uri=*
| rex field=uri "(?[^.]+).(com|org|de|co|net|gov)" 
| table username, url 
| append 
    [ search index="student-summary-index" source="SummaryIndex" username=* FACULTY=* 
    | table username, FACULTY] 
    | stats values(url) as url values(FACULTY) as FACULTY by username

Note: Make sure about uri and url fields. 🙂
Thanks

0 Karma

lakshman239
Influencer

If you have 3.8m events in the summary index, do you want to check for all users in the index or only active users or users logged/accessed in the last 7 days or so. Perhaps limiting the results from summary index can return fewer results and thereby produce faster search times. One thing you could also do is change the sub-search to something like search index="student-summary-index" source="SummaryIndex"
| stats count by username, FACULTY | fields username, FACULTY
, this will return fewer records. Include additional filtering before stats, if you have [ e.g. enabled users, last 7 days etc.]

The other option is to run a scheduled search against your summary index to produce a lookup [ again try to limit the fields/contents going to lookup] and use that lookup in place of sub-search or directly alone with your main search to OUTPUT FACULTY using username.

0 Karma
Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...