Splunk Search

Issue with performance on join command?

New Member

Hello there,
Sorry for asking a noob question! But I'm struggling to determine why my join isn't working across all data. My main search which queries the total number of accesses by a student or staff (in a separate query) and then the inner search tries to match the username to find the faculty. Then I try to dedup and combine the faculties if they are studying more than one degree.

I can query the summary index directly and retrieve a single student's faculty information successfully however if I query the entire summary index I do not get many faculty results returned. The summary index has 3.8mill events. Could it be performance limitation affecting the results? Honestly I dont know the details of the limit, as the Sys Admins set this enterprise version.

Is it correct to assume there is capacity issues or is it my query?

index="123" source="/path/to/logs.log"
| rex field=uri "(?[^.]+).(com|org|de|co|net|gov)"
| table username, url
| join type=left username [search index="student-summary-index" source="SummaryIndex"
| table username, FACULTY]

0 Karma

SplunkTrust
SplunkTrust

@jsoohoo

If you want to go with join then I suggest restricting events by adding a field filter. Like ( uri=* and username=* FACULTY=*) This will fetch only the required events.
Check this:

index="123" source="/path/to/logs.log" uri=*
| rex field=uri "(?[^.]+).(com|org|de|co|net|gov)" 
| table username, url 
| join type=left username 
    [ search index="student-summary-index" source="SummaryIndex" username=* FACULTY=* 
    | table username, FACULTY]

I would like you to try below search also. Here I have used append for achieving the same.

index="123" source="/path/to/logs.log" uri=*
| rex field=uri "(?[^.]+).(com|org|de|co|net|gov)" 
| table username, url 
| append 
    [ search index="student-summary-index" source="SummaryIndex" username=* FACULTY=* 
    | table username, FACULTY] 
    | stats values(url) as url values(FACULTY) as FACULTY by username

Note: Make sure about uri and url fields. 🙂
Thanks

0 Karma

SplunkTrust
SplunkTrust

If you have 3.8m events in the summary index, do you want to check for all users in the index or only active users or users logged/accessed in the last 7 days or so. Perhaps limiting the results from summary index can return fewer results and thereby produce faster search times. One thing you could also do is change the sub-search to something like search index="student-summary-index" source="SummaryIndex"
| stats count by username, FACULTY | fields username, FACULTY
, this will return fewer records. Include additional filtering before stats, if you have [ e.g. enabled users, last 7 days etc.]

The other option is to run a scheduled search against your summary index to produce a lookup [ again try to limit the fields/contents going to lookup] and use that lookup in place of sub-search or directly alone with your main search to OUTPUT FACULTY using username.

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!