Splunk Search
Highlighted

Issue with parsing large dataset using Join

New Member

Hello,
I am using the following search to parse 2 indexes since I want to combine the results from both indexes based on common field "email". I am running this search on my local Splunk instance and both indexes are uploaded CSV. I have configured limits.conf file to handle large dataset. I get different output for Clickedlink and deliveredemail when I use the OR operator in the Join versus when I just use either of them I get the correct output. Am I missing something here. Why is the OR operator trimming output result. I see 0 output for bunch of tables when normally it is populated with some number.

index=IndexA
| join type=inner email [ search index=IndexB ( event=delivered OR event=click ) | dedup email event | fields email, event ]
| stats count(eval('event'="delivered")) as EmailDelivered
count(eval('event'="click")) as Clicked
links
by Region, Division, Country, Location
| table Region, Division, Country, Location, "EmailDelivered" , Clickedlinks

Tags (4)
0 Karma
Highlighted

Re: Issue with parsing large dataset using Join

Legend

Hi kiranpatil1985,
there is a limit of 50,000 results in subsearches, for this reason and because join command is very very slow, I suggest to approach this problem in a different way, using stats command.

index=IndexA OR index=IndexB ( event=delivered OR event=click ) 
| dedup email event | fields email, event ]
| stats count(eval('event'="delivered")) as Email_Delivered count(eval('event'="click")) as Clicked_links BY email Region, Division, Country, Location
| table Region, Division, Country, Location, "Email_Delivered" , Clicked_links

Bye.
Giuseppe

0 Karma