Splunk Search

Why is Join not returning all events?

New Member
[|tstats latest(source) as source where source="F:\\FTPROOT\\Splunk Inputs\\IDM_*.csv" | fields source] returns 245,546 events

[|tstats latest(source) as source where source="F:\\FTPROOT\\Splunk Inputs\\IDM_*.csv" | fields source] | eval manager="uid="+uid+",ou=users,dc=cardinalhealth,dc=com" | rename employeeType AS managerEmpType | fields manager  | table uid managerEmpType returns 245,546 events

But when I join them thusly:

[|tstats latest(source) as source where source="F:\\FTPROOT\\Splunk Inputs\\IDM_*.csv" | fields source | join manager [search [|tstats latest(source) as source where source="F:\\FTPROOT\\Splunk Inputs\\IDM_*.csv" | fields source] | eval manager="uid="+uid+",ou=users,dc=cardinalhealth,dc=com" | rename employeeType AS managerEmpType | fields manager managerEmpType] table uid managerEmpType

I only get 43,440 matches. In theory, the subsearch should return a match for every event in the primary.

Am I missing something obvious here?

0 Karma

New Member

All I am trying to accomplish is take data set A. Modify it so it contains a new attribute managerEmpType. The value can be 1 or 2. This is data set B.

Take original data set A, which contains an attribute called manager. Then build a query that says show me every record where the manager attribute in data set A has a managerEmpType value=1 in data set B.

0 Karma

SplunkTrust
SplunkTrust

@Harold9000, seems like you are hitting Subsearch limit which is actually 50K by default for JOIN. Refer to documentation: http://docs.splunk.com/Documentation/Splunk/7.0.3/Search/Aboutsubsearches#Output_settings_for_subsea...

Since you seem to be running kind of selfjoin. Would it be possible to provide some sample data mock/anonymize any sensitive information and provide more details of what you expect?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

SplunkTrust
SplunkTrust

@Harold9000, please use the code button (101010) on Splunk Answers Comments box to ensure that special characters in the Code/SPL do not get escaped. Try editing your question, highlight the SPL Query and press the code button to re-post code.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

New Member

Thank you. It appears the moderator was kind enough to do it for me.

0 Karma

Path Finder

Hey Harold9000, the joins can be troublesome. I see that you have F:\FTPROOT\Splunk Inputs\IDM*.csv with the IDM* in the first 2 searches but not in your join search. Was that a typo?

Thanks,
JodyFSU

0 Karma

New Member

Yes. It was most definitely a typo. Another commenter suggested I "declare" the manager in the first query.

All I am trying to accomplish is create data set B with column managerEmpType. The value can be 1 or 2.

Take original data set A, which contains an attribute called manager. Then build a query that says show me every record where the manager attribute in data set A has a managerEmpType value=1 in data set B.

Make sense?

0 Karma

Path Finder

and of course the * was omitted in my comment.:
Hey Harold9000, the joins can be troublesome. I see that you have F:\FTPROOT\Splunk Inputs\IDM*.csv with the IDM *in the first 2 searches but not in your join search. Was that a typo?

0 Karma

Motivator

yes.manager field not declared in the first query

Could you please use the below query

[|tstats latest(source) as source where source="F:\FTPROOT\Splunk Inputs\IDM.csv" | fields source | | eval manager="uid="+uid+",ou=users,dc=cardinalhealth,dc=com" | table manager | join manager [search [|tstats latest(source) as source where source="F:\FTPROOT\Splunk Inputs\IDM.csv" | fields source] | eval manager="uid="+uid+",ou=users,dc=cardinalhealth,dc=com" | rename employeeType AS managerEmpType | fields manager managerEmpType uid ] |table uid managerEmpType manager

0 Karma