Splunk Search

Why is my join query pulling results correctly sometimes while other times, it's not?

New Member

Join query return weird result. Sometime its pull correct result & if I execute the same query after 2 mins. Some of the values are not correct

I have two Indexes. The scenario is to filter out the data that is common in both indexes basis of "ticketnumber" & pull out only those tickets where MoveIndate is greater than createddate

IndexA is having these columns ticketnumber, createddate, status
IndexB is having ticketnumber,MoveIndate,assignmentgroup.

Output of the query is not consistent all the time. Sometimes, all data fetched in the attributes & sometime time - createddate & status are coming as blank.

While these attributes are coming as blank in the statistics (table), I checked the values in the "Events" & and found that data is always available in the "Events"

I am not sure - how to troubleshoot this. Kindly help me to resolve this issue as I am stuck on this. Let me know, if the join is not correct.

My Query:

index=IndexA  
| dedup ticketnumber
| join type=inner ticketnumber 
    [ search index=IndexB ]
| eval Move_Indate= strptime(MoveIndate , "%Y-%m-%d") 
| eval created_date_= strptime(createddate, "%Y-%m-%d")
| where (MoveIndate >createddate) 
| table ticketnumber,createddate,status,MoveIndate,assignmentgroup
Tags (2)
0 Karma

New Member

The issue got resolved. I have explicitly mentioned the column name in each join .. here is my the updated query

index=IndexA 
| dedup ticketnumber 
| fields ticketnumber,createddate,status 
| join type=inner ticketnumber 
[ search index=IndexB 
|fields ticketnumber,MoveIndate,assignmentgroup]
| eval Move_Indate= strptime(MoveIndate , "%Y-%m-%d") 
| eval created_date_= strptime(createddate, "%Y-%m-%d")
| where (MoveIndate >createddate) 
| table ticketnumber,createddate,status,MoveIndate,assignmentgroup
0 Karma

Contributor

Hey,

you might run into problems since your are using indexes like data bases, it seems. Each event in an index will be assigned a timestamp (field _time). In your context this might be one of the fields, it might be something different. This might make your use case hard to handle, since your have to do a alltime search in order to catch every event on each side for your join. Moreover, you use dedup ... which is costly on large data sets.

Is there a way you can "transform" your use case into a timely context? Like looking just on data from last week? This of course will only work if you have a meaningful _time field.

Cheerz,
Björn

0 Karma

SplunkTrust
SplunkTrust

That should not work, your inner search doesn't produce a ticketnumber field.

For a generic, scalable approach check out https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...

0 Karma

New Member

Thanks Björn.

The issue got resolved. I have explicitly mentioned the column name in each join .. here is my the updated query

index=IndexA
| dedup ticketnumber
| fields ticketnumber,createddate,status
| join type=inner ticketnumber
[ search index=IndexB
|fields MoveIndate,assignmentgroup]
| eval MoveIndate= strptime(MoveIndate , "%Y-%m-%d")
| eval created
date_= strptime(createddate, "%Y-%m-%d")
| where (MoveIndate >createddate)
| table ticketnumber,createddate,status,MoveIndate,assignmentgroup

0 Karma