Friday, March 30, 2012

Merge Join Unexpected Results

I noticed a possible bug with the merge join in the June CTP and wanted to see if anyone else is experiencing these issues.

I have two sorted oledb sources with a sort order on columns TicketNumber (1), SequenceNumber (2). Later in my transformation I want to apply a merge join only on TicketNumber (1). The metadata still shows my sort order of TicketNumber, SequenceNumber but my join condition only uses the TicketNumber column.

Upon execution, the first sequence of the ticket on left input joins to every ticket sequence on the right. This is the expected behavior. However, once the sequence number on the left changes and the ticket number remains the same, the left joins to nothing on the right.

In order to get my expected result I had to resort the data only on the TicketNumber (1) column to clear the metadata of the sort order. Below is an example of my results and expected results.

Just wondering what others are experiencing and if this is a potential bug or not. The reduntant sort is killing my performance, but I need the two column sort for work done in previous steps.

Thanks,

Adam

Example.

Left Table
Ticket,Seq,Data
1,1,A
1,2,B
1,3,A

Right Table
Ticket, Seq, Data
1,1,A
1,2,B
1,3,A

Joined Result
L.Ticket, L.Seq, L.Data, R.Ticket, R.Seq, R.Data
1, 1, A, 1, 1, A
1, 1, A, 1, 2, B
1, 1, A, 1, 3, A
1, 2, B, NULL, NULL, NULL
1, 3, A, NULL, NULL, NULL

Expected Result
L.Ticket, L.Seq, L.Data, R.Ticket, R.Seq, R.Data
1, 1, A, 1, 1, A
1, 1, A, 1, 2, B
1, 1, A, 1, 3, A
1, 2, B, 1, 1, A
1, 2, B, 1, 2, B
1, 2, B, 1, 3, A
1, 3, A, 1, 1, A
1, 3, A, 1, 2, B
1, 3, A, 1, 3, A

I have seen this problem before. If I remember correctly, you have to make sure only rows that you want to join on are sorted. It seems as though the merge-join pays more attention to the rows that are sorted than to the "Join Key" checkboxes.
I filed a bug on this and this is the reply:
You can either uncheck the joinkey column in the custom editor or you can change the NumKeyColumns property in the properties grid or advanced editor
I haven't had time to test the NumKeyColumns property, let me know if it works.
-Evan Black|||Evan is right.

Adam, what's the value for NumKeyColumns (a property on Merge Join transform)? If you set to 1, you will get the expected results. That worked for me.|||Thanks guys that did the trick.

No comments:

Post a Comment