Friday, March 30, 2012

Merge Join question

Hi, All,

In the data flow of Intergration Service, my understanding is that the "Merge Join" only does the "join", there is no way to input the "where" clause. If I need to input the the "Where" clause or selection criteria, how should I do it in Integration Service?

TIA,

Maybe put in a conditional split transformation before you go into the merge... Filter your rows there.|||

john zhang wrote:

Hi, All,

In the data flow of Intergration Service, my understanding is that the "Merge Join" only does the "join", there is no way to input the "where" clause. If I need to input the the "Where" clause or selection criteria, how should I do it in Integration Service?

TIA,

Depending in the source you are using; I would use a SQL statement right in the surce components to 'pull' only required rows; otherwise I would use conditional split transforms within the data flow.

Rafael Salas

|||Thanks Phil.|||

Thanks Rafael for the input. Now I have "conditional split" worked for me even it's inconvenient

|||

john zhang wrote:

Thanks Rafael for the input. Now I have "conditional split" worked for me even it's inconvenient

Why is it inconvenient?

-Jamie

|||

With other popular ETL tool, in one transformation, you can do both "join" and "where". But in SQL Server 2005 Integration service, you have to use two transformation to finish the task.

|||

john zhang wrote:

With other popular ETL tool, in one transformation, you can do both "join" and "where". But in SQL Server 2005 Integration service, you have to use two transformation to finish the task.

Why is that a bad thing? Is having 2 boxes on the design surface really such a bad thing?

SSIS components are deliberately "atomic". By that I mean they do one single operation and thus the aim is to build data-flows which are greater than the sum of their parts. Just because the "join" and "where" is in the same box then it doesn't mean its doing any less work. In fact it might actually be doing more because you are using a CPU to do two operations whereas in SSIS this may not be the case.

Please don't think I'm trying to say that you're wrong - this is just my opinion and in a way I'm playing devil's advocate. I think its a useful discussion to have.

-Jamie

|||

here's a discussion, the merge join would be even better if it could join more than 2 tables at a time.

I have many packages that require a left join to 5 or 6 other tables and the only way to do this is to use a merge join for the first two tables, then get the data from the 3rd table, add another merge join transform, then join the output of the first merge join to the data from the third table to get a 3 table join and so on...

It works ok though..

|||

bobbins wrote:

here's a discussion, the merge join would be even better if it could join more than 2 tables at a time.

I have many packages that require a left join to 5 or 6 other tables and the only way to do this is to use a merge join for the first two tables, then get the data from the 3rd table, add another merge join transform, then join the output of the first merge join to the data from the third table to get a 3 table join and so on...

It works ok though..

Nice idea. Submit it at http://connect.microsoft.com

-Jamie

sql

No comments:

Post a Comment