Showing posts with label transformation. Show all posts
Showing posts with label transformation. Show all posts

Friday, March 30, 2012

Merge Join transformation hangs

I have serached through this forum, and I could not find the solution.

My SSIS data flow is a simple one. Extracting two tables from Oracle using OLE DB Source , join them together using merge join and loading into a table in SQL server by SQL Server Destination.

I haven't gone through this simple procedure because Merge join always hangs there. Down to further investigation, I found one input (randomly one of two inputs) is always stuck. Sometimes the input is empty, sometimes is about half way.

Is there workround to see what is happening there and to fix this problem?

TIA

As far as I can see, Mergejoin is not the culprit, it is by design for MergeJoin to wait until it gets buffers from both inputs before it outputs. Looks to me your input(s) did not get through at your Oracle server. I suggest you to use an external tool (rowsetviewer?) to perform the same queries you put on the two OLEDBSrc inputs, to see whether it works there.

thanks

wenyang

|||

Also make sure the 2 inputs of the Merge Join are actually sorted.

Rafael Salas

|||

Hi, Wenyang & Rafael,

Thanks very much for your input.

The 2 inputs of the Merge Join are sorted by "order by" and the Issorted = true. I found the killer is the "data viewer" of the path, which shows the input data. Once I removed the "data viewer", the problem's gone. I cannot explain why.

BTW, what's the external tool (rowsetviewer)? Where can I get it?

Merge Join (Full outer join) never finishes.

I have a merge join (full outer join) task in a data flow. The left input comes from a flat file source and then a script transformation which does some custom grouping. The right input comes from an oledb source. The script transformation output is asynchronous (SynchronousInputID=0). The left input has many more rows (200,000+) than the right input (2,500). I run it from VS 2005 by right-click/execute on the data flow task. The merge join remains yellow and the task never finishes. I do see a row count above the flat file destination that reaches a certain number and seems to get stuck there. When I test with a smaller file on the left it works OK. Any suggestions?

Try this - replace your destination with a RowCount component - it does nothing except count rows into a variable. if the package succeeds, then you have narrowed the problem down to an issue with your destination.

If there is an issue with your destination, you should start to look into disk and memory usage.

hth - I'll look forward to hearing more.

Donald

|||The way I worked around the problem was to separate it into 2 dataflows, the first one just taking the left branch of the input to the merge join and landing it to a raw destination. Then in the second data flow (the merge join) I simply used the raw file as the left input and it worked fine. Does that eliminate the possibility of it being an issue with the destination? If not then I will recreate the original scenario and replace the destination with a RowCount component as you suggested. Thanks.|||

That's a tricky one. I wonder if the combined memory usage of your script component and the merge join was stressing the box.

Donald

Merge Join

All,

I need to use Merge Join transformation to join two sources. One is from a PIVOT transformation and one of the output columns is ISSORTED, the other is from an OLE DB Source using a query. The Merge Join transformation requires both input source have to be sorted. I cannot find the ISSORTED property on the OLE DB Source!!

I tried to use Derived/ copy transformations but cannot find the property also. How can set the OLE query sorted in order to use the MergeJoin?

Thanks a lot

Jessie,

You have 2 options.

1) Sort the data in your query in the OLE DB Source using an ORDER BY clause. If you do this you will have to manually set IsSorted=TRUE - you do this in the Advanced Editor.

2) Use a SORT component to sort the data.

Note that setting IsSorted=TRUE does NOT sort the data for you. It only tells SSIS that the data is sorted. If you set IsSorted on a data-path that is not sorted then you will get the wrong results coming out of the MERGE JOIN.

-Jamie

|||Thank you so much Jamie|||

Hi, jamie,

This might be a dummy question.

I got the same problem as Jessie's, but when I went into the Advanced Editor of OLE DB Source, I could not find where to set up the Issorted=True. Could someone help me to find out?

TIA

|||

john zhang wrote:

Hi, jamie,

This might be a dummy question.

I got the same problem as Jessie's, but when I went into the Advanced Editor of OLE DB Source, I could not find where to set up the Issorted=True. Could someone help me to find out?

TIA

It is a property of the output. OLE DB Source only has one output (plus the error output) so its not too difficult to find.

-Jamie

Merge Join

All,

I need to use Merge Join transformation to join two sources. One is from a PIVOT transformation and one of the output columns is ISSORTED, the other is from an OLE DB Source using a query. The Merge Join transformation requires both input source have to be sorted. I cannot find the ISSORTED property on the OLE DB Source!!

I tried to use Derived/ copy transformations but cannot find the property also. How can set the OLE query sorted in order to use the MergeJoin?

Thanks a lot

Jessie,

You have 2 options.

1) Sort the data in your query in the OLE DB Source using an ORDER BY clause. If you do this you will have to manually set IsSorted=TRUE - you do this in the Advanced Editor.

2) Use a SORT component to sort the data.

Note that setting IsSorted=TRUE does NOT sort the data for you. It only tells SSIS that the data is sorted. If you set IsSorted on a data-path that is not sorted then you will get the wrong results coming out of the MERGE JOIN.

-Jamie

|||Thank you so much Jamie|||

Hi, jamie,

This might be a dummy question.

I got the same problem as Jessie's, but when I went into the Advanced Editor of OLE DB Source, I could not find where to set up the Issorted=True. Could someone help me to find out?

TIA

|||

john zhang wrote:

Hi, jamie,

This might be a dummy question.

I got the same problem as Jessie's, but when I went into the Advanced Editor of OLE DB Source, I could not find where to set up the Issorted=True. Could someone help me to find out?

TIA

It is a property of the output. OLE DB Source only has one output (plus the error output) so its not too difficult to find.

-Jamie