开发者

SSIS to copy data from one table to another, where not in destination table

开发者 https://www.devze.com 2022-12-23 17:11 出处:网络
I\'m in the process of creating an SSIS package on a server (server1) that looks at the data in a sql db on another site (server2) and copies relevant rows across.

I'm in the process of creating an SSIS package on a server (server1) that looks at the data in a sql db on another site (server2) and copies relevant rows across.

The SQL statement required is:

SELECT *
FROM server2.ordersTable
WHERE
OrderID Not In (SELECT OrderID FROM server1.ordersTable

This selects data from server1 which isn't in the tabl开发者_如何转开发e on server2 (based on order id) I then need to insert the result into a table on server1

How would I approach this? What components do I need etc...?


Presuming you can use OPENQUERY or a direct linked-server join on server2 to server1 (as in your example or the answer from @schrodinger's code), that would be the best solution to minimize the data over the wire.

But if that isn't available, you have two other options:

1) You can use the Lookup data-flow transformation in SSIS to check the record against the existing OrderIDs and only push the new records using the conditional split transformation.

2) You can transfer the entire table from Server2 into a temporary table on Server1, then compare on Server1 using a variant of the code you posted.


Realizing this is an OLD thread...

If you want to do this in SSIS, I would suggest a data flow with:

  • Two data sources.
  • Source A selects all of the data you want to move
  • Source B selects the key from the destination
  • Run both sources into a merge join configured as a left join
  • Run join results into a conditional split, define an output where the key from source B is null
  • Run the appropriate output of the split to a destination that inserts the desired rows in the appropriate table.

There are a number of enhancements that can be made to support additional functionality such as handling updates and deletes. There are also some performance tuning tricks that can be applied as required by data volume, one of the simplest being checking the modified date of the source data against the last time the package ran to reduce the number of records selected by Source A.


you can create a linked server for one of the servers first say LinkedServer1 for server1 and then you can use OPENQUERY like:

SELECT * FROM server2.OrdersTable WHERE OrderID NOT IN ( SELECT OrderID FROM OPENQUERY(LinkedServer, 'SELECT OrderID FROM LinkedServer1.OrdersTable) )

0

精彩评论

暂无评论...
验证码 换一张
取 消