开发者

sql server get only updated record

开发者 https://www.devze.com 2022-12-18 03:41 出处:网络
I am using sql server 2000. I need to get only updated records from remote server and need to insert that record in my local server on daily basis. Bu开发者_StackOverflowt that table did not have crea

I am using sql server 2000. I need to get only updated records from remote server and need to insert that record in my local server on daily basis. Bu开发者_StackOverflowt that table did not have created date or modified date field.


Use Transactional Replication.

Update

If you cannot do administrative operations on the source then you'll going to have to read all the data every day. Since you cannot detect changes (and keep in mind that even if you'd have a timestamp you still wouldn't be able to detect changes because there is no way to detect deletes with a timestamp) then you have to read every row every time you sync. And if you read every row, then the simplest solution is to just replace all the data you have with the new snapshot.


You need one of the following

  • a column in the table which flag new or updated records in a fashion or other (lastupdate_timestamp, incremental update counter...)
  • some trigger on Insert and Update, on the table, which produces some side-effect such as adding the corresponding row id into a separate table

You can also compare row-by-row the data from the remote server against that of the production server to get the list of new or updated rows... Such a differential update can also be produced by comparing some hash value, one per row, computed from the values of all columns for the row.

Barring one the above, and barring some MS-SQL built-in replication setup, the only other possibility I can think of is [not pretty]:

  • parsing the SQL Log to identify updates and addition to the table. This requires specialized software; I'm not even sure if the Log file format is published/documented, though I have seen this types of tools. Frankly this approach is more one for forensic-type situations...


If you can't change the remote server's database, your best option may be to come up with some sort of hash function on the values of a given row, compare the old and new tables, and pull only the ones where function(oldrow) != function(newrow).

You can also just do a direct comparison of the columns in question, and copy that record over when not all the columns in question are the same between old and new.

This means that you cannot modify values in the new table, or they'll get overwritten daily from the old. If this is an issue, you'll need another table in which to cache the old table's values from the day before; then you'll be able to tell whether old, new, or both were modified in the interim.


I solved this by using tablediff utility which will compare the data in two tables for non-convergence, and is particularly useful for troubleshooting non-convergence in a replication topology.

See the link.

tablediff utility


TO sum up:

  1. You have an older remote db server that you can't modify anything in (such as tables, triggers, etc).
  2. You can't use replication.
  3. The data itself has no indication of date/time it was last modified.
  4. You don't want to pull the entire table down each time.

That leaves us with an impossible situation.

You're only option if the first 3 items above are true is to pull the entire table. Even if they did have a modified date/time column, you wouldn't detect deletes. Which leaves us back at square one.

Go talk to your boss and ask for better requirements. Maybe something that can be done this time.

0

精彩评论

暂无评论...
验证码 换一张
取 消