How do you make your application aware that a database record was changed or created by an external application rather than the application itself?
Do you add a special check sum to the database record or what do you do to stop external changes 开发者_JAVA技巧to the database? (in case it was hacked and the hacker decides to add a new record or change an existing database record)
There are three questions being asked:
How do you determine a row was changed?
You have a handful of choices:
- Requery for the data and do a comparison. This will tell you exactly what columns changed (but not necessarily by whom).
- Use a
rowversion
column (called atimestamp
in older versions of SQL Server) which will update itself every time the row is changed. This will only tell you that something changed but not what column was changed or who changed it.
How do you determine that the row was created by an external application rather than your own app
This one is a bit trickier. Technically, you might think that simply writing the name of the app into a column in the row might suffice but that technique is not safe. An external application that has rights to save to this column could easily do the same. The safer way would be to use an external monitoring program that tracks every change made to the database and the user that made it. In SQL Server 2008, there is a feature called "Change Tracking" which revolves around similar functionality which also might suffice.
How do you stop external changes to the database?
The obvious choice is to restrict access to the database. Only let the application account have access to the database and/or access to the tables and stored procedures. In addition, you need to lock down access by administrators and developers to the database so that only a select few can access the database. Combined with good logging, that should prevent mystery rows from appearing.
What you can do is to create a LastUpdated field, which cn be updated using a trigger.
Then the application can make use of this filed to check if data has been out dated.
This is a small example
CREATE TABLE Vals(
ID INT,
Val VARCHAR(10),
DateCreated DATETIME DEFAULT GetDate(),
DateUpdated DATETIME DEFAULT GetDate()
)
GO
CREATE TRIGGER Upd ON Vals
AFTER UPDATE
AS
UPDATE Vals
SET DateUpdated = GetDate()
FROM Vals INNER JOIN
inserted ON Vals.ID = inserted.ID
Go
INSERT INTO Vals (ID, Val) SELECT 1, 'A'
SELECT *
FROM Vals
GO
UPDATE Vals SET Val = 'B'
SELECT *
FROM Vals
GO
DROP TABLE Vals
GO
Most applications use the latest write wins
tactic. While unappealing to the architects, this is easy to code, and nearly always good enough.
If you're in the 1% of applications that really need concurrency:
- Add a
UpdateDt
andUpdatedBy
column. If theUpdateDt
has changed since you read the record, give a warning - Store the values for the row you read, and use a copy to store any modifications. Before you write to the database, check if the original values still match. If they don't, give a warning. (This is the strategy used by the .NET
DataSet
.) - Add a locking table. When a user opens a record, set a lock for it. When another user tries to open the row for modification, issue an error message. This is called
pessimistic locking
, as opposed to the other methods, because it warns even before a potential conflict arises.
Via an ORM Some/all ORMs will generate SQL to match on every value that your object has, and run the update where all those conditions match. Important in this scenario is that some field be consistent as the 'change indicator'.
--Customer--
ID int
FirstName varchar(10)
LastName varchar(10)
ChangedOn datetime
UPDATE Customer
SET FirstName = 'SomeChange'
WHERE FirstName = 'Oldvalue'
andLastName = 'OldValue'
and ChangedOn = SomeDate
Home Baked Aside from an ORM, consider implementing an update stored proc for your table that will include an update to the ChangedOn field with a new timestamp whenever a record is updated. That will allow you to manage the updating yourself. Define a property for this timestamp in your class, and populate it in your objects when reading from its datastore. When saving back to the datastore, perform a check from the timestamp's value vs. what's currently stored in the database.
This can be done in SQL Server without the use of Triggers, sprocs, or enforcing timestamps throughout your code - see the rowversion column type.
The normal 'enterprisey' way to do this is to use message queues, by using message queues which are posted to by a trigger your app can get messages whenever somebody writes to a particular table. It also has the big advantage that your app doesn't need to be running, but can read the queue at start or on a schedule if you want.
see message queue task
When any application/user has access to tables (or updatable views on a table), triggers are the only reliable way to catch all changes and take appropriate actions.
Because triggers are not a great application design tool for business rules or general database integrity, in most database-centric applications where the database really has its own responsibilities and has its own perimeter and service definitions, you handle this in layers - no one has access to tables, you have non-updatable views and most access is required to go through stored procedures.
Even in applications where the database has an open perimeter, business rules are enforced in a data layer in the application and all access should go through that. Unfortunately in these cases, an open perimeter database really cannot guarantee much about the higher-order integrity or expectations about the data unless the accounts are really locked down and only access through the data access layer in code has somehow been assured.
If you absolutely have to let people update the tables directly, and you need to enforce the same rules in those exceptional cases, then triggers are really your only option. At the very least you can have relatively simple audit triggers which audit all the changes.
Detecting changes that occurred while the client was offline
This is the job of Change Tracking and Change Data Capture. They're both technologies derived from Replication (one from Merge, one from Transactional) and are used by the Sync Framework. With CT and/or CDC an application can connect to the database and ask 'what data changes occured since last time I was connected?'. These scenarios address Occasionally Connected Systems like laptops, mobile devices and the like.
Detecting changes while the client is online
This is the job of Query Notifications. The technology is based on something similar to how online indexes are maintained, and is capable of actively notifying the client when data has changed. The main scenario addressed by QN is cache invalidation.
Application techniques
Other techniques are application specific like triggers, rowversions, timestamps, code hooks in ORM and so on and so forth. None of them are very reliable, as the engine does not understand them as change tracking and certain operations that shouldn't be allowed are allowed (think at all the operations prohibited when replication is enabled, like spiting and merging partitions). Some are even worse than such extreme cases, the typical example being rowversion and timestamp columns, which are hopelessly helpless at detecting rows that were deleted.
The answers given so far, though sensible, might be answering a different question than intended.
I think he's saying: assume there's a database that's difficult to secure. How can he detect hacking when it happens, ideally before it overwrites the good data?
If that's what's meant, how about this:
- Keep a hash column in each row. It should be made from the columns in the row plus a private 'salt'.
- When your application changes data, have it update the hash. An outsider who doesn't know the 'salt' can't do that.
- Put an
UPDATE
trigger on the table to check the hash. If it doesn't check, something is wrong and you can rollback theUPDATE
before the bad data is committed.
This is hardly hackproof, but it discourages the casual vandal.
精彩评论