I'm using Rails and it does not provide support for database specific actions like triggers, stored procedures and various constraints (not all of them).
I wonder if I should put the database logic in the application itself instead.
Because then you can make more complex logic than the database provides and it is also database independent (I can move from mysql to postgresql and vice versa), and it won't be if you are putting these things in the data开发者_开发知识库base.
Is this the right way to go?
Thanks
(Note: This is a Postgres centric post. I have several years experience with MySQL, and a few Postgres and Oracle. I prefer Postgres by a country mile, but that's not the point of this post.)
This is really a question about two schools of thought: should the database be a simply a data store or should it contain application logic? There are cases for both.
IMHO the answer used to be a lot simpler before several DBs (esp Postgres) got some really, really good procedural languages in the DB itself. Before that point, it made sense to kick all the all the logic you could in the application because it had to be rather kludgey to get some things done in basic SQL.
Dave Markle in another answer makes a good point about triggers, they tend to be "magic" to the developer. Changing input on the way in can be terribly confusing. If I say UPDATE foo set X=3;
but then I go check foo
and x
is really 4
because some trigger intercepted it, it can be confusing. Like Dave, I tend to promote triggers for auditing functions and the like. The other issue with triggers is performance, they can just suck the life right out of a good batch operation.
Stored procedures - I hold in much higher regard. The developer is explicitly invoking them and they're named, so he should not be seeing them as a black box. A good stored procedure can greatly increase the speed by reducing the number of rows that need to be "sent over the wire". For a DB such as Postgres with a strong set of PL langauges, there's really nothing that can't be in an SP (that doesnt mean it SHOULD be done, but can be). Take a look at SimplyCity for an example of how stored procedures can be your friend. Additionally, you could share application logic classes between your application code and PL/python, PL/Perl, or PL/Php, PL/Ruby, or PL/Java. I believe Oracle can do something similar with Java - it's just not something the company I currently do Oracle work for is working with.
If you plan on staying database agnostic, you're going to be sacrificing a lot of features, a lot of speed, and a lot of your time. ORMs can make this easier, but ultimately there is fundamental difference in most databases engines that just can't be fully abstracted away.
Overall, you need to test, test, test (with real data) and make the decision that's best for your app. It's often a balancing act between performance, future maintenance, cost, and the resources you have to work with. Often times (not every time), moving the logic out of the database and into the application greatly reduces the performance of the application.
Even if you don't decide to put application logic in the DB - take the time to really learn about your DB of choice. It will make you a far better application developer in the long run.
I will give an answer probably the opposite of what everyone else will say.
Database logic belongs as close to the data as you can possibly implement it, and that means in the database. Doing anything else will, at best, require you to repeat yourself in different applications and have a big update headache. At worst (and this is pretty likely) you'll have inconsistent database logic implemented in different applications.
You need to separate out what's "database logic" and what's "application logic". My normal example is defining what your organization means by "FullName" for a customer. Imagine that you write code in your application that combines the first and last names into a full name field. Later, you add a middle initial (or decide to include an already existing middle initial column) as part of the full name.
If your full name logic is implemented in a view or stored procedure, you need only make a single change to enforce the new definition across all applications (including third-party applications for which you don't have source code).
The problem is somewhat eased in situations (like Ruby) where all data access is done through a shared ORM layer where you can put logic that will be used by all applications written in the same language. In my situation, however, I rarely work in situations in which only a single language or product will be used for programming against the corporate database.
If you're already using Ruby on Rails then you've committed to using opinionated software. Rightly or wrongly, Rails' opinion on this matter is that the logic goes within the Rails application. It even takes it as far as enforcing referential integrity itself through ActiveRecord and its associations, although you can of course enforce this in the database if you wish by adding constraints to your tables. A lot of Rails developers don't bother, either through ignorance or choice.
- The productivity gains Rails provides tend to diminish when you start to stray from its opinions and conventions
- If you're writing the sort of application that manipulates large quantities of data and would benefit from taking advantage of specific RDBMS features, then this is a bad fit for Rails anyway because it's optimized for creating new CRUD web applications
Because then you can make more complex logic than the database provides and it is also database independent (I can move from mysql to postgresql and vice versa), and it won't be if you are putting these things in the database.
The logic depends on the database - MySQL for example doesn't have recursive query support, nor analytic functions. PostgreSQL only got analytics recently - SQL Server's had both since v2005; Oracle since 9i.
Database logic in the application ltypically means more trips between the application and database - this is time/performance you will never recoup. Database trips are expensive, and be kept to as few as possible - which is why stored procedures & functions are your best tools for a highly scalable, performant persistence layer.
ORM has long been known for being great for queries that aren't complex, and I'm glad to see they've evolved to support using stored procedures/etc. ...Which utterly defeats the purpose of using ORM - now you're back to writing database specific code for sake of performance. It's like the joke about solving something with a regex - now you've got two problems...
Finally, databases - the tables and data types - also do not port to other vendors thought there are conversion kits. Even changes within the database are a pain, which is why database design/modeling development does not work well in agile/interative development processes.
Conclusion
Database logic in the application is great if you have super simple applications, with small datasets. But it will pay off in spades to learn database development - your clients will benefit from a more scalable, faster application.
I'm solidly in the "application logic" camp myself.
For me the most important issue has to do with scaling the persistence layer as your application grows. By relying heavily upon the database to implement your logic, you're signing your application up for database sharding or replication as your scaling solution for your persistence layer as you grow. I've found much more success with the plethora of distributed cache solutions at the business object layer.
Of secondary performance is the transparency of implementation. Modern object oriented design has a lot of benefits in pulling your data and business logic together, but it largely decays into a mess once you're relying upon an external and hidden layer of logic.
I have researched about where trigger is should be(application side or database side). Both of design decision have pros and cons.
Database Trigger
pros:
- Simplifies application logic
- Better performance
cons:
- Maintaince is harder
- Database dependency
- Batch operation performance problems
- If you don't have access to database for prod envoriment, It can be harder to manage
Application Logic:
pros:
- Maintaince can be done in one place. It is good thing.
- Database independent if you use ORM tools or frameworks.
cons:
- Performance may be problem
- Add some complexity to the application logic
One an view of point about this subject is following:
"If you have just one application which has write access to your database (or at least to the relevant tables here), and your CRUD operations are in one place of that application, then there is normally no need for using triggers for keeping your data consistent. Triggers are a good thing when you don't know beforehand who and what processes/applications will access your db in the future, and you want to keep a lot of consistency rules in one place, mostly in your DB."
enter link description here
Generally, yes, mostly for the reasons you just mentioned.
Triggers are especially bad because it's pretty opaque to the developer about what's actually going on behind the scenes when an operation is performed. I tend to limit my use of triggers to auditing functions, or functions that are incredibly important to protecting the data integrity of the system. I almost entirely avoid using triggers for application logic.
Same goes with stored procedures. In this day and age with ORMs being so widespread, I use stored procs for big, set-based data modifications whose performance would be undesirably bad when using an ORM such as Rails, which requires you to read the data from the database and marshal it back to the database layer once you're done processing it.
It's my opinion RDBMSs should be dumb and flexible, used strictly for persistence, and disparate enterprise applications should communicate through techniques of SOA.
I create my data structures in a normalized, flexible manner, typically using only datatypes and referential integrity constraints. The data are managed by an application (a "mother") that implements any rules necessary for a cohesive domain. If disparate (or 3rd party) apps need access to these data, they should not undermine the mother and go directly to the data store, but use techniques of SOA to communicate with her, asking her to to take actions and responding to the results (including failures) that she returns to them.
I think about it this way. If I want to access my company's email data, I don't access them directly where they reside on disk. That would be painful as I'd have to understand the data structures before I could do anything. Instead, I communicate with my email server application (Exchange Server, in my case) through techniques of SOA (WebDAV, CDO, LDAP, etc.). This app provides an abstraction layer to query the email, calendar, or contact data, to send emails or create contacts or appointments, or manage calendars etc.
If these other apps (created by you) are not fundamentally different from the mother app (also created by you), then perhaps you have too many one-off applications and a composition of these applications should be developed as a common and intuitive enterprise system. Then each of these applications turned use-cases (or modules, as I call them) can access the data directly leveraging a common data access layer that implements the same rules of cohesion defined centrally in the composite application. Of course, then the composite application should provide access to the data through techniques of SOA so that apps that are truly disparate can communicate with her.
There is data logic and there is application logic. The database has to know everything about the data logic, the application about the application logic. Data logic can be stored inside the database in stored procedures, views, rules, etc. Oracle, PostgreSQL, SQL Server, etc. have very powerfull tools to support this.
Many applications can use the same database and the same data, and you have to be sure the data is correct. Every application (in any language) can do different things with this data, that's up to the application. Building database logic inside your application, is like reinventing the wheel and asking for trouble. RDBMSs have about 40 years of development history, you're not going to get a better result in a couple of weeks or months.
For PostgreSQL, take a look at the different PLs: PL/pgSQL is nice, PL/perl and PL/ruby might be more interesting for perl and ruby developers.
精彩评论