Why is it useful to have a MetaData object which is not bind to an engine in SQLAlchemy?_问答_开发者

I am trying to understand the behavior of MySQL regarding the MetaData object and the engine object. This SO-Answe开发者_Go百科r describes MetaData as

a collection of table definitions

and engine as

the dialect and connection details of a specific database

So far so good. But when is it useful to have these two separated? Aren't table definitions linked to a specific database?

SQLAlchemy 0.1 didn't have a "metadata" object - the Engine was bound directly to each Table. That idea got old very quick, partially because folks wanted to declare their Table object before they connected, and "bound metadata" came along. Then, a long period of mass confusion occurred. People were in the super hard habit (since I told them to do it this way) of saying things like:

table.insert().execute()

result = table.select().execute()

I.e., no transaction, using a new connection each time. Then we'd say, "oh well you should be using a transaction, you should be more efficient about connections", and then we'd basically tell them they had to rewrite their app, and the "sqlalchemy has too many ways" meme blew up like a balloon.

At the same time, Pylons and other early WSGI frameworks were pushing hard to have multiple "apps" running at once - in some cases, different "users" would have their own set of tables each in different databases, kind of thing. More common are horizontal scaling approaches where the same tables are in many databases. I have an app here that has a "replication" system built in where records are copied from the "primary" database to the "history" database periodically, and the table metadata there is shared too.

The point is for all those use cases, users would come to SQLA and their understanding of things began with "bound metadata". Blogs and tutorials all over the place used it. And a good chunk of these users would need to break out of that system, and become completely confused that there was this whole "other way" of working. So it became clear that the "bound metadata" system was just too rigid as a default. Ideally I wish that I never implemented it at all, and I never use it myself. This is why the docs for it are pushed into just one section now, new users who only skim the docs, who by their nature add a huge support burden to the mailing list, don't find it and don't get confused. The section itself has plenty of bullets explaining exactly when and why it is confusing. I'm assuming you read it, at http://www.sqlalchemy.org/docs/core/schema.html#binding-metadata-to-an-engine-or-connection .

In fact, table definitions aren't linked to a specific database. The sqlalchemy metadata object, and all the attached objects (tables, columns, indexes, etc) define a completely abstract schema, without reference to any particular database. The engine is essentially three things: the connection info, the dialect, and a connection pool. The dialect is the important part here, it defines how to connect to a specific database, but more importantly with regards to your question, it also defines how to translate sqla's abstract schema objects (metadata et al) into sql commands specific to that database and driver.

This separation has a number of different uses:

Applications can define their schema at the module level, creating tables and metadata when the module is imported... but not bother defining an engine til the app actually is up and running (since knowing the full connection url & dialect usually requires reading a config file or getting user input first).
You can have the situation of multiple engines tied to the same set of metadata, such as a webapp where each user/password connects to the database with it's own credentials (Not that this is efficient, but sometimes needed for security purposes). Since metadata is a global object, it wouldn't make sense to ever bind it to a particular engine in this case.
It's rare, but you can also have the reverse case, multiple metadata instances for a single engine. This case can happen when multiple subcomponents are sharing the same database, but using different table names. Also, it can happen when trying to compare the app's current schema in one metadata instance against another which was reflected from the server (such as for the purpose of schema migration). This wouldn't strictly prevent you from binding each of them to the engine, but does help demonstrate how there can usefully be multiple metadata or engine instances.

There's probably some more use-cases I just can't think of right now, but that should give you the general idea of why they are conceptually separate from eachother.