开发者

Framework to Avoid Writting Repetitive Joins in SQL/Select Statements?

开发者 https://www.devze.com 2023-03-11 04:50 出处:网络
Say that each record in table TA points to exactly one record in table TB which in turn points to exactly one record in table TC. If I am to write a select statement to retrieve some results from tabl

Say that each record in table TA points to exactly one record in table TB which in turn points to exactly one record in table TC. If I am to write a select statement to retrieve some results from table TA and want to include a condition that uses a remote attribute defined on a record coming from table TC, then I have to write a bunch of join statements and also remember the names of the primary and foreign keys that link those tables together. Well, this is very easy to do, however it is also very boring, especially when you are writing similar joins to solve similar problems in different queries...

It would be much easier to use an object notation to access child records/properties, or something like this:

"TA.TB.TC.attribute"

I'd even preffer to use path notation:

"TA/TB/TC/attribute"

Of course, for something like that to work properly there should be a way of pre-defining tables' relationships, somewhere, or something like this (pseudo-code):

table TA(
    has one TB as "B"
    on (TA.fkey = TB.pkey)
)

table TB(
    has one TC as "C"
    on (TB.fkey = TC.pkey)
)

So we could write a select instruction like this:

select * from TA where B/C/attribute = "something"

And this would be coverted by the interpreter into:

select * from TA 
join TB as B on (TA.fkey = TB.pkey)
join TC as C on (TB.fkey = TC.pkey)
where C.attribute = "something"

Aditionally, there could even be a way of creating "shortcuts" to what I am here calling "remote attributes". Example:

   table TA(
    has one TB as "B"
    on (TA.fkey = TB.pkey);
   )

# define a "shortcut"
TA.C_attribute = TA/B/C/attribute;

Then C_attribute would exist as a column for TA but only virtually and not fisically; nonetheless it would become visible and accessible in queries:

select * from TA where C_attribute = "something"

This would force th开发者_C百科e parser/interpreter to go through all specifications pre-defined by the user and then rewrite the SQL properly with all those joins.

So, is there a tool/framework that does that in PHP? I don't mean Doctrines -- it doesn't cover all these requirements i've gone through here.


Well, if you want to get rid of joins without resorting to ORM, you have to get rid of your relational database engine. After all, the name is self-explanatory. It's relational, so you must join records together to create relationships.

What you may be looking is probably an object database. Document-oriented databases, which can also be built on top of object databases, may fit your needs as well.

I use MongoDB. It has a PHP extension, and has loads of useful features (most notably scalability and sharding).

In Mongo, you store documents, which are, basically, JSON objects. Considering a "mydb" database and a collection "mycoll" (collections are similar to tables if you like):

> x = {"foo":"bar", "hello":"world", "some_list":["a","b","c"]}
> db.mycoll.save(x)                             // save the object 'x'
> db.mycoll.find()                              // search the collection
{ "_id" : ObjectId("4df0190b01fdc88b103d3354"), "foo" : "bar", "hello" : "world", "some_list" : [ "a", "b", "c" ] }

As you can see, it supports nested objects and arrays as well. Also, it supports references to other objects:

> y = {"some_ref" : new DBRef("mycoll", x._id)} 
> db.mycoll.save(y);                            // save the object 'y'
> x.foo = "something has changed!"              // change the object 'x'
> y.some_ref.fetch().foo                        // query 'y' for object 'x'
something has changed!

And yes, it's basically a JavaScript shell, which means you can also create functions:

> myfunc = function(foo) {
... foo.foo = "something else has changed!";
... }
> myfunc(x)
> x
{
    "foo" : "something else has changed!",
    "hello" : "world",
    "some_list" : [
        "a",
        "b",
        "c"
    ],
    "_id" : ObjectId("4df0190b01fdc88b103d3354")
}

Sure, it doesn't replace relational databases, but for a lot of projects I'm working on recently, it fits perfectly. Especially those with custom fields and metadata, because it is schema-less.


The database issue is a little subtle, and you might want to think through this again.

Say that each record in table TA points to exactly one record in table TB which in turn points to exactly one record in table TC.

create table TA (
  id integer primary key,
  results char(1) not null
);

create table TB (
  id integer primary key references TA (id)
);

create table TC (
  id integer primary key references TB (id),
  remote_attr char(1)
);

insert into TA values
(1,'A'),(2,'B'),(3,'C');

insert into TB values
(1),(2),(3);

insert into TC values
(1,'a'),(2,'a'),(3,'b');

If I am to write a select statement to retrieve some results from table TA and want to include a condition that uses a remote attribute defined on a record coming from table TC, then I have to write a bunch of join statements and also remember the names of the primary and foreign keys that link those tables together.

Yes, you need to be able to look up the names of the keys, but you only need one join.

select TA.id, TA.results
from TA
inner join TC using (id)
where remote_attr = 'b';

id   results
--
3    C

Older technologies required application code to chase pointers through all the tables. The relational model (and SQL) lets you join directly on common values.

I actually don't mind writing the more explicit version.

select TA.id, TA.results
from TA
inner join TC on (TC.id = TA.id and TC.remote_attr = 'b')

You don't actually need to remember the names of the keys. At work, my operational database has over a thousand tables. I don't try to remember the names of all the keys; I just look them up. (I don't try to remember the names of all the tables and views either.)

The subtle database issue has to do with this syntax: "TA.TB.TC.attribute". That syntax will give you only the result of an equi-join, and then only through the equivalent of chasing pointers. SQL join syntax is infinitely more expressive than just simple equi-joins.

If you can only express equi-joins--not the full range of join conditions--with that kind of syntax, why use it at all?

0

精彩评论

暂无评论...
验证码 换一张
取 消