I am developing and maintaining a database-abstraction library called jOOQ, which aims to "internalise" SQL as an external DSL into Java. The goal of this endeavour is to allow for type-safely constructing and executing all possible SQL syntax elements of the most popular RDBMS. jOOQ's internal DSL is becoming more and more complex, and I'd like to get a formal hold of it. The idea is that I would like to be able to have some sort of formal definition of SQL as input, e.g.
select ::= subquery [ for-update-clause ]
subquery ::= SELECT [ { ALL | DISTINCT | UNIQUE } ] select-list
[ FROM table-reference ] ..
select-list ::= expression [ [ AS ] alias ] [, expression ... ]
expression ::= ...
alias ::= ...
table-reference ::= ...
The input could also be defined in XML or any other descriptive meta-language. Once I have that input, I'd like to generate from that input a set of Java interfaces, that model the defined syntax in Java. Example interfaces would be:
// The first "step" of query creation is modelled with this interface
interface Select0 {
// The various SELECT keywords are modelled with methods
// returning the subsequent generated syntax-element
Select1 select(Expression...);
Select1 selectAll(Expression...);
Select1 selectDistinct(Expression...);
Select1 selectUnique(Expression...);
}
// The second "step" of query creation is optional, hence it
// inherits from the third "step"
interface Select1 extends Select2 {
// Here a FROM clause may be added optionally
Select2 from(TableReference...);开发者_开发百科
}
// To keep it simple, the third "step" is the last for this example
interface Select2 extends SelectEnd {
// WHERE, CONNECT BY, PIVOT, UNPIVOT, GROUP BY, HAVING, ORDER BY, etc...
}
With the above interfaces, it will be possible to construct SQL queries in Java, like jOOQ already allows to do today:
create.select(ONE, TWO).from(TABLE)...
create.selectDistinct(ONE, TWO).from(TABLE)...
// etc...
Also, I'd like to exclude some syntax elements for some specific builds. E.g. when I build jOOQ for exclusive use with MySQL, there is no need to support for the SQL MERGE statement.
Is there any existing library implementing such a general approach in order to formally internalise and external DSL to Java? Or should I roll my own?
What you are really trying to do is to translate generic SQL into calls on your internal APIs. Seems reasonable.
To do that, you need a parser for "generic SQL", and a means to generate code from that parser. Typically you need the parser to build an abstract syntax tree, you pretty much need a symbol table (so that you know what things are table names, what are column names, and whether those column names are from table A or table B, so somewhere you need access to the SQL DDL that define the data model.... which requires you parse SQL again :).
With the AST and the symbol table, you can generate code a lot of ways, but a simple method is to walk the AST an translate constructs as you encounter them. This won't allow for building optimized queries; that requires more complex code generation, but I'd expect it to be adequate if supported by appropriate API functions you supply.
Actual code generation could be done by just printing Java text; if you go the ANTLR route you'll have to do something like this. An alternative is to literally transform the SQL code fragments (as ASTs) into Java code fragments (as ASTs). The latter scheme gives you more control (you can actually do transformations on Java code fragments if they are ASTs) and if done right, can be checked by a code generation tool.
Our DMS Software Reengineering Toolkit would be a good foundation for this. DMS provides an ecosystem for building translation tools, including robust parser machinery, symbol table support, surface-syntax pattern-directed translation rules, and has generic SQL (SQL 2011, the standard) as an available, tested parser, as well as Java, to be used on the code generation side.
精彩评论