I am writing some Java unit tests and I need to compare two sql strings where the sql statements are semantically equivalent but can be syntactically different. I can't do stri开发者_StackOverflow中文版ng comparison since an order of the from clause and where clause can be different yet the two queries might be equivalent.
Is there anyway to do this in java without having to write my own Oracle SQL Parser ? :)
P.S. The query can be very complicated !
Thank you !
The general answer is NO, because you can always call some kind of stored procedure which hides a Turing machine. The fact that you can do arithmetic in a SQL statement I think pushes you over the Turing cliff, too.
Of course, theorists always tell us everything is impossible, so we should all roll over and die.
Nah.
So what can you do? Well, a "simple" possibility is to normalize the SQL queries, much as you simplify algebraic equations. If you could somehow, for a SQL statement, "normalize" (convert) it into the absolutely shortest equivalent SQL that did the same thing, then you could normalize both SQL statements and compare the result statements; if the they are equal modulo identifier renaming, then they have the same "semantics". For every operator in SQL, there is some semantics behind it, and some set of equivalent operations, just as there is in algebra. So, if you can determine the set of algebraic equivalences for each SQL operator, you can replace each algebraic computation by the shortest algebriac equivalent which does the same thing.
To do this, you have to be able to parse SQL, and apply SQL rewrites to the parsed SQL, which means you need a program transformation engine. (You can see an analog of this at Parsing and Rewriting Algebra)
This doesn't work in all cases. First, there may be several same-length "shortest" SQL statements that are equivalent (2+X is the same as X+2 but it isn't obvious to a a tool). Now you've got a theorem proving problem (for our X+2 example, use the commutative law to prove they are equal), back to stuck in theory. Second, you may not know how to generate the shortest possible sequence using your rewrites; even math equations sometimes have to swell up before they can get small again. Technically you have to search all possible algebraic equivalences to find the shortest, and that's impossibly large.
So, hard to do in practice, too. So, NO.
Not a direct solution for your problem, but you may want to look into JSqlParser which may already cover part of what you need.
精彩评论