I try to insert 3.6 million records in MySQL 5.1 using JPA in Glassfish 3.0.1. I do it in one single EJB business method call, so I beleive it is done in one single SQL transaction. Am I mad? :-)
Since the same EntityManager instance is used, the business method must call em.flush() and em.clear() every 3000 records, or the em will clog up. (I tried various values, this seems to be optimum for me).
This works pretty nicely on my test&develop platform (WinXP on a quad-core box). Each 100000 em.persist() takes 24-28 s, and the whole operation takes 15 minutes.
But on our production box, with Ubuntu 10 on a 开发者_开发知识库virtualized x86_64 box, each 100000 em.persist() gets progressively slower. The first takes 40 s, then 70, 77, 89, 121, 130, 126, 163, 201, 247 s. The server application ends up with a total hang.
Mysql: 5.1.47-community MySQL Community Server (GPL) (Windows), 5.1.41-3ubuntu12.6 (Ubuntu)
I can't figure out why the (almost) same software behaves so fundamentally different. Any ideas?
15 minutes, that is really slow. Are you using transactions? If not you should be. If you are using transactions, how often are you committing, you should commit fairly regularly, but not too regularly.
Don't try to commit 3.6 million records in a single transaction. I think you're taking the load off your JVM by doing the periodic em.flush() and em.clear(), but that doesn't help the database on the other end who has to manage the rollback data for 3.6 million new records until you finally commit them.
Are you hosting the database on the same server as the Java application? Maybe the database is the bottleneck and not the Java code. Try inspecting the memory and CPU usage for the various processes on your two environments. It may be obvious that either Java or the database server is using up all the memory or CPU.
EJB/JPA doesn't (afaik) give me a choice; The EJB container handles the transaction and commits it when the business method returns. Perhaps there's a way to tell the EJB to do a commit each time the business method has created 10000 entities...
(I could do this using JDBC directly, of course, but that would be pulling the rug under JPA).
The strange thing, though, is the difference in behavious on the two machines.
精彩评论