I am using MySQL and Java to SELECT about 50000 records. The strange thing is that when I use ResultSet and next() method to read the data, I see that the RAM usage of my java application increases during the fetching. it begins with 255 MB and increases up to 379 MB! the code I am using is here:
try {
Class.forName("com.mysql.jdbc.Driver");
Connection conn = DriverManager.getConnection("jdbc:mysql://localhost/#mysql50#crawler - used in report?" + "user=root&password=&useUnicode=true&characterEncoding=UTF-8");
Statement st = conn.createStatement();
ResultSet rsDBReader = st.executeQuery("SELECT Id, Content FROM DocsArchive");
while (rsDBReader.next()) {
int docId = rsDBReader.getInt(1);
String content = rsDBReader.getString(2);
. . .
}
rsDBReader.close();
st.close();
conn.close();
} catch (Exception e) {
System.out.println("Exception in reading data: " + e);
}
I am sure that the memory usage is for ResultSet, not other parts of the program.开发者_JS百科 In this program I don't need to update records, so I want to remove every record after finishing the work. My guess is that the records which have been read, will not be removed and the program doesn't free their memory. so I have used some tricks to avoid this, such as using following code:
Statement st = conn.createStatement( ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.CLOSE_CURSORS_AT_COMMIT);
st.setFetchSize(500);
rsDBReader.setFetchSize(500);
but they didn't change any thing. :(
So I need some method that removes (releases) memory of rows that have been read.
Another interesting point is that even after finishing the function and closing the ResultSet, Statement and Connection, and going to other part of the program, still the program memory usage doesn't decrease! Thanks
Use Statement.setFetchSize() to provide a hint to the driver that it should stream the ResultSet
for ones containing a certain number of rows. As far as I know, the MySQL Connector-J driver does understand the hint and streams ResultSet
s (but this is restricted to a row at a time in the case of MySQL).
The default value being 0, will ensure that the Connector-J driver will fetch the complete ResultSet
without streaming it. That's why you will need to provide an explicit value - Integer.MIN_VALUE in the case of MySQL.
The statement:
Statement st = conn.createStatement( ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.CLOSE_CURSORS_AT_COMMIT);
does not result in streaming the ResultSet
(at least not on it's own accord). It merely ensures that the resultset is not "scrollable" (i.e. can be traversed only in the forward direction) and not "updatable" and the underlying cursor will be closed when the transaction commits.
As noted in the JDBC implementation notes of MySQL, the above statement (without the ResultSet.CLOSE_CURSORS_AT_COMMIT
parameter) has to be invoked in conjunction with the Statement.setFetchSize(Integer.MIN_VALUE)
invocation for the streaming to occur row by row. The associated caveats involved in such a scenario have been documented as well.
Note that, the holdability of the cursor is not specified in the example mentioned in the MySQL documentation. If you need a value different from the the one provided by Connection.getHoldability()
, then again, this advice might not apply.
I'd suggest limiting the amount of rows you retrieve in your query. 50000 is a lot, so why not have a loop that fetches, let's say, 1000 rows every time?
You can achieve this using the limit
statement, as described here. It's always best to be pragmatic about amount of data you're processing. Your current select might return 50000 rows today, but what if it grows to one million tomorrow? Your application will choke. So, do your processing step by step.
Note that there is similar issue with the latest releases of Postgres. In order to achieve cursor processing* you need to disable auto commit on connection connection.setAutoCommit(false)
and use single statement in your SQL statement (i.e. statement which contains just one semicolon). It worked for me.
Postgres JDBC documentation
What you see is actually expected behaviour and must not necessarily indicate a memory leak. Objects instances are in Java not garbage collected immediately after they have become unreachable and most Java VMs are very reluctant to return once allocated memory back to the operating system.
If you are using a recent version of Oracle's Java VM and really need a more aggressive garbage collector, you can try the G1GC implementation by adding the following arguments to the java command:
-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC
The G1GC garbage collector usually reclaims objects faster than the default garbage collector and unused memory is also freed by the process.
精彩评论