开发者

Streaming large result sets with MySQL

开发者 https://www.devze.com 2022-12-23 03:04 出处:网络
I\'m developing a spring application that uses large MySQL tables. When loading large tables, I get an OutOfMemoryException, since the driver tries to load the entire table into application memory.

I'm developing a spring application that uses large MySQL tables. When loading large tables, I get an OutOfMemoryException, since the driver tries to load the entire table into application memory.

I tried using

statement.setFetchSize(Integer.MIN_VALUE);

but then every ResultSet I open hangs on close(); looking online I found that that happens because it tries loading any unread rows before closing the ResultSet, but that is not the case since I do this:

ResultSet existingRecords = getTableData(tablename);
try {
    while (existingRecords.next()) {
        // ...
    }
} finally {
    existingRecords.close(); // this line is hanging, and there was no 开发者_高级运维exception in the try clause
}

The hangs happen for small tables (3 rows) as well, and if I don't close the RecordSet (which happened in one method) then connection.close() hangs.


Stack trace of the hang:

SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) line: not available [native method]

SocketInputStream.read(byte[], int, int) line: 129

ReadAheadInputStream.fill(int) line: 113

ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) line: 160

ReadAheadInputStream.read(byte[], int, int) line: 188

MysqlIO.readFully(InputStream, byte[], int, int) line: 2428 MysqlIO.reuseAndReadPacket(Buffer, int) line: 2882

MysqlIO.reuseAndReadPacket(Buffer) line: 2871

MysqlIO.checkErrorPacket(int) line: 3414

MysqlIO.checkErrorPacket() line: 910

MysqlIO.nextRow(Field[], int, boolean, int, boolean, boolean, boolean, Buffer) line: 1405

RowDataDynamic.nextRecord() line: 413

RowDataDynamic.next() line: 392 RowDataDynamic.close() line: 170

JDBC4ResultSet(ResultSetImpl).realClose(boolean) line: 7473 JDBC4ResultSet(ResultSetImpl).close() line: 881 DelegatingResultSet.close() line: 152

DelegatingResultSet.close() line: 152

DelegatingPreparedStatement(DelegatingStatement).close() line: 163

(This is my class) Database.close() line: 84


Only setting the fetch size is not the correct approach. The javadoc of Statement#setFetchSize() already states the following:

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database

The driver is actually free to apply or ignore the hint. Some drivers ignore it, some drivers apply it directly, some drivers need more parameters. The MySQL JDBC driver falls in the last category. If you check the MySQL JDBC driver documentation, you'll see the following information (scroll about 2/3 down until header ResultSet):

To enable this functionality, you need to create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

Please read the entire section of the document, it describes the caveats of this approach as well. Here's a relevant cite:

There are some caveats with this approach. You will have to read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown.

(...)

If the statement is within scope of a transaction, then locks are released when the transaction completes (which implies that the statement needs to complete first). As with most other databases, statements are not complete until all the results pending on the statement are read or the active result set for the statement is closed.

If that doesn't fix the OutOfMemoryError (not Exception), then the problem is likely that you're storing all the data in Java's memory instead of processing it immediately as soon as the data comes in. This would require more changes in your code, maybe a complete rewrite. I've answered similar question before here.


Don't close your ResultSets twice.

Apparently, when closing a Statement it attempts to close the corresponding ResultSet, as you can see in these two lines from the stack trace:

DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163

I had thought the hang was in ResultSet.close() but it was actually in Statement.close() which calls ResultSet.close(). Since the ResultSet was already closed, it just hung.

We've replaced all ResultSet.close() with results.getStatement().close() and removed all Statement.close()s, and the problem is now solved.


In case someone has the same problem, I resolved it by using the LIMIT clause in my query.

This issue was reported to MySql as a bug (find it here http://bugs.mysql.com/bug.php?id=42929) which now has a status of "not a bug". The most pertinent part is:

There's no way currently to close a result set "midstream"

Since you have to read ALL rows, you will have to limit your query results using a clause like WHERE or LIMIT. Alternatively, try the following:

ResultSet rs = ...
while(rs.next()) {
   ...
   if(bailOut == true) { break; }
}

while(rs.next()); // This will deplete the remaining rows on the stream

rs.close();

It may not be ideal, but at least it gets you past the hang on close.


If you are using spring jdbc then you need to use a preparedstatement creator in conjunction with SimpleJdbcTemplate to set the fetchSize as Integer.MIN_VALUE. Its described here http://neopatel.blogspot.com/2012/02/mysql-jdbc-driver-and-streaming-large.html


It hangs because even if you stop listening, the request still goes on. In order to close the ResultSet and Statement in the right order, try calling statement.cancel() first:

public void close() {
    try {
        statement.cancel();
        if (resultSet != null)
            resultSet.close();
    } catch (SQLException e) {
        // ignore errors on closing
    } finally {
        try {
            statement.close();
        } catch (SQLException e) {
            // ignore errors on closing
        } finally {
            resultSet = null;
            statement = null;
        }
    }
}


Scrollable Resultset ignore fetchSize and fetches all the rows at once causing out of meory error.

For me it worked properly when setting useCursors=true, otherwise The Scrollable Resultset ignores all the implementations of fetch size, in my case it was 5000 but Scrollable Resultset fetched millions of records at once causing excessive memory usage. underlying DB is MSSQLServer.

jdbc:jtds:sqlserver://localhost:1433/ACS;TDS=8.0;useCursors=true

0

精彩评论

暂无评论...
验证码 换一张
取 消