I want to load large .DBF (Visual FoxPro) files into a DataTable. For smaller files < 300MB it works fine with a fill command, and it runs pretty fast. But for larger file I run out of memory and need to load them into smaller parts. (Loading row 0...1000, then 1001..2000 and so on)
Based on some code found on the internet I made this operation, input start is the row to start reading from and max is the number of rows that I want to read.
The problem is that even if I just want to read 5 rows it takes around 30-60seconds on my machine due to the very slow execution of the Command.ExecuteReader.
public DataTable LoadTable2(string folder, string table, int start, int max)
{
string ConnectionString = "Provider=vfpoledb.1;Data Source="+folder+"\\"+table;
OleDbConnection Connection = new OleDbConnection(ConnectionString);
Connection.Open();
string dataString = String.Format("Select * from {0}", table);
OleDbCommand Command = new OleDbCommand(dataString, Connection);
//Takes very long time on large files.
OleDbDataReader Reader = Command.ExecuteReader(CommandBehavior.SequentialAccess);
DataSet ds = new DataSet();
var dt = ds.Tables.Add(table);
// Add the table columns.
for (int i = 0; i < Reader.FieldCount; i++)
{
dt.Columns.Add(Reader.GetName(i), Reader.GetFieldType(i));
}
int intIdx = 0;
int cnt = 0;
while (Reader.Read())
{
if (intIdx >= start)
{
DataRow r = dt.NewRow();
// Assign DataReader values to DataRow.
for (int i = 0; i < Reader.FieldCount; i++)
r[i] = Reader[i];
dt.Rows.Add(r);
cnt++;
}
if (cnt >= max)
{
break;
}
intIdx++;
}
Reader.Close();
Connection.Close();
return dt;
}
I have tested with both OLE and ODBC connection, no big difference. Files are all on local disc.
Does anyone have a good idea for how to make this much faster?开发者_如何学运维
Best regards Anders
I believe that with that driver (VFPOLEDB), you can change your query to specify the record numbers of interest. That way it would not be necessary to read through a bunch of records to get to the starting point. It would then not be necessary to skip over any records; just read the entire requested result set. The query might look like this:
SELECT * from thetable where recno() >= 5000 and recno() <= 5500
I realized that I have this driver installed and just now tested it and it does work. However, I don't think it "optimizes" that statement. In theory, it could directly compute the record offsets using record numbers, but (based on simple observation of a query on a larger dbf), it seems to do a full table scan. However, with FoxPro, you could create an index on recno()
, and then it would be optimized.
精彩评论