开发者

Slow INSERTs from SQL Server to MySQL Linked Server

开发者 https://www.devze.com 2023-02-11 00:10 出处:网络
We\'re doing a sort of roll-your-own replication from a SQL Server 2008 R2 database to a MySQL 5.1 database.This isn\'t true replication, as the MySQL box is acting as permanent storage and the SQL Se

We're doing a sort of roll-your-own replication from a SQL Server 2008 R2 database to a MySQL 5.1 database. This isn't true replication, as the MySQL box is acting as permanent storage and the SQL Server is what we're forced to use as a data aggregator.

So, I've written a stored procedure that checks the latest row inserted on my table in question on the MySQL database, populates a temporary table on the SQL Server end with a new batch of rows to insert back on the MySQL box, and does the inserts. The problem is, this is painfully slow. I have a backlog of >24 million rows to send from the SQL Server box to MySQL, and at the rate I'm getting <2 rows/second, it's going to take nearly 6 months to complete and will never be able to keep up with new data as it hits the SQL Server database.

The boxes are not on the same network--the MySQL box sits pretty darn near the country's connection to the Internet back bone, but the SQL Server box is (for reasons beyond our control) only connected through a business DSL line (not positive of the upstream speed). If I connect through the command line (as opposed to by querying through the linked server in SQL Server), I'm able to insert rows in the MySQL database at an average of ~0.03 seconds per row. This leads me to believe that either my stored procedure is just incredibly inefficient, or OPENQUERY/Linked Servers are just inherently slow. Here is the stored procedure:

DECLARE @LastSensorLogDateFormatted DATETIME
DECLARE @LastSensorLogDate VARCHAR(30)
DECLARE @LastSensorLogMillis INT
DECLARE @LastSensorLogEibaddress VARCHAR(30)
DECLARE @SensorLogReplicated BIGINT
DECLARE @counter INT

-- Create a temporary table to store last write 
-- to MySQL database on CASALA-DB01
CREATE TABLE #SensorLogRecord 
  (LastSensorLogDate VARCHAR(30)
  , LastSensorLogMillis INT
  , LastSensorLogEibaddress VARCHAR(30))            

-- Dump result of query on CASALA-DB01 into our temporary table
INSERT INTO #SensorLogRecord
  (LastSensorLogDate, LastSensorLogMillis, LastSensorLogEibaddress )
  (SELECT  date, date_millis, eib_address
  FROM 
    OPENQUERY(MYSQL4, 'SELECT date, date_millis, eib_address 
    FROM cabie.sensors_log_redux ORDER BY date desc LIMIT 1'))

-- Store the last sensor log date and EIB address 
-- from our temporary table into local vars
SELECT 
  @LastSensorLogDate = LastSensorLogDate, 
  @LastSensorLogMillis = LastSensorLogMillis, 
  @LastSensorLogEibaddress = LastSensorLogEibaddress 
FROM #SensorLogRecord

SET @LastSensorLogDateFormatted = 
    CAST((LEFT(@LastSensorLogDate, 20) + 
    CAST(@LastSensorLogMillis as VARCHAR)) AS DATETIME)

SET @counter = 0            
WHILE (1=1)
  BEGIN
    CREATE TABLE #RecordHolder (Id BIGINT)

    INSERT INTO #RecordHolder (Id)
      SELECT TOP 1000 Sensor_Id FROM dbo.Sensors_Log
      WHERE Sensor_Id NOT IN (SELECT * FROM dbo.Sensors_Arch开发者_开发技巧ivals)
      AND dbo.Sensors_Log.Date <= GETDATE()
      AND dbo.Sensors_Log.EibAddress <> @LastSensorLogEibaddress
      AND dbo.Sensors_Log.Date <> @LastSensorLogDateFormatted

  INSERT OPENQUERY(MYSQL4, 
    'SELECT date, eib_address, ip_address, value, application, phys_address 
    FROM sensors_log_redux_holding')        
  SELECT 
    CONVERT(VARCHAR, GNH.dbo.Sensors_Log.Date,121),
    GNH.dbo.Sensors_Log.EibAddress,
    GNH.dbo.Sensors_Log.Ip_Address,
    GNH.dbo.Sensors_Log.Value,
    GNH.dbo.Sensors_Log.Application,
    GNH.dbo.Sensors_Log.Phys_Address 
  FROM GNH.dbo.Sensors_Log 
  JOIN #RecordHolder  
    ON (#RecordHolder.Id = GNH.dbo.Sensors_Log.Sensor_Id)

  INSERT INTO dbo.Sensors_Archivals (Row_Id) SELECT Id FROM #RecordHolder

  DROP TABLE #RecordHolder  

  IF (@counter >= 1000000)
   BREAK
END

I know that's messy...we've been trying different approaches to getting this to work, so there are probably unused variables, etc. For what it's worth, just running a that first select query on the linked server takes upwards of 40 seconds.

EDIT: Here's my linked server setup:

  • Provider: Microsoft OLE Provider for ODBC Drivers
  • Product Name: MySQL 5.1
  • Provider String: DRIVER={MySQL ODBC 5.1 Driver};SERVER=192.168.17.5;DATABASE=cabie;USERNAME=ourUsername;PASSWORD=ourPassword
  • Advanced Options:
    • Collation Compatible: FALSE
    • Data Access: TRUE
    • RPC: FALSE
    • RPC Out: FALSE
    • Use Remote Collation: TRUE
    • Collation Name: (Blank)
    • Connection Timeout: 0
    • Query Timeout: 0
    • Distributor: FALSE
    • Publisher: FALSE
    • Subscriber: FALSE
    • Lazy Schema Validation: FALSE
    • Enable Promotion of Distributed Transactions: TRUE


Here's part of the problem...

# Provider: Microsoft OLE Provider for ODBC Drivers
# Provider String: DRIVER={MySQL ODBC 5.1 Driver};...

ODBC is famously slow and you are using it on both ends.

An alternative approach would be to write an application in Delphi or C(#). If this application connects using native client dll's to the database.

I know that Delphi with the open source ZEOS database using the client dll's freely available from Microsoft and MySQL will transfer data many times faster than ODBC will do.

Also C# or any modern programming language that uses the native client dll's with SQL will do the job. Make sure that if you use a in-between program to bridge between the two databases that it does not fall back to ODBC, lots of (mainly older) programming environments do this out of lazyness or cheapness with the same crappy transfer speeds that your getting now.

Hope this helps...

0

精彩评论

暂无评论...
验证码 换一张
取 消