开发者

SqlBulkCopy is slow, doesn't utilize full network speed

开发者 https://www.devze.com 2023-01-30 15:16 出处:网络
for that past couple of weeks I have been creating generic script that is able to copy databases. The goal is to be able to specify any database on some server and copy it to some other location, and

for that past couple of weeks I have been creating generic script that is able to copy databases. The goal is to be able to specify any database on some server and copy it to some other location, and it should only copy the specified content. The exact content to be copied over is specified in a configuration file. This script is going to be used on some 10 different databases and run weekly. And in the end we are copying only about 3%-20% of databases which are as large as 500GB. I have been using the SMO assemblies to achieve this. This is my first time working with SMO and it took a while to create generic way to copy the schema objects, filegroups ...etc. (Actually helped find some bad stored procs).

Overall I have a working script which is lacking on performance (and at times times out) and was hoping you guys would be able to help. When executing the WriteToServer command to copy large amount of data (> 6GB) it reaches my timeout period of 1hr. Here is the core code for copying table data. The script is written in PowerShell.

$query = ("SELECT * FROM $selectedTable " + $global:selectiveTables.Get_Item($selectedTable)).Trim()
Write-LogOutput "Copying $selectedTable : '$query'"            
$cmd = New-Object Data.SqlClient.SqlCommand -argumentList $query, $source
$cmd.CommandTimeout = 120;
$bulkData = ([Data.SqlClient.SqlBulkCopy]$destination)
$bulkData.DestinationTableName = $selectedTable;
$bulkData.BulkCopyTimeout = $global:tableCopyDataTimeout # = 3600
$reader = $cmd.ExecuteReader();
$bulkData.WriteToServer($reader); # Takes forever here on large tables

The source and target databases are located on different servers so I kept track of t开发者_C百科he network speed as well. The network utilization never went over 1% which was quite surprising to me. But when I just transfer some large files between the servers, the network utilization spikes up to 10%. I have tried setting the $bulkData.BatchSize to 5000 but nothing really changed. Increasing the BulkCopyTimeout to an even greater amount would only solve the timeout. I really would like to know why the network is not being used fully.

Anyone else had this problem? Any suggestions on networking or bulk copy will be appreciated. And please let me know if you need more information.

Thanks.

UPDATE

I have tweaked several options that increase the performance of SqlBulkCopy, such as setting the transaction logging to simple and providing a table lock to SqlBulkCopy instead of the default row lock. Also some tables are better optimized for certain batch sizes. Overall, the duration of the copy was decreased by some 15%. And what we will do is execute the copy of each database simultaneously on different servers. But I am still having a timeout issue when copying one of the databases.

When copying one of the larger databases, there is a table for which I consistently get the following exception:

System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. 

It is thrown about 16 minutes after it starts copying the table which is no where near my BulkCopyTimeout. Even though I get the exception that table is fully copied in the end. Also, if I truncate that table and restart my process for that table only, the tables is copied over without any issues. But going through the process of copying that entire database fails always for that one table.

I have tried executing the entire process and reseting the connection before copying that faulty table, but it still errored out. My SqlBulkCopy and Reader are closed after each table. Any suggestions as to what else could be causing the script to fail at the point each time?

CREATE TABLE [dbo].[badTable](
[someGUID] [uniqueidentifier] NOT NULL,
[xxx] [uniqueidentifier] NULL,
[xxx] [int] NULL,
[xxx] [tinyint] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NULL,
[xxx] [uniqueidentifier] NOT NULL,
[xxx] [uniqueidentifier] NULL,
CONSTRAINT [PK_badTable] PRIMARY KEY NONCLUSTERED 
(
[someGUID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

No indexes exist for this table on the target DB.


Have you considered removing indexes, doing the insert, and then reindexing?


I've used a dataset and wonder if this would be faster:

$ds=New-Object system.Data.DataSet
$da=New-Object system.Data.SqlClient.SqlDataAdapter($cmd)
[void]$da.fill($ds)
bulkData.WriteToServer($ds.Tables[0])


SqlBulk Copy is by far the fastest way of copying data into SQL tables.
You should be getting speeds in excess of 10,000 rows per second.
In order to test the bulk copy functionality, try DBSourceTools. ( http://dbsourcetools.codeplex.com )
This utility is designed to script Databases to disk, and then re-create them on a target server.
When copying data, DBSourceTools will first export all data to a local .xml file, and then do a Bulk Copy to the target database.
This will help to further identify where your bottleneck is, by breaking the process up into two passes : one for reading and one for writing.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号