What's a good alternative to firing a stored procedure 368 times to update the database?_问答_开发者

I'm working on a .NET component that gets a set of data from the database, performs some business logic on that set of data, and then updates single records in the database via a stored procedure that looks something like spUpdateOrderDetailDiscountedItem.

For small sets of data, this isn't a problem, but when I had a very large set of data that required an iteration of 368 stored proc calls to update the records in the database, I realized I had a problem. A senior dev looked at my stored proc code and said it looked fine, but now I'd like to explore a better method for sending "batch" data to the database.

What options do I have for updating the database in batch? Is this possible with stored procs? What other options do I have?

I won't have the option of installing a full-fledged ORM, but any advice is appreciated.

Additional Background Info:

Our current data access model was built 5 years ago and all calls to the db currently get executed via modular/static functions with names like ExecQuery and GetDataTable. I'm not certain that I'm required to stay within that model, but I'd have to provide a very good justification for going outside of our current DAL to get to the DB.

Also worth noting, I'm fairly new when it comes to CRUD operations and the database. I much prefer to play/work in the .NET side of code, but the data has to be stored somewhere, right?

Stored Proc contents:

ALTER PROCEDU开发者_StackOverflowRE [dbo].[spUpdateOrderDetailDiscountedItem] 
    -- Add the parameters for the stored procedure here
    @OrderDetailID decimal = 0,
    @Discount money = 0,
    @ExtPrice money = 0,
    @LineDiscountTypeID int = 0,
    @OrdersID decimal = 0,
    @QuantityDiscounted money = 0,
    @UpdateOrderHeader int = 0,
    @PromoCode varchar(6) = '',
    @TotalDiscount money = 0

AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Insert statements for procedure here
    Update OrderDetail
    Set Discount = @Discount, ExtPrice = @ExtPrice, LineDiscountTypeID = @LineDiscountTypeID, LineDiscountPercent = @QuantityDiscounted
    From OrderDetail with (nolock) 
    Where OrderDetailID = @OrderDetailID

    if @UpdateOrderHeader = -1
      Begin
        --This code should get code the last time this query is executed, but only then.
        exec spUpdateOrdersHeaderForSkuGroupSourceCode @OrdersID, 7, 0, @PromoCode, @TotalDiscount
      End

If you are using SQL 2008, then you can use a table-valued parameter to push all of the updates in one s'proc call.

update Incidentally, we are using this in combination with the merge statement. That way sql server takes care of figuring out if we are inserting new records or updating existing ones. This mechanism is used at several major locations in our web app and handles hundreds of changes at a time. During regular load we will see this proc get called around 50 times a second and it is MUCH faster than any other way we've found... and certainly a LOT cheaper than buying bigger DB servers.

An easy and alternative way I've seen in use is to build a SQL statement consisting of sql_execs calling the sproc with the parameters in the string. Not sure if this is advised or not, but from the .NET perspective, you are only populating one SqlCommand and calling ExecuteNonQuery once...

Note if you choose this then please, please use the StringBuilder! :-)

Update: I much prefer Chris Lively's answer, didn't know about table-valued parameters until now... unfortunately the OP is using 2005.

You can send the full set of data as XML input to the stored procedure. Then you can perform Set operations to modify the database. Set based will beat RBARs on performance almost every single time.

If you are using a version of SQL Server prior to 2008, you can move your code entirely into the stored procedure itself.

There are good and "bad" things about this.
Good

No need to pull the data across a network wire.
Faster if your logic is set based
Scales up

Bad

If you have rules against any logic in the database, this would break your design.
If the logic cannot be set based then you might end up with a different set of performance problems
If you have outside dependencies, this might increase difficulty.

Without details on exactly what operations you are performing on the data it's hard to give a solid recommendation.

UPDATE
Ben asked what I meant in one of my comments about the CLR and SQL Server. Read Using CLR Integration in SQL Server 2005. The basic idea is that you can write .Net code to do your data manipulation and have that code live inside the SQL server itself. This saves you from having to read all of the data across the network and send updates back that way.

The code is callable by your existing proc's and gives you the entire power of .net so that you don't have to do things like cursors. The sql will stay set based while the .net code can perform operations on individual records.

Incidentally, this is how things like heirarchyid were implemented in SQL 2008.

The only real downside is that some DBA's don't like to introduce developer code like this into the database server. So depending on your environment, this may not be an option. However, if it is, then it is a very powerful way to take care of your problem while leaving the data and processing within your database server.

Can you create batched statement with 368 calls to your proc, then at least you will not have 368 round trips. ie pseudo code

var lotsOfCommands = "spUpdateOrderDetailDiscountedItem 1; spUpdateOrderDetailDiscountedItem 2;spUpdateOrderDetailDiscountedItem ... 368'

var new sqlcommand(lotsOfCommands)
command.CommandType = CommandType.Text;

//execute command

I had issues when trying to the same thing (via inserts, updates, whatever). While using an OleDbCommand with parameters, it took a bunch of time to constantly re-create the object and parameters each time I called it. So, I made a property on my object for handling such call and also added the appropriate "parameters" to the function. Then, when I needed to actually call/execute it, I would loop through each parameter in the object, set it to whatever I needed it to be, then execute it. This created SIGNIFICANT performance improvement... Such pseudo-code of my operation:

protected OleDbCommand oSQLInsert = new OleDbCommand();
// the "?" are place-holders for parameters... can be named parameters, 
// just for visual purposes 
oSQLInsert.CommandText = "insert into MyTable ( fld1, fld2, fld3 ) values ( ?, ?, ? )";
// Now, add the parameters
OleDbParameter NewParm = new OleDbParameter("parmFld1", 0);
oSQLInsert.Parameters.Add( NewParm );

NewParm = new OleDbParameter("parmFld2", "something" );
oSQLInsert.Parameters.Add( NewParm );

NewParm = new OleDbParameter("parmFld3", 0);
oSQLInsert.Parameters.Add( NewParm );

Now, the SQL command, and place-holders for the call are all ready to go... Then, when I'm ready to actuall call it, I would do something like..

oSQLInsert.Parameters[0].Value = 123;
oSQLInsert.Parameters[1].Value = "New Value";
oSQLInsert.Parameters[2].Value = 3;

Then, just execute it. The repetition of 100's of calls could be killed by time by creating your commands over and over...

good luck.

Is this a one-time action (like "just import those 368 new customers once") or do you regularly have to do 368 sproc calls?

If it's a one-time action, just go with the 368 calls.
(if the sproc does much more than just updates and is likely to drag down the performance, run it in the evening or at night or whenever no one's working).

IMO, premature optimization of database calls for one-time actions is not worth the time you spend with it.

Bulk CSV Import

(1) Build data output via string builder as CSV then do a Bulk CSV import:

http://msdn.microsoft.com/en-us/library/ms188365.aspx

Table-valued parameters would be best, but since you're on SQL 05, you can use the SqlBulkCopy class to insert batches of records. In my experience, this is very fast.