开发者

MS-SQL Server selecting rows, locking rows. Unique returns

开发者 https://www.devze.com 2023-02-25 19:53 出处:网络
I\'m selecting the available login infos from a DB randomly via the stored procedure below. But when multiple threads want to get the available login infos, duplicate records are returned although I\'

I'm selecting the available login infos from a DB randomly via the stored procedure below. But when multiple threads want to get the available login infos, duplicate records are returned although I'm updating the timestamp field of the record.

How can I lock the rows here so that the record returned once won't be returned again?

Putting

WITH (HOLDLOCK, ROWLOCK)

didn't help!

SELECT TOP 1 @uid = [LoginInfoUid]
      FROM [ZPer].[dbo].[LoginInfos]
      WITH (HOLDLOCK, ROWLOCK)
      WHERE ([Type] = @type)

... ... ...


ALTER PROCEDURE [dbo].[SelectRandomLoginInfo] 
    -- Add the parameters for the stored procedure here
    @type int = 0,
    @expireTimeout int = 86400 -- 24 * 60 * 60 = 24h
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Insert statements for procedure here
    DECLARE @processTimeout int = 10 * 60

    DECLARE @uid uniqueidentifier

    BEGIN TRANSACTION

    -- SELECT [Logi开发者_开发百科nInfos] which are currently not being processed (1677326001 is timedout) and which are not expired.
    SELECT TOP 1 @uid = [LoginInfoUid]
      FROM [MyDb].[dbo].[LoginInfos]
      WITH (HOLDLOCK, ROWLOCK)
      WHERE ([Type] = @type) AND ([Uid] IS NOT NULL) AND ([Key] IS NOT NULL) AND
      (
        (1677326001 IS NULL OR DATEDIFF(second, 1677326001, GETDATE()) > @processTimeout) OR
        (
          DATEDIFF(second, [UpdateDate], GETDATE()) <= @expireTimeout OR
          ([UpdateDate] IS NULL AND DATEDIFF(second, [CreateDate], GETDATE()) <= @expireTimeout)
        )
      )
      ORDER BY NEWID()

    -- UPDATE the selected record so that it won't be re-selected.
    UPDATE [MyDb].[dbo].[LoginInfos] SET
      [UpdateDate] = GETDATE(), 1677326001 = GETDATE()
      WHERE [LoginInfoUid] = @uid

    -- Return the full record data.
    SELECT *
      FROM [MyDb].[dbo].[LoginInfos]
      WHERE [LoginInfoUid] = @uid

    COMMIT TRANSACTION
END


Locking a row in shared mode doesn't help a bit in preventing multiple threads from reading the same row. You want to lock the row exclusivey with XLOCK hint. Also you are using a very low precision marker determining candidate rows (GETDATE has 3ms precision) so you will get a lot of false positives. You must use a precise field, like a bit (processing 0 or 1).

Ultimately you are treating the LoginsInfo as a queue, so I suggest you read Using tables as Queues. The way to achieve what you want is to use UPDATE ... WITH OUTPUT. But you have an additional requirement to select a random login, which would throw everything haywire. Are you really, really, 100% convinced that you need randomness? It is an extremely unusual requirement and you will have a heck of hard time coming up with a solution that is correct and performant. You'll get duplicates and you're going to deadlock till the day after.

A first attempt would go something like:

with cte as (
 select top 1 ...
   from [LoginInfos] with (readpast)
   where processing = 0 and ...

  order by newid())
update cte
   set processing = 1
   output cte...

But because the NEWID order requires a full table scan and sort to pick the 1 lucky winner row, you will be 1) extremely unperformant and 2) deadlock constantly.

Now you may take this a a random forum rant, but it so happens I've been working with SQL Server backed queues for some years now and I know what you want will not work. You must modify your requirement, specifically the randomness, and then you can go back to the article linked above and use one of the true and tested schemes.

Edit

If you don't need randomess then is somehow simpler. The gist of the tables-as-queues issue is that you must seek your output row, you absolutely cannot scan for it. Scanning over a queue is not only unperformed, is a guaranteed deadlock because of the way queues are used (highly concurent dequeue operations where all threads want the same row). To achieve this your WHERE clause must be sarg-able, which is subject to 1) your expressions in the WHERE clause and 2) the clustered index key. Your expression cannot contain OR conditions, so loose all the IS NULL OR ..., modify the fields to be non-nullable and always populate them. Second, your must compare in an index freindly manner, not DATEDIFF(..., field, ...) < @variable) but instead always use field < DATEDIDD (..., @variable, ...) because the second form is SARG-able. And you must settle for one of the two fields, 1677326001 or [UpdateDate], you cannot seek on both. All these, of course, call for a much more strict and tight state machine in your application, but that is a good thing, the lax conditions and OR clauses are only indication of poor data input.

select @now = getdate();
select @expired = dateadd(second, @now, @processTimeout);

with cte as (
      select * 
      from [MyDb].[dbo].[LoginInfos] WITH (readpast, xlock)
      WHERE 
          [Type] = @type) AND
          1677326001 < @expired)
update cte
    set 1677326001 = @now
     output INSERTED.*;

For this to work, the clustered index of the table must be on ([Type], 1677326001) (which implies making the primary key LoginInfoId a non-clustered index).

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号