I am working in SQL Server 2008 R2 with a priority ordered set of content that must be assigned to a set of buckets to achieve a content specified value. Each item in the content list is related to nodes within a ragged tree hierarchy (the buckets). Each bucket has a value assigned to it and can hold a fixed quantity of content.
I am trying to allocate content in priority order to the buckets that they relate to (or any parent/grandparent up the tree from related content). I must start with the highest bucket value (with empty spaces) and stop only when the bucket values match or exceed my content value.
Hopefully my crude example will help. Assuming the B’s are buckets that can each hold 2 pieces of content and C’s are content. The bracketed numbers are the bucket value and required content value.
C1 would result in being allocated to B1 (highest value in B1’s tree) and B4 to give it a total value of 7. Both B1 an B4 now only have one slot remaining.
C2 would be allocated B1 and B5 leaving no slots in B1 and 1 slot in B2.
C3 would not be able to use B1 as there are no slots available, so would result in B2, B5 and B9 leaving no slots in B5 and one slot in B2 / B5.
And so on...
I can see how to achieve this iteratively by creating a list of all buckets and their relationship with all child / grand child buckets. Looping though content one at a time, assigning its' buckets and reducing the remaining bucket spaces. The reason I feel that it needs to be a loop is due to the unknown number of spaces remaining in each bucket based on processing all higher priority content.
But looping through content one at a time feels intrinsically wrong and there must be a more efficient way to solve this allocation problem – ideally in one pass…
Example SQL Server code (to match the above diagram)
--core table/fields
CREATE TABLE Bucket
(
Id int,
Name varchar(3),
BucketValue int,
SlotRemaining int --only required for my solution to hold number of slots left to fill
)
CREATE TABLE BucketParent
(
ChildBucketId int,
ParentBucketId int
)
CREATE TABLE Content
(
Id int,
Name varchar(3),
ContentValue int,
AllocationState int, --only required for my solution to identify content that still needs processing
--1=unprocessed, 2=Complete
Priority int --order to work through content 1=most imnportant
)
CREATE TABLE ContentBucket
(
ContentId int,
BucketId int
)
Go
CREATE TABLE ContentPriorityBucket -- table to record my allocation of content to the most valuable bucket
(
ContentId int,
BucketId int
)
Go
--test data to match example (wish id made it smaller now :)
INSERT INTO Bucket Values (1,'B1', 4, null)
INSERT INTO Bucket Values (2,'B2', 5, null)
INSERT INTO Bucket Values (3,'B3', 4, null)
INSERT INTO Bucket Values (4,'B4', 3, null)
INSERT INTO Bucket Values (5,'B5', 3, null)
INSERT INTO Bucket Values (6,'B6', 3, null)
INSERT INTO Bucket Values (7,'B7', 4, null)
INSERT INTO Bucket Values (8,'B8', 2, null)
INSERT INTO Bucket Values (9,'B9', 1, null)
INSERT INTO Bucket Values (10,'B10', 2, null)
INSERT INTO Bucket Values (11,'B11', 1, null)
INSERT INTO BucketParent Values (8, 4)
INSERT INTO BucketParent Values (4, 1)
INSERT INTO BucketParent Values (9, 5)
INSERT INTO BucketParent Values (5, 1)
INSERT INTO BucketParent Values (5, 2)
INSERT INTO BucketParent Values (10, 5)
INSERT INTO BucketParent Values (10, 6)
INSERT INTO BucketParent Values (6, 2)
INSERT INTO BucketParent Values (6, 3)
INSERT INTO BucketParent Values (11, 6)
INSERT INTO BucketParent Values (11, 7)
INSERT INTO BucketParent Values (7, 3)
INSERT INTO Content Values (1,'C1', 5, null, 1)
INSERT INTO Content Values (2,'C2', 8, null, 2)
INSERT INTO Content Values (3,'C3', 9, null, 3)
INSERT INTO Content Values (4,'C4', 10, null, 4)
INSERT INTO ContentBucket Values (1,8)
INSERT INTO ContentBucket Values (1,4)
INSERT INTO ContentBucket Values (2,9)
INSERT INTO ContentBucket Values (3,9)
INSERT INTO ContentBucket Values (4,10)
INSERT INTO ContentBucket Values (4,7)
GO
--Iterative solution that I am trying to improve on
UPDATE Bucket
SET SlotRemaining = 2 --clear previous run and allocate maximum bucket size
UPDATE Content
SET AllocationState = 1 --set state to unprocessed
--Clear last run
TRUNCATE Table ContentPriorityBucket
GO
DECLARE @ContentToProcess int = 0
DECLARE @CurrentContent int
DECLARE @CurrentContentValue int
SELECT @ContentToProcess = COUNT(id) FROM Content WHERE AllocationState =1
WHILE (@ContentToProcess > 0)
BEGIN
-- get next content to process
SELECT Top(1) @CurrentContent = ID,
@CurrentContentValue = ContentValue
FROM Content
WHERE AllocationState =1
ORDER BY Priority;
WITH BucketList (Id, BucketValue, SlotRemaining)
as
(
-- list buckets related to content
SELECT b.Id
,b.BucketValue
,b.SlotRemaining
FROM ContentBucket cb
INNER JOIN Bucket b on cb.BucketId = b.Id
WHERE cb.ContentId = @CurrentContent
-- need to pull back all buckets (even those that are full as they may have empty parents)
UNION ALL
SELECT b.Id
,b.BucketValue
,b.SlotRemaining
FROM BucketList bl
INNER JOIN BucketParent bp on bl.Id = bp.ChildBucketId
INNER JOIN Bucket b on bp.ParentBucketId = b.Id
),
DistinctBucketList (Id, BucketValue, SlotRemaining)
as
(
--dedupe buckets
SELECT distinct Id
, BucketValue
, SlotRemaining
FROM BucketList
),
BucketListOrdered (Id, BucketValue, RowOrder)
as
(
--order buckets
SELECT Id
,BucketValue
,ROW_NUMBER() OVER (ORDER BY BucketValue desc, Id)-- added id to get consistant result if two buckets have same value
FROM DistinctBucketList
WHERE SlotRemaining >0
),
CulmativeBucketListWithinRequiredValue (Id, RowOrder, CulmativeBucketValue, RequiredBucket)
as
(
-- this will mark all buckets up to the bucket value, but will be 1 bucket short
SELECT blo.Id
,blo.RowOrder
,SUM(blc.BucketValue) CulmativeBucketValue
,CASE
WHEN SUM(blc.BucketValue) <=@CurrentContentValue THEN 1
ELSE 0
END RequiredBucket
FROM BucketListOrdered blo
LEFT JOIN BucketListOrdered blc ON blc.RowOrder <= blo.RowOrder
GROUP BY blo.Id, blo.RowOrder
)
-- this will identify all buckets required to top content value
INSERT INTO Co开发者_StackOverflow社区ntentPriorityBucket
SELECT @CurrentContent
,b.Id
FROM CulmativeBucketListWithinRequiredValue b
WHERE b.RowOrder <= (SELECT Max(RowOrder) + 1 FROM CulmativeBucketListWithinRequiredValue WHERE RequiredBucket =1)
--reduce all used bucket sizes by 1 (could alternatively determine this from ContentPriorityBucket)
UPDATE Bucket
SET SlotRemaining = SlotRemaining -1
WHERE id in (SELECT BucketId FROM ContentPriorityBucket WHERE ContentId = @CurrentContent)
-- update processed bucket
UPDATE Content
SET AllocationState = 2
WHERE @CurrentContent = Id
SELECT @ContentToProcess = COUNT(id) FROM Content WHERE AllocationState =1
END
SELECT ContentId, BucketId FROM ContentPriorityBucket
/*
DROP TABLE Bucket
DROP TABLE BucketParent
DROP TABLE Content
DROP TABLE ContentBucket
DROP TABLE ContentPriorityBucket
*/
There are a couple points to make about this problem.
First, generalized bin-packing is a NP-Complete problem, and therefore cannot be solved in general in a single pass. This specific bin-packing, since it is an ordered packing, may be different, but the issue of the complexity of the problem remains; it's certainly not O(1), so it may need a loop no matter what.
1-pass non-looping solutions for this seem like they should not be possible; it looks like a problem that isn't made for set-based solutions. You could create a table-valued CLR function, which could find the bucket that each item fits into. Otherwise, keeping the looping solution would be fine. (If you post the code, it might be easier to see if there are improvements possible.)
精彩评论