In a system with multiple concurrent tasks operating on data, I want to order the tasks such that 开发者_StackOverflow中文版there is a minimum amount of waiting time involved. Each task in the system uses a certain set of resources, tasks are issued in a certain order (this order is what I want to calculate), a task will not start until it can get a lock on all required resources. Tasks are issued sequentially, so the 3rd task will not start until the second one has acquired all locks, and so on.
Task 1, Resources [A, B]
Task 2, Resources [B, C]
Task 3, Resources [C, D]
Task 4, Resources [E]
Best Solution
Task 1, [A, B]
Task 3, [C, D] //No waiting is possibly required
Task 4, [E] //Put this before task 3, to maximise the distance between uses of the same resource (minimise chances of lock contention)
Task 2, [B, C] //Some waiting *might* be required here
What algorithm could be used to calculate the optimal ordering such that there is the maximum gap between a resource being used, and then used again?
Nb. This is language agnostic, but bonus points for implementations in C#
Edit: The objective function given is non-linear, as pointed out by Moron in commmets. Hence this answer can not be used.
One possible approach would be to solve it using linear programming. Here is what I had in mind. Introduce a decision variable T_i_j that is set to 1 if we start task i at time j (I will be counting the tasks from 0 to 3). Furthermore we want to "punish" scheduling tasks close to each other, if they need the same resources. In the example given we want to punish T0_m and T1_n based on the difference between m and n, 3. We can then model the problem as follows
Minimize:
3 * T0_0 * T1_1 + 2 * T0_0 * T1_2 + 1 * T0_0 * T1_3
+ 3 * T0_1 * T1_2 + 2 * T0_1 * T1_3
+ 3 * T0_2 * T1_3
+ 3 * T1_0 * T2_1 + 2 * T1_0 * T2_2 + 1 * T1_0 * T2_3
+ 3 * T1_1 * T2_2 + 2 * T1_1 * T2_3
+ 3 * T1_2 * T2_3
Subject to
// We start a task exactly once.
T0_0 + T0_1 + T0_2 + T0_3 = 1
T1_0 + T1_1 + T1_2 + T1_3 = 1
T2_0 + T2_1 + T2_2 + T2_3 = 1
T3_0 + T3_1 + T3_2 + T3_3 = 1
// We can only start a single task at a given time.
T0_0 + T1_0 + T2_0 + T3_0 = 1
T0_1 + T1_1 + T2_1 + T3_1 = 1
T0_2 + T1_2 + T2_2 + T3_2 = 1
T0_3 + T1_3 + T2_3 + T3_3 = 1
We can then use an integer programming solver to find the best combinations to launch the jobs in.
The model above was generated with this (quite awful, but should give you the idea) code
class Program
{
private static string[][] s_tasks = new string[][]
{
new string[] { "A", "B"},
new string[] { "B", "C"},
new string[] { "C", "D"},
new string[] { "E" }
};
static void Main(string[] args)
{
string filePath = Path.Combine(Environment.GetEnvironmentVariable("USERPROFILE"), @"Desktop\lin_prog.txt");
using (TextWriter writer = new StreamWriter(filePath, false))
{
Console.SetOut(writer);
Console.WriteLine("Given tasks");
PrintTasks();
Console.WriteLine();
Console.WriteLine("Minimize:");
PrintObjectiveFunction();
Console.WriteLine();
Console.WriteLine("Subject to");
PrintConstraints();
}
}
static void PrintTasks()
{
for (int taskCounter = 0; taskCounter < s_tasks.Length; taskCounter++)
{
Console.WriteLine("Task {0}: [ {1} ]", taskCounter, String.Join(", ", s_tasks[taskCounter]));
}
}
static void PrintConstraints()
{
Console.WriteLine("// We start a task exactly once.");
for (int taskCounter = 0; taskCounter < s_tasks.Length; taskCounter++)
for (int timeCounter = 0; timeCounter < s_tasks.Length; timeCounter++)
{
Console.Write("T{0}_{1}", taskCounter, timeCounter);
if (timeCounter != s_tasks.Length - 1)
{
Console.Write(" + ");
}
else
{
Console.WriteLine(" = 1");
}
}
Console.WriteLine();
Console.WriteLine("// We can only start a single task at a given time.");
for (int timeCounter = 0; timeCounter < s_tasks.Length; timeCounter++)
for (int taskCounter = 0; taskCounter < s_tasks.Length; taskCounter++)
{
Console.Write("T{0}_{1}", taskCounter, timeCounter);
if (taskCounter != s_tasks.Length - 1)
{
Console.Write(" + ");
}
else
{
Console.WriteLine(" = 1");
}
}
}
static void PrintObjectiveFunction()
{
StringBuilder objective = new StringBuilder();
for (int currentTaskCounter = 0; currentTaskCounter < s_tasks.Length; currentTaskCounter++)
{
string[] currentTask = s_tasks[currentTaskCounter];
for (int otherTaskCounter = currentTaskCounter + 1; otherTaskCounter < s_tasks.Length; otherTaskCounter++)
{
string[] otherTask = s_tasks[otherTaskCounter];
if (ShouldPunish(currentTask, otherTask))
{
int maxPunishment = s_tasks.Length;
for (int currentTimeCounter = 0; currentTimeCounter < s_tasks.Length; currentTimeCounter++)
{
string currentTaskDecisionVar = String.Format("T{0}_{1}", currentTaskCounter, currentTimeCounter);
for (int otherTimeCounter = currentTimeCounter + 1; otherTimeCounter < s_tasks.Length; otherTimeCounter++)
{
string otherTaskDecisionVar = String.Format("T{0}_{1}", otherTaskCounter, otherTimeCounter);
// Punish tasks more in objective function if they are close in time when launched.
int punishment = maxPunishment - (otherTimeCounter - currentTimeCounter);
if (0 != objective.Length)
{
objective.Append(" + ");
}
objective.AppendFormat
(
"{0} * {1} * {2}",
punishment, currentTaskDecisionVar, otherTaskDecisionVar
);
}
objective.AppendLine();
}
}
}
}
// Nasty hack to align things.
Console.Write(" " + objective.ToString());
}
static bool ShouldPunish(string[] taskOne, string[] taskTwo)
{
bool shouldPunish = false;
foreach (string task in taskOne)
{
// We punish tasks iff. they need some of the same resources.
if(taskTwo.Contains(task))
{
shouldPunish = true;
break;
}
}
return shouldPunish;
}
}
A couple of things to note
- The code above runs in something like O(n^5) where n is the number of tasks. That is just to generate the model; integer programming is NP-complete.
- I'm by no means an OR specialist. I just gave it a try for fun.
- The solution outlined above uses no inherent constraints that the problem might contain. I could easily imagine a specialized job scheduling algorithm would perform much better (though I still think the problem is NP-complete)
- If I'm right in the fact, that the problem is NP-complete, then you'll probably be much better off using a cheap heuristic and launch the tasks quickly (unless you can precalculate the solution and use many times).
I think that if I had a box that solved arbitrary instances of your problem I could feed it disguised graph colouring problems (http://en.wikipedia.org/wiki/Graph_coloring) and have it solve them. I would translate each link into a resource shared by the nodes on either side of the link. All the nodes that are scheduled at the same time can then be coloured the same. So if your problem is easy then Graph Colouring is easy, but Graph Colouring is NP-complete so you're stuffed - well, almost.
OTOH problems like register allocation are reduced to Graph Colouring and approximately solved in practice, so one of the schemes used for Graph Colouring may work for you too. See e.g. http://en.wikipedia.org/wiki/Register_allocation.
Unless you have clear hierarchy it's going to be difficult to programmatically enforce. For example, you usually have to hold the resources in order to get the next one. IOW to get "B" you must first hold "A". To get "C" you must hold both "A" and "B" and so on. If this is NOT the case then I think the best you can do is to write a common routine that always requests your resources in the same order, A then B then C etc. and route all of your Tasks through this routine. I think ideally you would allocate the resources prior to queue the tasks.
If the resources are all the same, you can use a semaphore with a count of 5. For example a pool of DB connections. This does not seem to be your case though.
精彩评论