Please allow me to ask this question with an example: Suppose we have the following 3 lists (omitted double quotes for clarity):
L1: (a, c, b, d, f, j)
L2: (b, e, j, k)
L3: (a, d, e, g, h, j, i)
The output list can look like any of the following (there are more solutions)
Lanswer1: (a, c, b, d, e, f, g, h, j, i, k)
Lanswer2: (a, c, b, d, f, e, g, h, j,开发者_StackOverflow社区 i, k)
Lanswer3: (a, c, b, d, e, f, g, h, j, k, i)
In summary, the resulting ordered set
- Contains the union of elements from all the lists
- Order of the elements in all the original lists are preserved.
A 4th list, L4: (b, c, d), when added to the input, should throw an exception (since c comes before b in L1)
I came up with the answers by inspection. Can anybody suggest an algorithm to do this? Thank you, - M.S.
This can be done using topological sorting.
First build the directed graph from the lists. The elements of the list become the nodes. The edges go from the first element to the second, from the second to the third, and so on.
Depending on how you implement the algorithm you can get all possible solutions, or just one. Also, If the graph contains a cycle then the algorithm will stop with an error.
This is how the graph from your lists would look like:
Source:
digraph {
{
edge [color = "red"]
a -> c
c -> b
b -> d
d -> f
f -> j
}
{
edge [color = "blue"]
b -> e
e -> j
j -> k
}
{
edge [color = "green"]
a -> d
d -> e
e -> g
g -> h
h -> j
j -> i
}
}
Here's an attempt. Usual disclaimers apply.
Main logic
Essentially, it's a 2 step process.
1.Convert the 3 lists into one doubly linked list with each node pointing to the next element and also to any siblings (like e/f or i/k) using the second pointer. After conversion, it should look something like
a->c->b->d->f->g->h->j->k
| |
e i
2.write a function that takes two elements and the above linked list and returns a flag that indicates if the first element occurs before, after or at same position as the second element. One possible signature of the method could be
int getRelativePositionOf(char e1, char e2)
//returns -1 if e1 exists before e2, 0 if e1 and e2 are siblings and 1 if e1 exists after e2.
This will help you validate the fourth list and throw exception if the relative position of any two elements differs from the original data.
Linked list creation
As you can imagine, the first step i.e., creating the doubly linked list is the most challenging part. I haven't come up with the most elegant way to create it but here's one.
Put the longest list, L1 into a doubly linked list. So, in your example this is the 3rd list and you'll now have
a->d->e->g->h->j->i
Now iterate through the remaining elements and for each element, check if it exists in the linked list. If it does, ignore.
If it doesn't exist, then the element needs to be inserted into the linked list. So now we need to find out the position where this element should be added. Let's call this element 'X'.
Traverse through the length of the linked list and for each element say 'Y', check in L2 and L3 whether a positional relation exists between X & Y. So for example, if we are trying to insert 'c' into the above linked list, I'd start with the first element of linked list i.e. 'a' and see in the lists if a relationship exists between 'a' and 'c'.
A relationship is defined to exist if we can determine the relative position of the two elements in the raw lists. From the first list, I know 'c' comes after 'a'. So 'c' should be inserted somewhere after 'a'. So we move forward in the linked list and now compare 'd' with 'c'. Again from the first raw list, we know 'c' comes before 'd'. Bingo. Insert 'c' between 'a' and 'd'. So the linked list now becomes
a->c->d->->e->g->h->j->i
If a relationship can not be ascertained between 2 elements, they are siblings. So if you try to insert 'f' in the above list, you'd know that it should come after 'd' and before 'g' but no information is available about it's relative position wrt 'e'. So store 'f' as a sibling of 'e'.
Do this till all elements have been inserted.
It is obvious that step 5 is the ugliest and trickiest part of the above algorithm. But I think it can be optimized to make it simpler and faster. For example, you could create hashmaps of char->position out of your raw lists for faster lookups while determining the relative positions. Further you could also maintain a set of all elements that already exist in the linked list so as to avoid traversal of linked list if the element exists already. I am sure there are a number of ways this can be further optimized.
精彩评论