If I have two arrays.For example,
One array is int[] one={1,2,4,6,54,3,34};
the other is int[] two={12,1,2,4,7,8,54,3,34,5};
The problem is how can I get "the same parts" between one and two.
"The same parts" in the example are [1,2,4] and [54,3,34].
P.S.You can use pseudo language ,c,c#,java,php or other language.
P.S. Now I make clear th开发者_StackOverflow中文版e same parts.the same parts elements have the lists.
P.S.I have change the example,and value of each item in the array isn't equal (You can see my example.)
- at least two items match
- the index of the match item in the two arrays not necessary to match ,but the same parts must be continuous.
You can build a suffix tree for the two arrays (seen as 'strings') and compares the two trees.
In particular, you can choose one of the two trees (the one associated with the smaller array, say) (call it A) and start traversing it, mimicking the moves on the other tree (call it B).
If you are in a node u of tree A and you can't replicate any "move" from this node to the corresponding one of tree B, then you've found a "maximal match" (the one spelled from root to u) and you can prune the subtree of tree A rooted on u.
This is just an idea, you must build up on it; note that you can build a suffix tree in O(n) and this kind of "bisimilarity" is O(n) too, so it looks like optimal.
This is probably the longest common subsequence problem.
Nearly a brute force with some optimizations. Worst case O(n^4). n is size of shorter array.
one=[1,2,4,6,54,3,34]
two=[12,2,4,3,54,3,5]
one_pos_list_map = {} # map of value -> position list
one_pos_map = {} # map of value -> map of position -> 1
for pos in xrange(len(one)):
val = one[pos]
if val not in one_pos_map:
one_pos_map[val] = {}
one_pos_map[val][pos] = 1
if val not in one_pos_list_map:
one_pos_list_map[val] = []
one_pos_list_map[val].append(pos)
checked = [False for i in xrange(len(two)*len(two))]
def print_maximal_matches(start, end):
idx = start * len(two) + end - 1
if (checked[idx] or end - start < 2):
return
checked[idx] = True
match_pos_list = one_pos_list_map.get(two[start], [])
for match_pos in match_pos_list:
found = True
for i in xrange(start + 1, end):
if not one_pos_map.get(two[i], {}).get(match_pos + i - start, None):
found = False
break
if found:
print two[start:end]
return
print_maximal_matches(start + 1, end)
print_maximal_matches(start, end - 1)
print_maximal_matches(0, len(two))
精彩评论