I had 2 integer arrays, one original and one modified w.r.t the original array. Elements can be added or removed from the original to convert it to the modified. My problem is, in the modified, I need to find out which elements are new, which elements are same, and which elements are not there w.r.t to the original array.
Given data:
arr1 = 3,2,1 //Original array
arr2 = 1,4,5 //Modified array by adding and/or removing elements
I need something like:
same = 1
removed = 2,3
added = 4,5
Obviously, I can write several nested for loops and find it out, but this would be too inefficient. I was wondering if there was be a better or efficient way to do it.. I am using Java. This page kind of address a similar problem, but not sure开发者_如何学编程 if I can use it to solve my problem.
Any help would be appreciated.
If memory is not a constraint, I'd recommend using Set
for these kind of operations. Finding the things you require would be just a matter of calling the appropriate methods on the two Set
objects. Of course, this assumes you'd have unique elements in your elements and if not, you are interested in only unique elements when it comes to reporting stuff. For e.g.
public static void testSet() {
final Set<Integer> first = new HashSet<Integer>(Arrays.asList(1, 2, 3));
final Set<Integer> second = new HashSet<Integer>(Arrays.asList(1, 4, 5));
Set<Integer> result = new HashSet<Integer>(first);
result.retainAll(second);
System.out.println("Similar: " + result);
result = new HashSet<Integer>(first);
result.removeAll(second);
System.out.println("Removed: " + result);
result = new HashSet<Integer>(second);
result.removeAll(first);
System.out.println("Newly added: " + result);
}
/*
OUTPUT:
Similar: [1]
Removed: [2, 3]
Newly added: [4, 5]
*/
You could go through each array once and obviously use a removal save iteration like a loop from the back of the array.
int[] same
for (int i = arr1.length; i >= 0; i--)
{
if(arr2.contains(i))
same.add(i)
arr1.remove(i)
}
for (int i = arr2.length; i >= 0; i--)
{
if(same.contains(i))
arr2.remove(i)
}
Then arr1
will be the list of removed, arr2
will be added and same
will be the same.
If there are no duplicates, and the maximum integer is constrained, and the members are moderately dense (say, 1% density or better), make them into BitSets. The "and" of the two sets are "same", A.andNot(B) are the ones in A only, and B.andNot(A) are the ones in B only. If the integers are moderately dense, this is very fast.
If the integers are sparse, sort each array and walk up them in tandem.
You are trying to calculate the Levenshtein distance between two arrays.
There is a simple dynamic programming solution to calculate this (taken from Wikipedia):
int LevenshteinDistance(char s[1..m], char t[1..n])
{
// for all i and j, d[i,j] will hold the Levenshtein distance between
// the first i characters of s and the first j characters of t;
// note that d has (m+1)x(n+1) values
declare int d[0..m, 0..n]
for i from 0 to m
d[i, 0] := i // the distance of any first string to an empty second string
for j from 0 to n
d[0, j] := j // the distance of any second string to an empty first string
for j from 1 to n
{
for i from 1 to m
{
if s[i] = t[j] then
d[i, j] := d[i-1, j-1] // no operation required
else
d[i, j] := minimum
(
d[i-1, j] + 1, // a deletion
d[i, j-1] + 1, // an insertion
d[i-1, j-1] + 1 // a substitution
)
}
}
return d[m,n]
}
You can easily change this code to tell you what the individual operations are.
精彩评论