Given a large set of points in 3 dimensional space (x, y, z co-ordinates), need to find the ten nearest points to origin. Any pointers to already available java standard libraries. Also appreciate your opinion on 开发者_JS百科using optimal data structures and sorting algorithm to implement the solution in cost effective way wrt to both time and space ? Thanks in advance.
Just work out sqrt(x^2 + y^2 + z^2) for each value and sort. Here I've cached the results to make it more efficient.
// generate
Set<Point3D> points = new HashSet<Point3D>();
for (int i = 0; i < 20; i++) {
points.add(new Point3D(-5d + 10d * Math.random(), -5d + 10d
* Math.random(), -5d + 10d * Math.random()));
}
// distances
final Map<Point3D, Double> distanceCache = new IdentityHashMap<Point3D, Double>();
for (Point3D point : points) {
distanceCache.put(
point,
Math.sqrt(point.getX() * point.getX() + point.getY()
* point.getY() + point.getZ() * point.getZ()));
}
// sort
List<Point3D> tmp = new ArrayList<Point3D>(points);
Collections.sort(tmp, new Comparator<Point3D>() {
@Override
public int compare(Point3D o1, Point3D o2) {
return Double.compare(distanceCache.get(o2),
distanceCache.get(o1));
}
});
// print results
System.out.println(tmp.subList(0, 10));
for (Point3D point : tmp.subList(0, 10)) {
System.out.printf("%.2f,", distanceCache.get(point));
}
Assuming some kind of 3D point class
private static class Point3D {
private double x;
private double y;
private double z;
public Point3D(double x, double y, double z) {
super();
this.x = x;
this.y = y;
this.z = z;
}
public double getX() {
return x;
}
public double getY() {
return y;
}
public double getZ() {
return z;
}
@Override
public String toString() {
return new Formatter().format("%.2f,%.2f,%.2f", x, y, z).toString();
}
}
Adam has already provided a good and robust general solution. There are however a few possible optimizations that you could make use of in your particular case. It does however heavily depend on what you define as a large set of points. If we are talking many thousands of points, then do read on.
First of all, when working with euclidian distances it can be important to keep in mind that comparison the squared distance will give the same order as comparison between the actual distances. As a result you do not have to perform the relatively expensive square root operation. Just compare x*x+y*y+z*z
directly.
Secondly, the best general purpose sorting algorithms work in O(n * log n)
time. This applies to for example merge sort, heap sort and quick sort. However, if you have a large set of points and you just want to find the k first points, where k is small, then it is sometimes viable to choose a different algorithm. Specifically one where you can abort the sorting once you have found only these elements. For example, even doing a linear O(n) search to find each of the elements and performing that k times would yield a complexity of O(k * n). If k is less then log n then this method would be more efficient. Note that this is just a simple selection sort. Heap sort may also be a reasonable choice for finding the k first elements.
You should also consider whether you will build the set of points once and then run the algorithm many times, just adding/removing/moving a few points between every run, then it may be more efficient to maintain a data structure that remains ordered and then just taking out the first k elements from it. This could be as simple as a TreeSet
that is sorted on the squared distances. Alternatively it could be worth maintaining the points in an octree or some other 3D space partitioning data structure. That way you could just look through the nodes in the octree that are closest to the origin.
精彩评论