开发者

Minimising distance: distance formula

开发者 https://www.devze.com 2022-12-13 21:18 出处:网络
I am writing a program in C. I want to find a solution by minimizing expression D1+D2+......+Dn where Di\'s are distances calculated by distance formula between 2 points. The above expression is in

I am writing a program in C. I want to find a solution by minimizing expression

D1+D2+......+Dn

where Di's are distances calculated by distance formula between 2 points. The above expression is in x & y variables

Now I will differentiate this expression and find the solution. My doubt is:

since in the above expression, all Di's will occur as square roots which will be difficult to solve. So instead we 开发者_C百科can solve for this expression:

D1^2 + D2^2 + ......+ Dn^2

Will the answer produced by the above expression will be same as that would have been produced by solving the original one?

I have checked for simple test cases such as n=2. It produces the correct answer. Is it true in general?

If not, how this problem can be solved?


Even for 2d distances, it is not in general true that the minimum of a^2 + b^2 is at the same location as the minimum of a + b. It may be true for some very specific limited set of problems, of course. If you're trying to find a counterexample, note that the squares overpenalize long distances; if you construct an example with a minimum containing at least one long distance, you've a good chance that the sum of squares has a different minimum.

What is the problem you are trying to solve? It's quite possible that for your problem the distinction doesn't matter, of course; or that the minimum of the sum of squares is a cheaper problem and an easier first approximation to a final solution.

It may be obvious, but if the various distances are unrelated, then for each individual distance the square is minimal when the distance is and thus the sum of unrelated distances is minimal where the sum of the squares is.

Edit Post update: You're trying to find a centroid with the limitation that it lies on a particular line. In general outlines then: you've only one degree of freedom, and you can do plain differentiation. However, the result will have a sum of fractions with sqrt in the denominator; solving that algebraically in the general case isn't possible (AFAIK). I'm not 100% positive, but I think you're in luck in that your distance-sum has no local minimum except the global one; in that case newton's method will converge reliably and fast.

So, if you can verify the assumption that there is only one local minimum, you're home free, and even if you can, you can achieve a pretty good result fairly reliably and detect when it goes wrong simply by comparing your newton-method computed minimum with a few reality check points (say, the orthogonal projection of each point onto the line).


You have not tested enough. Minimizing D1 + D2 is not the same as minimizing D1^2 + D2^2 in general, although it may be for some particular D1 and D2.

EDIT after you reminded me that you are only interested in distances in the plane:

In the case where D1 and D2 are distances in the geometric plane, the point in the plane that minimizes D1^2 + D2^2 also minimizes D1 + D2, but it breaks down with three points.

Try it with the three points (0,0), (1,0) and (10, 0): Minimizing |x|+|x-1|+|x-10| is not the same as minimizing x^2+(x-1)^2+(x-10)^2


You are a bit unclear about the origin of your D1, D2, .. Dn but I assume you have a set of points P1, P2, ..., Pn in the x-y plane, and you want to find the point p0=(x0,y0) that minimises the sum of the distances between each point P1.. Pn and p0.

So your D1.. Dn are actually:

D1 = sqrt((x0-x1)^2 + (y0-y1)^2)
D2 = sqrt((x0-x2)^2 + (y0-y2)^2)
..
Dn = sqrt((x0-xn)^2 + (y0-yn)^2)

where x1 .. xn and y1 ... yn are known and x0, y0 are unknown. And you want to minimise D0:

D0 = D1+D2+......+Dn

If this is correct then you want to find the Geometric median. The wikipedia article should help you produce a solution.

Update: you state in a comment that point P0 should lie on a given line (please add this to your original problem statement). This means you can rewrite y0 as a function of x0:

y0 = a*x0 + b

where a and b are given. This reduces the complexity of your distance functions and makes derivation a serious possibility.

D1 = sqrt((x0-x1)^2 + (ax0+b-y1)^2)
D2 = sqrt((x0-x2)^2 + (ax0+b-y2)^2)
..
Dn = sqrt((x0-xn)^2 + (ax0+b-yn)^2)

But if the amount of points n is not too large I would just do a brute-force search* in the area of the line where x is close to the mean of x1 .. xn to find the point x-, y0 that minimises D0.


Counterexample:

d1=1 & d2=10 (sum=11 & sumOfSquares=101)

d1=6 & d2=6 (sum=12 & sumOfSquares=72)

Sum increased but sum of squares decreased.


Your problem is the minimization of an objective function given by some norm of your distances. The distances are Euclidean, and as such represent the Euclidean norm between two vectors. In order to understand the difference between minimizing sum(ai) over sum(ai^2) I recommend you read the Wikipedia entry on Norms; bottom line, notice the following:

||x||2       <= ||x||1 <= sqrt(n)||x||2
||x||_\infty <= ||x||2 <= sqrt(n)||x||_\infty
||x||_\infty <= ||x||1 <=       n||x||_\infty

Where ||x||2 is the Euclidean norm, ||x||1 is sum(abs(x1)+abs(x2)+...+abs(xn)), and ||x||_\infty is max(abs(x1),abs(x2),...,abs(xn)). In your case all the numbers are positive (you already have the Euclidean norm, so you can seee the difference.

It may also be helpful (though much more difficult to fully grasp) to read the fantastic book by Golub and Van Loan, Matrix Computations.

0

精彩评论

暂无评论...
验证码 换一张
取 消