Why does average damping magically speed up the convergence of fixed-point calculators?_问答_开发者

Why does average damping magically speed up the convergence of fixed-point calculators?

开发者 https://www.devze.com 2023-01-18 14:22 出处：网络

I\'m reading through SICP, and the authors brush over the technique of average damping in computing the fixed points of functions. I understand that it\'s necessary in certain cases, ie square roots i

I'm reading through SICP, and the authors brush over the technique of average damping in computing the fixed points of functions. I understand that it's necessary in certain cases, ie square roots in order to damp out the oscillation of the function y = x/y however, I don't understand why it magically aids the convergence of the fixed point calculating function. Help?

edit

Obviou开发者_如何学Csly, I've thought this through somewhat. I can't seem to wrap my head around why averaging a function with itself would speed up convergence when applied repeatedly.

It only speeds up those functions whose repeated applications "hop around" the fixpoint. Intuitively, it's like adding a brake to a pendulum - it'll stop sooner with the brake.

But not every function has this property. Consider f(x)=x/2. This function will converge sooner without the average damping (log base 2 steps vs log base (4/3) steps), because it approaches the fixpoint from one side.

While I can't answer your question on a mathematical basis, I'll try on an intuitive one: fixpoint techniques need a "flat" function graph around their ..well.. fixpoint. This means: if you picture your fixpoint function on an X-Y chart, you'll see that the function crosses the diagonal (+x,+y) exactly at the true result. In one step of your fixpoint algorithm you are guessing an X value which needs to be within the interval around the intersection point where the first derivative is between (-1..+1) and take the Y value. The Y that you took will be closer to the intersection point because starting from the intersection it is reachable by following a path which has a smaller slope than +/-1 , in contrast to the previous X value that you utilized, which has in this sense, the exact slope -1. It is immediately clear now that the smaller the slope, the more way you make towards the intersection point (the true function value) when using the Y as new X. The best interpolation function is trivially a constant, which has slope 0, giving you the true value in the first step.

Sorry to all mathematicians.