.NET math calculation performances_问答_开发者

开发者 https://www.devze.com 2023-04-04 14:16 出处：网络

I asked a question about having the Excel\'s BetaInv function ported to .NET: BetaInv function in SQL Server

I asked a question about having the Excel's BetaInv function ported to .NET: BetaInv function in SQL Server

now I managed to write that function in pure dependency less C# code and I do get the same results than in MS Excel up to 6 or 7 digits after comma, re开发者_如何学Csults are fine for us, the problem is that such code is embedded in a SQL CLR Function and gets called thousands of time from a stored procedure and makes the execution of the whole procedure about 50% slower, from 30 seconds up to a minute if I use that function or not.

here some code of it, I am not asking a deep analysis but is there anybody who sees any major performance issue in the way I am doing this calculations? like for example usage of other data types instead of doubles or whatsoever... ?

private static double betacf(double a, double b, double x)
        {
            int m, m2;
            double aa, c, d, del, h, qab, qam, qap;

            qab = a + b;
            qap = a + 1.0;
            qam = a - 1.0;

            c = 1.0; // First step of Lentz’s method.

            d = 1.0 - qab * x / qap;

            if (System.Math.Abs(d) < FPMIN)
            {
                d = FPMIN;
            }

            d = 1.0 / d;
            h = d;

            for (m = 1; m <= MAXIT; ++m)
            {
                m2 = 2 * m;
                aa = m * (b - m) * x / ((qam + m2) * (a + m2));
                d = 1.0 + aa * d; //One step (the even one) of the recurrence.

                if (System.Math.Abs(d) < FPMIN)
                {
                    d = FPMIN;
                }

                c = 1.0 + aa / c;

                if (System.Math.Abs(c) < FPMIN)
                {
                    c = FPMIN;
                }

                d = 1.0 / d;
                h *= d * c;

                aa = -(a + m) * (qab + m) * x / ((a + m2) * (qap + m2));
                d = 1.0 + aa * d; // Next step of the recurrence (the odd one).

                if (System.Math.Abs(d) < FPMIN)
                {
                    d = FPMIN;
                }

                c = 1.0 + aa / c;

                if (System.Math.Abs(c) < FPMIN)
                {
                    c = FPMIN;
                }

                d = 1.0 / d;
                del = d * c;
                h *= del;

                if (System.Math.Abs(del - 1.0) < EPS)
                {
                    // Are we done?
                    break;
                }
            }

            if (m > MAXIT)
            {
                return 0;
            }
            else
            {
                return h;
            }
        }

        private static double gammln(double xx)
        {
            double x, y, tmp, ser;

            double[] cof = new double[] { 76.180091729471457, -86.505320329416776, 24.014098240830911, -1.231739572450155, 0.001208650973866179, -0.000005395239384953 };

            y = xx;
            x = xx;
            tmp = x + 5.5;
            tmp -= (x + 0.5) * System.Math.Log(tmp);

            ser = 1.0000000001900149;

            for (int j = 0; j <= 5; ++j)
            {
                y += 1;
                ser += cof[j] / y;
            }

            return -tmp + System.Math.Log(2.5066282746310007 * ser / x);
        }

The only thing that stands out for me, and is usually a performance hit, is memory allocation. I don't know how often gammln is called but you might want to move the double[] cof = new double[] {} to a static one time only allocation.

double is usually the best type. Especially since the functions in Math take doubles. Unfortunately I see no obvious improvements to make on your code.

It might be possible to use look up tables to get a better first estimate on which you iterate, but since I don't know the Math behind what you're doing I don't know if that's possible in this specific case.

Obviously larger epsilons will improve performance. So choose it as large as possible while fulfilling your accuracy demands.

If the function gets called repeatedly with the same parameters you might be able to cache results.

One thing that looks odd is the way you force small values for c, d,... to FPMIN. My instinct is that this might lead to suboptimal step sizes.

All I've got is unrolling the j loop in gammln, but it'll make at most a tiny difference.

A more radical thought would be to rewrite in pure T-SQL, since it has everything you use: + - * / abs log are all available.