开发者

Minimal linear regression program

开发者 https://www.devze.com 2022-12-19 00:05 出处:网络
I am running some calculations in an external machine and at the end I get X, Y pair开发者_StackOverflows. I want to apply linear regression and obtain A, B, and R2. In this machine I can not install

I am running some calculations in an external machine and at the end I get X, Y pair开发者_StackOverflows. I want to apply linear regression and obtain A, B, and R2. In this machine I can not install anything (it runs Linux) and has basic stuff installed on it, python, bash (of course), etc.

I wonder what would be the best approach to use a script (python, bash, etc) or program (I can compile C and C++) that gives me the linear regression coefficients without the need to add external libraries (numpy, etc)


For a single, simple, known function (as in your case: a line) it is not hard to simply code a basic least square routine from scratch (but does require some attention to detail). It is a very common assignment in introductory numeric analysis classes.

So, look up least squares on wikipedia or mathworld or in a text book and go to town.


How about extracting the coeffs into a file, import to another machine and then use Excel/Matlab/whatever other program that does this for you?


Hi there this is my solution that I got from the Wikipedia article on best fit line.

#include <iostream>
#include <vector>

// Returns true if linear fit was calculated. False otherwise.
// Algorithm adapted from:
// https://en.wikipedia.org/wiki/Simple_linear_regression#Fitting_the_regression_line
template <typename PairIterator>
bool GetLinearFit(PairIterator begin_it,
                  PairIterator end_it,
                  double* out_slope,
                  double* out_yintercept) {

    if (begin_it == end_it) {
        return false;
    }

    size_t n = 0;
    double x_avg = 0;
    double y_avg = 0;

    for (PairIterator it = begin_it; it != end_it; ++it) {
        x_avg += it->first;
        y_avg += it->second;
        n++;
    }

    x_avg /= n;
    y_avg /= n;

    double numerator = 0;
    double denominator = 0;

    for (PairIterator it = begin_it; it != end_it; ++it) {
        double x_variance = it->first - x_avg;
        double y_variance = it->second - y_avg;
        numerator += (x_variance * y_variance);
        denominator += (x_variance * x_variance);
    }

    double slope = numerator / denominator;
    double yintercept = y_avg - slope*x_avg;

    *out_slope = slope;
    *out_yintercept= yintercept ;

    return true;
}

// Tests the output of GetLinearFit(...).
int main() {
    std::vector<std::pair<int, int> > data;
    for (int i = 0; i < 10; ++i) {
      data.push_back(std::pair<int, int>(i+1, 2*i));
    }

    double slope = 0;
    double y_intercept = 0;
    GetLinearFit(data.begin(), data.end(), &slope, &y_intercept);

    std::cout << "slope: " << slope << "\n";
    std::cout << "y_intercept: " << y_intercept<< "\n";

    return 0;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消