I am running some calculations in an external machine and at the end I get X, Y pair开发者_StackOverflows. I want to apply linear regression and obtain A, B, and R2. In this machine I can not install anything (it runs Linux) and has basic stuff installed on it, python, bash (of course), etc.
I wonder what would be the best approach to use a script (python, bash, etc) or program (I can compile C and C++) that gives me the linear regression coefficients without the need to add external libraries (numpy, etc)
For a single, simple, known function (as in your case: a line) it is not hard to simply code a basic least square routine from scratch (but does require some attention to detail). It is a very common assignment in introductory numeric analysis classes.
So, look up least squares on wikipedia or mathworld or in a text book and go to town.
How about extracting the coeffs into a file, import to another machine and then use Excel/Matlab/whatever other program that does this for you?
Hi there this is my solution that I got from the Wikipedia article on best fit line.
#include <iostream>
#include <vector>
// Returns true if linear fit was calculated. False otherwise.
// Algorithm adapted from:
// https://en.wikipedia.org/wiki/Simple_linear_regression#Fitting_the_regression_line
template <typename PairIterator>
bool GetLinearFit(PairIterator begin_it,
PairIterator end_it,
double* out_slope,
double* out_yintercept) {
if (begin_it == end_it) {
return false;
}
size_t n = 0;
double x_avg = 0;
double y_avg = 0;
for (PairIterator it = begin_it; it != end_it; ++it) {
x_avg += it->first;
y_avg += it->second;
n++;
}
x_avg /= n;
y_avg /= n;
double numerator = 0;
double denominator = 0;
for (PairIterator it = begin_it; it != end_it; ++it) {
double x_variance = it->first - x_avg;
double y_variance = it->second - y_avg;
numerator += (x_variance * y_variance);
denominator += (x_variance * x_variance);
}
double slope = numerator / denominator;
double yintercept = y_avg - slope*x_avg;
*out_slope = slope;
*out_yintercept= yintercept ;
return true;
}
// Tests the output of GetLinearFit(...).
int main() {
std::vector<std::pair<int, int> > data;
for (int i = 0; i < 10; ++i) {
data.push_back(std::pair<int, int>(i+1, 2*i));
}
double slope = 0;
double y_intercept = 0;
GetLinearFit(data.begin(), data.end(), &slope, &y_intercept);
std::cout << "slope: " << slope << "\n";
std::cout << "y_intercept: " << y_intercept<< "\n";
return 0;
}
精彩评论