开发者

Calculate error, MSE and MAPE? [closed]

开发者 https://www.devze.com 2023-02-22 10:50 出处:网络
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical an开发者_Python百科dcannot be reasonably answered in its current for
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical an开发者_Python百科d cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 11 years ago.

I created this program to estimate the Mean Squared Error (MSE), and Mean absolute percent error (MAPE): Is everything all right with this? pune is an .csv file with 22 data points.

pune <- read.csv("C:/Users/ervis/Desktop/Te dhenat e konsum energji/pune.csv", header=T,dec=",", sep=";")
pune <- data.matrix(pune,rownames.force=NA)
m1 <- seq(from = 14274.19, to = 14458.17, length.out = 10000)
MSE1 <- numeric(length = 10000)
for(i in seq_along(MSE1)) {
 MSE1[i] <- 1 / length(pune) * sum((pune-m1[i]) ^ 2)
}
MAPE1 <- numeric(length = 10000)
for(i in seq_along(MAPE1)) {
 MAPE1[i] <- 1 / length(pune) * sum(abs((pune-m1[i]) / pune))
}

Am I right?


Mean squared error seems to have different meanings in different contexts.

For a random sample taken from a population, the MSE of the sample mean is just the variance divided by the number of samples, i.e.,

mse <- function(sample_mean) var(sample_mean) / length(sample_mean)
mse(pune)

For regressions, MSE means the sum of squares of residuals divided by the degreees of freedom of those residuals.

mse.lm <- function(lm_model) sum(residuals(lm_model) ^ 2) / lm_model$df.residual
#or
mse.lm <- function(lm_model) summary(lm_model)$sigma ^ 2


Seems like a lot of code for a simple calculation. Here is how I would do it for a data vector a:

a = c(1:10)
mse_a = sum((a - mean(a)) ^ 2) / length(a)

From what I can see your formula for MSE is correct, but there should only be one value for the whole dataset, not multiple values.

If your data only contains 22 points, I can't see why you need to create a 10,000 item vector, regardless of whether you are using loops or not.

0

精彩评论

暂无评论...
验证码 换一张
取 消