Function to calculate R2 (R-squared) in R

I have a dataframe with observed and simulated data, and I would like to calculate the value of R2. I expected that there would be a function that I could name, but I cannot find it. I know that I can write my own and apply it, but will I miss something obvious? I want something like

obs <- 1:5
mod <- c(0.8,2.4,2,3,4.8)
df <- data.frame(obs, mod)

R2 <- rsq(df)
# 0.85
+11
source share
5 answers

You need some statistical knowledge to see this. R squared between two vectors is simply the square of their correlation . So you can define your function as:

rsq <- function (x, y) cor(x, y) ^ 2

Sandipan (. ), (- $r.squared).


, y x .

1: y ~ x y - mean(y) ~ x - mean(x)

lemma 1

2: = cov (x, y)/var (x)

lemma 2

3: R.square = cor (x, y) ^ 2

lemma 3


R x y ( ) - . !! R x + a y + b a b. " ". MSE RMSE:

42-:

R , . , .

R ( ) " ". , . , R , R . , .

:

preds <- 1:4/4
actual <- 1:4

R 1. , , - , . , preds actual?


1, 2 .

, , . x y y ~ x . , . , , R .

, R :

preds <- c(1, 2, 3)
actual <- c(2, 2, 4)
rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
rsq <- 1 - rss/tss
#[1] 0.25

:

regss <- sum((preds - mean(preds)) ^ 2) ## regression sum of squares
regss / tss
#[1] 0.75

, ( 1, "").

preds <- 1:4 / 4
actual <- 1:4
rss <- sum((preds - actual) ^ 2)  ## residual sum of squares
tss <- sum((actual - mean(actual)) ^ 2)  ## total sum of squares
rsq <- 1 - rss/tss
#[1] -2.375

, , 2 . , , . , , R , R .

+16

:

rsq <- function(x, y) summary(lm(y~x))$r.squared
rsq(obs, mod)
#[1] 0.8560185
+9

- , caret postResample(), " " documentation. " " -

  • RMSE
  • Rsquared
  • (MAE)

library(caret)
vect1 <- c(1, 2, 3)
vect2 <- c(3, 2, 2)
res <- caret::postResample(vect1, vect2)
rsq <- res[2]

r-, . 1-SSE/SST .

:

preds <- c(1, 2, 3)
actual <- c(2, 2, 4)
rss <- sum((preds - actual) ^ 2)
tss <- sum((actual - mean(actual)) ^ 2)
rsq <- 1 - rss/tss

, , , , ? , R ^ 2.

+4
source

You can also use a summary for linear models:

summary(lm(obs ~ mod, data=df))$r.squared 
+2
source

Here is the simplest solution based on [ https://en.wikipedia.org/wiki/Coefficient_of_determination]

# 1. 'Actual' and 'Predicted' data
df <- data.frame(
  y_actual = c(1:5),
  y_predicted  = c(0.8, 2.4, 2, 3, 4.8))

# 2. R2 Score components

# 2.1. Average of actual data
avr_y_actual <- mean(df$y_actual)

# 2.2. Total sum of squares
ss_total <- sum((df$y_actual - avr_y_actual)^2)

# 2.3. Regression sum of squares
ss_regression <- sum((df$y_predicted - avr_y_actual)^2)

# 2.4. Residual sum of squares
ss_residuals <- sum((df$y_actual - df$y_predicted)^2)

# 3. R2 Score
r2 <- 1 - ss_residuals / ss_total
0
source

Source: https://habr.com/ru/post/1662477/


All Articles