Fast pairwise simple linear regression between variables in a data frame

Some statistical result / background (Link in the picture: Function to calculate R2 (R-squared) in R) Computational details Computations involved here is basically the computation of the variance-covariance matrix. Once we have it, results for all pairwise regression is just element-wise matrix arithmetic. The variance-covariance matrix can be obtained by R function cov, but functions … Read more

Extract regression coefficient values

A summary.lm object stores these values in a matrix called ‘coefficients’. So the value you are after can be accessed with: a2Pval <- summary(mg)$coefficients[2, 4] Or, more generally/readably, coef(summary(mg))[“a2″,”Pr(>|t|)”]. See here for why this method is preferred.

Pass a vector of variables into lm() formula

You’re almost there. You just have to paste the entire formula together, something like this: paste(“roll_pct ~ “,b,sep = “”) coerce it to an actual formula using as.formula and then pass that to lm. Technically, I think lm may coerce a character string itself, but coercing it yourself is generally safer. (Some functions that expect … Read more

How does predict.lm() compute confidence interval and prediction interval?

When specifying interval and level argument, predict.lm can return confidence interval (CI) or prediction interval (PI). This answer shows how to obtain CI and PI without setting these arguments. There are two ways: use middle-stage result from predict.lm; do everything from scratch. Knowing how to work with both ways give you a thorough understand of … Read more

Predict() – Maybe I’m not understanding it

First, you want to use model <- lm(Total ~ Coupon, data=df) not model <-lm(df$Total ~ df$Coupon, data=df). Second, by saying lm(Total ~ Coupon), you are fitting a model that uses Total as the response variable, with Coupon as the predictor. That is, your model is of the form Total = a + b*Coupon, with a … Read more

Linear Regression and group by in R

Since 2009, dplyr has been released which actually provides a very nice way to do this kind of grouping, closely resembling what SAS does. library(dplyr) d <- data.frame(state=rep(c(‘NY’, ‘CA’), c(10, 10)), year=rep(1:10, 2), response=c(rnorm(10), rnorm(10))) fitted_models = d %>% group_by(state) %>% do(model = lm(response ~ year, data = .)) # Source: local data frame [2 … Read more

tech