regression
Show confidence limits and prediction limits in scatter plot
Here’s what I put together. I tried to closely emulate your screenshot. Given import numpy as np import scipy as sp import scipy.stats as stats import matplotlib.pyplot as plt %matplotlib inline # Raw Data heights = np.array([50,52,53,54,58,60,62,64,66,67,68,70,72,74,76,55,50,45,65]) weights = np.array([25,50,55,75,80,85,50,65,85,55,45,45,50,75,95,65,50,40,45]) Two detailed options to plot confidence intervals: def plot_ci_manual(t, s_err, n, x, x2, y2, ax=None): … Read more
fitting data with numpy
Unfortunately, np.polynomial.polynomial.polyfit returns the coefficients in the opposite order of that for np.polyfit and np.polyval (or, as you used np.poly1d). To illustrate: In [40]: np.polynomial.polynomial.polyfit(x, y, 4) Out[40]: array([ 84.29340848, -100.53595376, 44.83281408, -8.85931101, 0.65459882]) In [41]: np.polyfit(x, y, 4) Out[41]: array([ 0.65459882, -8.859311 , 44.83281407, -100.53595375, 84.29340846]) In general: np.polynomial.polynomial.polyfit returns coefficients [A, B, C] … Read more
scikit-learn cross validation, negative values with mean squared error
Trying to close this out, so am providing the answer that David and larsmans have eloquently described in the comments section: Yes, this is supposed to happen. The actual MSE is simply the positive version of the number you’re getting. The unified scoring API always maximizes the score, so scores which need to be minimized … Read more
predict.lm() with an unknown factor level in test data
You have to remove the extra levels before any calculation, like: > id <- which(!(foo.new$predictor %in% levels(foo$predictor))) > foo.new$predictor[id] <- NA > predict(model,newdata=foo.new) 1 2 3 4 -0.1676941 -0.6454521 0.4524391 NA This is a more general way of doing it, it will set all levels that do not occur in the original data to NA. … Read more
What does the capital letter “I” in R linear regression formula mean?
I isolates or insulates the contents of I( … ) from the gaze of R’s formula parsing code. It allows the standard R operators to work as they would if you used them outside of a formula, rather than being treated as special formula operators. For example: y ~ x + x^2 would, to R, … Read more