Show confidence limits and prediction limits in scatter plot

Here’s what I put together. I tried to closely emulate your screenshot. Given import numpy as np import scipy as sp import scipy.stats as stats import matplotlib.pyplot as plt %matplotlib inline # Raw Data heights = np.array([50,52,53,54,58,60,62,64,66,67,68,70,72,74,76,55,50,45,65]) weights = np.array([25,50,55,75,80,85,50,65,85,55,45,45,50,75,95,65,50,40,45]) Two detailed options to plot confidence intervals: def plot_ci_manual(t, s_err, n, x, x2, y2, ax=None): … Read more

fitting data with numpy

Unfortunately, np.polynomial.polynomial.polyfit returns the coefficients in the opposite order of that for np.polyfit and np.polyval (or, as you used np.poly1d). To illustrate: In [40]: np.polynomial.polynomial.polyfit(x, y, 4) Out[40]: array([ 84.29340848, -100.53595376, 44.83281408, -8.85931101, 0.65459882]) In [41]: np.polyfit(x, y, 4) Out[41]: array([ 0.65459882, -8.859311 , 44.83281407, -100.53595375, 84.29340846]) In general: np.polynomial.polynomial.polyfit returns coefficients [A, B, C] … Read more

scikit-learn cross validation, negative values with mean squared error

Trying to close this out, so am providing the answer that David and larsmans have eloquently described in the comments section: Yes, this is supposed to happen. The actual MSE is simply the positive version of the number you’re getting. The unified scoring API always maximizes the score, so scores which need to be minimized … Read more

predict.lm() with an unknown factor level in test data

You have to remove the extra levels before any calculation, like: > id <- which(!(foo.new$predictor %in% levels(foo$predictor))) > foo.new$predictor[id] <- NA > predict(model,newdata=foo.new) 1 2 3 4 -0.1676941 -0.6454521 0.4524391 NA This is a more general way of doing it, it will set all levels that do not occur in the original data to NA. … Read more