Select k random elements from a list whose elements have weights

If the sampling is with replacement, you can use this algorithm (implemented here in Python): import random items = [(10, “low”), (100, “mid”), (890, “large”)] def weighted_sample(items, n): total = float(sum(w for w, v in items)) i = 0 w, v = items[0] while n: x = total * (1 – random.random() ** (1.0 / … Read more

How to find the statistical mode?

One more solution, which works for both numeric & character/factor data: Mode <- function(x) { ux <- unique(x) ux[which.max(tabulate(match(x, ux)))] } On my dinky little machine, that can generate & find the mode of a 10M-integer vector in about half a second. If your data set might have multiple modes, the above solution takes the … Read more

Fitting empirical distribution to theoretical ones with Scipy (Python)?

Distribution Fitting with Sum of Square Error (SSE) This is an update and modification to Saullo’s answer, that uses the full list of the current scipy.stats distributions and returns the distribution with the least SSE between the distribution’s histogram and the data’s histogram. Example Fitting Using the El NiƱo dataset from statsmodels, the distributions are … Read more

tech