T-test in Pandas

it depends what sort of t-test you want to do (one sided or two sided dependent or independent) but it should be as simple as: from scipy.stats import ttest_ind cat1 = my_data[my_data[‘Category’]==’cat1′] cat2 = my_data[my_data[‘Category’]==’cat2′] ttest_ind(cat1[‘values’], cat2[‘values’]) >>> (1.4927289925706944, 0.16970867501294376) it returns a tuple with the t-statistic & the p-value see here for other t-tests … Read more

How to incrementally sample without replacement?

If you know in advance that you’re going to want to multiple samples without overlaps, easiest is to do random.shuffle() on list(range(100)) (Python 3 – can skip the list() in Python 2), then peel off slices as needed. s = list(range(100)) random.shuffle(s) first_sample = s[-10:] del s[-10:] second_sample = s[-10:] del s[-10:] # etc Else … Read more