Remove NaN/NULL columns in a Pandas dataframe?

Yes, dropna. See http://pandas.pydata.org/pandas-docs/stable/missing_data.html and the DataFrame.dropna docstring: Definition: DataFrame.dropna(self, axis=0, how=’any’, thresh=None, subset=None) Docstring: Return object with labels on given axis omitted where alternately any or all of the data are missing Parameters ———- axis : {0, 1} how : {‘any’, ‘all’} any : if any NA values are present, drop that label all … Read more

Get weekday/day-of-week for Datetime column of DataFrame

Use the new dt.dayofweek property: In [2]: df[‘weekday’] = df[‘Timestamp’].dt.dayofweek df Out[2]: Timestamp Value weekday 0 2012-06-01 00:00:00 100 4 1 2012-06-01 00:15:00 150 4 2 2012-06-01 00:30:00 120 4 3 2012-06-01 01:00:00 220 4 4 2012-06-01 01:15:00 80 4 In the situation where the Timestamp is your index you need to reset the index … Read more

How to get number of groups in a groupby object in pandas?

Simple, Fast, and Pandaic: ngroups Newer versions of the groupby API (pandas >= 0.23) provide this (undocumented) attribute which stores the number of groups in a GroupBy object. # setup df = pd.DataFrame({‘A’: list(‘aabbcccd’)}) dfg = df.groupby(‘A’) # call `.ngroups` on the GroupBy object dfg.ngroups # 4 Note that this is different from GroupBy.groups which … Read more

making matplotlib scatter plots from dataframes in Python’s pandas

Try passing columns of the DataFrame directly to matplotlib, as in the examples below, instead of extracting them as numpy arrays. df = pd.DataFrame(np.random.randn(10,2), columns=[‘col1′,’col2’]) df[‘col3’] = np.arange(len(df))**2 * 100 + 100 In [5]: df Out[5]: col1 col2 col3 0 -1.000075 -0.759910 100 1 0.510382 0.972615 200 2 1.872067 -0.731010 500 3 0.131612 1.075142 1000 … Read more

tech