pandas-groupby – Page 2 – Make Me Engineer

Python Pandas Group by date using datetime data

October 8, 2022 by Tarik

You can use groupby by dates of column Date_Time by dt.date: df = df.groupby([df[‘Date_Time’].dt.date]).mean() Sample: df = pd.DataFrame({‘Date_Time’: pd.date_range(’10/1/2001 10:00:00′, periods=3, freq=’10H’), ‘B’:[4,5,6]}) print (df) B Date_Time 0 4 2001-10-01 10:00:00 1 5 2001-10-01 20:00:00 2 6 2001-10-02 06:00:00 print (df[‘Date_Time’].dt.date) 0 2001-10-01 1 2001-10-01 2 2001-10-02 Name: Date_Time, dtype: object df = df.groupby([df[‘Date_Time’].dt.date])[‘B’].mean() print(df) … Read more

How to drop duplicates based on two or more subsets criteria in Pandas data-frame

September 30, 2022 by Tarik

Your syntax is wrong. Here’s the correct way: df.drop_duplicates(subset=[‘bio’, ‘center’, ‘outcome’]) Or in this specific case, just simply: df.drop_duplicates() Both return the following: bio center outcome 0 1 one f 2 1 two f 3 4 three f Take a look at the df.drop_duplicates documentation for syntax details. subset should be a sequence of column … Read more

pandas dataframe groupby datetime month

August 6, 2022 by Tarik

Managed to do it: b = pd.read_csv(‘b.dat’) b.index = pd.to_datetime(b[‘date’],format=”%m/%d/%y %I:%M%p”) b.groupby(by=[b.index.month, b.index.year]) Or b.groupby(pd.Grouper(freq=’M’)) # update for v0.21+

Pandas Groupby and Sum Only One Column

July 26, 2022 by Tarik

The only way to do this would be to include C in your groupby (the groupby function can accept a list). Give this a try: df.groupby([‘A’,’C’])[‘B’].sum() One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False option to return a dataframe object. This one … Read more

Pandas number rows within group in increasing order

July 26, 2022 by Tarik

Use groupby/cumcount: In [25]: df[‘C’] = df.groupby([‘A’,’B’]).cumcount()+1; df Out[25]: A B C 0 A a 1 1 A a 2 2 A b 1 3 B a 1 4 B a 2 5 B a 3

Use pandas.shift() within a group

July 16, 2022 by Tarik

Pandas’ grouped objects have a groupby.DataFrameGroupBy.shift method, which will shift a specified column in each group n periods, just like the regular dataframe’s shift method: df[‘prev_value’] = df.groupby(‘object’)[‘value’].shift() For the following example dataframe: print(df) object period value 0 1 1 24 1 1 2 67 2 1 4 89 3 2 4 5 4 2 … Read more