pandas extract year from datetime: df[‘year’] = df[‘date’].year is not working

If you’re running a recent-ish version of pandas then you can use the datetime accessor dt to access the datetime components: In [6]: df[‘date’] = pd.to_datetime(df[‘date’]) df[‘year’], df[‘month’] = df[‘date’].dt.year, df[‘date’].dt.month df Out[6]: date Count year month 0 2010-06-30 525 2010 6 1 2010-07-30 136 2010 7 2 2010-08-31 125 2010 8 3 2010-09-30 84 … Read more

How to stream DataFrame using FastAPI without saving the data to csv file?

Approach 1 (recommended) As mentioned in this answer, as well as here and here, when the entire data (a DataFrame in your case) is already loaded into memory, there is no need to use StreamingResponse. StreamingResponse makes sense when you want to transfer real-time data and when you don’t know the size of your output … Read more

DataFrame object has no attribute append

As of pandas 2.0, append (previously deprecated) was removed. You need to use concat instead (for most applications): df = pd.concat([df, pd.DataFrame([new_row])], ignore_index=True) As noted by @cottontail, it’s also possible to use loc, although this only works if the new index is not already present in the DataFrame (typically, this will be the case if … Read more

How to calculate mean values grouped on another column

You could groupby on StationID and then take mean() on BiasTemp. To output Dataframe, use as_index=False In [4]: df.groupby(‘StationID’, as_index=False)[‘BiasTemp’].mean() Out[4]: StationID BiasTemp 0 BB 5.0 1 KEOPS 2.5 2 SS0279 15.0 Without as_index=False, it returns a Series instead In [5]: df.groupby(‘StationID’)[‘BiasTemp’].mean() Out[5]: StationID BB 5.0 KEOPS 2.5 SS0279 15.0 Name: BiasTemp, dtype: float64 Read … Read more

Remove row with null value from pandas data frame

This should do the work: df = df.dropna(how=’any’,axis=0) It will erase every row (axis=0) that has “any” Null value in it. EXAMPLE: #Recreate random DataFrame with Nan values df = pd.DataFrame(index = pd.date_range(‘2017-01-01’, ‘2017-01-10′, freq=’1d’)) # Average speed in miles per hour df[‘A’] = np.random.randint(low=198, high=205, size=len(df.index)) df[‘B’] = np.random.random(size=len(df.index))*2 #Create dummy NaN value on … Read more

Adding a column in pandas df using a function

In general, you can use the apply function. If your function requires only one column, you can use: df[‘price’] = df[‘Symbol’].apply(getquotetoday) as @EdChum suggested. If your function requires multiple columns, you can use something like: df[‘new_column_name’] = df.apply(lambda x: my_function(x[‘value_1’], x[‘value_2’]), axis=1)