pandas extract year from datetime: df[‘year’] = df[‘date’].year is not working

If you’re running a recent-ish version of pandas then you can use the datetime accessor dt to access the datetime components: In [6]: df[‘date’] = pd.to_datetime(df[‘date’]) df[‘year’], df[‘month’] = df[‘date’].dt.year, df[‘date’].dt.month df Out[6]: date Count year month 0 2010-06-30 525 2010 6 1 2010-07-30 136 2010 7 2 2010-08-31 125 2010 8 3 2010-09-30 84 … Read more

How to stream DataFrame using FastAPI without saving the data to csv file?

Approach 1 (recommended) As mentioned in this answer, as well as here and here, when the entire data (a DataFrame in your case) is already loaded into memory, there is no need to use StreamingResponse. StreamingResponse makes sense when you want to transfer real-time data and when you don’t know the size of your output … Read more

DataFrame object has no attribute append

As of pandas 2.0, append (previously deprecated) was removed. You need to use concat instead (for most applications): df = pd.concat([df, pd.DataFrame([new_row])], ignore_index=True) As noted by @cottontail, it’s also possible to use loc, although this only works if the new index is not already present in the DataFrame (typically, this will be the case if … Read more

Color a scatter plot by Column Values

Imports and Data import numpy import pandas import matplotlib.pyplot as plt import seaborn as sns seaborn.set(style=”ticks”) numpy.random.seed(0) N = 37 _genders= [‘Female’, ‘Male’, ‘Non-binary’, ‘No Response’] df = pandas.DataFrame({ ‘Height (cm)’: numpy.random.uniform(low=130, high=200, size=N), ‘Weight (kg)’: numpy.random.uniform(low=30, high=100, size=N), ‘Gender’: numpy.random.choice(_genders, size=N) }) Update August 2021 With seaborn 0.11.0, it’s recommended to use new figure … Read more

How to calculate mean values grouped on another column

You could groupby on StationID and then take mean() on BiasTemp. To output Dataframe, use as_index=False In [4]: df.groupby(‘StationID’, as_index=False)[‘BiasTemp’].mean() Out[4]: StationID BiasTemp 0 BB 5.0 1 KEOPS 2.5 2 SS0279 15.0 Without as_index=False, it returns a Series instead In [5]: df.groupby(‘StationID’)[‘BiasTemp’].mean() Out[5]: StationID BB 5.0 KEOPS 2.5 SS0279 15.0 Name: BiasTemp, dtype: float64 Read … Read more

plot different color for different categorical levels

Imports and Sample DataFrame import matplotlib.pyplot as plt import pandas as pd import seaborn as sns # for sample data from matplotlib.lines import Line2D # for legend handle # DataFrame used for all options df = sns.load_dataset(‘diamonds’) carat cut color clarity depth table price x y z 0 0.23 Ideal E SI2 61.5 55.0 326 … Read more