pandas – Make Me Engineer

pandas fillna not working

June 26, 2023 by Tarik

You need inplace=True df[1].fillna(0, inplace=True)

converting list of header and row lists into pandas DataFrame

June 24, 2023 by Tarik

Call the pd.DataFrame constructor directly: df = pd.DataFrame(table, columns=headers) df Heading1 Heading2 0 1 2 1 3 4

How to create a scatter plot by category [duplicate]

June 20, 2023 by Tarik

You can use scatter for this, but that requires having numerical values for your key1, and you won’t have a legend, as you noticed. It’s better to just use plot for discrete categories like this. For example: import matplotlib.pyplot as plt import numpy as np import pandas as pd np.random.seed(1974) # Generate Data num = … Read more

Bar-Plot with two bars and two y-axis

June 20, 2023 by Tarik

Using the new pandas release (0.14.0 or later) the below code will work. To create the two axis I have manually created two matplotlib axes objects (ax and ax2) which will serve for both bar plots. When plotting a Dataframe you can choose the axes object using ax=…. Also in order to prevent the two … Read more

pandas extract year from datetime: df[‘year’] = df[‘date’].year is not working

June 19, 2023 by Tarik

If you’re running a recent-ish version of pandas then you can use the datetime accessor dt to access the datetime components: In [6]: df[‘date’] = pd.to_datetime(df[‘date’]) df[‘year’], df[‘month’] = df[‘date’].dt.year, df[‘date’].dt.month df Out[6]: date Count year month 0 2010-06-30 525 2010 6 1 2010-07-30 136 2010 7 2 2010-08-31 125 2010 8 3 2010-09-30 84 … Read more

How to stream DataFrame using FastAPI without saving the data to csv file?

June 18, 2023 by Tarik

Approach 1 (recommended) As mentioned in this answer, as well as here and here, when the entire data (a DataFrame in your case) is already loaded into memory, there is no need to use StreamingResponse. StreamingResponse makes sense when you want to transfer real-time data and when you don’t know the size of your output … Read more

DataFrame object has no attribute append

June 18, 2023 by Tarik

As of pandas 2.0, append (previously deprecated) was removed. You need to use concat instead (for most applications): df = pd.concat([df, pd.DataFrame([new_row])], ignore_index=True) As noted by @cottontail, it’s also possible to use loc, although this only works if the new index is not already present in the DataFrame (typically, this will be the case if … Read more

Color a scatter plot by Column Values

June 17, 2023 by Tarik

Imports and Data import numpy import pandas import matplotlib.pyplot as plt import seaborn as sns seaborn.set(style=”ticks”) numpy.random.seed(0) N = 37 _genders= [‘Female’, ‘Male’, ‘Non-binary’, ‘No Response’] df = pandas.DataFrame({ ‘Height (cm)’: numpy.random.uniform(low=130, high=200, size=N), ‘Weight (kg)’: numpy.random.uniform(low=30, high=100, size=N), ‘Gender’: numpy.random.choice(_genders, size=N) }) Update August 2021 With seaborn 0.11.0, it’s recommended to use new figure … Read more

How to calculate mean values grouped on another column

June 17, 2023 by Tarik

You could groupby on StationID and then take mean() on BiasTemp. To output Dataframe, use as_index=False In [4]: df.groupby(‘StationID’, as_index=False)[‘BiasTemp’].mean() Out[4]: StationID BiasTemp 0 BB 5.0 1 KEOPS 2.5 2 SS0279 15.0 Without as_index=False, it returns a Series instead In [5]: df.groupby(‘StationID’)[‘BiasTemp’].mean() Out[5]: StationID BB 5.0 KEOPS 2.5 SS0279 15.0 Name: BiasTemp, dtype: float64 Read … Read more

plot different color for different categorical levels

June 17, 2023 by Tarik

Imports and Sample DataFrame import matplotlib.pyplot as plt import pandas as pd import seaborn as sns # for sample data from matplotlib.lines import Line2D # for legend handle # DataFrame used for all options df = sns.load_dataset(‘diamonds’) carat cut color clarity depth table price x y z 0 0.23 Ideal E SI2 61.5 55.0 326 … Read more