Emulating deprecated seaborn distplots

Since I spent some time on this, I thought I share this so that others can easily adapt this approach:

from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np

x_list = [1, 2, 3, 4, 6, 7, 9, 9, 9, 10]
df = pd.DataFrame({"X": x_list, "Y": range(len(x_list))})

f, (ax_dist, ax_hist) = plt.subplots(2, sharex=True)

sns.distplot(df["X"], ax=ax_dist)
ax_dist.set_title("old distplot")
_, FD_bins = np.histogram(x_list, bins="fd")
bin_nr = min(len(FD_bins)-1, 50)
sns.histplot(data=df, x="X", ax=ax_hist, bins=bin_nr, stat="density", alpha=0.4, kde=True, kde_kws={"cut": 3})
ax_hist.set_title("new histplot")


Sample output:
enter image description here

The main changes are

  • bins=bin_nr – determine the histogram bins using the Freedman
    Diaconis Estimator and restrict the upper limit to 50
  • stat="density" – show density instead of count in the histogram
  • alpha=0.4 – for the same transparency
  • kde=True – add a kernel density plot
  • kde_kws={"cut": 3} – extend the kernel density plot beyond the histogram limits

Regarding the bin estimation with bins="fd", I am not sure that this is indeed the method used by distplot. Comments and corrections are more than welcome.

I removed **{"linewidth": 0} because distplot has, as pointed out by @mwaskom in a comment, an edgecolor line around the histogram bars that can be set by matplotlib to the default facecolor. So, you have to sort this out according to your style preferences.

Leave a Comment