Pandas DataFrame Density Plot.


Pandas DataFrame Density Plot.

Table Of Contents:

  1. Syntax ‘plot.density( )’ Method In Pandas.
  2. Examples ‘plot.density( )’ Method.

(1) Syntax:

DataFrame.plot.density(bw_method=None, ind=None, **kwargs)

Description:

  • Generate Kernel Density Estimate plot using Gaussian kernels.
  • In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function (PDF) of a random variable.
  • This function uses Gaussian kernels and includes automatic bandwidth determination.

Parameters:

  • bw_methodstr, scalar or callable, optional – The method used to calculate the estimator bandwidth. This can be ‘scott’, ‘silverman’, a scalar constant or a callable. If None (default), ‘scott’ is used. See scipy.stats.gaussian_kde for more information.
  • ind : NumPy array or int, optional – Evaluation points for the estimated PDF. If None (default), 1000 equally spaced points are used. If ind is a NumPy array, the KDE is evaluated at the points passed. If ind is an integer, ind number of equally spaced points are used.

  • **kwargs – Additional keyword arguments are documented in DataFrame.plot().

Returns:

  • matplotlib.axes.Axes or numpy.ndarray of them

(2) Examples Of plot.density() Method:

# Given a Series of points randomly sampled from an unknown distribution, estimate its PDF using KDE with automatic bandwidth determination and plot the results, evaluating them at 1000 equally spaced points (default):

s = pd.Series([1, 2, 2.5, 3, 3.5, 4, 5])
ax = s.plot.kde()

Output:

#A scalar bandwidth can be specified. Using a small bandwidth value can lead to over-fitting, while using a large bandwidth value may result in under-fitting:

ax = s.plot.kde(bw_method=0.3)

Output:

ax = s.plot.kde(bw_method=3)

Output:

# Finally, the ind parameter determines the evaluation points for the plot of the estimated PDF:

ax = s.plot.kde(ind=[1, 2, 3, 4, 5])

Output:

# For DataFrame, it works in the same way:

df = pd.DataFrame({
    'x': [1, 2, 2.5, 3, 3.5, 4, 5],
    'y': [4, 4, 4.5, 5, 5.5, 6, 6],
})
df
ax = df.plot.kde()

Output:

#A scalar bandwidth can be specified. Using a small bandwidth value can lead to over-fitting, while using a large bandwidth value may result in under-fitting:

ax = df.plot.kde(bw_method=0.3)
ax = df.plot.kde(bw_method=3)

# Finally, the ind parameter determines the evaluation points for the plot of the estimated PDF:

ax = df.plot.kde(ind=[1, 2, 3, 4, 5, 6])

Leave a Reply

Your email address will not be published. Required fields are marked *