How To Describe A DataFrame ?

Table Of Contents:

  1. Syntax ‘describe()’ Method In Pandas.
  2. Examples ‘describe( )’ Method.

(1) Syntax:

DataFrame.describe(percentiles=None, include=None, exclude=None, datetime_is_numeric=False)

Description:

  • Generate descriptive statistics.
  • Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.

Parameters:

  • percentiles: list-like of numbers, optional-
    • The percentiles to include in the output. All should fall between 0 and 1.
    • The default is [.25, .5, .75], which returns the 25th, 50th, and 75th percentiles.
  • include: ‘all’, list-like of dtypes or None (default), optional0 –
    • A white list of data types to include in the result. Ignored for Series. Here are the options:

      • all’ : All columns of the input will be included in the output.

      • A list-like of dtypes : Limits the results to the provided data types. To limit the result to numeric types submit numpy.number. To limit it instead to object columns submit the numpy.object data type. Strings can also be used in the style of select_dtypes (e.g. df.describe(include=['O'])). To select pandas categorical columns, use 'category'

      • None (default) : The result will include all numeric columns.

  • exclude: list-like of dtypes or None (default), optional, –

    A black list of data types to omit from the result. Ignored for Series. Here are the options:

      • A list-like of dtypes : Excludes the provided data types from the result. To exclude numeric types submit numpy.number. To exclude object columns submit the data type numpy.object. Strings can also be used in the style of select_dtypes (e.g. df.describe(exclude=['O'])). To exclude pandas categorical columns, use 'category'

      • None (default) : The result will exclude nothing.

  • result_type{‘expand’, ‘reduce’, ‘broadcast’, None}, default None –

    These only act when axis=1 (columns):

    • ‘expand’ : list-like results will be turned into columns.

    • ‘reduce’ : returns a Series if possible rather than expanding list-like results. This is the opposite of ‘expand’.

    • ‘broadcast’ : results will be broadcast to the original shape of the DataFrame, the original index and columns will be retained.

    The default behaviour (None) depends on the return value of the applied function: list-like results will be returned as a Series of those. However if the apply function returns a Series these are expanded to columns.

  • datetime_is_numeric: bool, default False
    • Whether to treat datetime dtypes as numeric. This affects the statistics calculated for the column. For DataFrame input, this also controls whether datetime columns are included by default.

Returns:

  • Series or DataFrame – Summary statistics of the Series or Dataframe provided

(2) Examples Of describe() Method:

Example-1

import pandas as pd
student = {'Name':['Subrat','Abhispa','Arpita','Anuradha','Namita'],
          'Roll_No':[100,101,102,103,104],
          'Subject':['Math','English','Science','History','Commerce'],
          'Mark':[95,88,76,73,93],
          'Gender':['Male','Female','Female','Female','Female']}
student_object = pd.DataFrame(student)
student_object

Output:

# Describing Student DataFrame.

student_object.describe()

Output:

Example-1

df = pd.DataFrame({'categorical': pd.Categorical(['d','e','f']),
                   'numeric': [1, 2, 3],
                   'object': ['a', 'b', 'c']
                  })
df

Output:

# Describing DataFrame.

df.describe()

Output:

# Include All Types Of Value

df.describe(include='all')

Output:

Leave a Reply

Your email address will not be published. Required fields are marked *