How To Remove Missing Values In A DataFrame?


How To Remove Missing Values In A DataFrame ?

Table Of Contents:

  1. Syntax ‘dropna()’ Method In Pandas.
  2. Examples ‘dropna( )’ Method.

(1) Syntax:

DataFrame.dropna(*, axis=0, how=_NoDefault.no_default, thresh=_NoDefault.no_default, subset=None, inplace=False

Description:

  • Remove missing values.

Parameters:

  • axis {0 or ‘index’, 1 or ‘columns’}, default 0-
    • Determine if rows or columns which contain missing values are removed.
      • 0, or ‘index’ : Drop rows which contain missing values.

      • 1, or ‘columns’ : Drop columns which contain missing value.

  • how : {‘any’, ‘all’}, default ‘any’ –
    • Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
      • ‘any’ : If any NA values are present, drop that row or column.

      • ‘all’ : If all values are NA, drop that row or column.

  • thresh: int, optional – 

    • Require that many non-NA values. Cannot be combined with how.

  • subset: column label or sequence of labels, optional –

    • Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include.

  • inplace: bool, default False – 

    • Whether to modify the DataFrame rather than creating a new one.

Returns:

  • Remove missing values.

(2) Examples Of dropna() Method:

Example-1

df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
                   "toy": [np.nan, 'Batmobile', 'Bullwhip'],
                   "born": [pd.NaT, pd.Timestamp("1940-04-25"),
                            pd.NaT]})
df

Output:

# Drop the rows where at least one element is missing.

df.dropna()

Output:

# Drop the columns where at least one element is missing.

df.dropna(axis='columns')

Output:

# Drop the rows where all elements are missing.

df.dropna(how='all')

Output:

# Keep only the rows with at least 2 non-NA values.

df.dropna(thresh=2)

Output:

# Define in which columns to look for missing values.

df.dropna(subset=['name', 'toy'])

Output:

# Keep the DataFrame with valid entries in the same variable.

df.dropna(inplace=True)
df

Output:

Leave a Reply

Your email address will not be published. Required fields are marked *