How To Fill Missing Values In A DataFrame?


How To Fill Missing Values In A DataFrame?

Table Of Contents:

  1. Syntax ‘fillna()’ Method In Pandas.
  2. Examples ‘fillna( )’ Method.

(1) Syntax:

DataFrame.fillna(value=None, *, method=None, axis=None, inplace=False, limit=None, downcast=None)

Description:

  • Fill NA/NaN values using the specified method.

Parameters:

  • value: scalar, dict, Series, or DataFrame –
    • Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame).
    • Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list.
  • method” {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None –
    • Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use next valid observation to fill gap.
  • axis { 0 or ‘index’, 1 or ‘columns’} –
    • Axis along which to fill missing values. For Series this parameter is unused and defaults to 0.
  • inplace: bool, default False –
    • If True, fill in-place. Note: this will modify any other views on this object (e.g., a no-copy slice for a column in a DataFrame).
  • limit: int, default None –
    • If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill.
    • In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled.
    • If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None.
  • downcast: dict, default is None –
    • A dict of item->dtype of what to downcast if possible, or the string ‘infer’ which will try to downcast to an appropriate equal type (e.g. float64 to int64 if possible).

Returns:

  • DataFrame or None
    • Object with missing values filled or None if inplace=True.

(2) Examples Of fillna() Method:

Example-1

df = pd.DataFrame([[np.nan, 2, np.nan, 0],
                   [3, 4, np.nan, 1],
                   [np.nan, np.nan, np.nan, np.nan],
                   [np.nan, 3, np.nan, 4]],
                  columns=list("ABCD"))
df

Output:

# Replace all NaN elements with 0s.

df.fillna(0)

Output:

# We can also propagate non-null values forward or backward.

df.fillna(method="ffill")

Output:

df.fillna(method="bfill")

Output:

# Replace all NaN elements in column ‘A’, ‘B’, ‘C’, and ‘D’, with 0, 1, 2, and 3 respectively.

values = {"A": 0, "B": 1, "C": 2, "D": 3}
df.fillna(value=values)

Output:

# Only replace the first NaN element.

values = {"A": 0, "B": 1, "C": 2, "D": 3}
df.fillna(value=values, limit=1)

Output:

# When filling using a DataFrame, replacement happens along the same column names and same indices

df2 = pd.DataFrame(np.zeros((4, 4)), columns=list("ABCE"))
df2
df2 = pd.DataFrame(np.zeros((4, 4)), columns=list("ABCE"))
df.fillna(df2)

Output:

Leave a Reply

Your email address will not be published. Required fields are marked *