How To Fill Missing Values In A DataFrame?
Table Of Contents:
- Syntax ‘fillna()’ Method In Pandas.
- Examples ‘fillna( )’ Method.
(1) Syntax:
DataFrame.fillna(value=None, *, method=None, axis=None, inplace=False, limit=None, downcast=None)
Description:
- Fill NA/NaN values using the specified method.
Parameters:
- value: scalar, dict, Series, or DataFrame –
- Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame).
- Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list.
- method” {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None –
- Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use next valid observation to fill gap.
- axis { 0 or ‘index’, 1 or ‘columns’} –
- Axis along which to fill missing values. For Series this parameter is unused and defaults to 0.
- inplace: bool, default False –
- If True, fill in-place. Note: this will modify any other views on this object (e.g., a no-copy slice for a column in a DataFrame).
- limit: int, default None –
- If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill.
- In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled.
- If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None.
- downcast: dict, default is None –
- A dict of item->dtype of what to downcast if possible, or the string ‘infer’ which will try to downcast to an appropriate equal type (e.g. float64 to int64 if possible).
Returns:
- DataFrame or None
- Object with missing values filled or None if
inplace=True
.
- Object with missing values filled or None if
(2) Examples Of fillna() Method:
Example-1
df = pd.DataFrame([[np.nan, 2, np.nan, 0],
[3, 4, np.nan, 1],
[np.nan, np.nan, np.nan, np.nan],
[np.nan, 3, np.nan, 4]],
columns=list("ABCD"))
df
Output:
# Replace all NaN elements with 0s.
df.fillna(0)
Output:
# We can also propagate non-null values forward or backward.
df.fillna(method="ffill")
Output:
df.fillna(method="bfill")
Output:
# Replace all NaN elements in column ‘A’, ‘B’, ‘C’, and ‘D’, with 0, 1, 2, and 3 respectively.
values = {"A": 0, "B": 1, "C": 2, "D": 3}
df.fillna(value=values)
Output:
# Only replace the first NaN element.
values = {"A": 0, "B": 1, "C": 2, "D": 3}
df.fillna(value=values, limit=1)
Output:
# When filling using a DataFrame, replacement happens along the same column names and same indices
df2 = pd.DataFrame(np.zeros((4, 4)), columns=list("ABCE"))
df2
df2 = pd.DataFrame(np.zeros((4, 4)), columns=list("ABCE"))
df.fillna(df2)