How To Remove Missing Values In A DataFrame ?
Table Of Contents:
- Syntax ‘dropna()’ Method In Pandas.
- Examples ‘dropna( )’ Method.
(1) Syntax:
DataFrame.dropna(*, axis=0, how=_NoDefault.no_default, thresh=_NoDefault.no_default, subset=None, inplace=False
Description:
- Remove missing values.
Parameters:
- axis {0 or ‘index’, 1 or ‘columns’}, default 0-
- Determine if rows or columns which contain missing values are removed.
0, or ‘index’ : Drop rows which contain missing values.
1, or ‘columns’ : Drop columns which contain missing value.
- Determine if rows or columns which contain missing values are removed.
- how : {‘any’, ‘all’}, default ‘any’ –
- Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
‘any’ : If any NA values are present, drop that row or column.
‘all’ : If all values are NA, drop that row or column.
- Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
thresh: int, optional –
Require that many non-NA values. Cannot be combined with how.
subset: column label or sequence of labels, optional –
Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include.
inplace: bool, default False –
Whether to modify the DataFrame rather than creating a new one.
Returns:
- Remove missing values.
(2) Examples Of dropna() Method:
Example-1
df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
"toy": [np.nan, 'Batmobile', 'Bullwhip'],
"born": [pd.NaT, pd.Timestamp("1940-04-25"),
pd.NaT]})
df
Output:
# Drop the rows where at least one element is missing.
df.dropna()
Output:
# Drop the columns where at least one element is missing.
df.dropna(axis='columns')
Output:
# Drop the rows where all elements are missing.
df.dropna(how='all')
Output:
# Keep only the rows with at least 2 non-NA values.
df.dropna(thresh=2)
Output:
# Define in which columns to look for missing values.
df.dropna(subset=['name', 'toy'])
Output:
# Keep the DataFrame with valid entries in the same variable.
df.dropna(inplace=True)
df