How To Update A DataFrame ?

Table Of Contents:

  1. Syntax ‘update( )’ Method In Pandas.
  2. Examples ‘update( )’ Method.

(1) Syntax:

DataFrame.update(other, join='left', overwrite=True, filter_func=None, errors='ignore')

Description:

  • Modify in place using non-NA values from another DataFrame.

  • Aligns on indices. There is no return value.

Parameters:

  • other: DataFrame, or object coercible into a DataFrame – Should have at least one matching index/column label with the original DataFrame. If a Series is passed, its name attribute must be set, and that will be used as the column name to align with the original DataFrame.
  • join: {‘left’}, default ‘left’ – Only left join is implemented, keeping the index and columns of the original object.
  • overwrite: bool, default True – 

    How to handle non-NA values for overlapping keys:

    • True: overwrite original DataFrame’s values with values from other.

    • False: only update values that are NA in the original DataFrame.

  • filter_func: callable(1d-array) -> bool 1d-array, optional – Can choose to replace values other than NA. Return True for values that should be updated.
  • errors: {‘raise’, ‘ignore’}, default ‘ignore’ – If ‘raise’, will raise a ValueError if the DataFrame and other both contain non-NA data in the same place.

Returns:

  • None: method directly changes calling object.

Raises

  • ValueError
    • When errors=’raise’ and there’s overlapping non-NA data.

    • When errors is not either ‘ignore’ or ‘raise’

  • NotImplementedError
    • If join != ‘left’

(2) Examples Of update() Method:

Example-1:

df = pd.DataFrame({'A': [1, 2, 3],
                   'B': [400, 500, 600]})
                   
new_df = pd.DataFrame({'B': [4, 5, 6],
                       'C': [7, 8, 9]})

Output:

df.update(new_df)
df

Output:

df = pd.DataFrame({'A': ['a', 'b', 'c'],
                   'B': ['x', 'y', 'z']})
new_df = pd.DataFrame({'B': ['d', 'e', 'f', 'g', 'h', 'i']})

Output:

df.update(new_df)
df

Output:

# For Series, its name attribute must be set.

df = pd.DataFrame({'A': ['a', 'b', 'c'],
                   'B': ['x', 'y', 'z']})
new_column = pd.Series(['d', 'e'], name='B', index=[0, 2])

Output:

df.update(new_column)
df

Output:

df = pd.DataFrame({'A': [1, 2, 3],
                   'B': [400, 500, 600]})
new_df = pd.DataFrame({'B': [4, np.nan, 6]})

Output:

df.update(new_df)
df

Output:

Leave a Reply

Your email address will not be published. Required fields are marked *