Python Pandas For Your Grandpa | Section 2.4 | Series Overwriting Data

Course Contents

  1. Introduction
  2. Series
    2.1 Series Creation
    2.2 Series Basic Operations
    2.3 Series Basic Indexing
    2.4 Series Overwriting Data
    2.5 Series Apply
    2.6 Series Concatenation
    2.7 Series Boolean Indexing
    2.8 Series View Vs Copy
    2.9 Series Missing Values
    2.10 Series Challenges

import numpy as np
import pandas as pd

Now that we know how to access data from a series using an index, overwriting data is pretty straight-forward.

For example, if you have the series foo = ['a','b','c','d','e'] with index labels [10,40,50,30,20] and you want to change the 2nd element to 'w', you can use

foo = pd.Series(['a', 'b', 'c', 'd', 'e'], index=[10, 40, 50, 30, 20])
foo.iloc[1] = 'w'
print(foo)
## 10    a
## 40    w
## 50    c
## 30    d
## 20    e
## dtype: object

If you wanted to switch the 1st, 2nd and 3rd elements to 'x' you could do

foo.iloc[[0, 1, 2]] = 'x'

# or use slicing
foo.iloc[:3] = 'x'

print(foo)
## 10    x
## 40    x
## 50    x
## 30    d
## 20    e
## dtype: object

And obviously, you can do the same exact operations using the index labels with foo.loc

foo.loc[40] = 'w'
foo.loc[[10, 40, 50]] = 'x'
foo.loc[10:50] = 'x'

What if you wanted to overwrite the entire series with a new set of values like the ones in this array?

new_vals = np.array([5, 10, 15, 20, 25])

Your first instinct might be to overwrite the entire foo variable like this, but then you’d lose foo's index.

foo = pd.Series(new_vals)
print(foo)
## 0     5
## 1    10
## 2    15
## 3    20
## 4    25
## dtype: int64

Instead, use slicing to select and overwrite foo's values without overwriting its index.

foo.iloc[:] = new_vals
print(foo)
## 10     5
## 40    10
## 50    15
## 30    20
## 20    25
## dtype: int64

Now suppose you have two series of 4 values each, whose indices are different but share a couple common values

x = pd.Series([10, 20, 30, 40])
y = pd.Series([1, 11, 111, 1111], index=[7,3,2,0])

What do you think the result of doing something like this will be?

x.loc[[0, 1]] = y

This one’s a bit strange to get used to, but when you see the result, it’s pretty clear to understand what’s happening.

print(x)
## 0    1111.0
## 1       NaN
## 2      30.0
## 3      40.0
## dtype: float64

Pandas starts by creating a temporary subset of x with index labels 0 and 1. Then it looks for matching labels in y to use to overwrite x. Since x's label 1 doesn’t doesn’t match any elements in y, pandas gives it the value NaN. And since NaN only exists as a floating point value, pandas has to cast the entire series from ints to floats.

Also note that we could do the same exact thing using slicing.

x.iloc[:2] = y

If we try to do this using a NumPy array on the right-hand-side, we’ll get an error because when the right-hand-side is a NumPy array, pandas tries to assign each element of the NumPy array to the Series subset on an element-by-element basis and the NumPy array has more elements than the Series subset.

x.loc[[0, 1]] = y.to_numpy()  # error

If the NumPy array on the right-hand-side is the same length as the Series subset on the left-hand-side, this would work.

x.loc[[0, 1]] = y.to_numpy()[:2]
print(x)
## 0     1.0
## 1    11.0
## 2    30.0
## 3    40.0
## dtype: float64