# Course Contents

1. Introduction
2. Series
2.1 Series Creation
2.2 Series Basic Operations
2.3 Series Basic Indexing
2.4 Series Overwriting Data
2.5 Series Apply
2.6 Series Concatenation
2.7 Series Boolean Indexing
2.8 Series View Vs Copy
2.9 Series Missing Values
2.10 Series Challenges

``````import numpy as np
import pandas as pd
``````

Let’s suppose you have this series, `x`

``````x = pd.Series(
data=[2, 3, 5, 7, 11, 13],
index=[2, 11, 12, 30, 30, 51]
)
``````

and then you set a new variable, `y`, equal to `x`

``````y = x
``````

Then you modify the first element of `y` to be 999.

``````y.iloc[0] = 999
``````

Obviously this modifies `y`, but does it also modify `x`???

``````x
## 2     999
## 11      3
## 12      5
## 30      7
## 30     11
## 51     13
## dtype: int64
``````

You might be surprised to see that even though we clearly changed `y`, `x` also gets modified. The reason this happens is because when we set `y` equal to `x`, pandas didn’t make a copy of `x`, it merely made `y` a reference to `x` so that the variable `y` actually points to the data stored by `x`. This is known as assignment by reference and some people would call `y` a “view” of `x`.

In order to avoid this type of behavior, when you create `y`, you’ll want to explicitly set it equal to a copy of `x` using something like

``````# Avoid assignment by reference
y = x.copy()
``````

Now if you change `y`, `x` is unchanged because `y`'s data is stored completely separate from `x`'s data.

``````y.iloc[1] = -333

# x is unchanged
print(x)
## 2     999
## 11      3
## 12      5
## 30      7
## 30     11
## 51     13
## dtype: int64
``````

One of the reasons this is so confusing is because assignment by reference only happens under some circumstances which aren’t clearly documented and aren’t always obvious. For example, if we have the Series

``````foo = pd.Series(['a', 'b', 'c', 'd'])
``````

and we set `bar` as

``````bar = foo.loc[foo <= 'b']
``````

and we modify bar

``````bar.iloc[0] = 'z'
``````

`foo` doesn’t get changed which means under the hood, pandas copied data from `foo` to create `bar`.

``````foo
## 0    a
## 1    b
## 2    c
## 3    d
## dtype: object
``````

Now, if we set `baz = foo.iloc[:2]`, which is the same exact subset of `bar`,

``````baz = foo.iloc[:2]
``````

and we modify `baz`

``````baz.iloc[0] = 'z'
``````

this time, `foo` gets changed.

``````foo
## 0    z
## 1    b
## 2    c
## 3    d
## dtype: object
``````

As far as I can tell, when it comes to Series, if you assign `a` equal to `b.loc[something]`, pandas returns a copy, otherwise it returns a view, but this is undocumented and the rules change when we start using DataFrames. So I don’t recommend memorizing any hard and fast rules. Instead, you kind of just have to play around with things. Use `.copy()` to be safe and just be aware that this quirky behavior exists. I know it sounds weird, but this is the kind of thing you get a feel for over time.

Another situation where it’s important to understand if pandas is copying data is when it comes to pretty much any pandas function that modifies a Series’ data. For example, every Series has a method called `replace()` which basically let’s you replace values with other values. In the case of `foo`, we can do something like replace every ‘a’ with ‘q’ and every ‘d’ with ‘p’. For example,

``````foo.replace({'a':'q', 'd':'p'})
## 0    z
## 1    b
## 2    c
## 3    p
## dtype: object
``````

The result of this method is a copy of `foo` with the replaced values. So we’re not actually modifying `foo`, we’re just building a brand new Series from it.

Of course, if you wanted to update `foo` with these replacements, you could just set

``````foo = foo.replace({'a':'q', 'd':'q'})
foo
## 0    z
## 1    b
## 2    c
## 3    q
## dtype: object
``````

This works, but it’s highly inefficient since internally pandas creates a whole new Series, reassigns `foo` to it, and then deletes the old Series. To circumvent this, lots of pandas functions have a parameter called `inplace` which, when True, tells pandas to modify the data you’re operating on rather than return a modified copy of it. So, rather than do `foo = foo.replace({'a':'q', 'd':'q'})`, you can just call `foo.replace({'a':'q', 'd':'q'}, inplace=True)`.

``````foo.replace({'a':'q', 'd':'q'}, inplace=True)
foo
## 0    z
## 1    b
## 2    c
## 3    q
## dtype: object
``````