# Python Pandas For Your Grandpa - 2.4 Series Boolean Indexing

In this section, we’ll see how to use boolean indexing to select values from a Series based on logical conditions. Just like NumPy arrays, you can subset a Pandas Series using a boolean index For example, if you have a Series of integers like this one called `foo`

```
import numpy as np
import pandas as pd
foo = pd.Series([20, 50, 11, 45, 17, 31])
print(foo)
## 0 20
## 1 50
## 2 11
## 3 45
## 4 17
## 5 31
## dtype: int64
```

if you check `foo < 20`

, you’ll get back a corresponding Series of boolean values.

```
foo < 20
## 0 False
## 1 False
## 2 True
## 3 False
## 4 True
## 5 False
## dtype: bool
```

If you assign that Series to a variable called `mask`

, you can use it to subset `foo`

picking out values less than 20.

```
mask = foo < 20
foo.loc[mask]
## 2 11
## 4 17
## dtype: int64
```

Or if you wanted to avoid the intermediate step, you can do a one-liner like

```
foo.loc[foo < 20]
## 2 11
## 4 17
## dtype: int64
```

Now, you might think that the ith value in `foo`

gets returned if the ith value in `mask`

is True. And you’d kind of be right, but watch what happens if we swap the index labels, 4 and 5 in `foo`

, and then we do the same exact boolean subset using `mask`

.

```
foo.index = [0, 1, 2, 3, 5, 4]
foo.loc[mask]
## 2 11
## 4 31
## dtype: int64
```

This time, the result includes 31 instead of 17. That’s because `foo.loc[mask]`

picks out the elements of `foo`

whose index label matches those of `mask`

where `mask`

has a True value. Usually this is fine, but in some cases it might not be what you want and if you’d rather just include or exclude values of `foo`

by corresponding positions of True and False values in mask, just use `mask`

's underlying NumPy array to subset `foo`

, like

```
foo.loc[mask.to_numpy()]
## 2 11
## 5 17
## dtype: int64
```

In this case the third and fifth values of `mask`

are True, so we get back the third and fifth values of `foo`

.

If you want to combine boolean Series together, you can do that too using an `&`

for *and* and a `|`

for *or*. Note than when you combine two boolean Series, Pandas matches and combines boolean values based on their index.

For example, suppose we have a Series called `ages`

with the age of five people,

```
ages = pd.Series(
data = [42, 43, 14, 18, 1],
index = ['peter', 'lois', 'chris', 'meg', 'stewie']
)
print(ages)
## peter 42
## lois 43
## chris 14
## meg 18
## stewie 1
## dtype: int64
```

and a corresponding Series called `genders`

with their gender.

```
genders = pd.Series(
data = ['female', 'female', 'male', 'male', 'male'],
index = ['lois', 'meg', 'chris', 'peter', 'stewie'],
dtype = 'string'
)
print(genders)
## lois female
## meg female
## chris male
## peter male
## stewie male
## dtype: string
```

Even though their indexes are in a different order, we can still answer questions like,

*Who’s a male younger than 18?*

```
mask = (genders == 'male') & (ages < 18)
mask.loc[mask]
## chris True
## stewie True
## dtype: bool
```

In this case, we make a Series to identify whether each person is a male, and a second Series to identify whether each person is younger than 18. Then we combine them with an ampersand - i.e. the elementwise *and* operator - to identify whether each person is a male and younger than 18. Then if we assign that to a variable called `mask`

, we can index it with itself to get the names of males less than 18. In this case the names’ll be in the index.

We can also use the `~`

operator to negate a boolean Series. So for example, if we do `~mask`

, we can determine “Who’s not a male and less than 18?". In other words, “Who *is* a female or *is* at least 18?".

```
~mask
## chris False
## lois True
## meg True
## peter True
## stewie False
## dtype: bool
```

When you combine boolean Series, make sure you wrap each condition in parentheses, otherwise the interpreter will read things in the wrong order and you’ll probably get an error. For example if we try to determine people between 18 and 42 like this, we’ll get an error.

```
ages.loc[ages >= 18 & ages <= 42] # ERROR
```

The solution here is just to wrap the conditions in parentheses like this.

```
ages.loc[(ages >= 18) & (ages <= 42)]
## peter 42
## meg 18
## dtype: int64
```

## Course Curriculum

**Introduction**

1.1 Introduction**Series**

2.1 Series Creation

2.2 Series Basic Indexing

2.3 Series Basic Operations

2.4 Series Boolean Indexing

2.5 Series Missing Values

2.6 Series Vectorization

2.7 Series`apply()`

2.8 Series View vs Copy

2.9 Challenge: Baby Names

2.10 Challenge: Bees Knees

2.11 Challenge: Car Shopping

2.12 Challenge: Price Gouging

2.13 Challenge: Fair Teams**DataFrame**

3.1 DataFrame Creation

3.2 DataFrame To And From CSV

3.3 DataFrame Basic Indexing

3.4 DataFrame Basic Operations

3.5 DataFrame`apply()`

3.6 DataFrame View vs Copy

3.7 DataFrame`merge()`

3.8 DataFrame Aggregation

3.9 DataFrame`groupby()`

3.10 Challenge: Hobbies

3.11 Challenge: Party Time

3.12 Challenge: Vending Machines

3.13 Challenge: Cradle Robbers

3.14 Challenge: Pot Holes**Advanced**

4.1 Strings

4.2 Dates And Times

4.3 Categoricals

4.4 MultiIndex

4.5 DataFrame Reshaping

4.6 Challenge: Class Transitions

4.7 Challenge: Rose Thorn

4.8 Challenge: Product Volumes

4.9 Challenge: Session Groups

4.10 Challenge: OB-GYM**Final Boss**

5.1 Challenge: COVID Tracing

5.2 Challenge: Pickle

5.3 Challenge: TV Commercials

5.4 Challenge: Family IQ

5.5 Challenge: Concerts