Contents

Python NumPy For Your Grandma - 3.4 Boolean Indexing

One of the most powerful features of NumPy is boolean indexing. In this section, we’ll see how you can use an array of boolean values to index another array.

For example, suppose we have a 3x3 array of positive integers called foo and we’d like to replace every 3 with 0.

import numpy as np

foo = np.array([
    [3, 9, 7],
    [2, 0, 3],
    [3, 3, 1]
])

Running foo == 3 gives us a 3x3 array of boolean values which we’ll store in a variable called mask.

mask = foo == 3
print(mask)
## [[ True False False]
##  [False False  True]
##  [ True  True False]]

Now we can use this array of boolean values to index our original array, identifying which elements are 3, and setting them equal to 0.

foo[mask] = 0
print(foo)
## [[0 9 7]
##  [2 0 0]
##  [0 0 1]]

Just like integer arrays, we can use 1d boolean arrays to pick out specific rows or columns of a 2d array. We’ll start by making two, length-3 boolean arrays to select rows and columns from foo.

rows_1_and_3 = np.array([True, False, True])
cols_2_and_3 = np.array([False, True, True])

Here we return rows 1 and 3.

foo[rows_1_and_3]
## array([[0, 9, 7],
##        [0, 0, 1]])

And here we return columns 2 and 3.

foo[:, cols_2_and_3]
## array([[9, 7],
##        [0, 0],
##        [0, 1]])

However, if we combine these row and column selectors together, you might be surprised to see the result is the 1d array [9,1].

foo[rows_1_and_3, cols_2_and_3]
## array([9, 1])

That’s because NumPy treats these boolean indices like integer indices where the integers used are the indices of True elements. In other words, NumPy treats the boolean index array [True, False, True] just like the integer index array [0, 2] and NumPy treats the boolean index array [False, True, True] just like the integer index array [1, 2].

So, when we selected rows 1 and 3 with our boolean index, it’s the same as if we used the integer row index [0, 2].

foo[[0, 2]]  # same as foo[rows_1_and_3]
## array([[0, 9, 7],
##        [0, 0, 1]])

And when we selected columns 2 and 3 with our boolean index, it’s the same as if we used the integer column index [1, 2].

foo[:, [1, 2]]  # same as foo[:, cols_2_and_3]
## array([[9, 7],
##        [0, 0],
##        [0, 1]])

And when we combined these indexes, it’s just as if we used a combination of integer indexes, which if you remember, selects elements based on corresponding indices. In this case, it’s the pair of elements (0,1) and (2,2).

foo[[0, 2], [1, 2]]  # same as foo[rows_1_and_3, cols_2_and_3]
## array([9, 1])

Logical operators let us combine boolean arrays. They include the “bitwise-and” operator, the “bitwise-or” operator, and the “bitwise-xor” operator.

b1 = np.array([False, False, True, True])
b2 = np.array([False, True, False, True])

# bitwise-and
print(b1 & b2)

# bitwise-or
## [False False False  True]
print(b1 | b2)

# bitwise-xor
## [False  True  True  True]
print(b1 ^ b2)
## [False  True  True False]

We can also negate a boolean array by preceding it with a tilde.

~np.array([False, True])
## array([ True, False])

Let’s see some examples of how you might use boolean indexing in practice. Here we have an array of person names and corresponding arrays for their age and gender.

names = np.array(["Dennis", "Dee", "Charlie", "Mac", "Frank"])
ages = np.array([43, 44, 43, 42, 74])
genders = np.array(['male', 'female', 'male', 'male', 'male'])

With boolean indexing, we can answer questions like,

Who’s at least 44?

names[ages >= 44]
## array(['Dee', 'Frank'], dtype='<U7')

Which males are over 42?

names[(genders == "male") & (ages > 42)]
## array(['Dennis', 'Charlie', 'Frank'], dtype='<U7')

Who’s a not a male or who is younger than 43?

names[~(genders == "male") | (ages < 43)]
## array(['Dee', 'Mac'], dtype='<U7')

Note that when we combine multiple boolean expressions, we have to wrap each expression in parentheses so that the expressions get evaluated before the bitwise logical operators.


Course Curriculum

  1. Introduction
    1.1 Introduction
  2. Basic Array Stuff
    2.1 NumPy Array Motivation
    2.2 NumPy Array Basics
    2.3 Creating NumPy Arrays
    2.4 Indexing 1-D Arrays
    2.5 Indexing Multidimensional Arrays
    2.6 Basic Math On Arrays
    2.7 Challenge: High School Reunion
    2.8 Challenge: Gold Miner
    2.9 Challenge: Chic-fil-A
  3. Intermediate Array Stuff
    3.1 Broadcasting
    3.2 newaxis
    3.3 reshape()
    3.4 Boolean Indexing
    3.5 nan
    3.6 infinity
    3.7 random
    3.8 Challenge: Love Distance
    3.9 Challenge: Professor Prick
    3.10 Challenge: Psycho Parent
  4. Common Operations
    4.1 where()
    4.2 Math Functions
    4.3 all() and any()
    4.4 concatenate()
    4.5 Stacking
    4.6 Sorting
    4.7 unique()
    4.8 Challenge: Movie Ratings
    4.9 Challenge: Big Fish
    4.10 Challenge: Taco Truck
  5. Advanced Array Stuff
    5.1 Advanced Array Indexing
    5.2 View vs Copy
    5.3 Challenge: Population Verification
    5.4 Challenge: Prime Locations
    5.5 Challenge: The Game of Doors
    5.6 Challenge: Peanut Butter
  6. Final Boss
    6.1 as_strided()
    6.2 einsum()
    6.3 Challenge: One-Hot-Encoding
    6.4 Challenge: Cumulative Rainfall
    6.5 Challenge: Table Tennis
    6.6 Challenge: Where’s Waldo
    6.7 Challenge: Outer Product

Additional Content

  1. Python Pandas For Your Grandpa
  2. Neural Networks For Your Dog
  3. Introduction To Google Colab