Share on:

Python NumPy For Your Grandma | Section 3.4 | Boolean Indexing
December 29, 2019

Table Of Contents

  1. Introduction
  2. NumPy Arrays
    2.1 What’s A NumPy Array
    2.2 Creating NumPy Arrays
    2.3 Indexing And Modifying 1-D Arrays
    2.4 Indexing And Modifying Multidimensional Arrays
    2.5 Basic Math
  3. Intermediate Array Stuff
    3.1 Broadcasting
    3.2 newaxis
    3.3 reshape
    3.4 boolean indexing
    3.5 nan
    3.6 infinity
    3.7 random
  4. Common Operations
    4.1 where
    4.2 Math Funcs
    4.3 all and any
    4.4 concatenate
    4.5 Stacking
    4.6 Sorting
    4.7 unique
  5. Challenges

This video covers boolean indexing, a technique for subsetting arrays using logical conditions.

Code

import numpy as np

# 3x3 array of positive integers. we want tocreplace every 3 with 0
foo = np.array([
    [3, 9, 7],
    [2, 0, 3],
    [3, 3, 1]
])

# Checking foo == 3, numpy gives us a 3x3 array of boolean values
mask = foo == 3

# Now we can use this array of boolean values to index our original array, identify which elements are 3
foo[mask]
foo[mask] = 0
foo

# use 1d boolean arrays to pick out specific rows or columns
rows_1_and_3 = np.array([True, False, True])
cols_2_and_3 = np.array([False, True, True])
foo[rows_1_and_3]  # returns rows 1 and 3
foo[:, cols_2_and_3]  # returns cols 2 and 3

## Be careful about using multiple boolean indexes together
foo[rows_1_and_3, cols_2_and_3] # equivalent to foo[[0,2], [1,2]]

# --- logical operators ---------------------------------------

# combine boolean arrays using bitwise operators
b1 = np.array([False, False, True, True])
b2 = np.array([False, True, False, True])
b1 & b2  # [False, False, False,  True]  # and
b1 | b2  # [False,  True,  True,  True]  # or
b1 ^ b2  # [False,  True,  True, False]  # xor

# negation
~np.array([False, True])

# Examples
names = np.array(["Dennis", "Dee", "Charlie", "Mac", "Frank"])
ages = np.array([43, 44, 43, 42, 74])
genders = np.array(['male', 'female', 'male', 'male', 'male'])

# Who's at least 44?
names[ages >= 44]

# Which males are over 42?
names[(genders == "male") & (ages > 42)]

# Who's a not a male or younger than 43?
names[~(genders == "male") | (ages < 43)]

Transcript

One of the most powerful features of numpy is boolean indexing. With boolean indexing, you can use an array of boolean values to subset another array. For example, suppose we have a 3x3 array of positive integers called foo and we’d like to replace every 3 with 0.
Running foo == 3 gives us a 3x3 array of boolean values which we’ll store in a variable called mask.
Now we can use this array of boolean values to index our original array, identifying which elements are 3, and setting them equal to 0.

Just like integer arrays, we can use 1d boolean arrays to pick out specific rows or columns of a 2-d array. We’ll start by making two, length-3 boolean arrays to select rows and columns from foo.
Here we return rows 1 and 3.
Here we return columns 2 and 3.
However, if we combine these row and column selectors together, you might be surprised to see the result is the 1d array “9 comma 1”. That’s because numpy treats these boolean selectors like integer selectors where the integers used are the indexes of True elements. In other words, the boolean array “True, False, True”, numpy treats just like the integer array “0 comma 2” and the boolean array “False, True, True”, numpy treats just like the integer array “1 comma 2”.
So, when we selected rows 1 and 3 with our boolean index, it’s the same as if we used the integer index 0 comma 2.
And when we selected columns 2 and 3 with our boolean index, it’s just like using the integer index 1 comma 2.
And when we combined these indexes, it’s just as if we used a combination of integer indexes, which if you remember, selects the pair of elements (0,1) and (2,2).

Logical operators let us combine boolean arrays. They include

  • the bitwise and operator
  • the bitwise or operator
  • And the bitwise xor operator

We can also negate a boolean array by preceding it with a tilde.

Let’s see some examples of logical operators in use.
Here we have an array of person names and corresponding arrays for their age and gender. With boolean indexing, we can answer questions like

  • Who’s at least 44?
  • Which males are over 42?
  • and Who’s a not a male or is younger than 43?

Note that when we combine multiple boolean expressions, we have to wrap each expression in parentheses so that the expressions get evaluated before the bitwise logical operators.


Enjoyed this article? Show your support and buy some GormAnalysis merch.
comments powered by Disqus