# Python NumPy For Your Grandma - 3.4 Boolean Indexing

Contents

One of the most powerful features of NumPy is boolean indexing. In this section, we’ll see how you can use an array of boolean values to index another array.

For example, suppose we have a 3x3 array of positive integers called `foo` and we’d like to replace every 3 with 0.

``````import numpy as np

foo = np.array([
[3, 9, 7],
[2, 0, 3],
[3, 3, 1]
])
``````

Running `foo == 3` gives us a 3x3 array of boolean values which we’ll store in a variable called mask.

``````mask = foo == 3
## [[ True False False]
##  [False False  True]
##  [ True  True False]]
``````

Now we can use this array of boolean values to index our original array, identifying which elements are 3, and setting them equal to 0.

``````foo[mask] = 0
print(foo)
## [[0 9 7]
##  [2 0 0]
##  [0 0 1]]
``````

Just like integer arrays, we can use 1d boolean arrays to pick out specific rows or columns of a 2d array. We’ll start by making two, length-3 boolean arrays to select rows and columns from foo.

``````rows_1_and_3 = np.array([True, False, True])
cols_2_and_3 = np.array([False, True, True])
``````

Here we return rows 1 and 3.

``````foo[rows_1_and_3]
## array([[0, 9, 7],
##        [0, 0, 1]])
``````

And here we return columns 2 and 3.

``````foo[:, cols_2_and_3]
## array([[9, 7],
##        [0, 0],
##        [0, 1]])
``````

However, if we combine these row and column selectors together, you might be surprised to see the result is the 1d array `[9,1]`.

``````foo[rows_1_and_3, cols_2_and_3]
## array([9, 1])
``````

That’s because NumPy treats these boolean indices like integer indices where the integers used are the indices of True elements. In other words, NumPy treats the boolean index array `[True, False, True]` just like the integer index array [0, 2] and NumPy treats the boolean index array `[False, True, True]` just like the integer index array [1, 2].

So, when we selected rows 1 and 3 with our boolean index, it’s the same as if we used the integer row index `[0, 2]`.

``````foo[[0, 2]]  # same as foo[rows_1_and_3]
## array([[0, 9, 7],
##        [0, 0, 1]])
``````

And when we selected columns 2 and 3 with our boolean index, it’s the same as if we used the integer column index `[1, 2]`.

``````foo[:, [1, 2]]  # same as foo[:, cols_2_and_3]
## array([[9, 7],
##        [0, 0],
##        [0, 1]])
``````

And when we combined these indexes, it’s just as if we used a combination of integer indexes, which if you remember, selects elements based on corresponding indices. In this case, it’s the pair of elements (0,1) and (2,2).

``````foo[[0, 2], [1, 2]]  # same as foo[rows_1_and_3, cols_2_and_3]
## array([9, 1])
``````

Logical operators let us combine boolean arrays. They include the “bitwise-and” operator, the “bitwise-or” operator, and the “bitwise-xor” operator.

``````b1 = np.array([False, False, True, True])
b2 = np.array([False, True, False, True])

# bitwise-and
print(b1 & b2)

# bitwise-or
## [False False False  True]
print(b1 | b2)

# bitwise-xor
## [False  True  True  True]
print(b1 ^ b2)
## [False  True  True False]
``````

We can also negate a boolean array by preceding it with a tilde.

``````~np.array([False, True])
## array([ True, False])
``````

Let’s see some examples of how you might use boolean indexing in practice. Here we have an array of person names and corresponding arrays for their age and gender.

``````names = np.array(["Dennis", "Dee", "Charlie", "Mac", "Frank"])
ages = np.array([43, 44, 43, 42, 74])
genders = np.array(['male', 'female', 'male', 'male', 'male'])
``````

With boolean indexing, we can answer questions like,

Who’s at least 44?

``````names[ages >= 44]
## array(['Dee', 'Frank'], dtype='<U7')
``````

Which males are over 42?

``````names[(genders == "male") & (ages > 42)]
## array(['Dennis', 'Charlie', 'Frank'], dtype='<U7')
``````

Who’s a not a male or who is younger than 43?

``````names[~(genders == "male") | (ages < 43)]
## array(['Dee', 'Mac'], dtype='<U7')
``````

Note that when we combine multiple boolean expressions, we have to wrap each expression in parentheses so that the expressions get evaluated before the bitwise logical operators.