Share on:

December 29, 2019

1. Introduction
2. NumPy Arrays
2.1 What’s A NumPy Array
2.2 Creating NumPy Arrays
2.3 Indexing And Modifying 1-D Arrays
2.4 Indexing And Modifying Multidimensional Arrays
2.5 Basic Math
3. Intermediate Array Stuff
3.2 newaxis
3.3 reshape
3.4 boolean indexing
3.5 nan
3.6 infinity
3.7 random
4. Common Operations
4.1 where
4.2 Math Funcs
4.3 all and any
4.4 concatenate
4.5 Stacking
4.6 Sorting
4.7 unique
5. Challenges

This video covers NumPy boradcasting, a technique used to perform operations on arrays with different shapes and sizes.

Code

import numpy as np

#  4x3 array
bart = np.array([
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]
])

# add 5 to the 1st column, 3 to the 2nd column and 10 to the 3rd column
lisa = np.array([
[5, 3, 10],
[5, 3, 10],
[5, 3, 10],
[5, 3, 10]
])
bart + lisa

bart + np.array([[5, 3, 10]])

bart +  np.array([5, 3, 10])

# shift and scale array
np.array([1,2,3]) + 0.5  # [1.5, 2.5, 3.5]
np.array([1,2,3]) * -1   # [-1, -2, -3]

# try adding bart to a 4 element vector
bart + np.array([0, 0, 0, 0])  # error

## So how does broadcasting work and when can we use it?
## Suppose we want to add or multiply two arrays, A and B
## Moving backwards from the last dimension of each array, we check if their dimensions are compatible
## Dimensions are compatible they are equal or either of them is 1
## If all of A's dimensions are compatible with B's dimensions, or vice versa, they are compatible arrays

### Examples

# Example 1
np.random.seed(1234)
A = np.random.randint(low = 1, high = 10, size = (3, 4))
B = np.random.randint(low = 1, high = 10, size = (3, 1))

A.shape  # (3, 4)
B.shape  # (3, 1)
#           ^  ^
#         compatible

# Example 2
np.random.seed(4321)
A = np.random.randint(low = 1, high = 10, size = (4, 4))
B = np.random.randint(low = 1, high = 10, size = (2, 1))

A.shape  # (4, 4)
B.shape  # (2, 1)
#           ^  ^
#         not compatible

# Example 3
np.random.seed(1111)
A = np.random.randint(low = 1, high = 10, size = (3, 1, 4))
B = np.random.randint(low = 1, high = 10, size = (2, 1))

A.shape  # (3, 1, 4)
B.shape  # (   2, 1)
#           ^  ^  ^
#         compatible

Transcript

Numpy uses broadcasting to make arrays with different shapes play together nicely. For example, suppose we have the following 4x3 array called bart, and we’d like to add 5 to the 1st column, 3 to the 2nd column and 10 to the 3rd column.
We’ve seen how we can create a corresponding 4x3 array, we’ll call it lisa, and add them together Numpy gives us a much better way to do this.

Instead of making a 4x3 array with repeated rows, we can add bart to a 1x3 array and numpy will fill in the gaps for us.
In fact, we can reduce that 1x3 array to a simple 1d array with 3 elements and the addition will work all the same.
This functionality of operating on arrays with different shapes and dimensionalities is called broadcasting.
You may have seen examples where an array is shifted or scaled by some constant. This is perhaps the simplest form of broadcasting.
Now, if we try to add bart to a 4 element array, we’ll get the error “operands could not be broadcast together with shapes 4 by 3 and 4”. So how does broadcasting work and when can we use it?

Suppose we want to add or multiply two arrays, A and B.
Moving backwards from the last dimension of each array, we check if their dimensions are compatible Dimensions are compatible they are equal or either of them is 1.
If all of A’s dimensions are compatible with B’s dimensions, or vice versa, they are compatible arrays. Let’s see how this works with some examples.

Here, A is a 3x4 array and B is a 3x1 array.
We start by comparing the last dimension of each array.
Since the last dimension of A is 4 and the last dimension of B is 1, numpy can expand B by making 4 copies of it along its second dimension. So, these dimensions are compatible.
Now we have to compare the 1st dimension of A and B. Since they’re both 3, they’re compatible. The only thing left for numpy, is to carry out whatever procedure we wanted on two equivalently sized 3x4 arrays.
In practice, numpy doesn’t actually follow this algorithm where it expands B, since it would be highly memory inefficient. However, conceptually, this is exactly what numpy does, and it’s a good mental model to understand how broadcasting works.

Let’s see another example.
Here, A is a 4x4 array and B is a 2x1 array.
The last dimension of A is 4 and the last dimension of B is 1, so these dimensions are compatible, and just like the last example we can temporarily transform B by making 4 copies of it along its 2nd axis.
Now we compare the 1st dimension of each array. In this case, there isn’t an obvious way to expand B into a 4x4 array to match A or vice versa, so these arrays are not compatible.

Let’s see another example where A has 3 dimensions and B has 2 dimensions.
As before, we start by comparing the last dimension of each array. In this case, A is 4 and B is 1, so we can expand B into a 2x4 array, making these dimensions compatible.
Next, we compare the 2nd to last dimension of each array. In this case, A is 1 and B is 2. This time, we expand A, copying it twice along its second axis to match B.
At this point, we’re out of B dimensions, so we know A and B are compatible. To complete our mental model of how math between these arrays would work, we can imagine copying B 3 times along a newly added 1st dimension.
We’re left with two transformed arrays, each with shape 3 by 2 by 4, which we can easily add or subtract, or combine in some other way.