Python NumPy For Your Grandma - 6.1 as_strided()
In this section, we’ll see how you can use the
as_strided() function to create complex views of an existing array.
Let’s start by creating a 2d array with 8 integers called
import numpy as np foo = np.array([ [10,20,30,40], [50,60,70,80] ])
If you remember, early on in the course we talked about how arrays are stored in contiguous, fixed-size memory blocks. In this case, foo is would be stored in memory like this.
foo is comprised of 64-bit integers, each block of memory is 64 bits. Alternatively stated, each block of memory is 8 bytes because there’s 8 bits in byte.
So, let’s say we’re at the beginning of the array. If we want to get to the third element, we know we need to jump across 16 bytes of data.
Now let’s say we’re back at the beginning of the array and we want to get to the element at index (1,1). In this case, we can do some basic math to figure out that we need to jump across 32 bytes to get to the second row and then another 8 bytes to get to the second element in the second row. This is exactly what the strides attribute of a numpy array tells us. For example,
foo.strides returns the tuple (32, 8) which means “to get to the next row, you need to jump across 32 bytes and to get to the next column you need to jump across 8 bytes”.
foo.strides ## (32, 8)
Here’s the cool part. With
np.lib.stride_tricks.as_strided(), you can create a new view of an existing array by modifying its strides but NOT copying or modifying its data. For example, if we can build a new view of
bar = np.lib.stride_tricks.as_strided(x = foo, shape = (3,4), strides = (16,8)) print(bar) ## [[10 20 30 40] ## [30 40 50 60] ## [50 60 70 80]]
This works because we define a 3x4 array that’s based on the data in
foo, but in this array, we tell NumPy to jump across 16 bytes to get to the next row and 8 bytes to get to the next column.
For example, if we request the element at index (1,0), NumPy starts at the beginning of the array and then jumps across one row and zero columns, so sixteen bytes plus 0 bytes, and reads off element 30.
To get to index (1,3), NumPy jumps across one row and three columns, so 16 bytes plus 24 bytes, and reads off element 60.
Now, it’s really important to note that
bar is a view of
foo. So if we modify
bar, we’ll also be modifying
foo. For example, if we do
bar[1, 0] = 999
and then we print foo,
print(foo) ## [[ 10 20 999 40] ## [ 50 60 70 80]]
foo gets modified even though we changed
bar. But not only that, if we print
bar, you can see that element (1,0) and element (0,2) changed.
print(bar) ## [[ 10 20 999 40] ## [999 40 50 60] ## [ 50 60 70 80]]
That’s because they both point to the same block of memory.
If you use this function, you need to be really careful that your strides make sense, and that they don’t spill outside the memory bounds of the original array. If you make your strides too big, you could end up pointing to memory that’s used by a completely different variable and you could end up crashing or corrupting your program. That’s why the docs for
as_strided() have a big red box that says “Warning. This function has to be used with extreme care.”
- Basic Array Stuff
2.1 NumPy Array Motivation
2.2 NumPy Array Basics
2.3 Creating NumPy Arrays
2.4 Indexing 1-D Arrays
2.5 Indexing Multidimensional Arrays
2.6 Basic Math On Arrays
2.7 Challenge: High School Reunion
2.8 Challenge: Gold Miner
2.9 Challenge: Chic-fil-A
- Intermediate Array Stuff
3.4 Boolean Indexing
3.8 Challenge: Love Distance
3.9 Challenge: Professor Prick
3.10 Challenge: Psycho Parent
- Common Operations
4.2 Math Functions
4.8 Challenge: Movie Ratings
4.9 Challenge: Big Fish
4.10 Challenge: Taco Truck
- Advanced Array Stuff
5.1 Advanced Array Indexing
5.2 View vs Copy
5.3 Challenge: Population Verification
5.4 Challenge: Prime Locations
5.5 Challenge: The Game of Doors
5.6 Challenge: Peanut Butter
- Final Boss
6.3 Challenge: One-Hot-Encoding
6.4 Challenge: Cumulative Rainfall
6.5 Challenge: Table Tennis
6.6 Challenge: Where’s Waldo
6.7 Challenge: Outer Product