Contents

Python NumPy For Your Grandma - 2.2 NumPy Array Basics

Before we can start using numpy, we need to install it and import it. The easiest way to install it is with pip. So, pip install numpy.

If you’re using google colab like me, you can do !pip install numpy, but you don’t even need to do that because it’s already installed by default.

And then conventional way to import numpy is like this.

import numpy as np

We’ll start by making our first array from a list of numbers.

arr = np.array([10, 20, 30, 40, 50])

We can print the array.

print(arr)
## [10 20 30 40 50]

We can check its dimensionality and shape.

arr.ndim
## 1
arr.shape
## (5,)

In this case we have a one dimensional array with shape, what I’ll call “five by”.

And we can see how many elements the array contains. In this case, five.

len(arr)
## 5

We can also make a two dimensional array from a list of lists.

arr_2d = np.array([
    [10, 20, 30, 40, 50],
    [100, 200, 300, 400, 500]
])

We can print the array.

print(arr_2d)
## [[ 10  20  30  40  50]
##  [100 200 300 400 500]]

We can check its dimensionality and shape.

arr_2d.ndim
## 2
arr_2d.shape
## (2, 5)

Here we have a 2d array with shape “2 by 5”.

We can see how many elements it has.

len(arr_2d)
## 2

Now, you might be surprised to see arr_2d has 2, not 10, elements. That’s because arr_2d can be interpreted as an array that contains two arrays inside of it. If you want to get the total number of nested elements in the array, you can use the .size attribute.

arr_2d.size
## 10

If you want to know the type of data in the array, you might try using python’s type() function, but this just tells you that the object is a NumPy array.

type(arr_2d)
## <class 'numpy.ndarray'>

If you want to see what kind of data the array is storing, you can use the .dtype attribute.

arr_2d.dtype
## dtype('int64')

There are two basic rules for every NumPy array.

  1. Every element in the array must be of the same type and size.
  2. If an array’s elements are also arrays, those inner arrays must have the same type and number of elements as each other. In other words, multidimensional arrays must be rectangular and not jagged.

For example, I can make a 1d array of integers and this is fine because every element in the array is an integer.

np.array([1, 2, 3])
## array([1, 2, 3])

If I try to make an array from a list that contains a mix of integers and strings, watch what happens.

np.array([1, 'hello', 3])
## array(['1', 'hello', '3'], dtype='<U21')

So, NumPy doesn’t error, but, it casts the integers to strings in order to satisfy the property that every element is the same type.

Now you might also be wondering, “Hey I thought you said you can’t have an array of strings because strings are objects that vary in size and the whole point of an array is to store fixed-size objects”.

The caveat to what I said is that you can create an array of strings if you restrict the strings to a certain size. And that’s what’s going on here. The dtype ‘<U21’ stands for unicode strings with 21 characters or less.

So if we try to reassign the 1st element to ‘a really really long string’

foo = np.array([1, 'hello', 3])
foo[0] = 'a really really long string'
print(foo)
## ['a really really long ' 'hello' '3']

You can see that ‘a really really long string’ gets truncated to 21 characters.

What if we try to build an array from a list of lists where the first inner list has four integers and the second inner list has four floats?

np.array([
    [1, 2, 3, 4],
    [5.5, 6.5, 5.5, 7.5],
])
## array([[1. , 2. , 3. , 4. ],
##        [5.5, 6.5, 5.5, 7.5]])

In this case, NumPy promotes the integers to floats, again to maintain homogenous data types, but otherwise builds the array as you’d expect.

Alright, let’s see one last example. What if we try to build an array from a list of lists where the first inner list has four integers and the second inner list has two integers?

np.array([
    [1, 2, 3, 4],
    [5, 6]
])
## array([list([1, 2, 3, 4]), list([5, 6])], dtype=object)
## 
## <string>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray

In this case NumPy gives us a warning that says “Creating an ndarray from ragged nested sequences is deprecated”. But we do actually get back an array with dtype ‘object’. Now this just means you have an array of pointers, which is more or less the same as a standard python list.


Course Curriculum

  1. Introduction
    1.1 Introduction
  2. Basic Array Stuff
    2.1 NumPy Array Motivation
    2.2 NumPy Array Basics
    2.3 Creating NumPy Arrays
    2.4 Indexing 1-D Arrays
    2.5 Indexing Multidimensional Arrays
    2.6 Basic Math On Arrays
    2.7 Challenge: High School Reunion
    2.8 Challenge: Gold Miner
    2.9 Challenge: Chic-fil-A
  3. Intermediate Array Stuff
    3.1 Broadcasting
    3.2 newaxis
    3.3 reshape()
    3.4 Boolean Indexing
    3.5 nan
    3.6 infinity
    3.7 random
    3.8 Challenge: Love Distance
    3.9 Challenge: Professor Prick
    3.10 Challenge: Psycho Parent
  4. Common Operations
    4.1 where()
    4.2 Math Functions
    4.3 all() and any()
    4.4 concatenate()
    4.5 Stacking
    4.6 Sorting
    4.7 unique()
    4.8 Challenge: Movie Ratings
    4.9 Challenge: Big Fish
    4.10 Challenge: Taco Truck
  5. Advanced Array Stuff
    5.1 Advanced Array Indexing
    5.2 View vs Copy
    5.3 Challenge: Population Verification
    5.4 Challenge: Prime Locations
    5.5 Challenge: The Game of Doors
    5.6 Challenge: Peanut Butter
  6. Final Boss
    6.1 as_strided()
    6.2 einsum()
    6.3 Challenge: One-Hot-Encoding
    6.4 Challenge: Cumulative Rainfall
    6.5 Challenge: Table Tennis
    6.6 Challenge: Where’s Waldo
    6.7 Challenge: Outer Product