Python NumPy For Your Grandma | Section 3.5 | nan


Course Contents

  1. Introduction
  2. NumPy Arrays
    2.1 What’s A NumPy Array
    2.2 Creating NumPy Arrays
    2.3 Indexing And Modifying 1-D Arrays
    2.4 Indexing And Modifying Multidimensional Arrays
    2.5 Basic Math
  3. Intermediate Array Stuff
    3.1 Broadcasting
    3.2 newaxis
    3.3 reshape
    3.4 boolean indexing
    3.5 nan
    3.6 infinity
    3.7 random
  4. Common Operations
    4.1 where
    4.2 Math Funcs
    4.3 all and any
    4.4 concatenate
    4.5 Stacking
    4.6 Sorting
    4.7 unique
  5. Challenges

This video covers the special floating point constant, nan, which is commonly used to represent missing or invalid data.


import numpy as np

# bot, 2 missing values
bot = np.ones(shape = (3, 4))
bot[[0, 2], [1, 2]] = np.nan

# check bot == np.nan
bot == np.nan

# Be careful
np.nan == np.nan  # False
np.nan != np.nan  # True

# which elements of bot are nan?

# only works for for arrays of floats
foo = np.array([1, 2, 3], dtype = 'int64')
foo[1] = np.nan  # error


With numpy, you can use nan to represent missing or invalid values. nan is a floating point constant that numpy reserves and treats specially.
For example, consider this array called bot which contains 2 missing values.
If you want to identify which elements of bot are nan, you might be inclined to do something like bot == np.nan but the result may surprise you.
Numpy designed nan so that nan == nan returns False, but nan != nan returns True.
This is because equivalence between missing or invalid values is not well defined. In practice, this behavior prevents silent bugs from creeping into your program. In order to see which elements of bot are nan, you can use numpy’s isnan() function.
It’s important to note that nan only works for an array of floats. If you try inserting nan into an array of integers, booleans, or strings, you’ll get an error or unexpected behavior.