Share on:

Python NumPy For Your Grandma | Section 3.5 | nan

December 29, 2019
python courses

This video covers the special floating point constant, nan, which is commonly used to represent missing or invalid data.

Code

import numpy as np

# bot, 2 missing values
bot = np.ones(shape = (3, 4))
bot[[0, 2], [1, 2]] = np.nan
bot

# check bot == np.nan
bot == np.nan

# Be careful
np.nan == np.nan  # False
np.nan != np.nan  # True


# which elements of bot are nan?
np.isnan(bot)

# only works for for arrays of floats
foo = np.array([1, 2, 3], dtype = 'int64')
foo[1] = np.nan  # error

Transcript

With numpy, you can use nan to represent missing or invalid values. nan is a floating point constant that numpy reserves and treats specially.
For example, consider this array called bot which contains 2 missing values.
If you want to identify which elements of bot are nan, you might be inclined to do something like bot == np.nan but the result may surprise you.
Numpy designed nan so that nan == nan returns False, but nan != nan returns True.
This is because equivalence between missing or invalid values is not well defined. In practice, this behavior prevents silent bugs from creeping into your program. In order to see which elements of bot are nan, you can use numpy’s isnan() function.
It’s important to note that nan only works for an array of floats. If you try inserting nan into an array of integers, booleans, or strings, you’ll get an error or unexpected behavior.


Enjoyed this article? Show your support and buy some GormAnalysis merch.
comments powered by Disqus