# Python NumPy For Your Grandma - 3.7 random

In this section, we’ll see how you can use NumPy’s random module to shuffle arrays, sample values from arrays, and draw values from a host of probability distributions. And then we’ll see why everything I just showed you is deprecated, and how to updated it to modern standards.

Let’s see an example of how you might simulate rolling a 6-sided die 3 times. In other words, we want to draw three integers from the range 1 to 6, with replacement. For this we can use the `randint()`

function from NumPy’s random module.

```
import numpy as np
np.random.randint(low=1, high=7, size=3)
## array([6, 3, 1])
```

If you try running this on your machine, you’ll probably get something different. However, we can get reproducible results by setting a random number seed immediately before we generate random numbers. To set a seed, use the `seed()`

function with your favorite value passed in. Try this example on your machine and you should get the same result.

```
np.random.seed(123)
np.random.randint(low=1, high=7, size=3)
## array([6, 3, 5])
```

Now, what if we wanted to draw three values between 1 and 6 without replacement? For this we can use the `choice()`

function, giving it

- a 1d array of values to choose from
- the number of samples we want to draw, whether values should be replaced which is
`False`

by default - and a 1d array of probabilities corresponding to our 1d array of options, which by default gives equal probability to each option.

`choice()`

is like a generalized version of `randint()`

. Let’s see some examples.

First we’ll draw 3 ints between 1 and 6 without replacement.

```
np.random.seed(2357)
np.random.choice(
a = np.arange(1, 7),
size = 3,
replace = False,
p = None
)
## array([6, 5, 1])
```

Next we’ll do the same thing, but we’ll give a probability to each element.

```
np.random.choice(
a = np.arange(1, 7),
size = 3,
replace = False,
p = np.array([0.1, 0.1, 0.1, 0.1, 0.3, 0.3])
)
## array([5, 2, 6])
```

Lastly we’ll draw 3 elements from an array of strings

```
np.random.choice(
a = np.array(['you', 'can', 'use', 'strings', 'too']),
size = 3,
replace = False,
p = None
)
## array(['use', 'you', 'can'], dtype='<U7')
```

Now, what if you wanted to sample the rows from a 5x2 array like this one?

```
foo = np.array([
[1, 2],
[3, 4],
[5, 6],
[7, 8],
[9, 10]
])
```

You can use `randint()`

and `choice()`

for that too. The trick is to generate a random 1d array of row indices and use that result to select rows from the array you wanted to sample.

For example, we can use `randint()`

to sample three rows from `foo`

with replacement.

```
np.random.seed(1234)
rand_rows = np.random.randint(
low=0,
high=foo.shape[0],
size=3
)
print(rand_rows)
## [3 4 4]
foo[rand_rows]
## array([[ 7, 8],
## [ 9, 10],
## [ 9, 10]])
```

And we can use `choice()`

to sample three rows from `foo`

without replacement.

```
np.random.seed(1234)
rand_rows = np.random.choice(
a=np.arange(start=0, stop=foo.shape[0]),
replace=False,
size=3
) # [4, 2, 3]
print(rand_rows)
## [4 0 1]
foo[rand_rows]
## array([[ 9, 10],
## [ 1, 2],
## [ 3, 4]])
```

You can also use this technique to shuffle an array, but NumPy makes this even easier with a function called `permutation()`

. For example, if we call `np.random.permutation(foo)`

, it’ll randomly shuffle the rows of `foo`

.

```
np.random.permutation(foo)
## array([[ 7, 8],
## [ 5, 6],
## [ 3, 4],
## [ 1, 2],
## [ 9, 10]])
```

Unfortunately though, `permutation()`

only shuffles the data along its first axis, so we can’t shuffle the columns of `foo`

- only the rows.

We can also sample values from a variety of probability distributions. For example, if we wanted to sample 4 values from the uniform distribution between 1 and 2 to populate a 2x2 array, we can do that with

```
np.random.uniform(low = 1.0, high = 2.0, size = (2, 2))
## array([[1.86066977, 1.15063697],
## [1.19851876, 1.81516293]])
```

Or we could sample two values from a standard normal distribution.

```
np.random.normal(loc = 0.0, scale = 1.0, size = 2)
## array([-0.00867858, -0.32106129])
```

Or we could build a 3x2 array with random binomial values.

```
np.random.binomial(n = 10, p = 0.25, size = (3, 2))
## array([[2, 4],
## [1, 0],
## [2, 0]])
```

There’s a whole bunch of other distributions supported by NumPy, so you can sample whatever your heart desires.

Okay, so now let’s see why everything I just showed you is deprecated…

Let’s suppose we’re using NumPy version 1.1 and the current random number generator is *ABC1*. So, when you do something like

```
np.random.seed(123)
np.random.randint(3, size=3)
```

under the hood, the random number generator *ABC1* is responsible for making sure you get back a statistically valid sequence of random integers *and* anyone else using NumPy who writes the same exact code gets back the same exact sequence.

Then somebody discovers a new random number generator, *DEF2*, actually does a better job of creating statistically valid random numbers. So, NumPy decides to replace the *ABC1* generator with the *DEF2* generator. So when you upgrade to NumPy version 1.2 and run the same exact code as before, you get a different sequence of random numbers.

**This is an issue** because one - some people have code or documentation that might break if the random numbers they were generating suddenly change, and two - if people can’t share a reproducible example because they’re on different versions of NumPy, that’s really inconvenient. I mean, think of all the old examples on Stack Overflow that would instantly become non-reproducible because NumPy updated their random number generator.

Another possibility is that someone comes along and creates a new random number generator, *GHI3*, that’s way faster than *DEF2* but slightly less statistically valid. Now NumPy has this problem of deciding whether to use the fast generator or the more accurate generator.

The solution to this was to create a generic `Generator`

class that you pick the random number generator you want to use. For the sake of simplicity, I’m just going to use NumPy’s `default_rng()`

method which, at the moment, selects the *PCG64* random number generator. And in fact, when you look at the documentation for functions like `randint()`

, that’s what NumPy suggests.

So, all we have to do is say `generator = np.random.default_rng()`

, and we have the option to pass in a seed like 123.

```
generator = np.random.default_rng(seed=123)
```

Now let’s see how we can reconstruct some of the examples from earlier. So, if I want to sample three random integers between 1 and 7, instead of using `np.random.randint()`

I can use `generator.integers()`

.

```
generator.integers(low=1, high=7, size=3)
## array([1, 5, 4])
```

If I want to choose three values between 0 and 9 with replacement, instead of using `np.random.choice()`

I can use `generator.choice()`

```
generator.choice(a=10, size=3, replace=True)
## array([0, 9, 2])
```

If I want to permute the rows of `foo`

, instead of doing `np.random.permutation(foo)`

I can do `generator.permutation(foo)`

, and this time I actually get an `axis`

argument, so if I wanted to, I could permute the columns of `foo`

by setting `axis=1`

.

```
generator.permutation(foo, axis=1)
## array([[ 1, 2],
## [ 3, 4],
## [ 5, 6],
## [ 7, 8],
## [ 9, 10]])
```

And all the distribution sampling methods we covered like *uniform*, *normal*, and *binomial* are implemented as generator methods as well.

```
generator.uniform(low = 1.0, high = 2.0, size = (2, 2))
## array([[1.1759059 , 1.81209451],
## [1.923345 , 1.2765744 ]])
generator.normal(loc = 0.0, scale = 1.0, size = 2)
## array([-0.31659545, -0.32238912])
generator.binomial(n = 10, p = 0.25, size = (3, 2))
## array([[2, 2],
## [4, 1],
## [3, 3]])
```

## Course Curriculum

**Introduction**

1.1 Introduction**Basic Array Stuff**

2.1 NumPy Array Motivation

2.2 NumPy Array Basics

2.3 Creating NumPy Arrays

2.4 Indexing 1-D Arrays

2.5 Indexing Multidimensional Arrays

2.6 Basic Math On Arrays

2.7 Challenge: High School Reunion

2.8 Challenge: Gold Miner

2.9 Challenge: Chic-fil-A**Intermediate Array Stuff**

3.1 Broadcasting

3.2 newaxis

3.3`reshape()`

3.4 Boolean Indexing

3.5 nan

3.6 infinity

3.7 random

3.8 Challenge: Love Distance

3.9 Challenge: Professor Prick

3.10 Challenge: Psycho Parent**Common Operations**

4.1`where()`

4.2 Math Functions

4.3`all()`

and`any()`

4.4`concatenate()`

4.5 Stacking

4.6 Sorting

4.7`unique()`

4.8 Challenge: Movie Ratings

4.9 Challenge: Big Fish

4.10 Challenge: Taco Truck**Advanced Array Stuff**

5.1 Advanced Array Indexing

5.2 View vs Copy

5.3 Challenge: Population Verification

5.4 Challenge: Prime Locations

5.5 Challenge: The Game of Doors

5.6 Challenge: Peanut Butter**Final Boss**

6.1`as_strided()`

6.2`einsum()`

6.3 Challenge: One-Hot-Encoding

6.4 Challenge: Cumulative Rainfall

6.5 Challenge: Table Tennis

6.6 Challenge: Where’s Waldo

6.7 Challenge: Outer Product