# Python NumPy For Your Grandma - 4.6 Sorting

Contents

In this section, we’ll see how you can use NumPy’s `sort()` function to sort the elements of an array.

`sort()` takes three primary parameters:

1. the array you want to sort
2. the axis along which to sort - the default, -1, sorts along the last axis
3. the kind of sort you want NumPy to implement. By default, NumPy implements quicksort.

For example, here we make a 1d array, `foo`, and then sort it in ascending order.

``````import numpy as np

foo = np.array([1, 7, 3, 9, 0, 9, 1])
np.sort(foo)
## array([0, 1, 1, 3, 7, 9, 9])
``````

Note that the original array remains unchanged.

``````print(foo)
## [1 7 3 9 0 9 1]
``````

If you want to sort the values of `foo` “in place”, use the `.sort()` method of the array object. In other words, if you do `foo.sort()`, this time `foo` updates with its values in sorted order.

``````foo.sort()
print(foo)
## [0 1 1 3 7 9 9]
``````

If you have an array with `nan` values like this one, `sort()` pushes them to the end of the array.

``````bar = np.array([5, np.nan, 3, 11])
np.sort(bar)
## array([ 3.,  5., 11., nan])
``````

Unfortunately NumPy doesn’t have an easy, direct way of sorting arrays in descending order. However, with a bit of thought, we can cook something up. Two methods really stand out.

The first is to sort the array in ascending order and then reverse the result.

``````np.sort(bar)[::-1]
## array([nan, 11.,  5.,  3.])
``````

The second is to negate the array’s values, sort those in ascending order, and then negate that result.

``````-np.sort(-bar)
## array([11.,  5.,  3., nan])
``````

The main difference between these techniques is that the 1st method pushes `nan` values to the front of the array and the 2nd method pushes `nan`s to the back. Also, the second method won’t work on strings since you can’t negate a string.

What if you wanted to sort a multidimensional array like this?

``````boo = np.array([
[55, 10, 12],
[20, 0, 33],
[55, 92, 3]
])
``````

In this case, you can use the `axis` parameter of the `sort()` function to specify which axis to sort along. For example, if we do `np.sort(boo, axis = 0)`, it sorts `boo` along axis 0. In other words, it sorts each column of `boo`.

``````np.sort(boo, axis = 0)
## array([[20,  0,  3],
##        [55, 10, 12],
##        [55, 92, 33]])
``````

And if we do `np.sort(boo, axis = 1)`, it sorts `boo` along axis 1. In other words, it sorts each row of `boo`.

``````np.sort(boo, axis = 1)
## array([[10, 12, 55],
##        [ 0, 20, 33],
##        [ 3, 55, 92]])
``````

We can also set `axis = -1` to tell NumPy to sort the last axis of an array. In this case, it’d be like sorting `boo` along axis 1.

``````np.sort(a = boo, axis = -1)
## array([[10, 12, 55],
##        [ 0, 20, 33],
##        [ 3, 55, 92]])
``````

Cool, but what if we wanted to sort the rows of `boo` according to, say, the values in the 1st column? If we had something to give us the array `[1, 0, 2]`, we could pop that into the the row index and get back our desired sorted array.

``````boo[[1, 0, 2]]
## array([[20,  0, 33],
##        [55, 10, 12],
##        [55, 92,  3]])
``````

The tool we’re looking for is `argsort()`. `argsort()` works just like `sort()`, except it returns an array of indices indicating the position each value of the array would map to in the sorted case.

For example, if you had the array `[3, 0, 10, 5]` and you called `argsort()` on it, you’d get back the array `[1, 0, 3, 2]`, because the smallest element is at index 1, the second smallest element is at index 0, and so on. If you used this array to index the original array, you’d get back a sorted array just as if you called `np.sort()`.

``````goo = np.array([3, 0, 10, 5])  # [3, 0, 10, 5]
np.argsort(goo)                # [1, 0,  3,  2]
## array([1, 0, 3, 2])
goo[np.argsort(goo)]           # [0, 3,  5  10]
## array([ 0,  3,  5, 10])
``````

So, if you wanted to sort a 2d array’s rows based on a certain column, you just have to call `argsort()` on that column’s values and use the result to select rows from the original array.

Looking back at our 2d array, `boo`, we could sort its rows by the 1st column, ascending like this.

``````boo[np.argsort(boo[:, 0])]
## array([[20,  0, 33],
##        [55, 10, 12],
##        [55, 92,  3]])
``````

This brings up an important question. If our array has repeated values, like the 55 in this case, how do we guarantee that sorting them won’t alter the order they appear in the original array? For example, these two matrices are both valid sorts of `boo` along its first column, but only the first matrix doesn’t mix up the order of the 55s.

``````print(boo[[1, 0, 2]])
## [[20  0 33]
##  [55 10 12]
##  [55 92  3]]
print(boo[[1, 2, 0]])
## [[20  0 33]
##  [55 92  3]
##  [55 10 12]]
``````

This is what’s known as a stable sorting algorithm. By default, `np.sort()` and `np.argsort` don’t use a stable sort algorithm, but you can force it to by setting `kind=stable`.