# Python Pandas For Your Grandpa | Section 2.5 | Series Apply

# Course Contents

- Introduction
- Series

2.1 Series Creation

2.2 Series Basic Operations

2.3 Series Basic Indexing

2.4 Series Overwriting Data

**2.5 Series Apply**

2.6 Series Concatenation

2.7 Series Boolean Indexing

2.8 Series View Vs Copy

2.9 Series Missing Values

2.10 Series Challenges

```
import numpy as np
import pandas as pd
```

## .apply()

Suppose you have some cool, complicated function like this one, which takes in a scalar value, `x`

, subtracts 1 if it’s less than 1.5, and adds 1 otherwise…

```
def my_func(x):
return x - 1 if x < 1.5 else x + 1
```

You want to apply that function to each element of a series like this.

```
foo = pd.Series([1.3, 1.9, 1.2, 1.0, 1.7, 1.3])
```

Lucky for you, Series has an `apply()`

method that lets you do exactly this. In this case, you’d call `foo.apply()`

, passing the function callabale, `my_func`

. The output of this is a new Series with the results of `my_func`

applied to each element of the original Series.

```
foo.apply(my_func)
## 0 0.3
## 1 2.9
## 2 0.2
## 3 0.0
## 4 2.7
## 5 0.3
## dtype: float64
```

Now suppose you generalize your function, giving it some parameters like this.

```
def my_func_with_params(x, s=1.5, a=1):
return x - a if x < s else x + a
```

How do you apply this function to each element of the series, also specifying the parameters you want to use? In this case you just call `.apply()`

, passing in named arguments to feed your function.

```
foo.apply(my_func_with_params, s=1.1, a=10)
## 0 11.3
## 1 11.9
## 2 11.2
## 3 -9.0
## 4 11.7
## 5 11.3
## dtype: float64
```

## Performance

The `apply()`

method is great, because it’s easy to use and it generalizes well, but sometimes it’s slow. If we apply `my_func()`

from the first example to a series with 10M values, it takes over 2 seconds to execute on my laptop.

```
import timeit
# Create a series of 10M values
bigfoo = pd.Series(np.random.uniform(low=1, high=2, size=10000000))
# Apply my_func() to bigfoo
timeit.timeit(lambda: bigfoo.apply(my_func), number = 1)
## 2.11 seconds
```

This is one of those cases where the function is simple enough that it’d be better to build it using pure NumPy. For example, the function below does the same thing as `my_func()`

and takes about half the time to execute.

```
def my_numpy_func(x):
a = x.to_numpy()
return np.where(a < 1.5, a - 1, a + 1)
timeit.timeit(lambda: my_numpy_func(bigfoo), number = 10)
## 1.05 seconds
```

The NumPy solution is faster because it’s vectorized; Without going into too much detail, it basically outsources the entire computation to C which is fast, whereas the `apply()`

solution spends a lot of time processing in Python, which is slow.