# Python Pandas For Your Grandpa - 3.5 DataFrame apply()

Just as Series has an `apply()`

method for applying some function to each element in a Series, DataFrame has an `apply()`

method that let’s you apply a function to each row or column in a DataFrame. In this video, we’ll see how and when to use it.

We’ll start by making a very simple three-row, two-column DataFrame called `df`

.

```
import numpy as np
import pandas as pd
df = pd.DataFrame({
'A': [5.2, 1.7, 9.4],
'B': [3.9, 4.0, 7.8]
})
print(df)
## A B
## 0 5.2 3.9
## 1 1.7 4.0
## 2 9.4 7.8
```

DataFrame’s `apply()`

method has two primary arguments, `func`

and `axis`

. `func`

tells `apply()`

what function to use and `axis`

tells `apply()`

whether to apply the function along axis 0 (the row axis), or along axis 1 (the column axis). For example, if we call `df.apply(func=np.sum, axis=0)`

,

```
df.apply(func=np.sum, axis=0)
## A 16.3
## B 15.7
## dtype: float64
```

we get back a 2-element Series that’s the result of calling `np.sum()`

on each column of `df`

. If we do the same thing with `axis=1`

, we get back a 3-element Series that’s the result of calling `np.sum`

on each row of `df`

.

```
df.apply(func=np.sum, axis=1)
## 0 9.1
## 1 5.7
## 2 17.2
## dtype: float64
```

Now suppose we have a DataFrame called `kids`

like this one with mixed column types.

```
kids = pd.DataFrame({
'name': pd.Series(['alice', 'mary', 'jimmy', 'johnny', 'susan'], dtype="string"),
'age': [9, 13, 11, 15, 8],
'with_adult': [True, False, False, True, True]
})
```

Our goal is to determine whether each child should be allowed in a haunted house. The rules for getting in the house are: *you have to be at least 12 or you need to have adult supervision*.

For tasks like these, `apply()`

works great. In this case, we’ll start by making a function called `is_allowed()`

that inputs `age`

- a number, and `with_adult`

- a boolean, and and returns a boolean indicating whether the kid is allowed to enter the haunted house.

```
def is_allowed(age, with_adult):
return age >= 12 or with_adult
```

Now we’ll call `kids.apply()`

passing in our function, `is_allowed`

, and `axis=1`

because we want the function to be applied on a row-by-row basis.

```
kids.apply(is_allowed, axis=1) # ERROR
```

Of course, this won’t work because haven’t told Pandas what values of each row to use for the arguments of our function. In fact, it’s not even clear what’s being passed into our function.

To understand `apply()`

with `axis = 1`

, let’s pick out the first row of our DataFrame.

```
row_0 = kids.iloc[0]
print(row_0)
## name alice
## age 9
## with_adult True
## Name: 0, dtype: object
```

What we get back is a Series. Now, you might remember me saying you can’t have a Series of mixed types, so how do we have a Series with a string, an int, and and a boolean? The answer is we don’t - we actually have a Series of pointers - i.e. memory addresses. That’s why the dtype is reported as ‘object’, because even though every pointer in the Series is an integer, what it’s pointing to could be any type of object in memory.

In any case, this Series is exactly what the `apply()`

method uses for the function input. So, let’s modify our function to expect and operate on this type of input.

```
def is_allowed(kid_series):
return kid_series.loc['age'] >= 12 or kid_series.loc['with_adult']
```

And now the same `kids.apply()`

call we made earlier works.

```
kids.apply(is_allowed, axis=1)
## 0 True
## 1 True
## 2 False
## 3 True
## 4 True
## dtype: bool
```

But let’s not do that because it ruins a perfectly clean and generic `is_allowed()`

function. Instead, let’s use a lambda function as a bridge between the Series input and our original `is_allowed()`

function.

We’ll start by reverting our `is_allowed()`

function back to what it was.

```
def is_allowed(age, with_adult):
return age >= 12 or with_adult
```

And then we’ll say `kids.apply()`

, and pass in `lambda row: is_allowed(row.loc['age'], row.loc['with_adult'])`

```
kids.apply(lambda row: is_allowed(row.loc['age'], row.loc['with_adult']), axis=1)
## 0 True
## 1 True
## 2 False
## 3 True
## 4 True
## dtype: bool
```

In this case we’re using lambda as a wrapper for our `is_allowed()`

function.

Tacking that onto our `kids`

DataFrame, we can see exactly whose allowed in our haunted house and who’s not.

```
kids['allowed'] = kids.apply(lambda row: is_allowed(row.loc['age'], row.loc['with_adult']), axis=1)
print(kids)
## name age with_adult allowed
## 0 alice 9 True True
## 1 mary 13 False True
## 2 jimmy 11 False False
## 3 johnny 15 True True
## 4 susan 8 True True
```

Sorry Jimmy!

## Course Curriculum

**Introduction**

1.1 Introduction**Series**

2.1 Series Creation

2.2 Series Basic Indexing

2.3 Series Basic Operations

2.4 Series Boolean Indexing

2.5 Series Missing Values

2.6 Series Vectorization

2.7 Series`apply()`

2.8 Series View vs Copy

2.9 Challenge: Baby Names

2.10 Challenge: Bees Knees

2.11 Challenge: Car Shopping

2.12 Challenge: Price Gouging

2.13 Challenge: Fair Teams**DataFrame**

3.1 DataFrame Creation

3.2 DataFrame To And From CSV

3.3 DataFrame Basic Indexing

3.4 DataFrame Basic Operations

3.5 DataFrame`apply()`

3.6 DataFrame View vs Copy

3.7 DataFrame`merge()`

3.8 DataFrame Aggregation

3.9 DataFrame`groupby()`

3.10 Challenge: Hobbies

3.11 Challenge: Party Time

3.12 Challenge: Vending Machines

3.13 Challenge: Cradle Robbers

3.14 Challenge: Pot Holes**Advanced**

4.1 Strings

4.2 Dates And Times

4.3 Categoricals

4.4 MultiIndex

4.5 DataFrame Reshaping

4.6 Challenge: Class Transitions

4.7 Challenge: Rose Thorn

4.8 Challenge: Product Volumes

4.9 Challenge: Session Groups

4.10 Challenge: OB-GYM**Final Boss**

5.1 Challenge: COVID Tracing

5.2 Challenge: Pickle

5.3 Challenge: TV Commercials

5.4 Challenge: Family IQ

5.5 Challenge: Concerts