Contents

Python Pandas For Your Grandpa - 3.14 Challenge: Pot Holes

Setup

Fed up with your city’s roads, you go around collecting data on potholes in your area. Due to an unfortunate coffee spill, you lost bits and pieces of your data. So, given your DataFrame of pothole measurements, discard rows where more than half the values are NaN, otherwise impute NaNs with the average value per column unless the column is non-numeric, in which case use the mode.

import numpy as np
import pandas as pd

potholes = pd.DataFrame({
    'length':[5.1, np.nan, 6.2, 4.3, 6.0, 5.1, 6.5, 4.3, np.nan, np.nan],
    'width':[2.8, 5.8, 6.5, 6.1, 5.8, np.nan, 6.3, 6.1, 5.4, 5.0],
    'depth':[2.6, np.nan, 4.2, 0.8, 2.6, np.nan, 3.9, 4.8, 4.0, np.nan],
    'location':pd.Series(['center', 'north edge', np.nan, 'center', 'north edge', 'center', 'west edge',
                          'west edge', np.nan, np.nan], dtype='string')
})

Solution

drop_rows = potholes.isnull().sum(axis=1) > potholes.shape[1]/2
potholes.fillna(potholes.mean(), inplace=True)
potholes.location.fillna(potholes.location.mode().iat[0], inplace=True)
potholes = potholes.loc[~drop_rows]
print(potholes)
##      length     width     depth    location
## 0  5.100000  2.800000  2.600000      center
## 1  5.357143  5.800000  3.271429  north edge
## 2  6.200000  6.500000  4.200000      center
## 3  4.300000  6.100000  0.800000      center
## 4  6.000000  5.800000  2.600000  north edge
## 5  5.100000  5.533333  3.271429      center
## 6  6.500000  6.300000  3.900000   west edge
## 7  4.300000  6.100000  4.800000   west edge
## 8  5.357143  5.400000  4.000000      center

Course Curriculum

  1. Introduction
    1.1 Introduction
  2. Series
    2.1 Series Creation
    2.2 Series Basic Indexing
    2.3 Series Basic Operations
    2.4 Series Boolean Indexing
    2.5 Series Missing Values
    2.6 Series Vectorization
    2.7 Series apply()
    2.8 Series View vs Copy
    2.9 Challenge: Baby Names
    2.10 Challenge: Bees Knees
    2.11 Challenge: Car Shopping
    2.12 Challenge: Price Gouging
    2.13 Challenge: Fair Teams
  3. DataFrame
    3.1 DataFrame Creation
    3.2 DataFrame To And From CSV
    3.3 DataFrame Basic Indexing
    3.4 DataFrame Basic Operations
    3.5 DataFrame apply()
    3.6 DataFrame View vs Copy
    3.7 DataFrame merge()
    3.8 DataFrame Aggregation
    3.9 DataFrame groupby()
    3.10 Challenge: Hobbies
    3.11 Challenge: Party Time
    3.12 Challenge: Vending Machines
    3.13 Challenge: Cradle Robbers
    3.14 Challenge: Pot Holes
  4. Advanced
    4.1 Strings
    4.2 Dates And Times
    4.3 Categoricals
    4.4 MultiIndex
    4.5 DataFrame Reshaping
    4.6 Challenge: Class Transitions
    4.7 Challenge: Rose Thorn
    4.8 Challenge: Product Volumes
    4.9 Challenge: Session Groups
    4.10 Challenge: OB-GYM
  5. Final Boss
    5.1 Challenge: COVID Tracing
    5.2 Challenge: Pickle
    5.3 Challenge: TV Commercials
    5.4 Challenge: Family IQ
    5.5 Challenge: Concerts

Additional Content

  1. Python NumPy For Your Grandma
  2. Neural Networks For Your Dog
  3. Introduction To Google Colab