Here’s a practical guide for calculating customer retention and churn from transaction data.

# Preface

The general idea of customer retention is self explanatory; It’s a measure of how well a business retainins their customers. Unfortunately the specifics of how to calculate a retention metric are not so clear. Likewise, customer *churn* is the complement of retention; It’s a measure of how many customers end their relationship with a business - i.e. how *poorly* a business retains their customers. High retention = good. High churn = bad. As soon as you measure one of these metrics, you immediately know the other.

# Motivating Example

Let’s motivate our future retention and churn calculations with an example. Suppose we run an ecommerce store that sells sporting goods and we have the following set of 2,046 transactions from 2010-01-10 to 2018-12-30.

TransactionID | TransactionDate | CustomerID | Amount | Product |
---|---|---|---|---|

1 | 2012-11-02 | 1 | $6.83 | basketball |

2 | 2014-05-18 | 1 | $2.36 | soccerball |

3 | 2013-01-04 | 1 | $7.21 | basketball |

… | … | … | … | … |

2044 | 2016-05-25 | 500 | $5.50 | soccerball |

2045 | 2014-06-24 | 500 | $5.49 | baseball |

2046 | 2014-10-18 | 500 | $6.69 | soccerball |

# Customer Retention

In order to start understanding our customer retention, it makes sense to analyze the behavior of *groups* of customers acquired around the same time. Such an analysis is called a *cohort analysis* where the cohorts are monthly/quarterly/annually acquired customers.

Here we analyze the activity of annually acquired customers across annual periods.

Cohort | 0 - 12 | 12 - 24 | 24 - 36 | 36 - 48 | 48 - 60 | 60 - 72 | 72 - 84 | 84 - 96 | 96 - 108 |
---|---|---|---|---|---|---|---|---|---|

2010-01-01 - 2010-12-31 | 67 | 29 | 34 | 27 | 30 | 27 | 28 | 30 | 27 |

2011-01-01 - 2011-12-31 | 50 | 28 | 22 | 27 | 24 | 23 | 24 | 23 | |

2012-01-01 - 2012-12-31 | 69 | 28 | 35 | 29 | 34 | 23 | 21 | ||

2013-01-01 - 2013-12-31 | 44 | 21 | 16 | 18 | 14 | 18 | |||

2014-01-01 - 2014-12-31 | 49 | 19 | 17 | 26 | 19 | ||||

2015-01-01 - 2015-12-31 | 64 | 26 | 24 | 30 | |||||

2016-01-01 - 2016-12-31 | 56 | 27 | 29 | ||||||

2017-01-01 - 2017-12-31 | 48 | 22 | |||||||

2018-01-01 - 2018-12-31 | 53 |

The first row of this table represents the customers acquired between 2010-01-01 and 2010-12-31. The second row represents all the customers acquired between 2011-01-01 and 2011-12-31. And so on..

The first column of this table represents the number of customers who made at least one purchase between 0 and 12 months since the start of their cohort period. In other words, the columns show the number of customers who were active in subsequent 12-month periods.

To make this more clear, let’s break down the first row of the table. The first entry, 67, tells us that 67 customers made at least one purchase 0-12 months after 2010-01-01. Alternatively stated, 67 customers made a purchase betweeen 2010-01-01 and 2010-12-31. Notice that this is eaxctly the cohort period represented by the row, so this tells us that 67 customers were acquired in that period. The second column shows that only 29 of those 67 customers made a purchase in the subsequent 12-month period, i.e. 12-24 months after 2010-01-01, i.e. between 2011-01-01 and 2011-12-31. Notice we immediately have an insight into our customer retention. More on that in a bit.. The third column shows that 34 customers in the cohort made a purchase in the next year, so we’ve actually regained some customers since the second period.

If we scan our eyes across the top row from left to right, we can get a sense of our customer retention. We started with 67 customers in the first year, dropped to 29 in the second year, rose to 34 the year after, then 27, 30, … So, our retention rate was about 43% in year 2 vs year 1, but appears to have stabilized around that number. With this idea in place, we can build the following table of annual customer retention rates.

Cohort | Customers | 12 - 24 | 24 - 36 | 36 - 48 | 48 - 60 | 60 - 72 | 72 - 84 | 84 - 96 | 96 - 108 |
---|---|---|---|---|---|---|---|---|---|

2010-01-01 - 2010-12-31 | 67 | 43% | 51% | 40% | 45% | 40% | 42% | 45% | 40% |

2011-01-01 - 2011-12-31 | 50 | 56% | 44% | 54% | 48% | 46% | 48% | 46% | |

2012-01-01 - 2012-12-31 | 69 | 41% | 51% | 42% | 49% | 33% | 30% | ||

2013-01-01 - 2013-12-31 | 44 | 48% | 36% | 41% | 32% | 41% | |||

2014-01-01 - 2014-12-31 | 49 | 39% | 35% | 53% | 39% | ||||

2015-01-01 - 2015-12-31 | 64 | 41% | 38% | 47% | |||||

2016-01-01 - 2016-12-31 | 56 | 48% | 52% | ||||||

2017-01-01 - 2017-12-31 | 48 | 46% |

Cool - we now have a table of customer retention we can use to understand our busines. ..Not so fast. There’s a fundamental issue we’ve overlooked and have to deal with.

Suppose we own a car dealership and we have the same Active Customer metrics as above. Again, looking at the row of data we see that we acquired 67 customers in 2010. Well, we certainly wouldn’t expect our customers to come back an buy a new car every year. In fact, it’s possible that the 29 repeat customers in year 2 and 34 repeat customers in year 3 may be completely different customers, in which case our annual retention rate of ~47% is really understating our true customer retention.

The takeaway here is that a retention rate has to be defined in terms of an appropriate purchase duration which is completely dependant on the nature of the business. If your company sells potato chips, you’d expect customers to repeat purchase on the order of weeks or 1-2 months, so it’s appriate to look at quarterly retention rates. On the other hand, if you sell cars, you should probably be measuring 3-5 year retention rates.

# Customer Churn

This’ll be quick. Churn is simply the complement of retention. If your retention rate is 30% then your churn rate is 100% - 30% = 70%, implying that 70% of the customers in a cohort have stopped purchasing from your business. The Customer Churn table implied by the Active Customers table above is the following

Cohort | Customers | 12 - 24 | 24 - 36 | 36 - 48 | 48 - 60 | 60 - 72 | 72 - 84 | 84 - 96 | 96 - 108 |
---|---|---|---|---|---|---|---|---|---|

2010-01-01 - 2010-12-31 | 67 | 57% | 49% | 60% | 55% | 60% | 58% | 55% | 60% |

2011-01-01 - 2011-12-31 | 50 | 44% | 56% | 46% | 52% | 54% | 52% | 54% | |

2012-01-01 - 2012-12-31 | 69 | 59% | 49% | 58% | 51% | 67% | 70% | ||

2013-01-01 - 2013-12-31 | 44 | 52% | 64% | 59% | 68% | 59% | |||

2014-01-01 - 2014-12-31 | 49 | 61% | 65% | 47% | 61% | ||||

2015-01-01 - 2015-12-31 | 64 | 59% | 62% | 53% | |||||

2016-01-01 - 2016-12-31 | 56 | 52% | 48% | ||||||

2017-01-01 - 2017-12-31 | 48 | 54% |