A decision tree is a model that uses a set of criteria to classify something. Suppose you tell your single friend Bill to go out with your new friend Sally. Since Bill has never met Sally, he asks you a series of questions.
Bill: How far from me does she live?
You: 15 miles
Bill: How tall is she?
Bill: Does she have a college degree?
Bill: Is she hot?
Bill: Okay, I’ll go out with her.
This is part of Bill’s decision tree for determining whether a girl is date-worthy. His entire tree might look like
This example is a binary classifier. It classifies women as date-worthy and not date-worthy. Decision trees can also perform multi-class classification as well as regression.
Now suppose you own an online dating site and Bill is one of your users. You want to add a feature to your site that identifies women who Bill might consider date-worthy. You decide that a decision tree would be a good model for the job.
Fortunately, Bill is a pretty lonely guy and he’s viewed hundreds of women on the site. All of those samples will help us build a good model for determining women who Bill deems date-worthy. Here’s what some of those samples might look like.
Since your site doesn’t explicitly track whether Bill thinks a girl is “date-worthy” we’ll have to infer it from whether or not he messaged her. The two should be pretty close.
The dataset above is called a training set. We will use it to construct our decision tree…
We’ve trained our model and now we have a decision tree that can classify women as date-worthy according to Bill’s tastes. Suppose a new lady, Jennifer signs up on your site. She’s 5’5 and lives 10 miles from Bill. Our model will instantly classify her as date-worthy and send Bill an email notification.
Before I get into the magic of how this decision tree was constructed, I’d like to point something out…
Previously, Bill has not messaged any women above 5’6. Because of this, our model classifies all women above 5’6 as not date-worthy. However, there might actually be a 5’7 woman who Bill would date. (Maybe she earns $10 Mil a year). Unfortunately our model would wrongly classify this woman as not date-worthy. Here comes the beauty of machine learning – Bill might find this woman’s profile through a simple search of all women in his area, and when he does he’ll send her a message. Now our model can be retrained with this new information. In other words, the model learns that Bill is willing to message girls above 5’6 in some extreme cases. As Bill views and messages more women, our model becomes more attuned to Bill’s criteria for date-worthiness. This is machine learning.