A decision tree is a machine-learning technique that uses the answers to a series of questions as input to classify data or generate predictions. The model is a supervised learning technique, which means it is taught and evaluated with data already labeled in the desired way.
There is no guarantee that the decision tree will always lead to an obvious conclusion. The data scientist may be given multiple options from which to choose. Since decision trees mimic human thought processes, the results are intuitive to most data scientists.
What is the Function of the Decision Tree?
Let’s establish some key terms of a decision tree before we get into how it works external link:open_in_new.
- Root node: The base of the decision tree.
- Splitting: The process of dividing a node into multiple sub-nodes.
- Decision node: When a sub-node is further split into additional sub-nodes.
- Leaf node: When a sub-node does not further split into additional sub-nodes; represents possible outcomes.
- Pruning: The process of removing sub-nodes of a decision tree.
- Branch: A subsection of the decision tree consisting of multiple nodes.
In visual terms, a decision tree looks like a tree. The root node is the most fundamental part of a tree. What started as a simple idea has grown into a full-fledged empire. Leaf nodes, which indicate outcomes, branch off from decision nodes. Each decision node represents a fork in the path and each leaf node represents a possible path forward.
Like new leaves on a tree limb, leaf nodes emerge from decision nodes. This is why each division of a decision tree is referred to as a “branch.” Let’s check out an illustration to see how this works. You play golf, and you play it well and reliably. You should always try to estimate whether your score will be under or over par for the day.
Even though you play consistently, your scores will vary depending on a few factors. The role of wind velocity, cloud cover, and temperature are discussed. Your score also varies depending on whether you walk or take a trolley. It also changes depending on whether you’re playing golf with buddies or complete strangers.
Below par and over par are the two terminal nodes in this illustration. Each of the inputs will be used to select branching nodes. Is there a breeze? Cold? Did you play golf with your pals? Are you on foot or did you use a cart? Assuming you are a steady golfer, a decision tree can be trained with enough information about your play to forecast how you will do on any given day.
The Structure and Variables of Decision Trees
Each golf shot’s result doesn’t rely on the previous one, just like each coin flip’s result doesn’t affect the next one. Dependent variables, on the other hand, are those that are influenced by events before them.
Building a decision tree requires building, in which you select the features and criteria that will form the tree. Then, the tree is trimmed to remove unimportant branches that could limit accuracy. Pruning entails finding outliers, data points far outside the norm, that could throw off the calculations by giving too much weight to infrequent events in the data.
Perhaps the weather has no bearing on your golfing performance, or perhaps a particularly bad round has clouded your judgment. As you dig deeper into the information that will inform your decision tree, you may eliminate anomalies like that one awful round of golf. You can also remove temperature nodes from the decision tree if they aren’t important to your data classification.
Decision trees that have been carefully crafted have few nodes and branches. A simple decision tree can be drawn on paper or a whiteboard with a pen and some thought. But for more complicated issues, you’ll need decision tree software.
What Are the Types of Decision Tree?
Many common decision tree algorithms are based on Hunt’s algorithm, which was created in the 1960s to simulate human learning in psychology.
ID3
Ross Quinlan is credited with creating ID3, also known as the “Iterative Dichotomiser 3.” Entropy and information gain are used in this technique to rank potential splits. You may read some of Quinlan’s 1986 work on this technique here (PDF, 1.4 MB).
C4.5
This algorithm is a refinement of ID3, which Quinlan also created. Decision-tree forks can be ranked according to information gain or gain ratios.
CART
Leo Breiman coined the phrase “classification and regression trees” (abbreviated as CART). This approach often makes use of Gini impurity to determine which attribute is best for partitioning. Gini impurity quantifies the frequency with which some arbitrary attribute is incorrectly categorized. A lower Gini impurity value is preferable when making assessments.
Stay in the loop on the latest tech trends! Follow us on Twitter for a wide range of topics.
Check out our links for detailed info on the newest breakthroughs in wearable tech, AI, and more: