Look at that! Less rules and better accuracy. We definitely built a better model when we limited the maximum depth.
Sometimes, however, setting the maximum depth can be too cumbersome and lead to underfitting.
How can we further refine our tree model?
Clip is a technique used to reduce excessive fit. Pruning also simplifies the decision-making tree by removing the weakest rules. Qualifying is often distinguished as follows:
- Pre-cutting (early stop) stops the tree before it has completed a series of exercises,
- Postoperative let the tree classify the training set perfectly and prune the tree.
We focus postoperative in this article.
Pruning is started from an uncut tree, it performs a section of subtrees (pruned trees) and picks it with the best cross-validation.
Pruning must ensure the following:
- The subtree is optimal – which means it has the highest accuracy with a cross-validated training group. (Trees can be optimized for the most important parameters for the engineer – not always for accuracy)
- Finding the optimal subtree should be computationally manageable.
ccp_alphaIs cost-complexity parameter.
Basically, pruning recursively finds the node with the “weakest link”. The weakest link is characterized by a powerful alpha, where the nodes with the lowest effective alpha are pruned first.
Mathematically cost complexity tree T is given:
- R (T) – Leaf knots full training error
- | T | – Number of leaf nodes
- a – complexity parameter (integer)
As alpha grows, more of the tree is pruned, which increases the impurities in its leaves.
If we just try to reduce training errors R (T), it results in relatively large trees (more leaves), leading to overcondition.
Pruning cost complexity produces a series of trees where cost complexity measures the subtree BILLION Is:
Parameter a reduces the complexity of the wood by adjusting the number of leaf nodes, which ultimately reduces excessive fit.
Which subtree is selected ultimately depends on α. If a = 0, then the largest tree is selected because the complexity penalty has substantially dropped. As α approaches infinity, a tree of size 1, i.e., one root node, is selected.
To get an idea of which values
ccp_alpha working to reduce the size of the tree, scikit-learn provides a function
cost_complexity_pruning_path which restores effective alpha and corresponding leaf impurities at each stage of the pruning process.
Let’s build our final tree model and see how it works.