Decision Tree Regressor in Excel. A Step-by-Step Guide for Machine… | by Angela Shi

[ad_1]

A Step-by-Step Guide for Machine Learning Beginners

I am writing a series of articles about implementing machine learning algorithms using Excel, which is an excellent tool for understanding the workings of these algorithms without programming.

In this article, we will implement the algorithm of the decision tree regressor step by step.

I will use a Google Sheet to demonstrate the implementation process. If you’d like to access this sheet, as well as others I’ve developed — such as linear regression with gradient descent, logistic regression, neural networks with backpropagation, KNN, k-means, and more to come — please consider supporting me on Ko-fi. You can find all of these resources at the following link: https://ko-fi.com/s/4ddca6dff1

Let’s use a simple dataset with only one continuous feature.

Decision tree regression in Excel simple dataset — image by author

We can visually guess that for the first split, there are two possible values one around 5.5 and the other around 12. Now the question is, which one do we choose?

To determine this, we can see a result from scikit learn using the DecisionTreeRegressor estimator. The image below shows that the first split is 5.5 since it leads to the lowest squared error. What does this mean exactly?

Simple Decision tree regression — image by author

This is exactly what we are going to find out: how do we determine the value for the first split with an implementation in Excel? Once we determine the value for the first split, we can apply the same process for the following splits. That is why we will only implement the first split in Excel.

Decision tree algorithms in 3 steps

I wrote an article to always distinguish three steps of machine learning to learn it in an effective way, and let’s apply the principle to Decision Tree Regressors:

1. Model: the model here is a set of rules, it is interesting to notice that it is different from a mathematical function-based model in the sense that for linear regression, we can write the model in the following form: y=aX+b, and the parameters a and b are to be determined. For a decision tree, the model is not parametric.
2. Model fitting: for a decision tree, we also call this process fully growing a tree. In the case of a Decision Tree Regressor, the leaves will contain only one observation with thus a MSE of zero.
3. Model tuning: for a decision tree, we also call it pruning, which consists of optimizing the hyperparameters such as the minimum number of observations in the leaves and the maximum depth.

Training process

Growing a tree consists of recursively partitioning the input data into smaller and smaller chunks or regions. For each region, a prediction can be calculated. In the case of regression, the prediction is the average of the target variable for the region.

At each step of the building process, the algorithm selects the feature and the split value that maximizes the one criterion, and in the case of a regressor, it is often the Mean Squared Error (MSE) between the actual value and the prediction.

Tuning or pruning

The pruning process can be seen as dropping nodes and leaves from a fully grown tree, or it is also equivalent to say that the building process stops when a criterion is met, such as a maximum depth or a minimum number of samples in each leaf node. And these are the hyperparameters that can be optimized with the tuning process.

Below we have some examples of trees with different value of max depth.

Decision tree regression with different max depth— image by author

Inferencing process

Once the decision tree regressor is built, it can be used to predict the target variable for new input instances by applying the rules and traversing the tree from the root node to a leaf node that corresponds to the input’s feature values.

The predicted target value for the input instance is then the mean of the target values of the training samples that fall into the same leaf node.

Here are the steps we will follow:

List all possible splits
For each split, we will calculate the MSE (Mean Squared Error)
We will select the split that minimizes the MSE as the optimal next split

All possible splits

First, we have to list all the possible splits that are the average values of two consecutive values. There is no need to test more values.

Decision tree regression in Excel with possible splits — image by author

MSE calculation for each possible split

As a starting point, we can calculate the MSE before any splits. This also means that the prediction is just the average value of y. And the MSE is equivalent to the Standard Deviation of y.

Now, the idea is to find a split so that the MSE with a split is lower than before. It is possible that the split does not significantly improve the performance (or lower the MSE), then the final tree would be trivial, that is the average value of y.

For each possible split, we can then calculate the MSE (Mean Squared Error). The image below shows the calculation for the first possible split, which is x = 2.

Decision tree regression in Excel MSE for all possible splits — image by author

We can see the details of the calculation:

Cut the dataset into two regions: with the value x=2, we determine two possibilities x<2 or x>2, so the x axis is cut into two parts.
Calculate the prediction: for each part, we calculate the average of y. That is the potential prediction for y.
Calculate the error: then we compare the prediction to the actual value of y
Calculate the squared error: for each observation, we can know calculate the square error.

Decision tree regression in Excel with all possible splits — image by author

Optimal split

For each possible split, we do the same to obtain the MSE. In Excel, we can copy and paste the formula and the only value that changes is the possible split value for x.

Decision tree regression in Excel splits— image by author

Then we can plot the MSE on the y-axis and the possible split on the x-axis, and now we can see that there is a minimum of MSE for x=5.5, this is exactly the result obtained with python code.

Decision tree regression in Excel Minimumization of MSE — image by author

Now, you can play with the Google Sheet:

you can modify the dataset
you can introduce a categorical feature
you can try to find the next split
you can change the criterion, instead of MSE, you can use absolute error, Poisson or friedman_mse as indicated in the documentation of DecisionTreeRegressor

Using Excel, it is possible to implement one split to gain more insights into how Decision Tree Regressors work. Even though we didn’t create a full tree, it is still interesting since the most important part is finding the optimal split among all possible splits.

[ad_2]
Source link

Decision Tree Regressor in Excel. A Step-by-Step Guide for Machine… | by Angela Shi | Mar, 2023