Revolutionize Your Binary Classification Analysis with New Graphic Tools in This Release of binclass-tools | by Luca Zavarella

[ad_1]

Discover the power of Calibration Curves, Gain and Lift Plots, and more in the latest version of binclass-tools —your ultimate solution for binary classification problems!

The binclass-tools package reached major version 1.* a few days ago and celebrates 13K downloads from PyPI! The new version contains many new features, including:

Calibration Plot
Cumulative Gain Plot
Cumulative Lift Plot
Response Plot
Cumulative Response Plot

Let’s take a closer look at what this is all about.

Assessment of binary classification model performance can be aided by utilizing calibration plots. These plots are useful in evaluating the degree of correspondence between predicted and actual probabilities. The fundamental premise of calibration is that if a model predicts a probability of 0.8 for an observation, then in actuality, the observation should have a positive class probability of 0.8.

To create a calibration plot, one needs to partition the predicted probabilities into discrete bins, such as 0–0.1, 0.1–0.2, 0.2–0.3, etc. For each bin, it is necessary to compute the average predicted probability along with the proportion of positive classes by leveraging the true labels and predicted probabilities. Thereafter, the average predicted probability is plotted against the proportion of positive classes for each bin.

Here is an example of a calibration plot from binclass-tools:

Figure 1 — Example of Calibration Curve (image by the author)

Figure 1 displays the optimal calibration curve, which takes the shape of a diagonal line. Such a curve corresponds to a scenario where predicted probabilities match actual probabilities perfectly. It is worth noting that imperfect calibration curves are widespread, with predicted probabilities sometimes overestimating or underestimating actual probabilities.

In Figure 1, the calibration curve generated by binclass-tools reveals the error at every point on the curve. Calibration curves can also facilitate computation of the Expected Calibration Error (ECE), which gauges the average gap between predicted and actual probabilities for all bins. To calculate ECE, one adds up the weighted absolute difference between the mean predicted probability and the proportion of positive classes in each bin, where the weight corresponds to the size of the bin.

Calibration plots prove useful while comparing several binary classification models. By comparing the calibration curves of different models, it is possible to determine which models are better calibrated and are able to produce more precise predictions. Binclass-tools facilitates this by providing a special function that generates the calibration curve of multiple models.

Figure 2 — Calibration curves of multiple models (image by the author)

To supplement calibration curves, Predicted Probability Distribution plots for each model can also be produced.

Predicted probability distribution plots, also known as probability histograms or density plots, are often included alongside calibration curves. These graphs display the distribution of predicted probabilities for all observations in the dataset. In binary classification problems, predicted probabilities should range from 0 to 1, where values closer to 0 indicate a low probability of belonging to the positive class and values closer to 1 suggest a high probability of belonging to the positive class.

Ideally, the predicted probability distribution should be properly calibrated, with most observations having predicted probabilities that accurately reflect the actual probabilities. A well-calibrated binary classifier’s predicted probability distribution would have a peak near 0.5, suggesting that the model is impartial to either the positive or negative class. The distribution would gradually decline in frequency towards 0 and 1, indicating that the model can confidently make predictions for observations that are distinctly positive or negative. However, the predicted probability distribution may be skewed or have multiple peaks in practical applications, indicating that the model is inadequately calibrated and may require further tuning. For example, if the histogram has a peak near 0.5 but few observations with predicted probabilities close to 0 or 1, this may indicate that the model is unable to distinguish between the positive and negative classes with confidence.

The predicted probability distribution plot provides a useful means to evaluate the calibration of a model and identify possible areas for improvement. Examining the distribution of predicted probabilities offers insights into the model’s performance and can help pinpoint specific areas that may benefit from further optimization.

A chart of accumulated gain is a visual display of a binary classification model’s ability to identify the positive class. It indicates the percentage of positive observations that can be identified when examining a specific percentage of the population with the highest forecast probabilities of belonging to the positive class.

The process of creating this chart involves sorting the observations in the dataset by their predicted probabilities in descending order. The sorted observations are then separated into a predetermined number of bins or percentiles, each containing an equal percentage of the total population. The accumulated gain chart displays the cumulative percentage of positive observations in each bin, progressing from the highest to the lowest predicted probabilities. The x-axis of the graph reflects the percentage of the population considered, ranging from 0% to 100%, while the y-axis displays the percentage of positive observations that can be identified within that population.

Here is an example of a cumulative gain plot from binclass-tools:

Figure 3 — Example of Cumulative Gain Curve from binclass-tools (image by the author)

A desirable cumulative gain plot for a binary classifier would be a straight line that extends from the origin to the top right corner, indicating that the model can effectively identify positive observations regardless of the percentage of the population considered. In practice, it is difficult to achieve a perfect cumulative gain plot, but a model that approaches the diagonal line is considered to perform better.

However, there may be situations where a model with a cumulative gain plot that deviates from the ideal diagonal line is preferable. For instance, a model used to identify fraudsters might be able to detect 80% of fraudsters by analyzing only 23% of the population sorted by decreasing predicted probability, as highlighted in Figure 3. This model would be suitable if the goal is to identify as many fraudsters as possible with limited resources. Nonetheless, it is important to evaluate the model’s performance across different population percentiles to determine where further improvements are necessary.

A chart called the cumulative lift plot or lift curve illustrates how well a binary classification model identifies positive observations. To compare the model’s performance to a random classifier, which randomly assigns positive or negative labels to each observation in the dataset, the lift plot is used.

The dataset is divided into equal-sized bins, also known as quantiles or deciles, after the observations have been ranked by their predicted probability of being in the positive class. The lift value for each bin is then calculated by dividing the observed positive rate by the expected positive rate for that bin, and the cumulative lift value up to each percentile of the population is plotted.

Here is an example of a lift curve from binclass-tools:

Figure 4 — Example of Lift Curve from binclass-tools (image by the author)

The value of the lift indicates the improvement in the ability of the model to identify positive observations compared to a random classifier for each bin. The random model line in a cumulative lift plot is a flat line that extends from one end of the plot to the other at a height equivalent to the overall positive rate in the population. A lift value of 1 implies that the model’s performance is identical to that of a random classifier, while a lift value greater than 1 suggests that the model outperforms the random classifier in identifying positive observations. As shown in Figure 4, selecting the top 44% of observations sorted by decreasing predicted probability leads to a selection containing 2.2 times the percentage of target class cases that would be expected by a random model.

Therefore, the ideal cumulative lift plot would be a curve that is significantly above the random model line, demonstrating that the model can identify positive observations more efficiently than random selection.

The answer visualization is a form of graphical representation utilized to assess the efficiency of a binary classification model. It is a two-dimensional depiction that shows the percentage of observed positive class cases for each of the ten equal-sized groups into which the predicted probabilities are sorted. As with the previous charts, the response visualization displays the predicted probability deciles on the x-axis in descending order, while the y-axis represents the percentage of actual positive class cases for each group.

Here is an example of a response curve from binclass-tools:

Figure 5 — Example of Response Curve from binclass-tools (image by the author)

A reliable method of measuring the performance of a binary classification model is the response plot. The response plot displays the proportion of actual positive class observations in each decile of predicted probabilities. In an ideal response plot, a well-calibrated model with strong predictive power should display a rapid rise in the proportion of actual positive class observations in the initial deciles of predicted probabilities, trailed by a more gradual increase in the later deciles. The initial deciles of predicted probabilities are where the model is most confident in its predictions and is expected to identify a high percentage of actual positive class observations. In Figure 5, for instance, 65% of the observations in decile 2 are identified as belonging to the target class (class 1).

The expected performance for a random model would be a flat line that stretches horizontally across the plot, which corresponds to an even distribution of positive class observations across all deciles of predicted probabilities. This is because a random model lacks the ability to differentiate between positive and negative classes, rendering it incapable of providing any predictive power.

Contrary to the response plot, the cumulative response plot calculates the percentage of current positive class observations cumulatively over all deciles ranging from the initial one to the one under analysis.

Here is an example of a cumulative response curve from binclass-tools:

Figure 6 — Example of Cumulative Response Curve from binclass-tools (image by the author)

In this case, the cumulative response plot does not pass through the random model line because the cumulative response value for a random model at the 10th decile is equal to the total proportion of positive cases, which is the same as the cumulative response value for any other model at the 10th decile.

The cumulative response plot is one of the most used plots as it answers to an easy question made also by non-expert data scientists: “If we apply the model and select up to decile X, what is the expected percentage of target class observations in the selection?”.

For example, from Figure 6 we can see that 63% of the observations in the percentiles from 1 to 29 together belong to the target class.

To create simple plots of these curves, you can utilize the Python binclass-tools package, which is available as an open source tool on GitHub. For more information on how to use the functions to generate the aforementioned plots, refer to the project’s GitHub page here:

Please note that the plots presented in this article were added to version 1.0.0 of the package. If you’re interested in learning about the theory behind plots implemented in earlier versions, you can refer to this article for the Interactive Confusion Matrix:

and this article for the Interactive ROC and Precision-Recall plots:

As always, any feedback on new package features is welcome.

[ad_2]
Source link

Revolutionize Your Binary Classification Analysis with New Graphic Tools in This Release of binclass-tools | by Luca Zavarella | Mar, 2023

Discover the power of Calibration Curves, Gain and Lift Plots, and more in the latest version of binclass-tools —your ultimate solution for binary classification problems!

Comments

Leave a Reply Cancel reply