Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

cminuttim · 2025-02-07T07:11:14Z

Description:

There is a critical flaw in the lazypredict library that significantly impacts the reliability of its model evaluation metrics, particularly for binary classification tasks. The library consistently underestimates Area Under the Receiver Operating Characteristic Curve (ROC-AUC) values, leading to potentially misleading results and incorrect conclusions.

Observed Behavior:

The lazypredict library computes ROC-AUC scores using predicted class labels (y_pred) instead of predicted probabilities:

y_pred = pipe.predict(X_test)
roc_auc = roc_auc_score(y_test, y_pred)

This approach contravenes the scikit-learn documentation for roc_auc_score, which explicitly states that probability estimates should be employed to calculate ROC-AUC scores accurately (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html):

In the binary case, it corresponds to an array of shape (n_samples,). Both probability estimates and non-thresholded decision values can be provided. The probability estimates correspond to the probability of the class with the greater label, i.e. estimator.classes_[1] and thus estimator.predict_proba(X, y)[:, 1].

Severity:

This issue is severe for several reasons:

Incorrect Model Comparison: Underestimated AUC values may lead researchers to prematurely dismiss or favor specific models, resulting in suboptimal choices for their tasks.
Misleading Results in Literature: Research papers and comparative analyses relying on the underestimated AUC values could draw erroneous conclusions, potentially leading to wasted effort and resources in follow-up studies.
Violation of Scientific Integrity: The library's users might unintentionally report flawed results, compromising the integrity of their scientific work.

Urgent Resolution Required:

I strongly urge the lazypredict maintainers to address this critical issue as soon as possible by modifying their code to use predicted probabilities instead of class labels for calculating ROC-AUC scores. This change is essential to restore confidence in the library's output and ensure that it provides accurate, reliable, and fair model assessments.

In the meantime, I recommend that users exercise caution when interpreting AUC values generated by lazypredict and, whenever possible, validate their results using alternative methods or libraries that compute AUC correctly.

Thank you for your immediate attention to this matter. Prompt resolution of this critical issue will greatly benefit the machine learning community and help maintain the integrity of research built upon lazypredict's outputs.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

cminuttim commented Feb 7, 2025

Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

Comments

cminuttim commented Feb 7, 2025