Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

Open
cminuttim opened this issue Feb 7, 2025 · 0 comments
Open

Severe Underestimation of ROC-AUC Values in Lazypredict Library #476

cminuttim opened this issue Feb 7, 2025 · 0 comments

Comments

@cminuttim
Copy link

Description:

There is a critical flaw in the lazypredict library that significantly impacts the reliability of its model evaluation metrics, particularly for binary classification tasks. The library consistently underestimates Area Under the Receiver Operating Characteristic Curve (ROC-AUC) values, leading to potentially misleading results and incorrect conclusions.

Observed Behavior:

The lazypredict library computes ROC-AUC scores using predicted class labels (y_pred) instead of predicted probabilities:

y_pred = pipe.predict(X_test)
roc_auc = roc_auc_score(y_test, y_pred)

This approach contravenes the scikit-learn documentation for roc_auc_score, which explicitly states that probability estimates should be employed to calculate ROC-AUC scores accurately (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html):

In the binary case, it corresponds to an array of shape (n_samples,). Both probability estimates and non-thresholded decision values can be provided. The probability estimates correspond to the probability of the class with the greater label, i.e. estimator.classes_[1] and thus estimator.predict_proba(X, y)[:, 1].

Severity:

This issue is severe for several reasons:

  1. Incorrect Model Comparison: Underestimated AUC values may lead researchers to prematurely dismiss or favor specific models, resulting in suboptimal choices for their tasks.
  2. Misleading Results in Literature: Research papers and comparative analyses relying on the underestimated AUC values could draw erroneous conclusions, potentially leading to wasted effort and resources in follow-up studies.
  3. Violation of Scientific Integrity: The library's users might unintentionally report flawed results, compromising the integrity of their scientific work.

Urgent Resolution Required:

I strongly urge the lazypredict maintainers to address this critical issue as soon as possible by modifying their code to use predicted probabilities instead of class labels for calculating ROC-AUC scores. This change is essential to restore confidence in the library's output and ensure that it provides accurate, reliable, and fair model assessments.

In the meantime, I recommend that users exercise caution when interpreting AUC values generated by lazypredict and, whenever possible, validate their results using alternative methods or libraries that compute AUC correctly.

Thank you for your immediate attention to this matter. Prompt resolution of this critical issue will greatly benefit the machine learning community and help maintain the integrity of research built upon lazypredict's outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant