Adding changelog and new model tuner updates

uclamii · Feb 10, 2025 · bac4f8a · bac4f8a
1 parent 29e4fce
commit bac4f8a
Show file tree

Hide file tree

Showing 21 changed files with 333 additions and 43 deletions.
diff --git a/docs/.buildinfo b/docs/.buildinfo
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: 741cf928a5633c3cf9847fb60a28953a
+config: 467fdae465f986bb3ac9ad8fa5e0cd8e
 tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/docs/.buildinfo.bak b/docs/.buildinfo.bak
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: 86681157926dfda84249a46ada0c9ab6
+config: 741cf928a5633c3cf9847fb60a28953a
 tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/docs/_sources/changelog.rst.txt b/docs/_sources/changelog.rst.txt
@@ -25,6 +25,60 @@ Changelog
 .. important::
    Complete version release history available `here <https://pypi.org/project/model-tuner/#history>`_
 
+Version 0.0.26b (Beta)
+-----------------------
+
+- Optimal threshold: Users can now specify target precision or recall and an optimal
+threshold is computed for that
+- Finalised testing: coverage is now at 86% total
+- New get_feature_names() helper function for extracting features
+- n_estimators calculation for boosting algorithms is now fixed
+
+Version 0.0.25a
+----------------
+
+- Pushed fixes for the get_feature_selection_pipeline method.
+- Updated scoring blocks for calibrated KFold models and folded confusion matrix metrics.
+- Added unittests for edge cases, including test_rfe_calibrate_model() and validating confusion matrix alignment.
+- Fixed mismatches between confusion matrix and classification report.
+- Provided fixes for all pipeline getter methods.
+- Integrated verify_imb_sampler prints into KFold logic.
+- Resolved typos in group split configurations and refined nested KFold bug fixes.
+- Adjusted fold metric calculations in report_model_metrics.
+- Moved optimal threshold logic into prediction functionality.
+- Enhanced return metrics dictionary logic to handle all cases and added multilabel classification tests.
+- Addressed Brier score calculation issues and optimized regression reports for KFold.
+- Introduced threshold print updates for clearer reporting.
+- Implemented SHAP scripts and tests for model explainability.
+- Removed outdated calibration reports from documentation and codebase.
+- Fixed bugs in regression metric calculations and refined KFold metric aggregation.
+
+
+Version 0.0.24a
+-----------------
+
+- Updated .gitignore to incl. doctrees
+- Added pickleObjects tests and updated reqs, tests passed
+- Added boostrapper test and tests passed
+- Adding multi class test script
+- Updated Metrics Output
+- Added optl' threshold print inside return_metrics
+- KFold metric printing
+- Augmented predict_proba test, and train_val_test_split
+- Fixed pipeline_steps arg in model definition
+- Refactored metrics_df in report_model_metrics for aesthetics
+- Unit Tests
+- Made return_dict optional in return_metrics
+- Added openpyxl versions for all python versions in requirements.txt
+- Refactor metrics, foldwise metrics and foldwise con_mat, class_labels
+- Cleaned notebooks dir
+- Added model_tuner version print to scripts
+- Added fix for sort of pipeline_steps now optional:
+- Added required model_tuner import to xgb_multi.py
+- Added requisite model_tuner import to multi_class_test.py
+- Added catboost_multi_class.py script
+- Removed pip dependency from requirements
+
 Version 0.0.23a
 --------------------
 

diff --git a/docs/_sources/getting_started.rst.txt b/docs/_sources/getting_started.rst.txt
@@ -25,7 +25,7 @@ Welcome to Model Tuner's Documentation!
 ========================================
 
 .. important::
-   This documentation is for ``model_tuner`` version ``0.0.023a``.
+   This documentation is for ``model_tuner`` version ``0.0.026b``.
 
 
 What Does Model Tuner Offer?

diff --git a/docs/_sources/usage_guide.rst.txt b/docs/_sources/usage_guide.rst.txt
@@ -1475,7 +1475,7 @@ Here are some of the available methods:
 
     Extracts both the preprocessing and feature selection parts of the pipeline.
 
-    **Example**::
+    **Definition**::
 
         def get_preprocessing_and_feature_selection_pipeline(self):
             steps = [
@@ -1489,7 +1489,7 @@ Here are some of the available methods:
 
     Extracts only the feature selection part of the pipeline.
 
-    **Example**::
+    **Definition**::
 
         def get_feature_selection_pipeline(self):
             steps = [
@@ -1503,7 +1503,7 @@ Here are some of the available methods:
 
     Extracts only the preprocessing part of the pipeline.
 
-    **Example**::
+    **Definition**::
 
         def get_preprocessing_pipeline(self):
             preprocessing_steps = [
@@ -1513,6 +1513,39 @@ Here are some of the available methods:
             ]
             return self.PipelineClass(preprocessing_steps)
 
+Extracting Feature names
+--------------------------
+When performing feature selection with tools such as Recursive Feature Elimination (RFE) or
+when using ColumnTransformers the feature names that are fed to the model can be obscured
+and different from the original. To get the transformed feature names or to extract the 
+feature names that were selected by the feature selection process we have provided the 
+`get_feature_names()` method. 
+
+.. py:function:: get_feature_names()
+
+   Extracts the feature names after they have been processed by the pipeline.
+   This does not work if a ColumnTransformerm, OneHotEncoder or some form of 
+   feature selection is not present in the pipeline.
+
+   **Definition**::
+
+      def get_feature_names(self):
+      if self.pipeline_steps is None or not self.pipeline_steps:
+            raise ValueError("You must provide pipeline steps to use get_feature_names")
+      if hasattr(self.estimator, "steps"):
+            estimator_steps = self.estimator[:-1]
+      else:
+            estimator_steps = self.estimator.estimator[:-1]
+      return estimator_steps.get_feature_names_out().tolist()
+
+   **Example Usage**::
+
+      ### Assuming you already have fitted a model with some form of feature selection
+      ### or feature transformation in the pipeline e.g. one hot encoder:
+
+      feat_names = model.get_feature_names()
+
+
 Summary
 --------
 
@@ -2880,7 +2913,8 @@ Step 3: Extract feature names from the training data, and initialize the SHAP ex
    import shap
 
    ## Feature names are required for interpretability in SHAP plots
-   feature_names = X_train.columns.to_list()
+   ## If feature selection, Column Transformer or One Hot Encoders were used
+   feature_names = model.get_feature_names()
 
    ## Initialize the SHAP explainer with the model
    explainer = shap.TreeExplainer(xgb_classifier)
@@ -2901,7 +2935,7 @@ Step 5: Generate a summary plot of SHAP values
 
    ## Plot SHAP values
    ## Summary plot of SHAP values for all features across all data points
-   shap.summary_plot(shap_values, X_test_transformed, feature_names=feature_names,)
+   shap.summary_plot(shap_values, X_test_transformed, feature_names=feature_names)
 
 
 .. raw:: html

diff --git a/docs/_static/documentation_options.js b/docs/_static/documentation_options.js
@@ -1,5 +1,5 @@
 const DOCUMENTATION_OPTIONS = {
-    VERSION: '0.0.23a0',
+    VERSION: '0.0.26b0',
     LANGUAGE: 'en',
     COLLAPSE_INDEX: false,
     BUILDER: 'html',

diff --git a/docs/about.html b/docs/about.html
@@ -6,7 +6,7 @@
   <meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>GitHub Repository &mdash; Model Tuner 0.0.23a0 documentation</title>
+  <title>GitHub Repository &mdash; Model Tuner 0.0.26b0 documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css?v=80d5e7a1" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=e59714d7" />
       <link rel="stylesheet" type="text/css" href="_static/copybutton.css?v=76b2166b" />
@@ -16,7 +16,7 @@
     <link rel="canonical" href="https://uclamii.github.io/model_tuner/about.html" />
       <script src="_static/jquery.js?v=5d32c60e"></script>
       <script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
-      <script src="_static/documentation_options.js?v=2b808d0e"></script>
+      <script src="_static/documentation_options.js?v=eb2198be"></script>
       <script src="_static/doctools.js?v=9bcbadda"></script>
       <script src="_static/sphinx_highlight.js?v=dc90522c"></script>
       <script src="_static/clipboard.min.js?v=a7894cd8"></script>

diff --git a/docs/caveats.html b/docs/caveats.html
@@ -6,7 +6,7 @@
   <meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Zero Variance Columns &mdash; Model Tuner 0.0.23a0 documentation</title>
+  <title>Zero Variance Columns &mdash; Model Tuner 0.0.26b0 documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css?v=80d5e7a1" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=e59714d7" />
       <link rel="stylesheet" type="text/css" href="_static/copybutton.css?v=76b2166b" />
@@ -16,7 +16,7 @@
     <link rel="canonical" href="https://uclamii.github.io/model_tuner/caveats.html" />
       <script src="_static/jquery.js?v=5d32c60e"></script>
       <script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
-      <script src="_static/documentation_options.js?v=2b808d0e"></script>
+      <script src="_static/documentation_options.js?v=eb2198be"></script>
       <script src="_static/doctools.js?v=9bcbadda"></script>
       <script src="_static/sphinx_highlight.js?v=dc90522c"></script>
       <script src="_static/clipboard.min.js?v=a7894cd8"></script>

diff --git a/docs/changelog.html b/docs/changelog.html
@@ -6,7 +6,7 @@
   <meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Changelog &mdash; Model Tuner 0.0.23a0 documentation</title>
+  <title>Changelog &mdash; Model Tuner 0.0.26b0 documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css?v=80d5e7a1" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=e59714d7" />
       <link rel="stylesheet" type="text/css" href="_static/copybutton.css?v=76b2166b" />
@@ -16,7 +16,7 @@
     <link rel="canonical" href="https://uclamii.github.io/model_tuner/changelog.html" />
       <script src="_static/jquery.js?v=5d32c60e"></script>
       <script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
-      <script src="_static/documentation_options.js?v=2b808d0e"></script>
+      <script src="_static/documentation_options.js?v=eb2198be"></script>
       <script src="_static/doctools.js?v=9bcbadda"></script>
       <script src="_static/sphinx_highlight.js?v=dc90522c"></script>
       <script src="_static/clipboard.min.js?v=a7894cd8"></script>
@@ -85,6 +85,9 @@
 <li class="toctree-l1"><a class="reference internal" href="about.html#acknowledgements">Acknowledgements</a></li>
 <li class="toctree-l1"><a class="reference internal" href="about.html#citing-model-tuner">Citing Model Tuner</a></li>
 <li class="toctree-l1 current"><a class="current reference internal" href="#">Changelog</a><ul>
+<li class="toctree-l2"><a class="reference internal" href="#version-0-0-26b-beta">Version 0.0.26b (Beta)</a></li>
+<li class="toctree-l2"><a class="reference internal" href="#version-0-0-25a">Version 0.0.25a</a></li>
+<li class="toctree-l2"><a class="reference internal" href="#version-0-0-24a">Version 0.0.24a</a></li>
 <li class="toctree-l2"><a class="reference internal" href="#version-0-0-23a">Version 0.0.23a</a></li>
 <li class="toctree-l2"><a class="reference internal" href="#version-0-0-22a">Version 0.0.22a</a></li>
 <li class="toctree-l2"><a class="reference internal" href="#version-0-0-21a">Version 0.0.21a</a></li>
@@ -142,6 +145,62 @@ <h1>Changelog<a class="headerlink" href="#changelog" title="Link to this heading
 <p class="admonition-title">Important</p>
 <p>Complete version release history available <a class="reference external" href="https://pypi.org/project/model-tuner/#history">here</a></p>
 </div>
+<section id="version-0-0-26b-beta">
+<h2>Version 0.0.26b (Beta)<a class="headerlink" href="#version-0-0-26b-beta" title="Link to this heading"></a></h2>
+<ul class="simple">
+<li><p>Optimal threshold: Users can now specify target precision or recall and an optimal</p></li>
+</ul>
+<p>threshold is computed for that
+- Finalised testing: coverage is now at 86% total
+- New get_feature_names() helper function for extracting features
+- n_estimators calculation for boosting algorithms is now fixed</p>
+</section>
+<section id="version-0-0-25a">
+<h2>Version 0.0.25a<a class="headerlink" href="#version-0-0-25a" title="Link to this heading"></a></h2>
+<ul class="simple">
+<li><p>Pushed fixes for the get_feature_selection_pipeline method.</p></li>
+<li><p>Updated scoring blocks for calibrated KFold models and folded confusion matrix metrics.</p></li>
+<li><p>Added unittests for edge cases, including test_rfe_calibrate_model() and validating confusion matrix alignment.</p></li>
+<li><p>Fixed mismatches between confusion matrix and classification report.</p></li>
+<li><p>Provided fixes for all pipeline getter methods.</p></li>
+<li><p>Integrated verify_imb_sampler prints into KFold logic.</p></li>
+<li><p>Resolved typos in group split configurations and refined nested KFold bug fixes.</p></li>
+<li><p>Adjusted fold metric calculations in report_model_metrics.</p></li>
+<li><p>Moved optimal threshold logic into prediction functionality.</p></li>
+<li><p>Enhanced return metrics dictionary logic to handle all cases and added multilabel classification tests.</p></li>
+<li><p>Addressed Brier score calculation issues and optimized regression reports for KFold.</p></li>
+<li><p>Introduced threshold print updates for clearer reporting.</p></li>
+<li><p>Implemented SHAP scripts and tests for model explainability.</p></li>
+<li><p>Removed outdated calibration reports from documentation and codebase.</p></li>
+<li><p>Fixed bugs in regression metric calculations and refined KFold metric aggregation.</p></li>
+</ul>
+</section>
+<section id="version-0-0-24a">
+<h2>Version 0.0.24a<a class="headerlink" href="#version-0-0-24a" title="Link to this heading"></a></h2>
+<ul class="simple">
+<li><p>Updated .gitignore to incl. doctrees</p></li>
+<li><p>Added pickleObjects tests and updated reqs, tests passed</p></li>
+<li><p>Added boostrapper test and tests passed</p></li>
+<li><p>Adding multi class test script</p></li>
+<li><p>Updated Metrics Output</p></li>
+<li><p>Added optl’ threshold print inside return_metrics</p></li>
+<li><p>KFold metric printing</p></li>
+<li><p>Augmented predict_proba test, and train_val_test_split</p></li>
+<li><p>Fixed pipeline_steps arg in model definition</p></li>
+<li><p>Refactored metrics_df in report_model_metrics for aesthetics</p></li>
+<li><p>Unit Tests</p></li>
+<li><p>Made return_dict optional in return_metrics</p></li>
+<li><p>Added openpyxl versions for all python versions in requirements.txt</p></li>
+<li><p>Refactor metrics, foldwise metrics and foldwise con_mat, class_labels</p></li>
+<li><p>Cleaned notebooks dir</p></li>
+<li><p>Added model_tuner version print to scripts</p></li>
+<li><p>Added fix for sort of pipeline_steps now optional:</p></li>
+<li><p>Added required model_tuner import to xgb_multi.py</p></li>
+<li><p>Added requisite model_tuner import to multi_class_test.py</p></li>
+<li><p>Added catboost_multi_class.py script</p></li>
+<li><p>Removed pip dependency from requirements</p></li>
+</ul>
+</section>
 <section id="version-0-0-23a">
 <h2>Version 0.0.23a<a class="headerlink" href="#version-0-0-23a" title="Link to this heading"></a></h2>
 <ul class="simple">

diff --git a/docs/genindex.html b/docs/genindex.html
@@ -5,7 +5,7 @@
 <head>
   <meta charset="utf-8" />
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Index &mdash; Model Tuner 0.0.23a0 documentation</title>
+  <title>Index &mdash; Model Tuner 0.0.26b0 documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css?v=80d5e7a1" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=e59714d7" />
       <link rel="stylesheet" type="text/css" href="_static/copybutton.css?v=76b2166b" />
@@ -15,7 +15,7 @@
     <link rel="canonical" href="https://uclamii.github.io/model_tuner/genindex.html" />
       <script src="_static/jquery.js?v=5d32c60e"></script>
       <script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
-      <script src="_static/documentation_options.js?v=2b808d0e"></script>
+      <script src="_static/documentation_options.js?v=eb2198be"></script>
       <script src="_static/doctools.js?v=9bcbadda"></script>
       <script src="_static/sphinx_highlight.js?v=dc90522c"></script>
       <script src="_static/clipboard.min.js?v=a7894cd8"></script>
@@ -131,6 +131,8 @@ <h2 id="B">B</h2>
         <li><a href="usage_guide.html#check_input_type">check_input_type()</a>
 </li>
         <li><a href="usage_guide.html#evaluate_bootstrap_metrics">evaluate_bootstrap_metrics()</a>
+</li>
+        <li><a href="usage_guide.html#get_feature_names">get_feature_names()</a>
 </li>
         <li><a href="usage_guide.html#get_feature_selection_pipeline">get_feature_selection_pipeline()</a>
 </li>
@@ -178,21 +180,28 @@ <h2 id="G">G</h2>
 <table style="width: 100%" class="indextable genindextable"><tr>
   <td style="width: 33%; vertical-align: top;"><ul>
       <li>
+    get_feature_names()
+
+      <ul>
+        <li><a href="usage_guide.html#get_feature_names">built-in function</a>
+</li>
+      </ul></li>
+      <li>
     get_feature_selection_pipeline()
 
       <ul>
         <li><a href="usage_guide.html#get_feature_selection_pipeline">built-in function</a>
 </li>
       </ul></li>
+  </ul></td>
+  <td style="width: 33%; vertical-align: top;"><ul>
       <li>
     get_preprocessing_and_feature_selection_pipeline()
 
       <ul>
         <li><a href="usage_guide.html#get_preprocessing_and_feature_selection_pipeline">built-in function</a>
 </li>
       </ul></li>
-  </ul></td>
-  <td style="width: 33%; vertical-align: top;"><ul>
       <li>
     get_preprocessing_pipeline()
 

diff --git a/docs/getting_started.html b/docs/getting_started.html
@@ -6,7 +6,7 @@
   <meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>Welcome to Model Tuner’s Documentation! &mdash; Model Tuner 0.0.23a0 documentation</title>
+  <title>Welcome to Model Tuner’s Documentation! &mdash; Model Tuner 0.0.26b0 documentation</title>
       <link rel="stylesheet" type="text/css" href="_static/pygments.css?v=80d5e7a1" />
       <link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=e59714d7" />
       <link rel="stylesheet" type="text/css" href="_static/copybutton.css?v=76b2166b" />
@@ -16,7 +16,7 @@
     <link rel="canonical" href="https://uclamii.github.io/model_tuner/getting_started.html" />
       <script src="_static/jquery.js?v=5d32c60e"></script>
       <script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
-      <script src="_static/documentation_options.js?v=2b808d0e"></script>
+      <script src="_static/documentation_options.js?v=eb2198be"></script>
       <script src="_static/doctools.js?v=9bcbadda"></script>
       <script src="_static/sphinx_highlight.js?v=dc90522c"></script>
       <script src="_static/clipboard.min.js?v=a7894cd8"></script>
@@ -123,7 +123,7 @@
 <h1>Welcome to Model Tuner’s Documentation!<a class="headerlink" href="#welcome-to-model-tuner-s-documentation" title="Link to this heading"></a></h1>
 <div class="admonition important">
 <p class="admonition-title">Important</p>
-<p>This documentation is for <code class="docutils literal notranslate"><span class="pre">model_tuner</span></code> version <code class="docutils literal notranslate"><span class="pre">0.0.023a</span></code>.</p>
+<p>This documentation is for <code class="docutils literal notranslate"><span class="pre">model_tuner</span></code> version <code class="docutils literal notranslate"><span class="pre">0.0.026b</span></code>.</p>
 </div>
 <section id="what-does-model-tuner-offer">
 <h2>What Does Model Tuner Offer?<a class="headerlink" href="#what-does-model-tuner-offer" title="Link to this heading"></a></h2>