Skip to content

Commit

Permalink
added catboost hyperparams docs
Browse files Browse the repository at this point in the history
  • Loading branch information
lshpaner committed Dec 7, 2024
1 parent f0200ed commit 0ad8612
Show file tree
Hide file tree
Showing 9 changed files with 91 additions and 5 deletions.
Binary file added assets/2016-10-06-lr-067222-730489.pdf
Binary file not shown.
2 changes: 0 additions & 2 deletions docs/_sources/caveats.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -236,8 +236,6 @@ where :math:`x_{\min}` and :math:`x_{\max}` represent the minimum and maximum va
By imputing missing values before scaling, we avoid these distortions, ensuring that the scaling operation reflects the true range of the data.




Column Stratification with Cross-Validation
---------------------------------------------
.. important::
Expand Down
31 changes: 31 additions & 0 deletions docs/_sources/usage_guide.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,37 @@ Step 5: Define Hyperparameters for XGBoost

This can be particularly useful for monitoring model performance when early stopping is enabled.

.. important::

When defining hyperparameters for boosting algorithms, frameworks like
XGBoost allow straightforward configuration, such as specifying ``n_estimators``
for the number of boosting rounds. However, CatBoost introduces potential
pitfalls when defining this parameter.

According to the `CatBoost documentation <https://catboost.ai/docs/en/references/training-parameters/>`_:

"For the Python package several parameters have aliases. For example, the --iterations parameter has the following synonyms: num_boost_round, n_estimators, num_trees. Simultaneous usage of different names of one parameter raises an error."

To avoid this issue in CatBoost, ensure you define only one of these parameters (e.g., ``n_estimators``) and avoid including others such as ``iterations`` or ``num_boost_round``.

Example: Tuning Hyperparameters for CatBoost
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When defining hyperparameters for grid search, specify only one alias in your configuration. Below is an example:

.. code-block:: python
cat_name = "cat"
tuned_hyperparameters_cat = {
f"{cat_name}__n_estimators": [1500], # Use only "n_estimators"
f"{cat_name}__learning_rate": [0.01, 0.1],
f"{cat_name}__depth": [4, 6, 8],
f"{cat_name}__loss_function": ["Logloss"],
}
This ensures compatibility with CatBoost’s requirements and avoids errors during hyperparameter tuning.


Step 6: Initialize and Configure the ``Model``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
1 change: 1 addition & 0 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@ <h1>Model Tuner Documentation<a class="headerlink" href="#model-tuner-documentat
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#step-3-check-for-zero-variance-columns-and-drop-accordingly">Step 3: Check for zero-variance columns and drop accordingly</a></li>
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#step-4-create-an-instance-of-the-xgbclassifier">Step 4: Create an Instance of the XGBClassifier</a></li>
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#step-5-define-hyperparameters-for-xgboost">Step 5: Define Hyperparameters for XGBoost</a></li>
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#example-tuning-hyperparameters-for-catboost">Example: Tuning Hyperparameters for CatBoost</a></li>
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#step-6-initialize-and-configure-the-model">Step 6: Initialize and Configure the <code class="docutils literal notranslate"><span class="pre">Model</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#step-7-perform-grid-search-parameter-tuning">Step 7: Perform Grid Search Parameter Tuning</a></li>
<li class="toctree-l3"><a class="reference internal" href="usage_guide.html#step-8-fit-the-model">Step 8: Fit the Model</a></li>
Expand Down
Binary file modified docs/objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/searchindex.js

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions docs/usage_guide.html
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@
<li class="toctree-l3"><a class="reference internal" href="#step-3-check-for-zero-variance-columns-and-drop-accordingly">Step 3: Check for zero-variance columns and drop accordingly</a></li>
<li class="toctree-l3"><a class="reference internal" href="#step-4-create-an-instance-of-the-xgbclassifier">Step 4: Create an Instance of the XGBClassifier</a></li>
<li class="toctree-l3"><a class="reference internal" href="#step-5-define-hyperparameters-for-xgboost">Step 5: Define Hyperparameters for XGBoost</a></li>
<li class="toctree-l3"><a class="reference internal" href="#example-tuning-hyperparameters-for-catboost">Example: Tuning Hyperparameters for CatBoost</a></li>
<li class="toctree-l3"><a class="reference internal" href="#step-6-initialize-and-configure-the-model">Step 6: Initialize and Configure the <code class="docutils literal notranslate"><span class="pre">Model</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#step-7-perform-grid-search-parameter-tuning">Step 7: Perform Grid Search Parameter Tuning</a></li>
<li class="toctree-l3"><a class="reference internal" href="#step-8-fit-the-model">Step 8: Fit the Model</a></li>
Expand Down Expand Up @@ -543,6 +544,32 @@ <h3>Step 4: Create an Instance of the XGBClassifier<a class="headerlink" href="#
</ul>
<p>This can be particularly useful for monitoring model performance when early stopping is enabled.</p>
</div>
<div class="admonition important">
<p class="admonition-title">Important</p>
<p>When defining hyperparameters for boosting algorithms, frameworks like
XGBoost allow straightforward configuration, such as specifying <code class="docutils literal notranslate"><span class="pre">n_estimators</span></code>
for the number of boosting rounds. However, CatBoost introduces potential
pitfalls when defining this parameter.</p>
<p>According to the <a class="reference external" href="https://catboost.ai/docs/en/references/training-parameters/">CatBoost documentation</a>:</p>
<blockquote>
<div><p>“For the Python package several parameters have aliases. For example, the –iterations parameter has the following synonyms: num_boost_round, n_estimators, num_trees. Simultaneous usage of different names of one parameter raises an error.”</p>
</div></blockquote>
<p>To avoid this issue in CatBoost, ensure you define only one of these parameters (e.g., <code class="docutils literal notranslate"><span class="pre">n_estimators</span></code>) and avoid including others such as <code class="docutils literal notranslate"><span class="pre">iterations</span></code> or <code class="docutils literal notranslate"><span class="pre">num_boost_round</span></code>.</p>
</div>
</section>
<section id="example-tuning-hyperparameters-for-catboost">
<h3>Example: Tuning Hyperparameters for CatBoost<a class="headerlink" href="#example-tuning-hyperparameters-for-catboost" title="Link to this heading"></a></h3>
<p>When defining hyperparameters for grid search, specify only one alias in your configuration. Below is an example:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">cat_name</span> <span class="o">=</span> <span class="s2">&quot;cat&quot;</span>
<span class="n">tuned_hyperparameters_cat</span> <span class="o">=</span> <span class="p">{</span>
<span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">cat_name</span><span class="si">}</span><span class="s2">__n_estimators&quot;</span><span class="p">:</span> <span class="p">[</span><span class="mi">1500</span><span class="p">],</span> <span class="c1"># Use only &quot;n_estimators&quot;</span>
<span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">cat_name</span><span class="si">}</span><span class="s2">__learning_rate&quot;</span><span class="p">:</span> <span class="p">[</span><span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">],</span>
<span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">cat_name</span><span class="si">}</span><span class="s2">__depth&quot;</span><span class="p">:</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">8</span><span class="p">],</span>
<span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">cat_name</span><span class="si">}</span><span class="s2">__loss_function&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Logloss&quot;</span><span class="p">],</span>
<span class="p">}</span>
</pre></div>
</div>
<p>This ensures compatibility with CatBoost’s requirements and avoids errors during hyperparameter tuning.</p>
</section>
<section id="step-6-initialize-and-configure-the-model">
<h3>Step 6: Initialize and Configure the <code class="docutils literal notranslate"><span class="pre">Model</span></code><a class="headerlink" href="#step-6-initialize-and-configure-the-model" title="Link to this heading"></a></h3>
Expand Down
2 changes: 0 additions & 2 deletions source/caveats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -236,8 +236,6 @@ where :math:`x_{\min}` and :math:`x_{\max}` represent the minimum and maximum va
By imputing missing values before scaling, we avoid these distortions, ensuring that the scaling operation reflects the true range of the data.




Column Stratification with Cross-Validation
---------------------------------------------
.. important::
Expand Down
31 changes: 31 additions & 0 deletions source/usage_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,37 @@ Step 5: Define Hyperparameters for XGBoost

This can be particularly useful for monitoring model performance when early stopping is enabled.

.. important::

When defining hyperparameters for boosting algorithms, frameworks like
XGBoost allow straightforward configuration, such as specifying ``n_estimators``
for the number of boosting rounds. However, CatBoost introduces potential
pitfalls when defining this parameter.

According to the `CatBoost documentation <https://catboost.ai/docs/en/references/training-parameters/>`_:

"For the Python package several parameters have aliases. For example, the --iterations parameter has the following synonyms: num_boost_round, n_estimators, num_trees. Simultaneous usage of different names of one parameter raises an error."

To avoid this issue in CatBoost, ensure you define only one of these parameters (e.g., ``n_estimators``) and avoid including others such as ``iterations`` or ``num_boost_round``.

Example: Tuning Hyperparameters for CatBoost
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When defining hyperparameters for grid search, specify only one alias in your configuration. Below is an example:

.. code-block:: python
cat_name = "cat"
tuned_hyperparameters_cat = {
f"{cat_name}__n_estimators": [1500], # Use only "n_estimators"
f"{cat_name}__learning_rate": [0.01, 0.1],
f"{cat_name}__depth": [4, 6, 8],
f"{cat_name}__loss_function": ["Logloss"],
}
This ensures compatibility with CatBoost’s requirements and avoids errors during hyperparameter tuning.


Step 6: Initialize and Configure the ``Model``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down

0 comments on commit 0ad8612

Please sign in to comment.