Skip to content

EDA Toolkit 0.0.11a2

Compare
Choose a tag to compare
@lshpaner lshpaner released this 21 Oct 22:51
· 122 commits to main since this release

Data Doctor Updates

1. new_col_name logic for when scale_conversion==None, but there are cutoffs to be applied to a new column, allowing such situations to go through so that the new column is created.

2. Fix for apply_as_new_col_to_df logic

Updated the logic for generating the new column name when apply_as_new_col_to_df=True. This ensures that the column name is correctly assigned based on the applied transformation or cutoff.

Original code:

# New column name options when apply_as_new_col_to_df == True
if apply_as_new_col_to_df == True and scale_conversion == None and apply_cutoff == True:
    new_col_name = feature_name + "_" + 'w_cutoff'
elif apply_as_new_col_to_df == True and scale_conversion != None:
    new_col_name = feature_name + "_" + scale_conversion
    
**Updated version**:

```python
# Default new column name in case no conditions are met
new_col_name = feature_name

# New column name options when apply_as_new_col_to_df == True
if apply_as_new_col_to_df:
    if scale_conversion is None and apply_cutoff:
        new_col_name = feature_name + "_w_cutoff"
    elif scale_conversion is not None:
        new_col_name = feature_name + "_" + scale_conversion

3. Custom ValueError for missing conditions

Added a custom ValueError to handle cases where the user sets apply_as_new_col_to_df=True but does not specify either a scale_conversion or enable apply_cutoff. This provides clearer feedback to users and avoids unexpected behavior.

4. New error-handling block:

if apply_as_new_col_to_df:
    if scale_conversion is None and not apply_cutoff:
        raise ValueError(
            "When applying a new column with `apply_as_new_col_to_df=True`, "
            "you must specify either a `scale_conversion` or set `apply_cutoff=True`."
        )

Overall Changes

  • Corrected the logic for generating new column names when transformations or cutoffs are applied.
  • Added a custom ValueError when apply_as_new_col_to_df=True but neither a valid scale_conversion nor apply_cutoff=True is specified.
  • Updated the docstring to reflect the new logic and error handling.