Releases: lshpaner/eda_toolkit
EDA Toolkit 0.0.9
Version 0.0.9
Bug Fixes and Minor Improvements
Improved error messages and validation checks across multiple functions to prevent common pitfalls and ensure smoother user experience.
Visualization Enhancements
DataFrame Columns: Added a background_color
variable to dataframe_columns
, allowing the user to enter a string representing a color name, or hex value. Try/Except on the output, in case the end user has a deprecated version of Pandas, where the styler would use hide()
instead of hide_index()
. The highlighted columns allow for easier null versus unique value analysis.
The docstring now clearly describes the purpose of the function—analyzing DataFrame columns to provide summary statistics.
Args:
- The
df
argument is specified as apandas.DataFrame
. - The
background_color
argument is marked as optional, with a brief description of its role. - The
return_df
argument is also marked as optional, explaining what it controls.
Returns: The return type is specified as pandas.DataFrame
, with a clear explanation of the difference based on the return_df
flag.
KDE Distribution Plots: Improved kde_distributions()
with enhanced options for log scaling, mean/median plotting, custom standard deviation lines, and better handling of legends and scientific notation.
Scatter Plots: Enhanced scatter_fit_plot()
with support for hue-based coloring, best fit lines, correlation display, and flexible grid plotting options.
EDA Toolkit 0.0.8e
Version 0.0.8e
This update introduces several key changes to the plot_3d_pdp
function and minor changes to the stacked_crosstab_plot
function, simplifying the function's interface and improving usability, while maintaining the flexibility needed for diverse visualization needs.
stacked_crosstab_plot
-
Flexible
save_formats
Input:save_formats
now accepts a string, tuple, or list for specifying formats (e.g.,"png"
,("png", "svg")
, or["png", "svg"]
).- Single strings or tuples are automatically converted to lists for consistent processing.
-
Dynamic Error Handling:
- Added checks to ensure a valid path is provided for each format in
save_formats
. - Raises a
ValueError
if a format is specified without a corresponding path, with a clear, dynamic error message.
- Added checks to ensure a valid path is provided for each format in
-
Improved Plot Saving Logic:
- Updated logic allows saving plots in one format (e.g., only
"png"
or"svg"
) without requiring the other. - Simplified and more intuitive path handling for saving plots.
- Updated logic allows saving plots in one format (e.g., only
plot_3d_pdp
- Parameter Changes
-
Removed Parameters:
- The parameters
x_label_plotly
,y_label_plotly
, andz_label_plotly
have been removed. These parameters previously allowed custom axis labels specifically for the Plotly plot, defaulting to the generalx_label
,y_label
, andz_label
. Removing these parameters simplifies the function signature while maintaining flexibility.
- The parameters
-
Default Values for Labels:
- The parameters
x_label
,y_label
, andz_label
are now optional, withNone
as the default. If not provided, these labels will automatically default to the names of the features in thefeature_names_list
. This change makes the function more user-friendly, particularly for cases where default labels are sufficient.
- The parameters
-
Changes in Default Values for View Angles:
- The default values for camera positioning parameters have been updated:
horizontal
is now-1.25
,depth
is now1.25
, andvertical
is now1.25
. These adjustments refine the default 3D view perspective for the Plotly plot, providing a more intuitive starting view.
- The default values for camera positioning parameters have been updated:
- Plot Generation Logic
-
Conditionally Checking Labels:
- The function now checks whether
x_label
,y_label
, andz_label
are provided. If these areNone
, the function will automatically assign default labels based on thefeature_names_list
. This enhancement reduces the need for users to manually specify labels, making the function more adaptive.
- The function now checks whether
-
Camera Position Adjustments:
- The camera positions for the Plotly plot are now adjusted by multiplying
horizontal
,depth
, andvertical
byzoom_out_factor
. This change allows for more granular control over the 3D view, enhancing the interactivity and flexibility of the Plotly visualizations.
- The camera positions for the Plotly plot are now adjusted by multiplying
-
Surface Plot Coordinates Adjustments:
- The order of the coordinates for the Plotly plot’s surface has been changed from
ZZ, XX, YY[::-1]
toZZ, XX, YY
. This adjustment ensures the proper alignment of axes and grids, resulting in more accurate visual representations.
- The order of the coordinates for the Plotly plot’s surface has been changed from
- Code Simplifications
-
Removed Complexity:
- By removing the
x_label_plotly
,y_label_plotly
, andz_label_plotly
parameters, the code is now simpler and easier to maintain. This change reduces potential confusion and streamlines the function for users who do not need distinct labels for Matplotlib and Plotly plots.
- By removing the
-
Fallback Mechanism for Grid Values:
- The function continues to implement a fallback mechanism when extracting grid values, ensuring compatibility with various versions of scikit-learn. This makes the function robust across different environments.
- Style Adjustments
-
Label Formatting:
- The new version consistently uses
y_label
,x_label
, andz_label
for axis labels in the Matplotlib plot, aligning the formatting across different plot types.
- The new version consistently uses
-
Color Bar Adjustments:
- The color bar configuration in the Matplotlib plot has been slightly adjusted with a shrink value of
0.6
and a pad value of0.02
. These adjustments result in a more refined visual appearance, particularly in cases where space is limited.
- The color bar configuration in the Matplotlib plot has been slightly adjusted with a shrink value of
- Potential Use Case Differences
-
Simplified Interface:
- The updated function is more streamlined for users who prefer a simplified interface without the need for separate label customizations for Plotly and Matplotlib plots. This makes it easier to use in common scenarios.
-
Less Granular Control:
- Users who need more granular control, particularly for presentations or specific formatting, may find the older version more suitable. The removal of the
*_plotly
label parameters means that all plots now use the same labels across Matplotlib and Plotly.
- Users who need more granular control, particularly for presentations or specific formatting, may find the older version more suitable. The removal of the
- Matplotlib Plot Adjustments
-
Wireframe and Surface Plot Enhancements:
- The logic for plotting wireframes and surface plots in Matplotlib remains consistent with previous versions, with subtle enhancements to color and layout management to improve overall aesthetics.
Summary
- Version
0.0.8e
of theplot_3d_pdp
function introduces simplifications that reduce the number of parameters and streamline the plotting process. While some customizability has been removed, the function remains flexible enough for most use cases and is easier to use. - Key updates include adjusted default camera views for 3D plots, removal of Plotly-specific label parameters, and improved automatic labeling and plotting logic.
EDA Toolkit 0.0.8
Version 0.0.8
This update introduces several key changes to the plot_3d_pdp
function and minor changes to the stacked_crosstab_plot
function, simplifying the function's interface and improving usability, while maintaining the flexibility needed for diverse visualization needs.
stacked_crosstab_plot
-
Flexible
save_formats
Input:save_formats
now accepts a string, tuple, or list for specifying formats (e.g.,"png"
,("png", "svg")
, or["png", "svg"]
).- Single strings or tuples are automatically converted to lists for consistent processing.
-
Dynamic Error Handling:
- Added checks to ensure a valid path is provided for each format in
save_formats
. - Raises a
ValueError
if a format is specified without a corresponding path, with a clear, dynamic error message.
- Added checks to ensure a valid path is provided for each format in
-
Improved Plot Saving Logic:
- Updated logic allows saving plots in one format (e.g., only
"png"
or"svg"
) without requiring the other. - Simplified and more intuitive path handling for saving plots.
- Updated logic allows saving plots in one format (e.g., only
plot_3d_pdp
- Parameter Changes
-
Removed Parameters:
- The parameters
x_label_plotly
,y_label_plotly
, andz_label_plotly
have been removed. These parameters previously allowed custom axis labels specifically for the Plotly plot, defaulting to the generalx_label
,y_label
, andz_label
. Removing these parameters simplifies the function signature while maintaining flexibility.
- The parameters
-
Default Values for Labels:
- The parameters
x_label
,y_label
, andz_label
are now optional, withNone
as the default. If not provided, these labels will automatically default to the names of the features in thefeature_names_list
. This change makes the function more user-friendly, particularly for cases where default labels are sufficient.
- The parameters
-
Changes in Default Values for View Angles:
- The default values for camera positioning parameters have been updated:
horizontal
is now-1.25
,depth
is now1.25
, andvertical
is now1.25
. These adjustments refine the default 3D view perspective for the Plotly plot, providing a more intuitive starting view.
- The default values for camera positioning parameters have been updated:
- Plot Generation Logic
-
Conditionally Checking Labels:
- The function now checks whether
x_label
,y_label
, andz_label
are provided. If these areNone
, the function will automatically assign default labels based on thefeature_names_list
. This enhancement reduces the need for users to manually specify labels, making the function more adaptive.
- The function now checks whether
-
Camera Position Adjustments:
- The camera positions for the Plotly plot are now adjusted by multiplying
horizontal
,depth
, andvertical
byzoom_out_factor
. This change allows for more granular control over the 3D view, enhancing the interactivity and flexibility of the Plotly visualizations.
- The camera positions for the Plotly plot are now adjusted by multiplying
-
Surface Plot Coordinates Adjustments:
- The order of the coordinates for the Plotly plot’s surface has been changed from
ZZ, XX, YY[::-1]
toZZ, XX, YY
. This adjustment ensures the proper alignment of axes and grids, resulting in more accurate visual representations.
- The order of the coordinates for the Plotly plot’s surface has been changed from
- Code Simplifications
-
Removed Complexity:
- By removing the
x_label_plotly
,y_label_plotly
, andz_label_plotly
parameters, the code is now simpler and easier to maintain. This change reduces potential confusion and streamlines the function for users who do not need distinct labels for Matplotlib and Plotly plots.
- By removing the
-
Fallback Mechanism for Grid Values:
- The function continues to implement a fallback mechanism when extracting grid values, ensuring compatibility with various versions of scikit-learn. This makes the function robust across different environments.
- Style Adjustments
-
Label Formatting:
- The new version consistently uses
y_label
,x_label
, andz_label
for axis labels in the Matplotlib plot, aligning the formatting across different plot types.
- The new version consistently uses
-
Color Bar Adjustments:
- The color bar configuration in the Matplotlib plot has been slightly adjusted with a shrink value of
0.6
and a pad value of0.02
. These adjustments result in a more refined visual appearance, particularly in cases where space is limited.
- The color bar configuration in the Matplotlib plot has been slightly adjusted with a shrink value of
- Potential Use Case Differences
-
Simplified Interface:
- The updated function is more streamlined for users who prefer a simplified interface without the need for separate label customizations for Plotly and Matplotlib plots. This makes it easier to use in common scenarios.
-
Less Granular Control:
- Users who need more granular control, particularly for presentations or specific formatting, may find the older version more suitable. The removal of the
*_plotly
label parameters means that all plots now use the same labels across Matplotlib and Plotly.
- Users who need more granular control, particularly for presentations or specific formatting, may find the older version more suitable. The removal of the
- Matplotlib Plot Adjustments
-
Wireframe and Surface Plot Enhancements:
- The logic for plotting wireframes and surface plots in Matplotlib remains consistent with previous versions, with subtle enhancements to color and layout management to improve overall aesthetics.
Summary
- Version
0.0.8
of theplot_3d_pdp
function introduces simplifications that reduce the number of parameters and streamline the plotting process. While some customizability has been removed, the function remains flexible enough for most use cases and is easier to use. - Key updates include adjusted default camera views for 3D plots, removal of Plotly-specific label parameters, and improved automatic labeling and plotting logic.
EDA Toolkit 0.0.8d
Version 0.0.8d
This update introduces several key changes to the plot_3d_pdp
function and minor changes to the stacked_crosstab_plot
function, simplifying the function's interface and improving usability, while maintaining the flexibility needed for diverse visualization needs.
stacked_crosstab_plot
-
Flexible
save_formats
Input:save_formats
now accepts a string, tuple, or list for specifying formats (e.g.,"png"
,("png", "svg")
, or["png", "svg"]
).- Single strings or tuples are automatically converted to lists for consistent processing.
-
Dynamic Error Handling:
- Added checks to ensure a valid path is provided for each format in
save_formats
. - Raises a
ValueError
if a format is specified without a corresponding path, with a clear, dynamic error message.
- Added checks to ensure a valid path is provided for each format in
-
Improved Plot Saving Logic:
- Updated logic allows saving plots in one format (e.g., only
"png"
or"svg"
) without requiring the other. - Simplified and more intuitive path handling for saving plots.
- Updated logic allows saving plots in one format (e.g., only
plot_3d_pdp
- Parameter Changes
-
Removed Parameters:
- The parameters
x_label_plotly
,y_label_plotly
, andz_label_plotly
have been removed. These parameters previously allowed custom axis labels specifically for the Plotly plot, defaulting to the generalx_label
,y_label
, andz_label
. Removing these parameters simplifies the function signature while maintaining flexibility.
- The parameters
-
Default Values for Labels:
- The parameters
x_label
,y_label
, andz_label
are now optional, withNone
as the default. If not provided, these labels will automatically default to the names of the features in thefeature_names_list
. This change makes the function more user-friendly, particularly for cases where default labels are sufficient.
- The parameters
-
Changes in Default Values for View Angles:
- The default values for camera positioning parameters have been updated:
horizontal
is now-1.25
,depth
is now1.25
, andvertical
is now1.25
. These adjustments refine the default 3D view perspective for the Plotly plot, providing a more intuitive starting view.
- The default values for camera positioning parameters have been updated:
- Plot Generation Logic
-
Conditionally Checking Labels:
- The function now checks whether
x_label
,y_label
, andz_label
are provided. If these areNone
, the function will automatically assign default labels based on thefeature_names_list
. This enhancement reduces the need for users to manually specify labels, making the function more adaptive.
- The function now checks whether
-
Camera Position Adjustments:
- The camera positions for the Plotly plot are now adjusted by multiplying
horizontal
,depth
, andvertical
byzoom_out_factor
. This change allows for more granular control over the 3D view, enhancing the interactivity and flexibility of the Plotly visualizations.
- The camera positions for the Plotly plot are now adjusted by multiplying
-
Surface Plot Coordinates Adjustments:
- The order of the coordinates for the Plotly plot’s surface has been changed from
ZZ, XX, YY[::-1]
toZZ, XX, YY
. This adjustment ensures the proper alignment of axes and grids, resulting in more accurate visual representations.
- The order of the coordinates for the Plotly plot’s surface has been changed from
- Code Simplifications
-
Removed Complexity:
- By removing the
x_label_plotly
,y_label_plotly
, andz_label_plotly
parameters, the code is now simpler and easier to maintain. This change reduces potential confusion and streamlines the function for users who do not need distinct labels for Matplotlib and Plotly plots.
- By removing the
-
Fallback Mechanism for Grid Values:
- The function continues to implement a fallback mechanism when extracting grid values, ensuring compatibility with various versions of scikit-learn. This makes the function robust across different environments.
- Style Adjustments
-
Label Formatting:
- The new version consistently uses
y_label
,x_label
, andz_label
for axis labels in the Matplotlib plot, aligning the formatting across different plot types.
- The new version consistently uses
-
Color Bar Adjustments:
- The color bar configuration in the Matplotlib plot has been slightly adjusted with a shrink value of
0.6
and a pad value of0.02
. These adjustments result in a more refined visual appearance, particularly in cases where space is limited.
- The color bar configuration in the Matplotlib plot has been slightly adjusted with a shrink value of
- Potential Use Case Differences
-
Simplified Interface:
- The updated function is more streamlined for users who prefer a simplified interface without the need for separate label customizations for Plotly and Matplotlib plots. This makes it easier to use in common scenarios.
-
Less Granular Control:
- Users who need more granular control, particularly for presentations or specific formatting, may find the older version more suitable. The removal of the
*_plotly
label parameters means that all plots now use the same labels across Matplotlib and Plotly.
- Users who need more granular control, particularly for presentations or specific formatting, may find the older version more suitable. The removal of the
- Matplotlib Plot Adjustments
-
Wireframe and Surface Plot Enhancements:
- The logic for plotting wireframes and surface plots in Matplotlib remains consistent with previous versions, with subtle enhancements to color and layout management to improve overall aesthetics.
Summary
- Version
0.0.8d
of theplot_3d_pdp
function introduces simplifications that reduce the number of parameters and streamline the plotting process. While some customizability has been removed, the function remains flexible enough for most use cases and is easier to use. - Key updates include adjusted default camera views for 3D plots, removal of Plotly-specific label parameters, and improved automatic labeling and plotting logic.
EDA Toolkit 0.0.8c
EDA Toolkit 0.0.8c:
Summary of Changes:
1. New Features & Enhancements:
-
plot_3d_pdp
Function:- Added
show_modebar
Parameter: Introduced a new boolean parameter,show_modebar
, to allow users to toggle the visibility of the mode bar in Plotly interactive plots. - Custom Margins and Layout Adjustments:
- Added parameters for
left_margin
,right_margin
, andtop_margin
to provide users with more control over the plot layout in Plotly. - Adjusted default values and added options for better customization of the Plotly color bar (
cbar_x
,cbar_thickness
) and title positioning (title_x
,title_y
).
- Added parameters for
- Plotly Configuration:
- Enhanced the configuration options to allow users to enable or disable zoom functionality (
enable_zoom
) in the interactive Plotly plots. - Updated the code to reflect these new parameters, allowing for greater flexibility in the appearance and interaction with the Plotly plots.
- Enhanced the configuration options to allow users to enable or disable zoom functionality (
- Error Handling:
- Added input validation for
html_file_path
andhtml_file_name
to ensure these are provided when necessary based on the selectedplot_type
.
- Added input validation for
- Added
-
plot_2d_pdp
Function:- Introduced
file_prefix
Parameter:- Added a new
file_prefix
parameter to allow users to specify a prefix for filenames when saving grid plots. This change streamlines the naming process for saved plots and improves file organization.
- Added a new
- Enhanced Plot Type Flexibility:
- The
plot_type
parameter now includes an option to generate both grid and individual plots (both
). This feature allows users to create a combination of both layout styles in one function call. - Updated input validation and logic to handle this new option effectively.
- The
- Added
save_plots
Parameter:- Introduced a new parameter,
save_plots
, to control the saving of plots. Users can specify whether to save all plots, only individual plots, only grid plots, or none.
- Introduced a new parameter,
- Custom Margins and Layout Adjustments:
- Included the
save_plots
parameter in the validation process to ensure paths are provided when needed for saving the plots.
- Included the
- Introduced
2. Documentation Updates:
- Docstrings:
- Updated docstrings for both functions to reflect the new parameters and enhancements, providing clearer and more comprehensive guidance for users.
- Detailed the use of new parameters such as
show_modebar
,file_prefix
,save_plots
, and others, ensuring that the function documentation is up-to-date with the latest changes.
3. Refactoring & Code Cleanup:
- Code Structure:
- Improved the code structure to maintain clarity and readability, particularly around the new functionality.
- Consolidated the layout configuration settings for the Plotly plots into a more flexible and user-friendly format, making it easier for users to customize their plots.
This version enhances the usability of the plot_3d_pdp
and plot_2d_pdp
functions, introduces new features for greater flexibility in plot customization, and ensures that the functions are well-documented and easy to use. The updates are backward-compatible and aim to provide a more seamless user experience in generating and saving both 3D and 2D partial dependence plots.
EDA Toolkit 0.0.8b
Version 0.0.8b Release Notes
We are excited to announce the release of version 0.0.8b, which introduces significant enhancements and new features to improve the usability and functionality of our toolkit.
New Features:
-
Optional
file_prefix
instacked_crosstab_plot
Function- The
stacked_crosstab_plot
function has been updated to make thefile_prefix
argument optional. If the user does not provide afile_prefix
, the function will now automatically generate a default prefix based on thecol
andfunc_col
parameters. This change streamlines the process of generating plots by reducing the number of required arguments. - Key Improvement:
- Users can now omit the
file_prefix
argument, and the function will still produce appropriately named plot files, enhancing ease of use. - Backward compatibility is maintained, allowing users who prefer to specify a custom
file_prefix
to continue doing so without any issues.
- Users can now omit the
- The
-
Introduction of 3D and 2D Partial Dependence Plot Functions
- Two new functions,
plot_3d_pdp
andplot_2d_pdp
, have been added to the toolkit, expanding the visualization capabilities for machine learning models.plot_3d_pdp
: Generates 3D partial dependence plots for two features, supporting both static visualizations (using Matplotlib) and interactive plots (using Plotly). The function offers extensive customization options, including labels, color maps, and saving formats.plot_2d_pdp
: Creates 2D partial dependence plots for specified features with flexible layout options (grid or individual plots) and customization of figure size, font size, and saving formats.
- Key Features:
- Compatibility: Both functions are compatible with various versions of scikit-learn, ensuring broad usability.
- Customization: Extensive options for customizing visual elements, including figure size, font size, and color maps.
- Interactive 3D Plots: The
plot_3d_pdp
function supports interactive visualizations, providing an enhanced user experience for exploring model predictions in 3D space.
- Two new functions,
Impact:
- These updates improve the user experience by reducing the complexity of function calls and introducing powerful new tools for model interpretation.
- The optional
file_prefix
enhancement simplifies plot generation while maintaining the flexibility to define custom filenames. - The new partial dependence plot functions offer robust visualization options, making it easier to analyze and interpret the influence of specific features in machine learning models.
We encourage users to explore these new features and provide feedback on their experience. As always, we remain committed to continuous improvement and welcome suggestions for future updates.
EDA Toolkit 0.0.8a
Version 0.0.8 Release Notes
We are excited to announce the release of version 0.0.8, which introduces significant enhancements and new features to improve the usability and functionality of our toolkit.
New Features:
-
Optional
file_prefix
instacked_crosstab_plot
Function- The
stacked_crosstab_plot
function has been updated to make thefile_prefix
argument optional. If the user does not provide afile_prefix
, the function will now automatically generate a default prefix based on thecol
andfunc_col
parameters. This change streamlines the process of generating plots by reducing the number of required arguments. - Key Improvement:
- Users can now omit the
file_prefix
argument, and the function will still produce appropriately named plot files, enhancing ease of use. - Backward compatibility is maintained, allowing users who prefer to specify a custom
file_prefix
to continue doing so without any issues.
- Users can now omit the
- The
-
Introduction of 3D and 2D Partial Dependence Plot Functions
- Two new functions,
plot_3d_pdp
andplot_2d_pdp
, have been added to the toolkit, expanding the visualization capabilities for machine learning models.plot_3d_pdp
: Generates 3D partial dependence plots for two features, supporting both static visualizations (using Matplotlib) and interactive plots (using Plotly). The function offers extensive customization options, including labels, color maps, and saving formats.plot_2d_pdp
: Creates 2D partial dependence plots for specified features with flexible layout options (grid or individual plots) and customization of figure size, font size, and saving formats.
- Key Features:
- Compatibility: Both functions are compatible with various versions of scikit-learn, ensuring broad usability.
- Customization: Extensive options for customizing visual elements, including figure size, font size, and color maps.
- Interactive 3D Plots: The
plot_3d_pdp
function supports interactive visualizations, providing an enhanced user experience for exploring model predictions in 3D space.
- Two new functions,
Impact:
- These updates improve the user experience by reducing the complexity of function calls and introducing powerful new tools for model interpretation.
- The optional
file_prefix
enhancement simplifies plot generation while maintaining the flexibility to define custom filenames. - The new partial dependence plot functions offer robust visualization options, making it easier to analyze and interpret the influence of specific features in machine learning models.
We encourage users to explore these new features and provide feedback on their experience. As always, we remain committed to continuous improvement and welcome suggestions for future updates.
EDA Toolkit 0.0.7
Add flex_corr_matrix
function for customizable correlation matrix visualization
This release introduces a new function, flex_corr_matrix
, which allows users to generate both full and upper triangular correlation heatmaps with a high degree of customization. The function includes options to annotate the heatmap, save the plots, and pass additional parameters to seaborn.heatmap()
.
Summary of Changes:
- New Function:
flex_corr_matrix
- Functionality:
- Generates a correlation heatmap for a given DataFrame.
- Supports both full and upper triangular correlation matrices based on the
triangular
parameter. - Allows users to customize various aspects of the plot, including colormap, figure size, axis label rotation, and more.
- Accepts additional keyword arguments via
**kwargs
to pass directly toseaborn.heatmap()
. - Includes validation to ensure the
triangular
,annot
, andsave_plots
parameters are boolean values. - Raises an exception if
save_plots=True
but neitherimage_path_png
norimage_path_svg
is specified.
- Functionality:
Usage:
# Full correlation matrix example
flex_corr_matrix(df=my_dataframe, triangular=False, cmap="coolwarm", annot=True)
# Upper triangular correlation matrix example
flex_corr_matrix(df=my_dataframe, triangular=True, cmap="coolwarm", annot=True)
Contingency table df to object type
Convert all columns in dataframe to object, to prevent issues with numerical columns.
df = df.astype(str).fillna("")
EDA Toolkit 0.0.6
Add validation for plot_type
parameter in kde_distributions
function
This release adds a validation step for the plot_type
parameter in the kde_distributions
function. The allowed values for plot_type
are "hist"
, "kde"
, and "both"
. If an invalid value is provided, the function will now raise a ValueError
with a clear message indicating the accepted values. This change improves the robustness of the function and helps prevent potential errors due to incorrect parameter values.
# Validate plot_type parameter
valid_plot_types = ["hist", "kde", "both"]
if plot_type.lower() not in valid_plot_types:
raise ValueError(
f"Invalid plot_type value. Expected one of {valid_plot_types}, "
f"got '{plot_type}' instead."
)
EDA Toolkit 0.0.5
Ensure Consistent Font Size and Text Wrapping Across Plot Elements
Description
This PR addresses inconsistencies in font sizes and text wrapping across various plot elements in the stacked_crosstab_plot
function. The following updates have been implemented to ensure uniformity and improve the readability of plots:
-
Title Font Size and Text Wrapping:
- Added a
text_wrap
parameter to control the wrapping of plot titles. - Ensured that title font sizes are consistent with axis label font sizes by explicitly setting the font size using
ax.set_title()
after plot generation.
- Added a
-
Legend Font Size Consistency:
- Incorporated
label_fontsize
into the legend font size by directly setting the font size of the legend text usingplt.setp(legend.get_texts(), fontsize=label_fontsize)
. - This ensures that the legend labels are consistent with the title and axis labels.
- Incorporated
Testing
- Verified that titles now wrap correctly and match the specified
label_fontsize
. - Confirmed that legend text scales according to
label_fontsize
, ensuring consistent font sizes across all plot elements.
Outcome
These changes improve the visual consistency of plots generated by the stacked_crosstab_plot
function, making the plots more professional and easier to read. This PR should be reviewed and merged to standardize font sizing and text presentation across the codebase.