Skip to content

Diffprivlib 0.3.0

Compare
Choose a tag to compare
@naoise-h naoise-h released this 26 Jun 06:19

This release of diffprivlib includes a number of new additions, as well as various fixes to existing functionality. Some changes break backward compatibility with previous versions of the library. This version of diffprivlib supports Python 3.5 through 3.8.

The updates are summarised as follows.

Added

  • BudgetAccountant class to keep track of privacy budget spent in a script (and associated notebook).
  • Budget class to allow easy comparison (with <, >, etc) between privacy budgets of the form (epsilon, delta).
  • count_nonzero, sum and nansum functions to calculate a differentially private count and sum on an array or list.
  • GaussianDiscrete mechanism, the discrete analogue to the Gaussian mechanism.
  • clip_to_bounds and clip_to_norm to clip input data to the given bounds/norm; used in tools and models as appropriate.
  • Notebook demonstrating data exploration and visualisation capabilities.

Changed

Breaking:

  • The form/syntax of the bounds parameter passed to tools and models has changed; it is now specified as a tuple of the form (min, max). min and max can be scalars or 1-dimensional arrays.
    Bounds can typically be converted to the new form with new_bounds = ([l for l, _ in bounds], [u for _, u in bounds]).
  • All functions (other than histogram functions) that previously required a range parameter now requires bounds instead (e.g. models.LinearRegression, models.StandardScaler, tools.mean, etc.).

Non-breaking:

  • Diffprivlib now requires scikit-learn version 0.22 or later.
  • Geometric mechanism now has default sensitivity=1.This reflects the typical use of the geometric mechanism on count queries with sensitivity 1.
  • All mechanisms now support zero sensitivity.

Fixed

  • The publicly-exposed class counts in models.GaussianNB now satisfy differential privacy. The class_count_ attribute is therefore noisy, and care must be taken in relying on these values for testing or other purposes.
  • mean, std and var tools, no longer require numpy array inputs, and can take all array-like inputs (e.g. scalars, lists and tuples).
  • Sensitivity calculation when randomising scalar-valued var output.