Diffprivlib 0.3.0
This release of diffprivlib includes a number of new additions, as well as various fixes to existing functionality. Some changes break backward compatibility with previous versions of the library. This version of diffprivlib supports Python 3.5 through 3.8.
The updates are summarised as follows.
Added
BudgetAccountant
class to keep track of privacy budget spent in a script (and associated notebook).Budget
class to allow easy comparison (with<
,>
, etc) between privacy budgets of the form(epsilon, delta)
.count_nonzero
,sum
andnansum
functions to calculate a differentially private count and sum on an array or list.GaussianDiscrete
mechanism, the discrete analogue to the Gaussian mechanism.clip_to_bounds
andclip_to_norm
to clip input data to the given bounds/norm; used in tools and models as appropriate.- Notebook demonstrating data exploration and visualisation capabilities.
Changed
Breaking:
- The form/syntax of the
bounds
parameter passed to tools and models has changed; it is now specified as a tuple of the form(min, max)
.min
andmax
can be scalars or 1-dimensional arrays.
Bounds can typically be converted to the new form withnew_bounds = ([l for l, _ in bounds], [u for _, u in bounds])
. - All functions (other than histogram functions) that previously required a
range
parameter now requiresbounds
instead (e.g.models.LinearRegression
,models.StandardScaler
,tools.mean
, etc.).
Non-breaking:
- Diffprivlib now requires scikit-learn version 0.22 or later.
Geometric
mechanism now has defaultsensitivity=1
.This reflects the typical use of the geometric mechanism on count queries with sensitivity 1.- All mechanisms now support zero sensitivity.
Fixed
- The publicly-exposed class counts in
models.GaussianNB
now satisfy differential privacy. Theclass_count_
attribute is therefore noisy, and care must be taken in relying on these values for testing or other purposes. mean
,std
andvar
tools, no longer require numpy array inputs, and can take all array-like inputs (e.g. scalars, lists and tuples).- Sensitivity calculation when randomising scalar-valued
var
output.