Releases: rapidfuzz/RapidFuzz
Release 3.5.0
Changed
- skip pandas
pd.NA
similar toNone
- add
score_multiplier
argument toprocess.cdist
which allows multiplying the end result scores
with a constant factor. - drop support for Python 3.7
Performance
- improve performance of simd implementation for
LCS
/Indel
/Jaro
/JaroWinkler
- improve performance of Jaro and Jaro Winkler for long sequences
- implement
process.extract
withlimit=1
usingprocess.extractOne
which can be faster
Fixed
- the preprocessing function was always called through Python due to a broken C-API version check
- fix wraparound issue in simd implementation of Jaro and Jaro Winkler
Release 3.4.0
Changed
- upgrade to
Cython==3.0.3
- add simd implementation for Jaro and Jaro Winkler
Release 2.15.2
Since rapidfuzz v2.x is still widely used, Python 3.12 support is backported to rapidfuzz v2.x.
Added
- add python 3.12 support
Release 3.3.1
Added
- add missing tag for python 3.12 support
Release 3.3.0
Changed
- upgrade to
Cython==3.0.2
- implement the remaining missing features from the C++ implementation in the pure Python implementation
Added
- added support for Python 3.12
Release 3.2.0
Changed
- build x86 with sse2/avx2 runtime detection
Release 3.1.2
Changed
- upgrade to
Cython==3.0.0
Release 3.1.1
Changed
- upgrade to
taskflow==3.6
Fixed
- replace usage of
isnan
withstd::isnan
which fixes the build on NetBSD
Release 3.1.0
Changed
- added keyword argument
pad
to Hamming distance. This controls whether sequences of different
length should be padded or lead to aValueError
- improve consistency of exception messages between the C++ and pure Python implementation
- upgrade required Cython version to
Cython==3.0.0b3
Fixed
- fix missing GIL restore when an exception is thrown inside
process.cdist
- fix incorrect type hints for the
process
module
Release 3.0.0
Changed
-
allow the usage of
Hamming
for different string lengths. Length differences are handled as
insertions / deletions -
remove support for boolean preprocessor functions in
rapidfuzz.fuzz
andrapidfuzz.process
.
The processor argument is now always a callable or None. -
update defaults of the processor argument to be
None
everywhere. For affected functions this can change results, since strings are no longer preprocessed. To get back the old behaviour passprocessor=utils.default_process
to these functions. The following functions are affected by this:process.extract
,process.extract_iter
,process.extractOne
fuzz.token_sort_ratio
,fuzz.token_set_ratio
,fuzz.token_ratio
,fuzz.partial_token_sort_ratio
,fuzz.partial_token_set_ratio
,fuzz.partial_token_ratio
,fuzz.WRatio
,fuzz.QRatio
-
rapidfuzz.process
no longer calls scorers withprocessor=None
. For this reason user provided scorers no longer require this argument. -
remove option to pass keyword arguments to scorer via
**kwargs
inrapidfuzz.process
. They can be passed
via ascorer_kwargs
argument now. This ensures this does not break when extending function parameters and
prevents naming clashes. -
remove
rapidfuzz.string_metric
module. Replacements for all functions are available inrapidfuzz.distance
Added
- added support for arbitrary hashable sequence in the pure Python fallback implementation of all functions in
rapidfuzz.distance
- added support for
None
andfloat("nan")
inprocess.cdist
as long as the underlying scorer supports it.
This is the case for all scorers returning normalized results.
Fixed
- fix division by zero in simd implementation of normalized metrics leading to incorrect results