subdist is a fast Python module for finding fuzzy substring matches. It uses a modified version of the Levenshtein distance algorithm (described here).
- Initial release
- Added get_score function, which returns a score between 0.0 and 1.0 based on the edit distance
- Improved docstrings, refactored, added slight speedup to distance function.
Get the Levenshtein (edit) distance of a substring
import subdist needle = u"short string" haystack = u"This is a long string" distance = subdist.substring(needle, haystack)
Get the fuzzy match score (0.0 to 1.0) of a substring
import subdist score = subdist.get_score(needle, haystack)
- installer package (subdist-0.2.1.tar.gz)
- Win32 binary installer (Python 2.5) (subdist-0.2.1.win32-py2.5.exe)
- Win32 binary installer (Python 2.4) (subdist-0.2.1.win32-py2.4.exe)