Archive for the 'C' Category

Version 0.2 of subdist module released

Just a quick note that I've released version 0.2 of my subdist module.
What is subdist?
subdist is a C Python extension that calculates fuzzy substring matches, based on Levenshtein distance.
subdist works purely with Unicode strings; calling one of its functions with a non-Unicode string will raise an error.
What's new in version 0.2?
Version 0.2 adds a get_score […]

The past, present, and future of optimization

I have a relative ("Dan") who used to earn a living optimizing code in the late 70s and early 80s. Around then, a new-fangled high-level language named "C" was starting to catch on, but companies didn't like all the wasted cycles in C programs due to the under-optimized assembly code that their C compilers were […]

Extending Python with C: A case study

Near-100x speedup with a C extension
I recently wrote about an algorithm for fuzzy matching of substrings implemented in Python. This is a feature that I needed for a piece of software I'm currently developing.
When I started using the fuzzy_substring function on some test cases, however, it was unacceptably slow. Using a modestly large test corpus […]