Archive for the 'programming' Category

Stupid gender stereotyping and questionable coding practices 101

The WSJ blog of all places has an article about how men and women code (and women do it better).
Apparently, "testosterone-fueled" men tend to write cryptic code without comments, while "touchy-feely and considerate" women tend to write lots of helpful comments and are just all-around nice people.
If you're going to make sweeping generalizations about […]

Programming as craft

Paul Graham famously compared programming to painting, claiming that programming in its highest form (i.e. what he calls "hacking") is equivalent to art.
This seems to resonate with a lot of programmers, and not just because it makes us feel better about ourselves to believe that we're really creating art when we code our 83rd login […]

The mysterious “ImportError: cannot import name cache”

The scenario: I'm packaging a CherryPy server with py2exe, using Mako as my template engine. So I create my exe file, and fire up the app, and get this 500 Mako error:

Traceback (most recent call last):
  File "cherrypy\_cprequest.pyo", line 551, in respond
  File "cherrypy\_cpdispatch.pyo", line 24, in __call__
  File "main.py", line 210, in index
  File "mako\lookup.pyo", line 70, in […]

“Instant” translation service?

I recently came across an intriguing model for translation services called ICanLocalize. It's similar to what a lot of translation agencies do, but with a couple of twists:

An "instant" translation service with 15-minute (!) turnaround
Website translation with built-in translation memory.
Translators work from a Web browser using a tool that the service provides.

Good idea for small […]

Counting words (etc.) in an HTML file with Python

In a previous post, I wrote about how to count words, characters, and Asian characters using python.
In this post I want to pull that together with code to get a word count from an HTML file.
What needs counting
What needs counting depends to some extent on what you need the word count for, but here I'm […]

Speeding up search on Honyaku archive site

Last summer, I launched a new archive site for the Honyaku mailing list.
The site is written in Python using the django framework, with MySQL as the database. I chose MySQL because my tests showed that it was much faster than PostgreSQL at text searching.
Lately, however, the searches have been taking a huge amount of time. […]

What price elegance?

In a recent post, I gave some code for counting the top n most frequent words in an arbitrary text file using itertools.groupby.
The code is written in a somewhat functional style. It's short and, dare I say, kind of elegant. But it turns out that this code is quite a bit slower than an imperative […]

Counting occurrences in a sequence with itertools.groupby

itertools.groupby is a great tool for counting the numbers of occurrences in a sequence.
Here are some examples from the interactive interpreter.
A list of numbers

>>> # Create a random list of numbers
>>> from random import random
>>> numbers = [int(random() * 10) for x in range(20)]
>>> numbers
[8, 0, 3, 2, 3, 9, 8, 2, 8, 3, 0, […]

Making the robot dance

Some time around 1980, my elementary school classroom got a computer. While most of the other kids fooled around playing Hunt the Wumpus, my friend and I found the BASIC manual that came with the computer. We laboriously copied in the code to make a "robot" appear on the screen. After a lot of typos, […]

Using chardet to convert arbitrary byte strings to Unicode

chardet is a fantastic module for finding the encoding of arbitrary byte strings. You can combine this with a check for a BOM to pretty reliably turn them into Unicode.
Edit: Thanks to Kirit's comment below, I added code to check for UTF-32.

import chardet
def bytes2unicode(bytes, errors='replace'):
    """Convert a byte string into Unicode.
    First checks […]

Next Page »