wchartype: Python module to get full-width (double-byte) character types

When dealing with full-width (especially CJK) characters, you'll often want to know the type of a particular character — Kanji/Hanzi/Hanja, hiragana, katakana, and so on.

wchartype is a Python module that will determine the type of a full-width character. The functions all expect Unicode strings of length one.


import wchartype
if wchartype.is_asian(u'\u65e5′):
    print u"\u65e5 is an Asian character"

sentence = u"this has ひらがな"
if any(wchartype.is_hiragana(x) for x in sentence):
    print "'%s' has one or more hiragana characters" % sentence

wchartype has a home on pypi and can be installed via easy_install:

easy_install wchartype

Here is the permanent page for wchartype.

Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>