Honyaku mailing-list archive cleanup

I spent some time this weekend cleaning up the Honyaku mailing-list archive. Here's a summary of what I did:

  1. Removed stray file attachments (including a couple resumes; marked with "[[attachment stripped]]")
  2. Decoded base64 emails
  3. Removed quoted digests (marked with "[[digest stripped]]")
  4. Fixed some mojibake problems

There is still some mojibake, but already the archive has far less mojibake than Mizuno-san's old archive — I'd say about a 10th. Some of it's beyond fixing, but I'm going through and fixing what can be. I've found this mojibake fixer to be very useful for doing this.

No comments yet. Be the first.

Leave a reply