Version 2.1 of Count Anything released

I just released version 2.1 of my free word-counting program, Count Anything.

Click here to get the latest version.

This is a minor release with two fixes: the "No" button of the check for updates dialog was broken, and files on the results page are now in alphabetical order (in the next release, I'll make all the columns sortable by clicking).

Count Anything is free and released under the MIT license.

45 comments to Version 2.1 of Count Anything released

  • I downloaded and and installed your word count software in my OS.

    1. Your software shows just the word count and there is no provision for invoicing as in PractiCount & Invoice.
    2. The results obtained for word count, characters and characters without spaces were: 9,590 56,501 47,470
    3. With PractiCount the similar results were: 9621, 57064 47585

    This is for your information please.

    Regards,
    Dondu N. Raghavan

  • @Dandu

    1. Your software shows just the word count and there is no provision for invoicing as in PractiCount & Invoice.

    That’s true; invoicing is another feature I plan to add at a future date.

    For now, what I usually do is save the results as an HTML file or a tab-delimited text file, and open it in Excel (Excel will open either format), and then do the calculations there.

    2. The results obtained for word count, characters and characters without spaces were: 9,590 56,501 47,470
    3. With PractiCount the similar results were: 9621, 57064 47585

    Without seeing the files in question (and not knowing Practicount’s specific algorithm), it’s hard to say what the cause of the discrepancy is.

    With Count Anything, I try to replicate the word counts produced by MS Word as much as possible, since in my experience it’s kind of a gold standard for word/character counts.

  • Well there is no question of sending files from my side as they are confidential. Yet, you can word count any of your own files with these two softwares and see for yourself.

    Regards,
    Dondu N. Raghavan

  • @Dondu

    Replicating the results of Practicount isn’t a terribly high priority for me, but Count Anything comes with a utility called Dump Text (it should be in the same folder under the Start menu).

    You can use this to dump the text from the file in question, make sure that all the text you expect is there, and then do a word count in MS Word (if you own it) and see if they match.

  • Adam

    Hi Ryan,

    Thanks for this latest version. I use Count Anything all the time, and have recommended it to a lot of my clients too. Really looking forward to the release that will make columns clickable for sorting – I’d been thinking of suggesting this as a feature.

    Keep up the great work!

    Adam

  • Dear Ryan,

    thanks for developing a free universal word count tool: we all need this kind of program. May I suggest that a future version let the user choose between a word count “à la Word” and another one “à la Trados”. As you are probably aware, the TM-tools do not count words the way the word processors do, and the translation companies like mine often want to know the TM count. I do not refer to the comprehensive analyze function (with repetitions, fuzzy matches and matches), but “only” to the word count itself.

    Thanks anyway for the good job!

    Guillaume

  • @Guillaume

    What do you mean by Trados style? Do you mean the Trados word-count algorithm, or how they give percentages for each type, or something else?

    Incidentally, you might be interested in another program of mine, Analyze Assist, which analyzes source files against translation memories.

  • Ryan

    I recently installed 2.1, and now when I try to count words (in a MS Word file) I get “0″ results.

    Here’s the output from the log:

    … Error parsing file
    Traceback (most recent call last):
    File “main.py”, line 595, in do_wordcount
    File “AnalyzeAssist\segmenter\textseg.pyo”, line 58, in get_sentences
    File “AnalyzeAssist\segmenter\textseg.pyo”, line 28, in bytes2unicode
    TypeError: unicode() argument 2 must be string, not None

    I can post more info if it might help, please let me know what you need.

    Does anyone else have this problem? Is it possible to downgrade?

  • @Ryan

    You can get the previous version here:
    http://ginstrom.com/CountAnything/CountAnything_Setup_2.0.exe

    I’m not really sure what the cause of the error is, but it appears to be trying to parse a text file. Are there any text files mixed with the Word file you’re counting?

  • Ryan

    Hi Ryan, no the files are 100% MsWord files. I’ll try to create some test cases and then send you more detail (if interested?).

    Thanks for the link so I can downrev!

  • Yes, I’d appreciate the test cases, thanks.

  • Ryan

    Hi Ryan,

    In testing, I think I found the problem. The software seems to choke if there are any non-ascii characters in the file name.

    I created two files: “Test case CA.docx” and “Test case CA和訳.docx”

    Characters were counted in the first file, but character counts for the second one were all “0.” I tested this with other characters (adding a “Ⅱ” and some other variants) and it is pretty consistent. Please send me an email if you’d like source files.

    Thank you

  • @Ryan

    Thanks for that information. Indeed, I think that I can track down this bug. Luckily, it will be easily fixed.

  • May

    I have used your latest version of Count Anything. I tried to count a Japanese language document in scanned PDF, I got zero results. Please enlighten. Thank you.

  • @May

    It’s possible that the PDF file was scanned, or that it is of a type not supported by the (open source) PDF parser Count Anything uses, pdfminer.

  • May

    Many Thanks Ryan. You have said it very well. I realized that even an OCR cannot help to extract anything from a scanned PDF. This is very inconvenient especially when I have to work on files with 100s of pages. The sad day is some clients would not be able to provide a simple Word or PDF file.

    However you have done a great job to produce Count Anything. I found it more user friendly than Trados. I will definitely explore using your other softwares to help make life easier in document translation.

    Once again thanks for sharing your useful softwares.

  • pozmu

    On Windows Vista I get following error when trying to count files:

    com_error: (-2147221005, ‘Invalid class string’, None, None)

  • Did you try to count words for an Office 2007 or 2010 file?

    It’s possible that you don’t have a parser registered for the *.docx/*.xlsx/etc. extension.

    Could you go to the Tools menu, select Options, go to the File Extensions tab, and make sure that the entry for “MS Word” is “*.doc;*.rtf;*.docx”, and so on?

  • Yogini Nazare

    Hello Sir,

    This Yogini from Patni Computer Systems Ltd.
    We had downloaded the software of Count Anything from http://ginstrom.com/CountAnything/ but, it is unable to install.
    We get the error after we press finish button of installation.
    We request your help.

    Thanks and Regards, Yogini

  • When I try to count multiple documents, a virtual basic error code comes up, which stops the counting.

    The software is not working for me, but it is promising.

    Is there a solution?

  • Anton

    Thanks a lot for the program, I’ve been looking for something like that for a long time.
    Unfortunately it counts WRONG. I used the dump text feature and found out that the program sometimes replaces spaces after periods (full stops) with section marks that are not counted as characters.

    I guess it’s a bug :)

  • Thanks for pointing this out. Could you send me a file for verification to software at ginstrom dot com? Thanks!

  • ARW

    Downloading your program as we speak. This will probably join the others in my free translation toolbox (TMbuilder, Xbench, ASAP Utilities, ExamDiff, Calipers).

  • kiran

    Is the code openly available? This program will be helpful for me but I would like to automate it. Is there any API exposed which can be triggered from external programs to get the line/character count instead of using UI.

  • The source code isn’t generally available, but that’s more from lack of interest than my desire to hide the code (it’s MIT licensed). If there is enough interest, I’ll get the code in a releasable state — that involves pulling together some different modules into a single release package.

    There is a command-line utility that comes with Count Anything, called  count_words.exe. Here is the help for this program:

    usage: count_words.exe [-h] [-x format] [-o outfile] -f filename
    [filename ...]

    Count the words/characters in the specified input files

    optional arguments:
    -h, --help show this help message and exit
    -x format Output format: one of [table, tabs] (optional)
    -o outfile Output to file 'outfile' (optional)
    -f filename [filename ...]
    One or more files to process (required)

  • kiran

    I used the command line and looks like it can serve the purpose. But I believe it would be better to have API calls exposed for better exception handling and customization. Will exposing the API be difficult to interaction with 3rd party systems?

  • @kiran: I agree that an API would be useful. This would involve first releasing my segmenter module, which the word-count algorithm uses to get text from various file formats, and then the wordcount module itself. I will try to get those projects up on bitbucket or a similar service.

  • kiran

    Thank you very much. I will be looking forward for the API.

  • kiran

    What does ‘count number-only segments’ in the Tools->Options->Count settings mean?

  • kiran

    @Ryan Ginstrom
    Are you still working on open-sourcing the code?

  • @kiran

    The “count only number segments” option will skip segments that are only numbers. This will skip things like numbers in tables (the main reason is that translators are often not paid for such segments, or are paid differently).

    I haven’t done any work on open sourcing the code, but I have a game plan. I need to split out my segmenter module from another library, open source that, and then I think I can open source the wordcount app as-is.

  • kiran

    Does the character count calculated by Count Anything include spaces as well as vbc’s ?

  • @kiran

    Had to look up “vbc.” Count Anything counts “all characters” and “characters (no spaces)”. The second measure is the total character count minus the number of characters for which c.isspace() returns true.

  • kiran

    For some files, there is difference between the UI output and command prompt output. Is there any reason why it is?

  • Dash

    Isn’t possible to count the number of lines too ?
    Thanks

  • I have downloaded your software. I found it too good and very user friendly. I appreciate you work. But i thing there is bug in the software, it is working fine but sometime it gives me a error as “com_error: (-2147221005, ‘Invalid class string’, None, None)”. Please provide me the solution.
    Thanks in advance.

  • Very useful software, thank you so much!

  • Regarding your product I’m globally very satisfied.

    Just a problem when using long path.

    My documents are classified in directories. Due to the quantity of project I’m working on, the path are long. Long enough to crash Count Anything…

    For example, for this document, the soft crashes : “D:\Mes Documents\______TRADUCTION123______\BackOfficeNLS\12-06-23 – 19h21 SOLUTIONS-NLS\www\release\help\language\fr\___Synthèse v41\v41\Resumé vFR – 41\backup automatic\Consanguinité et élevage d’autruches.txt”

    If I rename the file consan.txt and put it in the following path c:\, all works fine.

    It’s not very important now but I spend a least 2 hours to find why this soft doesn’t work!

  • Thanks for reporting this. This is probably due to a limitation of the underlying wxWidgets GUI framework. I will look into a way to handle longer file paths.

  • If you have a site that contains forms, you could also include this script on the page to spare your visitors the hassle you once had to go through!

    function countit(){

    /*Word count script
    */

    var formcontent=document.wordcount.wordcount2.value
    formcontent=formcontent.split(” “)
    document.wordcount.wordcount3.value=formcontent.length
    }

  • Ryan I LOVE IT. This is the first ime I’ve come across CountAnything, and it ticks a lot of boxes for us. I’ll feed back if we find any anomalies, but please, keep up the good work!

  • kiran

    Is there a way to calculate the char count without spaces using command line ?

  • Muqarram Khorakiwala

    I am trying to count multiple ppt files (97-2000 format). Ver 2.1 of CountAnyThing opens the first file and after that it gives the following error – type:PowerPoint.Application.Visible. However, if I count the files individually, I am able to obtain the wordcounts. How can this be resolved?

  • Irina Chernykh

    Hi, Ryan, and thank you for such a useful tool (as well as for PowePoint2Doc tool)!
    I’ve found it just in time to count my multiple PDFs but haven’t tried PPTs yet.
    Are you planning to include the support of Adobe InDesign files some time? That would be great!

    And – excuse me my ignorance, please, – why Word is among supported files? Are there any other reasons besides batch counting?

  • Hi Irina:

    Yes, Count Anything supports Word files. Just drag and drop them into the file box, and click Count.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>