Three reasons to avoid private class members

I recently read an interesting post about the need for access control via the private keyword (or an equivalent mechanism) on the CodeThinked blog. The post raises some valid points, but I still think enforced private/protected access is something to be used sparingly at most. Here are three reasons why.

Private members are a pain to unit test

Classes with lots of private methods and variables are a pain to unit test. The pathological case is the "iceberg class" — one public method with a void return, and lots of hidden private members.

class LogParser
{
    typedef list< string >::iterator token_iterator ;
    list< string > m_tokens ;
    list< string > m_parsed_tokens ;
public:
    void parse_log(const tstring &logfile_name,
                   const tstring &outfile_name) ;
private:
    void write_results(const tstring &outfile_name) ;
    token_iterator tokens_begin() ;
    token_iterator tokens_end() ;
    void parse_token(const token_iterator &token) ;
    void tokenize_file(const tstring &logfile_name) ;
};

(I know that everybody reading this — except me — is much too smart to write a stupid class like this. It's just for demonstration purposes.)

Right now, the only way to test this class is to touch the file system twice — once to read in a test log file, and another time to check the results. In order to unit test private member functions, and check the state of private member variables (more on these below), you've got to resort to some sort of hackery, such as the infamous #ifndef hack:

class LogParser
{

    void parse_log() ;
#ifndef TESTING
private:
#endif

Yes, there are other ways, but they all require a little too much intimacy with the test code for my taste.

Private methods and member variables are a code smell

Having lots of private methods and members often means that you've got another class (or two) dying to break free and live on its own. In the pathological LogParser example above, we really have three functions: we read and tokenize a log file, parse the tokens, then write our results to another file.

Having those private methods is a big hint that this functionality doesn't really belong on our LogParser class.

How about if we decompose this into three classes:

class LogTokenizer
{
public:
    typedef list< string >::iterator token_iterator ;

    token_iterator tokens_begin() ;
    token_iterator tokens_end() ;
    void tokenize_file(const tstring &logfile_name) ;
private:
    list< string > m_tokens ;
};

class LogWriter
{
public:
    void write_token(const string &token, FileWriter &writer) ;
};

class LogParser
{
    typedef list< string >::iterator token_iterator ;
public:
    void parse_log(const tstring &logfile_name,
                   const tstring &outfile_name) ;
private:
    void parse_token(const token_iterator &token) ;

    list< string > m_parsed_tokens ;
};

Here, we've got three much more focused classes, with much tighter responsibilities. This code is more reusable — it would be fairly easy to imagine taking that LogTokenizer and applying it in other situations, for example.

Note that the encapsulation is now better, if anything. Users of our LogParser class still don't have to worry about what's under the hood — and now they won't even be distracted by as many private member declarations when they browse the header file.

It will also be a lot easier to configure our LogParser to take different input and output sources. As one example, we could change LogParser to take a (smart) pointer to a LogTokenizer instance instead of an input filename, and then configure it with different LogTokenizer subclasses — including a fake tokenizer that doesn't touch the file system for testing purposes.

If I were to continue to refactor this code, I would expect to end up with almost everything public.

State is evil

Functional programming teaches that state (in the form of member variables) is evil, because it makes your code more complex and harder to test. No, C++ is not a functional programming language, nor is Java, nor C#, nor Python for that matter. But I think the functional programming style can teach us a lot about good programming in these languages as well.

I therefore try to avoid using member variables (which for C++, will probably be private for reasons touched on here). I generally consider it a problem to have more than about two of them in any given class.

2 comments to Three reasons to avoid private class members

  • “Private methods … are a code smell.”

    Surely, that’s a little harsh. So harsh, in fact, that I’ve to repeat the post I made to CodeThinked, for a little balance.

    Private methods are the most common way of bringing information hiding to class behaviour, and in some systems it’s even mathematically provable that encapsulation and information hiding reduce complexity.

    If you think private methods are a code smell, then why aren’t private classes within a subsystem a code smell?

    And once you lose private classes, i.e., once all your classes are visible to all else, then you’ll inevitably find dependencies springing up between classes that clearly should never have known about one another.

    Unit testing is great. It’s essential. But it’s no excuse for throwing out information hiding. See:

    http://www.edmundkirwan.com/encap/intro.html

    Or just the graph showing potential structural complexity as a function of the number of encapsulated regions in a system:
    http://www.edmundkirwan.com/encap/page5.html

    Regards,

    Ed

  • Calling private methods a code smell is harsh only if you think “code smell” means bad code.

    I see a code smell as an indicator that the code could be bad. For example, many of the code smells are contradictory, such as “large class” versus “lazy class” or “data class.” The idea is that the tension between them should create an equilibrium.

    Likewise, private methods are a code smell that may mean you need to perform extract class; but not always. The tension of other code smells will prevent you in some cases.

    I read your links, but I still think that namespaces are a better way to encapsulate classes. If your internal classes are in a separate namespace, the user of your interface won’t need to deal with that complexity unless they choose to. And a private class is still in a namespace — that of its enclosing class. You’re just enforcing encapsulation, not reducing the number of namespaces. Also, your formula doesn’t seem to take namespace nesting into account.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>