Sunday, June 04, 2006

Memory Leaking and Automatic Collection

I checked in a change to today that lets one disable the garbage collector more easily. I then tried out a few memory leak detection tools. First I tried ccmalloc. I couldn't even get this working, it just crashes, even with the sample program on the web site. Then I gave mpatrol a go. I'd heard good things about mpatrol. Unfortunately it doesn't seem to like dynamic loading of shared libraries and (for no good reason) we don't link to the loaders statically in Boomerang. So I gave up and installed valgrind. It still rocks. It not only told me how much memory we were leaking and where from, it also told me some use-before-define errors of which I wasn't aware. I checked in fixes.

Next, I had a look at Boost's shared_ptr class. I'm hoping to figure out a way to easily add something like this to replace the garbage collector. Unfortunately, the use of smart pointers is anything but easy. You'd think that you could define something like this:

typedef boost::shared_ptr<prog> ProgPtr;

and then replace every ocurrance of Prog* with ProgPtr. Of course, that doesn't work. Not only do you have the problem of no operator= being defined for a shared_ptr but you also have the problem of where to put this declaration. You can't put it after the definition of the Prog class, cause how do you forward declare ProgPtr if your Prog class includes any references to ProgPtr? So you try to put the typedef before the class definition, but then you get errors which indicate that a forward declaration of Prog is not sufficient.

In many ways, this is the greatest strength of the garbage collector, it's a drop in replacement. I'm starting to think that adding all those thousands of delete statements would be less trouble than using smart pointers. If anyone knows how you go about migrating to smart pointers without so much pain, I'd appreciate an email.


  1. I've had no problems with using shared_ptr before; and it does support the assignment operator. As for the 'is not defined' problem, just forward declare the class;

    class prog;
    typedef boost::shared_ptr<prog> ProgPtr;

    class prog

    The problems your having (no assignment operator, can't typedef the class) suggest the header isn't included properly; are you using #include <boost/shared_ptr.hpp>?

  2. If you have some code like:

    ProgPtr p = new Prog();

    you will get an error. That's the operator= I'm talking about. I believe you can do something like this:

    ProgPtr p = ProgPtr(new Prog());

    but that's just an annoying amount of work.

  3. The constructor is marked explicit for a good reason; automatically assigning a pointer to the shared pointer changes ownership semantics (the shared pointer will now delete the pointer).

    For example, if the constructor was explicit, changing
    void someFunc(SomeClass *ptr)
    void someFunc(boost::shared_ptr<SomeClass> ptr)
    would seriously change the roles of the function; now the pointer passed in would be automatically deleted.

    If the construction was implicit there would be no warnings for doing something like
    SomeClass bob;
    except that you now attempt to delete a local variable...

    Constructing the object can be done like
    ProgPtr p(new Proj);
    without the repetition.

  4. My point is that I have an already existing code base. I want an automatic memory management system that will integrate with that existing system. If I'm going to go change every allocation to not use an = operator then I might as well go find and insert all the delete statements I'm lacking. No, I think the solution to my problem is to figure out why the garbage collector isn't happy with Qt's threading model.

  5. But for boost to have broken functionality so it will be a drop in replacement isn't right either.

    The compile problem isn't that it lacks an assignment operator (it does have one), it's that the constructor to initialise from a raw pointer is marked explicit (to avoid bugs such as I mentioned earlier).

    Changing the constructors is far easier then inserting deletes, because the explicit constructor will cause a build failure (and so is easy to find, and will be a once off task taking around 20-30 minutes), whereas finding where & when to inset the delete is far harder (no build errors or warnings when you miss one), and will be a continous maintenance task.

    At work we recently switched a portion of our code (that had complex ownership semantics of a set of classes) to use boost::shared_ptr, which simplified the code quite a lot.

    That said, all to their own :-)