kremer@cpsc.ucalgary.ca

A solution to the polymorphic class problem in STL

The problem

A serious complaint about STL attacks the very paradigm on which STL is based. One of STL’s central tenants is that containers directly contain there objects, and this is the root of many of its efficiency claims. But in doing this, STL runs contrary to one of the very important tenants in object oriented programming: polymorphic references. For example, a program may contain a list of shapes, where shape is the polymorphic superclass of other concrete classes such as ellipse, rectangle and triangle. Class shape may have a pure virtual method called draw(), while the concrete subtypes all implement the draw() method. The program may iterate the list of shapes calling the polymorphic draw() method which will execute the appropriate code block for each object in the list.

A STL container expects to contain its objects directly, so a programmer’s first attempt might be to code:

	ellipse e(rect1);
	rectangle r(rect2);
	list < shape > shapeList;
	shapeList.push_back(e);
	shapeList.push_back(r);
But this code will not work. If class shape is pure virtual, the code will not compile because list will attempt to generate a shape object (and it can’t because its has a pure virtual member). If class shape is not pure virtual the code will compile, but it will fail at run time: When the list is iterated to draw all the shapes, it will be shape::draw() which is called each time instead of ellipse::draw() and rectangle::draw(). This is because the insertions into the list entail a copy of the object: shapeList.push_back() will do call shape::operator=(&shape) with the ellipse object as a parameter. Clearly, the object in the shape list is a shape and not an ellipse.

Some initial attempts

It becomes apparent that one level of indirection is needed to solve the problem. An obvious solution is to change the list of shapes to a list of pointers to shapes, so that the list declaration in the code fragment above would be:
	list < shape* > shapeList;
and list would be populated:
	shapeList.push_back(&e);
	shapeList.push_back(&r);
This solution seems to work, but it will likely lead to run time errors, for if variables e and r are automatic, they will be destructed at end of thier blocks. If the list's scope lives on past the end of this block, it will be left containing invalid objects (pointers to arbitrary places in or beyond the stack).

One possible solution is to insist that any object placed in such a list must be allocated from the heap:

	list < shape* > shapeList;
	shapeList.push_back(new ellipse(rect1));
	shapeList.push_back(new rectangle(rect2));
Besides the fact that unsuspecting programmers might not read the documentation and violate this rule, this solution leaves open another problem. When the list is destructed, it will leave all of its referenced objects on the heap without calling their destructors or deallocating their memory. While this may not cause any serious run time errors, it is certainly not acceptable memory management!

One could always subclass list < shape* > and destroy the contained objects in the subclass's destructor, but this would defeat a lot of the convenience of using STL's containers: one would have to write a lot of simple constructors and a new destructor for every variation of an STL container.

One could build template subclasses to every one of STL's containers -- an "indirect STL". For example:

template < class T >
class indirect_list : private list < T* >
  {
  ...
  }
but this solution is surprisingly complex: the inheritance should be private because indirect_list < T > is not a subtype of list < T >. This means that the code for indirect_list is just as long as the code for list!

A reasonable solution

An alternative solution is to change the behavior of pointers rather than containers. One could make a new reference template class who's destructor would take care of destructing its referent. One variant of this idea is listed below:
template < class T >
class Ref2
  {
  public:
    Ref2(const T &s)		{KillData = true;  t = s.clone();}
    Ref2(T *s)			{KillData = false; t = s;}
    Ref2(const Ref2 < T > &r)	{KillData = true;
				 t = r.t?r.t->clone():NULL;}
    ~Ref2()			{if (t && KillData) delete t;}

    Ref2& operator= (const Ref2 < T > & r) {
	if (t && KillData) delete t;
	KillData = true;
	t = r.t?r.t->clone():NULL;
	return *this; }
    T* operator->() const			{return t;}
    int operator< (const Ref2 < T > & r) const	{return t?r.t?(*t) < (*r.t):false:true;}
    operator T&() const				{return *t;}
    operator T*() const				{return t;}
    T& operator*() const			{return *t;}
  protected:
    T *t;
  private:
    bool KillData;
  };
Its used like this:
	list < Ref2 < shape > > shapeList;
	shapeList.push_back(ellipse(rect1));
	shapeList.push_back(rectangle(rect2));
In the above usage, the T& constructor is called which sets KillData to true; this means that when shapeList is destroyed, it will call the destructors of the Ref2 objects which will in turn destroy the referent objects. With this variant of reference classes, one can also signal the class not to destroy the object (if that is what is needed) by using the T* constructor instead:
	static ellipse persistentEllipse(rect3);
	shapeList.push_back(&persistentEllipse);
When using the iterators of shapeList, one must pay attention to the extra level of indirection:
	list < Ref2 < shape > >::iterator i;
	for (i=shapeList.begin(); i!=shapeList.end(); i++)
		(*i)->draw();
or
	list < Ref2 < shape > >::iterator i;
	for (i=shapeList.begin(); i!=shapeList.end(); i++)
		(**i).draw();
but this does not seem like overly cumbersome syntax.

One of problems with this solution is that it defeats a lot of the efficiency of the STL. The STL goes to great lengths to efficiently store its objects in blocks (separating memory allocation/deallocation and construction/destruction). The Ref2 class blatantly allocates and deallocates its referents one at a time! Run time efficiency is also compromised by the extra level of indirection. It is not clear that there any possible way around the extra-level-of-indirection problem for containers of polymorphic classes.

kremer@cpsc.ucalgary.ca