OO Concept: Containers

The Need for Dynamic Organisation

A program consisting of a single large object is not OO, indeed with just a single set of common functions (the member functions) operating on a single pool of common data (the data members), its not even a well structured procedural program! No, the power of OO only becomes apparent once there is a set of interacting objects. So in this topic we shall consider the organisation of object collections. FORTRAN 77 has very limited support for data organisation:-

Access control: data can be local to a subroutine/function, or global i.e. a COMMON block.
Data structuring: scalar and static array.

There is a direct translation of this organisation into C++:-

Access control: object can be local to a function, or shared between a set of functions i.e. the data members of a object.
Data structuring: scalar and static array e.g.:-
```
MyClass MyObj;
MyClass MyObjs[10];
```

To do justice to the dynamic nature of OO we need dynamic structures and dynamic ways of expressing relationships between them. To take the second of these first, the way C++ expresses relationships has already been discussed in the OO topic Pointers & References. But what about dynamic structures?

Containers and Iterators

In OO jargon, dynamic structures are containers Let's start by trying to write down a list of requirements for an array container:-

Create a array with a supplied size.
Change the size.
Ask the size.
Add object at index i.
Remove object at index i.
Access object at index i.

We could add a few more features, but that will do. Such a list is easy to compose because the idea of an array is a simple concept. As soon as we hear the word "concept" we should think "object". So a dynamic array should be some type of object. Note also the concept of an array is quite independent of what it is an array of. So we should be able to design the array object and only afterwards decide what type of object to place in it. We could repeat this with a list:-

Create an empty list
Ask the size of the list.
Add object to the end of the list.
Remove object from the end of the list.
Access each member of the list in turn.

Comparing the two sets of requirements we at once we see there are features in common and this should make us think "inheritance".

One requirement they share in common with every other container we can think of is the ability to sequentially access every member in the container. This implies some type of "cursor" that can be initialised and then stepped through the set, together with a way of accessing the object at the cursor. This type of object is called an iterator. Its tempting to think that the container object could provide this iterator service. But what if we want 2 iterators, or 100, all working at once? No, an iterator has to be a separate type of object although it will clearly have a very close relationship to the container it iterates over.

So, starting with a few simple principles we can deduced that:-

Containers are objects. Each container has a corresponding iterator.
Container and Iterator code should not depend on the type of contained object.
Container and Iterators will form inheritance trees.

Membership and Ownership

Now that we have determined a few basic features of containers its time to ask how containers actually contain objects. As the relationship is dynamic i.e. determined at execution time, we know at once that pointers to objects will be involved. So we can imagine member functions passing pointers into and out of the container. The dynamic nature also suggests something else about containers and the objects they contain. If it is possible to add objects to containers then these objects must exist outside, i.e. independently of containers. So a good working rule is: Containers do not own the objects they contain. Put another way, deleting a container should not delete the objects it contained. After all, the same object may be a member of several containers. Why is ownership so important? The answer is that objects occupy memory, and when an object is no longer needed it should be deleted to recover its memory so as to prevent a memory leak.

So containers manipulate sets of pointers, but pointers to what? C++ is a strongly typed language, which is to say you cannot declare a "pointer to any old object" you have to say exactly what it is being pointed to. From this it would appear the following unpleasant conclusions are unavoidable:-

A container can only hold one type of object.
For every possible type of container type and object that it contains, there will have to be a new container class.

In practice containers can be developed based on each of these restrictions.

Inheritance based Containers

This starts from the premise that a container can only hold one type of object, or to be more precise, pointers to one type of object. However containers really ought to hold pointers to a heterogeneous collection objects. In the OO talk on Virtual Functions & Polymorphism we already have seen the answer to this problem. There we had a heterogeneous collection of tracks that could all be processed by driver code that only knew about a primitive Track object embedded within each:-

Track *MyTrack = GetFirstTrack();
while ( ! MyTrack ) {
  MyTrack->Propagate();
  MyTrack = GetNextTrack();
}

We can use the same trick here. So long as each object inherits, directly or indirectly, from some basic type of "container hook" object, then containers can just point to these embedded objects. To recover the "outer" object, we just have to convert the pointer back as described in Pointer Conversion section of the OO topic Inheritance. and shown in the diagram:-

Its worth repeating the warning that this is an unsafe conversion. Although the compiler knows how to adjust the pointer it makes no attempt to confirm that the inner object is indeed embedded in the outer, only that it could be. It trusts the user to know what they are doing. Get it wrong and a corrupt pointer results, and what ensues will not be pleasant!

ROOT Containers are based on this idea. Any object that is to be held in a ROOT container has to inherit from a TObject. Each time an object is passed to the container it is converted to a TObject pointer. The compiler does this implicitly. For example:-

MyCollectionPtr->Add(MyObjPtr);

Each time an object is returned it has to be converted back using an explicit conversion:-

MyObjPtr = (MyClass *) MyCollection->FindObject("MyObject name");

Template Containers

Some people dislike the above scheme because it is not truly "type safe". In the end the compiler leaves it the user to ensure that the types are correct. This only leaves the other alternative: creating a new class each time a new (container organisation) / (contained object) combination is needed. To say the least this could be very tedious. Most people faced with this choice would, for each type of container:-

Define the class leaving a dummy name for the contained object type.
When a particular container was needed, copy the code and then do a global search and replace for the contained class.

C++ has a built in template scheme that effectively does just this. For example:-

template<class T>

class ArrayContainer {
private:
  T *member[10];
...
};

This is a (seriously flawed) array container that has a 10 element array of pointers to T, so could hold up to 10 T objects (I said it was flawed). The important point though is that the template statement indicates that T is a template, or parameterised class. If we need an ArrayContainer for Track objects, it can be created by:-

ArrayContainer<Track> MyTrackArrayContainer;

C++ takes the parameter list , and substitutes Track for T throughout the definition of the class ArrayContainer, them compiles the code so generated, effectively doing the same we could do by hand, but with a lot less effort. This produces code that is type safe, but does have different drawbacks:-

Templated code is harder to read.
At the time of writing these notes, some compilers can be very slow when dealing with templates.
It does not solve the problem when a container has to hold a heterogeneous set of objects.
The system can end up generating a great deal of code; each container/object combination has its own code, a phenomenon that is sometimes referred to as code bloat.

The Standard Template Library (STL) is part on ANSI C++, and includes a set of template containers.

Go Back to the The C++ Crib Top Page

If you have any comments about this page please send them to Nick West