OO Concept: Containers
The Need for Dynamic Organisation
A program consisting of a single large object is not OO, indeed with just a
single set of common functions
(the member functions)
operating on a single pool of common data
(the data members),
its not even a well structured procedural program! No, the power of OO only
becomes apparent once there is a set of interacting objects.
So in this topic we shall
consider the organisation of object collections.
FORTRAN 77 has very limited support for data organisation:-
- Access control: data can be local to a subroutine/function, or
global i.e. a COMMON block.
- Data structuring: scalar and static array.
There is a direct translation of this organisation into C++:-
To do justice to the dynamic nature of OO
we need dynamic structures and dynamic
ways of expressing relationships between them. To take the second of these
first, the way C++ expresses relationships has
already been discussed in the OO topic
Pointers & References.
But what about dynamic structures?
Containers and Iterators
In OO jargon, dynamic structures are
containers
Let's start by trying to write down a list of requirements for an array
container:-
- Create a array with a supplied size.
- Change the size.
- Ask the size.
- Add object at index i.
- Remove object at index i.
- Access object at index i.
We could add a few more features, but that will do. Such a list is easy to
compose because the idea of an array is a simple concept. As soon as we hear
the word "concept" we should think
"object".
So a dynamic array should be
some type of object. Note also the concept of an array is quite independent
of what it is an array of. So we should be able to design the array object and
only afterwards decide what type of object to place in it.
We could repeat this with a list:-
- Create an empty list
- Ask the size of the list.
- Add object to the end of the list.
- Remove object from the end of the list.
- Access each member of the list in turn.
Comparing the two sets of requirements we
at once we see there are features in common and this should make us think
"inheritance".
One requirement they share in common with every other
container we can think of
is the ability to sequentially access every member in the container. This
implies some type of "cursor" that can be initialised and then stepped through
the set, together with a way of accessing the object at the cursor. This type
of object is called an iterator. Its tempting to think that the
container object could provide this iterator service.
But what if we want 2 iterators, or 100, all working at once? No, an iterator
has to be a separate type of object although it will clearly have a very close
relationship to the container it iterates over.
So, starting with a few simple principles we can deduced that:-
- Containers are objects. Each container has a corresponding iterator.
- Container and Iterator code should not depend on the type of contained
object.
- Container and Iterators will form inheritance trees.
Membership and Ownership
Now that we have determined a few basic features of containers its time to ask
how containers actually contain objects. As the relationship is
dynamic i.e. determined at execution time, we know at once that pointers to
objects will be involved. So we can imagine member functions passing pointers
into and out of the container. The dynamic nature also suggests something else
about containers and the objects they contain. If it is possible to add
objects to containers then these objects must exist outside, i.e.
independently of containers. So a good working rule is: Containers do not own the objects they contain. Put another
way, deleting a container should not delete the objects it contained. After
all, the same object may be a member of several containers. Why is ownership
so important? The answer is that objects occupy memory, and when an object is
no longer needed it should be deleted to recover its memory so as to prevent a
memory leak.
So containers manipulate sets of pointers, but pointers to what? C++ is a
strongly typed language, which is to say you cannot declare a "pointer to any
old object" you have to say exactly what it is being pointed to. From this it
would appear the following unpleasant conclusions are unavoidable:-
- A container can only hold one type of object.
- For every possible type of container type and object that it contains,
there will have to be a new container class.
In practice containers can be developed based on each of these restrictions.
Inheritance based Containers
This starts from the premise that a container can only hold one type of
object, or to be more precise, pointers to one type of object. However
containers really ought to hold pointers to a heterogeneous collection objects.
In the OO talk on
Virtual Functions &
Polymorphism
we already have seen the answer to this problem. There we had a heterogeneous
collection of tracks that could all be processed by driver code that only knew
about a primitive Track object embedded within each:-
Track *MyTrack = GetFirstTrack();
while ( ! MyTrack ) {
MyTrack->Propagate();
MyTrack = GetNextTrack();
}
We can use the same trick here. So long as each object inherits, directly or
indirectly, from some basic type of "container hook" object,
then containers can
just point to these embedded objects. To recover the "outer" object, we just
have to convert the pointer back as described in Pointer Conversion section
of the OO topic
Inheritance.
and shown in the diagram:-
Its worth repeating the warning that
this is an unsafe conversion. Although the
compiler knows how to adjust the pointer it makes no attempt to confirm that
the inner object is indeed embedded in the outer, only that it could be. It
trusts the user to know what they are doing. Get it wrong and a corrupt
pointer results, and what ensues will not be pleasant!
ROOT Containers
are based on this idea. Any object that is to be held in a ROOT container has
to inherit from a
TObject.
Each time an object is passed to the container it is converted to a TObject
pointer. The compiler does this implicitly. For example:-
MyCollectionPtr->Add(MyObjPtr);
Each time an object is returned it
has to be converted back using an explicit conversion:-
MyObjPtr = (MyClass *) MyCollection->FindObject("MyObject name");
Template Containers
Some people dislike the above scheme because it is not truly "type safe".
In the end the compiler leaves it the user to ensure that the types are
correct. This only leaves the other alternative:
creating a new class each time a new (container organisation) / (contained
object) combination is needed.
To say the least this could be very tedious. Most people
faced with this choice would, for each type of container:-
- Define the class leaving a dummy name for the contained object type.
- When a particular container was needed, copy the code and then do a
global search and replace for the contained class.
C++ has a built in template scheme that effectively does just this.
For example:-
template<class T>
class ArrayContainer {
private:
T *member[10];
...
};
This is a (seriously flawed) array container that has a 10 element array of
pointers to T, so could hold up to 10 T objects (I said it was flawed). The
important point though is that the template statement indicates that T is a
template, or parameterised class. If we need an ArrayContainer for Track
objects, it can be created by:-
ArrayContainer<Track> MyTrackArrayContainer;
C++ takes the parameter list