Maintaining and refactoring C++

Last week was my last day working with C++ (for a while). It’s been quite fun to revisit both the programming language and source code which kicked of my development career over 12 years ago, and I have enjoyed the experience a lot. There is also a few things to note so I put together a short list of things I found interesting during this short maintainance assignment.

Introducing a source control system

The code was originally written in 1999 and the executable files have been running in production ever since. Today the programs are owned by a group in the enterprise operations team. Their focus is to keep the systems up and running and they have little interest in the development process. There was no source control system available when I originally developed the code so,  before making any changes to the existing source, I was determined to correct that fact. A few months ago I taught myself Git and have never looked back since. Git is an excellent tool and this was an appropriate opportunity to introduce Git as suitable source control system for this code base. Being the sole maintainance developer of these programs I was happy just to add Git to aid my own productivity and give me the ability to safely abort a change should the need arise (and it did), but it will also pay off in the long run.

Updating to new IDE

Once a source control system was in place, the next step was to pick out the correct file candidates from each project to be checked in to the repository. I didn’t want every project file source controlled and this was a good occasion to get a bit more familiar with some of the lesser known project files used by the IDE, and also how to configure Git to filter file names/paths. Originally, the projects were all developed using Microsoft Visual C++ version 6 so the first step was to get them updated to a newer C++ IDE, which just happened to be Visual Studio 2008. Once the project files I needed were identified, these were checked in to the repository and tagged as the base version. Safe and ready to go!

Automatically updating the projects from Visual C++ 6.0 projects to Visual Studio 2008 solutions went ahead problem free – the IDE handled it all. My job was then to rid myself of the unnecessary project files only used by the old IDE. The (newer) Visual Studio C++ compiler has grown a lot “smarter” so a few syntax bugs had to be ironed out before the old code would build. There were also warnings due to calls to C++ standard library functions that now were deemed unsafe. In most cases a safer alternative was suggested.

Visual Studio 2008 is not unfamiliar to me, and those following this blog will know that I have used it for C# development, but never for C++. I was surprised how it lagged it’s C# cousin in functionality. Among other things there is little or no support for MSBuild and the IDE has no refactoring functionality. The latter was a real let down since refactoring C++ proved to be notoriously more difficult than any other modern language I have encountered. However, a few things made the update worth it: a better compiler and some IDE features like folders for structuring the source files. Visual Studio 2008 also has line numbering which I’m pretty sure was missing in the Visual C++ 6 source code editor.

Documentation and getting familiar with the source code

By chance, it just so happened that I came across Doxygen when googling for free C++ tools. Since Doxygen can be used for C#, Java and Python (untried, but according to the documentation) I thought it would be worth the time to take a closer look at this tool and that proved to be a wise decision. Doxygen is brilliant! I have not used it for the other languages it supports, but I plan to for my next project.  It’s syntax may remind you of JavaDoc, but with the correct dependencies installed it can create useful illustrations for viewing code and dependencies. Also, when creating the documentation you can configure it to include the source code in the documentation. For me the output was html and I actually found it easier to browse through the generated Doxygen documentation with my web browser than the source code itself using the IDE! Also useful is the fact that Doxygen can tell you which functions a particular function calls, and which functions your function calls. This proved to be useful when looking for things to refactor while attempting to simplify the code.

Beautiful code

I had never really had the need for a beautifier before, but this time I wanted to make the source easier to read, and also replace tabs with spaces and a few other things. I found a beautifier named UniversalIndentGUI which also works with more than one programming language, which I think is a plus. I fed all the source files to it and out popped “beautifully formatted” C++ source code. Voilà!

Unit testing and mocking framework

In Java development, unit testing is part of everyday life and has been for quite some time. However, where JUnit is the defacto standard for unit testing for Java, there is no similar single tool which has widespread adoption for C++ development. There are many tools available, but I had a hard time picking the one which I thought had the most potential and most active user community. In the end my choice fell on Google Test which proved to be a useful tool. Along with Google Mock, a mocking framework for C++, they provide functionality for unit testing and creating mock objects.

I spent a lot of the project time trying to refactor the code to use these tools. Unfortunately the code was riddled with references to a third part library, Lotus Domino C++ API, which I could not get working with GTest. Therefore a lot of the work was trying to narrow the usage of this library to only certain parts of the code. Although this was always in my plans, I never got quite that far and ran out of time, which was a shame. Refactoring can be time-consuming…

Project improvements

I added a simple readme file and change log to each project and moved any comments referring to a changes from the source code to the change log. I hope this will prove useful to any future developers for getting a head start and saving them from starting off with the source itself. With a simple attribute, Doxygen let me include the contents of each of the files in to the generated Doxygen documentation, which I though was a nice touch.

Lasting impressions

As I said earlier, I will miss working with C++. That said, I feel I can better appreciate the syntax improvements of languages such as C#, Java and Python. I think these languages better facilitate the creation of object-oriented code without syntax getting in the way, so to speak. C++ does make you work harder, but supplies more power in return (if you need it!). It is useful to keep in mind that trying to write C++ code in a Java or C# style may well provide you with unwanted memory leaks. In C++ you use the new and delete operators to create object instances on the heap, whereas Java and C# provide garbage collection to handle the deletion of objects no longer being referenced, as you probably know. Take this example, a Java method for fetching a bucket of water could look something like this:

public Bucket createBucketOfWater() {
    Bucket b = new BucketImpl();
    return b;

Inside the method a new instance of a Bucket class is created and initialised. The memory used for this object will be reclaimed by garbage collection once the myBucket reference to the object is invalidated. The caller does need to think of this – it happens automatically.

// someObjectInstance creates and initialises the Bucket class, the garbage collector handles the memory when the myBucket reference goes out of scope
Bucket myBucket = someObjectInstance.createBucketOfWater();

Doing something similar in C++ may not be a good idea. You may end up with something like:

// create a new Bucket of water, return a pointer to the memory on the heap
Bucket* CreateBucketOfWater() {
    Bucket* b = new BucketImpl();
    return b;

This code works, but will burden the caller to delete the memory used for the Bucket when done. If, for some reason, the caller should forget, the memory will be lost once the pointer variable is invalidated. We then have a memory leak.

// create a new Bucket of water, return a pointer to the memory on the heap
Bucket* b = CreateBucketOfWater();

// must remember to delete memory on heap
delete b;

A useful rule of thumb to remember is that objects should be created and deleted by the same part of the code, not spread around. In other words a function or method should not create an object on the heap and then leave it up to the caller to tidy up when done. So how do we avoid this scenario? A more suitable C++ approach could be something like this:

// function body not relevant
void FillBucketWithWater(Bucket*);
// create a Bucket instance and pass an object pointer to the method, remember to delete the memory when done
Bucket* b = new WaterBucket();
delete b;

So to conclude, where in Java you would ask the method for a bucket of water, in C++ you would supply your own bucket and then use another method to fill it with water! When you are done with the bucket you are responsible for deleting it since you created it.

However, although this is a clear division of responsibilities, it does make me wonder how to properly create a factory method without burdening the caller to delete any created heap objects that the factory creates.