Maintaining and refactoring C++

Last week was my last day working with C++ (for a while). It’s been quite fun to revisit both the programming language and source code which kicked of my development career over 12 years ago, and I have enjoyed the experience a lot. There is also a few things to note so I put together a short list of things I found interesting during this short maintainance assignment.

Introducing a source control system

The code was originally written in 1999 and the executable files have been running in production ever since. Today the programs are owned by a group in the enterprise operations team. Their focus is to keep the systems up and running and they have little interest in the development process. There was no source control system available when I originally developed the code so,  before making any changes to the existing source, I was determined to correct that fact. A few months ago I taught myself Git and have never looked back since. Git is an excellent tool and this was an appropriate opportunity to introduce Git as suitable source control system for this code base. Being the sole maintainance developer of these programs I was happy just to add Git to aid my own productivity and give me the ability to safely abort a change should the need arise (and it did), but it will also pay off in the long run.

Updating to new IDE

Once a source control system was in place, the next step was to pick out the correct file candidates from each project to be checked in to the repository. I didn’t want every project file source controlled and this was a good occasion to get a bit more familiar with some of the lesser known project files used by the IDE, and also how to configure Git to filter file names/paths. Originally, the projects were all developed using Microsoft Visual C++ version 6 so the first step was to get them updated to a newer C++ IDE, which just happened to be Visual Studio 2008. Once the project files I needed were identified, these were checked in to the repository and tagged as the base version. Safe and ready to go!

Automatically updating the projects from Visual C++ 6.0 projects to Visual Studio 2008 solutions went ahead problem free – the IDE handled it all. My job was then to rid myself of the unnecessary project files only used by the old IDE. The (newer) Visual Studio C++ compiler has grown a lot “smarter” so a few syntax bugs had to be ironed out before the old code would build. There were also warnings due to calls to C++ standard library functions that now were deemed unsafe. In most cases a safer alternative was suggested.

Visual Studio 2008 is not unfamiliar to me, and those following this blog will know that I have used it for C# development, but never for C++. I was surprised how it lagged it’s C# cousin in functionality. Among other things there is little or no support for MSBuild and the IDE has no refactoring functionality. The latter was a real let down since refactoring C++ proved to be notoriously more difficult than any other modern language I have encountered. However, a few things made the update worth it: a better compiler and some IDE features like folders for structuring the source files. Visual Studio 2008 also has line numbering which I’m pretty sure was missing in the Visual C++ 6 source code editor.

Documentation and getting familiar with the source code

By chance, it just so happened that I came across Doxygen when googling for free C++ tools. Since Doxygen can be used for C#, Java and Python (untried, but according to the documentation) I thought it would be worth the time to take a closer look at this tool and that proved to be a wise decision. Doxygen is brilliant! I have not used it for the other languages it supports, but I plan to for my next project.  It’s syntax may remind you of JavaDoc, but with the correct dependencies installed it can create useful illustrations for viewing code and dependencies. Also, when creating the documentation you can configure it to include the source code in the documentation. For me the output was html and I actually found it easier to browse through the generated Doxygen documentation with my web browser than the source code itself using the IDE! Also useful is the fact that Doxygen can tell you which functions a particular function calls, and which functions your function calls. This proved to be useful when looking for things to refactor while attempting to simplify the code.

Beautiful code

I had never really had the need for a beautifier before, but this time I wanted to make the source easier to read, and also replace tabs with spaces and a few other things. I found a beautifier named UniversalIndentGUI which also works with more than one programming language, which I think is a plus. I fed all the source files to it and out popped “beautifully formatted” C++ source code. Voilà!

Unit testing and mocking framework

In Java development, unit testing is part of everyday life and has been for quite some time. However, where JUnit is the defacto standard for unit testing for Java, there is no similar single tool which has widespread adoption for C++ development. There are many tools available, but I had a hard time picking the one which I thought had the most potential and most active user community. In the end my choice fell on Google Test which proved to be a useful tool. Along with Google Mock, a mocking framework for C++, they provide functionality for unit testing and creating mock objects.

I spent a lot of the project time trying to refactor the code to use these tools. Unfortunately the code was riddled with references to a third part library, Lotus Domino C++ API, which I could not get working with GTest. Therefore a lot of the work was trying to narrow the usage of this library to only certain parts of the code. Although this was always in my plans, I never got quite that far and ran out of time, which was a shame. Refactoring can be time-consuming…

Project improvements

I added a simple readme file and change log to each project and moved any comments referring to a changes from the source code to the change log. I hope this will prove useful to any future developers for getting a head start and saving them from starting off with the source itself. With a simple attribute, Doxygen let me include the contents of each of the files in to the generated Doxygen documentation, which I though was a nice touch.

Lasting impressions

As I said earlier, I will miss working with C++. That said, I feel I can better appreciate the syntax improvements of languages such as C#, Java and Python. I think these languages better facilitate the creation of object-oriented code without syntax getting in the way, so to speak. C++ does make you work harder, but supplies more power in return (if you need it!). It is useful to keep in mind that trying to write C++ code in a Java or C# style may well provide you with unwanted memory leaks. In C++ you use the new and delete operators to create object instances on the heap, whereas Java and C# provide garbage collection to handle the deletion of objects no longer being referenced, as you probably know. Take this example, a Java method for fetching a bucket of water could look something like this:

public Bucket createBucketOfWater() {
    Bucket b = new BucketImpl();
    b.fill();
    return b;
}

Inside the method a new instance of a Bucket class is created and initialised. The memory used for this object will be reclaimed by garbage collection once the myBucket reference to the object is invalidated. The caller does need to think of this – it happens automatically.

// someObjectInstance creates and initialises the Bucket class, the garbage collector handles the memory when the myBucket reference goes out of scope
Bucket myBucket = someObjectInstance.createBucketOfWater();
myBucket.DoSomething();

Doing something similar in C++ may not be a good idea. You may end up with something like:

// create a new Bucket of water, return a pointer to the memory on the heap
Bucket* CreateBucketOfWater() {
    Bucket* b = new BucketImpl();
    b->FillWithWater();
    return b;
}

This code works, but will burden the caller to delete the memory used for the Bucket when done. If, for some reason, the caller should forget, the memory will be lost once the pointer variable is invalidated. We then have a memory leak.

// create a new Bucket of water, return a pointer to the memory on the heap
Bucket* b = CreateBucketOfWater();
b->DoSomething();

// must remember to delete memory on heap
delete b;

A useful rule of thumb to remember is that objects should be created and deleted by the same part of the code, not spread around. In other words a function or method should not create an object on the heap and then leave it up to the caller to tidy up when done. So how do we avoid this scenario? A more suitable C++ approach could be something like this:

// function body not relevant
void FillBucketWithWater(Bucket*);
// create a Bucket instance and pass an object pointer to the method, remember to delete the memory when done
Bucket* b = new WaterBucket();
FillBucketWithWater(b);
b->DoSomething();
delete b;

So to conclude, where in Java you would ask the method for a bucket of water, in C++ you would supply your own bucket and then use another method to fill it with water! When you are done with the bucket you are responsible for deleting it since you created it.

However, although this is a clear division of responsibilities, it does make me wonder how to properly create a factory method without burdening the caller to delete any created heap objects that the factory creates.

Advertisements

Google Test (GTest) setup with Microsoft Visual Studio for C++ unit testing

Introduction

[Links now include solution files for both 2008 and 2010 versions of Visual Studio]

I’m going to be nice with you today and save you some time. What I am about to describe to you took me the better part of two (half) workdays…. with a few hours sleep in between. Setting up Google Test with Microsoft Visual Studio can be a bit tricky, but if you really want unit testing for C++ in Visual Studio (and I hope you do) then this is for you. Most of the challenges can be overcome by configuring the compiler and linker correctly.

It’s worth mentioning that before settling on Google Test, or GTest as it’s also named, I did take a look at a few of the other unit test frameworks for C++, but I don’t think things seem any easier anywhere else. GTest doesn’t seem like a bad choice: its open source, used to test the Google Chromium Projects (Chrome) and more importantly, seems to be actively maintained.

There is a fair bit of documentation available on the project site, but sometimes you just want to get a feel for something before committing yourself to it. This posting should help you do that, but if you want more, the project has good documentation. In my quest for documentation I noticed several guides, a FAQ, a Wiki and a mailing list. In other words, there are good sources of information available if you choose to dive in.

Disclaimer

I suppose a disclaimer is in order for those wondering:

  • I only work with C++ in passing. It’s not something I do much of these days and my working knowledge of Microsoft Visual Studio for C++ is limited.
  • I used Visual Studio 2008 Profession Edition for this work. I also updated the project using Microsoft Visual Studio 2010 Professional Edition (see links below). Maybe the Express versions will work too?
  • I am not affiliated with Google in any way. The reason I am looking in to this particular framework is because I am currently maintaining some older C++ programs that I wrote 10 years ago. I want to introduce unit testing for them before making changes and GTest seems a good choice.

So, in this posting I want to share with you how I configured Visual Studio 2008 to work with the GTest framework. After spending a fair bit of time getting this to work, I want to write it all down while it’s still fresh in my mind.

The GTest binaries for unit testing

First thing’s first: you need to download the Google Test Framework. I use version 1.5.0 which seems to be the current stable release. I unpacked the GTest project to a folder named C:\Source\GTest-1.5.0\ which I then refer to from other projects in need of the unit testing library. I call this directory %GTest% in the text that follows. Be aware that I think I may have read that Google recommends adding the GTest project to your own solution and building them together with your own code, but this is how I do it for this sample project.

If you are coming from a Java world then this may be where you hit your first snag. It may be a bit different from what you have grown accustomed to with Eclipse, JUnit and all, but you will have to build the unit test binaries from the downloaded C++ source code. Yes, you will actually have to compile and build the GTest libraries yourself, but before you lose heart, let me add that it comes with project files for many popular C++ IDEs, Visual Studio being one of them (older version). In the msvc/ folder of the download you will find two Visual Studio solution files which VS 2008 will ask you to upgrade when you open them.

I had no trouble building the binaries. In fact, I can’t remember actually having to configure anything so don’t be put off by this step. However, there is an issue here: there are two solution files and you must choose the correct version to use with your project. The solution file with the -md suffix uses DLL versions of Microsoft runtime libraries, while the solution with no suffix uses static versions of the Microsoft runtime libraries. The important thing to note is that you must correctly set the C++ Code Generation setting for the Debug and Release configurations in your project to the exact same setting used when building GTest. If you experience linker problems somewhere down the line in your project then this might be the cause. Most of the trouble I have experienced while building has been due to this setting being incorrect. The project’s README file does a better job of explaining all this so be sure to have a look. For my code I am using the static versions of the runtime libraries, so for me that’s /MT for the Release configuration and /MTd for the Debug configuration. I use the GTest solution without the -md suffix.

In any case, if you plan on using both Debug and Release configurations in your own project then you should remember to also build the GTest solution for both Debug and Release configurations. Among other things, the Release configuration will build two files, gtest.lib and gtest_main.lib, and similarly, the Debug configuration will also build two files, namely gtestd.lib and gtest_maind.lib (notice the extra -d- character in the file names).

Project setup

Now that you have successfully generated the libraries for unit testing, we need to incorporate them in to a C++ project. The GTest documentation provided will show you some simple examples of how to create unit tests using the framework, but it won’t say much about how to set up a good project structure for unit testing. I guess, this is not to be expected since it could be very environment specific.

My preference is to avoid making the unit tests part of the resulting binary (EXE file), and I don’t want to have to restructure my existing project (too much) to add unit testing. I simply want to add unit tests to my project, but avoid making my existing project code aware that it’s now being unit tested. So, my solution is based on what I’ve grown accustomed to with Java development with Eclipse, or C# development with Visual Studio. Maybe this is also the norm in other C++ projects? The idea is to split the solution in to three separate projects:

  1. One project containing the base code which will function as a library for the others
  2. One project used for running main(), the application entry point, which makes calls to functionality in the library
  3. One project for running unit tests which also makes calls to the same library functionality. In GTest the main() function entry point can be optional if you use gtest_main.lib.

The screenshot below shows what this may look like in Visual Studio:

Solution view in Visual Studio 2008

This setup requires the BaseCode project to be built as a library (LIB) file. The two others projects will build as EXE files that both depend on the LIB file so their project’s dependencies must be set up to both individually depend on the BaseCode project. When attempting to build the solution using this project structure, these are the things to watch for:

  • The BaseCode project must be configured to build as library. For both configurations, Release and Debug, you must set the project’s Configuration Type to Static Library (.lib). It’s Code Generation must be set to Multi-threaded (/MT) for the Release configuration and Multi-threaded Debug (/MTd) for the Debug configuration (must be identical to the GTest project explained earlier).
  • The RunBaseCode project is used to create the EXE for the resulting application so it’s Configuration Type is set to Application (.exe) which is the default. It depends on the BaseCode library so it’s project dependency must be set to depend on the BaseCode project. The Code Generation should also be set as explained above.
  • The TestBaseCode project is also used to create an EXE, but only for running the test cases – it’s not something you ship. It also depends on the BaseCode library so it’s project dependency must be set to depend on the BaseCode project. As before, it’s Code Generation should be set as explained above.
  • Since the TestBaseCode project needs to run the unit tests it must refer to the GTest libraries. Of the three projects, it is the only project which needs this. Therefore, for both Release and Debug configurations, set the Additional Include Directory setting to refer to the %GTest%\include directory.
  • The TestBaseCode Release configuration’s Additional Library Directories setting should refer to the %GTest%\msvc\gtest\Release directory. The Additional Dependencies setting should list the libraries gtest.lib and gtest_main.lib. Similarly, for the Debug configuration the Additional Library Directories setting should refer to the %GTest%\msvc\gtest\Debug directory and the Additional Dependencies should list the libraries gtestd.lib and gtest_maind.lib (notice the extra -d- character in the file names). Of course, if you have set up you GTest libraries somewhere else then it you have to refer to these directories instead.
  • The Command Line setting for TestBaseCode‘s Post-Build Event can be set to “$(TargetDir)$(TargetFileName)” for both Release and Debug configurations. This will run the unit tests automatically and display the results in the Build output window after building the project.

If you are successful, the build output should look something like this:

Screenshot of the build log

You will notice that the unit tests are run automatically and results displayed. The build creates two EXE files as expected, one for the application and one for the unit tests:

Screenshot of running the code and tests

If you get this far you might also want to check out gtest-gbar project which is a graphical UI for the unit tests. It’s a simple, one-file .NET application. By pointing it at the unit test EXE file you can get output like this:

Screenshot of gtest-gbar

Closing

For simplification, I’m linking to the Visual Studio 2008 solution I used to create the example so you can have a look at my solution settings. If you are using Visual Studio 2010 then use this solution. Have a look, build it and see if it works for you! You will also need to download, build and refer to the GTest framework LIB files and include folder as described above. Tell me how you get on and what Visual Studio version you were using (2008, 2010, Express etc). Your feedback would be greatly appreciated!!

Now that I’ve got this set up the next step for me is to incorporate GTest unit testing in to my current C++ projects. There’s a lot to learn…