Introduction to Refactoring

Introduction to Refactoring

Jim Cooper - Tabdee Ltd


Introduction

What is Refactoring?

Why Refactor?

    Aims of Refactoring

    Opportunities for Refactoring

Refactoring Techniques

    Learning to Apply Refactorings

    Extract Method

    Extract Class

    Inline Class

    Move Method

    Inline Temp

    Split Temporary Variable

    Rename Method

    Replace Magic Number with Symbolic Constant

    Replace Error Code with Exception

    Encapsulate Field

    Self Encapsulate Field

    Remove Control Flag

    Form Template Method

    Introduce Local Extension

    Replace Type Code with State/Strategy

    Replace Conditional with Polymorphism

    Replace Method with Method Object

    Hide Delegate

    Duplicate Observed Data

    Introduce Null Object

    Encapsulate Downcast

    Moving Methods and Fields Up and Down a Hierarchy

Converting Refactorings to Delphi

Bad Smells – When to Refactor

    Duplicated Code

    Long Method

    Large Class

    Long Parameter List

    Divergent Change

    Shotgun Surgery

    Feature Envy

    Data Clumps

    Primitive Obsession

    Switch Statements

    Parallel Inheritance Hierarchies

    Lazy Class

    Speculative Generality

    Data Class

    Refused Bequest

    Comments

Problems

Refactoring Tools

    Rename Symbol

    Sync Edit

    Extract Method

Summary

References

Source code for this paper.


Introduction

This paper will serve as an introduction to the discipline of refactoring. We will discuss what it is, and what benefits and problems there are. We will talk about guidelines for applying refactoring methods - when to refactor, which techniques to use, and how to apply them, and when to stop.

The main reference is Martin Fowler’s book Refactoring, and we will follow his lead in regard to terminology, guidelines and naming. However, he used Java, and some refactorings are language specific, so we will also be examining how to modify the techniques for Delphi, and see a couple of examples of Delphi specific refactorings .

As this is an initiation into the method, there was the difficult decision of what example to use. As always with these sessions, time is short, and speakers are trying to across certain points. If an example is too complex, too much time is spent explaining the background rather than the important points. So what I'm trying to say, in a weaselly sort of way, is that the main example we will see of refactoring in action will appear quite contrived. It is. I chose it because it's simple to understand, and to demonstrate a number of refactorings. But because it is so straightforward, we will take the process rather further than would probably be necessary in a real project. I'll mention it again when we start going through the example, but please bear in mind that it is an example of techniques, rather than a convincing example of the long-term benefits of refactoring.

Until you try it yourself in a significant project, you will have to take it on faith that these techniques will make coding faster and easier and software more reliable. Such a claim is difficult to prove in such a short time. It also means that the close relationship between unit testing and refactoring may not be emphasised enough in the demonstration, so I’ll be bringing it up from time to time.

There are other examples in the first chapter of Fowler’s book, and the Brandon Smith article in The Delphi Magazine which are somewhat longer.

You should not necessarily expect to see anything startlingly new in this paper. Many of the techniques will be familiar to you. Like patterns, refactorings are a description of best practice, and many programmers have therefore been applying these principles for years. However, we will see that a more rigorous approach is possible. Many of the same advantages that accrue from work on patterns apply here, too. We have standardised descriptions, advice and a process to follow, but perhaps most importantly, we have names. We have the same sort of verbal shorthand that we use when talking about patterns.

What is Refactoring?

    “Refactoring is the process of changing a software system
     in such a way that it does not alter the external behaviour
    of the code yet improves its internal structure.”

            Martin Fowler, Refactoring

He goes on to describe it as “improving the design after it has been written”. The term also describes the individual techniques in the process.

You should note that sometimes a change in the behaviour of a program is required, but that is not a refactoring. Refactoring only changes the structure of the code, not the overall behaviour of the software .

We will see that each step is very simple and small and done in such a way as to minimise the chances of introducing bugs. The effect of a series of these incremental changes can be a radically improved design. When done properly, refactoring is intimately associated with testing, particularly unit testing. After each refactoring, no matter how small, the test suite should be run (sometimes they have to be modified too). This way, the chances are good of finding any bugs that might be introduced as soon as they occur, as they are probably in the code you just changed. Waiting until after a series of refactorings to test makes bug finding much more difficult.

It is, of course, possible to refactor and not use unit tests. However, in this case, your confidence in your code should be far lower. I have had problems writing unit tests sometimes, particularly for components that are part of a framework. Two components spring to mind, that were part of an object persistence framework, mapping business objects to databases. These components were implementations of the Observer pattern, and were intended to sit between a GUI and the business objects, applying changes to various properties, displaying those changes and so on. Because these components worked together, were quite complex, and relied on a GUI being available, I didn’t bother writing any unit tests. The result was that these components were the buggiest part of the framework. Refactoring them was risky because they were used everywhere, and I could not be confident I hadn’t broken them with any changes I made. By contrast, the part of the framework that handled the mapping of business objects to the database had a suite of tests, and turned out to be far more reliable, and I could make changes in relative safety .

Why Refactor?

What is code for? A chunk of code written in some programming language is NOT a program, just one way of visualising the model of one. There are other ways – Bold uses UML diagrams and OCL, for instance. Code is not required by a computer. In fact, we have to use a computer to turn our code into something the computer understands. Code is purely for use by programmers. As such, it should be easy for humans to read, understand and modify. Practices like refactoring and using design patterns help make the code more usable for us, not computers.

Most software spends far longer in maintenance mode than it took to write in the first place – normally up to 90% of the program lifetime. Maintenance means fixing bugs, changing the program behaviour to meet changing requirements, and adding new features. All of these activities mean modifying or extending existing code. So the readability and maintainability of the code base should be the paramount features of any program you develop. Refactoring is a process that will help achieve that.

Aims of Refactoring

  1. Improve the design of software.
  2. Make the software easier to understand. In fact, Fowler uses refactoring to help him understand unfamiliar code. I’ve tried this myself with some success. One point to note here is that it may not be possible to write tests until you understand the code, so this may be one time to defer, not ignore, unit testing.
  3. Help you find bugs, because refactoring helps give a deeper understanding of the code.
  4. Help you program faster in the long term.
  5. Make it easier to talk about the process by formalising it, and giving the techniques names.

Most refactorings reduce the amount of code (Microsoft therefore seem not to be big fans), and they all work to improve the organisation of code, so applying these techniques helps stave off the chaos that is a feature of most long-lived code. You will often find that a lot of small, useful helper classes fall out of the refactoring process, many of which can be included in libraries for later use.

Opportunities for Refactoring

  1. When you add functionality. One reason to refactor at this point is to make adding the new feature easier, the other is improve your understanding of the existing code, especially if it was written by someone else. However, it is best to keep the activities of refactoring existing code and adding new code separate. If you find the new feature would be easier to add if the code were restructured, stop adding code and refactor for a while. When you’re happy, go back to adding the new code. This reduces the chance of introducing bugs.
  2. When you need to fix a bug. Again, the need to understand the code drives the need to refactor. The fact there is a bug suggests that the the code was not clear enough because nobody saw it.
  3. During a code review, both to improve understanding of another’s code, and to come up with suggestions for improvements. Anyone who has done pair programming will know that it’s like a continuous code review, and refactoring is pretty much a constant activity.

Obviously, refactoring takes time to do, and it isn’t always obvious that it will save time in the long run. You may have difficulty convincing management of the benefits, but since each individual refactoring is small, you might be able to sneak it past them without their noticing.

We will examine how to decide what code needs refactoring later, after I cut all the waffle, and we get down to looking at some examples.

Refactoring Techniques

We will examine some of the more common and useful refactorings in some detail. We will not be able to look at all those listed by Fowler (let alone the extra ones on the Refactoring website), and in any case, it isn’t necessary. It’s better to start with the main ones, learning how and when to apply them, and then go browsing through the book, looking for others that strike a chord. You can use the Smells chart inside the back cover as a guide (we will look at smells later).

I haven’t mentioned it for a while, but you will need to implement unit tests. If you’re unfamiliar with these, they are very small, quick tests that check the behaviour of the methods and properties of your classes. They are not intended as a replacement for your normal QA process, which will look at larger scale activities. They are also not really intended for integration testing, and in my opinion, are poorly suited to user interface testing. The references at the end of this paper have more information about this. There is a free library called DUnit which you should probably use, and the website is also given below. The code with this paper includes the unit tests for one of the examples.

Some points to note about refactorings:

  1. They are usually more atomic actions than patterns, but some refactorings do implement patterns (the simpler ones like Template and Strategy).
  2. Some refactorings in Fowler’s book are very Java based, and some rely on Java language features like final. We will see how to convert his instructions to Delphi later. In general, different languages will require some different and/or modified refactorings.
  3. Not many refactorings are about using interfaces .
  4. Many refactorings make use of other, simpler refactorings (e.g. Extract Class uses Move Field and Move Method).
  5. Some seem contradictory (we will see Extract Class and Inline Class later) – the aim is code that is easier to understand. Make decisions based on that.
  6. It is possible to mechanically apply some refactorings, but you cannot necessarily decide mechanically which to apply.
  7. It is possible to go too far.

Learning to Apply Refactorings

  1. Don’t try to learn them all at once. You can always apply new ones later.
  2. Start with the smells and the recommendations to fix them. Then follow other refactoring paths as recommended in the initial fixes.
  3. Always, always, always test!
  4. Don’t refactor blindly – always check to see if this will make your code clearer (because there are opposite refactorings intended for use in different situations).
  5. Develop your own refactorings and bad smells.
  6. Eventually, you’ll apply the principles as you’re writing the code. Be careful how far down that road you go; don’t make the code unnecessarily general.
  7. We’re allowed to disagree with Fowler, but he’s probably right, and we’re probably wrong.
  8. Don’t worry about optimising the code until after you’ve refactored and checked for bottlenecks with a profiler. Programmers are notoriously bad at guessing where optimisations are needed, and are infamous for thinking that that statement applies only to other programmers.

Let’s look at a few refactorings, and see how they work.

Extract Method

Probably the most used technique. We’ve all done this – we’ve identified a piece of code that we want to reuse, or is making a method a bit long, so we create a new method, and cut the offending code out of the old method and paste it into the new one. Seems simple enough, but there are some complications, enough to make Fowler say that this refactoring is the litmus test of a refactoring tool – if it can cope with this properly, then it’s probably useful.

So, what’s different about a formal Extract Method refactoring? Well, like a design pattern description, there is a certain structure to the definition. Firstly we have a name, obviously Extract Method in this case.

Next there is a short summary describing when the refactoring is needed, and what it does. For this example we have:

You have a code fragment that can be grouped together.

Turn the fragment into a method whose name explains the purpose of the method.

The format and text styles are typical, and are used on the website as well as in the book. This is usually followed by a short code example, and then the motivation describing why the refactoring should be done and when it shouldn’t.

This is followed by the mechanics, giving a step by step guide to performing the refactoring, and then one or more examples, showing how the steps are applied. Typically these are longer than the initial one, and contain more discussion of what’s happening.

Obviously, we’re not going to have all that for each of the refactorings that we will examine, as that would be too much plagiarism, even for me. What we’ll do is hit the high points of a few of the refactorings, leaving a fuller discussion to the book and website.

Extract Method is typically used when a method is too long, or the code needs a comment for it to be understandable. Extracting the relevant code out into its own method allows it to be called elsewhere, and makes the original method easier to read. However, this relies on the new method being well named.

The mechanics are as follows:

  • Create a new method, and give it a name that describes what it does (not how it does it). Sometimes a single line of code might be extracted if that will give it a more meaningful name.
  • Copy the extracted code from the original method into the new one.
  • The new method may be using variables and parameters from the original, so we need to deal with them.
  • Any local variables from the original method that are only used in the extracted code can be made local variables of the new method instead.
  • Some of the remaining variables from the original method may be modified in the new method. If there is only one, then make the new method a function and assign the result to the variable in the original routine. If there is more than one, then you need to do something else first. In Delphi it is possible to have several var parameters. Other options include applying the refactorings Split Temporary Variable (using a different local variable each time it is assigned to) and Replace Temp With Query (where the local variable is removed, and we just call the method that assigns to it every time we need to. This sounds inefficient, but isn’t necessarily).
  • Pass any local variables from the original method used in the new one as parameters.
  • Compile.
  • Replace the extracted code in the original method with a call to the new one.
  • Compile and test.

Examples of this in action are pretty common, and we’ll see some later in the Replace Type Code with State/Strategy example.

A modification of this procedure that I sometimes use is to extract the method to a nested procedure first. This makes dealing with the local variables a little easier. I can then decide if the method is too dangerous for any other methods to call, in which case I can leave it as a nested procedure, or put it back inline. If it’s safe to move then moving within the same class can proceed as we’ve already seen. If we want to move it to another class as part of an Extract Class refactoring then we can sometimes move it straight from the nested procedure.

Extract Class

This is a more complex refactoring, that’s used when a class is violating the principle of separation of concerns. That is, the class is doing things that should be split into two or more classes. Normally, this is a result of feature creep, where more and more responsibilities get added to a class, until you have a class with too many methods and a lot of data. This can also be the result of doing several Extract Method refactorings.

Sometimes this is perfectly ok, as long as everything is related, but usually you will find that a subset of the data and a subset of the methods go together, or are particularly dependent on each other. These should be split out into a new class.

The creation of the new class involves making a new class, and possibly renaming the old one if its duties have changed. You normally need to create a reference to the new class in the old one. You may not need to expose it if it is purely used internally by the old class. Then you can use the Move Field and Move Method refactorings (see below) repeatedly to move the elements of the new class across. You should compile and test after each move.

Inline Class

This is the opposite of Extract Class, and should be applied when a class is doing too little. You can often move the fields and methods to a class that uses them. Inverse refactorings like this are common, and are why you cannot refactor blindly.

Move Method

This refactoring is used a lot by other more complex refactorings. The aim is, as always, to end up with simpler classes, which normally means smaller classes, so we often need to move methods from large classes to newer ones. The first thing to do is to decide what elements of the old class are used by the method, as they may also need to be moved. Sometimes there are a group of related methods, in which case you should consider moving them too. However, if the method is declared in a descendant or ancestor class, you may not be able to move it, unless you can create a similar hierarchy with the target class.

To do the move, declare the method in the new class, changing its name if necessary. Copy the code over, making any adjustments needed to make it work. The new class should then compile. Then you need to change the old class to refer to an instance of the new one, at which point the code should pass the original tests.

The next few refactorings will not be examined in any great detail; we’ll just describe briefly what they do.

Inline Temp

Local variables can get in the way of other refactorings, and sometimes they are worth removing just to see what else becomes possible. This refactoring replaces a local variable only assigned to once with a simple expression with that expression.

However, you may sometimes be using a temporary variable to avoid breaking the Law of Demeter (too many dereferences in a statement), and you might want to leave it there. It also may be doing no harm, of course.

Split Temporary Variable

This is also normally done to facilitate other refactorings. You can replace a local variable that is assigned to more than once with a different variable for each assignment. This is done mainly to assist with Extract Method, and to make code clearer for the reader. This latter case is often indicated by the repeated use of a variable called “Temp”.

Some refactorings just encapsulate good programming practice. Here are a few examples:

Rename Method

Some refactorings seem trivial, but are actually quite important. This is one that is too often ignored. If the task of a method has changed over time, then that method should be renamed to reflect its new duties. If you’ve ever worked on code where this has not been done, you’ll know what a pain it is. It defeats the entire point of encapsulation, because you need to go and read another method in order to understand what the current one does. In the worst case, you get completely confused as to the purpose of the current method.

Replace Magic Number with Symbolic Constant

This is an example of a refactoring that encapsulates a standard piece of programming advice that we all know. It applies equally well to strings, of course.

Replace Error Code with Exception

If you don’t already do this, you deserve a spanking! It is poor programming practice to use functions that return error codes in languages that support exceptions. The reason is that you can then write code that doesn’t have a load of if statements testing results, but you can just write the code as if you could assume it always works. The error handling is then all in one place.

However, don’t use exceptions to replace assertions or checks. Exceptions are for unexpected errors. For instance, you might be testing if an object reference is nil before calling some method of the object. If the object should never be nil, then you should raise an exception. If you are just programming defensively, and making sure your program doesn't crash in those instance where the object is legitimately unassigned you just need a check. See the Replace Exception with Test refactoring in Fowler for more details.

Encapsulate Field

This refactoring needs to be modified in Delphi in order to introduce properties. This should then be called Replace Field with Property. If you don't use this refactoring everywhere, the Delphi style police will come around and, to quote the great sage, Cartman, “kick you in the nerts”.

Essentially, this replaces a public field with a property. Making fields public is a bad idea (we’ll see in a minute that making them protected isn’t much better). Reasons to make the field a property include the fact that we can change the way the field is stored in the object, whilst preserving the interface. We can also control what happens when the value of the field is changed and make property read-only or (less commonly) write-only, which we cannot do with a field.

At a minimum a public field should be changed to be private (and in Delphi, its name should be changed to begin with an “F”). The property should have the original (non-F) name and the read and write specifiers should refer to the private field. It may be necessary to change them to Get and/or Set methods (accessor methods) later to encapsulate some behaviour.

Self Encapsulate Field

We might allow direct access to a field from within a class, as long as the field is private. If the field visibility needs to be protected or better, say because a descendant class needs to access it, we should never expose the field, but convert it to a property. Even if there is no descendant class, but we need to add behaviour whenever the field value changes, we should create a property. In this case, the only places for direct field access are in constructors, destructors and accessor methods.

This refactoring needs to follow the guidelines given in Replace Field with Property (what was Encapsulate Field) above.

Remove Control Flag

Some people use a local variable as a control flag to determine when to break out of loops. I’ve seen this recently where instead of the control flag, a programmer who shall remain nameless, used a goto instead of the control flag. You should use Break, Continue, or Exit in Delphi.

For some reason, when the board reviewed this paper for the conference, they took exception (no pun intended) to this one. I don't know why, as it is pretty much the reason for including those built-in functions in Delphi in the first place.



Some refactorings introduce design patterns:

Form Template Method

This details steps to follow to use the Template Method pattern in existing code.

Introduce Local Extension

This describes how to introduce the Decorator (or Wrapper) pattern.

Replace Type Code with State/Strategy

Let’s have a look at a longer, more complex refactoring, along with some example code. When you have a class whose behaviour depends on the value of some type code (i.e. it is dependent on the state), and that type code changes during the lifetime of an object, then use this refactoring to replace the type code with the State or Strategy pattern. We will look at an example of refactoring to the State pattern. You will need to follow along in the attached code, as there is too much to put everything in this paper at each step.

The problem is that we have a procedure, not even in a class, to parse comma separated value (CSV) files. I have shamelessly filched the code from Julian Bucknall’s excellent Tomes of Delphi: Algorithms and Data Structures. Note that this particular example does not deal with carriage returns or line feeds; it assumes that the lines have been split up already and are fed to the routine one at a time to extract the fields. By a strange coincidence, this procedure implements a state machine, making the choice of which pattern to use pretty easy. The first step is obviously to wrap this procedure in a class, and write some unit tests.

When developing classes like this that are likely to become components, or form part of a library or framework, I use the DUnit test framework as a harness, and do all the development in there. The code attached to this paper shows the state of the parsing class code after each step of the refactoring. Initially, our code looks like this:

unit CsvParser;

interface

uses
  Classes;
  
type
  TCsvParser = class(TObject)
  private
  protected
  public
    procedure ExtractFields(const s : string;FieldList : TStrings);
  published
  end;


implementation

uses
  SysUtils;

  
{ TCsvParser }

procedure TCsvParser.ExtractFields(const s : string;FieldList : TStrings);
  type
    TParserStates = (psFieldStart,psScanField,psScanQuoted,psEndQuoted,psGotError);

  var
    State    : TParserStates;
    i        : Integer;
    Ch       : AnsiChar;
    CurField : string;
begin
  // Initialize by clearing the string list, and starting in FieldStart state
  FieldList.Clear;
  State    := psFieldStart;
  CurField := '';

  // Rread through all the characters in the string
  for i := 1 to Length(s) do begin
    // get the next character
    Ch := s[i];

    // Switch processing on the state
    case State of
      psFieldStart : begin
        case Ch of
          '"' : State := psScanQuoted;
          ',' : FieldList.Add('');
        else
          CurField := Ch;
          State    := psScanField;
        end;
      end;

      psScanField : begin
        if (Ch = ',') then begin
          FieldList.Add(CurField);
          CurField := '';
          State    := psFieldStart;
        end else begin
          CurField := CurField + Ch;
        end;
      end;

      psScanQuoted : begin
        if (Ch = '"') then begin
          State := psEndQuoted;
        end else begin
          CurField := CurField + Ch;
        end;
      end;

      psEndQuoted : begin
        if (Ch = ',') then begin
          FieldList.Add(CurField);
          CurField := '';
          State    := psFieldStart;
        end else begin
          State := psGotError;
        end;
      end;

      psGotError : begin
        raise Exception.Create(Format('Error in line at position %d',[i]));
      end;
    end;
  end;

  // If we are in the ScanQUoted or GotError state at the end
  // of the string, there was a problem with a closing quote
  if (State = psScanQuoted) or (State = psGotError) then begin
    raise Exception.Create('Missing closing quote');
  end;

  // If the current field is not empty, add it to the list
  if (CurField <> '') then begin
    FieldList.Add(CurField);
  end;
end;

end.

This is essentially just Master Bucknall’s procedure dumped straight into a class. There are also three test cases in our test framework. One tests that fields are extracted correctly, including empty and quoted fields, and the other two test that exceptions are raised in the right places. Testing that your exceptions are raised when you expect is important. The code at this stage is in the ReplaceTypeCodeWithState\Step0 directory. A quick run of the tests shows we’re up and working, and ready to start modifying the code.

Before we start with the refactoring proper, we’ll extract the local type definition out of the method and put it in the interface section. We’re now ready to go through Replace Type Code with State. After each step, we should run the tests. The code for each step can be found in the consecutively numbered directories.

The first step is to “self-encapsulate the type code”. We do this by extracting the local variable State out of the method and making it a private field called FState. We then use Self Encapsulate Field (actually, we use the Delphi equivalent of Replace Field with Property) to make a protected property. In this case, a protected property is a good idea, because we aren’t sure exactly how the State pattern stuff is going to work at this stage, and this will protect us from later changes. The tests should be run again to check no damage has been done. This code is in the Step1 directory.

We now create a new class that will replace the TParserState enumerated type. Fowler advocates adding an abstract function to return the parser state type, but in Delphi this is better done by using a read-only property in an abstract class as shown below:

  TCsvParserState = class(TObject)
  private
    function GetParserState : TParserState; virtual; abstract;
  public
    property ParserState : TParserState read GetParserState;
  end;

If you anticipate having state classes defined in different units then you could make the function protected instead of private. For this example we’ll leave everything in the same unit.

We then add a subclass for every state. The GetParserState function should be overridden in each subclass to return the relevant enumerated type value. The code is then compiled to check for syntax errors. There is no point running a test at this point. This code is in the Step2 directory.

We’re going to depart from Fowler’s prescription a little now, because we changed things a bit earlier on, and because he starts to make use of Java’s automatic garbage collection. Instead of creating a new field in the parser class for a new state object of our new abstract class type, we’ll just change the type of the existing private field, and introduce accessor methods for our property. The new definition looks like this:

  TCsvParser = class(TObject)
  private
    FState : TCsvParserState;

    function  GetState : TParserState;
    procedure SetState(const Value : TParserState);
  protected
    property State : TParserState read GetState write SetState;
  public
    procedure ExtractFields(const s : string;FieldList : TStrings);
  published
  end;


implementation

uses
  SysUtils;

{ TCsvParser }

function TCsvParser.GetState : TParserState;
begin
  Result := FState.ParserState;
end;


procedure TCsvParser.SetState(const Value : TParserState);
begin
  FreeAndNil(FState);

  case Value of
    psFieldStart : FState := TCsvParserFieldStartState.Create;
    psScanField  : FState := TCsvParserScanFieldState.Create;
    psScanQuoted : FState := TCsvParserScanQuotedState.Create;
    psEndQuoted  : FState := TCsvParserEndQuotedState.Create;
    psGotError   : FState := TCsvParserEndQuotedState.Create;
  end;
end;


procedure TCsvParser.ExtractFields(const s : string;FieldList : TStrings);
  var
    i        : Integer;
    Ch       : AnsiChar;
    CurField : string;
begin
  // Code removed for conciseness
end;  

We can compile and run the tests to check everything is still ok. Astute readers will have noticed an error in the code above. This is a mistake that I made when writing this paper. The unit tests immediately told me that there was a problem when checking that the bad chars exception got raised. I was immediately able to track down the problem (well, it took me couple of goes, because like everyone else, I tend to read what I expect to see). Changing the line dealing with the psGotError case in SetState fixed it:

    psGotError   : FState := TCsvParserGotErrorState.Create;

This sort of bug is far easier to find when you’ve just written it. I can’t say enough good things about unit testing. It won’t solve all your problems, but it does give you such a lot more confidence in your code.

The code for this phase is in the Step3 directory.

The final step is itself a refactoring; Replace Conditional with Polymorphism.

Replace Conditional with Polymorphism

You can use this refactoring whenever you have a case statement (or list of if..then..else’s) that chooses different behaviour depending on the value of some type code. It makes the original method abstract and moves the code for each part of the case statement into overriding methods in subclasses. It is of particular use when the case statement appears several times.

You need to have generated the subclasses already before applying this refactoring, either as we have in the Replace Type Code with State/Strategy example above, or by using Replace Type Code with Subclasses. The latter is not an option if the type code changes during the lifetime of an object (as it does in our example), or if you have already introduced subclasses for another reason.

We’re now ready to refactor ExtractFields.

The first thing to do in our example is to use Extract Method to get the case statement into a method by itself. We then need to compile and test again. This code is in the Step4 directory .

Obviously, this new method belongs in the state class, so we use Move Method to move it to the base state class. In our case, this is easy, but we need to change the constructor for our state class so that state objects can call the parser to change state.

Note that this introduces a problem. A state object can now be processing a character and need to change state. But a call to SetState destroys the object doing the calling. This is obviously not good. We should also consider the fact that a lot of creating and destroying of state objects will be going on if we’re parsing a large file. One of the refinements of the State pattern is to create only one instance of each state class. This is appropriate when there are few classes, and the state changes often, but new states are added infrequently, which describes our situation perfectly.

So, what we can do is take the creation code from the SetState method and move it into a constructor, and add five private fields, one of each state subclass. Obviously we need a destructor as well. SetState becomes a simple assignment method.

As this is a major change, we need to compile and test again. The code is in the Step5 directory.

In each of the subclasses, make a new method with the same name as the one in the base class (ProcessChar in our case). Make the base class method abstract (and virtual, of course), and move the code for each part of the case statement into the relevant subclass method. Compile and test again. The code is in the Step6 directory. Fowler suggests doing this one subclass/case at a time, and testing after each, but at least in this case, the subclass methods are so trivial, I felt it was ok to do it all at once.

Our example method has now been turned into a set of classes that implement the State pattern. The refactorings we used guided us through every step along the way, making using the pattern easy. Looking back, there are quite a few pages explaining what to do, but actually the changes to the code were pretty minimal at each step, and often consisted of moving code from one place to another. Once you get used to it, the process can go very quickly.

To tidy up the code a bit more, I applied a few other refactorings, but I’m not going to talk you through them all in detail. Because each state knows about the parser object, we can use a variation of Preserve Whole Object and make most of the parameters to ProcessChar fields in the parser class. We don’t have to pass the parser object as a parameter because it has already been done in the constructor.

It’s a bit awkward and unclear to keep calling FParser.State all the time, too, so that was extracted out into a ChangeState method. Once all that was done, the enumerated type for tracking the state could be done away with, along with all the ParserState properties and GetParserState functions. The final version of the code is in the Final directory. Further refinements are left as an exercise for the reader.

The bottom line is whether the code is now any easier to understand and maintain. Maybe not in this artificial example: the case statement in the original code was pretty short. However, were it to become necessary to add more states, this would be a simple matter of creating a new state class, and implementing one method (and perhaps other classes would need to change to model new state transitions). The parser class would only need to change to create new cached state fields, but there would not be any behavioural changes. The gain from doing all this work becomes more obvious in larger examples when there are more states.

The final unit is shown below:

unit CsvParser;

interface

uses
  Classes;

type
  TCsvParser = class;  // Forward declaration
  TParserStateClass = class of TCsvParserState;


  TCsvParserState = class(TObject)
  private
    FParser : TCsvParser;

    procedure ChangeState(NewState : TParserStateClass);
    procedure AddCharToCurrField(Ch : Char);
    procedure AddCurrFieldToList;
  public
    constructor Create(AParser : TCsvParser);

    procedure ProcessChar(Ch : AnsiChar;Pos : Integer); virtual; abstract;
  end;


  TCsvParserFieldStartState = class(TCsvParserState)
  private
  public
    procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
  end;


  TCsvParserScanFieldState = class(TCsvParserState)
  private
  public
    procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
  end;


  TCsvParserScanQuotedState = class(TCsvParserState)
  private
  public
    procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
  end;


  TCsvParserEndQuotedState = class(TCsvParserState)
  private
  public
    procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
  end;


  TCsvParserGotErrorState = class(TCsvParserState)
  private
  public
    procedure ProcessChar(Ch : AnsiChar;Pos : Integer); override;
  end;


  TCsvParser = class(TObject)
  private
    FState           : TCsvParserState;
    // Cache state objects for greater performance
    FFieldStartState : TCsvParserFieldStartState;
    FScanFieldState  : TCsvParserScanFieldState;
    FScanQuotedState : TCsvParserScanQuotedState;
    FEndQuotedState  : TCsvParserEndQuotedState;
    FGotErrorState   : TCsvParserGotErrorState;
    // Fields used during parsing
    FCurrField       : string;
    FFieldList       : TStrings;

    function  GetState : TParserStateClass;
    procedure SetState(const Value : TParserStateClass);
  protected
    procedure AddCharToCurrField(Ch : Char);
    procedure AddCurrFieldToList;

    property State : TParserStateClass read GetState write SetState;
  public
    constructor Create;
    destructor  Destroy; override;

    procedure ExtractFields(const s : string;AFieldList : TStrings);
  published
  end;

implementation
  // Code removed for space reasons. See the attached source code instead
end;

Other interesting refactorings include:

Replace Method with Method Object

Sometimes you want to split up a long method but the way local variables are used makes it impossible (or eat least very messy) to apply Extract Method. You can make the long method a class, with the local variables becoming fields in that class. You can then apply other refactorings to decompose the long method into smaller ones.

Hide Delegate

This is used to hide entire classes from view. Encapsulation should not be taken to apply only at a class and field level. Sometimes we can call objects that hide entire subsystems from the rest of an application (the Facade pattern facilitates this, for instance).

Duplicate Observed Data

An interesting refactoring, but it’s too long to discuss here. It is concerned with moving to using an Observer to connect business (sometimes called domain) objects to a GUI.

Introduce Null Object

Actually, this could be classed as a refactoring that introduces a pattern, too. Sometimes it’s useful to have a class that encapsulates a null value, so that we don’t have to keep checking to see if an object has been assigned or not. We always get an object, but it does the right thing when there would otherwise be a test for null. The null object should be a constant object, and is often therefore a Singleton (i.e. the same null object is used everywhere).

Whenever a method returns a nil instead of an object reference, and whenever you have tests using Assigned or comparisons to nil, you may have an opportunity to use this refactoring.

Encapsulate Downcast

Sometimes you need to typecast a method result so that a descendant class is returned. A Delphi example is making a type-safe collection.

Moving Methods and Fields Up and Down a Hierarchy

There are several refactorings to move fields and methods up and down a class hierarchy, as well as ones to extract new ancestor and descendant classes.

As I mentioned earlier, there are very many refactorings, and we’ve only looked at a selection to give you a taste of what’s available.

Converting Refactorings to Delphi

Fowler uses Java in his book, so some things are not the same in Delphi. Here are a few things to help you apply his advice in Delphi.

  1. Use const instead of final variables.
  2. To make a class immutable, make all the properties read-only, and set their values in the constructor. For example:
  3.         interface
            
            type
              TDateRange = class(TObject)
              private
                FStartDate : TDateTime;
                FEndDate   : TDateTime;
              public
                constructor Create(AStartDate,AEndDate : TDateTime);
              published
                property StartDate : read TDateTime FStartDate;
                property EndDate   : read TDateTime FEndDate;
              end;
            
            implementation
            
            constructor TDateRange.Create(AStartDate,AEndDate : TDateTime);
            begin
              inherited;
              FStartDate := AStartDate;
              FEndDate   := AEndDate;
            end;
    
  4. For some reason, Fowler uses Query in the names of refactorings when he’s talking about a Function. I don’t know why. Perhaps it’s a delightful English eccentricity, like his propensity for using cricket and beer in his examples.
  5. A temporary variable normally refers to a variable local to a method.
  6. Some refactorings cannot be applied to Delphi in their given forms. Examples are anything that relies on garbage collection, and refactorings like Self Encapsulate Field where Delphi would use properties instead of (or as well as) getter and setter methods.

Bad Smells – When to Refactor

How do you know when to apply refactoring? An intriguing idea is one introduced by Kent Beck of Extreme Programming fame: smells (at the time he was being influenced by the odours of his new baby). There is no absolute measure of when to introduce some refactoring; it’s a matter of intuition tuned by experience. However, sometimes we have code that just doesn’t seem right, and Beck decided to say that sort of situation was a bad smell in code. Note that a smell is not a definite indication of a problem that has to be fixed, but rather an indication of areas where improvements could be considered.

We’ll examine some of the smells Beck and Fowler found, spending most time on those I find in code most often. Sometimes I even find these in my code! Like most of the refactorings, the descriptions of the smells and the cures may well be familiar. We are after all, cataloguing best practice. As with patterns and refactorings, just naming the smells is a useful exercise. Moreover, it does you no harm to occasionally go through your code looking for these smells (code reviews are a good time).

You might have or develop your own smells and remedies, as well as your own refactorings. If you do, you should keep a list, and if you’re confident you’ve found something useful, you can even submit it to the Refactoring website.

Duplicated Code

One thing we should never do is duplicate code. Obviously, that code should be a method in one place, called from everywhere it is needed. However, there are some complications:

  1. Same code in different methods of the same class. In this case, use Extract Method to make the code a separate routine, and call the new method from all places. We’ve all done this a million times.
  2. Same code in sibling descendant classes. Use Extract Method followed by Pull Up Method to move the method up the inheritance hierarchy.
  3. If the methods are similar, but not exactly the same, use Form Template Method to apply the Template Method pattern.
  4. If the similar methods use different algorithms, use Substitute Algorithm to standardise on the clearest one.
  5. If the duplicated code is in unrelated classes, then you have to decide whether it belongs in one of the classes, being invoked from the others, or whether to use Extract Class to move the code out on its own.

Long Method

The idea here is that short methods are usually better methods because they are easier to read and understand. Not everyone agrees with this point of view (see Steve McConnell’s Code Complete, for a discussion of this), and there is little agreement on how long is too long. What you gain in small, clear methods can be lost if you have to go and examine each of the calls those methods make. The key is clear naming and commenting. Fowler uses the guideline that you should use a (well-named) method whenever you feel the need for a comment. He even advocates extracting a single line of code into a new method if the name of the method explains the code better. I’m not sure I’m totally with him on this point, and I tend to leave the line in with a comment until I duplicate the line somewhere else.

Obviously, the most common way to deal with verbose code is to use Extract Method. You may need to reduce the number of local variables first with Replace Temp with Query, or the numbers of parameters with Introduce Parameter Object (see below). In extreme cases, you may need to use Replace Method with Method Object where all the local variables become fields in the new object.

A point Fowler doesn’t mention, but raised by McConnell, is the order in which statements occur within a routine. Sometimes the order is important, and your refactorings should obviously respect that. You may also find it easier to refactor if you group lines with similar or related functionality together first (see McConnell, Chapter 13, “Organizing Straight-Line Code”).

You might find applying Decompose Conditional useful, where you extract methods from the condition, if part and else part of a complicated if-then-else statement to simplify and shorten the code. Some if-then-else and case statements are an indication that polymorphism and/or the introduction of the State or Strategy pattern is needed. There is a separate bad smell for case statements.

Large Class

I don’t find this one coming up all that often, and when it does, it doesn’t always make sense to split the class into smaller pieces. Again, it all depends on what you define as too large.

The remedies are mostly straightforward though; use either Extract Class to group related fields and methods, or possibly Extract Subclass. You may, of course, be able to do this multiple times. Fowler also recommends using Extract Interface, which I’m not so comfortable with. I believe interfaces are used too often, and they should only be used if they make the code clearer, not because they’re cool. Applying the same interface to otherwise unrelated classes introduces a relationship between those classes, the full implications of which I don’t believe is very clear. There are problems mixing object and interface access in the same code, too, so be careful.

Long Parameter List

There are a few strategies for dealing with painfully long parameter lists. Essentially, the idea is to send something you can query to get the necessary values, rather than sending the values themselves. However, be careful not to make the cure worse than the disease. One refactoring you can do is Introduce Parameter Object, which bundles up a group of parameters to pass as an object. This usually requires you to define a new class, which may not be worth doing if it is only used once. Fowler’s technique also means making the parameter class immutable (i.e. all the parameters read-only), which in my experience leads to the constructor of the new class having the same number of parameters as the method call.

One solution to this dilemma is to make more than one parameter class, where related parameters are grouped together. Another solution available in Delphi is the use of default parameters. This option is most suitable for those occasions where a number of the parameters are the same in most of the calls. Be careful to test thoroughly when changing the parameter order, especially if several are of the same type.

Divergent Change

If you find yourself having to change one class to cope with two or more different kinds of modifications, it’s probably time to split the responsibilities into two or more classes.

Shotgun Surgery

If making a particular type of modification requires small changes in many places in many classes, you might consider moving all the changes into one class.

Feature Envy

Sometimes you find a method in a class that mostly uses data (sometimes methods) from another class. You can use Move Method to move the offending method to the other class. If it is not appropriate to move the whole method, you might be able to use Extract Method first to separate out the dependent code, then move the new method. The idea is to keep data and the methods that act on that data together. However, there are exceptions to this rule (Fowler cites the Strategy pattern as an example), so consider whether this smell needs to be combated. I have rarely needed to do this, and when I have, I’ve usually found myself applying the Strategy or Template Method patterns later.

Data Clumps

Sometimes fields seem to group naturally together, and are usually passed as parameters and/or used together, like start and end dates, for instance. Whenever you get a “clump” of fields like this, you can use Extract Class to build a new class. This then gives you a chance to slim down parameter lists, and look for the Feature Envy smell, which will give you some idea what methods to move to the new class. I find myself doing this more often these days, and I have a lot more small, useful classes.

Primitive Obsession

This smell also derives small classes, and is similar to Data Clumps. This time though, we’re looking for primitive types (integers, strings, floats, TDateTimes and so on) that are being used in cases where small objects would be better. Examples are ranges with minimum and maximum values, complex numbers (before Delphi 6, anyway), money classes with a currency and an amount, special string types like phone numbers, post codes, and so on. Replace Data Value with Object is the basic refactoring to use – it simply builds a class out of the field.

If the primitive type is an enumerated type, Fowler recommends using Replace Type Code with Class. However, this is mainly useful for weakly typed languages like C, where the compiler typically treats enumerations as integers, and therefore does no type checking. Pascal is a strongly typed language so I would suggest that normally all you need do is make sure the elements of the type are clearly named.

The exception to this is when the behaviour depends on the value of the enumerated type field. Usually this is signified by the field appearing in lots of conditional and case statements. In this instance, you may want to look at applying the Switch Statements remedies.

Switch Statements

Whenever you see a case statement (called a switch statement in C-based languages), or a large conditional block with lots of else clauses, there may well be an opportunity to apply polymorphism. Usually the case statement depends on an enumerated type field (if it’s a string you tend to get else..if lines). Fowler explains how to either expand the class hierarchy, or implement the State or Strategy pattern.

If the case statement is only rarely used, then there is no need to go to all that trouble. I wouldn’t bother doing anything much if the case or conditional statement is very large and only used once. All I would do is make sure that the code for each case was encapsulated in a method where necessary using Decompose Conditional.

Another possible time to change is if you pass the selector value of the case statement as a parameter, and this value is used to select different behaviours. Replace Parameter with Explicit Methods shows how to rearrange the code so as to call the behaviour directly.

Finally, I have often seen Delphi code that uses a case statement where an array would do. For instance, this:

function TSomeClass.GetEnumAsStr(EnumValue : TMyEnum);
begin
  case EnumValue of
    meEnum1 : Result := ‘Enum1’;
    meEnum2 : Result := ‘Enum2’;
    // More cases
  end;
end;

would be better expressed as:

function TSomeClass.GetEnumAsStr(EnumValue : TMyEnum);
  const
    NameArray : array [TMyEnum] of string = (‘MyEnum1’,’MyEnum2’,…);
begin
  Result := NameArray[EnumValue];
end;

Parallel Inheritance Hierarchies

This is when making a subclass of one class requires a matching descendant of another class. I’ve only come across this once, when it was necessary (each business object class in an object persistence framework required its own class to map the values to a database). In this case, Fowler’s recommended solution of moving methods and fields from the mappers to the business objects was definitely not appropriate.

Lazy Class

You get this smell when you have a class that isn’t doing enough to justify its existence. You can subsume a class into an ancestor using Collapse Hierarchy, or move it into a class that uses it with Inline Class.

This seems the opposite of some other smells, like Primitive Obsession and Data Clumps. This sort of thing is one reason you can’t apply these remedies without some thought. You may find that you’ve created a small class unnecessarily by following the earlier advice, for example. When you have conflicting smells, I believe the deciding factor is that the code should be easier to understand when the refactorings are done.

Also, compare this smell with Data Class, as they can be similar.

Speculative Generality

This is a really hard one for most programmers to avoid. We tend to build too much flexibility into our code, because “one day” we’ll need it. Remarkably often that day never comes. If the only user of a method or class is a test case (or worse, it is never used), you don’t need it. Remove the method, or use the Lazy Class remedies to remove the class.

Of course, there are occasions when you definitely know you will need a class or method in the future. Sometimes it makes sense to write a set of related methods at the same time (a set navigation methods through some unusual data structure, for instance) while it’s all clear in your mind. If you’re writing a component or framework for someone else to use, then obviously you sometimes need to supply methods, classes and interfaces that you believe will be necessary (you can still be ruthless, though).

Most of the time, I use the heuristic “if I need it later, I’ll write it later”.

Data Class

You get this smell when part way through the Data Clumps and Primitive Obsession remedies. It’s very similar to Lazy Class as well. It’s when you have classes with no methods, only fields. You can use Move Method to move code that uses the class into it (possibly using Extract Method first; they often go together). Fowler has a bit of other advice regarding using accessor methods for properties (the Get and Set methods), but in Delphi whether or not you use accessors is less important than making sure you always make your fields private and get at them with properties.

Refused Bequest

I have to say I’m not happy with Fowler’s take on this. He thinks it’s ok for subclasses to have a subset of the data and behaviour of an ancestor, at least sometimes. Whenever I’ve come across cases where I wanted to do this sort of thing, I’ve invariably found a better hierarchy or a pattern to apply.

Comments

Comments can be indicators of other smells. If they are there to explain complex code, then the code may need looking at. Extract Method and Rename Method can be used to make the code clearer, and may render some comments superfluous.

Some comments are essential though – self-commenting code is a myth. You particularly need to comment on why something is being done, as good code usually tells you what is being done. Write down the stuff you’re going to forget in 6 months or a year when you come back to maintain your code.

Another example of useful comments in Delphi is the To-do list entries.

Beck and Fowler identify some other smells that you may want to look up: Temporary Field, Message Chains, Middle Man, Alternative Class with Different Interfaces, Incomplete Library Class.

Problems

When you have a new hammer, every problem seems like a nail, and it’s tempting to apply it willy-nilly with no thought as to the consequences. In fact, often the consequences are not known straight away. Refactoring as a discipline is fairly new, although some programmers have been doing elements of it for years. Some problem areas no doubt remain to be found, and so you should thinkabout what you’re doing when applying these techniques. For instance, the techniques in Fowler’s book do not take into accountspecial considerations needed for concurrent or distributed software. Other problem areas he identifies are:

  1. Databases. There is often a tight coupling between a database and an application. This is especially the case if you use data-aware controls in Delphi. It helps if there is a layer between the database and the object model, so that changes to one are insulated from the other. Even then, changes to the database schema normally imply data migration from the old to the new schema, which can be difficult and error-prone.
  2. Changing interfaces (in the declaration sense of the word). Changing the internals of a class is easy, and one of the main advantages of object oriented programming, but changing the interface of the class is more problematic. This is particularly difficult when the class is used by many applications. Component writers are familiar with this dilemma, and sometimes you have to leave the old interface in place as well as the new. (Internally, the old interface should call the new one.) From Delphi 6 onwards, you can make use of the deprecated keyword so that the compiler will issue warnings if the old interface is used (I cannot find a way of deprecating a property though). You should be careful to expose as little of an interface as possible, but again, this decision is difficult for component and framework writers who may have little or no control over how their components and classes are used.
  3. Designs that are difficult to refactor. Some design decisions may be too crucial to refactor your way out of, if you make a mistake. There is not yet much data on which these are, and Fowler has little advice in this area. My current thoughts are that this is somewhat beyond the scope of refactoring. It isn’t a silver bullet, and won’t solve all your problems. Sometimes you mess up in a big way, in which case, see 4.
  4. The code is beyond help. We’ve all seen (and, if we’re honest, most likely written) code that is beyond any chance of rescue. There are times when you just need to bite the bullet and rewrite. Even then, you may be able to use refactoring techniques to encapsulate anything worth saving.

You shouldn’t get the idea that you can refactor your way out of anything. You still need to do some design upfront. Refactoring is most useful when refining and maintaining that design. It allows you design something simple and clean, and modify it to changing circumstances later.

There are two traps to which many of us fall prey. One is trying to write code flexible enough to cope with any changes we can conceive, rather than any changes we actually know are going to happen. Instead of designing flexibility in, we can design in ease of refactoring, and make changes a systematic, controlled process when they are required.

The other common problem is optimising code before we know it’s a bottleneck. Refactoring helps by proving smaller, cleaner, clearer blocks of code to analyse and for which we can improve performance.

Refactoring Tools

Many of the refactorings are quite mechanical, and lend themselves to being automated. In some languages, notably Smalltalk and Java, such tools exist. Borland’s JBuilder has support for some refactorings built in. Until now, Delphi programmers have had very little help. Most of us have been left to struggle with a text editor. Some tools, like Modelmaker, purport to support refactoring, but as far as I can determine, they don’t have any refactorings built in, they just offer help for you to manually perform them. For those who use them, CodeRush, Modelmaker Explorer and Castalia may make life easier, too.

However, with Diamondbacl, life has gotten a lot more promising. A new refactoring engine has been added, and while at the time of writing the number of available refactorings is small, there are quite a few more in the pipeline. I should also point out that since this paper was prepared with a preview version, things may (indeed are quite likely to) have changed. Please bear that in mind when looking at the following section. My understanding is that all these refactorings should work with Delphi code in both Win32 and .NET projects, and with C# code, with the exception of the resource string refactoring, which is not applicable to C#.

Rename Symbol

Clicking on a symbol (or positioning the edit window cursor on a symbol) and right clicking brings up the following context menu:

Figure 1

The same refactoring is enabled if we instead use the main Refactoring menu:

Figure 2

Selecting the Rename refactoring using either method (or the Shift-Ctrl-E keyboard shortcut) brings up a dialog which asks for the new name:

Figure 3

As long as "View references before refactoring" is checked, clicking on OK will bring up a dialog which shows all the places the proposed refactoring will affect. It's shown floating in figure 4, but it can of course also be docked.

Figure 4

Note that renaming this class will result in changes being applied in multiple units. Modifying the class name can be done from anywhere the class name is used. In this case I chose the first occurence in the forward declaration of the class, but I could also have done it from a method declaration in the implementation section of the unit, for example. As the refactoring dialog shows, there will be 6 changes made, covering the class declaration, subclass declarations, method implementations and property types.

It is possible to close the refactoring dialog before actually applying the changes, and add more:

Figure 5

The toolbar buttons at the top of the dialog apply the selected refactoring, undo the refactoring, remove the selected refactoring and remove all refactorings, respectively.

Naturally, there are rules that apply when trying to rename symbols. To begin with, the refactoring will work on methods, variables, fields, classes, records, structs, interfaces, types, and parameters. The Delphi help lists a few rules for renaming:

  1. You cannot rename a symbol to a keyword.
  2. You cannot rename a symbol to the same symbol name.
  3. You cannot rename a symbol from within a dependent project when the project where the original declaration symbol resides is not open.
  4. You cannot rename symbols imported by the compiler.
  5. You cannot rename an overridden method when the base method is declared in a class that is not in your project.
  6. If an error results from a refactoring, the engine cannot apply the change. For example, you cannot rename a symbol to a name that already exists in the same declaration scope. If you still want to rename your symbol, you need to rename the symbol that already has the target name first, then refresh the refactoring. You can also redo the refactoring and select a new name. The refactoring engine traverses parent scopes, searching for a symbol with the same name. If the engine finds a symbol with the same name, it issues a warning.

Renaming a method has a few extra provisions. If the method is overloaded, then the refactoring engine renames only the overloaded procedure and only calls to the selected overloaded procedure (all other overloads remain unaffected). If you try to rename an overridden method the engine renames all of the base declarations and descendent declarations, which means the original virtual symbol and all overridden symbols that exist. Looking at rule 5 above, you can see that this can only happen if the base class is in your project (or at least in your project group). So you cannot rename the Paint method of a VCL control, for example.

Sync Edit

A new feature that is closely related to the Rename Symbol refactoring is Sync Edit. It's easier to show how this works that to explain it. Suppose we have a section of code where the same identifier is used several times. Selecting that block of code causes the Sync Edit icon to appear in the gutter on the left hand side of the code editor:

Figure 6

Note that I have not selected the entire method. I've done this for a reason that will become obvious shortly. Clicking on the icon gives us the display in figure 7:

Figure 7

There are several things to note here. One is that there are several duplicate identifiers in this snippet of code. Besides FileName, there is FStrategy, ExtractFileExt, Self and Create. All duplicates are underlined, and the currently selected identifier is highlighted, with its duplicates surrounded by a box. The first issue with Sync Edit is now apparent. Notice that the word "create" in the comment is also underlined, as is the word "the". Sync Edit does not do any analysis of the text, it works purely by matching text strings. So by using the term "identifier" I have been misleading you somewhat.

To move between different duplicated strings, you can use the Tab key. In out example, hitting Tab once results in this:

Figure 8

But we'll edit FileName and see what happens:

Figure 9

The other big difference between Sync Edit and the Rename Symbol refactoring should be obvious. Because the method declaration line was not highlighted, changing FileName did not change the parameter declaration. This is why the code has red wiggly lines under each instance of FileName in the method body. Had we used Rename Symbol on any occurrence of FileName, it would have been changed everywhere that was necessary (except if we had needed to change it in the comment). Sync Edit is handy, but it is not a refactoring tool, exactly.

Extract Method

As far as I'm concerned, the best news is the inclusion of Extract Method. This refactoring is core to most of the more complex refactorings, and it shows what the engine is capable of, which gives me great hope for the future. To see how it works, we can look at the same method we examined in the Sync Edit example (this is code from my "More Design Patterns" paper, by the way):

procedure TDocument.OpenFile(const FileName : string);
begin
  FFileText.LoadFromFile(FileName);

  // Could use Factory Method here, but for now, just inline the code to
  // create the new strategy object
  FreeAndNil(FStrategy);

  if ExtractFileExt(FileName) = '.csv' then begin
    FStrategy := TCsvStrategy.Create(Self);
  end else if ExtractFileExt(FileName) = '.xml' then begin
    FStrategy := TXmlStrategy.Create(Self);
  end;
end;

As you can see from the comment, when I was writing the code, I spotted an opportunity to use the Factory Method pattern. Since there are only two strategy classes in the application, and I only do this in the one fairly simple method, I decided that there was no immediate value splitting that out. For the sake of the demonstration, let's assume the time has come to create a separate factory method. What we do is select the code we want in the new method, and right click (we could also use the main Refactoring menu, or the Shift-Ctrl-M shortcut):

Figure 10

Selecting the Extract Method refactoring brings up the following dialog :

Figure 11

This gives us the opportunity to choose the name for the new method, and shows what it will look like. If we enter "StrategyFactory" as the name, the refactoring engine makes all the necessary changes to the code, so that we end up with this:

procedure TDocument.OpenFile(const FileName : string);
begin
  FFileText.LoadFromFile(FileName);
  StrategyFactory(FileName);
end;

procedure TDocument.StrategyFactory(FileName : string);
begin
  FreeAndNil(FStrategy);

  if ExtractFileExt(FileName) = '.csv' then begin
    FStrategy := TCsvStrategy.Create(Self);
  end else if ExtractFileExt(FileName) = '.xml' then begin
    FStrategy := TXmlStrategy.Create(Self);
  end;
end;

The declaration of the new method is also made in the interface section of the unit (in the private section of the TDocument class). In case you're worried about the formatting changing, I actually went and tidied it up a bit afterwards, and removed the comment, which also made it into the new method. Notice that the refactoring engine dealt with the FileName parameter correctly. It also deals with local variables. However, at the time of writing it did not deal extracting code from nested procedures. Also, I've only managed to get procedures extracted, not functions, although the documentation suggests both are possible. However, this is a very useful tool, which I know I'll use extensively.

The other refactoring built in at the time of writing replaces a string literal with a resource string, which is essentially a variation on Replace Magic Number with Symbolic Constant. This refactoring only applies to Delphi, and a resource string is used over a constant because that is more useful when globalising an application. The refactoring engine is also used for other things, like automatically declaring local variables and fields when you have referenced them in code but not written their declarations yet. While undoubtedly useful, this is more a code completion feature than a refactoring as such, so I won't go into that more here.

Similarly, the facility to import namespaces, and the "Find References" function (introduced in Delphi 8), are very handy new features that you might want to have a look at. Also included is integration with NUnit and DUnit for writing unit tests, which as I keep pointing out, are essential for refactoring (and many of the so-called Agile methodologies). All in all, there are some great new productivity boosting abilities in Diamondback.

Summary

By now, you should have a fairly good idea what refactoring is all about - what it is, and how you can use it. Like learning about design patterns for the first time, it can take a while for the information to sink in, and for it to become second nature, so you shouldn't try to memorise Fowler's entire book straightaway. But start to apply the techniques, even in a little way at first, and you will soon find it paying dividends. Even if you think all we have discussed is familiar territory, a browse through the book or website can still turn up some hidden gems, and at the very least, should make you look at your code with a more critical eye. That can't be a bad thing.

And by the way, did I mention unit testing?

References

Fowler, Martin. Refactoring. Improving the Design of Existing Code. 2000, Addison-Wesley ISBN 0-201-48567-2

Refactoring website www.refactoring.com for more refactorings, tools, and corrections to the book.

Fowler, Martin. Crossing Refactoring’s Rubicon. www.refactoring.com

Fowler, Martin. Refactoring: Doing Design After the Program Runs. www.refactoring.com

Kerievsky, Joshua. Refactoring to Patterns. 2004, Addison-Wesley ISBN 0321213351

Smith, Brandon. Refactoring in the Real World. The Delphi Magazine, Issue 66, February 2001.

McConnell, Steve. Code Complete .1993, Microsoft Press ISBN 1-55615-484-4
A new edition has recently been released.

Gamma, Erich et al. Design Patterns. Elements of Reusable Object-Oriented Software. 1995, Addison-Wesley ISBN 0-201-63361-2

Bracken, Rob. Testing: Quality Time with DUnit. The Delphi Magazine, Issue 76, December 2001.

Groves, Malcolm. Automated Unit Testing with Delphi. DCon 2002.

DUnit website dunit.sourceforge.net/

NUnit website www.nunit.org

Carter, Joanna. Most Valuable Player – Parts 1 to 3. UK-BUG Magazine, November/December 2001 to March/April 2002.