Refactoring - CL2 Flashcards

1
Q

Moving Features Between Objects (basic)
Move Method
Move Field

A

MOVE FUNCTION
formerly: Move Method

Motivation
The heart of a good software design is its modularity—which is my ability to make most
modifications to a program while only having to understand a small part of it. To get
this modularity, I need to ensure that related software elements are grouped together
and the links between them are easy to find and understand. But my understanding of
how to do this isn’t static—as I better understand what I’m doing, I learn how to best
group together software elements. To reflect that growing understanding, I need to
move elements around.
All functions live in some context; it may be global, but usually it’s some form of a
module. In an object­oriented program, the core modular context is a class. Nesting a
function within another creates another common context. Different languages provide
varied forms of modularity, each creating a context for a function to live in.
One of the most straightforward reasons to move a function is when it references
elements in other contexts more than the one it currently resides in. Moving it together
with those elements often improves encapsulation, allowing other parts of the software
to be less dependent on the details of this module.
Similarly, I may move a function because of where its callers live, or where I need to call
it from in my next enhancement. A function defined as a helper inside another function
may have value on its own, so it’s worth moving it to somewhere more accessible. A
method on a class may be easier for me to use if shifted to another.
Deciding to move a function is rarely an easy decision. To help me decide, I examine the
current and candidate contexts for that function. I need to look at what functions call
this one, what functions are called by the moving function, and what data that function
uses. Often, I see that I need a new context for a group of functions and create one with
Combine Functions into Class (144) or Extract Class (182). Although it can be difficultto decide where the best place for a function is, the more difficult this choice, often the
less it matters. I find it valuable to try working with functions in one context, knowing
I’ll learn how well they fit, and if they don’t fit I can always move them later.

Mechanics
Examine all the program elements used by the chosen function in its current
context. Consider whether they should move too.
If I find a called function that should also move, I usually move it first. That way,
moving a clusters of functions begins with the one that has the least dependency on
the others in the group.
If a high­level function is the only caller of subfunctions, then you can inline those
functions into the high­level method, move, and reextract at the destination.
Check if the chosen function is a polymorphic method.
If I’m in an object­oriented language, I have to take account of super­ and subclass
declarations.
Copy the function to the target context. Adjust it to fit in its new home.
If the body uses elements in the source context, I need to either pass those elements
as parameters or pass a reference to that source context.
Moving a function often means I need to come up with a different name that works
better in the new context.
Perform static analysis.
Figure out how to reference the target function from the source context.
Turn the source function into a delegating function.
Test.
Consider Inline Function (115) on the source function.
The source function can stay indefinitely as a delegating function. But if its callers
can just as easily reach the target directly, then it’s better to remove the middle
man.

MOVE FIELD
Motivation
Programming involves writing a lot of code that implements behavior—but the strength
of a program is really founded on its data structures. If I have a good set of data
structures that match the problem, then my behavior code is simple and
straightforward. But poor data structures lead to lots of code whose job is merely
dealing with the poor data. And it’s not just messier code that’s harder to understand; it
also means the data structures obscure what the program is doing.
So, data structures are important—but like most aspects of programming they are hard
to get right. I do make an initial analysis to figure out the best data structures, and I’ve
found that experience and techniques like domain­driven design have improved my
ability to do that. But despite all my skill and experience, I still find that I frequently
make mistakes in that initial design. In the process of programming, I learn more about
the problem domain and my data structures. A design decision that is reasonable and
correct one week can become wrong in another.
As soon as I realize that a data structure isn’t right, it’s vital to change it. If I leave my
data structures with their blemishes, those blemishes will confuse my thinking and
complicate my code far into the future.
I may seek to move data because I find I always need to pass a field from one record
whenever I pass another record to a function. Pieces of data that are always passed to
functions together are usually best put in a single record in order to clarify their
relationship. Change is also a factor; if a change in one record causes a field in another
record to change too, that’s a sign of a field in the wrong place. If I have to update the
same field in multiple structures, that’s a sign that it should move to another place
where it only needs to be updated once.I usually do Move Field in the context of a broader set of changes. Once I’ve moved a
field, I find that many of the users of the field are better off accessing that data through
the target object rather than the original source. I then change these with later
refactorings. Similarly, I may find that I can’t do Move Field at the moment due to the
way the data is used. I need to refactor some usage patterns first, then do the move.
In my description so far, I’m saying “record,” but all this is true of classes and objects
too. A class is a record type with attached functions—and these need to be kept healthy
just as much as any other data. The attached functions do make it easier to move data
around, since the data is encapsulated behind accessor methods. I can move the data,
change the accessors, and clients of the accessors will still work. So, this is a refactoring
that’s easier to do if you have classes, and my description below makes that assumption.
If I’m using bare records that don’t support encapsulation, I can still make a change like
this, but it is more tricky.

Mechanics
Ensure the source field is encapsulated.
Test.
Create a field (and accessors) in the target.
Run static checks.
Ensure there is a reference from the source object to the target object.
An existing field or method may give you the target. If not, see if you can easily
create a method that will do so. Failing that, you may need to create a new field in
the source object that can store the target. This may be a permanent change, but you
can also do it temporarily until you have done enough refactoring in the broader
context.
Adjust accessors to use the target field.
If the target is shared between source objects, consider first updating the setter to
modify both target and source fields, followed by Introduce Assertion (302) to
detect inconsistent updates. Once you determine all is well, finish changing the
accessors to use the target field.
Test.Remove the source field.
Test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Organizing Data (basic)
Encapsulate Field
Encapsulate Collection

A

ENCAPSULATE RECORD
formerly: Replace Record with Data Class

Motivation
This is why I often favor objects over records for mutable data. With objects, I can hide
what is stored and provide methods for all three values. The user of the object doesn’t
need to know or care which is stored and which is calculated. This encapsulation also
helps with renaming: I can rename the field while providing methods for both the new
and the old names, gradually updating callers until they are all done.
I just said I favor objects for mutable data. If I have an immutable value, I can just have
all three values in my record, using an enrichment step if necessary. Similarly, it’s easy
to copy the field when renaming.
I can have two kinds of record structures: those where I declare the legal field names
and those that allow me to use whatever I like. The latter are often implemented
through a library class called something like hash, map, hashmap, dictionary, or
associative array. Many languages provide convenient syntax for creating hashmaps,
which makes them useful in many programming situations. The downside of using
them is they are aren’t explicit about their fields. The only way I can tell if they use
start/end or start/length is by looking at where they are created and used. This isn’t a
problem if they are only used in a small section of a program, but the wider their scope
of usage, the greater problem I get from their implicit structure. I could refactor such
implicit records into explicit ones—but if I need to do that, I’d rather make them classesinstead.
It’s common to pass nested structures of lists and hashmaps which are often serialized
into formats like JSON or XML. Such structures can be encapsulated too, which helps if
their formats change later on or if I’m concerned about updates to the data that are
hard to keep track of.

Mechanics
Use Encapsulate Variable (132) on the variable holding the record.
Give the functions that encapsulate the record names that are easily searchable.
Replace the content of the variable with a simple class that wraps the record. Define
an accessor inside this class that returns the raw record. Modify the functions that
encapsulate the variable to use this accessor.
Test.
Provide new functions that return the object rather than the raw record.
For each user of the record, replace its use of a function that returns the record with
a function that returns the object. Use an accessor on the object to get at the field
data, creating that accessor if needed. Test after each change.
If it’s a complex record, such as one with a nested structure, focus on clients that
update the data first. Consider returning a copy or read­only proxy of the data for
clients that only read the data.
Remove the class’s raw data accessor and the easily searchable functions that
returned the raw record.
Test.
If the fields of the record are themselves structures, consider using Encapsulate
Record and Encapsulate Collection (170) recursively.

ENCAPSULATE VARIABLE

formerly: Self­Encapsulate Field
formerly: Encapsulate Field

Motivation
Refactoring is all about manipulating the elements of our programs. Data is more
awkward to manipulate than functions. Since using a function usually means calling it,
I can easily rename or move a function while keeping the old function intact as a
forwarding function (so my old code calls the old function, which calls the new
function). I’ll usually not keep this forwarding function around for long, but it does
simplify the refactoring.
Data is more awkward because I can’t do that. If I move data around, I have to change
all the references to the data in a single cycle to keep the code working. For data with a
very small scope of access, such as a temporary variable in a small function, this isn’t a
problem. But as the scope grows, so does the difficulty, which is why global data is such
a pain.
So if I want to move widely accessed data, often the best approach is to first encapsulate
it by routing all its access through functions. That way, I turn the difficult task of
reorganizing data into the simpler task of reorganizing functions.
Encapsulating data is valuable for other things too. It provides a clear point to monitor
changes and use of the data; I can easily add validation or consequential logic on the
updates. It is my habit to make all mutable data encapsulated like this and only
accessed through functions if its scope is greater than a single function. The greater the
scope of the data, the more important it is to encapsulate. My approach with legacy
code is that whenever I need to change or add a new reference to such a variable, I
should take the opportunity to encapsulate it. That way I prevent the increase of
coupling to commonly used data.
This principle is why the object­oriented approach puts so much emphasis on keeping
an object’s data private. Whenever I see a public field, I consider using Encapsulate
Variable (in that case often called Encapsulate Field) to reduce its visibility. Some go
further and argue that even internal references to fields within a class should go
through accessor functions—an approach known as self­encapsulation. On the whole, I
find self­encapsulation excessive—if a class is so big that I need to self­encapsulate its
fields, it needs to be broken up anyway. But self­encapsulating a field is a useful step
before splitting a class.
Keeping data encapsulated is much less important for immutable data. When the data
doesn’t change, I don’t need a place to put in validation or other logic hooks before
updates. I can also freely copy the data rather than move it—so I don’t have to change
references from old locations, nor do I worry about sections of code getting stale data.
Immutability is a powerful preservative.

Mechanics
Create encapsulating functions to access and update the variable.
Run static checks.
For each reference to the variable, replace with a call to the appropriate
encapsulating function. Test after each replacement.
Restrict the visibility of the variable.
Sometimes it’s not possible to prevent access to the variable. If so, it may be useful
to detect any remaining references by renaming the variable and testing.
Test.
If the value of the variable is a record, consider Encapsulate Record (162).

ENCAPSULATE COLLECTION

Motivation
I like encapsulating any mutable data in my programs. This makes it easier to see when
and how data structures are modified, which then makes it easier to change those data
structures when I need to. Encapsulation is often encouraged, particularly by object­
oriented developers, but a common mistake occurs when working with collections.Access to a collection variable may be encapsulated, but if the getter returns the
collection itself, then that collection’s membership can be altered without the enclosing
class being able to intervene.
To avoid this, I provide collection modifier methods—usually add and remove—on the
class itself. This way, changes to the collection go through the owning class, giving me
the opportunity to modify such changes as the program evolves.
Iff the team has the habit to not to modify collections outside the original module, just
providing these methods may be enough. However, it’s usually unwise to rely on such
habits; a mistake here can lead to bugs that are difficult to track down later. A better
approach is to ensure that the getter for the collection does not return the raw
collection, so that clients cannot accidentally change it.
One way to prevent modification of the underlying collection is by never returning a
collection value. In this approach, any use of a collection field is done with specific
methods on the owning class, replacing aCustomer.orders.size with
aCustomer.numberOfOrders. I don’t agree with this approach. Modern languages
have rich collection classes with standardized interfaces, which can be combined in
useful ways such as Collection Pipelines [mf­cp]. Putting in special methods to handle
this kind of functionality adds a lot of extra code and cripples the easy composability of
collection operations.
Another way is to allow some form of read­only access to a collection. Java, for
example, makes it easy to return a read­only proxy to the collection. Such a proxy
forwards all reads to the underlying collection, but blocks all writes—in Java’s case,
throwing an exception. A similar route is used by libraries that base their collection
composition on some kind of iterator or enumerable object—providing that iterator
cannot modify the underlying collection.
Probably the most common approach is to provide a getting method for the collection,
but make it return a copy of the underlying collection. That way, any modifications to
the copy don’t affect the encapsulated collection. This might cause some confusion if
programmers expect the returned collection to modify the source field—but in many
code bases, programmers are used to collection getters providing copies. If the
collection is huge, this may be a performance issue—but most lists aren’t all that big, so
the general rules for performance should apply (Refactoring and Performance (64)).
Another difference between using a proxy and a copy is that a modification of the
source data will be visible in the proxy but not in a copy. This isn’t an issue most of thetime, because lists accessed in this way are usually only held for a short time.
What’s important here is consistency within a code base. Use only one mechanism so
everyone can get used to how it behaves and expect it when calling any collection
accessor function.

Mechanics
Apply Encapsulate Variable (132) if the reference to the collection isn’t already
encapsulated.
Add functions to add and remove elements from the collection.
If there is a setter for the collection, use Remove Setting Method (331) if possible. If
not, make it take a copy of the provided collection.
Run static checks.
Find all references to the collection. If anyone calls modifiers on the collection,
change them to use the new add/remove functions. Test after each change.
Modify the getter for the collection to return a protected view on it, using a read­
only proxy or a copy.
Test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
Composing Methods (basic)
Extract Method
Inline Method
Inline Temp
Replace Temp with Query
Split Temporary Variable
A

EXTRACT FUNCTION
formerly: Extract Method

Motivation
Extract Function is one of the most common refactorings I do. (Here, I use the term
“function” but the same is true for a method in an object­oriented language, or any kind
of procedure or subroutine.) I look at a fragment of code, understand what it is doing,
then extract it into its own function named after its purpose.
During my career, I’ve heard many arguments about when to enclose code in its own
function. Some of these guidelines were based on length: Functions should be no larger
than fit on a screen. Some were based on reuse: Any code used more than once should
be put in its own function, but code only used once should be left inline. The argument
that makes most sense to me, however, is the separation between intention and
implementation. If you have to spend effort looking at a fragment of code and figuring
out what it’s doing, then you should extract it into a function and name the function
after the “what.” Then, when you read it again, the purpose of the function leaps right
out at you, and most of the time you won’t need to care about how the function fulfills
its purpose (which is the body of the function).Once I accepted this principle, I developed a habit of writing very small functions—
typically, only a few lines long. To me, any function with more than half­a­dozen lines
of code starts to smell, and it’s not unusual for me to have functions that are a single
line of code. The fact that size isn’t important was brought home to me by an example
that Kent Beck showed me from the original Smalltalk system. Smalltalk in those days
ran on black­and­white systems. If you wanted to highlight some text or graphics, you
would reverse the video. Smalltalk’s graphics class had a method for this called
highlight, whose implementation was just a call to the method reverse. The name
of the method was longer than its implementation—but that didn’t matter because
there was a big distance between the intention of the code and its implementation.
Some people are concerned about short functions because they worry about the
performance cost of a function call. When I was young, that was occasionally a factor,
but that’s very rare now. Optimizing compilers often work better with shorter functions
which can be cached more easily. As always, follow the general guidelines on
performance optimization.
Small functions like this only work if the names are good, so you need to pay good
attention to naming. This takes practice—but once you get good at it, this approach can
make code remarkably self­documenting.
Often, I see fragments of code in a larger function that start with a comment to say what
they do. The comment is often a good hint for the name of the function when I extract
that fragment.

Mechanics
Create a new function, and name it after the intent of the function (name it by what
it does, not by how it does it).
If the code I want to extract is very simple, such as a single function call, I still
extract it if the name of the new function will reveal the intent of the code in a better
way. If I can’t come up with a more meaningful name, that’s a sign that I shouldn’t
extract the code. However, I don’t have to come up with the best name right away;
sometimes a good name only appears as I work with the extraction. It’s OK to
extract a function, try to work with it, realize it isn’t helping, and then inline it back
again. As long as I’ve learned something, my time wasn’t wasted.
If the language supports nested functions, nest the extracted function inside the
source function. That will reduce the amount of out­of­scope variables to deal with
after the next couple of steps. I can always use Move Function (198) later.Copy the extracted code from the source function into the new target function.
Scan the extracted code for references to any variables that are local in scope to the
source function and will not be in scope for the extracted function. Pass them as
parameters.
If I extract into a nested function of the source function, I don’t run into these
problems.
Usually, these are local variables and parameters to the function. The most general
approach is to pass all such parameters in as arguments. There are usually no
difficulties for variables that are used but not assigned to.
If a variable is only used inside the extracted code but is declared outside, move the
declaration into the extracted code.
Any variables that are assigned to need more care if they are passed by value. If
there’s only one of them, I try to treat the extracted code as a query and assign the
result to the variable concerned.
Sometimes, I find that too many local variables are being assigned by the extracted
code. It’s better to abandon the extraction at this point. When this happens, I
consider other refactorings such as Split Variable (240) or Replace Temp with
Query (178) to simplify variable usage and revisit the extraction later.
Compile after all variables are dealt with.
Once all the variables are dealt with, it can be useful to compile if the language
environment does compile­time checks. Often, this will help find any variables that
haven’t been dealt with properly.
Replace the extracted code in the source function with a call to the target function.
Test.
Look for other code that’s the same or similar to the code just extracted, and
consider using Replace Inline Code with Function Call (222) to call the new
function.
Some refactoring tools support this directly. Otherwise, it can be worth doing some
quick searches to see if duplicate code exists elsewhere.

INLINE FUNCTION
formerly: Inline Method

Motivation
One of the themes of this book is using short functions named to show their intent,
because these functions lead to clearer and easier to read code. But sometimes, I do
come across a function in which the body is as clear as the name. Or, I refactor the body
of the code into something that is just as clear as the name. When this happens, I get rid
of the function. Indirection can be helpful, but needless indirection is irritating.
I also use Inline Function is when I have a group of functions that seem badly factored.
I can inline them all into one big function and then reextract the functions the way I
prefer.
I commonly use Inline Function when I see code that’s using too much indirection—
when it seems that every function does simple delegation to another function, and I get
lost in all the delegation. Some of this indirection may be worthwhile, but not all of it.
By inlining, I can flush out the useful ones and eliminate the rest.

Mechanics
Check that this isn’t a polymorphic method.
If this is a method in a class, and has subclasses that override it, then I can’t inline
it.Find all the callers of the function.
Replace each call with the function’s body.
Test after each replacement.
The entire inlining doesn’t have to be done all at once. If some parts of the inline are
tricky, they can be done gradually as opportunity permits.
Remove the function definition.
Written this way, Inline Function is simple. In general, it isn’t. I could write pages on
how to handle recursion, multiple return points, inlining a method into another object
when you don’t have accessors, and the like. The reason I don’t is that if you encounter
these complexities, you shouldn’t do this refactoring.

EXTRACT VARIABLE
formerly: Introduce Explaining Variable

Motivation
Expressions can become very complex and hard to read. In such situations, local
variables may help break the expression down into something more manageable. In
particular, they give me an ability to name a part of a more complex piece of logic. This
allows me to better understand the purpose of what’s happening.
Such variables are also handy for debugging, since they provide an easy hook for a
debugger or print statement to capture.
If I’m considering Extract Variable, it means I want to add a name to an expression in
my code. Once I’ve decided I want to do that, I also think about the context of that
name. If it’s only meaningful within the function I’m working on, then Extract Variable
is a good choice—but if it makes sense in a broader context, I’ll consider making the
name available in that broader context, usually as a function. If the name is available
more widely, then other code can use that expression without having to repeat the
expression, leading to less duplication and a better statement of my intent.
The downside of promoting the name to a broader context is extra effort. If it’s
significantly more effort, I’m likely to leave it till later when I can use Replace Temp
with Query (178). But if it’s easy, I like to do it now so the name is immediately
available in the code. As a good example of this, if I’m working in a class, then Extract
Function (106) is very easy to do.

Mechanics
Ensure that the expression you want to extract does not have side effects.Declare an immutable variable. Set it to a copy of the expression you want to name.
Replace the original expression with the new variable.
Test.
If the expression appears more than once, replace each occurrence with the variable,
testing after each replacement.

INLINE VARIABLE
formerly: Inline Temp

Motivation
Variables provide names for expressions within a function, and as such they are usually
a Good Thing. But sometimes, the name doesn’t really communicate more than the
expression itself. At other times, you may find that a variable gets in the way of
refactoring the neighboring code. In these cases, it can be useful to inline the variable.

Mechanics
Check that the right­hand side of the assignment is free of side effects.
If the variable isn’t already declared immutable, do so and test.
This checks that it’s only assigned to once.Find the first reference to the variable and replace it with the right­hand side of the
assignment.
Test.
Repeat replacing references to the variable until you’ve replaced all of them.
Remove the declaration and assignment of the variable.
Test.

REPLACE TEMP WITH QUERY

Motivation
One use of temporary variables is to capture the value of some code in order to refer to
it later in a function. Using a temp allows me to refer to the value while explaining its
meaning and avoiding repeating the code that calculates it. But while using a variable is
handy, it can often be worthwhile to go a step further and use a function instead.
If I’m working on breaking up a large function, turning variables into their own
functions makes it easier to extract parts of the function, since I no longer need to pass
in variables into the extracted functions. Putting this logic into functions often also sets
up a stronger boundary between the extracted logic and the original function, which
helps me spot and avoid awkward dependencies and side effects.
Using functions instead of variables also allows me to avoid duplicating the calculation
logic in similar functions. Whenever I see variables calculated in the same way in
different places, I look to turn them into a single function.
This refactoring works best if I’m inside a class, since the class provides a shared
context for the methods I’m extracting. Outside of a class, I’m liable to have too many
parameters in a top­level function which negates much of the benefit of using a
function. Nested functions can avoid this, but they limit my ability to share the logic
between related functions.
Only some temporary variables are suitable for Replace Temp with Query. The variable
needs to be calculated once and then only be read afterwards. In the simplest case, this
means the variable is assigned to once, but it’s also possible to have several assignments
in a more complicated lump of code—all of which has to be extracted into the query.
Furthermore, the logic used to calculate the variable must yield the same result when
the variable is used later—which rules out variables used as snapshots with names like
oldAddress.

Mechanics
Check that the variable is determined entirely before it’s used, and the code that
calculates it does not yield a different value whenever it is used.
If the variable isn’t read­only, and can be made read­only, do so.
Test.
Extract the assignment of the variable into a function.If the variable and the function cannot share a name, use a temporary name for the
function.
Ensure the extracted function is free of side effects. If not, use Separate Query from
Modifier (306).
Test.
Use Inline Variable (123) to remove the temp.

SPLIT VARIABLE

formerly: Remove Assignments to Parameters
formerly: Split Temp

Motivation
Variables have various uses. Some of these uses naturally lead to the variable being
assigned to several times. Loop variables change for each run of a loop (such as the i in
for (let i=0; i<10; i++)). Collecting variables store a value that is built up
during the method.
Many other variables are used to hold the result of a long­winded bit of code for easy
reference later. These kinds of variables should be set only once. If they are set more
than once, it is a sign that they have more than one responsibility within the method.
Any variable with more than one responsibility should be replaced with multiple
variables, one for each responsibility. Using a variable for two different things is very
confusing for the reader.

Mechanics
Change the name of the variable at its declaration and first assignment.
If the later assignments are of the form i = i + something, that is a collecting
variable, so don’t split it. A collecting variable is often used for calculating sums,
string concatenation, writing to a stream, or adding to a collection.
If possible, declare the new variable as immutable.
Change all references of the variable up to its second assignment.Test.
Repeat in stages, at each stage renaming the variable at the declaration and
changing references until the next assignment, until you reach the final assignment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Simplifying Conditional Expressions (basic)
Decompose Conditional Expression
Consolidate Conditional Expression
Consolidate Duplicate Conditional Fragments
Remove Control Flag
Replace Conditional with Polymorphism

A

DECOMPOSE CONDITIONAL

Motivation
One of the most common sources of complexity in a program is complex conditional
logic. As I write code to do various things depending on various conditions, I can
quickly end up with a pretty long function. Length of a function is in itself a factor that
makes it harder to read, but conditions increase the difficulty. The problem usually lies
in the fact that the code, both in the condition checks and in the actions, tells me what
happens but can easily obscure why it happens.
As with any large block of code, I can make my intention clearer by decomposing it and
replacing each chunk of code with a function call named after the intention of that
chunk. With conditions, I particularly like doing this for the conditional part and each
of the alternatives. This way, I highlight the condition and make it clear what I’m
branching on. I also highlight the reason for the branching.
This is really just a particular case of applying Extract Function (106) to my code, but I
like to highlight this case as one where I’ve often found a remarkably good value for the
exercise.

Mechanics
Apply Extract Function (106) on the condition and each leg of the conditional.

CONSOLIDATE CONDITIONAL EXPRESSION

Motivation
Sometimes, I run into a series of conditional checks where each check is different yet
the resulting action is the same. When I see this, I use and and or operators to
consolidate them into a single conditional check with a single result.
Consolidating the conditional code is important for two reasons. First, it makes it
clearer by showing that I’m really making a single check that combines other checks.
The sequence has the same effect, but it looks like I’m carrying out a sequence of
separate checks that just happen to be close together. The second reason I like to do
this is that it often sets me up for Extract Function (106). Extracting a condition is one
of the most useful things I can do to clarify my code. It replaces a statement of what I’m
doing with why I’m doing it.
The reasons in favor of consolidating conditionals also point to the reasons against
doing it. If I consider it to be truly independent checks that shouldn’t be thought of as a
single check, I don’t do the refactoring.

Mechanics
Ensure that none of the conditionals have any side effects.
If any do, use Separate Query from Modifier (306) on them first.Take two of the conditional statements and combine their conditions using a logical
operator.
Sequences combine with or, nested if statements combine with and.
Test.
Repeat combining conditionals until they are all in a single condition.
Consider using Extract Function (106) on the resulting condition.

SLIDE STATEMENTS
formerly: Consolidate Duplicate Conditional Fragments

Motivation
Code is easier to understand when things that are related to each other appear together.
If several lines of code access the same data structure, it’s best for them to be together
rather than intermingled with code accessing other data structures. At its simplest, I
use Slide Statements to keep such code together. A very common case of this is
declaring and using variables. Some people like to declare all their variables at the top
of a function. I prefer to declare the variable just before I first use it.
Usually, I move related code together as a preparatory step for another refactoring,
often an Extract Function (106). Putting related code into a clearly separated function
is a better separation than just moving a set of lines together, but I can’t do the Extract
Function (106) unless the code is together in the first place.

Mechanics
Identify the target position to move the fragment to. Examine statements between
source and target to see if there is interference for the candidate fragment. Abandon
action if there is any interference.
A fragment cannot slide backwards earlier than any element it references is
declared.
A fragment cannot slide forwards beyond any element that references it.A fragment cannot slide over any statement that modifies an element it references.
A fragment that modifies an element cannot slide over any other element that
references the modified element.
Cut the fragment from the source and paste into the target position.
Test.
If the test fails, try breaking down the slide into smaller steps. Either slide over less
code or reduce the amount of code in the fragment you’re moving.

REMOVE FLAG ARGUMENT
formerly: Replace Parameter with Explicit Methods

Motivation
A flag argument is a function argument that the caller uses to indicate which logic the
called function should execute. I may call a function that looks like this:
Click here to view code image
function bookConcert(aCustomer, isPremium) {
if (isPremium) {
// logic for premium booking
} else {
// logic for regular booking
}
}
To book a premium concert, I issue the call like so:
bookConcert(aCustomer, true);
Flag arguments can also come as enums:Click here to view code image
bookConcert(aCustomer, CustomerType.PREMIUM);
or strings (or symbols in languages that use them):
Click here to view code image
bookConcert(aCustomer, “premium”);
I dislike flag arguments because they complicate the process of understanding what
function calls are available and how to call them. My first route into an API is usually
the list of available functions, and flag arguments hide the differences in the function
calls that are available. Once I select a function, I have to figure out what values are
available for the flag arguments. Boolean flags are even worse since they don’t convey
their meaning to the reader—in a function call, I can’t figure out what true means. It’s
clearer to provide an explicit function for the task I want to do.
premiumBookConcert(aCustomer);
Not all arguments like this are flag arguments. To be a flag argument, the callers must
be setting the boolean value to a literal value, not data that’s flowing through the
program. Also, the implementation function must be using the argument to influence
its control flow, not as data that it passes to further functions.
Removing flag arguments doesn’t just make the code clearer—it also helps my tooling.
Code analysis tools can now more easily see the difference between calling the premium
logic and calling regular logic.
Flag arguments can have a place if there’s more than one of them in the function, since
otherwise I would need explicit functions for every combination of their values. But
that’s also a signal of a function doing too much, and I should look for a way to create
simpler functions that I can compose for this logic.

Mechanics
Create an explicit function for each value of the parameter.
If the main function has a clear dispatch conditional, use Decompose Conditional
(260) to create the explicit functions. Otherwise, create wrapping functions.For each caller that uses a literal value for the parameter, replace it with a call to the
explicit function.

REPLACE CONDITIONAL WITH POLYMORPHISM

Motivation
Complex conditional logic is one of the hardest things to reason about in programming,
so I always look for ways to add structure to conditional logic. Often, I find I can
separate the logic into different circumstances—high­level cases—to divide the
conditions. Sometimes it’s enough to represent this division within the structure of a
conditional itself, but using classes and polymorphism can make the separation more
explicit.
A common case for this is where I can form a set of types, each handling the conditional
logic differently. I might notice that books, music, and food vary in how they are
handled because of their type. This is made most obvious when there are several
functions that have a switch statement on a type code. In that case, I remove the
duplication of the common switch logic by creating classes for each case and usingpolymorphism to bring out the type­specific behavior.
Another situation is where I can think of the logic as a base case with variants. The base
case may be the most common or most straightforward. I can put this logic into a
superclass which allows me to reason about it without having to worry about the
variants. I then put each variant case into a subclass, which I express with code that
emphasizes its difference from the base case.
Polymorphism is one of the key features of object­oriented programming—and, like any
useful feature, it’s prone to overuse. I’ve come across people who argue that all
examples of conditional logic should be replaced with polymorphism. I don’t agree with
that view. Most of my conditional logic uses basic conditional statements—if/else and
switch/case. But when I see complex conditional logic that can be improved as
discussed above, I find polymorphism a powerful tool.

Mechanics
If classes do not exist for polymorphic behavior, create them together with a factory
function to return the correct instance.
Use the factory function in calling code.
Move the conditional function to the superclass.
If the conditional logic is not a self­contained function, use Extract Function (106)
to make it so.
Pick one of the subclasses. Create a subclass method that overrides the conditional
statement method. Copy the body of that leg of the conditional statement into the
subclass method and adjust it to fit.
Repeat for each leg of the conditional.
Leave a default case for the superclass method. Or, if superclass should be abstract,
declare that method as abstract or throw an error to show it should be the
responsibility of a subclass.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly