Five Suggestions for Cleaner Code
Being involved in open source exposes you to a lot of different projects, individuals, coding styles... the works. It also means lots of code reviews. More reviews than I'd typically perform at work. In code reviews, I can be pretty opinionated. But I believe they stem from a strong desire to ensure that the code base is as clean and readable as possible.
With that said, these are all approaches that I would like to see all developers adopt. Nothing is hard to reason about and should be easy to start trying to implement from day one.
1. Avoid Flag Arguments
What exactly do I mean by a flag argument? I'll demonstrate with an example.
public void DoWork(bool log)
{
Execute();
if(log)
{
// write to some log
}
}
In the code above, we have a method that does something. When that something is done, it checks to see if we should log the event or not. If the passed in value is true, we go ahead and log the event. Otherwise, we bail out of the method.
This approach to writing methods is ultimately very confusing for the consumer. The implementation details may seem clear enough, but it's hard to reason about when used in the wild.
DoWork(true); // what could true possibly mean?
I feel a much better approach is to actually split this behavior up into two individual methods. One method for each condition.
public void DoWork()
{
Execute();
}
public void DoWorkWithLogging()
{
DoWork();
// log stuff
}
Here, we have a method DoWork
that does just that. It only does the work and does not perform any logging. We've also introduced a method called DoWorkWithLogging
that just calls DoWork
, but additionally performs a logging operation. Hopefully, it's obvious given the name of this method, that it will not only do work, but it will log that it did it.
I feel this approach expresses a lot more intent throughout your codebase and reduces the cyclomatic complexity. We don't have any conditionals!
2. Do One Thing
Now, we hear this a lot, right? Do one thing and do it well. The single responsibility principle. However, it can be hard to actually reason about what does one thing actually mean? A lot of it is subjective, but I generally tend to reason about it in the form of abstractions.
An abstraction isn't a responsibility -- the caller doesn't know anything about the implementation details. The caller simply calls the abstraction and nothing more. Let's see an example.
public int CalculateCostForSize(int size)
{
if(size >= LARGE)
{
return size * LARGE_MODIFIER;
}
return size * DEFAULT_MODIFIER;
}
At first muster, this method is may seem completely acceptable. It's doing one thing, right? It's calculating the cost for a given size. However, there are a lot of implementation details that this method is concerning itself with.
The CalculateCostForSize
method has to know what it means to be large. In this case, size
is greater than or equal to LARGE
. The method also has to know how to actually do the calculation. So one could argue that this method is actually doing three things.
- Figure out if the size is large.
- Calculate the cost when the size is large.
- Calculate the cost when the size is not large.
I feel a better approach would be:
public int CalculateCostForSize(int size)
{
if(SizeIsLarge(size))
{
return HandleLargeSize(size);
}
return HandleDefaultSize(size);
}
private bool SizeIsLarge(int size)
{
return size >= LARGE;
}
private bool HandleLargeSize(int size)
{
return size * LARGE_MODIFIER;
}
private bool HandleSmallSize(int size)
{
return size * DEFAULT_MODIFIER;
}
Now the method really only knows about the algorithm in which to calculate the size. It's not aware of any implementation details. Those are hidden behind abstractions.
3. Leverage Intention-Revealing Names
This is one that I see a lot and is an incredibly easy fix. Consider the following code snippet:
public void DoWork()
{
if(input % 2 == 0)
{
// when do I do this?
}
}
Now as programmers, we probably all know what this code means. Using the modulus operator to figure out if a given number is even or not. But what if it's not clear? That's an easy fix then!
public void DoWork()
{
// check to see if the input is an even number
if(input % 2 == 0)
{
// oh, why didn't you just say so?
}
}
This is probably one of the things that grinds my gears the most when it comes to comments. I've ranted on comments before. But this just hurts me. Physically. Please avoid using comments to explain potentially confusing code. I humbly offer an alternative:
public void DoWork()
{
var inputIsEven = input % 2 == 0;
if(inputIsEven)
{
// easy enough!
}
}
Put another way, think instead if the example was some complicated business rule. We've all seen them..
if(input > 5 && input < 10 || (input >= 100))
That really needs to use an intention-revealing name..
var isWithinAcceptableParameters = (input > 5 && input < 10 || (input >= 100))
if(isWithinAcceptableParameters)
{
}
No comments required!
4. Use Source Control.. Not Comments
I am a firm believer that code that is not required in any way to run the application, should not exist in the code base. While this does include dead code, unreachable code, etc (which may be hard to detect at times and sneak in). I mostly refer to code that has been commented out.
public void DoWork()
{
if(input > 0)
{
}
/*
* why didnt this work?
* if(input > -1)
* {
* }
*/
}
It's an extreme example, but not uncommon in my travels. A more common example might look a little something like
var config = new Config();
config.Title = "title";
//config.MaxSize = 100;
config.MinimumSize = 1;
Source control is an invaluable tool. Most say that regardless of what you're coding, you should always use source control. Even for the smallest of projects. There is no reason to muddle up your codebase with these type of comments when source control can retrieve it for you at a moments notice.
5. Avoid Long Parameter Lists
In the world of refactoring, there is a code smell called a data clump. I see this pretty frequently in code reviews. The developer introduced a slew of new methods that all have the same data set, but are all individual parameters.
Data clumps reveal themselves in a couple of ways. Through class members:
private int max;
private int min;
private DateTime date;
private string result;
and through methods..
public string GetResult(int max, int min, int age, DateTime date)
{
}
We even seem to instinctively clump these related data sets together without even noticing. Doesn't putting that newline between date
and result
just feel better?
The big problem is when we start copy-pasting these parameters into multiple locations. In order for the dataset to make sense, it requires a max
, min
, and date
so why not encapsulate all three pieces of data into an object?
public class ResultContext
{
public ResultContext(int max, int min, DateTime date)
{
Max = max;
Min = min;
Date = date;
}
public int Max { get; }
public int Min { get; }
public DateTime Date { get; }
}
Now we can simply pass around the object, rather than a long list of parameters.
In the end, the true takeaway in all of this is.. please step away from //
and /* */