mentis vulgaris
simple thoughts | jason smith
Promises, Promises
Posted by Jason Smith - 03/09/10 at 06:09:10 pmPromises are like babies: easy to make, hard to deliver. ~Anonymous
Every .NET interface, every COM interface, and every API is a promise. A contract of sorts. We all know that — it’s the point of an API. But, have we really thought about what that implies (especially in the context of object oriented code)?
Consider this interface (assume we’re using namespace System.Collections.Generic):
public interface IAlarm
{
List<AlarmInfo> GetList(
HostMachineInfo hostMachine);
List<AlarmInfo> GetListByEnvironment(
HostMachineInfo hostMachine,
EnvironmentInfo environment);
}
What are some of the promises being made here?
The obvious ones:
- You can get the collection of AlarmInfo’s associated with a hostMachine,
- you can get the collection of AlarmInfo’s that are associated with a hostMachine in an environment.
Maybe a bit less obvious, but still promised by the interface:
- You can modify the set returned by those methods (e.g., you can Add, AddRange, Clear, and Remove to and from the list. You can also Reverse and Sort the lists in place)
- The return value says you can access individual elements by index.
More? How about something a bit more problematic, but required by the implementation of List<T>:
- The list is always in your your process space.
- The list is always completely loaded prior to it being returned to you.
In other words, this API promises that every implementation of this interface will satisfy all of those requirements (as well as the promises made by any base interfaces). It says that no matter how the outside world changes, no matter how big the data sets get, or how constrained the memory, no matter how distributed this might be, all of those promises will be satisfied.
Absolutely no future implementation of GetList(HostMachineInfo) can lazy-load AlarmInfo’s — no matter how big that result set gets. Furthermore, all future implementations must gracefully handle a user changing the contents of the list — even when an implementation might return a cached list, shared across other consumers who might be surprised by such a change.
With the current design, the only way we can add any of that new behavior in a later release is to find and modify all consuming code to use a new abstraction, instead of simply supplying a single new implementation of the interface.
In short, it makes the mistaken assumption that software doesn’t change but, by actively resisting change, makes the design brittle. Which all leads to big, painful, late and over-budget rewrites (or, read another way, we burn cash that could be spent on raises, bonuses and new gee-whiz tools).
So, how do we fix it? In existing/shipped code, it’s probably too late. More likely than not there is other code depending on the promises already made by the offending code. But in new designs, think about the minimum we can promise and still satisfy client code’s needs. If, for example, all consumers really need to do is iterate over the list of alarms, then consider an interface that returns IEnumerable<AlarmInfo> instead — iteration is the only thing IEnumerable<T> promises.
Careful what you promise — you’ll have to live with it a long time
Software Craftsmen, Laws and Principles, Oh My
Posted by Jason Smith - 25/03/10 at 12:03:48 pmMany software developers have written about various Principles and Laws of software development. We talk about things like SOLID, OCP, SRP, LSP, DRY, LoD, and whether these things help or hinder. To outsiders it seems we spend a lot of time arguing the software equivalent of how many angels can dance on the head of a pin.
On the surface Derick Bailey’s blog post and subsequent discussion is a case in point.
There is a reason we believe these things are important. The largest, to me, is that software changes. And anything we can do to make successful change easier is, to a working software developer, the right thing to do.
Derick is a really sharp guy, and I’ve been reading his and other Los Techie’s blogs for some time. In this particular blog, he suggest a particular software technique is exempt from one of our more misunderstood laws: The Law of Demeter (LoD). If I correctly grasp what he’s saying, then I disagree (I’ll be the first to admit I might be misreading him, though).
The LoD is one of those “Laws” or Principles that keeps our code adaptable. It does so by reducing the set of types a body of code is dependent on.
Analogy Time
Think of software as a piece of soft cloth. Like cloth, software with minimal dependencies is extremely flexible — it can be taken from operating system to operating system. It can be built on different architectures. Similarly, cloth is flexible: It can be laid flat to cover a large area. It can be folded to cover a small area. It can be draped over something. Adding a dependency to software is like driving a nail through the cloth into the surface below. The cloth becomes anchored. The cloth can still be laid flat. It can be folded. But it can’t be moved to another area unless we move what it’s been anchored to. Add another nail. Now the cloth can only be laid flat, and can only be folded small enough to include the space between the nails. As we add nails, the ways we can use the cloth decrease. As we add dependencies, the way we can use our software also decreases.
To keep software supple, the LoD seeks to strictly control dependencies — to control the places software is “anchored” to another type. To say the law is exempt because of the syntactic sugar it adds misses the point (and here’s where I may have missed Derick’s point). His example provides the extension methods that give us the following:
var asset = assets.FirstOrDefault().As<RecordUniqueAsset>();
This is intended to be (roughly) equivalent to the more wordy:
var asset = null;
// I assume some intervening code happens here
if (assets != null && assets.Count > 0)
{
asset = assets[0] as RecordUniqueAsset;
}
As far as concision goes, the former code is a win. Nothing in the change touches on the LoD though. The determining factor about whether the extension method violates the LoD is almost entirely dependent on whether the original code violates it.
One might argue that the static type that “contains” the extension methods are an additional type (which would violate LoD). I’m not one of those folks. The reason they aren’t an additional type, is (correctly written) extension methods are part of the interface or type they extend.
If you’re willing to read a bit of C++, a great examination on why this is the case can be found in an article written by Scott Meyer (of Effective C++ fame).
That said, any fool can tell you the right number of angels is 42. It’s always 42.
Using Velocity Charts and Burn-Downs to Drive Development
Posted by Jason Smith - 28/02/10 at 01:02:20 pmImagine you’re the driver in a car cruising along the Seven Miles Bridge in the Florida Keys. Sunny day. Clear. There are no other cars, and you’ve been given permission from the Florida Troopers to go as fast as you like – so you’re cooking along at 80-90 miles an hour.
Now imagine you’re doing that with someone in the back seat holding their hands over your eyes and letting you take a 3 second peek every 30 seconds or so.
Sound like fun?
Not if you’re anything like me. I’d want to know if I were getting anywhere off track as soon as it happened. Hundreds of pounds of steel, plastic and simulated walnut trim develop a fair bit of momentum. The only way I’ll keep that car under control at that speed is slight steering adjustments as soon as I notice I’m drifting.
So why do we run projects like that? How often have we sat through projects with a weekly status report where folks give the old “we’re on track” for the first 10 weeks of a 12 week project, followed by a “we’re 90% done” for the next 3 weeks? Or we’ve taken on an 18 month project, and only found in month 16 that it’ll take at least 24 months. How well can a project running under a full head of steam radically change directions without crashing and burning?
Velocity Charts (also called Burn-Downs or Burn-Ups) are the simplest way to get immediate visibility into whether a project is drifting off course (and are incredibly effective at identifying fairly early in the process when a project will really be completed).
The way they work is extremely simple and incredibly accurate.
Estimate Sizes
Given a set of tasks, make an estimate of their sizes (often called “story points” or “task points”). There are a number of rules to get good estimates, but the simplest approach uses these rules:
- The estimate is unitless. It’s not hours. Not days, weeks or months. It is just a number. Each task is estimated in size relative to the others.
- The values we use are 0, 1, 2, 3, 5, 8, 13 and infinite/too big (each value is the sum of the preceding 2 values). Or you can use a doubling sequence: 0, 1, 2, 4, 8, 16 and infinite/too big. These values are used for the following two reasons:
- Humans are better at saying a task is “bigger than that other one, but not as big as this” but we have problems identifiying precisely how much bigger. Both series account for that.
- Humans seem to not be very good at estimating sizes that are more than an order of magnitude difference, so we cap it at 13 (or 16)
- If this is the first time you’ve done this, simply identify the smallest task in your set and assign it a value of 1. If this isn’t the first time you’ve done this, compare this to your last project’s items of size 1. Is it the same size as the first? Give it a 1. Bigger? Twice as big? Three times? Give it the appropriate number. Repeat this for each task you’ve identified, comparing their relative sizes to the previously sized tasks.
- Any task bigger than a 13 (or 16 if using the doubling sequence) must be broken down into smaller pieces. The likelihood of not understanding the task well enough is pretty high
- Record these tasks into a list.
Notice that throughout the exercise above, there is no estimate of time. History demonstrates we software people profoundly suck rocks when estimating how long it takes us to do something (I’m convinced our egos get in the way). This is all about relative size.
Track Progress
This is where time enters the picture – not as a guess or estimate, but as a measure of what’s really happening.
- Write your estimates down on a grid (a spreadsheet works beautifully here). First column is each task, second column is the associated size. The rest of the columns will be updated daily.
- Regularly (I prefer daily, but you can get away with weekly if you like driving with kids’ sticky hands over your eyes), have a quick standup meeting with the team (shouldn’t last longer than 15 minutes). Ask them what items from the list of tasks are complete. If (and only if) the task is complete, enter a zero in the first empty column for that task, otherwise carry over the size from the previous day.
- Sum the columns and generate a line or bar chart from the sums.
- Calculate a regression line from the chart (Excel calls it a Series). The negation of the slope of that line is the Velocity for that team for that period. So, if the slope of the line is -1.67 and we are doing this daily, it means the velocity is 1.67 task points per day.
- Multiply your Velocity and the sum of remaining sizes, and you have a good idea of the number of periods (days if you’re doing this daily) of time before the project is complete.
Here’s an example chart:

Generated from this data:

Caveats
The velocity’s predictive power improves with more samples, so your first 3 or 4 days won’t give you a very accurate prediction of when the project will be done. Each sample improves things tremendously though.
Different teams have different velocities, based on the size of the task they used as their size 1, as well as the size of the team. As a result of this, DO NOT USE VELOCITY FOR COMPARING THE EFFECTIVENESS OF TEAMS – IF YOU DO, YOU WILL BE COMPARING APPLES TO VOLKSWAGENS AND THE TERRORISTS WILL WIN.
Additional Information
An excellent way to generate even more accurate size estimates can be found here:
http://www.planningpoker.com/detail.html
You can get additional information on Velocity Charts at these locations:
http://www.versionone.com/Resources/Velocity.asp
http://www.controlchaos.com/about/burndown.php
The following is a great way to do burn-downs in the face of changing requirements (that never happens, right?):
http://www.mountaingoatsoftware.com/scrum/alt-releaseburndown
The Bottleneck
Posted by Jason Smith - 20/02/10 at 10:02:47 amNearly 70% of all software projects are considered unsuccessful. They are late, over budget and have fewer features than users wanted (and often ones they didn’t want – I’m looking at you, Clippy).
I gotta ask, would this situation exist if we had perfect knowledge of the future?
If we knew the user interface would confuse our users – would we spend as much time as we did implementing it the wrong way? If we knew our design would lead to performance issues, wouldn’t we select an alternate before we were too committed? If we knew the code we just wrote had a bug in it wouldn’t we fix the code before it even consumed a second of QA’s time? If we knew that QA would find a bug in a third party product caused by the way we called the API, would we have altered our design to compensate?
Most of our problems developing software would be solved by knowing, by learning the things we don’t know earlier and faster.
Now consider the Theory of Constraints:
- Any manageable system is limited in achieving more of its goal by a very small number of constraints.
- There is always at least one constraint.
In a nutshell, we will see the greatest improvements in our ability to reach our goals by understanding and defeating our greatest constraints.
If that’s so, then software projects are unsuccessful because we haven’t optimized the activity we spend most of our time doing. That activity is not coding, requirements gathering, bug fixing or dropping in an easter egg. The biggest constraint we have is Learning:
- Learning what the customer wants
- Learning whether what we wrote is what the customer really wanted
- Learning how use an API
- Learning code the guy down the hall wrote
- Learning when we’re actually going to deliver
- Learning whether our code actually works
- Learning why it didn’t in a special case
- Ad nauseum
Good product developers intuitively understand this. They actively seek out ways to learn faster and earlier.
They learn early that the UI they are doing meets the user’s needs and doesn’t confuse (prototyping and seeking active customer feedback). They understand that it’s orders of magnitude cheaper (in both dollars and effort) to learn about and fix a bug when the bug is introduced (automated unit testing and continuous integration). They understand that the fastest way to help our coworkers learn what we are doing is by collaborating with them (pair programming, daily stand-ups and being colocated). They learn about their risks early by working the risky code early (spike solutions). They know that code that is easy to understand is easier to learn and use (simple design, simple code that expresses intent).
Optimize the process of learning, and we optimize how we develop software.
Exceptions are Easy. Except When They’re Not (Part 2)
Posted by Jason Smith - 13/02/10 at 02:02:28 pmNothing is said which has not been said before. ~ Terence (195/185 BC-159 BC) Playwright of the Roman Republic
In a previous post, I posed two questions to my readers (all √(-1) of them): whether a body of code was exception safe, and if there was anything I needed to change to insure the code was consistent should an exception be thrown.
The pertinent code is reproduced here:
public class Foo
{
// ...
private void Load()
{
ServiceProxy<IFooConfig> proxy = null;
try
{
proxy = ServiceHelper.CreateProxyFromServiceContractInf<IFooConfig>(
InfoSourceUri.ToString());
IFooConfig config = proxy.ProxyContract;
this.NumberOfFrobs = config.NumberOfFrobs;
this.WhereToGetThem = config.FrobSource;
for (var e in config.MiscellaneousEntries)
{
this.Miscellaneous.Add(e);
}
}
catch (Exception ex)
{
// Report the error and then ...
throw;
}
finally
{
proxy.CleanupProxy(); // Extension method that aborts if proxy is faulted or ignores if null
}
}
}
To answer the first question we need to understand what exception safety is. It’s generally broken down into 5 categories:
- No guarantees — when an exception is thrown, the program crashes — often leaving persistent things in an indeterminant state.
- Minimal (or “No Leak”) guarantee. The program doesn’t crash, and resources don’t leak, but there are no guarantees about side effects.
- The basic (or “Weak”) guarantee: the invariants of the component are preserved, and no resources are leaked (called the “weak guarantee”, as the system is left in a safe but altered and unknown state). Recovery is only possible by introspection of the state of the objects involved.
- The strong guarantee: that the operation has either completed successfully or when an exception is thrown, the state is exactly as it was before the operation started.
- The no-throw guarantee: that the operation will not throw an exception.
Given those definitions, Foo.Load() provides at least the “minimal/no leak” guarantee for Foo. Since there aren’t any stated invariants about the class’ state, you could also argue it is in a “valid” state, giving the Basic/Weak guarantee.
The second question I asked points to the problem, though. As written, when an exception is thrown in Foo.Load(), the Foo instance will be in an unknown state. Which means it is unusable (or you have to break encapsulation and engage in some introspection to understand its state).
Let’s say Foo is used to configure something else in your system. It got the “NumberOfFrobs” value for that service, but if the call on the line that set “WhereToGetThem” threw an exception, the old value of WhereToGetThem (possibly) remains in that instance. Even worse, we’re not really sure where the situation broke down (what if the exception threw half-way through updating the Miscellaneous array — which portion is valid?).
This problem becomes very interesting when the behavior of an object depends on the state of several members of the object and only half are current.
There’s a very straightforward solution to this mess: Insure code supports the “strong exception guarantee. This means calls either succeed or they don’t change the system. Period. When we catch an exception from a method call, we know our program is in the state it was immediately prior to the call.
If this sounds familiar, it should — it’s just like working with transactional databases (or any other transactional system).
Do the error prone stuff first and then commit your changes in a block of code that can’t throw an exception:
private void Load()
{
int nFrobs = 0;
Uri where = null;
Dictionary<string ,object> misc = null;
ServiceProxy<IFooConfig> proxy = null;
try
{
proxy = ServiceHelper.CreateProxyFromServiceContractInf<IFooConfig>(
InfoSourceUri.ToString());
IFooConfig config = proxy.ProxyContract;
// exception-prone code:
nFrobs = config.NumberOfFrobs;
where = new Uri(config.FrobSource);
misc = new Dictionary<string,object>(config.MiscellaneousEntries);
}
catch (Exception ex)
{
// Report the error and then ...
throw;
}
finally
{
proxy.CleanupProxy(); // Extension method that aborts if proxy is
// faulted or ignores if null
}
// NO THROW GUARANTEE required:
// from this point on, the only valid code is code with
// no-throw guarantee. If default property setters can
// throw, have a version that doesn’t (or set the field
// directly).
this.NumberOfFrobs = nFrobs;
this.WhereToGetThem = where;
this.Miscellaneous = misc;
}
Note the comment near the end of the code — to have the strong guarantee, you must have some code that satisfies the no-throw guarantee (e.g., no exceptions are allowed to escape “no-throw” code — a good requirement for the IDisposable.Dispose() method, by the way).
That’s it. If you have a reference to a Foo, outside of Foo.Load(), it’s consistent.
If you’d like to dig further into this subject, David Abrahams originally wrote about this subject several years ago. While his original paper discussed it in the context of C++, the issues apply to .NET based languages as well.
Exceptions are Easy. Except when they’re not.
Posted by Jason Smith - 10/02/10 at 09:02:10 amAssuming all the code compiles correctly, is Foo.Load() exception safe? Is there anything I need to change in Load’s exception handler to insure an instance of Foo is always in a valid state?
///Defines a type that associates a URI with the
/// count of frobs, miscellaneous information about those frobs
/// and where to get them.
///
public class Foo
{
public Foo()
{
Miscellaneous = new Dictionary<string , object>();
}
private Uri _infoSource;
public Uri InfoSourceUri
{
get { return _infoSource; }
set
{
_infoSource = value;
Load();
}
}
private void Load()
{
ServiceProxy<IFooConfig> proxy = null;
try
{
proxy = ServiceHelper.CreateProxyFromServiceContractInf<IFooConfig>(InfoSourceUri.ToString());
IFooConfig config = proxy.ProxyContract;
NumberOfFrobs = config.NumberOfFrobs;
WhereToGetThem = config.FrobSource;
for (var e in config.MiscellaneousEntries)
{
Miscellaneous.Add(e);
}
}
catch (Exception ex)
{
// Report the error
}
finally
{
proxy.CleanupProxy(); // Extension method that aborts if proxy is
// faulted or ignores if null
}
}
public int NumberOfFrobs { get; set; }
public Uri WhereWeGetThem { get; set; }
public IDictionary<string , object> Miscellaneous { get; private set; }
}
Encapsulating Your Way To Better Unit Tests
Posted by Jason Smith - 07/02/10 at 11:02:01 amI an earlier blog, I discussed the qualities of good unit tests. Primarily, good unit tests are fast, isolate the bugs, repeatable, self-validating and timely.
At first blush, that seems easier said than done. We write distributed software and connect to databases. Some of those databases live in another country.
So, how can an object that sets up connections to a database, prime it with test data, consume that data, manipulate it and then present it in a UI actually be fast, isolating, repeatable and self-validating? If by fast we mean microseconds per test (that’s exactly what we mean), we’re toast as soon as we even talk to the database. How do we isolate our tests when we write to a persistent store, while another test is running in parallel against the same store? What happens when the database fails for some reason, and the data has been partially changed? Is the error in our code, or the database? And self-validating? Come on! Our UI’s need to be poked, prodded and looked at to be tested.
Unit-testing Nirvana sounds like a ton of work.
Actually, it’s the work we already do.
Most of us use object oriented languages because we buy into the idea of encapsulation — or, said another way, we want to reduce couplings to particular implementations. Even if polymorphism and inheritance doesn’t float your boat, old-fashioned structured design pushes this same idea. We simply hide (encapsulate) implementation behind function calls, so we can focus on the work at hand.
For example, assume we have a window where individual elements within it may be visible or invisible based on user preferences. What is it about the algorithm to determine the initial visibility for elements of a window that requires a database or a .config file? Nothing, right? Either will approach work. So will other changes in the underlying program. All the algorithm really cares about is it gets configuration data from somewhere. So we should encapsulate that concept.
We know our window needs to be initialized with some configuration data. The window doesn’t really care where it came from. A candidate method might look something like this:
class MyWindow : System.Windows.Forms.Form
{
// a list of our window controls that we can enable or disable visibility on.
private List<Control> _toggleVisibility;
// ... blah blah
public void SetInitialWindowVisibility(IDictionary<string , bool> visibilities)
{
foreach (var control in _toggleVisibility)
{
if (visibilities.TryGetValue(control.Name, out visible))
{
control.Visible = visible;
}
}
}
}
We honor the Single Responsibility Principle: No database involved. No System.Configuration.ConfigurationManager. If high cohesion and low coupling are the benchmarks of good design, then this is pretty decent.
Here’s the payoff — this approach gave us a good unit-testable design for free:
- it’s fast — for our test, we simply pass a preloaded dictionary and verify the window controls’ visibilities match the dictionary state
- it isolates the error — if the unit-test fails here, it’s because the visibility setting behavior was broken. Period. Not a database, not a parser — just the method under test.
- It’s repeatable — every time we run it, we get the same results — noone on the outside can poison the test during it’s run,
It’s self-validating — our unit test need only iterate over the embedded controls collection comparing the state of the control.Visible value. No popping up the user-interface to hand inspect is necessary.
Sure, that configuration dictionary needs to be initialized somewhere, and that code should be tested as well. But if it connects to a database to get the information, keep the test for that code outside the body of unit-tests (recall, unit tests should be run on the developers’ machines, often. If they aren’t fast, they won’t be run at all). Definitely include database tests in the an integration or smoke test, though — those are run with much lower frequency, so they can afford to take a bit more time to run.
For the adventurous, you can raise the bar even more – what about this algorithm requires that it be implemented in a System.Windows.Forms.Window derivative? Can the implementation be refactored to remove this dependency?
Qualities of Good Unit Tests
Posted by Jason Smith - 02/02/10 at 11:02:08 amSo — you’ve been tasked with insuring your code has unit tests. What does that mean, really?
According to Wikipedia,
unit testing is a software design and development method where the programmer gains confidence that individual units of source code are fit for use. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class. — http://en.wikipedia.org/wiki/Unit_testing
If you understand the primary beneficiary of unit tests are the developers, then it make sense we should make our unit tests as helpful to developers as we possibly can.
Brett Schuchert and Tim Ottinger, old hands at unit testing, provide some pointers. They describe good unit tests as being “FIRST”:
- Fast
If a developer hesitates to run unit tests because they take a long time to run, the tests won’t be run at all. Unit tests need to be run often (as often as you build – the compiler can only check for correct syntax, the tests insure you got the semantics right), so they need to be cheap and easy to run. One test should take a small fraction of a second. A test suite that takes longer than a few seconds to run is painfully far too slow.
- Isolated
This actually means two things:
- Unit tests should be small enough that the bug is obvious as soon as the test is viewed. A unit test tests one feature of one class.
- Unit tests should never depend on the order the tests are run – if a test breaks because some other test did or did not successfully run before, then the test can no longer be run in isolation
- Repeatable
Tests must be able to be run over and over without any intervention. No hand-loading of a database or editing configuration files. They run whether there is a network or not. Unit tests do not test external systems.
- Self Validating
It must be obvious whether a test passes or fails. We never have to interpret results in a database or a file somewhere. - Timely
Tests are written at the right time. Many argue the right time is immediately before the code being tested is written (I’m usually one of them). Regardless, unit tests should always be written before the tested code is checked into source control.
Michael Feathers (the author of one of the original CppUnit unit testing frameworks) also describes some features that should never be in a unit test. He states a test is not a unit test if:
- It talks to the database (not fast, isolated nor repeatable)
- It communicates across the network (not fast, isolated nor repeatable)
- It touches the file system (not fast, isolated nor repeatable)
- It can’t run at the same time as any of your other unit tests (not isolated or repeatable)
- You have to do special things to your environment (such as editing config files) to run it. (not fast, isolated nor repeatable)
He goes on to say:
Tests that do these things aren’t bad. Often they are worth writing, and they can be written in a unit test harness. However, it is important to be able to separate them from true unit tests so that we can keep a set of tests that we can run fast whenever we make our changes.
So — are your unit tests FIRST or are they painful?
Why Unit Test?
Posted by Jason Smith - 20/01/10 at 10:01:27 amNobody changes until the pain of staying the same becomes greater than the pain of change ~ Anonymous
If the code I come across in my work is any example, most developers haven’t drank the Unit Testing Kool-Aide. Given the old saw about change and pain, I can almost understand. Almost.
So, why should I unit test?
Let’s answer that question by understanding who benefits from them. Just like any other software we write, when we understand who we write our software for, we’re more likely to write software that will serve them best.
Unit Tests are for programmers. If you write code, they are for you.
Big Deal
Sure, unit tests help demonstrate a body of code does what it should. So does white box testing, functional testing, and integration testing. Really, wouldn’t our time be better served stepping through the code for some test cases and a sample app or two, and letting those fiends in QA figure out ways to break our code?
In reality, verifying your code works is just one benefit of unit tests. There are other, further reaching benefits, when you realize the where the bulk of software development cost really lies:
- It documents our code in a way that comments could never hope to match
- It protects dependent code from future breaking changes
- It forces us to decrease coupling (improving our designs – even more so if you take unit testing all the way to TDD)
- Helps us uncover (and prevent) bugs earlier and faster (it beats the pants off of using the debugger to find a break).
Note those benefits are not about insuring the tested code works right now. It’s also about insuring our code won’t break tomorrow. It’s about insuring the newly hired developer understands what the code does so she doesn’t accidentally break it with a “fix”. It’s about when the new developer still doesn’t understand, the tests will complain before your customers can.
Profoundly Better Code Documentation
Most .NET programmers have seen structured XML used to document code. Likely, we’ve filled out several <summary>, <return> and <param> tags. Maybe several others. The intent behind them (and coding standards mandating them) is to insure our code is documented enough so the poor slobs who come after us know how to use it.
The problem here is a simple one — comments are pathological liars. They get out of sync with the actual behavior of our code. They often don’t reveal enough about the contract of the code. And when they do, they certainly can’t enforce what is documented (does that <param> accept null values? What happens if we pass a null there? What is the valid range for that integer <param>? What happens to that Stream when there is no data to write? )
Because comments aren’t executable, they have no way to enforce correct behavior. Unit Tests, on hte other hand, demonstrate exactly how to use a bit of code. Because they are executable, they simply cannot get out of sync with the code. Proper unit tests verify an exception is thrown when an invalid argument is supplied; that an integer is in range; what happens to a stream passed to a function.
Protection Against Breaking Changes
Let’s say you’ve been assigned a task that involves adding a feature. You find a class that does 90% of what you need it to do, so you make your changes. Your change doesn’t change the signature of a method, but it does modify the semantics of a parameter just a teeny, tiny bit. You build the code and sure enough, the compiler is happy. Of course the compiler is happy – you didn’t modify the calling syntax – merely the semantics.
So you run the tests and they break. You look at the breaking tests and find they were checking for that very semantic change. Turns out there are several assemblies that depend on the original behavior. If you had tested only your change, you’d never know you broke other code until the bug reports started pouring in. The unit tests caught that before your users could.
A unit test from a previous release demonstrates behavior that code out there somewhere depends on. It’s a warning you should heed.
Improves your Designs with Decreased Coupling
Wikipedia says:
A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class. — http://en.wikipedia.org/wiki/Unit_testing
Comp.software-eng.testing FAQ says:
A unit typically … does not include any called sub-components (for procedural languages) or communicating components in general.
Unit Testing: in unit testing called components (or communicating components) are replaced with stubs, simulators, or trusted components. Calling components are replaced with drivers or trusted super-components. The unit is tested in isolation.
Isolation is a big deal in unit testing. In almost every case, the only thing that should be tested in a unit test is one method of one class. All other participants in that method are stubs, simulators, and test-drivers. The only way you can get to that is by reducing your coupling. Unit tests are much easier to write if your classes depend on abstractions instead of concretions. You’ll also discover you simply cannot unit test a class that depends on the behavior of globals, singletons and most static members – no way, no how.
Test Driven Design (TDD) – not the same thing as unit testing, but a subset of ways to apply it — helps improve designs even further by forcing you to think about your code the way a consumer of your code would. As a result, our APIs are no longer defined by how something is done (leaky abstractions), instead they are defined by what our users want done, which then constrains our implementation (the TDD’s affect on unit testing is such a big subject it warrants its own post).
Better Tests
Finally, unit tests insure the quality of our code in a way no other tests can. All non-unit tests can only insure how a body of code behaves in ways the calling code knows how to call. Consider a function that counts words in a text stream. If that function is currently used only in a program that always supplies text streams of 2KB in size, and there are no unit tests for that function, then the only behavior we can verify for that function is that it can count words in text streams of no smaller or larger than 2KB of data, because the calling code can only handle streams of 2KB of data.
We simply don’t know what will happen if we get 0 bytes, 2047 bytes, 2049 bytes or 2MB of data.
Now, our requirements change, and we must consume just over 3K of data. Our functional tests start blowing up. We can either work through the call stack looking for the issue (let’s say it was an off by one error in our original function that used a 2KB buffer with a zero byte terminator … oops), or we could have had unit tests that leveraged the knowledge our word counting function used a buffer of exactly 2KB in size, so we could write edge case tests around that value).
In upcoming blogs, I’ll cover some ways to maximize the benefit of unit tests, while minimizing the pain.
Is Your Code SOLID: The Dependency Inversion Principle
Posted by Jason Smith - 15/01/10 at 09:01:17 amThe mother of all SOLID principles. Nail this one, and you’ll keep your codebase supple — ready for just about any change you throw at it. The Dependency Inversion Principle comes in two flavors:
- HIGH LEVEL MODULES SHOULD NOT DEPEND UPON LOW LEVEL MODULES. BOTH SHOULD DEPEND UPON ABSTRACTIONS.
- ABSTRACTIONS SHOULD NOT DEPEND UPON DETAILS. DETAILS SHOULD DEPEND UPON ABSTRACTIONS.
An example should make this clear.
Consider a very simple requirement: read a serial port for ASCII characters, find any appropriate stock symbol and associated data and copy that data onto a TCP/IP socket (don’t laugh – those were real requirements: a trading partner’s trade desk wanted a connection to a former employer’s trading platform, and the exchange had no more ports to grant).
The MacGyver’ed up solution ended up with a dependency graph that looked like this:
More abstractly, this program reads characters from a source (the serial port), does a lookup, and then writes textual data to a destination (a socket). A reasonable one-off, but ultimately a problem. Because the high level module (symbol lookup) depended on low level modules (an RS232 serial port reader and a socket), we could only use it in an environment with an RS232 port and a socket. Any other use would require so much refactoring a rewrite actually started to make sense. It was a very simple hack … err … design that became very expensive to maintain.
Instead, if the symbol lookup module depended on a simple text reader abstraction, which the RS232 Serial Port reader then implemented, and the socket implemented a simple text writer abstraction, we suddenly open all the modules up to a world of other uses and environments:
With this change, all the classes follow the OCP. The behavior of the program is changed by replacing the RS232 Serial Port Reader with some other character reader (i.e., and extension) – no other classes need to change. As long as our line-of-business module (symbol lookup) depends on abstractions (TextReader and TextWriter), the data will flow no matter where it comes from, or goes to.
Usually, DIP isn’t violated in such an over-the-top way as that. It’s often much more subtle:
namespace Numa.Infrastructure.Client
{
public interface IUIHost
{
//...
/// <summary>
/// Get or Set Environment Configuration Setting
/// </summary>
Dictionary<string, string> EnvConfig
{
get;
set;
}
}
}
Note the implementation detail: IUIHost.EnvConfig returns a Dictionary<string,string>. Not only does the interface depend on that detail, it forces all of its consumers dependency on it simply by making it part of the interface. A reasonable SOLID refactoring of this interface would probably replace the Dictionary<string , string> class with an IDictionary<string , string> interface.
This simple change does two things to improve the quality of our design:
- It protects the clients of our abstractions from changes in our implementation of the abstraction. A derived design might deal with huge result sets, and so it loads the data lazily. A specific Dictionary type can’t do that, but something implemented under the IDictionary interface can.
- It documents a contract between our abstraction and clients of our abstraction – constraining both our designs, and the valid uses of our designs. The change says our interface returns an object that can map a string to another string.
Before we think we’re done, consider this: both the original and the changed code also says the IUIHost.EnvConfig property supplies a type where:
- String mappings can be added
- String mappings can be removed
- String mappings can be cleared
In other words, this interface promises to support those additional features. If we don’t want to make all of those promises, we should choose an abstraction that better describes what we mean (i.e., If we don’t want to allow modifications to the result set, we might consider an IEnumerable or ICollection return type)
Another place we see the DIP violated is in code written from a procedural perspective – code that depends on specific function calls instead of calling through abstractions or interfaces. They are often implemented in types with names like “FooManager” or “FooHelper”. For example:
using System;
using System.Security.Principal;
public static class AuthorizationService
{
public static bool HasPermission(
string action,
IIdentity identity)
{
// something
}
}
public class MyClass
{
public void DoSomething()
{
bool permitted = AuthorizationService.HasPermission(
"MyClass.DoSomething",
System.Threading.Thread.CurrentPrincipal.Identity);
if (permitted)
{
// blah
}
}
}
MyClass.DoSomething depends on the following implementation details:
- where the Identity comes from (the Thread’s CurrentPrincipal)
- it depends on where it gets the AuthorizationService from (it’s a global)
- the specific implementation of the AuthorizationService.
To understand this code’s resistance to change, try writing a unit test around MyClass.DoSomething() that only invokes a test stub AuthorizationService. It simply can’t be done.
AuthorizationService exposes the detail that there’s only one (implied by the fact that it’s a static class) callers will also depend on that detail, and that its services will always be accessible via a reference to the class name. In other words, it simply can’t be changed without forcing a rebuild of all clients.
An improved design might look something like:
using System;
using System.Security.Principal;
public interface IAuthorizationService
{
bool HasPermission(string action, IIdentity identity);
}
public class MyClass
{
IIdentity _user;
IAuthorizationService _authSvc;
public MyClass(IAuthorizationService authService,
IIdentity user)
{
_authSvc = authService;
_user = user;
}
public void DoSomething()
{
bool permitted = _authSvc.HasPermission(
"MyClass.DoSomething", _user);
if (permitted)
{
// blah
}
}
}
This technique, called Constructor Dependency Injection, breaks MyClass’s dependency on the detail that an AuthorizationService is implemented a particular way, and that it’s supplied in a particular way. By breaking that dependency, our designs become much easier to change and test.
Powered by WordPress with GimpStyle Theme design by Horacio Bella.
Entries and comments feeds.
Valid XHTML and CSS.

