Skip to content
I am starting up a Hobo based project that will appear sometime soon. I made a simple mistake, so I thought I should share.

I ran something like this... sudo hobo code_blog
I got a few missing gem problems, so I did a few sudo gem install X.
I did a few ./script/generate hobo_model_resource X.
Finally I get to ./script/generate hobo_migration
and ./script/server

With excitement I fire up firefox and get a lovely screen. I attempt to sign up and get

ActiveRecord::StatementInvalid: SQLite3::SQLException: unable to open database file: INSERT INTO users ("salt", "updated_at", "crypted_password", "remember_token_expires_at", "username", "administrator", "remember_token", "created_at") VALUES('xxxxxxxxxxxxxxxxba4739cda76', '2008-07-28 22:16:31', 'xxxxxxxxxxxxxxxxxxx99537dd568', NULL, 'nigel', 't', NULL, '2008-07-28 22:16:31')
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/abstract_adapter.rb:150:in `log'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/sqlite_adapter.rb:132:in `execute'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/sqlite_adapter.rb:345:in `catch_schema_changes'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/sqlite_adapter.rb:132:in `execute'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/abstract/database_statements.rb:156:in `insert_sql'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/sqlite_adapter.rb:146:in `insert_sql'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/abstract/database_statements.rb:44:in `insert_without_query_dirty'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/abstract/query_cache.rb:19:in `insert'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/base.rb:2272:in `create_without_callbacks'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/callbacks.rb:226:in `create_without_timestamps'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/timestamp.rb:29:in `create'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/base.rb:2238:in `create_or_update_without_callbacks'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/callbacks.rb:213:in `create_or_update'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/base.rb:1972:in `save_without_validation'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/validations.rb:934:in `save_without_transactions'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/transactions.rb:108:in `save'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/connection_adapters/abstract/database_statements.rb:66:in `transaction'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/transactions.rb:80:in `transaction'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/transactions.rb:100:in `transaction'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/transactions.rb:108:in `save'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/transactions.rb:120:in `rollback_active_record_state!'
from /usr/local/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/active_record/transactions.rb:108:in `save'
from (irb):7>> exit


OK, so do you spot the mistake?

Well it's the sudo in sudo hobo code_blog.

This made the hobo files all owned by sudo.

The fix.. sudo chown nigelthorne code_blog/* obviously with your username and project directory.

We live a learn. :)
As we refactor our code, some design changes make some existing code redundant. When you have unit tests on all your code you can't find this using coverage. So finding this redundant code can be tricky.

What you need is a tool that can analyze your code base. The best I have found for .Net is NDepend.

NDepend provides a query language for code.

I started a project and added all assemblies except the test ones. I then ran the query :
SELECT TYPES WHERE TypeCa == 0

TypeCa refers to the 'Afferent Coupling' of the type.. ie. how many other types refer to this one.

This found 40 files I could delete.

I used Resharper to double check each one was not used (except in tests).

I got a couple of false positives with types I am only using to pass as type parameters to generic classes. This is the only thing stopping me adding this check as part of our build script.

Finding all those classes by hand would have taken much longer, so...

Works for us!

Not withstanding the fact that I needed to restore my operating system in the first place--due to an inexplicable and catastrophic failure of the Java installation resulting in segfaults--I was able to restore my entire 100GB system in around 4 hours. For posterity:

  • Boot off the OS X System Install DVD--hold down option while the system starts
  • Connect the external drive with the TimeMachine backup--in my case a TimeCapsule attached via ethernet
  • Select "Restore from TimeMachine backup" in the Utilities menu
  • Select the specific backup (by timestamp) from which to restore
  • And away you go!

The disk is then automatically erased and a fully bootable system is restored sans temp directories and cache files. It even managed to restore my PostgreSQL databases that were running at the time--which probably says more about PostgreSQL than anything.

The one grumble I do have is that the timestamps in the name of the backups were some non-obvious period relative to the actual date the backup was made. The difference wouldn't have been much of an issue had I simply needed to restore the most recent backup but as it turned out I needed to go back a couple of days in order to get a clean system. Thankfully I got lucky on the second attempt :)

Once I had restored the system I took a look at the backup folders and sure enough there are two timestamps: the one in the folder name, and the created date. The created timestamp was spot on but the one in the folder name--the one presented to you when restoring--was whacky. I honestly didn't spend long enough to calculate if the difference was consistent.

What is really interesting is that I had SuperDuper! on my list of software to start using but it would appear there is little need--at least in my case.

My blog administration has been broken due to a forced upgrade, and my motivation to blog has been reduced due to time spent on other tasks. However, I'm planning on fixing my blog software soon and hopefully return to the blogging fold sooner rather than later....

Much of the work performed in web based application development involves taking data from one state and applying it to some other application. You get stuff from a database, or a middleware service, or some other system and send it to a web browser, or a web service client or whatever. Going the other way, you take data from a web request, or a web service call and pass it onto a database, or middleware. Essentially, much of the work I've been doing over the past 2 or 3 years hasn't had a lot to do with business problems, it's been more about problems of structure and transport. At a 200ft level many of the applications I've worked on look like this:


client tier--------web tier--------business tier-------data/comm tier

html/xml---------pojo-------------pojo---------------pojo/xml


Browser clients speak in pojo's, raw request/response parameters or structured XML; web service clients speak in XML and objects. Middleware services speak in XML or JMS messages, databases speak in JDBC objects.

It's like working as a translator at the UN.

Often the only business oriented concept you have any need to perform will be some sort of validation or just the mere structure to properly represent business relationships.

So why bother with the middle bit? Why not just let my XML capable client get the XML straight from the middleware tier, and use that to form the UI? Why not make the picture:

client tier--------web tier-------------------------data/comm tier
html/xml---------pojo--------------------------------pojo/xml

or even:

client tier------------------------------------------data/comm tier
html/xml---------------------------------------------pojo/xml

There's a reason why people don't use the SQL tags available for JSP. It's 24 flavours of wrong.

I propose that the correct thing to do in almost all situations involving this type of problem is to convert the incoming data of the systems and clients you're interacting with into a neutral representation of the business concepts at play. We'll call this the 'Neutral Domain'.

The Neutral Domain could take on the form of a set of value objects, or even a map of values. There are many advantages to this, but one of the most important is not the way the data is held together, it's the fact that the application is in control of it. Why is this important?

Mulitple Flavoured End Points

Being in control of a structured domain sitting between two disparate systems means that if a third disparate system is also added into the mix all that has to happen is the addition of a language builder for that new system. The Neutral Domain is unaffected. It just sits there representing the structure of the data in a client neutral state. It is the responsibility of a layer of translators who speak both the client language and the neutral domain that perform the work. Adding another new client that speaks a new language or relies on the data in another format is simply a matter of adding on some new builders at a new end point.

At the other end of the application, communicating with backend systems, the same thing applies. The very fact that you have a neutral representation of the data means you can add discrete layers that handle the complexity of structuring data for movement to many other systems which probably all speak a slightly different language.

Validation and 'Business Logic'

A Neutral Domain means you can perform business things on data in an Object Oriented, highly unit testable fashion. I was recently implementing a service that retrieved currency exchange rates in an XML format from somewhere and sent (the data was pulled from the client) that data to a desktop client application in XML.

It was entirely possible that I could just pass on the XML to the desktop application, using the same format of the originating news service, without establishing a Neutral Domain. Just a pass through of XML. Simple.

Lets imagine then, that the customer did two things. They wanted a margin on all exchange rates of n%, and they wanted to also display the same data in a web page that was not going to be XML capable. Using the 'lets unite the world in a single representation of XML' strategy would doom you too transforming XML from one state into something else and performing business logic on XML documents in order to sort out the margins. Now throw in a bunch of other client specific requirements and a simple pass through solution is looking a little messy.

Having established a Neutral Domain the margin requirement was trivial. All of my CurrencyExchangeRate objects were instantiated with an appropriate Margin value. Secondly the web application that will only accept pojo's simply used the neutral domain to construct View objects of the important structures, and of course the desktop application used XML that was the result of a set of builders specifically designed to convert my neutral domain to an agreed format. My Neutral Domain remains unaffected and capable of servicing the business logic regardless of the client that requires a view of the same data.

The Neutral Domain here is 'View Agnostic'. It's only concern is to represent structured information that is the same, all the time. It always has the same logical structure, and it always has the same 'business meaning'. It is ubiquitous, regardless of who intends on using it, or manipulating it's values.

It is in this state that we are most easily able to implement good programming principals, write clean testable business code that is discrete from the clean testable transformation code that must happen at different layers of the application.

Insulation and Testing

When you implement a Neutral Domain you decouple the source of the data from the use of the data. Changes in one don't always require a change in the other. Source and Use are insulated from changes in eachother. Using nice things like inversion of control we can trivially implement substitutions or multiple implementations of abstract concepts to change behaviour.

Having the application layered in this fashion means we can implement a test suite that treats each layer in isolation; tests only focus on behaviour '1 layer deep'. The majority of your test suite is made up of proper unit tests and not fragile integration tests.

Continuous Integration is a common practice in Agile development circles, but I think people (especially those new to agile thinking) sometimes miss the point.

Problem is, the term has become synonymous with build-servers such as CruiseControl (etc, etc), which frequently grab the latest code, build it, and execute automated tests. These are often referred to as "continuous-integration servers", which IMHO is a really bad name, 'cos if there's one thing these servers typically don't do, it's integrate.

And the point of continuous-integration is just that: Integrating. Continuously! Which means:

  • developers frequently updating their working-areas (or personal branches) with the latest code on the mainline branch (typically many times a day), and
  • frequently merging their own changes back into the mainline (typically several times a day).

Unless you're doing this, you ain't "doing continuous integration", however frequently you're running automated builds!

Integrating continuously can be difficult. In particular, it forces you to chunk larger changes and features into small, bite-sized pieces that can be drip-fed into the codebase. And, you have to deal with other developers changing stuff all the time. Build-servers and automated tests are essential tools here, because they help keep the team honest, ensuring that everyone has a stable (if evolving) base to work on.

There are are plenty of upsides to frequent integration:

  • each individual integration is smaller, and therefore easier
  • design issues (including differences of opinion) are identified earlier
  • developers can leverage each other's work earlier
  • changes can be tested (and bugs detected) earlier
  • software can be deployed more frequently

In summary: check it in already!

Last weekend I went along to CITCON here in Melbourne. Which was great fun, by the way.

There I ran a session on "Attacking slow-running CI builds". It was a small group, but an interesting discussion, I think. Here are my (rough, unedited) notes:

WHAT is the impact of a slow build?

  • fewer checkins
  • more waiting
  • context switching
  • discourages integration
  • discourages writing of additional tests
  • more chance of overlapping checkins
  • more build breakages
  • more time required to get the build fixed
  • reduced productivity
  • WASTE!

WHY is the build slow?

  • slow tests (particularly acceptance tests)
    • over-testing (testing the same code-paths repeatedly)
    • expensive set-up and tear-down
    • too much testing via the user-interface
    • tests that pause, sleep, or poll (e.g. to deal with AJAX)
  • too much I/O!
  • use of slow infrastructure components (database servers, application servers, etc.)
  • slow hardware

HOW can we make it faster?

  • faster hardware
  • run tests in parallel
  • distribute tests
  • fail fast
    • selective testing: run tests most likely to fail first
      • could use dependency-analysis to identify which tests were affected by recent commits
  • refactor story-based acceptance tests into scenario-based tests
    • bigger tests, with more assertions, offsets set-up/tear-down costs
      • but makes tests more complex
  • share test fixtures between a group of tests
    • but breaks test isolation
  • avoid I/O
    • in-memory database
    • in-memory file-store (RAM disk?)
    • stub out infrastructure components
      • avoid testing these components by side-effect
  • populate the database directly, rather than using the user-interface to set-up for a test
  • separate your system into components that can be tested independently

Thinking about this later ...

There are two types ...

The suggestions for improving build times seemed to fall into two categories:

  1. optimise the build/tests
  2. throw additional hardware at the problem

My problem with the "throw hardware at it" approach is that it typically only helps for the build-server machine; the poor old developers are still left with a slow-running build, and therefore many of the productivity issues still exist.

Another idea

It occurs to me now that we missed a fairly fundamental trick to improve test times: improve the performance of the system-under-test itself. It's a great excuse to start thinking about performance earlier in the project.

"Customer Acceptance Test" does not need to mean end-to-end

On all the projects I've been on in recent years, we've ended up with the majority of the tests being either "developer unit tests", which run super-fast, or "customer acceptance tests" which test end-to-end (browser-to-database) and run super-slow.

Methinks it should be less black-and-white. If we can demonstrate functionality that the customer cares about by calling the underlying logic directly (i.e. at unit-test level), rather than by exercising the user-interface, then what's wrong with that? (We just need one test to prove that the underlying logic has been properly integrated into the UI.)

NCover is an invaluable part of our continuous integration server. (In conjunction with NCoverCop [shameless plug]).

We recently splashed out on the commercial version [NCover 2.0] rather than use the open source version with an aim to speed up the build.

The upgrade was pretty painless. I needed to extend NCoverCop to process the new file format, so the next version will now handle both. It can even compare one against the other.

The results




We managed to shave 2 minutes off the build. Every little helps!
Well although this feature has been mentioned on the web it doesn't seen to exist in any version of NMock2 I can find. With a bit of digging I worked out how to implement it myself.

You can download the assembly from NMock2Extensions try it out, or download the source from svn.

Here's an example of using it...


Stub.On(childScope)
.Method(new GenericMethodMatcher("Get", typeof(IControl)))
.Will(Return.Value(control));

Expect.Once.On(childScope)
.Method(new GenericMethodMatcher("Get",typeof(IDocument)))
.Will(Return.Value(NewMock<idocument>()));


As you can see, you can Stub or Mock on the same method with different generic types and they are handled differently, as you would expect.

Happy Mocking
Check out Getting Started with NDependencyInjection. I'm looking for some feedback. How readable/followable is it?