Wednesday, July 16, 2014

Coding backwards - a design technique

The worst design formula I ever encountered was by courtesy of a group of interns, ten years ago. When given the problem statement, they proceeded to convert the top level business entity names into classes and the verbs into their methods. Needless to say, what came out of the exercise was an abomination: it looked beautifully elegant as a class diagram, but lacking any consideration of how the thing might work at implementation level, it (and the code it inspired) had to be thrown out with prejudice.

I have seen many such formulas over the intervening decade. They all had one thing in common: they were formulas. In that they substitute the hard work, i.e. the thinking, with a series of steps based on some theory (or worse yet, based on a collection of  buzzwords masquerading as a theory). I myself have tried to come up with a few over the years. Not because I dislike thinking, but as I have come to realize, because I hate chaos.

A design technique, in my opinion, should not try to reduce the amount of thinking required. It should instead try to reduce the amount of chaos in the process so that the developer can do his thinking methodically. Here I outline a process that I have been using for the past four years. It has worked equally well with C++ (my mother tongue of ten years), PHP and especially JavaScript (my recently adopted second language).


Before I found Python, I was a Perl afficionado; and those who know the temperament of that language would probably already know its remarkable resistance to attempts at structure. You may start with the best process, standards (and intentions) in mind and three months into the project, Perl has a tendency to suddenly jump out of the repository and proclaim "Ha ha, look at me I'm a mess!". So it didn't quite work there. The Pythonian way on the other hand, intrinsically encourages some of the things I'm trying to achieve with the technique below, so I found that my Python code often came out looking better (and being more maintainable) in general even without the benefit of this technique.

Enough preamble now. Let us have a look. Compared to most people I know, I develop backwards: while most people write classes, modules and functions and then use them, I use them in code before I write them.

I recently introduced IndexedDB to a hybrid JavaScript application. In plain vanilla JavaScript, IndexedDB usage looks a bit verbose:



I need to wrap this for a number of reasons. First, I don't allow boilerplate code where I work -- a library or a module may not force its user to do something it can do itself. Usage needs to be a lot more terse. Second, it has to fit into our existing application's event model.

Here's how I start: I assume the existence of a hypothetical module that meets all my requirements, and I start using it:



So we're assuming that the database already exists with the right schema? Can't do that with IndexedDB. The only time you get to meddle with the schema -- whether it is for the first time or during an application upgrade -- is when the onupgradeneeded event is triggered during a database open. So we need to know the schema at open/connect time.



That looks better. Now I'm going to pull out the configs in to a separate JSON file because schemas are better specified declaratively than constructed procedurally.



Still we're missing the scripts or schema deltas for upgrading the database from the two previous versions. I'm not going to handle that problem in it's entirety right now, but here's one way I see it happening:



So there's a snapshot of each version's schema in the database config file? Not going to be maintainable over the long run. Conversely, you could just have delta configs that gives you the differences since the previous version and then at the end have a full schema for the latest version. I'm leaning towards that latter solution. Otherwise the upgrade script will need to do diff's on the schemas on its own.



The onupgradeneeded handler will have to load the upgrade files in the correct order and run the synchronously. For example: 0 -> 3: no need for a script, create the current schema. 2 -> 3: fetch version3.js and execute it. 1 -> 3: execute version2.js and version3.js in order. Upgrades are done procedurally because there may be more to them than just schema changes. Migration scripts in particular will need to look at existing data.



Now, how do I start writing data? I tried this first:



Notice I've converted IndexedDB's result.callback = handler syntax to a more familiar operation(callback). If you come from a multi-threaded programming background, the former will look as if it has a race condition (in reality it doesn't because the JavaScript engine will execute function scope before handling asynchronous callbacks). I've also simplified the error communications with function(result) { if (result.error) ... } where the result depends on your operation.

What I'm not happy with: that database.getStore('storename') looks superfluous. IndexedDB's transaction model is well suited for transactions spanning multiple stores, but for most day-to-day operations, I'm going to make the following simplification:



So now the datastore is an argument in the add() call. Looks good enough to start.

But most operations in our app won't go like that. We need transactions. Here's how I'd like to see those handled:



I've assumed a remove(storename, key) function here. We can easily implement that in IndexedDB. Notice how we don't repeatedly specify store names anymore. Our module's transaction object will have to collect an array of all referenced stores and use that when creating the IndexedDB transaction. So that's an advantage of passing the store name into the operation. Here Transaction.add() will only add the operation to a list. It will only be executed via Database.add() during Transaction.execute().

One iteration seems good enough for that bit, but we could also do it this way:



Looks more terse, but implementation might be more complicated. We will need the Database to know the difference between calling add() normally (dispatches the request immediately) and calling add() within a transaction (collects and dispatches only when execute() is called). Plus it doesn't allow you to add items to the transaction within a loop, at least not directly. So for the first version of this module, I'm going to stick with the syntax from ITERATION 1.

Once I've gone through a similar process for the other operations I need (update and query, in this case), I'm ready to implement. If I've done everything right, the above snippets should just work once I included the completed module.

No comments: