A brief history of the Future
Lessons learned from API design under pressure
The proposed solution
As one of the most-recent additions to the architecture group, I was asked to look into this problem, and see if there was something we could do to make it easier for the application and service developers to write readable and maintainable code. I went away and did some research, and came back with an idea, which we called a Future. The Future was a construct based on the idea of a value that would be available “in the future”. You could write your code in a more-or-less straight-line fashion, and as soon as the data was available, it’d flow right through the queued operations.
The design of the Future
Future was based on ideas from SmallTalk(promise), Java(future/promise), Dojo.js(deferred), and a number of other places. The primary design goals were:
- Make it easy to read through a piece of asynchronous code, and understand how it was supposed to flow, in the “happy path” case
- Simplify error handling – in particular, make it easy to bail out of an operation if errors occur along the way
- To the extent possible, use Future for all asynchronous control flow
You can see the code for Future, because it got released along with the rest of the WebOS Foundations library as open source for the Open WebOS project.
My design brief looked something like this:
A Future is an object with these properties and methods:
.result The current value of the Future. If the future does not yet have a value, accessing the result property raises an exception. Setting the result of the Future causes it to execute the next “then” function in the Future’s pipeline.
.then(next, error) Adds a stage to the Future’s pipeline of steps. The Future is passed as a parameter to the function “next”. The “next” function is invoked when a value is assigned to the future’s result, and the (optional) “error” function is invoked if the previous stage threw an exception. If the “next” function throws an exception, the exception is stored in the Future, and will be re-thrown if the result of the Future is accessed.
This is more-or-less what we ended up implementing, but the API did get more-complicated along the way. Some of this was an attempt to simplify common cases that didn’t match the initial design well. Some of it was to make it easier to weld Futures into callback-based code, which was ultimately a waste of time, in that Future pretty much wiped out all competing approaches to flow control. And one particular set of changes was thrown in at the last minute to satisfy a request that should just have been denied (see What went wrong, below).
What went right
We shipped a “minimal viable product” extremely quickly
Working from the initial API design document, Tim got an initial version of Future out to the development team in just a couple of days, which had all of the basics working. We continued to iterate for quite a while afterwards, but we were able to start the process of bring people up to speed quickly.
We did, in fact, eliminate “callback hell” from our code base
After the predictable learning curve, the former Java developers really took to the new asynchronous programming model. We went from “it sometimes kind of works”, to “it mostly works” in an impressively-short time. Generally speaking, the Future-based code was shorter, clearer, and much easier to read. We did suffer a bit in ease of debugging, but that was as much due to the primitive debugging tools on Node as it was to the new asynchronous model.
We doubled-down on our one big abstraction
Somewhat surprisingly to me, the application teams also embraced Futures. They actually re-wrote significant parts of their code to switch over to Future-based APIs at a deeper level, and to allow much more code sharing between the front end and back end of the Mail application, for example. This code re-use was on the “potential benefits” list, but it was much more of a win than anyone originally expected.
We wrote a bunch of additional libraries on top of Future, for all sorts of asynchronous tasks – for file I/O, database access, network and telecoms, for the system bus (dbus) interface, basically anything that you might have wanted to access on the platform, was available as a Future-based API.
The Future-based code was very easy to reason about in the “happy path” case
One of the best things about all this, is that with persistent use of Futures everywhere, you could write code that looked like this:
Most cases were a bit more-complicated than that (often using inline functions), but the pattern of only handling the success case, and just letting errors propagate, was very common. And in fact, the “error” case was, as often as not, logging a message and rescheduling the task for later.
The all-or-nothing error propagation technique fit (most of) our use cases really well
The initial use case of the Future was for a WebOS feature called “Synergy”. This was a framework for combining data from multiple sources into a single uniform format for the applications. So, for example, you could combine your contacts from Facebook, Google, and Yahoo into a single address book list, and WebOS would automatically de-dubplicate and link related contacts, and sync changes made on the phone to the proper remote service that the data originally came from. Similarly, all of your incoming e-mail went into the same “Mail” database on-device.
In a multi-stage synchronization process like this, there are all sorts of ways that the operation can fail – the remote server might be down, or the network might be flaky, or the user might decide to put the phone into airplane mode in the middle of a sync operation. In the vast majority of cases, we didn’t actually care what the error was, just that an error had occurred. When an error happened, the usual response was to leave the on-phone data the way it was, and try again later. In those cases where “fuck it, I give up” was not the right error handling strategy, the rough edges of the error-handling model were a bit easier to see.
What went wrong
The API could have been cleaner/simpler
It didn’t take long before we were adding convenience features to make some of the use cases simpler. Hence, the “whilst” function on Future, which was intended to make it easier to iterate over a function that returned Futures. There were a
couple of other additions that also got a very small amount of use, and could have easily been replaced by documentation of the “right” way to do things.
Future had more-complex internal state than was strictly needed
Error handling was still a bit touchy, for non-transactional cases
If you had to write code that actually cared about handling errors, then the “error” function was actually located in a pretty terrible place, you’d have all these happy-path “then” functions, and one error handler in the middle. Using named functions instead of anonymous inline functions helped a bit with this, but I would still occasionally get called in to help debug a thrown exception that the developer couldn’t find the source for.
It would have been really nice to have a complete stack trace for the exception that was getting re-thrown, but we unfortunately didn’t have stack traces available in both the application context and the service context. In the end, “thou shalt not throw an exception unless it’s uniquely identifiable” was almost sufficient to resolve this.
I caved on a change to the API that I should have rejected
Fairly late in the process, someone came to me and said “I don’t like the ‘magic’ in the way the result property works. People don’t expect that accessing a property will throw an exception, so you should provide an API to access the state of the Future via function calls, rather than property access”. At this point, we had dozens of people successfully using the .result API, and very little in the way of complaints about that part of the design.
I agreed to make the addition, so we could “try it out” and see whether the functional API was really easier or clearer to use.
Nobody seemed to think so, except for the person who asked for it. Since they were using it, it ended up having to stay in the implementation. and since it was in the implementation, it got documented, which just confused later users (especially third parties), who didn’t understand why there were two different ways to accomplish the same tasks.
How do I feel about this, 8 years later?
Pretty good, actually. Absent a way to see into the future, I think we made a pretty reasonable decision with the information we had available. The Bedlam team did an amazing bit of work, and WebOS got rapidly better after the big re-architecturing. In the end, it was never quite enough to displace any of the major mobile OSes, but I still miss some features of Synergy, even today. After all the work Apple has done over the years to improve contact sync, it’s still not quite as good (and not nearly as open to third parties) as our solution was.