Inspired by the resurgent interest in both OT and Node.js, and some recent projects such as
LakTEK's recent announcement of a collaborative editing tool in Node.js, I'm announcing my own little node.js project -
Pluto.
Pluto will be an Operational Transform library written in JavaScript, and optimised to run on Node.js and greatly simplify the development of collaborative web applications. The library should be sufficiently abstract that it can handle more or less any document that you could persist in a browser (such as word processing document, spreadsheets, drawing programs, game states etc.), and not be bound to any particular transport system (such as WebSockets or long polling).
Pluto's High Level Architecture
I've sketched out how I think this library could work, and how it can take away the meat of the OT problem away from a developer. In essence, a web page that would normally have events from the user (onPress, onClick etc.) relayed to UI elements in the DOM, they are instead captured by a PlutoDocument object that translates them into OT-expressible mutation commands.
These are then sent to the PlutoClient object (where the actual OT magic happens), and the PlutoClient object relays back any changes that need to be made to the document (also as OT commands, but these may come not just from the client but also from other clients editing the same document) and these are translated into actual changes in the client's DOM.
WWhy care about Operational Transform?
If you've taken a look at how Google Wave works, you'll have come across the concept of Operational Transform. And if you've used Google Wave you'll see why it kicks ass. Basically, OT is a mechanism for keeping complex documents in sync across different locations,
without locking. This last part is crucial, because it means that lots of people can make large numbers of changes simultaneously on a document and not have to stop working because someone else is also making changes at the same time.
Although Google Wave is geared for managing Waves, the underlying OT implementation is actually more generic - and can be used to collaboratively manage almost any kind of structured document. The intent behind Pluto is to provide this low level OT management functionality, while leaving the binding of it to a specific type of document up to application developer.
Two other great features of OT is that since it expresses documents as a series of mutations (a kind of collaborative command pattern), it's fairly straightforward to implement history support to systems that use it. Furthermore, using a technique called composition you can compile a sequence of many small individual OT operations into larger, more granular operations that are still equivalent. This means while you can collaboratively edit a document character by character, you don't necessarily need to store or relay each character change out to every other client.
OT is a big field and it's been around for some time - there are many different applications of the algorithm that vary in performance characteristics. The mechanism that Wave (and Pluto) follows is a client-sever implementation where the server holds a canonical document state, and the clients are responsible for staying up to date with it. This pushes a whole bunch of the grunt work onto the client, which is OK since it's much easier for a single client to figure out how to catch up to the server, than a server trying to keep track of the state of many clients. This will help Pluto stay true to it's efficiency goal.
For a detailed theoretical explanation on how this type of OT works, I strongly recommend Daniel Spiewak's
brilliant introduction to Operational Transformation.
Why Node.js?
Node.js, if you haven't heard, is an event driven Javascript runtime environment. An oversimplified description would be to say it's Google's V8 engine ripped out of Chrome, with some custom extensions that allow it to talk to things like TCP sockets and local file systems instead of a browser's DOM.
Any potentially blocking operation (such as reading from a disk) is accessed through node via a callback. So instead of writing code like:
fileContents = fm.ReadFile("/somefile.txt");
processedContents = fp.Process(FileContents);
You typically express it like:
fm.readFile("/somefile.txt", function(fileContents) {
fp.Process(fileContents, function(processedContents) {
// Do more stuff..
});
});
These aren't real code examples but you get the idea. The advantage of using Node and these sequence of callbacks is that node will handle the queuing and memory allocation while the process is waiting, which means in situations where large numbers of operations are requesting access to a shared resource, it can be handled much more efficiently.
The Javascript language is naturally well suited to this style of writing code, and of course it makes sense for concurrent servers to be event driven.
Theres' a lot of choice for concurrent server frameworks like Twisted, Tornado, EventMachine etc., many of which are far more mature - but apart from the reasons above Node is a standout candidate for this project for two reasons.
Reason 1 - Node's concurrency performance is awesome
I won't get drawn into the inevitable benchmarking flamewar - I'm not qualified, but
these early stats look promising.
Efficiency concurrency is obviously crucial to a project that enables collaboration - and means we can hopefully delay worrying about scalability.
Reason 2 - It's written in Javascript
OT is an algorithm that relies strongly on complex complementary operations being performed on both the client and the server. So it helps a lot to be able to write the same code to deploy on the client and the server. Given the number of browser JS inconsistencies I might be nieve in this view, but it's a start.
(as an aside, Google Wave solved this problem by going the other way - they wrote their OT implementation in Java and compile those JS libraries to Javascript using the excellent Google Web Toolkit).
DESIGN GOALS
The exact implementation will no doubt vary as we go, but will stay true to the following key design goals.
- Be efficient by optimising application design for non-blocking concurrency and pushing as much concurrent state management to the client. This should allow a single server to handle a large number of clients mutating the same document, without the need for federation.
- Support document playback
- Be independent of transport and persistance (although we'll probably focus on key/value support for the latter)
- Be largely independent of the document being worked on - such that the library can be utilised for a wide range of applications
- Be open source (probably an MIT-style license)
The expected deliverables of the project will be:
- A comprehensively tested library that can support OT between browsers on a wide range of applications, independent of transport mechanism and storage
- A reference implementation of the above, that includes a working application (probably a document editor of some kind), as well as support for transport
What next?
This project is a long way from completion - there's not even a repo for it yet. But I'm announcing this now because I want your help to get this moving.
If you're thinking about building something similar, I'd love to hear from you so that hopefully we can join forces - or at least swap war stories.
If you're just curious or enthusiastic about the project, comment, wave, tweet or drop me a line to show your support (which will hopefully inspire me to give up more weekends to devote to this project).
If you like making cute logos, this project needs one.
And if you're interested in sponsoring the development of a tool like this, I'd obviously love to hear from you.
And of course, if you like it, spread the word.