A device driver 'DSL' using core.async
.
One of the killer features of Livespaces has always been its ability to automate control of many disparate hardware devices in our meeting rooms and labs. For example, a Livespace meta app can set up a room for a meeting by setting the level of the room’s lighting, reseting audio mixer levels, powering up the front displays and setting their default inputs, and then loading up a presentation on one display, all the while playing a pleasing background humming noise just to let you know how happy it is to do all this for you.
Livespaces acts like an operating system for a roomful of hardware and software, and, like an operating system, it has a plugin device driver architecture. Livespaces currently has drivers for audio amplifiers, echo cancellers, audio/video switching matrix’s, video cameras, LCD displays, electrical control buses (lights, etc), projectors, fibre switching matrixes, servo actuators, and more.
Which is all very froody, except that these drivers have been written by a variety of developers on an ad hoc basis over nearly a decade now, and I think it’s fair to say the word ‘ghetto’ begins to describe the quality of the code as it stands today.
Ad Hoc’ery
Up until recently we’ve dealt with this by simply avoiding looking at it too closely, but when a ‘must not fail’ demonstration did in fact fail, due in large part to the drivers for the big screen LCD’s deadlocking the Livespace server’s main event loop and thus bringing everything else to a screeching halt, we began to realise that Something Must Be Done.
As well as the drivers being of variable quality, they are also very ad hoc, re-implementing their own communication layers, adhering to various multi-threading disciplines (including the ever-popular ‘optimistic’ threading control, i.e none), and having no standard error detection and handling behaviour.
That last one is super important when dealing with large amounts of hardware: in that environment what can go wrong, will … randomly. It’s crucial that drivers keep on going as well they can in the face of failure, while flagging the problem loud and clear for the poor sod assigned to run the room.
In fact, only a very small amount of code should typically be needed for any given device driver. Manufacturers typically design very simple device-control protocols that just allow you to get & set properties on the device, and the only variations are the framing of messages (including things like headers, checksums, etc.) and the actual codes needed to read or write various states to and from the device.
For example, an LCD display entity might have just these properties:
Property | Values |
power | true, false |
input | dvi, video, s-video, hdmi, etc |
And an HD video camera might have these properties:
Property | Values |
power | true, false |
focus | auto, manual |
focus-position | integer |
auto-exposure | auto, manual, shutter-priority, iris-priority, bright |
The rest of the device control stack is about communicating with the device (typically over IP or serial) and representing the device as a Livespace entity. A Livespace entity is an data object containing name/value pairs which is replicated across any number of hosts in a room. A device entity publishes the hardware’s state, and allows clients to control the device by changing properties of the entity.
Async
We often need new drivers in a hurry when putting together new equipment, and the person doing the integration is often not a software developer. Since drivers are so conceptually simple, we wanted to try to see if we could make writing new drivers accessible to technically-proficient non-developers. A driver Domain-Specific Language (DSL) immediately came to mind.
I had gotten as far as making a design based on expect-style interactions to drive a state machine, but then I serendipitously read another great article on Clojure’s core.async library, and the reminder of core.async’s use of state machines behind the scenes lit up few connections: core.async would be perfect for writing imperative driver code rather than having to juggle event notifications and track the state explicitly (and deal with the inevitable heisenbugs that this approach would bring).
Design
The top-level design I came up with was:
The Device component keeps a presentation entity in sync with the hardware by creating a core.async transport channel (over IP or serial), and having the driver talk to the hardware over it. The Device initially reads the state of the hardware into the entity, publishes it, then listens for changes to the entity and syncs them back to the hardware. If any error occurs, the entity is un-published until it’s resolved.
All of this is done over core.async channels. Messages over the transport, property change notifications from the entity, timeouts and shutdown signal: all channels.
Transport
The transport is a fairly obvious bridging of a Netty channel to a core.async channel. The core.async channel initiates a network connection on first access, then syncs messages to/from Netty and core.async, closing the channel when done.
Driver
The driver is defined by a set of properties implementing the IProperty
Clojure protocol:
(defprotocol IProperty "A device driver property. The property has an access mode (read-only/read-write/write-only/constant) and fn's to asynchronously read and write the property value." (property-name [this] "The name (keyword) of the property.") (property-mode [this] "The change mode: :rw (read-write), :ro (read-only), :wo (write-only), :const (constant).") (get-property-value [this channel] "Async get the property value from channel, returning a channel with one of [:complete value], or [error-code error-message].") (set-property-value [this channel value] "Async set property to value, returning a channel containing [:complete] or [error-code error-message].") (affects-properties [this] "A seq of property names whose values may change when this property changes."))
Essentially each property has a name, access mode (read/write) and methods to read/write values from/to a channel. Most drivers will use the default IProperty
implementation which uses pluggable codecs for encoding commands and property values — you’ll see detail on that later in the example drivers.
So writing a driver usually boils down to writing a codec for encoding commands to the device and decoding responses: the other details — transport, error handling, entity sync — all come for free. And, as you’ll see later, I think it’s not unreasonable to ask non-developer systems integrators to learn enough of the Clojure ‘DSL’ for drivers (mostly by example) that they can develop new ones themselves.
Device
The Device implementation is where it all comes together, and where core.async really makes a radical difference.
Here’s the state diagram for the Device:
Imagine writing that as an explicit state machine driven by four or five event handlers, with blocking network IO and multi-thread locking. In fact I don’t have to: that is essentially how the old code worked.
In the new system, each arrow is an async channel read, and the code itself makes perfect sense when read sequentially. It’s still not simple, but I think it’s close to being as simple as it can be.
Example Driver
To get a feel for how a driver is specified in this approach, here’s an look at the driver for the Sony FCB H11 block camera, a professional-grade HD video camera that we use for video conferencing. Here’s the core of the driver:
(defn make-sony-fcb-driver "Sony FCB H11 camera driver. Camera number starts at 1, will only be higher when using daisy-chained cameras. Returns an IDriver implementation. Product page: http://pro.sony.com/bbsc/ssr/product-FCBH11/" [camera-number] (let [properties (make-fcb-properties camera-number)] (reify IDriver (device-type [this] "camera") (init-device-channel [this channel] (sentinel-delimited-channel channel 0xff)) (all-properties [this] properties)))) (defn make-fcb-codec [camera-number] (reify ICommandCodec (encode-update [this property value] [(+ 0x80 camera-number) 0x01 0x04 (:command-code property) (encode-value property value)]) (encode-inquiry [this property] [(+ 0x80 camera-number) 0x09 0x04 (:command-code property)]) (decode-response [this property [camera code & params]] (let [camera (- (bit-shift-right camera 4) 8)] (if (= camera camera-number) (case (bit-shift-right code 4) 4 [:ack] 5 (let [result (if params (decode-value property params))] (if (vector? result) result [:complete result])) 6 [:error (error-description (first params))] [:protocol-error (str "Unknown response code: " code)]) [:protocol-error (str "Invalid camera ID: " camera)])))))
In make-sony-fcb-driver
’s IDriver
implementation we return a set of properties (created by make-fcb-properties
, shown next), all of which use the same codec (created by make-fcb-codec
). The driver also wraps the raw transport channel with one that merges incoming blobs of data into messages terminated by 0xFF
, and adds that message delimiter to outgoing messages.
The make-fcb-codec
function is the core of the driver: it encodes state inquiry & update commands, and decodes responses according to the documented protocol for Sony FCB cameras. This protocol is simply a fixed header (0x80
+ camera-number), followed by 0x01
0x04
for set value (encode-update
), 0x09
0x04
for get value (encode-inquiry
), then the specific code for the property, and the new value (if it’s a set command). The response (parsed by decode-response
) is headed by the camera number, success/error code, and parameters containing the property value if it’s a response to an inquiry.
The codec works with messages represented as Clojure integer sequences, so we can use Clojure’s pattern matching and destructuring to slice and dice the incoming messages, and simply write a sequence literal when composing them in the encode-*
methods — compare that to fiddling with Java ByteBuffer
’s. This also avoids the problems caused by bytes being signed in Java, making them a PITA to work with directly. For an example of how this is nice, have a look at the decode-response
method’s use of destructuring to pull out the camera and response code from an incoming message.
Driver Properties
Below are the properties of the driver. Notice that a number of properties specify their possible values via inline Clojure maps, which the property-table
function turns into value codec’s for us. You can also implement custom codec’s, like the pqrs-codec
which reads and writes a 16-bit value spread over four bytes (no, I have no idea why Sony chose that encoding either).
(defn make-fcb-properties [camera-number] (property-table (make-fcb-codec camera-number) [:name "Sony FCB H11 Camera"] ;; CAM_Power [:power :rw 0x00 boolean-codec] ;; CAM_Focus (p38) / CAM_FocusModeInq. "auto/manual" (0x10) is ;; write-only, toggles focus between auto and manual. [:focus :rw 0x38 auto-manual-codec] ;; CAM_FocusPos. Can only set focus pos when in manual mode, ;; otherwise get "command not executable". [:focus-pos :rw 0x48 pqrs-codec] ;; CAM_SpotAE [:spot-auto-exposure :rw 0x59 boolean-codec] [:auto-exposure :rw 0x39 {:auto 0x00 :manual 0x03 :shutter-priority 0x0A :iris-priority 0x0B :bright 0x0D}] ;; CAM_WB / CAM_WBModeInq. [:white-balance :rw 0x35 {:normal-auto 0x00 :indoor-mode 0x01 :outdoor-mode 0x02 :one-push 0x03 :auto-trace 0x04 :manual 0x05 :outdoor-auto 0x06 :sodium-auto-lamp 0x07 :sodium-lamp 0x08}] ;; CAM_WB One Push Trigger. Changes whitebalance to manual. [:white-balance-trigger :wo 0x10 {true 0x05} [:whitebalance]] ;; CAM_Gain. Cannot always set when autoexposure not manual (?) [:gain :rw 0x4C pqrs-codec] ;; CAM_RGain [:r-gain :rw 0x43 pqrs-codec] ;; Cam_BGain [:b-gain :rw 0x44 pqrs-codec] ;; Cam_Shutter [:shutter :rw 0x4A pqrs-codec] ;; Cam_Iris [:iris :rw 0x4B pqrs-codec] ;; CAM_SlowShutterMode [:slow-shutter :rw 0x5A auto-manual-codec]))
The whole driver is 128 lines of Clojure code, down from over 1,000 in the original Java implementation.
A Driver DSL For Non Programmers?
I think the sort of code shown here should make sense to non-programmers, especially after they’ve had a look at existing example drivers. It requires very little ‘real’ Clojure coding, no Java coding, and the user doesn’t need spin up a complex development environment. And as more device drivers get written in the new style, I expect we’ll recognise higher-level patterns and make the process even simpler.
This is early days, and we’re not yet sure whether this will appeal enough that systems integrators will actually write their own drivers. One thing I can say for sure now, though, is that it’s already making it easier and far more enjoyable for me!