A device driver 'DSL' using core.async.

One of the killer features of Livespaces has always been its ability to automate control of many disparate hardware devices in our meeting rooms and labs. For example, a Livespace meta app can set up a room for a meeting by setting the level of the room’s lighting, reseting audio mixer levels, powering up the front displays and setting their default inputs, and then loading up a presentation on one display, all the while playing a pleasing background humming noise just to let you know how happy it is to do all this for you.

Livespaces acts like an operating system for a roomful of hardware and software, and, like an operating system, it has a plugin device driver architecture. Livespaces currently has drivers for audio amplifiers, echo cancellers, audio/video switching matrix’s, video cameras, LCD displays, electrical control buses (lights, etc), projectors, fibre switching matrixes, servo actuators, and more.

Which is all very froody, except that these drivers have been written by a variety of developers on an ad hoc basis over nearly a decade now, and I think it’s fair to say the word ‘ghetto’ begins to describe the quality of the code as it stands today.

Ad Hoc’ery

Up until recently we’ve dealt with this by simply avoiding looking at it too closely, but when a ‘must not fail’ demonstration did in fact fail, due in large part to the drivers for the big screen LCD’s deadlocking the Livespace server’s main event loop and thus bringing everything else to a screeching halt, we began to realise that Something Must Be Done.

As well as the drivers being of variable quality, they are also very ad hoc, re-implementing their own communication layers, adhering to various multi-threading disciplines (including the ever-popular ‘optimistic’ threading control, i.e none), and having no standard error detection and handling behaviour.

That last one is super important when dealing with large amounts of hardware: in that environment what can go wrong, will … randomly. It’s crucial that drivers keep on going as well they can in the face of failure, while flagging the problem loud and clear for the poor sod assigned to run the room.

In fact, only a very small amount of code should typically be needed for any given device driver. Manufacturers typically design very simple device-control protocols that just allow you to get & set properties on the device, and the only variations are the framing of messages (including things like headers, checksums, etc.) and the actual codes needed to read or write various states to and from the device.

For example, an LCD display entity might have just these properties:

PropertyValues
powertrue, false
inputdvi, video, s-video, hdmi, etc

And an HD video camera might have these properties:

PropertyValues
powertrue, false
focusauto, manual
focus-positioninteger
auto-exposureauto, manual, shutter-priority, iris-priority, bright

The rest of the device control stack is about communicating with the device (typically over IP or serial) and representing the device as a Livespace entity. A Livespace entity is an data object containing name/value pairs which is replicated across any number of hosts in a room. A device entity publishes the hardware’s state, and allows clients to control the device by changing properties of the entity.

Async

We often need new drivers in a hurry when putting together new equipment, and the person doing the integration is often not a software developer. Since drivers are so conceptually simple, we wanted to try to see if we could make writing new drivers accessible to technically-proficient non-developers. A driver Domain-Specific Language (DSL) immediately came to mind.

I had gotten as far as making a design based on expect-style interactions to drive a state machine, but then I serendipitously read another great article on Clojure’s core.async library, and the reminder of core.async’s use of state machines behind the scenes lit up few connections: core.async would be perfect for writing imperative driver code rather than having to juggle event notifications and track the state explicitly (and deal with the inevitable heisenbugs that this approach would bring).

Design

The top-level design I came up with was:

Design for async drivers

The Device component keeps a presentation entity in sync with the hardware by creating a core.async transport channel (over IP or serial), and having the driver talk to the hardware over it. The Device initially reads the state of the hardware into the entity, publishes it, then listens for changes to the entity and syncs them back to the hardware. If any error occurs, the entity is un-published until it’s resolved.

All of this is done over core.async channels. Messages over the transport, property change notifications from the entity, timeouts and shutdown signal: all channels.

Transport

The transport is a fairly obvious bridging of a Netty channel to a core.async channel. The core.async channel initiates a network connection on first access, then syncs messages to/from Netty and core.async, closing the channel when done.

Driver

The driver is defined by a set of properties implementing the IProperty Clojure protocol:

(defprotocol IProperty
  "A device driver property. The property has an access
  mode (read-only/read-write/write-only/constant) and fn's to
  asynchronously read and write the property value."

  (property-name [this]
    "The name (keyword) of the property.")

  (property-mode [this]
    "The change mode: :rw (read-write), :ro (read-only), :wo (write-only),
     :const (constant).")

  (get-property-value [this channel]
    "Async get the property value from channel, returning a channel
     with one of [:complete value], or [error-code error-message].")

  (set-property-value [this channel value]
    "Async set property to value, returning a channel containing
     [:complete] or [error-code error-message].")

  (affects-properties [this]
    "A seq of property names whose values may change when this
    property changes."))

Essentially each property has a name, access mode (read/write) and methods to read/write values from/to a channel. Most drivers will use the default IProperty implementation which uses pluggable codecs for encoding commands and property values — you’ll see detail on that later in the example drivers.

So writing a driver usually boils down to writing a codec for encoding commands to the device and decoding responses: the other details — transport, error handling, entity sync — all come for free. And, as you’ll see later, I think it’s not unreasonable to ask non-developer systems integrators to learn enough of the Clojure ‘DSL’ for drivers (mostly by example) that they can develop new ones themselves.

Device

The Device implementation is where it all comes together, and where core.async really makes a radical difference.

Here’s the state diagram for the Device:

State diagram for a device

Imagine writing that as an explicit state machine driven by four or five event handlers, with blocking network IO and multi-thread locking. In fact I don’t have to: that is essentially how the old code worked.

In the new system, each arrow is an async channel read, and the code itself makes perfect sense when read sequentially. It’s still not simple, but I think it’s close to being as simple as it can be.

Example Driver

To get a feel for how a driver is specified in this approach, here’s an look at the driver for the Sony FCB H11 block camera, a professional-grade HD video camera that we use for video conferencing. Here’s the core of the driver:

(defn make-sony-fcb-driver
  "Sony FCB H11 camera driver. Camera number starts at 1, will only be
  higher when using daisy-chained cameras. Returns an IDriver
  implementation.

  Product page: http://pro.sony.com/bbsc/ssr/product-FCBH11/"
  [camera-number]
  (let [properties (make-fcb-properties camera-number)]
    (reify
      IDriver
      (device-type [this]
        "camera")

      (init-device-channel [this channel]
        (sentinel-delimited-channel channel 0xff))

      (all-properties [this]
        properties))))

(defn make-fcb-codec [camera-number]
  (reify ICommandCodec
    (encode-update [this property value]
      [(+ 0x80 camera-number) 0x01 0x04
       (:command-code property)
       (encode-value property value)])

    (encode-inquiry [this property]
      [(+ 0x80 camera-number) 0x09 0x04
       (:command-code property)])

    (decode-response [this property [camera code & params]]
      (let [camera (- (bit-shift-right camera 4) 8)]
        (if (= camera camera-number)
          (case (bit-shift-right code 4)
            4
            [:ack]
            5
            (let [result (if params (decode-value property params))]
              (if (vector? result)
                result
                [:complete result]))
            6
            [:error (error-description (first params))]
            [:protocol-error (str "Unknown response code: " code)])
          [:protocol-error (str "Invalid camera ID: " camera)])))))

In make-sony-fcb-driver’s IDriver implementation we return a set of properties (created by make-fcb-properties, shown next), all of which use the same codec (created by make-fcb-codec). The driver also wraps the raw transport channel with one that merges incoming blobs of data into messages terminated by 0xFF, and adds that message delimiter to outgoing messages.

The make-fcb-codec function is the core of the driver: it encodes state inquiry & update commands, and decodes responses according to the documented protocol for Sony FCB cameras. This protocol is simply a fixed header (0x80 + camera-number), followed by 0x01 0x04 for set value (encode-update), 0x09 0x04 for get value (encode-inquiry), then the specific code for the property, and the new value (if it’s a set command). The response (parsed by decode-response) is headed by the camera number, success/error code, and parameters containing the property value if it’s a response to an inquiry.

The codec works with messages represented as Clojure integer sequences, so we can use Clojure’s pattern matching and destructuring to slice and dice the incoming messages, and simply write a sequence literal when composing them in the encode-* methods — compare that to fiddling with Java ByteBuffer’s. This also avoids the problems caused by bytes being signed in Java, making them a PITA to work with directly. For an example of how this is nice, have a look at the decode-response method’s use of destructuring to pull out the camera and response code from an incoming message.

Driver Properties

Below are the properties of the driver. Notice that a number of properties specify their possible values via inline Clojure maps, which the property-table function turns into value codec’s for us. You can also implement custom codec’s, like the pqrs-codec which reads and writes a 16-bit value spread over four bytes (no, I have no idea why Sony chose that encoding either).

(defn make-fcb-properties [camera-number]
  (property-table (make-fcb-codec camera-number)
    [:name         "Sony FCB H11 Camera"]
    ;; CAM_Power
    [:power        :rw 0x00 boolean-codec]
    ;; CAM_Focus (p38) / CAM_FocusModeInq. "auto/manual" (0x10) is
    ;; write-only, toggles focus between auto and manual.
    [:focus        :rw 0x38 auto-manual-codec]
    ;; CAM_FocusPos. Can only set focus pos when in manual mode,
    ;; otherwise get "command not executable".
    [:focus-pos    :rw 0x48 pqrs-codec]
    ;; CAM_SpotAE
    [:spot-auto-exposure :rw 0x59 boolean-codec]
    [:auto-exposure :rw 0x39 {:auto 0x00 :manual 0x03 :shutter-priority 0x0A
                             :iris-priority 0x0B :bright 0x0D}]
    ;; CAM_WB / CAM_WBModeInq.
    [:white-balance :rw 0x35 {:normal-auto      0x00
                              :indoor-mode      0x01
                              :outdoor-mode     0x02
                              :one-push         0x03
                              :auto-trace       0x04
                              :manual           0x05
                              :outdoor-auto     0x06
                              :sodium-auto-lamp 0x07
                              :sodium-lamp      0x08}]
    ;; CAM_WB One Push Trigger. Changes whitebalance to manual.
    [:white-balance-trigger :wo 0x10 {true 0x05} [:whitebalance]]
    ;; CAM_Gain. Cannot always set when autoexposure not manual (?)
    [:gain         :rw 0x4C pqrs-codec]
    ;; CAM_RGain
    [:r-gain       :rw 0x43 pqrs-codec]
    ;; Cam_BGain
    [:b-gain       :rw 0x44 pqrs-codec]
    ;; Cam_Shutter
    [:shutter      :rw 0x4A pqrs-codec]
    ;; Cam_Iris
    [:iris         :rw 0x4B pqrs-codec]
    ;; CAM_SlowShutterMode
    [:slow-shutter :rw 0x5A auto-manual-codec]))

The whole driver is 128 lines of Clojure code, down from over 1,000 in the original Java implementation.

A Driver DSL For Non Programmers?

I think the sort of code shown here should make sense to non-programmers, especially after they’ve had a look at existing example drivers. It requires very little ‘real’ Clojure coding, no Java coding, and the user doesn’t need spin up a complex development environment. And as more device drivers get written in the new style, I expect we’ll recognise higher-level patterns and make the process even simpler.

This is early days, and we’re not yet sure whether this will appeal enough that systems integrators will actually write their own drivers. One thing I can say for sure now, though, is that it’s already making it easier and far more enjoyable for me!