Cobalt: the technical basis

The Cobalt project started in 2009 with a view to creating a new system for video analytics that would replace the older Astraguard system that had been in existence since 2000 (and that itself was based on an earlier system that had developed since the late 1980's). Cobalt was to be entirely new and not a mere upgrade of the earlier systems: no code was transferred from those earlier systems. The understanding derived from those previous systems was of course of fundamental value in defining the scope and aims of Cobalt.

Achievement and ambition

    Among the desiderata for a modern video analytics system are

  • Work with data from both IP and analogue cameras
  • Fully TCP networked
  • Work equally well in thermal and optical wavebands
  • High quality data compression
  • Have a strongly encrypted data stream
  • Implement Image-DNA as the basis for analytics
  • Fast data retrieval and analytics
  • Self-evident software interfaces

The first two in this list were not particularly novel in 2009, so the first challenge was the the third: high quality data compression. The previous versions of Astraguard developed a format which was known internally as "AG7". This was a substantial improvement on MPEG-4 in terms of picture quality, degree of compression and efficiency of computation.

That improvement came mainly from the use of a compressed wavelet representation of the image data. Typically, AG7 would store at least one or two days of high quality data on a single DVD (4.5GB), and storage levels in excess of a week were not unusual.

Comparison of MP4 and AG7

Comparison of the detail in AG7 and MP4 rendition of the same image. The fine resolution of the AG7 image is outstanding as compared to the blocky structure of the MP4 one. To do this the video stream was split into two, and each stream was separately and simultaneously encoded. (The pictures have been slightly enhanced for clarity)


AG7 is a wavelet-based image format describing the image pixel data. Fast video stream analysis requires that additional information providing a synopsis of activity in the scene should be encoded along with the pixel data. The early work was directed towards encoding specific attributes closely related to detecting video motion.

We referred to the early version of this information stream as "synoptic data". The synoptic data was only derived from the greyscale part of the image data since that was all the relatively poor cameras in use at the time warranted. Moreover, it used relatively simple way of identifying the salient events in a video. Despite the simplicity, it worked very well in the context of post-recording analysis.

Several years experience with the synoptic data allowed us to considerably widen the scope of the analysis and to make it so that the user would not have to enter any numbers or parameters as part of the configuration. This was achieved by adding optimization algorithms which would automatically adapt to widely different scenes and conditions. With that the user would merely have to identify areas of interest within the scene and the program would report when anything happened (or stopped happening) there.

We now refer to this as "Image-DNA". It was called that because there was always the idea that we might be able to produce a multi-attribute signature of the activity in the video stream that paralleled the behaviour of parts of the human visual system.

The development of Image-DNA post-2009 followed the route towards replacing the optimization algorithms with what is fashionably called neural net learning. In that sense the data contained in the Image-DNA is a surrogate for the output of the LGN (Lateral Geniculate Nucleus) of the human visual system. It should be stressed that this was not brain modelling, we were merely replicating the processes taking place in one specific part of the brain. We are merely generating data that can be interpreted and acted upon by a piece of software: Cobalt. More details on the nature and workings of Image-DNA are found here .