10 questions to Michael Sparks, OS Guru at BBC (Part 1)

We've drawn up a list of ten questions for Michael Sparks who is a BBC Senior Research Engineer, specialised in and passionate about Open source. He tells us more about his encounter with the OS movement and more interestingly, some very very good insight on the Kamaelia project. Brace yourself for the 6000+ words answers. Part 2 will be published tomorrow.

First of, let me preface my answers by saying that these answers must be construed generally as personal opinions and preconceptions, and not in any shape or form an official BBC position. I do not speak for the BBC here despite working there, and despite speaking on behalf of the BBC at Linux World London.

1. Who is Michael Sparks?

I'm currently a Senior Research Engineer who works at the BBC, a computer science graduate from Manchester University and have spent my professional career prior to the BBC working on scaling large scale network systems. I'm a software engineer, and I'm interested in improving the tools we use for creating systems both large and small, so that we can get on and implement all the cool tools that SciFi inspires. I also like cats and being random :-)

2. How did you develop a passion for Open Source?

Interesting question. To me Open Source embodies an ideal that scientists have had for many years, hundreds even. In science you explore the possible, and report your findings in a repeatable way. Usually you do this by publishing work work and the mechanisms required to do this. Open source falls under the category of publishing your work.

The initial hook that got me started in using open source however goes back to my student days. I was doing computer science and needed access to a free compiler (penniless student you see :), and didn't want to rip anyone off (Whatever anyone says about copyright, infringing on a state granted time limited monopoly isn't nice, or frankly necessary).

I saw people copying Microsoft or Borland developer tools and thought "If I'm ending up in that industry, maybe, then it seems hypocritical to start off by ripping off". As a result I downloaded DJGPP - a port of GCC to DOS, and then moved onto a 4 floppy distribution of Linux.

From then on it was natural (as a student) to look for a legal gratis alternative, and to contribute back where you can. The open source aspects came later.

After all, it's very clear that in a situation where someone is giving you code, the best way you can pay for that gift is to "Pay It Forward" in exactly the same way as the film of the same name. That film, for me is one of the best single analogies as to how open source works in practice.

After reading the GNU manifesto however, I personally came down heavily on the side of the BSD license. Developers do have a choice in our society (rightly or wrongly) on how (or if) they charge for their services.

At the end of the day, the market will decide. The growth of Linux over the past 15 years is an interesting decision by the market. I like the BSD license because it allows people to take my code and generate wealth from it. This is precisely what's happened with software systems built on this license over a long time period.

TCP/IP, DNS, Web, Email - all the major initial implementations all had BSD or BSD-like licenses in that the code can be closed (PD code can be closed).

However between them they generate more tax revenue for the government per year than the BBC license fee, and significantly more for the general population. Not bad for something you can get hold of for free. The mere existance of the closable, but not closed versions tends to keep the closed versions more "honest" - since otherwise they get replaced by the closable (or non-closable) versions.

3. To what extent is Open Source used in the BBC? Have you for example migrated to Open Office?

This is perhaps the wrong question, but I'll answer as I see it directly. It's used on the server side fairly heavily (in addition to a number of proprietary tools), and on the desktop side proprietary software dominates almost exclusively. In BBC Research, open source is heavily used as a development platform however, and this drives the use of tools like OpenOffice that support open data formats.

That said, as an aside, Apple computers are very highly rated by BBC a significant number of BBC employees - largely due to the number of multimedia applications that exist on the OS X platform. Given OS X has a fairly substantial amount of open source software beneath it, there are a number of people using open source and not aware of it.

Personally I've been using OpenOffice since it was called Star Office - for around 8 years now, and its been nice to see the tool mature over that time period. The BBC however as a whole simply isn't ready to make that switch right now (even if some parts could).

Why?

Well, the guiding principle as to whether to use a piece of software is not about whether the code is open source or proprietary? It's "does it do the job?", "is it value for money?", "is it secure?", "who do we get support from?", "what tools are the normal workforce used to?", "what formats do we communicate with indies?", "what tools do we use to communicate to indies", etc. ie standard business reasons rather than ideological.

In the server space switching to Open Source solutions is most often pushed by arguments relating to supportability, stability, scalability, standards, security & flexibility. On the desktop the issues are more people and task oriented.

The cost of savings on license fees has to be weighed up by costs of retraining, and loss of features (even if replaced by an equivalent but different feature).

The development of open standards for document interchange (something the IFF people worked on many years back to great effect) are extremely important for one reason - they allow those working on a desktop system to move their content from one system to another seamlessly.

(Whatever licensing scheme the systems have)

As a result, the question on the desktop isn't really "has to BBC moved to OpenOffice.org?" but rather "is it moving to ODF?" (the Open Document format from OASIS - a cross industry standardisation group for documents). On this front the BBC's current involvement is as a member of the ODF Alliance.

Joining the ODF Alliance is an initiative from BBC Research in recognition that this is an attempt at a vendor neutral format - ie an Open Standard - for document interchange.

The question however of is the BBC moving to ODF is a complex one. The BBC works with many companies and documents of all kinds - from email, through web pages, through to code, video files, traditional documents, spreadsheets etc - fly between these companies. The key thing about this is that the format used has to be understood by both sides.

So, sure, the BBC could mandate it will only use Gnumeric and .gnumeric files for all spreadsheets, but it wouldn't be able to talk to its partners.

However in practice, your clients will send you .xls files, .doc files and so on. Unless you can open them to look precisely as sent (pivot tables intact, templated fields in tables of contents & scripts intact), then you can't migrate fully.

As OpenOffice.org continues to grow in richness and capability it is likely to be authorised for use on the official BBC Desktop - as Firefox has been.

However its adoption is likely to be slow initially, since this will depend on the general BBC population.

Some people like to say that support is a problem, and whilst that used to be an issue many years ago, with companies like IBM and Novell providing professional support services for many of the major open source desktop applications this isn't really the dominating factor.

The dominating factor in the decision list above was "what tools are the normal workforce used to?". In this case, at the moment "ICT skills" are deemed to be, effectively "can you drive Microsoft products", not "can you use a word processor".

For proof of this you can look at the ECDL. If you download the sample tests for this from this site, you'll note that the zip file contains almost exclusively document formats created by Microsoft, including some files for specific versions of Microsoft products. Until you see a change in these to open document formats, I would expect Microsoft skills rather than ICT skills to dominate in the workforce.

Also whilst I feel this reinforces the situation where Microsoft product skills are taught over transferable ICT skills this isn't a criticism of Microsoft or the ECDL. It just reflects the workforce marketplace at present so. It's one indication of the level of change required in society before you can say "has company X migrated to OpenOffice.org".

In a way, I expect the surge of interest in Apple & Apple products in recent years may act as a factor in increasing awareness that there are spreadsheets, wordprocessors and presentation tools that aren't Microsoft based, and that these will cause a change in the way these skills are taught. Products supporting open data standards will naturally benefit as a side effect here.

As a result personally speaking I suspect that the BBC will migrate on the desktop at more or less the same speed as the general population, or at best a bit faster due to the BBC's general investment in open data formats. In non-desktop areas though, I'd expect open source and open data formats to continue to be adopted at ever faster rate. (predictions can be wrong here for both)

4. BBC’s current leading open source project is Kamaelia; can you tell us more about it.

Kamaelia is designed as a reinvention of the way we design and build software with an aim of making it simpler to maintain, and to put the programmer's power in the hands of the creative user. An eventual aim would include a generic application as putting the same sort of power as the spreadsheet does for the average user into the average users hands but for generic applications using multimedia, and network systems. This is some time off however!

It's current incarnation originated in the original problem space of online delivery of BBC content, which is a naturally concurrent problem. (20 million people watching different things is a naturally concurrent problem). As a result one of the core design goals was to make the maintenance of concurrent systems as simple as it can be. This has the side effect of making systems more comprehensible, sometimes easier to build, and hopefully more accessible.

In practical terms Kamaelia is designed as a tool that allows us to collaborate with others in figuring out how to make the best of the resources the BBC has for either areas of production where the BBC can be assisted by Kamaelia or online delivery and the collaborative development of more appropriate protocols for country population sized delivery.

It's based on some very sound ideas in CSP distilled down to a very simple essence - no shared data, and small pieces of code loosely joined - with data taken from outboxes delivered to inboxes, based on wiring done at run time.

If that sounds familiar, it should - it's directly akin to what happens with Unix pipelines, except:

* We can send arbitrary objects (with methods) rather than just file like data

* We can create arbitrary shapes rather than pipelines (eg 5 buttons converging on a display control, controlling an image display for a presentation tool)

* We default to single threaded single process using co-routines, but can use threads as well, rather than being forced to be heavyweight multiprocess.

However it's also inspired by a naive engineers view of biology. In animals we have 2 main communication systems which allow the billions of cells (which don't know of each others existance) to operate as whole. We have the neural system (which Kamaelia's core - Axon - deliberately alludes to) which can be viewed as the brain's main (or most obvious) way of controlling the body. The other main system the body has of communication is the hormonal system which provides essentially mechanisms to say "starvation mode, conserve energy", "horny mode, reproduce", "pregnant mode", and so on. This is much slower acting, but provides global useful information.

Software doesn't really work in exactly this way, but it's useful to remember the two modes - essentially point to point for fast comms and global commununications for slow changing globally useful information. In a way the unix environment is accidentally similar to this. In Kamaelia the decision to add in something akin to a Linda tuple space for global communications of certain current states (like the existance of an active display) was a concious decision based on the recognition that if nature finds the two modes useful, developers might too. This has proven to be a good decision.

The upshot from a developer's perspective is that it means you can often hack together a quick and dirty implementation that satisfies your immediate problem, and then have a clear and simple route to making that implementation scalable and maintainable. This allows you to rapidly answer "what if" style questions and find out if its worth spending the extra time on fully answering the question.

As the set of components grows, the tools we have for building Kamaelia systems also matures. For example, we have a system for visually joining components together which evolved from a system for looking inside running Kamaelia systems. As a result you can now visually create a rapidly growing number of problems in a growing number of problem domains. The key thing however here is that whilst the visual tool is nice, it's not just a pretty interface with little behind it, it's grown the other way up - it's a pretty interface that represents the tip of the iceberg.

In practical terms we have tools for creating network systems based on TCP, UDP & multicast; higher level support for HTTP & BitTorrent & growing support for RTP; tools for producing traditional GUI apps; tools for games style interfaces - which includes rich media interfaces; tools for working with 3D; tools for audio capture, encoding etc; tools for working with DVB (freeview), and video, etc.

I could go on, but the key thing about this is that all these tools can be used in the same system with each other in a mix and match fashion, quite literally like lego or K'Nex, etc. On a practical level, we use Kamaelia to collaborate over cross sites on opposite sides of the country using a whiteboarding application where clients can connect to each other in a P2P fashion, and as well as sharing a traditional whiteboard, audio is also shared (and mixed) across the network.

For more information on this people can see this month's (december 2006 issue) of Linux Format.

5. The title of your session at Linuxworld is Open Sourcing at The BBC: When, Why, Why Not and How. Any particular reason why you have decided to include “Why Not”?

Balance.

Specifically the title is intended to allude to a variety of activities with regard to open source ranging from use of open source and origination of open source. There are situations where for a business it doesn't make sense to release code as open source, and situations where it does. Similarly there are situations where not using open source makes sense as much as using open source. To not recognise this would be an unbalanced view, and so the title reflects this intended balance.

The "why not" however itself has to be balanced against the fact that if you don't release your product as open source, that at some point in the future someone else may release a product with sufficient similar functionality that is open source. However regarding that point, that's a common business risk and doesn't just apply to open source.

Also, but including "why not" it forces me, as a speaker, to think about the flipside. It provides the opportunity to re-examine if those points are valid, and opportunities for others to come back and say "that's simply not the case".

(Part 2 will be published tomorrow)