BT Digital Archives project: A closer look at how 165 years of content were digitised

When the world's oldest communications group wanted to share hundreds of thousands of documents with the wider Internet, it partnered with the National Archives and Coventry University to make it a reality. We spoke to David Hay, Head of Heritage at BT Group, to find out more about the project itself (BT Digital Archives), how it came to life and what were the various challenges he met while bringing BT Dgital Archives to fruition.

What is the BT Digital Archives?

The BT Digital Archives is a new online resource of almost 500,000 images of photographs, research reports and policy and operational files from BT's archives dating back to 1846. BT is the oldest and most established communications company in the world dating back to the dawn of telecommunications, and its world class archives – recognised as such by UNESCO and Arts Council England – reflect that rich heritage.

The archives also incorporate BT's existing online catalogue, so as well as including the digital content created under this project, it is also a catalogue to BT's entire available archive. The collection includes records of BT itself, its predecessor Post Office Telecommunications, and the private telegraph and telephone companies that the Post Office took over during the nineteenth and early twentieth centuries, dating back to BT's first ancestor company – The Electric Telegraph Company founded in 1846.

How was it created?

The project is a collaboration between the academic, private and public sectors, making the best use of the skills and culture from all three. Coventry University recognised the significance and research potential of BT's archives, which reflect the leading role played by the UK in the development of communications technology over more than 165 years, a story that has not had the attention it merits.

Coventry University and BT Heritage successfully submitted a joint bid to Jisc – a charity which encourages innovative use of digital technologies to maintain the UK's position as a leader in education and research - as part of Jisc's Content programme. The National Archives is the third partner invited to take part because of the experience and expertise in scanning archival records. Total funding was £1 million, and the project ran from November 2011 to July 2013.

What obstacles or challenges did you face in creating the digital archive?

This has been a huge project, and a massive logistical exercise. Less than 10 per cent of the overall collection could be covered by the project budget, so first of all there was much discussion and consultation between the partners and potential users on what should be included. Once that was done, the documents had to be carefully packaged and transported to The National Archives where they were conserved where necessary before being scanned and returned to BT Archives. The scans then had to be quality checked, correctly named and matched to the catalogue data in BT Archives' catalogue. In many cases, entirely new metadata was created which had to be incorporated in to the catalogue. There was also a substantial amount of transcription and optical character recognition to allow free text searching across the records wherever possible, and this again had to be incorporated. Finally, and not least, the platform to host and access all of this content had to be built, and this was done by Serious Games International, a wholly owned subsidiary of Coventry University, in collaboration with BT and the supplier Axiell.

On a project of this scale, there were always going to unforeseen obstacles, but we have worked through most of these and the site is now live and working. Over the next few weeks we will be sorting out last few snags, and over the next few months adding more content

What are some of the highlights?

The three main categories include:

  • 45,000 photographs and pictures, c1865 - 1982
  • 190,000 pages from over 13,500 research reports, 1878 - 198
  • 1,230,000 documents from over 550 policy and operational files, 1851 – 1983

We were particularly pleased to be able to include the whole of BT's archive of research reports of work done at the Post Office Research Establishment at Dollis Hill and later at it successor, BT's research laboratories at Martlesham near Ispwich. Covering 1878 – 1981, they document over a century of British scientific endeavour, until recently largely hidden. They include work done by Tommy Flowers and his team, who built the world's first programmable computer at Dollis Hill for the Government code breaking centre at Bletchley Park in 1943. Although the Colossus report is not included – it was so sensitive it was destroyed on Churchill's orders at the end of the war – other reports written by Flowers and his team reflect their work on the technology that led to it pre-war, and where it took them post war, including other computer projects.

Other highlights from the collection that have been scanned include

  • documents relating to BT's ancestor company, the Electric Telegraph Company, which in 1846 became the first nationwide communications firm in the world;
  • correspondence from 1877 with Alexander Graham Bell's agent offering Bell's telephone to the British government, who turned it down;
  • correspondence between Guglielmo Marconi and the General Post Office from 1896 discussing the Italian's "new system of telegraphy without wires";
  • photos of Britain's first national telephone kiosks with concept drawings and correspondence detailing their design and public reaction to their introduction;
  • pictorial records of the advent of the world's first emergency call service, 999.
  • an image of Central Telegraph Office staff dealing with congratulatory telegrams to Buckingham Palace on the birth of Prince Charles, the future Prince of Wales.
  • documents illustrating the role of British telecommunications workers in the war effort during both world wars.

What's the benefit of digitising the BT archive?

Before this project, very little of the archive was available digitally online. Researchers wanting to study any of the collection always had to visit the BT Archives in Holborn, London in person and ask for the relevant files or reports to be brought to them. Although nothing can replace the experience of physically handling the archives, it's not always convenient or practical to make the journey, particularly from overseas.

Now users from anywhere in the world will be able to log on to www.bt.com/btdigitalarchives and explore 50 terabytes worth of images and documents detailing how Britain laid the foundations for global telecommunications, including the first telephone exchange in 1879 and the Queen making the first automatic long distance telephone call in the fifties.

The collection is also a lot more searchable, as there people can make free text searches across the content, and registered users can also add their own search tags and comments or facts about individual records, adding to the knowledge base that others can benefit from. Academic registered users can also download PDFs of whole files and reports to study offline.

There are also a number of learning resources based on the Archives produced by Coventry University academics on subjects around design, computing, linguistics and problem based learning that students and teachers can use in learning environments.

The Serious Games International have also developed an interactive fun non-academic way into the collection. Called Mosaic, it feature a number of the archive photographs converted into mosaics made up of thousands of individual images. You can zoom into the mosaic to view these images, and clicking on them take you to the catalogue entry for the image, from where you can go onto associated images. It's a great way of browsing through the photographs and realising the sheer breadth and scope of the collection.

Of course, not all of the BT Archives collection has been digitised, so people will have to visit in person to view the whole range of material on any given topic, but this is a huge resource that will cover most people's needs for study, imagery, study resources and a lot more.

Tell us about some of the biggest or most-notable archive projects you've been involved in

I've been working in archives for almost 30 years, most of that time with BT, and we've been involved in some big projects during that time. We've moved the collection a couple of times, most recently to our purpose adapted accommodation in 1997, and moving over 3km of shelved records is quite a task. We distributed thousands of old documents to almost every local authority record office in the country some years back when BT reviewed all the title deeds which it no longer needed to its estate, which as a national company is considerable. Some of the stuff dated back to medieval times, which really surprised me.

On the technology front, the biggest project we've done before now was the digitisation of the historical phone book collection in a partnership with www.ancestry.co.uk. Ancestry recognised the family history value of the phonebooks, and as they didn't have a major scanning operation in the UK then we had to ship over 1,700 directories covering 1880 – 1984 to Utah. That again was a massive logistical exercise, but for that project the format and content of the records was pretty consistent so was a lot less complicated than this latest one.

Where do you see the future of archiving at large? What role does technology play in this?

I doubt that nothing will replace the experience of handling original analogue archives, and that sense of becoming part of the document's history and legacy of people who previously used it and originally created it.

But there's no doubt that the future is digital, not just in disseminating analogue paper and audio visual records to a much wider audience much more easily and usefully, but in accessing and using information that is born digital and has existed in no other way. There's a whole set of issues and challenges around identifying, preserving and accessing electronic records. The principles are the same, but the practical tools and techniques are very different, and will obviously be technology based.