The size of the Library of Alexandria before its destruction was legendary, but now a series of ambitious digital projects are aiming to take the vision of this ancient library and create vast online libraries fit for the digital age.
Yesterday, Google announced that it had made available the first books and documents scanned and indexed as part of its Google Print project, an ambitious scheme that sees the search engine giant attempting to digitise the libraries of Stanford University, Harvard University, Oxford University, the University of Michigan and the New York Public Library, and make all or parts of these books available online.
Google is competing with a rival project run by the Open Content Alliance, which was founded by Internet Archive and is supported by Microsoft and Yahoo!. Microsoft revealed today that it is working with the British Library to scan 100,000 books and put them online as part of Microsoft’s own book search service, set to launch next year.
Libraries had already been working on digitising print material, but with the support of technology heavyweights their aims can now become far loftier. The prospect of pooling the knowledge contained in the valuable collections of these different libraries, and making this searchable online for the whole of humanity is an enticing prospect.
There is, however, one group who is proving less keen on the idea and that is the publishers. It is significant that the first works Google chose to make available are works in the public domain and, therefore, not subject to copyright restriction.
The Author’s Guild and five publishers have filed lawsuits against Google over the scanning of copyrighted work. The search giant says that it doesn’t break copyright laws because only a short excerpt of text around a user's search term is shown, unless a publisher gives permission for more to be allowed. Google would seem to have a fair point here.
Publishers concerns are understandable but the project could prove to be a useful tool for driving sales, particularly for smaller publishers and lesser known authors who lack the muscle to heavily promote their works.
Perhaps it’s more a case of Google’s apparent arrogance in launching the scheme without seeking the publishers' approval that has got up their noses. Not to mention the fact that Google plans to make a whole heap of money from advertising next to user searches of the online works.
The Open Content Alliance says it has been more careful to involve publishers from the start and that by using the Creative Commons license, it allows copyright holders to say how the material is used.
Copyright issues will take a while to be ironed out but in the meantime there are plenty of public domain works out of copyright to keep the Google, Microsoft and Yahoo! scanners busy for years yet.