You are here

Web Archiving and Legal Deposit

Some very interesting discussions over lunch, especially with Ian Oi from Blake Dawson Waldron (lawyers to the National Library of Australia as well as key collaborators with QUT on the translation of the Creative Commons framework into the Australian legal context). Talking to Paul Koerbin from the NLA also reminded me that the changes we've made to the server setup of M/C - Media and Culture (now using the three sub-domains more effectively) recently may mean that the archiving of the site by PANDORA as it's happened so far may now not work so well any more - I'll have to check back with the NLA to make sure there are no longer-term problems.

Craddock Morton, the Director of the National Museum of Australia, is chairing the next session. He introduces Hans Liegmann from the German National Library, which is based in three sites in Leipzig, Frankfurt a.M., and Berlin. He points out that the legal deposit law as it exists in Germany today does not currently include online publications; but some changes may be underway.

TEL, The European Library, was a peak body for European libraries, and ran between 2001 and 2004; it was charged with investigating collaborative approaches across Europe (to some extent also beyond the EU itself). It developed consensus on missions, metadata standards, and business models, and led to the development of the European Library Office which will develop a combined portal to the services of the various national libraries, in all official European languages (this will also apply for the new European Union member states). The system will need to interface with various distributed databases across all partner institutions as well as work with a central index for some of its sources (there are plenty of acronyms here, which I won't go into). The National Library of the Netherlands will serve as the service provider for TEL.

In Germany, of course, culture is an area governed by the individual states (even though there is a ministerial-level cultural advisor to the federal chancellor). Here, a key project is Nestor, which runs from 2003-2006. It will develop approaches to Web archiving, including the setup of an information and communication platform and network involving its various partner institutions. It aims to distribute the labour required and identify the potential synergies between individual institutions. There is also a need to breach the language barrier between German efforts and wider international projects as they are occurring (and evidenced by this very conference). Ultimately, there is also a need to organise workshops and develop mechanisms for surveys and statistics, and finally interconnect local and international projects as well as standards for archiving and metadata, certification of trusted repositories, and preservation guidelines. (Various more specific studies into a number of the issues raised throughout the conference are already underway as well…)

Finally, there is also a project called Kopal, for the cooperative development of a long-term digital information archive (building on the IBM DIAS project which Hans Jansen from the Dutch Library already talked about). This is compared to an information bank with safe deposit boxes, and will therefore be a trusted digital information repository, using standard object formats and establishing effective cooperative models.

Next up is John Tuck, Head of British Collections at the British Library, speaking on BL's efforts in Web archiving. It, too, benefits from legal deposit legislation at least in the area of print, and it seeks to develop as complete an archive of electronic publications as it has in print. Legal deposit law has changed over the centuries, and was last updated in 2003 through the Legal Deposit Libraries Act. The Act is enabling legislation (it enables more specific regulations to be established later), which did not, however, change legislation for printed publications or extend to non-print; however, it does provide a framework for the Secretary of State to make new regulations: for offline/online formats including associated computer programs to provide access to works; and on provisions for libraries' gathering of online resources. There will now be an advisory panel to be established by the Department of Media, Culture, and Sport, which will advise the Secretary on regulations, format by format; a Joint Committee on Legal Deposit has also been set up and has working groups on reviewing offline publication deposit schemes, issues around how to define UK publications, and questions of how to deal with e-journals.

The BL itself has a significant Web archiving programme which aims to define collection development policy and interfaces with the UK Web Archiving Consortium (UKWAC), the IIPC, and the Internet Archive. A small six-month experiment to archive some 100 UK sites began in 2001, building on voluntary archiving approaches (publishers were asked for their permission); this explored issues around selection criteria and processes and focussed on specific topical areas. From here, a more detailed Web Archiving Strategy has been formulated. Clearly, a truly comprehensive coverage of the UK Web will not be possible - rather, then there is a dual strategy of both taking full Web snapshots and performing far more selective harvesting of a limited and well-defined range of sites. Sites to be selected will include are relevant to research, culture, and Web innovation in Britain, building on curatorial expertise within the BL itself and external experts.

The UKWAC was launched in June 2004 and comprises a number of key institutions in Britain. It hopes to collect (voluntarily) some 6000 sites within the two years of its lifespan using the NLA's PANDAS software system for Web archiving. In terms of its collaboration with the Internet Archive, there is interest in acquiring content dating back to the IA's first collection of the British Web in 1997. Overall, too, all such efforts need to lead to the development of sustainable, systematic, and standardised approaches to the preserving of Internet content.

Finally to Penny Carnaby, the CEO and National Librarian of the National Library of New Zealand. She notes that today we're in a 'press delete' generation, and that in some years' time we may well regret the level of information that has been lost through this 'digital amnesia'. In 1999, the Library and Information Association of New Zealand Aotearoa (LIANZA) challenged the NZ government to develop a national information strategy aiming to preserve digital material. The three key terms here were connectivity, content, and capability (and today, also continuity and collaboration), always also with a sensibility for the needs of preserving indigenous knowledge. In May 2003, the National Library Act was revised by the NZ government, and brought the concept of legal deposit into an electronic domain: it defined the term 'electronic document', and as a subset of this, also 'Internet document'.

In November 2003, the World Summit on the Information Society (WSIS) in Geneva (to take place again in 05 in Tunis) also considered issues around the conservation of digital information, and the NLNZ took place in an important way by outlining some of the key issues of digital preservation (the Library is set up as an NZ government department with its own National Library Minister (!), and was therefore very well able to participate in this governmental summit). In 2005, countries will need to audit their progress against the goals set by the 2003 summit, so this means that there is now a real need to follow through on such goals and develop national e-strategies.

New Zealand, in fact, now has its own digital strategy (which builds on the connectivity/content/capability framework), and the NLNZ has received some NZ$24 million for its digital repository in the latest annual New Zealand budget. The approaches here will include snapshots as well as selective harvesting, but also direct engagement with relevant research, community and industry bodies in diverse fields. There is, however, an issue around copyright and access, of course - if there is, for example, a legal deposit system for electronic resources and these are then made available through the Library's Website, it may undercut the ability of content creators to exploit their intellectual property (but the NLNZ as well as the NZ government, which is a strong supporter of the creative industries agenda, is very sympathetic to these concerns as well.)

The next step, then, is the implementation of such programmes into a national digital archive, NZ Online. This also links back to the various component parts which have led to these developments (such as the second WSIS phase), and looks out to further international cooperation and collaboration. (New Zealand's intellectual environment clearly is one of the most supportive for digital preservation issues, in comparison to most of the countries we've heard from here…)