You are here

Archiving the Danish Web

Niels Finneman is next, speaking about the archiving of Danish election sites which led on to the question of developing a national Danish Internet archive. They didn't find either the Australian or Swedish models appropriate for their purposes, and so tried to find a middle way.

Three basic elements: 1. they wanted snapshots of the .dk domain (four per year, ideally - this means that less work is necessary in between); 2. selective archiving remain important, but (different from the Australian model) this is used to identify sites which are archived more than four times a year (e.g. newspaper sites which change very frequently, or very high-use or specific Internet genres which are of particular interest); 3. ideally an additional opportunity for on-demand archiving (on demand of an interested archivist or researcher).

An interesting issue: right now, unauthorised archiving isn't legal in Denmark yet, but there are exceptions for libraries (but they can't make archived content available to the wider public on the Net - go figure).

The archive will start to operate from 1 January 2005. There is an existing Scandinavian project to develop a shared interface to an Internet archive, but this project isn't necessarily directly related to it; also, there is an issue of how 'live' presented content is - do hyperlinks on archived pages connect to other pages in the archive (to preserve the authentic experience of surfing sites from a specific point in time) or to current sites directly on the Web (or can there be a choice)?

There's a very interesting discussion about metadata (or perhaps meta-metadata) emerging now - data collected about researchers' use of archived material. It could lead to a kind of Amazon.com-style system ('researchers who looked at this site also studied these'), or a system that links archived materials with papers that have been written about them by researchers. Libraries wouldn't be able to track all of this, of course; the library system would need to be developed in a way that enables researchers directly to connect with the archive (i.e. annotate the archive collection with their own research). Semantic Web, anyone?