You are here

Non-Content Features of Material in the Internet Archive

The third presenter in this Web Science 2016 session is Tu Ngoc Nguyen, who reintroduces us to the Internet Archive's Wayback Machine. This is a useful service, but searching it is not necessarily straightforward. Is it possible to draw on the non-content features to improve search results?

The project drew on the full archive for the German Web, and utilised a number of assessment techniques to assess and rank documents based on twenty non-content features. I'm frankly unable to understand the numerical data presented in the tables here, but from what I do understand the use of these additional features does improve the retrievability of relevant information. Sorry!