You are here

Understanding Computational Methods in the Digital Humanities

Canberra.
The final panellist on this DHA 2012 panel on ‘Big Digital Humanities’ is John Unsworth. His definition of the digital humanities is narrower than that of the others: he defines it as a form of humanities scholarship that builds centrally on computational methods – for example, research which uses ‘big data’ resources to do work which could not be done in any other way.

John uses the Hathi Trust Digital Library as an example: a collection of some 10 million (and growing) digitised publications which emerged in tandem with the Google Books initiative and is supported by libraries which contributed to the initiative; the Trust also operates a research centre which enables users to do computational work building on this vast resource.

But such computational research, if it is to be meaningful, will still require a substantial amount of further preparatory work: for example, to clean and structure the available data, and to improve the tools which are used to work with the dataset. The vision of a seamless humanities computing future which Peter Robinson presented earlier in this session is some way off at best, and utopian at worst. The difficulties which Hathi Trust partners had in retrieving digital library data from Google at the conclusion of its digitisation processes (via datadump downloads, rather than through an API) is an obvious example for this.

The obvious conclusion from such experiences is that a range of legal and organisational frameworks are required to facilitate greater cross-institutional collaboration – as only such collaboration will enable work with large datasets. On the other hand, too, there’s a need for individual researchers to have the skills to understand the tools they’re using, and to build some of their own, at least on a small scale.

Such tools and technologies cannot be treated simply as a black box – it’s important for the researcher to know enough to be able to tinker, and indeed it’s a sign of their seriousness. Not everyone has to do it, or to be interested in it, but for all of us to ignore this altogether is even worse.