The second day at the iCS Symposium at IT University Copenhagen starts with a keynote by Lina Dencik. She explores the difficulties in researching the datafied society, building on several of the projects currently underway at the Data Justice Lab at Cardiff University. This work must involve researchers, but also civil society actors, practitioners, journalists, and others.
The datafied society represents an immensely fast-moving space; there are constant updates on development projects, company initiatives, government actions, data scandals, etc. As researchers, it is important to introduce a sense of slowness into this environment from time to time, in order to take a more considered and careful look at what is going on, yet the speed at which new data-driven technologies are being implemented across society, often without having been fully trialled and tested, makes this very difficult and gives a great deal of unchecked power to the companies providing these technologies.
When governments use them to predict and modify human behaviour, we are positioned in new ways as citizens by these technologies, and it is important to investigate the roles that these initiatives envisage for us. We become citizens when we engage in these digital environments as they profile, categorise, and sort us: our digital traces are used variously to determine our citizenship status (a new form of ius algorithmi rather than ius sanguini); to detect and police domestic extremism and disorder (drawing especially on social media data captured through commercial analytics platforms, as a new form of ‘open source intelligence’); and to ‘score’ citizens on a number of dimensions in order to optimise – or perhaps minimise – public service delivery (the Chinese ‘social credit’ score is just the most egregious example of a much wider trend here).
These approaches seek to develop integrated, clearly defined datasets that capture the available information about each citizen, and thereby generate reliable and standardised scores for everyone. In the U.K. this takes place in the context of continuing austerity policies; the agencies carrying out such work are therefore data-rich but resource-poor, and rely mainly on commercial data gathering and warehousing solutions that may not necessarily be especially fit for the tasks they are applied to.
How are we to study these ‘big data’, algorithmic initiatives? Even just doing the technical analyses of the algorithms being used is exceptionally difficult, because they are undisclosed and intransparent; Freedom of Information requests to government provide some details, yet government agencies often refuse such requests with the excuse that to reveal the algorithms’ inner workings might enable citizens to mislead and game them. The truth is also that the government agencies often themselves do not know how the (commercial) algorithms on which they rely actually work. Alternatively, it may be possible to examine the past scholarly publications of the Chief Technology Officers of the companies that provide such algorithmic services; in their earlier careers they may have published papers about the algorithms they had designed.
But even if we know how the algorithms work, how does that help us? How do we audit the ‘fairness’ of these algorithms – and indeed, what do we mean by fairness in this context? Instead, perhaps it is more important to examine the broader role of these initiatives within society, and to study the operative logics that they reveal. There is a broader move from reactive to preemptive logics in governance that is revealed by these developments, and the focus of such governance is increasingly on risk management. Social science researchers can make useful contributions here, by decentring the role of data and data systems and focussing instead on what people do with the data, and why. This also serves to politicise data systems as sites of struggle, and may reveal opportunities to push back against these developments.
It is especially important to assess the impact of these datafied approaches, yet to do so is often difficult because the impacts are often indirect and difficult to confirm without direct access to decision-making processes. Governments and companies are also pushing back against this work, and there is an agenda to narrow our understanding of the role of data in governance. In response to such pushback, Lina highlights here especially the Funding Matters initiative, which emerged from the controversy about the role of shady data analytics company Palantir as a sponsor for an academic conference on privacy: does funding support from this and similar companies undermine the independence of scholarly research?
There is in this, if nothing else, a need for greater transparency about funding sources for scholarly research. We have already seen a number of high-profile cases of tech industry funding directed to ‘friendly’ researchers (or alternatively, withdrawals of such funding from scholars that companies have seen as too critical); especially in light of the growing reliance of scholarly research on commercial funding, such problems are only likely to increase further in the coming years.