The final keynote at ASMC14 for today is by Bernhard Rieder from the Digital Methods Initiative, who stepped in at short notice for Tarleton Gillespie who could not be here. He begins by noting the role of algorithms in our experience of information and media; they select what information is considered most relevant to us, and are now a crucial part of our participation in public life. This raises a number of questions – and starting with search engines, such algorithms have been considered increasingly by researchers.
One way to approach algorithms is by considering the question of knowing: what style of reasoning do algorithms implement, and how do they connect this to forms of performativity. Bernhard has been one of the chief developers of the Digital Methods Initiative, and in this role works closely with as well as thinks critically through algorithms; this is also a process of opening the black box of the algorithms which shape our online experiences.
Such black boxes are critical: users do whatever they do on a platform, them the black box algorithm performs "some maths", as Bernhard says, and finally we end up with some kind of structural order. How can we understand such processes more fully? On the input side, we have a system in use, accepting user inputs; this feeds into an algorithm which processes these data, and finally returns structured outputs that provide a digested overview of content.
But what is fed into the process is content of a wide variety, from cat photos through political propaganda to marketing content; these are each channelled into highly standardised grammars of action, however, and thereby made capturable and proccessible, in pursuit of specific operational goals. Grammars must become progressively more pervasive and more explicit – deeper – to enable more data to be captured; this is somewhat like living inside a survey, or even like a very large experiment in which a great many variables are controlled.
In turn, such data are again made available to be used and built upon, in the same way that Tindr builds on Facebook; this is nothing new, as semantic categorisation and data classification problems have a very long history which predates the age of computers by some centuries.
But what we need such classifications and calculations for? We need them because we have been dealing with increasingly high numbers of individual (if not independent) entities whose activities and behaviours must be understood – and to understand them we have increasingly required statistical and related computational approaches. Social media, too, deal with various kinds of such "too many" problems – they work to provide algorithmic answers to the question of where to direct our attention.
Classification is still very important in many ways, them, but processing and calculation make things even more complicated. The result is a process of semanticisation, of connecting items, entities, things, based on various shared attributes. These relationships are encoded into the underlying classification structures which the platforms we engage with offer – and occasionally they change, as when Facebook introduced a "custom" option to its gender selection box which broke out from the limiting male/female binary.
Some data, of course, are smiley a byproduct of actual users actually using the system; these are not predefined categories, but arise from aggregate patterns of usage. The observations based on such patterns are probabilistic, then – they generate predictive categories, which are at present most frequently used in advertising (Facebook enables ad targetting by what it calls "ethnic affinity", for example).
So much for the user input phase, then. How algorithms tie together all of these diverse data sources becomes even more interesting, and complex: there is a vocabulary of methods which are used to derive meaning from data, and such techniques stem from a wide variety of disciplines and philosophies. Bernhard demonstrates this by visualising his Facebook friends network, and notes that there is an infinite amount of ways in which such a network could be processed and visualised, and (by adding more of the data which is available about interests, likes, shares, etc.) can be made infinitely complex.
Each algorithmic technique highlights certain aspects of the data, in different ways; each technique can be further adjusted by changing specific parameters used to control them; and such parameters are increasingly set by continuous testing. Here empirical practices and operational goals converge, as actual behaviours and algorithmic results can be compared against each other. We move from algorithm to practice, as it provides certain outputs for users to engage with, but also back from practice to algorithm, as the algorithm is trained to better anticipate actual uses.
This moves past any intentional interpretation of specific signals, and instead strives to make things inherently calculable; every signal receives a certain value which feeds back into the evaluation. The result is a kind of generated (or even generative?) algorithm which sits outside Tarleton Gillespie's editorial/algorithmic dichotomy – and there is growing industry rhetoric around such algorithms as new modes of empirical validation.
Such algorithms fold into one the analysis of data and the production of order – they are engines of order. This move from classification to calculation is a move from essence to relation, and a good analogue to such algorithmic configurations is found in multi-sided markets; these, too, are places of truth in the sense that truth is produced as a byproduct of their optimal functioning. The right algorithm produces the optimal equilibrium between user satisfaction and value extraction.
We should not dismiss such developments out of hand, or see their proponents from simplistic perspectives. They are neither simply naive nor cynical. We face a series of important problems: there is likely to be further concentration and concentric diversification of large Internet companies through tipping markets; operational concepts of knowledge and truth will become even more pervasive; privacy issues pale compared to the threat of knowledge monopolisation and the commercial reconfiguration of publicness; and political institutions are horribly unprepared for dealing with algorithmic engines of order, both technically and normatively.