You are here

Classes of Content in Content Moderation Approache

The next speaker in this AoIR 2023 session are João Carlos Magalhães and Emilie de Keulenaar, who begins by outlining the recent history of platform content moderation – from the relatively minimalist approach of the 2000s to early 2010s, influenced by a maximalist and very American understanding of free speech and executed mainly through manual means, to the more interventionist moderation since the mid-2010s, recognising the multiple harms of unlimited free speech, building on a more European and international human rights framework, and utilising much more automated means of moderation.

Content moderation is among the most consequential and controversial systems of speech control ever created; it is shaped by legal, technological, economic, and political factors on a global scale. Do these complexities mean that effective content moderation is essentially impossible, however? A history of content moderation enables us to examine the structures of power that are involved in its creation and operation, and to reconsider whether and how it has worked and how it should work (and should be regulated). It may also highlight the potential paths not taken.

The present study focusses on Twitter, and collected archives of Twitter’s About and Rules pages relating to hateful and abusive content from the WayBack Machine for every month. This showed a gradual growth in such content, as well as a shift in focus: from an early focus on violent and physical threats through attention to spam, hateful and abusive conduct, to a major shift in Twitter’s approach after its acceptance of the Santa Clara principles.

There is an initial focus on ugly content and behaviours that are always moderated; bad hate speech and abusive behaviour with variable objectionability; and variable issues that may be seen as problematic today but could become more acceptable at a later stage (e.g. as governments, platform leadership, or political contexts change). Moderation and enforcement approaches to these forms of content may change over time.

’Ugly’ content is always objectionable in part because it is not defined by Twitter but by other, governmental and transnational actors like the Global Internet Forum for Countering Terrorism. ‘Bad’ content, by contrast, is defined by platforms policies and addressed by internal dispute resolution processes, which evolve over time. Content of variable or emergent objectionability attracts uneven and changing moderation practices that the platform may change on a whim or in response to user activism or public pressure.

This points to a form of moderated moderation – a pragmatic, crisis-resistant speech architecture that is able to withstand external shocks resulting from the unpredictable ways in which speech becomes unacceptable.