Skip to main content
Home
Snurblog — Axel Bruns

Main navigation

  • Home
  • Information
  • Blog
  • Research
  • Publications
  • Presentations
  • Press
  • Creative
  • Search Site

Challenges in Building Moderation Bots

Snurb — Thursday 12 June 2025 22:18
Artificial Intelligence | Social Media | Bots Building Bridges 2025 | Liveblog |

The next speakers at the Bots Building Bridges project workshop are Zlata Kikteva and Arthur Romazanov, representing the DeLab (or Deliberation Laboratory) team at the University of Passau. Their team has developed a bot to take on moderation tasks.

This builds on research by members of the team on how humans moderate online discussions, which has explored key moderation strategies – soft moderation such as probing for elaborations, tone policing, social norm policing, agenda control, fact-checking, inviting experts top contribute; as well as hard moderation such as removing content and users from the discussion.

But can we delegate such moderation effectively to bots? Such moderation might need diverse background and contextual knowledge on the discussion topic, the nature of the community, and the specific participants involved in order to determine appropriate courses of action; could bots ever model these human attributes?

At the same time, automated moderation remains desirable since human moderation is labour-intensive, expensive, challenging for large content volumes, not available around the clock, and especially difficult to sustain for small platforms; human moderators can also be perceived to be more biased than automated systems.

This project conducted ethnographic field work on the small-scale FoodLog network as well as a self-moderation study on Xitter and intervention studies on Reddit and Mastodon; this produced some 211 conversation samples with some 41% ‘true’ labels.

From this, it was possible to identify nine features that could be operationalised in the making of moderation decisions; these features are not all necessarily equivalent to each other (e.g. the detection of hate speech would always lead to an intervention, irrespective of other features), but a combination of these assessments can then be used to determine whether a given comment reaches a set threshold for intervention.

If the threshold is met, this triggers a prompt for a Large Language Model to produce the content of the intervention in response to the preceding conversation.

  • 11 views
INFORMATION
BLOG
RESEARCH
PUBLICATIONS
PRESENTATIONS
PRESS
CREATIVE

Recent Work

Presentations and Talks

Beyond Interaction Networks: An Introduction to Practice Mapping (ACSPRI 2024)

» more

Books, Papers, Articles

Untangling the Furball: A Practice Mapping Approach to the Analysis of Multimodal Interactions in Social Networks (Social Media + Society)

» more

Opinion and Press

Inside the Moral Panic at Australia's 'First of Its Kind' Summit about Kids on Social Media (Crikey)

» more

Creative Work

Brightest before Dawn (CD, 2011)

» more

Lecture Series


Gatewatching and News Curation: The Lecture Series

Bluesky profile

Mastodon profile

Queensland University of Technology (QUT) profile

Google Scholar profile

Mixcloud profile

[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Licence]

Except where otherwise noted, this work is licensed under a Creative Commons BY-NC-SA 4.0 Licence.