The next speaker in this AoIR 2024 conference session is Nanna Bonde Thylstrup, who begins by noting the critical role of data annotation practices in shaping the machine learning process underlying generative AI; such annotation is a world-making practice, must align with editorial values and the journalistic ethos of objectivity, and can of course also reproduce pre-existing societal biases.
In a sense, then, the algorithms of generative AI must also seek to reproduce (and perhaps improve upon) the famous ‘gut feeling’ of conventional human journalism. The present project worked with developers and data annotations at Danish news organisations – but notably not with journalists, who never engage with such in-house divisions – to observe internal discussion processes about the development and deployment of such technologies, and noted typical divisions between different perspectives (development, financial, management) and philosophies (objective, moral, procedural) about data annotation.
Part of this is about developing an in-house data annotation protocol, which must reflect the news outlet’s positioning, the assumed values of its audience, and complexity of the real world; this cannot necessarily build on existing generic taxonomies that may be available from outside sources and projects. And while such work is critical to the functioning of the news organisation, it is also very invisible in the organisation, since it is seen as mere support work for the core business; this is also reflected in the physical setup of the organisation, where the newsroom and the IT department are located on different floors and do not generally interact with each other.
There is also friction between merely modelling existing reality and modelling reality as people might think it should be; this is a political decision, and necessarily biases the model. Annotators were very aware of the world-building they contributed to, and of the long-term consequences that such world-building might have. How such decision-making takes place across many organisations in this space requires a great deal more investigation.