Skip to main content
Home
Snurblog — Axel Bruns

Main navigation

  • Home
  • Information
  • Blog
  • Research
  • Publications
  • Presentations
  • Press
  • Creative
  • Search Site

Exploring the Use of LLMs in News Content Coding

Snurb — Tuesday 15 July 2025 17:11
Politics | Polarisation | Journalism | Industrial Journalism | Artificial Intelligence | Dynamics of Partisanship and Polarisation in Online Public Debate (ARC Laureate Fellowship) | IAMCR 2025 | Liveblog |

The final speaker in this session at the IAMCR 2025 conference in Singapore is my excellent colleague Laura Vodden, presenting on the methodology of our ongoing analysis of climate coverage in the Australian media. This explores patterns of polarisation within journalistic content, but polarisation is not particularly well-defined in the literature, so we have developed the concept of destructive polarisation as an approach to defining when polarisation becomes problematic.

There is no clear information on how polarised the Australian media landscape is. Therefore, this project examines climate change coverage across some 26 Australian news outlets from the mainstream to the margins, drawing on climate change news coverage from these outlets over two constructed weeks.

The project manually coded these articles for seven different variables, including which actors and actor types are featured, what claims are made about climate change, and about other political actors. Such detailed coding is slow, laborious, and expensive, however, and therefore the project is also exploring the use of Large Language Models like ChatGPT-4o to extend this coding process to a larger dataset.

So far, this has included a number of areas: determining in the first place whether an article does centrally cover climate change as the central topic. For this, the LLM performed better than older methods like LDA or BERTopic, though it does not always agree with the human coders (and the human coders are not always unanimous in their views either).

Second, identifying the actors named in an article, and assigning an actor category: here, the challenge is to clearly extract the text span that describes an actor (including or excluding indefinite or definite articles, as well as further role descriptions that may also be present); actor categories are chosen from a predefined list, and here matches between human and LLM coders are easier to assess. LLM runs turned out to produce highly consistent (but not entirely identical results); LLM and human coder agreement is generally acceptable, and similar to agreement levels between different human coders. In some edge case contexts, the LLM performs better than the human coders.

Finally, classifying the claims made in the article again produced good consistency between LLM coding runs, and solid agreement between LLMs and humans; human coders were also often somewhat inconsistent between each other.

Overall, then, LLMs have potential in content coding, and their use often fosters a reflective approach to the research, since it can also point out issues with the human coding itself. LLMs also continue to develop, of course, so further testing across a broader range of models will be important too.

  • 1 view
INFORMATION
BLOG
RESEARCH
PUBLICATIONS
PRESENTATIONS
PRESS
CREATIVE

Recent Work

Presentations and Talks

Beyond Interaction Networks: An Introduction to Practice Mapping (ACSPRI 2024)

» more

Books, Papers, Articles

Untangling the Furball: A Practice Mapping Approach to the Analysis of Multimodal Interactions in Social Networks (Social Media + Society)

» more

Opinion and Press

Inside the Moral Panic at Australia's 'First of Its Kind' Summit about Kids on Social Media (Crikey)

» more

Creative Work

Brightest before Dawn (CD, 2011)

» more

Lecture Series


Gatewatching and News Curation: The Lecture Series

Bluesky profile

Mastodon profile

Queensland University of Technology (QUT) profile

Google Scholar profile

Mixcloud profile

[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Licence]

Except where otherwise noted, this work is licensed under a Creative Commons BY-NC-SA 4.0 Licence.