Exploring the Use of LLMs in News Content Coding

Snurb — Tuesday 15 July 2025 17:11

The final speaker in this session at the IAMCR 2025 conference in Singapore is my excellent colleague Laura Vodden, presenting on the methodology of our ongoing analysis of climate coverage in the Australian media. This explores patterns of polarisation within journalistic content, but polarisation is not particularly well-defined in the literature, so we have developed the concept of destructive polarisation as an approach to defining when polarisation becomes problematic.

There is no clear information on how polarised the Australian media landscape is. Therefore, this project examines climate change coverage across some 26 Australian news outlets from the mainstream to the margins, drawing on climate change news coverage from these outlets over two constructed weeks.

The project manually coded these articles for seven different variables, including which actors and actor types are featured, what claims are made about climate change, and about other political actors. Such detailed coding is slow, laborious, and expensive, however, and therefore the project is also exploring the use of Large Language Models like ChatGPT-4o to extend this coding process to a larger dataset.

So far, this has included a number of areas: determining in the first place whether an article does centrally cover climate change as the central topic. For this, the LLM performed better than older methods like LDA or BERTopic, though it does not always agree with the human coders (and the human coders are not always unanimous in their views either).

Second, identifying the actors named in an article, and assigning an actor category: here, the challenge is to clearly extract the text span that describes an actor (including or excluding indefinite or definite articles, as well as further role descriptions that may also be present); actor categories are chosen from a predefined list, and here matches between human and LLM coders are easier to assess. LLM runs turned out to produce highly consistent (but not entirely identical results); LLM and human coder agreement is generally acceptable, and similar to agreement levels between different human coders. In some edge case contexts, the LLM performs better than the human coders.

Finally, classifying the claims made in the article again produced good consistency between LLM coding runs, and solid agreement between LLMs and humans; human coders were also often somewhat inconsistent between each other.

Overall, then, LLMs have potential in content coding, and their use often fosters a reflective approach to the research, since it can also point out issues with the human coding itself. LLMs also continue to develop, of course, so further testing across a broader range of models will be important too.

20 views