I disappeared on summer holidays pretty much immediately after my keynote on practice mapping at the ACSPRI conference in Sydney in late November, so I haven’t yet had a chance to round up my and our last few publications for the year (as well as a handful of early arrivals from 2025). And what a year it’s been – although it’s felt as if I’ve taken a more supportive than leading role these past few months, there have still been quite a few new developments, and a good lot more to come. I’ll group these thematically here:
And the final speaker in this final AoIR 2024 conference session is the excellent Fabio Giglietto, whose focus is on coding Italian news data using Large Language Models. This worked with some 85,000 news articles shared on Facebook during the 2018 and 2022 Italian elections, and first classified such URLs as political or non-political; it then produced and clustered text embeddings for these articles, and used GPT-4-turbo to classify the dominant topics in these clusters.
This required considerable prompt crafting, especially also to ensure that prompts remained within the LLM’s token limits. Key challenges here included the choice of LLM …
The next speaker in this final AoIR 2024 conference session is the great Hendrik Meyer, whose interest is in detecting stances in climate change coverage. This focusses especially on climate change debates in German news media, focussing on climate protests, discussions about speed limits, and discussions about heating and heat pump regulations.
Here stances might be better understood as evaluations related to a given issue or policy, and Large Language Models can be useful tools in assessing this, but this also requires considerable prompt crafting in order to generate consistent results. Computational costs for doing so (especially with complex prompts) …
The next speaker in this session at the AoIR 2024 conference is my QUT colleague Tariq Choucair, whose focus is especially on the use of LLMs in stance detection in news content. A stance is a public act by a social actors, achieved dialogically through communication, which evaluates objects, positions the self and other subjects, and aligns with other subjects within a sociocultural field.
Here, the focus is broadly on stances towards issues, persons, groups, and organisations. There are some tools for doing so, but they mainly focus on English-language content, are designed for specific types of data, and tend …
The second speaker in this final session at the AoIR 2024 conference is Bruna Silveira de Oliveira, whose focus is on using LLMs to study content in the Brazilian manosphere. Extremist groups in this space seek legitimisation, and the question here is whether LLMs can be used productively to analyse their posts.
This analysis focusses on some 2,500 episodes of Brazilian masculinist podcasts across ten streaming platforms. It engaged in an assisted content analysis using OpenAI’s GPT-4 model, and explored whether this could identify detailed variables in the content. The podcast episodes were transcribed using automated tools, and 52 episodes …
The final (!) session at this wonderful AoIR 2024 conference is on content analysis, and starts with Ahrabhi Kathirgamalingam. Her interest is especially on questions of agreement and disagreement between content codings; the gold standard here has for a long time been intercoder reliability, but this tends to presume a single ground truth which may not exist in all coding contexts.
The concept of ‘constructs of marginalisation’ might be useful here: marginalised people are underrepresented; existing structural power defines who defines such constructs; they are historically and culturally shaped; and explicit as well as ambiguous and evasive language that discriminates …