How Microsoft Copilot Provided (Mis)information about the 2024 Taiwanese Presidential Election

Snurb — Wednesday 3 July 2024 15:47

The third presenter in this IAMCR 2024 session is Joanne Kuai, whose interest is in LLM-powered chat bots and search engines. There is a considerable shift now underway in search: instead of presenting a list of search results, search engines are gradually moving towards the presentation of a summary of the search topic, with references attached. This is true for Google’s Gemini, Microsoft Copilot, and Baidu search, and especially important as more than half the world’s population participates in elections in 2024.

This project focussed on results from Microsoft Copilot on the Taiwanese presidential election earlier in 2024. In particular, it assessed dimensions of information readiness, factual correctness, norms represented by responses, and sourcing practices, and examined this across multiple languages (English, traditional and simplified Chinese, German, and Swedish). This took a multi-turn conversational approach across five languages, queried from Sweden, in January 2024, some days before the election, and posed several key questions related to the election.

Results from this were coded manually, and found that the chat bot answered nearly 60% of all questions; it deflected mainly on questions about the best candidate to vote for, and deflection strategies also differed between the two versions of Chinese (which variously represent the usage in Taiwan and mainland China). Factual correctness was largely good; German answers had the least error rate (possibly as the result of critical coverage by AlgorithmWatch, while fully 50% of responses in traditional Chinese were factually incorrect (stating that there were no election candidates yet, or that an ineligible candidate would run again). In English, one factual error related to leading candidates’ polling performance persisted throughout.

Explicit norms were also frequently expressed, mostly declaring the chat bot’s neutrality. Sourcing behaviours were different across languages, and sometimes misattributed information to the wrong sources. This particular AI chat bot should not be trusted to provide reliable information at this stage, therefore.

290 views