Addressing the Wicked Problem of Account Matching across Platforms

Snurb — Thursday 26 September 2024 03:30

Social Media | Facebook | Twitter | ECREA 2024 |

The next speaker in this ECREA 2024 session is Azade Kakavand, whose interest is in mapping far-right voices across platforms. This is methodologically difficult, and requires a matching of user identities across platforms – especially also because far-right actors are well-known for using multiple platforms for a variety of distinct purposes.

The present study employs the process of user identity linkage (UIL), which was developed in computer science for user profiling, marketing, and cybersecurity purposes. Here, however, the approach is not limited to natural persons but is applied to human and non-human accounts of any kind. The project draws on data from some 1,000 Facebook pages and 40,000 Twitter accounts, which it identified through a three-step snowball sample that started with some 40 German far-right accounts.

Matching between these accounts started with a gold-standard manually matched sample of matching accounts across both platforms, with additional categorisation of the actor types and affiliations; computational user identity linkage then built on attribute (user names, screen names, etc.) and content features (post content, temporal distribution). This uses a reduced feature set compared to UIL in other contexts, in part due to limitations in data availability.

Some 111 user handles matched perfectly; as did 206 user names; the remainder were more complicated. Fuzzy matching, especially for party names and abbreviations, was unsuccessful: AfD and FDP use almost the same set of letters, for example, but are very different parties. Overall, name matching was more successful than handle matching, but both measures are not ideal.

So, more work is needed here. More manual coding is necessary for the development of a gold-standard dataset; sentence embedding and named entity recognition might also help connect accounts.

133 views