The next speaker at the Social Media Access Days at the German National Library is Robert Jäschke. He begins by noting the legal constraints on social media data sharing, including Terms of Service, copyright, and other restrictions. One approach to managing this is the way Twitter approached this: sharing datasets with lists of tweet IDs without any further content was allowed, and researchers then needed to ‘rehydrate’ them by regathering the tweet data. Another approach is to share only aggregate metrics rather than the source data themselves; or to share derived datasets (like term matrices, n-gram datasets, or word embeddings) …
And the next session at the Social Media Access Days at the German National Library starts with Ofra Klein, who will outline the challenges of studying far-right mobilisation in spite of the constraints of social media data access regimes. The far right use social media very extensively to promote their propaganda, and this can lead to physical demonstrations, riots, and violence; as and when this happens, social media posts and accounts are then often removed by the perpetrators of the platforms, complicating any meaningful research.
In addition, the number of platforms used for far-right agitation have diversified substantially; in addition …
Day Two at the Social Media Access Days in Frankfurt starts with my keynote, taking stock of how we access and engage with social media data nearly ten years after the drastic changes to many platforms' data access regimes following the 2018 Cambridge Analytica scandal. Back then, I wrote an article for Information, Communication & Societyabout the APIcalypse; how have things developed since then?
The final speaker at the Social Media Access Days at the German National Library today is Kristina Petzold, whose focus is on the question of whether music-related user-generated content can be seen as cultural heritage – this includes, for instance, some of the creative content generated and shared during the COVID-19 pandemic, and is part of post-digital everyday practice.
This includes content remixes, memes, and mashups, and is therefore highly referential; it is culturally relevant (and the cultural relevance of remix practices is now formally recognised under German law); but it is also highly ephemeral, especially where it exists in …
The next speakers in this session at the Social Media Access Days at the German National Library are Gabriel Viehhauser and Carl Friedrich Haak, whose interest is in making use of donated social media data – the concrete context here is that the Austrian author Clemens J. Setz, who has at times posted some of his short-form work on Twitter, has donated his archive of tweets to a library in Vienna, which was unsure about what to do with this gift.
Such work is diverse in its formats; further, Setz is author, but also interlocutor, curator, recipient, object of mentions …
The next speaker in this session at the Social Media Access Days at the German National Library is Catharina Ochsner, whose focus is on the archiving of scholarly blogs. Such blogs are engaged in science communication and thereby introduce more transparency into the scientific process; they exist in many different formats and across various major and minor platforms, and frequently link to each other and to other external resources.
But their long-term availability is limited, and depends on the blogger’s continued activity. There is a need for long-term archival of such resources in their original form, which also implies a …
The second session at the Social Media Access Days at the German National Library begins with a paper by Mia Berg and Oliver Vettermann, whose focus is on social media data scraping, with a particular focus on TikTok. TikTok does offer an API for data access (at least in Europe), but unfortunately it remains severely limited and unreliable; this is problematic given that many user practices and content formats are in urgent need of further analysis. One example of such a content genre is AI-generated video content, such as POV videos that purport to imagine historical situations.
The next speaker in this session at the Social Media Access Days at the German National Library is Katharina Maubach, whose focus here is on data formats for archiving social media data. She works with a project exploring liking activities on social media platforms, especially relating to content from news sites; this covers Disqus, Facebook, YouTube, Xitter, and Instagram.
Ideally, such a cross-platform dataset should be shared with other researchers under FAIR principles (findable, accessible, interoperable, and reusable), but under the Terms of Service of such platforms and their data access conditions this is very difficult; the focus of Katharina’s …
And next up at the Social Media Access Days at the German National Library are Marco Wähner and Jan Dennis Gumz, exploring the further use of Wikipedia data on the early German federal election in 2025. Because of the unusual circumstances of the election, following the failure of the governing coalition, there was an increased need for information about the election amongst voters, and Wikipedia (as the only public-interest Very Large Online Platform classified by the EU) played an important role here.
But as a collaboratively edited online platform, Wikipedia represents a particularly special information ecosystem; editing activity here also …
It’s a chilly Tuesday in Frankfurt, the Matildas just advanced to the final of the 2026 Women’s Asian Cup, and I’m at the opening of the Social Media Access Days at the German National Library, co-organised by my dear friend Katrin Weller from GESIS, the Leibniz Institute for the Social Sciences. The programme begins with a day in German, and opens with a paper by Pascal Siegers, who introduces the AVERA project. This emerged from a federal ministry project supporting the collection and sharing of data from research projects on racism and far-right extremism, and a first need it …