You are here

'Big Data'

Modelling Discrete Choice Problems

Post-lunch, the final day of Web Science 2016 continues with a keynote by Andrew Tomkins, whose focus is on the dynamics of choice in online environments. He begins by highlighting R. Duncan Luce's work, including his Axiom of Choice, but also points out the subsequent work that has further extended the methods for analysing discrete choice. Today, the most powerful models are mathematically complex and computationally intractable, as well as requiring sophisticated external representations of dependence.

From this work it has become clear that the Axiom of Choice holds only under relatively select conditions. Contextual data is of great importance here, and additional approaches to modelling general behaviour of discrete choice are required. The Randomised Utility Model, for instance, assigns a random utility value to each available choice, and in an ideal world users would then select the item with maximum utility; but because of existing preferences real-world users will deviate from such choices.

How Facebook Uses Computational Processes to Police Its Ads

The final Web Science 2016 keynote for today is by Daniel Olmedilla, whose work at Facebook is to police the ads being posted on the site. Ads are the only part of Facebook where inherently unsolicited content is pushed to users, so the quality of those ads is crucial – users will want relevant and engaging content, while advertisers need to see a return on investment. Facebook itself must ensure that its business remains scalable and sustainable.

Key problem categories are legally prohibited content (e.g. ads for illegal drugs); shocking and scary content; sexually suggestive material; violent and confronting content; offensive before-and-after images; ads with inappropriate language; and images containing a large amount of text.

Predicting Twitter-Based Information Cascades

The next session at Web Science 2016 starts with a paper by Jure Leskovec on information cascades. Such cascades emerge as users of social media platforms (re)share content through their networks, and the prediction of such processes is traditionally very difficult.

Current Practices in Social Media Data Sharing between Researchers

The next WebSci 2016 presenters are Katharina Kinder-Kurlanda and Katrin Weller, who argue that it is necessary to address the digital divides in data accessibility in social media research. They interviewed a large number of social media researchers, and what emerges from this work is that much data sharing is already taking place, but under varying circumstances.

Identifying MOOC Learners on Social Media Platforms

We start the first paper session at WebSci 2016 with a paper by Guanliang Chen that examines learner engagement with Massively Open Online Courses (MOOCs). These generate a great deal of data about learner engagement during the MOOC itself, but there's very little information about learners before and after this experience. Can we use external social Web data to identify and profile these learners, in order to better customise the learning experience for them?

Web Science and Biases in Big Data

It's a cool morning in Germany, and I'm in Hannover for the opening of the 2016 Web Science conference, where later today my colleague Katrin Weller and I will present our paper calling for more efforts to preserve social media content as a first draft of the present. But we start with an opening keynote by Yahoo!'s Ricardo Baeza-Yates, on Data and Algorithmic Bias in the Web.

Ricardo begins by pointing out that all data have a built-in bias; additional bias is added in the data processing and interpretation. For instance, some researchers working with Twitter data then extrapolate across entire populations, although Twitter's demographics are not representative for the wider public. There are even biases in the process of measuring for bias.

Twitter in Germany: A Big Data Perspective (GAU 2015)

Georg-August-Universität Göttingen

Twitter in Germany: A Big Data Perspective

Axel Bruns

  • 3 June 2015 – Georg-August-Universität Göttingen

Pages

Subscribe to RSS - 'Big Data'