You are here

'Big Data'

The Limitations of Twitter as a Data Source

The next speaker in this ICA 2018 session is Fabian Pfaffenberger, who also highlights the unreliability of Twitter data. The API’s 1% sample is extremely biased, and the search API is also unreliable in what it delivers; historical data is especially incomplete as the search API delivers only tweets posted in the past 6-7 days and will not include deleted tweets or tweets from subsequently deleted or suspended accounts.

The Unreliability of the Twitter API

I’ve now moved on to an ICA 2018 high-density session on computational methods, which starts with Rebekah Tromble. She begins by noting the uncertainty about what Twitter data actually represent, and her project was to explore these questions.

Understanding the Factors That Affect Facebook’s Algorithmic Profiling of Users

The first ICA 2018 session I’m seeing this Monday morning is on echo chambers, and starts with Kelley Cotter and Mel Medeiros, who outlines the processes by which social media platforms generate algorithmic identities for their users. These identities determine what kind of content users encounter in their (algorithmically curated) newsfeed.

The Datafication Logics of Social Media Profile-Making

The final speaker in this ICA 2018 session is Lukasz Szulc, who shifts our attention to our digital profiles. Profile making is now ubiquitous in digital culture, especially of course in social networking sites and with the continuing move towards a platformisation of the Internet. Through our increased use of mobile devices they have also become more pervasive.

The Materiality of Big Data Technologies

The next speaker in this ICA 2018 session is Zane Cooper, whose interest is in the material constitution of big data. Big data make use of earth and labour that do not easily track with its digital manifestations: they generate a long supply chain of physical hardware that supports the big data cloud. There is therefore a need to distinguish between what big data infrastructures are (their constitutional logic) and what they do (their operational logic).

A New Map of the Australian Twittersphere

Together with some of my colleagues from the QUT Digital Media Research Centre, I’ve just released a new, detailed analysis of the structure of the Australian Twittersphere. Covering some 3.72 million Australian Twitter accounts, the 167 million follower/followee connections between them, and the 118 million tweets posted by these accounts during the first quarter of 2017, the new article with Brenda Moon, Felix Münch, and Troy Sadkowsky, published in December 2017 in the open-access journal Social Media + Society, maps the structure of the best-connected core of the Australian Twittersphere network:

The Australian Twittersphere in 2016: Mapping the Follower/Followee Network

Twitter is now a key platform for public communication between a diverse range of participants, but the overall shape of the communication network it provides remains largely unknown. This article provides a detailed overview of the network structure of the Australian Twittersphere and identifies the thematic drivers of the key clusters within the network. We identify some 3.72 million Australian Twitter accounts and map the follower/followee connections between the 255,000 most connected accounts; we utilize community detection algorithms to identify the major clusters within this network and examine their account populations to identify their constitutive themes; we examine account creation dates and reconstruct a timeline for the Twitter adoption process among different communities; and we examine lifetime and recent tweeting patterns to determine the historically and currently most active clusters in the network. In combination, this offers the first rigorous and comprehensive study of the network structure of an entire national Twittersphere.

I published a preview of some of the study’s key findings in The Conversation in May 2017. Meanwhile, my paper at the Future of Journalism conference in Cardiff in September 2017 built on this new Twittersphere map to test for the existence of echo chambers and filter bubbles in Australian Twitter – and found little evidence to support the thesis:

Towards e-Privacy by Design in European Union Legislation

The second keynote at AoIR 2017 is by Marju Lauristin, who is both a professor at the University of Tartu and the rapporteur on e-privacy at the European Parliament, where she also represents Estonia as an MEP; indeed she has been named one of the most influential Estonian women in the world. This week the Parliament voted on new EU privacy regulations which Marju has been instrumental in developing.

Her focus here is on the impact of algorithms on deliberative democracy, and the short summary of the situation is that algorithms will severely affect democracy if the companies that utilise them remain unchecked, and that they will prevented from doing so only if effective legislation is enacted to protect democratic processes.

Different Bots in the 2016 U.S. Presidential Election

The next speaker at AoIR 2017 is Olga Boichàk, who begins by highlighting the role of social media platforms in structuring specific forms of human sociality. But this also means that automated accounts – specifically, bots – can imitate and affect genuine human interactions in these spaces. What does this mean for online discussions in the context of the 2016 U.S. election campaign, then?

Donald Trump's Campaign and the Hybrid Media System

The first keynote at AoIR 2017 is by Andrew Chadwick, who explores what the 2016 U.S. presidential campaign means for our understanding of the hybrid media system. Political communication is in the middle of a chaotic transitional period, due in good part to the disruptions brought by newer, digital media; some older media have also been renewed by integrating the logics of newer media. This then represents a systemic perspective that examines forces while they are in flow.

The hybrid media system is built on the interactions of older and newer media logics in the reflexively connected field of media and politics. Actors in this field tap and steer information flows in ways that suit their goals, enable or disable the agency of others, across various older and newer media settings. 'Hybrid' here shifts our conceptualisation from 'either/or' to 'not only, but also'; it foregrounds complexity, interdependence and transition. We pay more attention to boundaries, flux, and liminal spaces, where practices intermeshing and co-evolve.

Testing the Validity of Twitter API Data

The next speaker in this AoIR 2017 session is Rebekah Tromble, whose focus is on the impact of digital data collection methods on scientific inference. Collecting data from social media APIs, how can we know whether we have 'good', valid data?

Pages

Subscribe to RSS - 'Big Data'