You are here

'Big Data'

The Opportunities and Challenges of 'Big Data' Research

At the end of an extended trip to a range of conferences and symposia I've made my way to Vienna, where I'm attending the DGPuK Digital Methods conference at the University of Vienna. The conference is in German, but I'll try to blog the presentations in English nonetheless - wish me luck... We begin with keynote by Jürgen Pfeffer, addressing - not surprisingly - the question of 'big data' in communications research.

Jürgen begins by asking what's different about 'big data' research. In our field, we're using 'big data' on communication and interaction to work towards a real-time analysis of large-scale, dynamic sociocultural systems, necessarily especially through computational approaches - this draws on the data available from major social networks and other participative sites, but it aims not to research "the Internet", but society by examining communication patterns on the Internet (and elsewhere).

Distinguishing Chain and Name Networks in Social Network Analysis

The final speaker in this "Compromised Data" session is Anatoliy Gruzd, whose interest is in the automated discovery and visualisation of communication networks from social media data. (He's also just launched a new journal in this field, Big Data and Society.) How can such networks be discovered and visualised, and how can we evaluate the sense of community which may exist in them?

Social network analysis enables us to investigate the connections between users in social networks. It reduces large quantities of messages to a smaller number of nodes exchanging communication; it can track longitudinal developments over time; it can show the social dynamics of interaction around specific topics and events; and it can differentiate between different types of network formation in social interaction.

Bottom-Up Measurements of Network Performance

The next session at "Compromised Data" starts with Fenwick McKelvey, who begins with a reference to the emergence of digitised methods for the study of the Web during the mid-2000s. This was the time around which the latest generation of social media emerged, enabling us to begin thinking about society through the study of the Internet, requiring the development of new research methods by repurposing computer science methods for social science research.

In Toronto, Infoscape Labs developed a number of tools for the exploration of political discourse in Web 2.0, including the Blogometer. This is the emergence of platform studies, paying attention to the platform itself - but this also introduces challenges about how to study the platform, as the core object of research itself intervenes in its study, e.g. through the politics of APIs. This work also required compromises around data access and utilisation, and a growing bifurcation between scholarly and commercial research activities emerged.

Archiving Our Personal Digital Milieux

The final presenter in this morning session at "Compromised Data" is Yuk Hui, who will present a social media self-archiving project. He has worked for years on audiovisual archives, but much of the work on this field has focussed on institutional rather than personal archives, with the latter often concerned mainly with privacy issues.

But another set of problems relates to data management instead: we are working with multiple cloud-based systems, but rarely archive our digital objects effectively - archiving is not just about storing, but about preserving the context of digital objects as well: the digital milieu.

Social Media Data and Their Utopian Assumptions

The next speaker at "Compromised Data" is Ingrid Hoofd, whose interest is in how new technologies make certain types of representation possible or impossible. The neoliberalisation of universities, for example, leads to a quantification of research data which generates poor research. This is the violence of numbers: how do we assess the way new media technologies change the face of social sciences research, then?

Social media data mining methodology provides an allegory of the technological apparatuses that use it. This hinges on these technologies' propensity to speed up, and on the associated notion of change. There is a strong emphasis on objectivity, generating more true as well as more questionable coverage of the conditions of the real. Social science via datamining tools is implicated in a push towards an idealised data-driven utopia.

Haunted Data in Cross-Media Controversies

The second day of "Compromised Data" starts with Lisa Blackman, who is tracking social media controversies and mapping information contagion. Can we use quantitative methods in non-positivist ways to understand these processes?

Lisa introduces the idea of haunted data, and suggests that we need to think about digital methods as performative: we need to move behind infographics when thinking about visualising data. Part of this is about priming: creating an experimental apparatus that makes people feel that their actions are self-directed, but actually generates such actions through the interventions of the apparatus. Such research is controversial because of its early ties to research into psychic phenomena, however. It is useful, however, to explore information contagion and virality, especially in the context of social media controversies.

The Push towards Niche Geosocial Data

The final speaker on this first day of "Compromised Data" is Sidneyeve Matrix, who shifts our focus towards geosocial information as generated by smartphones and other mobile devices. Only 12% of US users as surveyed by the Pew Centre posted Foursquare check-ins in 2013, for example, down from 18% in 2011 - but this may mask a greater take-up of other location-based services, not least the Frequent Locations functionality in iOS7.

There is a continuing trend towards the consumerisation of geodata. Geosocial cultural arrangements are explored through the use of mobile communication patterns, but such analysis is notoriously difficult - not because of a lack of data, but because of the difficulties in assigning meaning to the geolocated information which is available from a variety of platforms.

Towards a More User-Centric Perspective in Utilising 'Big Data'

The next speaker at "Compromised Data" this afternoon is Asta Zelenkauskaite, who notes the increasing interweaving of social and mainstream media; based on the properties of 'big data' it therefore becomes important to explore how users engage with mass media and cross-media contexts. How relevant are 'big data' to the mass communication field?

Traditional media outlets have been mainly focussing on a quasi-passive engagement with media content, while social media now offer a two-way interaction by providing back channel functionality. Mass media content, user-generated content, and user interactions' digital imprints are coming together to shape this cross-media environment.

'Big Data' and Government Decision-Making

The next speaker at "Compromised Data" is Joanna Redden, whose interest is in government uses of 'big data', especially in Canada. There's a great deal of hype surrounding 'big data' in government at the moment, which needs to be explored from a critical perspective; the data rush has been compared to the gold rush, with similarly utopian claims - here especially around the ability for 'big data' to support decision-making and democratic engagement, and the contribution 'big data'-enabled industries can make to the GDP.

But how are 'big data' actually being used in government contexts? New tools and techniques for the analysis of 'big data' are of course being used in government, but how these affect policy decisions remains unclear. Social media analysis is similarly being used for public policy and service delivery; sentiment analysis is used for some decisions around law enforcement and service delivery, but adoption to date is slow.

Pages

Subscribe to RSS - 'Big Data'