You are here

'Big Data'

Exploring the Global Demographics of Twitter (AoIR 2014)

Association of Internet Researchers conference 2014

Exploring the Global Demographics of Twitter

Axel Bruns, Darryl Woodford, and Troy Sadkowsky

In spite of the substantial international success of Twitter as a social media platform, reliable information about its userbase is surprisingly difficult to come by. Other than the 232 million “monthly active users” reported in the company’s disclosures to the U.S. Securities and Exchange Commission ahead of its listing on the stock exchange, and some high-level breakdowns of account numbers across a number of key markets, most other assumptions about the Twitter userbase remain guesswork or are based on surveys with comparatively limited sample sizes. This paper takes a different approach to exploring the demographics of the platform: by undertaking a long-term crawl process across the entire Twitter user ID numberspace, we have gathered the publicly available details on every Twitter user account created between the platform’s emergence in 2006 and the conclusion of our crawl in 2013. By identifying the key patterns within this database of some 872 million accounts existing during our collection period, we are able to provide a much more comprehensive overview of Twitter’s footprint across the globe, its patterns of growth, and of typical user careers as listeners, followers, hubs and communicators than has been possible in any previous study.

‘Big Social Data’ in Context: Connecting Social Media Data and Other Sources (ACSPRI 2014)

Australian Consortium for Social and Political Research Incorporated (ACSPRI) Social Science Methodology Conference 2014

‘Big Social Data’ in Context: Connecting Social Media Data and Other Sources

Axel Bruns and Tim Highfield

The current “computational turn” (Berry, 2012) in media and communication studies is driven largely by the increased programmatic accessibility of large and very large sources of structured data on the online activities and content of Internet users – and here, especially of data from platforms such as Facebook and Twitter. Such ‘big social data’ are being used to examine the social media response to issues and events ranging from national elections (Larsson & Moe, 2014) through natural disasters (Bruns et al., 2012) to popular entertainment (Highfield et al., 2013), and in doing so tell a detailed and real-time story of how large populations of Internet users engage with the topics that concern them.

The study of user activities in specific social media spaces alone, however, necessarily isolates such activities from their wider context. Self-evidently, users’ activities do not remain limited to Facebook or Twitter alone: they cross over between these and other social media platforms, and intersect with other online and offline activities. To develop a more comprehensive picture of how citizens engage with and respond to current issues, even only in an online environment, it would therefore be necessary to connect and correlate the data sourced from social media platforms with data from a range of other sources which describe other aspects of the overall online experience.

This paper describes the approach and presents early outcomes from one such initiative to put ‘big social data’ in a wider context. As part of an ARC Future Fellowship project, we draw both on large, longitudinal Twitter and Facebook datasets which describe how Australian social media users engage with and share the news articles published by a range of leading Australian news and commentary sites, and on complementary, representative data from the market research company Experian Hitwise which track, through anonymised data collection at the ISP level across millions of households, what terms Australian Internet users are searching for, and how their attention is distributed across available Websites.

The combination of these sources provides an important new dimension beyond mere social media metrics themselves: in aggregate, our sources show the extent to which users’ searching and browsing activities around current events (which generally remain invisible to their peers) correlate with active news sharing and dissemination activities (which are designed to alert peers to an issue), and how such correlations differ across different themes and events, and different social media platforms. This constitutes an important further methodological and conceptual advance not only for the study of social media, but for media and communication studies as such.

Berry, D., ed. (2012). Understanding Digital Humanities. London: Palgrave Macmillan.

Bruns, A., Burgess, J., Crawford, K., & Shaw, F. (2012). #qldfloods and @QPSMedia: Crisis Communication on Twitter in the 2011 South East Queensland Floods. Brisbane: ARC Centre of Excellence for Creative Industries and Innovation, 2012. Retrieved from http://cci.edu.au/floodsreport.pdf.

Highfield, T., Harrington, S., & Bruns, A. (2013). Twitter as a Technology for Audiencing and Fandom: The #Eurovision Phenomenon. Information, Communication & Society, 16(3), 315-39. doi:10.1080/1369118X.2012.756053

Larson, A.O., & Moe, H. (2014). Twitter in Politics and Elections: Insights from Scandinavia. In Weller, K., Bruns, A., Burgess, J., Mahrt, M., & Puschmann, C., eds., Twitter and Society. (K. Weller, A. Bruns, J. Burgess, M. Mahrt, & C. Puschmann, Eds.). New York: Peter Lang. 319-30.

Mapping a National Twittersphere: A ‘Big Data’ Analysis of Australian Twitter User Networks (ECREA 2014)

European Communication Conference (ECREA) 2014

Mapping a National Twittersphere: A ‘Big Data’ Analysis of Australian Twitter User Networks

Axel Bruns, Darryl Woodford, Troy Sadkowsky, and Tim Highfield

Twitter research to date has focussed mainly on the study of isolated events, as described for example by specific hashtags or keywords relating variously to elections (Larsson & Moe, 2012), natural disasters (Mendoza et al., 2010), entertainment (Highfield et al., 2013) and sporting events (Bruns et al., 2014), and other moments of heightened activity in the network. This limited focus is determined in part by the limitations placed on large-scale access to Twitter data by Twitter, Inc. itself. By contrast, only a handful of studies – usually by researchers associated with commercially funded research organisations or with Twitter, Inc. itself – have utilised the Twitter ‘firehose’ or similar more comprehensive sources of data to explore broader patterns of traffic flows or follower connections on the platform (e.g. Leetaru et al., 2013).

This project builds on a long-term, large-scale analysis of the global Twitter userbase which has managed to identify within the over 725 million global registered Twitter accounts some 2.5 million Australian accounts (by matching profile details such as location, description, and timezone against a set of relevant criteria). Further, we analysed the follower/followee connections of these 2.5 million accounts and from this developed a first comprehensive map of account relationships within the Australian Twittersphere. In-depth network analysis of this map reveals the existence of a range of clusters of especially tightly interconnected users, linked to each other by other accounts acting as bridges between the clusters. In turn, qualitative exploration of the leading account’s profiles in each cluster provides an indication of the various areas of thematic focus which have determined the formation of these clusters, and their association with other clusters in the same network vicinity. Further correlation with other relevant profile data (including the creation date for each account, its level of tweeting activity, and the date of the account’s last tweet) offers additional opportunities to trace the emergence and growing complexity of the Australian Twittersphere over time, from the earliest adopters of the platform to its most recent users, and to filter the overall network for the most active and most persistent users.

This study represents the first ever comprehensive investigation of the development of a national Twittersphere as an entity in its own right. While the global nature of Twitter as a social media platform means that Australian accounts will also be connected with their counterparts in other countries, it is still to be expected that shared interests and identity lead to the majority of connections between accounts to occur within the same national user population, and our analysis of these connection patterns provides an important indicator of the themes around which these connections crystallise, as well as of the longitudinal development of these clusters of interests.

Bruns, A., Weller, K., & Harrington, S. (2014). Twitter and Sports: Football Fandom in Emerging and Established Markets. In K. Weller, A. Bruns, J. Burgess, M. Mahrt, & C. Puschmann (Eds.), Twitter and Society (pp. 263–280). New York: Peter Lang.

Highfield, T., Harrington, S., & Bruns, A. (2013). Twitter as a Technology for Audiencing and Fandom: The #Eurovision Phenomenon. Information, Communication & Society, 16(3), 315–39. doi:10.1080/1369118X.2012.756053

Larsson, A.O., & Moe, H. (2011). Studying Political Microblogging: Twitter Users in the 2010 Swedish Election Campaign. New Media & Society, 14(5). doi:10.1177/1461444811422894

Leetaru, K., Wang, S., Cao, G., Padmanabhan, A., & Shook, E. (2013). Mapping the Global Twitter Heartbeat: The Geography of Twitter. First Monday, 18(5). doi:10.5210/fm.v18i5.4366

Mendoza, M., Poblete, B., & Castillo, C. (2010) Twitter under Crisis: Can We Trust What We RT? Paper presented at Social Media Analytics, KDD '10 Workshops, Washington, DC, 25 July 2010. Available from: http://research.yahoo.com/files/mendoza_poblete_castillo_2010_twitter_terremoto.pdf

Mapping Online Publics: New Methods for Twitter Research (Twitter Analytics Workshop 2014)

Twitter Analytics Workshop 2014

Mapping Online Publics: New Methods for Twitter Research

Axel Bruns, Jean Burgess, and Darryl Woodford

  • 12 June 2014 – Twitter Workshop: Analysing Network Data, Göttingen

The study of Twitter at large scale and in close to real time requires the development of new methodological approaches which are able to process, analyse, and visualise the ‘big social data’ which can be accessed through the Twitter API. The Mapping Online Publics project in the ARC Centre of Excellence for Creative Industries and Innovation (CCI) at Queensland University of Technology has developed a number of approaches to the study of short- and long-term Twitter publics, from analyses of the dynamics of ad hoc issue publics around natural disasters and political crises through the tracking of information flows and audience interests across mainstream and social media to the comprehensive mapping of the Australian Twittersphere. This presentation will outline the methodological approaches developed for this work, and reflect on the opportunities and challenges facing social media researchers.

 

Axel Bruns is an Australian Research Council Future Fellow and Associate Professor in the Creative Industries Faculty at Queensland University of Technology in Brisbane, Australia. He leads the QUT Social Media Research Group and is the author of Blogs, Wikipedia, Second Life and Beyond: From Production to Produsage (2008) and Gatewatching: Collaborative Online News Production (2005), and a co-editor of Twitter and Society (2014), A Companion to New Media Dynamics (2012) and Uses of Blogs (2006). His current work focusses on the study of user participation in social media spaces such as Twitter, especially in the context of acute events. His research blog is at http://snurb.info/, and he tweets at @snurb_dot_info. See http://mappingonlinepublics.net/ for more details on his research into social media.

Jean Burgess is Deputy Director of the ARC Centre of Excellence for Creative Industries & Innovation (CCI) and Associate Professor, Digital Media in the Creative Industries Faculty at Queensland University of Technology. She is involved in several research projects that apply computer-assisted methods to the analysis of large-scale social media data. Her books include YouTube: Online Video and Participatory Culture (Polity Press, 2009), Studying Mobile Media: Cultural Technologies, Mobile Communication, and the iPhone (Routledge, 2012) and A Companion to New Media Dynamics (Wiley-Blackwell, 2013). Over the past decade she has worked with a large number of government, industry and community-based organisations, focusing on the uses of social and co-creative media to increase participation, advocacy and engagement.

Darryl Woodford is a Research Fellow in the ARC Centre of Excellence for Creative Industries & Innovation (CCI) at Queensland University of Technology. He has a background in Engineering and Game Studies, including research on the agency of avatars in virtual environments. His current research includes work on social norms and regulation in the video game and gambling industries, and he is leading the development of new digital methods for measuring and evaluating television audience engagement using social media analytics.

The Global Demographics of Twitter

This final morning at AoIR 2015 opens with my paper with Darryl Woodford and Troy Sadkowsky which explores the global Twitter userbase. Our slides are below:

The Relevance of Devices in Divergent Tweeting Practices

The first presenters on this second day at AoIR 2015 are Bernhard Rieder and Carolin Gerlitz, whose interest is in using data from Twitter's 'spritzer' firehose! which delivers a random 1% or all current tweets. How can this be used to identify individual types of activity in relation of the wider platform ecology? In particular, for the purposes of this paper, what light does it shed on the use of different devices for tweeting?

The project collected some 32 million tweets from the spritzer firehose over the course of one week, and key tools for tweeting were especially iPhone and Android devices. This may also be combined with the tweet contents themselves, to see which devices contribute especially strongly to specific hashtags, for example.

Entering the Age of the Generative Algorithm

The final keynote at ASMC14 for today is by Bernhard Rieder from the Digital Methods Initiative, who stepped in at short notice for Tarleton Gillespie who could not be here. He begins by noting the role of algorithms in our experience of information and media; they select what information is considered most relevant to us, and are now a crucial part of our participation in public life. This raises a number of questions – and starting with search engines, such algorithms have been considered increasingly by researchers.

One way to approach algorithms is by considering the question of knowing: what style of reasoning do algorithms implement, and how do they connect this to forms of performativity. Bernhard has been one of the chief developers of the Digital Methods Initiative, and in this role works closely with as well as thinks critically through algorithms; this is also a process of opening the black box of the algorithms which shape our online experiences.

Conference Blogging Coming Up

I’m currently on the road again, as part of a trip which has already taken me through Hamburg (for a meeting with our research partners at the Hans-Bredow-Institut) and Göttingen (for the inaugural workshop of our new ATN-DAAD-funded research collaboration with colleagues at the Göttingen Digital Humanities Centre. The latter will focus especially on developing new methods for analysing and visualising social media networks, building on the considerable work we’ve already done in this area – and at the workshop last week we’ve already made good progress towards a few new ideas for what we can do. With my colleagues Jean Burgess and Darryl Woodford I also participated in a public symposium at the GCDH, and I’ll make the slides and audio from our talk available here soon.

A Mid-Year Update of Recent Publications

I’ve continued to update my lists of publications and presentations over the past months, but I think it’s time to do another quick round-up of recent work before all the new projects start in earnest.

First off, my colleagues Darryl Woodford, Troy Sadkowsky and I have been making some good progress developing further methodological approaches to Twitter research – focussing this time especially on examining how accounts gain their followers (for some of the outcomes from that research, also see our coverage at Mapping Online Publics):

Axel Bruns, Darryl Woodford, and Troy Sadkowsky. “Towards a Methodology for Examining Twitter Follower Accession.First Monday 19.4 (2014).

Axel Bruns and Darryl Woodford. “Identifying the Events That Connect Social Media Users: Charting Follower Accession on Twitter.” In SAGE Research Methods Cases. London: Sage, 2013.

More generally, I’ve also been involved in a couple of related publications led by Stefan Stieglitz from the University of Münster (one in English,  one in German) which highlight the contribution which the emerging field of social media analytics will be able to make to the disciplines of business informatics and information systems:

Different Forms of Talk on Twitter

It’s been a little quiet again here, as I’ve taken February and March off on Long Service Leave. That’s all about to change, though, because two major new research projects are about to start now – more of these soon.

For the moment, here’s my first conference presentation for 2014, from the Media Talk symposium at Griffith University in Brisbane. I used this to work through the three layers of communication on Twitter which Hallvard Moe and I have identified in our chapter in Twitter and Society, and to provide some examples for how these layers operate in practice.

This is also the first time I’m trying Penxy as a tool for archiving my slides with audio recordings, since Slideshare has made the unfortunate decision to discontinue its slidecasts and remove any audio recordings from its site. Most of my past slidecasts are therefore also on the Penxy site now, and I’ll try to update the existing links to recorded presentations on this site when I get a chance.

Here’s my talk:

Layers of Communication: Forms of Talk on Twitter

Pages

Subscribe to RSS - 'Big Data'