You are here

The Limitations of Twitter as a Data Source

Mon, 28/05/2018 - 22:16 — Snurb

ICA 2018

The next speaker in this ICA 2018 session is Fabian Pfaffenberger, who also highlights the unreliability of Twitter data. The API’s 1% sample is extremely biased, and the search API is also unreliable in what it delivers; historical data is especially incomplete as the search API delivers only tweets posted in the past 6-7 days and will not include deleted tweets or tweets from subsequently deleted or suspended accounts.

User information is also incomplete, and geodata is largely unreliable and limited to some 1% of all tweets. Further, genuine users are mixed with bots in the datasets – better bot identification tools are sorely needed. And whatever we encounter may not be representative in any meaningful way – Twitter is already a niche medium, and Twitter users may be especially interested in engaging with leading users. Its userbase appears to be stagnating at this stage.

Popular content

Today's:

All time:

Last viewed:

Profiles Elsewhere

Syndicate

You are here

The Limitations of Twitter as a Data Source

Recent Work

Presentations and Talks

Books, Papers, Articles

Opinion and Press

Creative Work

Lecture Series

Filter Bubbles Book

Gatewatching Books

Produsage Book

Search form

Popular content

Today's:

All time:

Last viewed:

Profiles Elsewhere

Syndicate

You are here

The Limitations of Twitter as a Data Source

Recent Work

Presentations and Talks

Books, Papers, Articles

Opinion and Press

Creative Work

Lecture Series

Filter Bubbles Book

Gatewatching Books

Produsage Book