You are here

Facepager: A Tool for Gathering Facebook Data

The final panel at Digital Methods in Vienna is on Web monitoring, and starts with a paper by Jakob Jünger on Facepager, a tool for gathering data from Facebook. Such data could be scraped directly from the Web pages, or retrieved through the API; Facepager takes the second route, which has specific implications for the kind of data which are available for it.

For example, popular Facebook pages show a general estimate of how many likes they've received (e.g. "700k"), while the API returns an exact number; this needs to be considered in any analysis which examines the actual user experience, of course.

Facepager takes a bunch of Facebook page names, and then enables the user to gather all posts or likes as well as a number of other types of data; these can be exported as data files for further processing and analysis. And despite the name, Facepager appears to capture Twitter data as well, and has a generic API interface which can connect with a variety of other services, too. The tools is available under an open source licence.

As with any automated, API-based data gathering tool, there are some methodological issues here, of course. APIs are often far from transparent, data are thus not necessarily complete, and the indicators for activity on social media platforms which emerge from the data are therefore not necessarily always entirely valid.

On Facebook, some comments may not be delivered, for example, but the reasons for this are far from clear; it may have something to do with the Facebook network of the researcher doing the data gathering, which would be a significant limitation, of course. Similarly, user activity metrics may be affected by automated posting, but such automated posts could be difficult to distinguish from genuine activity. And API functionality might chance and isn't always well-documented; the metrics they return might change overnight, for example, invalidating the research results.

There are also concerns over research ethics and user privacy, of course. The availability and apparent straightforwardness of data is problematic in its attractiveness for researchers.


The latest binary of our Facepager is available here

Btw: The next (v 3.4) Version of Facepager is able to capture data from Streaming API´s (like Twitter)