The next session at "Compromised Data" starts with Frauke Zeller, who begins by noting the multimodality of communication, including through social media: many texts are using more than one semiotic mode, combining text, images, audio and video. How can the existing methods for studying multimodality be transferred to online environments, and to research building on 'big data', however?
Some such work begins with exploring the networks between users, and between texts, but this is not enough - how do we move from the macro to the meso and micro levels of communication? How do we move to the manifest to more latent content, especially where non-textual content is involved?
Social interactions are mainly manifested through language, and the words we use are also related to social status and other factors; but language also stretches to non-textual content. Investigations of image-based exchanges, for example, also need to take into account other forms of image communication - e.g. the comments and sub-images in Facebook threads.
A mixed-methods approach which looks at this, and includes the quantitative analysis of manifest content as well as the qualitative analysis of latent content, also requires some standardisation in order to be repeatable and rigorous. This might involve standard tools for keyword and frequency analysis, KWIC analysis and co-occurrence, and sentiment analysis, leading to the identification of textual and linguistic characteristics, and to text-image analysis, and from there feeding back into the first stages.
Frauke's experimental case study explored this by developing a methodological pipeline from quantitative to qualitative analysis - using word frequencies as a first indicator of relevant linguistic patterns, computing colocation analyses around word occurrences, using this as a code list for Atlas.ti, and from these identifying the images used in connection with specific codes.
But there are issues with such approaches, too: picture analysis in social networks needs to face problems with anonymity and quantity, with data access on specific platforms, and with multilingual and multilingual data. A mixed-methods approach including a critical assessment of these and other issues is necessary here.