You are here

Concept Maps for Selected Australian Political Blogs, Part I

(Cross-posted from Gatewatching.)

In a previous post, I mentioned our work in developing a new methodology for mapping link and concept networks in the Australian blogosphere. For a first test run of this project, we archived posts in some 300-400 Australian political blogs between the start of November 2007 (the last month of the federal election campaign) and the end of January 2008, and we've now begun an exploratory analysis of this corpus of data.

As noted in our discussion paper for this project, the first step in this analysis is to distinguish between different functional components of blogs and blog pages (something that does not necessarily happen in comparable studies, by the way). So, what I'm focussing on here are the blog posts themselves, which are of course the major discursive element of any blog - as part of our approach, we've separated these posts from all other content on the blog (headers, footers, blogrolls, sidebars, comments sections, etc.). While I'll mainly discuss content analysis here, this is especially important also in the context of link analysis, of course, where blogroll, comment, and other links skew the data if we want to focus on examining the discursive network between blog posts.

So, building on this corpus of blog post data, here are some preliminary observations. What I've done here in the first place is to run the concept mapping software Leximancer over the content gathered from a selection of key Australian blogs, to both fine-tune that process and see if any discernible differences between individual blogs emerge. I'll present the results in two ways: one simply lists the key terms for each blog in order of frequency (giving a quick indication of what they're frequently talking about), and the second maps these key terms in relation to one another - terms which frequently co-occur in close proximity to one another in the text are located closer to one another than terms which don't, in other words. (I'll post these maps later, in the second part of this post.)

Before I go on to the results, I need to say a little more about the Leximancer process, though. Leximancer begins by simply trawling through the body of text and counting how often each word occurs, but in doing so, it already drops a number of meaningless 'kill words' (the, and, or, etc.), and beyond this it allows for further manual editing as well; in this second stage, it's possible to combine terms which have the same meaning (e.g. ALP, Labor; Liberals, Libs, Liberal Party; Howard, John Howard) and remove further terms which aren't interesting for the intended analysis (finally, apparently, significantly, think, show, say, etc.). Choices made during this process are necessarily skewing the eventual result, so I want to be quite clear about this.

Perhaps the most contentious omission I've made in the exploration discussed below is to drop the "Australia, Australian, Australians" cluster of terms from the analysis - my reasoning here is that a) in discussion Australian politics these terms will occur so often and in so many different contexts (Australian government, Australian elections, Australian media, Australia's policy, etc.) to be essentially meaningless, and b) it's impossible for Leximancer to distinguish between mentions of the Australian election/media/... and The Australian newspaper (also because 'the' is itself a 'kill word', of course). Clearly this is not without its problems, though - and at a later stage I'll be keen to run a few comparative analyses to see whether leaving Australia(n/ns) in makes a significant difference to the results.

In addition to this cluster of terms, I've also dropped a number of the other terms originally identified by Leximancer in its first pass over the data. The full list of such dropped terms is different for each blog, but usually includes things like the names of months and days, link words like 'finally, first', etc., and terms occurring in frequent phrases ('instance' from 'for instance', 'writes', 'says', etc. from '[journalist/blogger] writes', and so on). Eventually, I'm keen to fine-tune this list, and perhaps develop a list of standard 'kill words' that applies more generally for blogging analysis.

OK, so - having said all of that, here are a few preliminary results. The blogs I've looked at for the purposes of this analysis include: Andrew Landeryou's The Other Cheek, Larvatus Prodeo, and Club Troppo. Let's begin a frequency analysis for the most common terms in each of them (listing the top 20 most commonly used terms for each blog):

The Other Cheek Larvatus Prodeo Club Troppo

Concept
Labor
Liberal
OC
political
election
government
campaign
people
patriot
seat
Rudd
candidate
state
Greens
public
federal
left
work
Game
Age
Absolute
Count
145
145
143
104
87
83
76
65
62
62
56
56
50
49
48
42
42
38
38
38
Relative
Count
100%
100%
98.6%
71.7%
60%
57.2%
52.4%
44.8%
42.7%
42.7%
38.6%
38.6%
34.4%
33.7%
33.1%
28.9%
28.9%
26.2%
26.2%
26.2%

Concept
Labor
Howard
government
election
party
Rudd
year
political
campaign
policy
issue
Coalition
work
uranium
public
change
seats
blog
candidates
person
Absolute
Count

127
102
102
91
80
79
78
76
66
54
53
48
42
41
41
39
39
38
37
37
Relative
Count
100%
80.3%
80.3%
71.6%
62.9%
62.2%
61.4%
59.8%
51.9%
42.5%
41.7%
37.7%
33%
32.2%
32.2%
30.7%
30.7%
29.9%
29.1%
29.1%

Concept
national
government
years
economic
policy
people
Howard
election
Labor
world
countries
politics
problem
public
issue
work
Rudd
change
tax
price
Absolute
Count
263
241
228
204
202
194
154
133
124
122
114
109
105
91
91
90
86
84
84
78
Relative
Count

100%
91.6%
86.6%
77.5%
76.8%
73.7%
58.5%
50.5%
47.1%
46.3%
43.3%
41.4%
39.9%
34.6%
34.6%
34.2%
32.6%
31.9%
31.9%
29.6%

Clearly there are some notable differences in the key terms used in each of these. The Other Cheek, for example, is big on what we might call the generic technical terms of politics - the names of major parties, 'politics/al', 'election', 'government', 'campaign', 'seat', candidate', etc. Owing to Landeryou's writing style, 'OC' itself also pops up as a very frequent term, as does 'patriot' as a term of endearment for those in either party with whose political views he agrees. (By the way, 'Game' at #19 is reflective of another stylistic quirk: Landeryou's frequent use of the phrase 'Game on.') On the other side of the political ledger, 'Greens' is also a relatively frequent term, but beyond this there is a notable absense of terms related to specific policies and initiatives - except perhaps for 'work' (which could relate to workplace relations), but then 'work' is also commonly used in non-issue-specific phrases ('this policy won't work', 'our agenda of work', etc.), so that its ranking among the top 20 terms here might be misleading. It's somewhat surprising that John Howard isn't mentioned very often (he appeared only at #36 in the list, with 31 mentions) - but remember that more than two thirds of the corpus for this analysis stem from after the election; what this may mean, therefore, is that Landeryou is strongly focussed on current events and scandals, not so much on longer-term analysis.

This is one clear point of difference with Larvatus Prodeo; here, over the same circumelection period, Howard ranked at #2 and was mentioned 102 times. Otherwise, the broad trend is perhaps similar to OC; generic technical terms such as 'government', 'election', 'party', 'political', 'campaign' are similarly strong, but 'policy' and 'issue' also get a showing. One notable term here is 'uranium' at #14 (41 mentions), which clearly points to the presence of more specific topical debates in addition to the broader coverage of electioneering and political processes. Note also 'blog' as a key term - in part, this may reflect the fact that LP has been a keen follower of the pseph wars, and that questions related to the role of citizen journalism in Australia are not uncommon, but like 'work', its ranking might be inflated by more generic uses ('as X writes in their blog'). By comparison, 'blog' appears only further down the order for Landeryou, with 21 mentions.

The picture is considerably different for Club Troppo, where terms related to specific policy fields are notably more common. Here, even terms such as 'Howard', 'Labor', and 'Rudd' are outranked by 'economy/ic' and 'policy', and 'world', 'countries' (often in the context of 'developed/ing countries'), 'problem', 'issue', and 'tax' also make the top twenty list. Many more of these - 'rates', 'recession', 'market', 'child/ren', 'international', 'community' - occur on the next twenty places in the list, unlike The Other Cheek and, less strongly so, also unlike Larvatus Prodeo. To take just one further example for these differences: 'intervention' (used mainly in relation to the Howard government's intervention in indigenous communities) ranks at #57 for Troppo, at #63 for LP, and doesn't appear in OC at all. Related to this, 'child/ren' ranks at #32 for Club Troppo, but doesn't appear in the list of key concepts for either of the other blogs. My reading of this is that Troppo focusses much more strongly on policy analysis over political wonkery and insider gossip; for OC, the balance is reversed, while LP sits somewhere in the middle - but I'd be interested in how our readers would interpret this...

I'll post the concept maps based on these rankings in my next post, in a few days. The differences between the three blogs become even more obvious there.

Technorati : , , , , , , ,
Del.icio.us : , , , , , , ,