8. Timeline of External Events can be constructed automatically

The starting point for this hypothesis is the idea that a news platform naturally reacts to external events.

The presumption that important events make the community direct its attention to these events seems to be obvious considering the result of H2. Because the community filters and spreads news from other sources, it can be expected that global events change the community’s interest structure.

The Group-Betweenness-Centrality (GBC) measures the level of networking. A high GBC indicates the existence of one very central actor; so most of the communication has to pass though him (i.e. when the other members don’t know each other). Accordingly, we will have a low degree of GBC in very democratic communities, in which nearly everyone communicates with each other. In TeCFlow, GBC is graphically represented in a trend line over time.

A global event would probably cause someone to write an article about it very fast. This article would then be edited by many different persons. This way the event would change the GBC-Plot of Dataset2; it would cause the value to rise. This can be explained by the fact that one user, the writer of the article, should become very central, since he or she is established as the recipient for every edit.

Concentrating on those points in time when the GBC is higher than the normal level and taking a look at the single actor’s centrality at these points in time will enabled us to clearly identify the actor who caused a peak.

However, looking at the articles which led to this actor’s high centrality shows that no single articles but rather a collection of many “unimportant” articles caused the actor’s high centrality.

Obviously, users do not become very central in this dataset by writing single articles that are edited by many other users. Instead our approach identified those actors who are the global connectors at these points in time, maybe because these actors just entered the community recently and therefore are concerned with many different topics. Another possible explanation is that an actor removed many junk edits on the search for suspect articles.

Since this first approach did not work as expected, we tested the hypothesis again using Dataset1. Analogous to the proceeding before, there should be an important event whenever there is a minimum in the GBC-Plot of Dataset1. At such a point in time no actor has a high centrality, so there is a lot communication between many different actors. But the results did not support the hypothesis either. Compared with Dataset2 there are only very small changes in the GBC. These changes could not be used for any further analysis, they seemed to be randomly.

Another approach to test this hypothesis was possible on our database as well: Using the Term Analysis it should be feasible to figure out dominating key-terms which than should lead us to important topics the community is dealing with at particular points in time.

In doing so, we move away from analyzing Wikinews’ structure and instead run a content-based analysis. For this we included the content by running a Term Analysis. We then tried to find out whether different events can be recognized by measuring the centrality of different key terms.

In the Term Analysis every word of the whole communication is weighted and connected to other terms. In the static or dynamic view the words are shown according to their centrality and are connected to those terms that occurred in the same communication.

The first communication content we looked at consisted of the article’s headlines. That way data processing was faster, because otherwise the amount of data would have been huge. Also, we did not expect a loss of quality in the result, because by focusing on the headlines of the articles the important terms have been sorted out in a natural way.

By running the dynamic view we could basically see an overview of the most important events of the last year without using any other variable. In the correct chronological order we could recognize the tsunami catastrophe, the election of the pope, the election in Great Britain, the terrorist attacks on the subway in London, the hurricane Katharina and the earthquake in Pakistan. The term “Bush” was quite central during the whole time and got strongly connected to the term “Iraq”, which also had a high centrality and was connected with the term “war” almost constantly.