tommorris.org

Discussing software, the web, politics, sexuality and the unending supply of human stupidity.


Wikipedia stats: weekend traffic as a clue to determining 'seriousness' of articles?

In all the meta-discussion that goes on around Wikipedia and Wikipedia-style sites (Citizendium, for example), one of the frequently recurring themes is the role of pop culture in building an encyclopedia.1

Lots of people turn their noses up at Wikipedia because of the sheer amount of attention paid to Pokémon compared to perhaps more serious topics like politics, science, the arts and so on. There’s something very important here in terms of building policies regarding accuracy: to me, it matters significantly if the article on, say, evolutionary biology is inaccurate in a way that it doesn’t matter if the article on Britney Spears is inaccurate. When designing policies and building community infrastructure around a project like Wikipedia, it seems important to build them to ensure that articles that are about “serious” topics get cared for more than articles on popular topics, because those articles have more real-world significance. If Wikipedia has the wrong date for a Britney Spears single, the knock-on effects of such inaccuracy are less significant than having the wrong dates about something like the Balfour Declaration. In an ideal world, there would be no inaccuracies and everything would be perfect. But this is not an ideal world.

I went on to Emw’s article stats page today to find out about some article stats for various things I’ve been helping with: specifically the Hallmarks of Cancer article which got written about on the BBC News website and in the Times of London (and, slightly less significantly, this week’s Signpost). I also nominated it for a DYK, so it’ll be interesting to see whether BBC News sends it more traffic than a DYK does.

But statistics are addictive and very useful.2 So I started punching in all sorts of articles on topics I’m interested in, mostly in philosophy. I tried John Rawls and Thomas Aquinas and Philosophy of Science. I was only interested in recent stats, so I set it to only show me the last 63 days (from 1 February onwards).

One interesting thing I found recurring through all the philosophy-related pages I looked at was a drop in views at the weekend: like clockwork, there’d be a huge drop in page views on Saturdays, and then a slight nudge upwards on Sundays, then back to a steady number during the week. I kept on trying: Bertrand Russell, William of Ockham, Friedrich Nietzsche (extravagantly so!), Ludwig Wittgenstein, Jean-Paul Sartre and John Locke. A few didn’t fit: Saul Kripke and Gottlob Frege. But most of the articles I tried in philosophy fitted the pattern.

I did not find this effect when I searched for pop culture related topics: I tried a variety of video games and pop musicians/rock bands.

A tentative hypothesis lurks there: if Wikipedia is being used as a quick reference by university students, school students, teachers and lecturers during the week, that would explain why there seems to be a dip at the weekend. (And an even more ad hoc explanation: on Saturday, people are going out and having fun or watching TV, on Sunday they are doing their homework, hence more pageviews on Sunday than Saturday.) But because movies and TV shows and pop music are fun, people still read about them on Wikipedia at the weekend, while academic articles are more of a work thing for people in universities.

There’s a whole stack of things one needs to control for, and this is a long way from being confirmed. It’s still a very early hypothesis. Firstly, I’ve only used a very limited sample: a few dozen articles, and limited them in scope to the intersection of philosophy articles and things which came into my mind this afternoon.

If further statistical analysis pans out, it may be possible to use such a correlation to help with directing academic contributors through WikiProjects, selection of featured articles and other featured content (DYKs etc.) and a whole lot of other stuff. It could also simply be useful to help sort articles within WikiProjects: for generalist or broad WikiProjects like United Kingdom or Europe or United States or Biology, being able to statistically sort “serious/academic” articles from more entertaining or frivolous/pop culture articles may be useful. I mean, imagine if a musicologist turns up at Wikipedia and wants to help with writing articles about music: being able to nudge them towards the academic topics may be a useful thing.

Rather than keep this bottled up in my head, I thought I may as well share it so others interested in statistical analysis of wikis like Wikipedia and on the debate over academic contributions to sites like Wikipedia can use this as a starting point for further analysis of the data.

  1. I heard a guy a while back talking about building a language variant of Wikipedia–one of the African languages–and he said that pop culture and sport brought people in and got them editing. They may start editing not so important articles, but they would often progress to editing more serious stuff. So we should probably be careful not to slag off pop culture articles too much.

  2. Especially for outreach: being able to tell people working inside existing institutions like museums or scientific research laboratories or universities or whatever that topics they work on are viewed however many thousands of times a day makes the question of contributing to Wikipedia less of a matter of idealistic hippie sharing and more of a public information and education issue. You can write a scholarly monograph and maybe 50 of your peers will read it. Or you can fix an error on Wikipedia and a few hundred or thousand people—men and women on the Clapham omnibus!—will read it. On average, about 6,000 people a day read the Wikipedia article on John Locke. And about 7,700 people a day read the article on cancer.