Data Graphics Technical

Spatial Interaction Modelling for Access to Higher Education

This is the first in a series detailing the projects I have worked on at UCL in the last academic year.

My main project through the last year has been to test a hypothesis, developed by Professor AG Wilson, that the flows of students moving from school to university can be approximately by spatial interaction modelling (SIM). Put simply, SIM is a variant of the 300-odd year old Newton’s Law of Universal Gravitation, i.e. the attraction between two masses is related by each of their masses and the distance between them. Replace the masses by the numbers of final-year pupils a school, and a university’s capacity, and make the distance decay exponential instead of inverse-square, and that’s the basics of the model. A similar theory has been applied to great effect by Joel Dearden of CASA, in his retail SIM, which has shown a “tipping point” explaining how supermarkets and out-of-town retail developments have become attractive to shoppers over the last forty years.

Of course, it’s a little more complicated than that, and even with the more complex model I’ve tested, a large number of simplifying assumptions have to be made.

The two main extra parameters that are added to the model are (1) that universities have an “attractiveness factor” above and beyond their size. I have used one of the common university league tables to provide values for this factor. And (2) the distance-decay is not uniform across all types of school students, but varies by their background. By splitting up the final-year school students by demographic, the variation in the distance-decay can be seen, and this is used to calibrate the model.


The seven OAC demographic supergroups are shown here – the horizontal scale is distance and is the same in each graph. (Only English-based school students going to English universities are considered in the study.) The vertical scale is the proportion of students, of that OAC supergroup, in each distance bucket. The actual number of students in each supergroup varies dramatically and this is not shown in the graphs.

The graphs show there is indeed considerable variation between supergroups in the “beta value” of the drop-off if approximated as exponential, and also in the “R-squared” fit to true exponential decay.

  1. Blue collar.
  2. City living – this group strongly favours London, Birmingham and Manchester, i.e. the same or other “big cities” in England, hence characteristic peaks appear at these distances – accentuated by the relatively small school-age population in this group.
  3. Countryside – this group rises before falling, as there is a minimum distance they need to travel to get to even their nearest university.
  4. Prospering suburbs – the lowest beta-value, in other words this group attaches the least importance to school-university distance.
  5. Constained by circumstance – similar to the first group.
  6. Typical traits – the “average” group which encouragingly also has an average looking graph.
  7. Multi-cultural – more distance-sensitive than the others – hence the very steep drop-off. This shows that people living in areas classified as multi-cultural will more strongly desire going to a university that is very local to their home.

Prof Wilson’s theory also factors in the subject that the student is studying (not all universities offer all subjects, and some are most are strong in certain subjects and weak in others), and their attainment at school (i.e. they might really want to study Maths at Oxford, and be at a school very near by, but if they get a D in Maths at A-Level, they aren’t going to be able to do that.)
Universities also come in two types – “recruiting”, where there are more places than students genuinely intending studying there, and “selective”, where there are more prospective students than places. One interesting effect of the recent economic downturn is the massive increase in people applying for university in 2009-10 – UCL saw a 12% increase for undergraduate courses, for example. This has had the effect of making more universities selective.

In order to consider two types in the same model, it was necessary to develop what is known as a “partially constrained” SIM. The details are for a future article, but, put simply, an iterative approach, assigning students to a university and then reassigning the weakest for over-capacity universities, is taken.

I built a GUI in Java – it’s the language I’m most comfortable with for “proper” programming – to quickly visualise the results and compare them with real-life flows. Here’s a bit of it:


This shows the perhaps not very surprising prediction that BIRM7s (multi-cultural school students living in Birmingham) are pretty likely to also go to university in Birmingham (AST = Aston, BCU = Birmingham City University, BIR = University of Birmingham), rather than elsewhere in the country.

When compared with the actual flows:
…the model under-predicts the flow to Birmingham City University, possibly because BCU’s desirability amongst this demographic group is mis-calibrated. Further-education students are also not present in the predicted model, but are included in the actual flows, so the two are not, as presented, normalised.

The model needs to be developed further before it can be presented formally. In particular, attainment is almost certainly a necessary component.


Quantum GIS 1.3

A new version of Quantum GIS, the free, open-source and user-friendly GIS, has been released today.

See the official blog for all the details, but the most exciting addition for me is the OpenStreetMap integration. Now, you can download data directly from the OSM servers, into the application. OSM-like stylings are applied to the data to make it look a bit more like a map, and you can easily can view all the tags and relations on each object. You can also edit the data directly in QGIS, as if it was normal GIS data, and then save it straight back to the server. This could potentially make it a good alternative to the Potlatch and JOSM editors that are currently used for the bulk of additions to OpenStreetMap. The integration isn’t perfect – I got a server-side bounding box error on my first attempt out downloading data which should have been caught locally – but it’s pretty impressive nonetheless.

With QGIS’s excellent python integration, it should be possible to write other plugins, to, for example, create well-shaped building outlines with perfect right angles. I think you can do this in JOSM too, but I’ve always found JOSM a rather unfriendly application to use.

Here’s some OpenStreetMap data of my local area, in Quantum GIS, with a road I added highlighted in red:


WordPress and Yepsport

Yepsport is now on the WordPress CMS, having been on Movable Type up until now from when it was first started earlier this year. The key difference (other than the look) is that WordPress pages are generated on the fly for every view. WordPress also has a very active community right now, so expect lots of new and cool webloggy stuff to appear in the months ahead.

To celebrate, I’ll be posting a number of new articles in the next few days.

I think I’ve got everything just about right in the moving across process, but if you notice anything odd, or something that looks like it doesn’t work, please drop me an email at o.obrien (at)

My other weblog (Yablog) will be moved across shortly. NB. If you currently subscribe to Yesport’s RSS, XML or Atom feeds, you’ll need to change the link you use. Follow the links at the bottom of the panel on the right.