Categories
Conferences

Conference Review: GIScience 2014

IMG_0953c

I was in Vienna for most of last week, presenting at a satellite workshop of the GIScience conference, before joining the main event for the latter part of the week.

GIScience is a biennial international academic conference, alternating between America and Europe. At the intersection between geography, GIS and information visualisation. It is very much academically focused, which contrasts strongly with FOSS4G (GIS technology), WhereCamp (GIS community) and the AGI (GIS business).

My highlights for this year’s conference:

  • Jason Dykes (City) gave a keynote on balancing geovisualisation and information visualisation. As ever with presentations from City’s GICentre unit, the graphics were presented by way of various live demos and compellingly explained.
  • UCL Geography/CEGE had a strong presence of the conference and various of my colleagues gave presentations, a number focusing on using geolocated social media, both as a tool for research (e.g. population synthesis) and for research itself. There was also an unveiling of LOAC (UCL/Liverpool), a classification specially built for London, further details on this to follow soon as LOAC is signed off and rolled out.
  • Another UCL Geography presentation on comparing surname clustering and genotype clustering in the UK
  • A interesting presentation from TU Eindhoven on automatically creating and simplifying network diagrams using circular arcs.
  • Automatic Itinerary Reconstruction from Texts (LIUPPA/Pau) – showed how a fairly accurate map can be made simply by scanning prose, and otherwise unknown locations of places can be roughly determined by their textual relations to other, known places.

Many of the talks appear in an LNCS proceedings book.

Outside of the conference, much Wiener Schnitzel and Gelato was consumed, and historic old Vienna was explored. A highlight was conference drinks in the huge barrelled halls underneath the very grand city hall.

IMG_0963ec

Categories
Conferences Geodemographics

Mapping Geodemographic Classification Uncertainty

oxford_sed

I’m presenting a short paper today at the Uncertainty Workshop at GIScience 2014 in Vienna, looking at cartographic methods of showing uncertainty in the new OAC 2011 geodemographic maps of the UK using textures and hatching to the quality of fit of areas to their defined “supergroup” geodemographic cluster.

Mapnik was used – its compositing operations allow the easy combination of textures and hues from the demographic data and uncertainty measure onto the same tile, suitable for displaying on a standard online map.

These are my presentation slides (if you get a bandwidth message, try refreshing this webpage, or download here):

You can download a PDF of the short paper from here.

A special version of the OAC map, which includes the special uncertainty layers that you can see in the paper/presentation, can temporarily be found here. Use the extra row of buttons at the top to toggle on/off uncertainty effects, and see the SED scores at the bottom left, as you mouse over areas. Note that this URL is a development one and so likely to change/break at some point soon.

Background mapping is Crown Copyright and Database Right Ordnance Survey 2014, and the OAC data is derived from census data that is Crown Copyright the Office of National Statistics. Both are used under the terms of the Open Government Licence.

Categories
Data Graphics Geodemographics

A Result/Turnout Correlation for the Scottish Independence Referendum?

graph_corr2

A final update to my Scottish Independence Referendum Data Map – the circle borders now show the turnout percentage, with the highest (>90%) as a solid green, the lowest showing as red.

There is a weak (R^2 = 0.177) negative correlation between the Yes vote %, and the Turnout %, suggesting that the Yes campaign had more difficulty in getting its supporters to vote on the day. This may be due to the traditional tendency for older voters to turn out more than younger ones, and the polls suggesting that younger people were more likely to vote Yes. (The BBC has more on the demographics of the Scottish voters.)

You can see this weak correlation on the map, with green-borders (high turnout %) on red circles (low Yes %), and some of the bluer areas (high Yes %) having red borders (low turnout %), although East Dumbartonshire is a noticeable exception.

map_corr

Categories
OpenLayers Technical

OpenLayers 3

ol-logo

As a learning exercise, I been trying to “migrate” my recent #indyref map from OpenLayers 2.13.1 to the very new version 3.0.0 of the popular mapping API. It seemed a good time to learn this, because the OpenLayers website now shows v3 as the default version for people to download and use. Much of my output in the last few years has been maps based on OpenLayers, so I have considerable interest in the new version. There are some production sites using OpenLayers 3 already – for example, the official Swiss map.

I use the term “migrate” in inverted commas, because, really, OpenLayers 3 is pretty much a rewrite, with an altered object model, and accordingly requires coding from scratch a new map rather than just changing a few lines. It has so far taken me four times as long to do the conversion, as it did to create the original map, although that is an inevitable consequence of learning as I go along.

I’ll update this blogpost as I discover workarounds.

Shortcomings in v3 that I have come across so far:

  • No Permalink control. This is unfortunate, particularly as “anchor” style permalinks, which update as you move around the map, are very useful for visualisations like DataShine where people share specific views and places, and I can inject extra parameters in. The site linked above suggests this is a feature that should not be in the core mapping library, but instead an additional library can query/construct necessary parameters. Perhaps, but I think layer/zoom/lat/lon parameters are such a key part of a map (as opposed to other interactive content) that they still deserve to be treated specially.
  • The online documentation, particularly the apidoc, is very sparse in places. As mentioned above, there is also some mismatching in functionality suggested in the online tutorials, to what is actually available. Another example, the use of “font” instead of “fontSize” and “fontStyle” for styles. This will improve I am sure, and there is at least one book available on OpenLayers 3, but it’s still a little frustrating at this stage.
  • Label centering on the circle vectors is not as good as with OL 2. This is possibly due to antialiasing of the circle itself. You can see the labels “jump” slightly when comparing the two versions – see links below.
  • Much, much slower on my iPhone 4 (and also on a friend’s Android phone). This is not what I was expecting! This is the “killer” problem for me which means I’ve kept my map on OL 2 for now. Wrapping my vector layer in an Image layer is supposed to speed things up, but causes the layer not to display on my iPhone. Disabling the potentially expensive mousemove listener did not make a difference. Adding a viewport meta tag with width=device-width speeded things up a lot so that it was almost as fast as OL 2 (without the meta tag) but then I would need to rewrite my own UI for mobile – something I don’t need to do with the OL 2 version!
  • No support (yet) for UTFGrids. These are a form of vector tiles, for metadata rather than geographic features, which I use on the DataShine project.

Things which I like about the new version:

  • Smooth vector resizing/repositioning when zooming in/out on a computer. (N.B. This is only when using a Vector layer and a Vector source, rather than Image layer with an ImageVector source that itself uses a Vector source.)
  • Attribution is handled better, it looks nicer.
  • No need to have a 100% width/height on the map div any more.
  • Resolution-specific styling. I’ve used this to hide the labels when zoomed out beyond a certain amount.
  • Can finally specify (in a straightforward fashion) a minimum zoom level.
  • Point coordinates and extents/bounds are specified in a much simpler way.
  • On a more general note, the new syntax is more complete and feels less “hacky”. The developers have taken the opportunity to do it “right” and remove inconsistencies, misplaced functionality and other quirks from the old version. For example, separating out visual UI controls and interaction management controls into two separate classes.
  • Drag-and-drop addition of KML/GeoJSON vector features. Example (use this file as a test).

Some gotchas, which got me for a bit, but I was able to solve:

  • You need to link in a new ol.css stylesheet, not just the Javascript library, in order to get the default controls to display and position correctly.
  • Attribution information is attached to a source object now, not directly to the layer. A layer contains a source.
  • Attribute-based vector styling is a lot more complicated to specify. You need to create a function which you feed in to an attribute. The function has to return a style wrapped in an array – this may be the closure syntax in Javascript that I have not come across before.
  • Hover/mouseover events are not handled directly by OpenLayers any more – but click events are, so the two event types require quite different setups.
  • Minor differences between the debug and regular versions of the library. The example I noticed is that the debug version allows ol.control.ScaleLineUnits.METRIC to be specified as an attribute for the ScaleLine control, but the non-debug version needs to use an explicit string “metric”.
  • No opacity control on individual styles – only on layers. This means I can’t have the circles with the fill at 80% opacity but the text at 100% opacity. Opacity can be set on styles, but has to be specified as part of the colour, in RGBA format (where A is the alpha, i.e. opacity, you want) rather than as a separate attribute. This is contrary to the tutorials on the website. Layer opacity can continue to be specified as seperate attributes.

My OpenLayers 3 version of the #indyref map is here – compare with the OpenLayers 2 one. Note that, since first writing this blogpost, I’ve subsequently updated the OpenLayers 2 one to change the cartography there further.

Categories
Data Graphics

Scottish Independence Referendum: Data Map

indyref

Scotland’s population is heavily skewed towards the central belt (Glasgow/Edinburgh) which will affect likely reporting times of the independence referendum in the early hours of Friday 19 September, this being dependent both on the overall numbers of votes cast in each of the 32 council areas, and the time taken to get ballot boxes from the far corners of each area to the counting hall in each area. Helicopters will be used, weather permitting, in the Western Isles!

There is also likely a significant variation in the result that each area declares – with regions next to England (so dependent on trade with them) and furthest away from them (so benefiting most from support) likely to strongly vote “No”, the major cities being difficult to call, and the rural areas and smaller, less affluent cities of the central vote much more likely to vote “Yes”. Note that unlike a constituency election which is “first past the vote” for each area, the referendum is a simple sum-total for everyone, so while it will be interesting hearing each individual results, ultimately we won’t know the result until almost every area has declared the result, and the lead for one side becomes unassailable (areas will declare the size of the vote well before the result, which will make this possible).

A screenshot of a table, in a report “Scotland referendum: Looking through the mist” from the Credit Suisse Economics Research unit, was circulating Twitter a couple of days ago:

Expected #indyref declaration times for every council area. Good find by @gerrybraiden. pic.twitter.com/ryzCtDbRCQ

— Scott Reid (@scottreid1980) September 12, 2014

It has estimates on all three of these metrics, so I’ve taken this, combined it with centroids of each of the council areas, and produced a map. Like many of my maps these days, coloured circles are the way I’m showing the data. Redder areas are more likely to vote no, and larger circles have a larger registered population. The numbers show the estimated declaration times. Looks like I’ll be up all night on Thursday. Mouse over a circle for more information.

View the live #indyref map here.

ps. I’ve subsequently got hold of a copy of the report concerned. To quote the methodology for determining the “Yes” rating, it’s

“derived from support for the Scottish National Party in the 2012 local elections. We… show a range from 0 (the lowest local vote [share] for SNP in 2012, excluding Orkney and Shetland where the vote was negligible) to 10 (highest local vote share for SNP).”

This implies the Orkney/Shetland results were not used in the 0-10 scaling, as their very low results for the SNP overly skewed the metric.

Categories
Cycling London

An East-West and North-South Cycle Superhighway for London?

eastwest

TfL is currently consulting on a couple of proposed “Cycle Superhighways” – an East-West route from Paddington to Tower Hill and a North-South route from St Pancras to Elephant & Castle. The consultations close on 12 October.

The Cycle Superhighways punch right through the centre of London, they are generally wide and properly segregated from traffic. The space is often being made available by reclaiming a traffic lane. The Mayor has referred to them as a “Crossrail for Bikes”, which is a fair description. The two routes meet at the Blackfriars junction.

The east-west route has some curious quirks – it takes a circuitous route around Hyde Park, whereas a new lane going right through the park, or the existing cycle track in the north-east, would surely work better. I expect this is thanks to a lack of cooperation from the Royal Parks authorities – they really should travel to Central Park in New York City to see how world city parks are done properly. It also has a strange section where it takes another tunnel alongside the Blackfriars Tunnel, even though the latter is having one lane closed anyway to keep the traffic lane count consistent. But overall it is a well planned route. Cyclists retain right-of-way over most of the side streets, they don’t have annoying chicanes around the “floating” bus stops, and the “early start” lights (which actually simply act to ensure a cyclist will never get a green light right through) are few and far between.

The north-south route is less completely planned – the core section from Farringdon to Elephant & Castle however is ready for the detailed consultation. A strange dogleg on the approach to Elephant & Castle is unfortunate – “Superhighway” cyclists are always going to be looking for the fastest route, which the route does not take here – but apart from that it is a good, and straight, route.

I very much hope these two routes get built in their planned form and the proposals don’t get watered down. But also I would like Transport for London to focus on improving the busiest existing infrastructure too. Today on my research blog I publish a map showing estimated routes for 12 million bikeshare trips earlier this year. It shows the “Route 0” cyclepath, south of and parallel to Euston Road, as being the busiest of all. There is a good section of segregated two-way cycleway, but it’s horribly cramped, with queues of cyclists at rush-hour often so long that they back onto the next junction. The roadway alongside is normally less busy and therefore often makes for a quicker cycle route. I would also like many more one-way streets to be made two-way for cycles only – the “Sauf Vélo” popular in France, but for London. This can be done on a “lightweight” basis with minimal signage change, so there should be many, many more streets allowing such flows. After all, we don’t make pedestrians walk in a single direction!

Categories
Bike Share London

From Putney to Poplar: 12 Million Journeys on the London Bikeshare

london_barclayscyclehire

The above graphic (click for full version) shows 12.4 million bicycle journeys taken on the Barclays Cycle Hire system in London over seven months, from 13 December 2013, when the south-west expansion to Putney and Hammersmith went live, until 19 July 2014 – the latest journey data available from Transport for London’s Open Data portal. It’s an update of a graphic I’ve made for journeys on previous phases of the system in London (& for NYC, Washington DC and Boston) – but this is the first time that data has been made available covering the current full extent of the system – from the most westerly docking station (Ravenscourt Park) to the the most easterly (East India), the shortest route is over 18km.

As before, I’ve used Routino to calculate the “ideal” routes – avoiding the busiest highways and taking cycle paths where they are nearby and add little distance to the journey. Thickness of each segment corresponds to the estimated number of bikeshare bikes passing along that segment. The busiest segment of all this time is on Tavistock Place, a very popular cycle track just south of the Euston Road in Bloomsbury. My calculations estimate that 275,842 of the 12,432,810 journeys, for which there is “good” data, travelled eastwards along this segment.

The road and path network data is from OpenStreetMap and it is a snapshot from this week. These means that Putney Bridge, which is currently closed, shows no cycles crossing it, whereas in fact it was open during the data collection period. There are a few other quirks – the closure of Upper Ground causing a big kink to appear just south of Blackfriars Bridge. The avoidance of busier routes probably doesn’t actually reflect reality – the map shows very little “Boris Bike” traffic along Euston Road or the Highway, whereas I bet there are a few brave souls who do take those routes.

My live map of the docking stations, which like the London Bikeshare itself has been going for over four years, is here.

[Update – A version of the map appears in Telegraph article. N.B. The article got a little garbled between writing it and its publication, particularly about the distinction between stats for the bikeshare and for commuter cyclists in London.]

Categories
BODMAS Geodemographics

DataShine: 2011 OAC

oac2

The 2011 Area Classification for Output Areas, or 2011 OAC, is a geodemographic classification that was developed by Dr Chris Gale during his Ph.D at UCL Geography over the last few years, in close conjunction with the Office for National Statistics, who have endorsed it and adopted it as their official classification and who collected and provided the data behind the classification – namely the 2011 Census.

A geodemographic classification such as this takes the datasets and looks for clusters, where particular places have similar characteristics across many of the variables. It does this on a non-geographic basis, but spatial autocorrelation means that geographic groupings do typically appear – e.g. a particular part of an inner city will typically have more in common with another part of the inner city, than of the suburbs. However, these areas will often also share much in common with other “inner city” parts of cities elsewhere. Names are then assigned, to attempt to succinctly describe the clusters.

As part of the DataShine project, we have taken the classifications, and mapped them, using the DataShine style of restricting the classification colouring to built up areas and (when zoomed in) individual rows of houses. The map is the third DataShine output, following maps of individual census tables and also the new Travel to Work Flows table.

We’re just mapping the eight “Supergroups”, the top-level clusters. A pop-up shows the more detailed groups and subgroups, and you can find pen-portraits for all these classifications on the ONS website.

Click on the box for an individual supergroup, in the key at the top, to see a map showing just that supergroup on its own. For example, here are the “Cosmopolitan” dwellers of London:

oac3

Like 2011 OAC itself, the map covers all of the UK, including Scotland and Northern Ireland. For the latter, there is no Ordnance Survey Open Data which is how we created the building/urban outlines, so we have improvised with data from OpenStreetMap and NISRA (Northern Ireland Statistics).

The map is part of DataShine, an output of the BODMAS project, but also is in conjunction with the the new Consumer Research Data Centre, an ESRC Data Investment which is being set up here at UCL and other institutions. As such, there is a CDRC version of the map.

As part of the BODMAS project we have also been studying the quality of fit of 2011 OAC for different parts of the UK, and techniques to visualise the uncertainty and quality of the classifications. We will be presenting these findings at the Uncertainty workshop at the GIScience conference in Vienna, later this month.

Direct link to the map.
See also the DataShine blog.

Categories
Conferences

Workshop on Big Data and Urban Informatics

IMG_0716

I attended the Big Data and Urban Informatics workshop in UIC Chicago in early August. My previous blog post outlined my presentation at the workshop. Here’s my notes and thoughts on some of the other talks that I attended.

IMG_0724

  • Above, the AURIN Workbench is a sophisticated platform for city authorities in Australia to output their data and visualise it through a portal. It’s an academic and commercial partnership. A key focus is data consolidation and normalisation, to allow for straightforward comparisons. This is a challenging aspect with so many data sources, from many authorities and places, and as such there is a large team of people involved with the ever-necessary data processing.
  • CASA scholar Greg Erhardt presented on Ph.D work, below, combining together public transport datasets for San Francisco, to build up a multi-modal database. One particular challenge is the incomplete adoption of smartcard-based travel. Here in London, we are lucky that the Oyster-card usage is so high, that it forms a near-complete picture of public transport usage in many parts of London. This is not the case in San Francisco and many other cities.
  • An update on UrbanSim (picture at bottom), one of many urban models, a reworked version of which now uses the Python Data Science Data Stack and is hosted on GitHub – both of these potentially opening the model up to discovery, use and adaptation by new groups. ActivitySim is launching as part of the project – this will be an open activity based travel demand model, to complement UrbanSim’s land-use focus.

IMG_0730

IMG_0739

I saw several other interesting talks and presentations, and it is interesting to see just how much activity is going on in the urban informatics spaces, particularly with the ever-increasing volumes of so-called “big data” becoming increasingly easily available for researchers and visualisers.