China: ICSDM Conference


Last week I was in China for the 2nd IEEE International Conference on Spatial Data Mining (ICSDM), travelling with my lab’s director who was keynoting and giving a day’s teaching at the conference’s accompanying summer school. The conference was based in Fuzhou University, on the western edge of Fuzhou in Fujian Province, a city of five million people about 90 minutes north east of Hong Kong by plane, and an hour’s drive inland from the ocean. The city’s setting is rather dramatic – it is surrounded by forested mountains, and the greenery extends into the city too, where it helps absorb pollution.

IMG_20150709_165709ecThe conference consisted of a number of keynote presentations given by domain experts on topics such as Big Models for Big Data, to Social Media geographic data mining and classification, to multi-source pollution monitoring and modelling. Interspersed with the keynotes were parallel tracks of project presentations, many (but not all) of which were given by Ph.D. candidates and other students at various universities elsewhere in China, as well as at Fuzhou itself. Remote sensing was a major theme of the conference, but other topics included modelling house prices based on demographic information and looking at movements of people using the Chinese equivalents of Facebook and Twitter.

As well as the conference itself there was time for a number of walks in the local forest parks and up some mountains – tough in the heat and humidity of southern China in the summer, but well worth it for the views. We also visited a number of temple buildings and other areas popular with tourists.

It was a well organised conference and was interesting to attend – not least to see that the sorts of research topics that we are familiar with here in quantitative geography at UCL, are carried out in China too – but with a local perspective, based on the different datasets available and cultural habits. The keynote talks also added a good, rounded perspective on the spatial data mining field as it currently stands. All in all, an eye-opening week.


The week before last I was at GISRUK, the long-running annual academic conference for early (and not-so-early) career researchers in GI Science in the UK, Ireland and further afield. This year’s conference was in Leeds and attracted a record number of 250+ participants. I presented a poster at a meeting the day before the main conference, but otherwise had no talk to give at Leeds, which means I was able to relax and focus on seeing the most interesting sessions. This year we had some great keynotes, including two visually impressive talks from Google’s Ed Parsons and MIT’s Sarah Williams, opening and closing the conference respectively. Outside of the keynotes, there were three main streams running simultaneously, but with the theme regularly changing after each group of talks, which meant for plenty of room swapping.

Some of my favourite talks:

  • With a large UCL attendance, there were plenty of talks on geodemographics and socioeconomic mapping. One of my favourites was from Monsuru Adepeju of the UCL Crime, Policing and Citizenship project, the talk looked at a new way of detecting crime hotspots. The presentation included the below map showing a crime-weighted geodemographic map of London.gisruk1
  • Staying with UCL and geodemographics, but going from crime to food, this classification, developed alongside a major food retailer in the UK, was presented by UCL’s Guy Lansley of the Consumer Data Research Centre, the work linked ethnic-weighted classifications with the popularity of certain food types, to simplify the task of providing particular food-types popular with one or more major ethnic groups in the UK, as the country’s population demographic continues to change and move.gisruk2
  • Staying with the geodemographic theme, Mark Birkin of Leeds gave an overview of geodemographics research in the era of big data, where ever increasing amounts of data allow ever more sophisticated analysis to be performed. The below image shows a slide from a presentation presenting a very detailed geodemographic map – right down to postcode (typically 50 homes) level.gisruk3
  • Away from geodemographics and to cartography: Jonny Huck of Lancaster presented the results of a study into creating a number of map types that encouraged good interaction with the map itself – the aim making maps for mobile devices that were engaging and encouraged people to look at the screen frequently when navigating – but not being so difficult to interpret that they were frustrating. Four styles of map, of the Lancaster University campus, were created from a Google Maps base, and participants were asked to navigate around the campus. The style that proved to be most effective in terms of engagement, while being fun to use, was the “PacMap”, a screenshot of which is shown below. Ironically Google released an unrelated PacMap for the whole world, as part of this year’s “April Fool” Google Maps hack.gisruk4
  • Ed Manley of UCL showed some results of using mobile phone data to derive patterns of mobility through certain parts of an urban area, showing that different communities experience their cities in different ways and to different extents.gisruk5
  • I didn’t see the presentation by Robin Lovelace (Leeds) on his work-in-progress on creating an R/Shiny-based tool for visualising current inter-neighbourhood cycling flows, and predicting future flows based on several scenarios, but I did get a demo of the tool, which is looking impressive, and will be a powerful way to communicate and interrogate a complex dataset.
  • Some other highlights included TransportOAC (Nick Bearman, Liverpool) which is a geodemographic map focused on who people move around the UK. The classification is relatively “noisy” spatially, and London’s unique transport system (compared with the rest of the UK) means it gets a number of classification groups to itself. I also enjoyed Nilufer Aslam’s talk about linking metro smartcard data (from TfL’s Oyster Card) with journey and usage information of bikeshare systems, to see whether they indeed formed a “last mile” option for commuters, and how availability patterns affected this.
  • I presented a poster, below, on DataShine, at the poster session for a meeting immediately prior to GISRUK. The poster summarises the three websites that are my principal output thus far, from the BODMAS project.gisruk6

So, an excellent conference, full of interesting talks on geodemographics and various other GIS-related research. Thanks to the organisers for their hard work in staging a smoothly-run and successful three days.

Conference Review: GIScience 2014


I was in Vienna for most of last week, presenting at a satellite workshop of the GIScience conference, before joining the main event for the latter part of the week.

GIScience is a biennial international academic conference, alternating between America and Europe. At the intersection between geography, GIS and information visualisation. It is very much academically focused, which contrasts strongly with FOSS4G (GIS technology), WhereCamp (GIS community) and the AGI (GIS business).

My highlights for this year’s conference:

  • Jason Dykes (City) gave a keynote on balancing geovisualisation and information visualisation. As ever with presentations from City’s GICentre unit, the graphics were presented by way of various live demos and compellingly explained.
  • UCL Geography/CEGE had a strong presence of the conference and various of my colleagues gave presentations, a number focusing on using geolocated social media, both as a tool for research (e.g. population synthesis) and for research itself. There was also an unveiling of LOAC (UCL/Liverpool), a classification specially built for London, further details on this to follow soon as LOAC is signed off and rolled out.
  • Another UCL Geography presentation on comparing surname clustering and genotype clustering in the UK
  • A interesting presentation from TU Eindhoven on automatically creating and simplifying network diagrams using circular arcs.
  • Automatic Itinerary Reconstruction from Texts (LIUPPA/Pau) – showed how a fairly accurate map can be made simply by scanning prose, and otherwise unknown locations of places can be roughly determined by their textual relations to other, known places.

Many of the talks appear in an LNCS proceedings book.

Outside of the conference, much Wiener Schnitzel and Gelato was consumed, and historic old Vienna was explored. A highlight was conference drinks in the huge barrelled halls underneath the very grand city hall.


Mapping Geodemographic Classification Uncertainty


I’m presenting a short paper today at the Uncertainty Workshop at GIScience 2014 in Vienna, looking at cartographic methods of showing uncertainty in the new OAC 2011 geodemographic maps of the UK using textures and hatching to the quality of fit of areas to their defined “supergroup” geodemographic cluster.

Mapnik was used – its compositing operations allow the easy combination of textures and hues from the demographic data and uncertainty measure onto the same tile, suitable for displaying on a standard online map.

These are my presentation slides (if you get a bandwidth message, try refreshing this webpage, or download here):

You can download a PDF of the short paper from here.

A special version of the OAC map, which includes the special uncertainty layers that you can see in the paper/presentation, can temporarily be found here. Use the extra row of buttons at the top to toggle on/off uncertainty effects, and see the SED scores at the bottom left, as you mouse over areas. Note that this URL is a development one and so likely to change/break at some point soon.

Background mapping is Crown Copyright and Database Right Ordnance Survey 2014, and the OAC data is derived from census data that is Crown Copyright the Office of National Statistics. Both are used under the terms of the Open Government Licence.

Workshop on Big Data and Urban Informatics


I attended the Big Data and Urban Informatics workshop in UIC Chicago in early August. My previous blog post outlined my presentation at the workshop. Here’s my notes and thoughts on some of the other talks that I attended.


  • Above, the AURIN Workbench is a sophisticated platform for city authorities in Australia to output their data and visualise it through a portal. It’s an academic and commercial partnership. A key focus is data consolidation and normalisation, to allow for straightforward comparisons. This is a challenging aspect with so many data sources, from many authorities and places, and as such there is a large team of people involved with the ever-necessary data processing.
  • CASA scholar Greg Erhardt presented on Ph.D work, below, combining together public transport datasets for San Francisco, to build up a multi-modal database. One particular challenge is the incomplete adoption of smartcard-based travel. Here in London, we are lucky that the Oyster-card usage is so high, that it forms a near-complete picture of public transport usage in many parts of London. This is not the case in San Francisco and many other cities.
  • An update on UrbanSim (picture at bottom), one of many urban models, a reworked version of which now uses the Python Data Science Data Stack and is hosted on GitHub – both of these potentially opening the model up to discovery, use and adaptation by new groups. ActivitySim is launching as part of the project – this will be an open activity based travel demand model, to complement UrbanSim’s land-use focus.



I saw several other interesting talks and presentations, and it is interesting to see just how much activity is going on in the urban informatics spaces, particularly with the ever-increasing volumes of so-called “big data” becoming increasingly easily available for researchers and visualisers.

On City Dashboards and Data Stores

Earlier this month, I gave a short presentation at the Big Data and Urban Informatics Workshop, which took place at UIC (University of Illinois in Chicago). My presentation was an abridged version of a paper that I prepared for the workshop. In due course, I plan to publish the full paper, possibly as a CASA working paper or in another open form. The full paper had a number of authors, including Prof Batty and Steven Gray.

Below are the slides that formed the basis of my presentation. I left out contextual information and links in the slidedeck itself, so I’ve added these in after the embedded section:


Slide 3: MapQuest map showing CASA centrally located in London.
Slides 4-5: More information.
Slide 6: More information about my Bike Share Map, live version.
Slide 7: More information.
Slide 8: More information about CityDashboard, live version.
Slide 10: Live version of CityDashboard’s map view.
Slide 11: More information about the London Periodic Table, live version.
Slide 14: More information about Prism.
Slide 15: London and Paris datastores.
Slide 16: Chicago, Washington DC, Boston data portals.
Slide 17: The London Dashboard created by the Greater London Authority. Many of its panels update very infrequently.
Slide 18: Washington DC’s Open Government Dashboard and Green Dashboard, these are rather basic dashboards, the first being simply a graph and the second having just three categories.
Slide 19: The Amsterdam Dashboard created by WAAG, a non-profit computer society based in the heart of the city.
Slide 20: The Open Data City Census (US version/UK version) created by OKFN – a great idea to measure and compare cities by the breadth and quality of their open data offerings.
Slide 21: More information.
Slide 22: More information.
Slide 23: Pigeon Sim.
Slide 24: Link to iCity, More information on DataShine, live version.
Slide 25: More information on DataShine Travel to Work Flows, live version.

Some slides contain maps, which are generally based on OpenStreetMap (OSM) or Ordnance Survey Open Data datasets.

GISRUK 2014 (Part 3)

A final post where I highlight more of the best papers at GISRUK 2014 in Glasgow – see Part 1 and Part 2.

Geodemographic classification for Ireland

It was an early start on a Bank Holiday Good Friday, particularly as I was commuting from Edinburgh, but I made it in for the second half of Chris Brunsdon (NUI Maynooth)’s talk on creating a geodemographic classification for Ireland. Applying many of the same techniques used to produce the 2001 (and indeed the forthcoming 2011) OAC for the UK, but applying an Irish emphasis – where availability of septic tanks is an important census variable – using using PAM rather than K-means clustering, and ensuring a fully reproducable approach. Six “broad clusters” were identified, as shown on the colourful dendrogram here. Chris also showed maps of the classification, both for Ireland in general and Dublin in particular.


Mapping neighbourhoods from internet-derived data

Defining London’s “real” neighbourhoods is something of a preoccupation for me at the moment, with a number of related maps on the Mapping London blog, so this was a talk of great interest to me. Paul Brindley (Nottingham). There are a wide variety of potential sources of data to define neighborhoods – social media, Flickr photograph tags, OpenStreetMap etc. Paul concentrated on postal addresses – specifically the “unnecessary” bit between the street and city, which people habitually still include. By mapping these extra pieces of information to postcodes, and also looking at their population and where their footprints overlapped, an informal geography of neighbourhoods, defined by people themselves, is revealed. The pre-press version of the paper is online.


Whitebox Geospatial Analysis Toolkit

Finally, a bit of a surprise, and a talk that would have fitted in well at FOSS4G in Nottingham last year, Whitebox GAT is a GIS package focused on complex raster (e.g. LIDAR) manipulation and analysis. The open-source project looks powerful and impressive, but has a low profile, particularly as it’s not part of OSGeo, so the lead author was at the conference, and gave this talk, as part of an effort to increase its profile.

After the conference concluded, I took the opportunity of the unusual weather for Glasgow (i.e. sunny, warm) for a wander around the city, going via the University campus, the new Riverside Museum (and tall ship), the “Squinty Bridge” and Glasgow Green.



Above: View of the Glasgow University campus from Dumbarton Bridge, and the Riverside Museum building.

GISRUK 2015 will be at Leeds University.

GISRUK 2014 (Part 2)

Following on from part one of my conference review, here are my favourite talks from the middle part of the conference.

Social media and spatial modelling – Tweets and museums

Robin Lovelace (Newcastle) won best paper at the end of the conference, for this talk on examining tweets “geofenced” around many local museums, to see from where these people travelled and what they had to say about the museum.

Agent Based Models and GIS for disaster zones

Sarah Wise (George Mason University & UCL) presented a chapter from her Ph.D on the use of GIS in immediate post-disaster zones, focusing on the Haiti earthquake. OpenStreetMappers quickly mapped Port-au-Prince and other badly damaged areas, using satellite and aerial imagery made available, and Sarah studied the resulting crowdsourced GI information. An agent-based model was then used, with the fractured road network, to model how survivors would move to locations where food and other aid was made available, the visualisation of the model output showing how well different areas, some with considerable damage to the road network, were served in the days after the disaster. Sarah won Best Paper on Spatial Analysis which is awarded by CASA based on submitted abstracts for the conference.


Visualising activity spaces of urban utility cyclists

This talk by Seraphim Alvanides (Northumbria) showed that utility cyclists – those aiming to get from A to B as efficiently as possible – are often poorly served by dedicated cycling infrastructure. Where a road route is shorter than a cycleway, more people than you might expect will take the former, and the talk showed some graphics of flows along roads and paths to demonstrate this.

Exposure to air pollution: the quantified self

Jonny Huck (Lancaster) gave one of my favourite presentations of the conference, and certainly one of the most visually impressive. It first explored personal sensors (for heart rate, breathing etc) and the internet of things (with small internet-connected devices), then combining the two to detail a device, based on Arduino, e-Health, Waspmote and Android, for monitoring exposure to pollution – combining breathing rate and air pollution levels – for a walk around the campus at Lancaster University, where climbing up steep hills in the campus had as much impact as walking alongside major roads. It’s early stage research and I’m not sure the very intrusive breathing monitor is going to catch on, but certainly points to a quantified future. At CASA, we have started to acquire and evaluate personal and environmental sensors, with FitBits and pollution sensors in the office, so a CASA-centric approach to this kind of research might not be too far off.


The final session of the day took the form of a series of plenaries about interdisciplinary research. While some of these were interesting in their own right (particularly, an unexpected one on cellular biology!) I didn’t get as much out of them as I did from the paper sessions.

At the end of the second day of the conference most people went to the dinner – I didn’t have a ticket for this though, so headed back to central Glasgow with Addy, who’s written up his thoughts on the conference here on the EDINA Go-Geo blog. My comments on the final day will appear in the final part, tomorrow.

GISRUK 2014 (Part 1)

I was at the Geographic Information Systems Research United Kingdom (GISRUK) 2014 conference last week. GISRUK is the key GIS conference for early-career academic researchers in the UK and Ireland, and is hosted by a different university in the British Isles each year. The audience are mainly UK academics, with young researchers and professors in roughly equal attendance, along with some academics from abroad, including Malaysia, Nigeria and Canada. They are definitely more geo and less tech, the conference being relatively quiet on Twitter, especially compared to conferences such as State of the Map or Wherecamp EU.

This year the conference was hosted up at Glasgow University. Being tucked into the Easter break might have meant a reduced attendance on previous years. However, there were many good talks in the two parallel streams that ran through the three days of the conference – some 50 talks altogether, plus plenaries – and some talks were very popular, with attendees just about squeezing in to the venue.

In this post (and in the second and third parts, to follow) I’ve highlighted the talks that I found the most interesting. Of course, with two streams, there were inevitably interesting sessions which overlapped, and so I may have missed some of the best of all – in a couple of cases I ended up changing room half-way though a session. I’ve paraphrased the talk titles here.

Streets vs landmarks for text-based directions for pedestrians

This talk, given by William Mackaness from Edinburgh, was on an interesting study monitoring how people get from A to B, given one of two kinds of text directions – landmark based “turn left at the Bank of Scotland branch coming up on the left” or street based “continue on George Street, turn left onto Frederick Street in 500m” and monitored, with GPS and movement sensors, how well they moved through the urban realm, with landmark based directions proving better. Of course, these are harder for automated systems as street names a more uniform and consistent storage type than landmarks.

Clustering landmark tags in urban images

This was probably my favourite talk of the whole conference. By the same team as the above, it was presented by Phil Bartie (St Andrews) and outlined algorithms used to detect buildings and other landmarks from photos, by looking at where people tag interesting features in set photographs, how they tag them, and then linking the tags and locations together to try and separate visually close (but distinct) features, and combine different elements of the same feature that are spatially far apart. The heatmap examples used in the talk were compelling.


Using social media data to assess crime hotspots

Nick Malleson’s (Leeds) talk looked at tackling the “daytime population” problem – crime statistics tend to exaggerate city centres, as these have a large daytime population but a low residential (i.e. census/official) population, which areas are typically normalised by to produce a crime rate. By looking at georeferenced social media activity as a proxy for daytime population, the city centre hotspots disappear and move into the most deprived suburbs – although these need to be controlled also by a possible lower-than-average use of social media in such areas.


Exploring links between coal-mining, deprivation and health

This known link was mapped out well by Paul Norman (Leeds), using some great maps of the relevant census data. The talk included a potted history of coal mines and their phased closures. The study was longitudinal – combining statistics over multiple censuses, with data on opening and closing of mines (mine opening dates often being hard to determine).


At the end of the first day of the conference, therewas a reception at the opulent City Chambers in the centre of Glasgow, where I had the novelty of being served a glass of Irn Bru (Scotland’s other national drink, and tougher to find in London) by a waiter, in a room surrounded with marble and various paintings of former council leaders!


Part two to follow tomorrow. Addy Pope at EDINA Go-Geo has also reviewed the conference.



I’m now the proud owner of this lovely green glass globe paperweight – it was my prize from the web map category of the mapping competition at the FOSS4G conference last year, but it’s taken me this long to finally get my hands on it, as I was disappearing on a train before the end of the conference, and accidentally delegated receipt of the prize to a friend who I thought lived in London – actually he lives several hundred miles away. Anyway, thanks to the organisers for coming up with such an inspired prize, one that is useful and beautiful.