On City Dashboards and Data Stores

Earlier this month, I gave a short presentation at the Big Data and Urban Informatics Workshop, which took place at UIC (University of Illinois in Chicago). My presentation was an abridged version of a paper that I prepared for the workshop. In due course, I plan to publish the full paper, possibly as a CASA working paper or in another open form. The full paper had a number of authors, including Prof Batty and Steven Gray.

Below are the slides that formed the basis of my presentation. I left out contextual information and links in the slidedeck itself, so I’ve added these in after the embedded section:

Notes

Slide 3: MapQuest map showing CASA centrally located in London.
Slides 4-5: More information.
Slide 6: More information about my Bike Share Map, live version.
Slide 7: More information.
Slide 8: More information about CityDashboard, live version.
Slide 10: Live version of CityDashboard’s map view.
Slide 11: More information about the London Periodic Table, live version.
Slide 14: More information about Prism.
Slide 15: London and Paris datastores.
Slide 16: Chicago, Washington DC, Boston data portals.
Slide 17: The London Dashboard created by the Greater London Authority. Many of its panels update very infrequently.
Slide 18: Washington DC’s Open Government Dashboard and Green Dashboard, these are rather basic dashboards, the first being simply a graph and the second having just three categories.
Slide 19: The Amsterdam Dashboard created by WAAG, a non-profit computer society based in the heart of the city.
Slide 20: The Open Data City Census (US version/UK version) created by OKFN – a great idea to measure and compare cities by the breadth and quality of their open data offerings.
Slide 21: More information.
Slide 22: More information.
Slide 23: Pigeon Sim.
Slide 24: iCity and DataShine Travel to Work Flows.

Some slides contain maps, which are generally based on OpenStreetMap (OSM) or Ordnance Survey Open Data datasets.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien
Electric Tube
London North/South

Borough Tops

Screen Shot 2014-08-05 at 14.49.16

The Diamond Geezer is, this month, climbing the highest tops in each one of London’s 33 boroughs.

To find the highest points, he’s used a number of websites which list the places. These derive the data from contour lines, perhaps supplemented with GPS or other measurements. However, another interesting – and new – datasource for calculating this kind of metric, is OS Terrain 50. Released as part of the Ordnance Survey Open Data packages, it is a gridded DEM (Digital Elevation Model). It’s right up to date, at 50m x 50m horizontal resolution, and 10cm vertical resolution, and it should correct for buildings, so showing the true ground height.

Looking at the DEM for Newham, I think it reveals a new highest point – not Wanstead Flats at 15m above sea level, as Diamond Geezer’s lists suggest, but Westfield Avenue, the new road that runs through the Olympic Park. Beside John Lewis, the road rises, to a highest point of 21.6m. It shows as purple in the graphic above. Nearby, the new “bowl” of the lower part of the Olympic Stadium can be seen, as well as the trench through which High Speed 1 runs, at Stratford International Station.

I can’t argue with the Chancery Lane/Holborn junction as being the highest ground-point in the City of London, at 21.9m. In Tower Hamlets, it’s more tricky. The old railyards between Shoreditch High Street and the lines into Liverpool Street look like they are at 21.7m, however the ground here is not publically accessible, and the DEM is quite noisy here, with only part of the railyard showing this height.

I’m looking for a way to do this programatically – calculating the highest DEM value for each borough. I’ve tried using QGIS’s Zonal Statistics plugin, with polygon shapefiles of London’s boroughs, but this only shows the mean value of the DEM for that borough.

Here’s the list I’ve created by measuring – the main issue with my dataset is that the measurements are only at the centre of each 50m x 50m cell.

Borough Hgt (m) 50m cn 10-digit grid ref Description of
approximate location
By edge?
Barking and Dagenham 45.3 TQ_48590_89948 Industrial area just E of northern part of Whalebone Lane North.
Barnet 146.1 TQ 21955 95622 Just south of the water tower to the east of Rowley Lane, near Rowley Green.
Bexley 81 TQ 45737 71256 Langdon Shaw, southwest side. Yes
Brent 91.2 TQ 20732 88877 Junction of Wakemans Hill Avenue and The Grove.
Bromley 246.5 TQ 43637 56487 A233 – where Main Road changes name to Westerham Hill Yes
Camden 135.6 TQ 26277 86225 Lower Terrace, just off Heath Street in Hampstead. Yes
City of London 21.9 TQ 30970 81612 NW edge – junction of Holborn and Chancery Lane.
Croydon 175.7 TQ 34330 61827 Sanderstead Plantation, SW path crossroads.
Ealing 81.5 TQ 16177 84398 Horsenden Hill
Enfield 118.7 TQ 25632 97674 Just north of Camlet Way, Hadley Wood, opposite Calderwood Place. Yes
Greenwich 131.1 TQ 43831 76583 Southern end of Eaglesfield Recreation Ground on Shooters Hill.
Hackney 39.8 TQ 32025 87574 In Finsbury Park, beside Green Lanes, opposite No. 330. Yes
Hammersmith and Fulham 45.9 TQ 22960 82756 Harrow Road at north end of bridge over the railway line near Kensal Green station. Yes
Haringey 129 TQ 28326 87479 Ground by Highgate School Chapel, just north of Highgate High Street.
Harrow 153.4 TQ 15288 93808 Magpie Hall Road, between The Common and Alpine Walk. Yes
Havering 106 TQ 51192 93055 Churchyard of St John the Evangelist church (also Broxhill Road by the cricket pitch)
Hillingdon 130.5 TQ 10585 91678 Junction of South View Road and Potter Street Hill Yes
Hounslow 33.6 TQ 11320 78815 Western Road – bridge over the Grand Union Canal.
Islington 99.9 TQ 28874 87217 Highgate Hill and Hornsey Lane junction. Yes
Kensington and Chelsea 45.7 TQ 23014 82728 Kensal Green Cemetery, northern edge, beside the Harrow Road, above the railway line. Yes
Kingston upon Thames 91.3 TQ 16644 60376 Telegraph Hill
Lambeth 110.9 TQ 33620 70729 Westow HIll and Japser Road junction. Yes
Lewisham 111.2 TQ 33918 71779 Sydenham Hill and Rock Hill junction. Yes
Merton 56 TQ 23627 70823 Lauriston Road and Wilberforce Way NW junction.
Newham 21.6 TQ 37967 84530 Westfield Avenue, outside John Lewis in Westfield Stratford City.
Redbridge 91.5 TQ 47945 93784 Cabin Hill
Richmond upon Thames 56 TQ 18779 73065 Bridleway/path junction just east of Queens Road, opposite the Pembroke Lodge car-park and to the NE of it.
Southwark 111.5 TQ 33926 71686 Sydenham Hill, between Chestnut Place and Bluebell Close. Yes
Sutton 146.4 TQ 28383 59986 Middle of rectangle of land south-east of Corrigan Avenue and south-west of Richland Avenue.
Tower Hamlets 21.7 TQ 33720 82184 Railway yards between Shoreditch High Street station and the railways lines leading to Liverpool St Station.
Waltham Forest 92.2 TQ 38415 95010 Pole Hill (north top)
Wandsworth 60.7 TQ 22881 72780 Big Alp, Wimbledon Common
Westminster 53 TQ 26627 18386 Finchley Road and Boundary Road junction. Yes
Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien
Electric Tube
London North/South

DataShine Travel to Work Flows

datashinecommute

Today, the Office for National Statistics (ONS) have released the Travel to Work Flows based on the 2011 census. These are a giant origin-destination matrix of where people commute to work. There are various tables that have been released. I’ve chosen the Method of Travel to Work and visualised the flows, for England and Wales, on this interactive map. The map uses OpenLayers, with an OpenStreetMap background for context. Because we are showing the flows and places (MSOA population-weighted centroids) as vectors, a reasonably powerful computer with a large screen and a modern web browser is needed to view the map. The latest versions of Firefox, Safari or Chrome should be OK. Your mobile phone will likely not be so happy.

Blue lines represent flows coming in to a selected place, that people work in. Red lines show flows out from the selected location, to work elsewhere.

The map is part of the DataShine platform, an output of the BODMAS project led by Dr Cheshire, where we take big, open datasets and analyse them. The data – both the travel to work flows and the population-weighted MSOA centroids – come from from the ONS, table WU03EW.

View the interactive map here.

lichfieldcommute

London Words

Screen Shot 2014-07-21 at 15.46.02

Above is a Wordle of the messages displayed on the big dot-matrix displays (aka variable message signs) that sit beside major roads in London, over the last couple of months. The larger the word, the more often it is shown on the screens.

The data comes from Transport for London via their Open Data Users platform, through CityDashboard‘s API. We now store some of the data behind CityDashboard, for London and some other cities, for future analysis into key words and numbers for urban informatics.

Below, as another Wordle, are the top words used in tweets from certain London-centric Twitter accounts – those from London-focused newspapers and media organisations, tourism organisations and key London commentators. Common English words (e.g. to, and) are removed. I’ve also removed “London”, “RT” and “amp”.

Screen Shot 2014-07-21 at 15.56.57

Some common words include: police, tickets, City, crash, Boris, Thames, Park, Festival, Bridge, bus, Kids.

Finally, here’s the notes that OpenStreetMap editors use when they commit changes to the open, user-created map of the world, for the London area:

Screen Shot 2014-07-21 at 16.10.50

Transport and buildings remain a major focus of the voluntary work on completing and maintaining London’s map, that contributors are carrying out.

There is no significance to the colours used in the graphics above. Wordle is a quick-and-dirty way to visualise data like this, we are looking at more sophisticated, and “fairer” methods, as part of ongoing research.

This work is preparatory work for the Big Data and Urban Informatics workshop in Chicago later this summer.

Thanks to Steve and the Big Data Toolkit, which was used in the collection of the Twitter data for CityDashboard.

Introducing DataShine

kingston_5beds

This week, James and I launch DataShine: Census. This is part of the ESRC BODMAS project, here at UCL’s Centre for Advanced Spatial Analysis, that is led by James, and which started at the beginning of this year.

DataShine: Census shows web maps of the Quick Statistics aggregate tables of Census data for England/Wales for 2011, that were published last year by the Office of National Statistics.

DataShine: Census is the successor to CensusProfiler which I put together when I was at UCL’s Department of Geography in 2009. The main difference, apart from being a more modern website with updating URLs, geolocation etc, is that the data maps presented are “shone” through buildings, rather than covering all the land area. This has two advantages, and two disadvantages. The two advantages are that it means the countryside doesn’t dominate, and that the urban form (building blocks, parks, road structures) is more recognisable – so it looks more like a map of real places rather than a complicated patchwork of bright colours with abstract boundaries. The two disadvantages are that buildings can be individually represented, implying a greater level of spatial precision than is the case.

For the Census data, I wanted to come up with a good way of showing an interesting map, for all ~900 census aggregate variables, without having to make 900 decisions manually. To do this, I calculated the average percentage population, based on the populations across the output areas (~150 houses each), and the standard deviation of the percentage population. When you do this, and then plot the two statistics for each variable against each other, you get a graph like this:

census_qsgraph

Most variables have very small averages and so cluster at the bottom left hand side. The distinctive line of variables with small averages and high standard deviations are where the overall population is care homes and other institutions, rather than people or standard households.

I have split the variables into four sections, each of which is grouped differently for the key. The ones under the main triangle are mapped using a divergent colour scheme (red/green by default) from the average, which always appears in the middle of the key:

divergencemean

The ones above it (high standard deviations) are mapped as simple equal intervals of eighths, between 0 and 100%:

equalintervals

Finally, variables with very small/large averages, and large standard deviations, are mapped as multiples of the average (or 1-average) – here the average will always appear one from the beginning or the end of the key:

highav_highsd lowav_highsd

(The other three are using sequential colour ramps.)

DataShine is a platform for creating these kinds of web maps. As well as the initial census example, we are hoping to use it create other sorts of web maps, I hope to release and blog about those soon! I am also running a dedicated DataShine blog, which currently features some examples of particularly interesting maps coming from DataShine: Census, as well as some technical detail of the “geostack” behind the platform.

James has also written about the project.