Workshop on Big Data and Urban Informatics

IMG_0716

I attended the Big Data and Urban Informatics workshop in UIC Chicago in early August. My previous blog post outlined my presentation at the workshop. Here’s my notes and thoughts on some of the other talks that I attended.

IMG_0724

  • Above, the AURIN Workbench is a sophisticated platform for city authorities in Australia to output their data and visualise it through a portal. It’s an academic and commercial partnership. A key focus is data consolidation and normalisation, to allow for straightforward comparisons. This is a challenging aspect with so many data sources, from many authorities and places, and as such there is a large team of people involved with the ever-necessary data processing.
  • CASA scholar Greg Erhardt presented on Ph.D work, below, combining together public transport datasets for San Francisco, to build up a multi-modal database. One particular challenge is the incomplete adoption of smartcard-based travel. Here in London, we are lucky that the Oyster-card usage is so high, that it forms a near-complete picture of public transport usage in many parts of London. This is not the case in San Francisco and many other cities.
  • An update on UrbanSim (picture at bottom), one of many urban models, a reworked version of which now uses the Python Data Science Data Stack and is hosted on GitHub – both of these potentially opening the model up to discovery, use and adaptation by new groups. ActivitySim is launching as part of the project – this will be an open activity based travel demand model, to complement UrbanSim’s land-use focus.

IMG_0730

IMG_0739

I saw several other interesting talks and presentations, and it is interesting to see just how much activity is going on in the urban informatics spaces, particularly with the ever-increasing volumes of so-called “big data” becoming increasingly easily available for researchers and visualisers.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien
Electric Tube
London North/South

On City Dashboards and Data Stores

Earlier this month, I gave a short presentation at the Big Data and Urban Informatics Workshop, which took place at UIC (University of Illinois in Chicago). My presentation was an abridged version of a paper that I prepared for the workshop. In due course, I plan to publish the full paper, possibly as a CASA working paper or in another open form. The full paper had a number of authors, including Prof Batty and Steven Gray.

Below are the slides that formed the basis of my presentation. I left out contextual information and links in the slidedeck itself, so I’ve added these in after the embedded section:

Notes

Slide 3: MapQuest map showing CASA centrally located in London.
Slides 4-5: More information.
Slide 6: More information about my Bike Share Map, live version.
Slide 7: More information.
Slide 8: More information about CityDashboard, live version.
Slide 10: Live version of CityDashboard’s map view.
Slide 11: More information about the London Periodic Table, live version.
Slide 14: More information about Prism.
Slide 15: London and Paris datastores.
Slide 16: Chicago, Washington DC, Boston data portals.
Slide 17: The London Dashboard created by the Greater London Authority. Many of its panels update very infrequently.
Slide 18: Washington DC’s Open Government Dashboard and Green Dashboard, these are rather basic dashboards, the first being simply a graph and the second having just three categories.
Slide 19: The Amsterdam Dashboard created by WAAG, a non-profit computer society based in the heart of the city.
Slide 20: The Open Data City Census (US version/UK version) created by OKFN – a great idea to measure and compare cities by the breadth and quality of their open data offerings.
Slide 21: More information.
Slide 22: More information.
Slide 23: Pigeon Sim.
Slide 24: Link to iCity, More information on DataShine, live version.
Slide 25: More information on DataShine Travel to Work Flows, live version.

Some slides contain maps, which are generally based on OpenStreetMap (OSM) or Ordnance Survey Open Data datasets.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien
Electric Tube
London North/South

Borough Tops

Screen Shot 2014-08-05 at 14.49.16

The Diamond Geezer is, this month, climbing the highest tops in each one of London’s 33 boroughs.

To find the highest points, he’s used a number of websites which list the places. These derive the data from contour lines, perhaps supplemented with GPS or other measurements. However, another interesting – and new – datasource for calculating this kind of metric, is OS Terrain 50. Released as part of the Ordnance Survey Open Data packages, it is a gridded DEM (Digital Elevation Model). It’s right up to date, at 50m x 50m horizontal resolution, and 10cm vertical resolution, and it should correct for buildings, so showing the true ground height.

Looking at the DEM for Newham, I think it reveals a new highest point – not Wanstead Flats at 15m above sea level, as Diamond Geezer’s lists suggest, but Westfield Avenue, the new road that runs through the Olympic Park. Beside John Lewis, the road rises, to a highest point of 21.6m. It shows as purple in the graphic above. Nearby, the new “bowl” of the lower part of the Olympic Stadium can be seen, as well as the trench through which High Speed 1 runs, at Stratford International Station.

I can’t argue with the Chancery Lane/Holborn junction as being the highest ground-point in the City of London, at 21.9m. In Tower Hamlets, it’s more tricky. The old railyards between Shoreditch High Street and the lines into Liverpool Street look like they are at 21.7m, however the ground here is not publically accessible, and the DEM is quite noisy here, with only part of the railyard showing this height.

I’m looking for a way to do this programatically – calculating the highest DEM value for each borough. I’ve tried using QGIS’s Zonal Statistics plugin, with polygon shapefiles of London’s boroughs, but this only shows the mean value of the DEM for that borough.

Here’s the list I’ve created by measuring – the main issue with my dataset is that the measurements are only at the centre of each 50m x 50m cell.

Borough Hgt (m) 50m cn 10-digit grid ref Description of
approximate location
By edge?
Barking and Dagenham 45.3 TQ_48590_89948 Industrial area just E of northern part of Whalebone Lane North.
Barnet 146.1 TQ 21955 95622 Just south of the water tower to the east of Rowley Lane, near Rowley Green.
Bexley 81 TQ 45737 71256 Langdon Shaw, southwest side. Yes
Brent 91.2 TQ 20732 88877 Junction of Wakemans Hill Avenue and The Grove.
Bromley 246.5 TQ 43637 56487 A233 – where Main Road changes name to Westerham Hill Yes
Camden 135.6 TQ 26277 86225 Lower Terrace, just off Heath Street in Hampstead. Yes
City of London 21.9 TQ 30970 81612 NW edge – junction of Holborn and Chancery Lane.
Croydon 175.7 TQ 34330 61827 Sanderstead Plantation, SW path crossroads.
Ealing 81.5 TQ 16177 84398 Horsenden Hill
Enfield 118.7 TQ 25632 97674 Just north of Camlet Way, Hadley Wood, opposite Calderwood Place. Yes
Greenwich 131.1 TQ 43831 76583 Southern end of Eaglesfield Recreation Ground on Shooters Hill.
Hackney 39.8 TQ 32025 87574 In Finsbury Park, beside Green Lanes, opposite No. 330. Yes
Hammersmith and Fulham 45.9 TQ 22960 82756 Harrow Road at north end of bridge over the railway line near Kensal Green station. Yes
Haringey 129 TQ 28326 87479 Ground by Highgate School Chapel, just north of Highgate High Street.
Harrow 153.4 TQ 15288 93808 Magpie Hall Road, between The Common and Alpine Walk. Yes
Havering 106 TQ 51192 93055 Churchyard of St John the Evangelist church (also Broxhill Road by the cricket pitch)
Hillingdon 130.5 TQ 10585 91678 Junction of South View Road and Potter Street Hill Yes
Hounslow 33.6 TQ 11320 78815 Western Road – bridge over the Grand Union Canal.
Islington 99.9 TQ 28874 87217 Highgate Hill and Hornsey Lane junction. Yes
Kensington and Chelsea 45.7 TQ 23014 82728 Kensal Green Cemetery, northern edge, beside the Harrow Road, above the railway line. Yes
Kingston upon Thames 91.3 TQ 16644 60376 Telegraph Hill
Lambeth 110.9 TQ 33620 70729 Westow HIll and Japser Road junction. Yes
Lewisham 111.2 TQ 33918 71779 Sydenham Hill and Rock Hill junction. Yes
Merton 56 TQ 23627 70823 Lauriston Road and Wilberforce Way NW junction.
Newham 21.6 TQ 37967 84530 Westfield Avenue, outside John Lewis in Westfield Stratford City.
Redbridge 91.5 TQ 47945 93784 Cabin Hill
Richmond upon Thames 56 TQ 18779 73065 Bridleway/path junction just east of Queens Road, opposite the Pembroke Lodge car-park and to the NE of it.
Southwark 111.5 TQ 33926 71686 Sydenham Hill, between Chestnut Place and Bluebell Close. Yes
Sutton 146.4 TQ 28383 59986 Middle of rectangle of land south-east of Corrigan Avenue and south-west of Richland Avenue.
Tower Hamlets 21.7 TQ 33720 82184 Railway yards between Shoreditch High Street station and the railways lines leading to Liverpool St Station.
Waltham Forest 92.2 TQ 38415 95010 Pole Hill (north top)
Wandsworth 60.7 TQ 22881 72780 Big Alp, Wimbledon Common
Westminster 53 TQ 26627 18386 Finchley Road and Boundary Road junction. Yes

DataShine Travel to Work Flows

datashinecommute

Today, the Office for National Statistics (ONS) have released the Travel to Work Flows based on the 2011 census. These are a giant origin-destination matrix of where people commute to work. There are various tables that have been released. I’ve chosen the Method of Travel to Work and visualised the flows, for England and Wales, on this interactive map. The map uses OpenLayers, with an OpenStreetMap background for context. Because we are showing the flows and places (MSOA population-weighted centroids) as vectors, a reasonably powerful computer with a large screen and a modern web browser is needed to view the map. The latest versions of Firefox, Safari or Chrome should be OK. Your mobile phone will likely not be so happy.

Blue lines represent flows coming in to a selected place, that people work in. Red lines show flows out from the selected location, to work elsewhere.

The map is part of the DataShine platform, an output of the BODMAS project led by Dr Cheshire, where we take big, open datasets and analyse them. The data – both the travel to work flows and the population-weighted MSOA centroids – come from from the ONS, table WU03EW.

View the interactive map here.

lichfieldcommute

London Words

Screen Shot 2014-07-21 at 15.46.02

Above is a Wordle of the messages displayed on the big dot-matrix displays (aka variable message signs) that sit beside major roads in London, over the last couple of months. The larger the word, the more often it is shown on the screens.

The data comes from Transport for London via their Open Data Users platform, through CityDashboard‘s API. We now store some of the data behind CityDashboard, for London and some other cities, for future analysis into key words and numbers for urban informatics.

Below, as another Wordle, are the top words used in tweets from certain London-centric Twitter accounts – those from London-focused newspapers and media organisations, tourism organisations and key London commentators. Common English words (e.g. to, and) are removed. I’ve also removed “London”, “RT” and “amp”.

Screen Shot 2014-07-21 at 15.56.57

Some common words include: police, tickets, City, crash, Boris, Thames, Park, Festival, Bridge, bus, Kids.

Finally, here’s the notes that OpenStreetMap editors use when they commit changes to the open, user-created map of the world, for the London area:

Screen Shot 2014-07-21 at 16.10.50

Transport and buildings remain a major focus of the voluntary work on completing and maintaining London’s map, that contributors are carrying out.

There is no significance to the colours used in the graphics above. Wordle is a quick-and-dirty way to visualise data like this, we are looking at more sophisticated, and “fairer” methods, as part of ongoing research.

This work is preparatory work for the Big Data and Urban Informatics workshop in Chicago later this summer.

Thanks to Steve and the Big Data Toolkit, which was used in the collection of the Twitter data for CityDashboard.