Category Archives: Data Graphics

Eight Ways to Better Flow Maps

As part of a presentation I gave yesterday at the RSAI-BIS (Regional Science Association International – British & Irish Section) annual conference, on DataShine Travel to Work maps, I outlined the following eight techniques to avoid swamping origin/destination (aka flow) maps with masses of data, typically shown as straight lines between each pair of locations.

Lines tend to obscure other lines, making the flows of interest and significance harder to spot, and creating an ugly visual impact. See above for an extreme example which shows (all) cycle-to-work flows in inner-city London. Large numbers of flow lines, if delivered as vectors to a web browser, can also cause the web browser to run slowly or run out of memory, affecting the user experience.

To avoid this, I generally try to use one or several of the following techniques.

1. Restrict to a single origin or a single destination. This does require the user to first click on a location of interest before any flow can be seen:

From L to R, DataShine Commute, Understanding Scotland’s Places (USP) and DataShine Region Commute, the last one showing that, in some cases, this can still produce an overload of lines.

2. Only show flows above a threshold. This could be a simple minimum value threshold (e.g. 10 people), a set number of lines (e.g. 1000 largest flows) or dynamic value-based limit (e.g. only where flow is 1% of the origin population), the latter generally only working if a single origin is shown at a time:

From L to R, The Great British Bike To Work (with a simple flow-size threshold) and Understanding Scotland’s Places, which uses a dynamic origin-based theshold, shown here with the constrasting number of bidirectional flows visualised from a large city (centre) with those from a small town (right), each being selected in turn.

3. Minimise the overall number of possible origins/destinations. What you lose in detail you might gain in clarity and simplicity. DataShine Region Commute only shows flows between LAs, rather than the spatial detail of flows within them.

4. Restrict the geography. The Propensity to Cycle Tool (Lovelace R et al, 2017) shows the main flows (based on a threshold) on a county-by-county basis, with easy and clear prompts to allow the user to move to a neighbouring county if they wish.

5. Bend the lines. Tools, such as the Stanford Flow Map Layout tool or Gephi with the “Geo Layout” and curved lines, allow flow lines to be clustered or curved in a way that reduces clutter, while retaining geography. The first approach clumps pairs of flow lines together in a logical way, as soon as they approach each other. The second approach simply curves all the lines, on a clockwise basis, generally removing them from the central area unless that is their destination. See also this paper by Bernhard Jenny (Jenny B. et al, 2017) which details the benefits of curving lines and further cartographic modifications, and this paper by Stefan Hennemann (Hennemann S. et al, 2015) which outlines a sophisticated approach to grouping together flow lines, on a world-wide basis.

From L to R: Commutes into London from districts outside London, from the 2001 census, by Alastair Rae (Rae A., 2010) using the Stanford Flow Map Layout tool, and top destination for each origin tube station, based on Oyster card data, by Ed Manley (Manley E., 2014) using a particular Gephi flow layout.

6. Route the flow. Snap the lines to roads or other appropriate linear infrastructure, using shortest-path or sensible-path routing, and combining the segments of lines that meet together, either by increasing the width or adjusting the hue or translucency.

From L to R: The Propensity to Cycle Tool (Lovelace R et al, 2017) routed for shortest path, and journeys on the “Boris Bikes” bikeshare system in central London, routed with OSM data to the shortest cycle-friendly route. In both cases, journeys meeting along a segment cause the segment to widen proportionally.

7. Don’t use a simple geographical map. This map, created by Robert Radburn at City University (Radburn R, 2015) in Tableau, is a “small multiple” style map of car commutes between London boroughs, with a map of London being made up itself of miniature maps of London. Each inner map shows journeys originating from the highlighted borough to the other boroughs. These maps are then arranged in a map themselves. It takes a little getting used to but is an effective way to show all the flows at once, without any potentially overlapping lines.

8. Miss out the flow lines altogether. Here, a selected origin (in green) causes the destination circles to change in size and colour, depending on the flow to them. In this case, the flow is modelled commutes on the London Underground network – made clearer by the addition of the tube lines themselves on the second map – but just as a background augmentation rather than flow lines.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien

Evolution of London’s Rush Hour Traffic Mix

My latest London data visualisation crunches an interesting dataset from the Department of Transport. The data is available across England, although I’ve chosen London in particular because of its more interesting (i.e. not just car dominated) traffic mix. I’ve also focused on just the data for 8am to 9am, to examine the height of the morning rush hour, when the roads are most heavily used. 15 years worth of data is included – although many recording stations don’t have data for each of those years. You can choose up to three modes of transport at once, with the three showing as three circles of different colours (red, yellow and blue) superimposed on each other. The size of each circle is proportional to the flow.

It’s not strictly a new visualisation, rather, it’s an updated version of an older one which had data from just one year, using “smoothed” counts. But it turns out that the raw counts, while by their nature more “noisy”, cover a great many more years and are split by hours of the day. I’ve also filtered out counting stations which haven’t had measurements made in the last few years.

Note also the graph colours and map colours don’t line up – unfortunately the Google Material API, that I am using for the charting, does not yet allow changing of colours.

An alternate mode for the map, using the second line of options, allows you to quantify the change between two years, for a single selected type of transport. Green circles show an increase between the first and second year, with purple indicating decreases.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien

Lives on the Line v2: Estimated Life Expectancy by Small Areas

livesontheline_district

I’ve produced an updated version of a graphic that my colleague Dr James Cheshire created a few years ago, showing how the estimated life expectancy at birth varies throughout the capital, using a geographical tube map to illustrate sometimes dramatic change in a short distance.

You can see an interactive version on my tube data visualisation platform. Click a line colour in the key on the bottom right, to show just that line. For example, here’s the Central line in west London.

The data source is this ONS report from 2015 which reports averages by MSOA (typical population 8000) for 2009-2013. I’ve averaged the male and female estimates, and included all MSOAs which touch or are within a 200m radius buffer surrounding the centroid of each tube, DLR and London Overground station and London Tram stops. I’ve also included Crossrail which opens fully in 2019. The technique is similar to James’s, he wrote up how he did it in this blogpost. I used QGIS to perform the spatial analysis. The file with my calculated numbers by station is here and I’m planning on placing the updated code on GitHub soon.

livesontheline_alllondon

My version uses different aggregation units (MSOAs) to James’s original (which used wards). As such, due to differing wards and MSOAs being included within each station’s buffer area, you cannot directly compare the numbers between the two graphics. An addition is that I can include stations beyond the London boundary, as James’s original dataset was a special dataset covering the GLA area only, while my dataset covers the whole of England. The advantage of utilising my data-driven platform means that I can easily update the numbers, as and when new estimates are published by the ONS.

Estimating life expectancies at birth for small areas, such as MSOAs, is a tricky business and highly susceptible to change, particularly due London’s high rates of internal migration and environmental change. Nevertheless it provides a good snapshot of a divided city.

View the interactive version.

livesontheline_dlr

Data: ONS. Code: Oliver O’Brien. Background mapping: HERE Maps.

Big Data Here

9k-1

The Consumer Data Research Centre (CDRC) at UCL is organising a short pop-up exhibition on hyperlocal data: Big Data Here. The exhibition is taking place in North Lodge, the small building right beside UCL’s main entrance. The exhibition materials are supplied by the Centre for Advanced Spatial Analysis (CASA).

Inside, a big projection shows local digital information. What the screen shows will change daily between now and Friday, when the exhibition closes. Today it is showing a live to-the-second feed of bus arrivals at the bus stop outside the North Lodge, and tube train arrivals at Euston Square station just up the road. Watch the buses zip by as they flash up “Due” in big letters on the feed. Both of these are powered by Transport for London’s Unified Push API, and we are planning on publishing the visualisation online next week. Tomorrow will be showing a different local data feed, and then a final one on Friday.

cvimcbqwgaa4bkw

Opposite the projection is the iPad Wall. This was created by CASA a few years back by mounting a bank of iPads to a solid panel (above photo shows them in test mode) and allowing remote configuration and display. The wall has been adapted to show a number of metrics across its 12 panels. Four of these showcase footfall data collected by one of our data partners, and being used currently in CDRC Ph.D. research. The other panels show a mixture of air quality/pollutant measures, tube train numbers and trends, and traffic camera videos.

We hope that passersby will enjoy the exhibition visuals and use them to connect the real world with the digital space, a transposition of a digital data view onto the physical street space outside.

The exhibition runs 24 hours a day until Friday evening, with the doors open from noon until 3:30pm each day. The rest of the time, the visualisations will be visible through the North Lodge’s four windows. The exhibition is best viewed at night, where the data shines out of the window, spilling out onto the pavement and public space beyond:

2q

Big Data Here is taking place during Big Data Week 2016. Visit the exhibition website or just pop by UCL before Friday evening.

9k

cvmvee-xyaaxmep

SIMD 2016: The Scottish Index of Multiple Deprivation

simd_2016_pic1

Like its English counterpart IMD, SIMD is released every few years by the Scottish government, as a dataset which scores and ranks every small statistical area in Scotland according to a number of measures. These are then combined to form an overall rank and measure of deprivation for the area. This can then be mapped to show the geographical variation and spread of deprived (and non-deprived) communities across the country. I mapped SIMD 2012 for The New Booth Map and also it appears as a layer in CDRC Maps.

simd_2016_pic3Dr Cheshire and I recently were commissioned to produce a new website to showcase the older SIMD 2012, and for the release of SIMD 2016, that contained tools useful for researchers and other specialist users, such as specific area data selection and retrieval and map downloads. The base of the website was the “DataShine” mapping style used in both the above examples, where only buildings are coloured, so that urban areas can be easily seen and related. With the great majority of the Scottish population in urban areas, and vast areas of unpopulated land in the country, this style of mapping is very useful both to draw the eye to where the population is, and also present a map that is a more familiar representation of the country. As such, even though this is intended to be a “pro users” site, it is accessible and useful to the general viewer too.

The new website SIMD.scot, was launched by the Scottish Government along with the SIMD 2016 statistical release, at the end of August. It was featured on the BBC News Scotland website, as well as on the Daily Record and Scotsman newspaper websites, drawing 60,000+ visits in the first few days.

Some technical notes about SIMD.scot:

  • UTFGrids, at 4×4 pixel resolution are used for the mouseover popup data on component indices.
  • I use HTML5 Download to create a PNG image of the current map view – this works only in Chrome and Firefox.
  • A “mobile” version of the website starts with an area chooser dialog, when viewed on screens smaller than 800px wide.
  • The website uses static content (except for the postcode search) in order to load quickly, even when many people are viewing the site at once.

The work was carried out through UCL Consultants. Explore the SIMD 2016 map itself at SIMD.scot or see the 2012 version.

simd_2016_pic2

Tube Heartbeat

tubeheartbeat

Tube Heartbeat is a interactive map that I recently built as part of a commission by HERE, using the HERE JavaScript API. It visualises a fascinating dataset that TfL makes available sporadically – the RODS (Rolling Origin Destination Survey) – which reveals the movements of people on the London Underground network in amazing detail.

The data includes, in fifteen-minute intervals throughout a weekday, the volume of tube passengers moving between every adjacent pair of stations on the entire tube network – 762 links across the 11 lines. It also includes numbers entering, exiting and transferring within each of the 268* tube stations, again at a 15 minute interval from 5am in the morning, right through to 2am. It has an origin/destination matrix too, again at fine-grained time intervals. The data is modelled, based on samples of how and where passengers are travelling, during a specimen week in the autumn – a period not affected either by summer holidays or Christmas shopping. The size of the sample, and the careful processing applied, means that we can be confident that the data is an accurate representation of how the system is used. The data is published every few years – as well as the most recent dataset, I have included an older one from 2012, to allow for an easy comparison.

As well as the animation of the data, showing the heartbeat of London as the the lines pulse with passengers squeezing along them, I’ve including graphs for each station and each link. These show all sorts of interesting stats. For example, Leicester Square has a huge evening peak, when the theatre-goers head for home:

leicestersquare

Or Croxley, in suburban north-west London, with a very curious set of peaks, possibly relating to the condensed school day:

croxley

Walthamstow (along with some other east London stations) has two morning rush-hours with a slight lull between them:

walthamstow

Check the later panels in the Story Map, the intro which appears when first viewing Tube Heartbeat, for more examples of local quirks.

This is my first interactive web map produced using the HERE JavaScript API – in the past, I have extensively used the OpenLayers, as well as, a long while back, Google Maps API. The API was quick to pick up, thanks to good examples and documentation, and while it isn’t quite as full-featured as OpenLayers in terms of the cartography, it does include a number of extra features, such as being quickly able to implement direction arrows along lines, and access to a wide variety of HERE map image tiles. I’m using two of these – a subdued gray/green background map for the daytime, and an equivalent darker one for the evening data. You’ll see the map transition between the two in the early evening, when you “play” the animation or scrub the slider forwards.

Additionally, I’ve overlayed a translucent light grey rectangle across the map, which acts to further diffuse the background map and highlight the tube data on top. The “killer” feature of HERE JavaScript API, for me, is that it’s super fast – much faster than OpenLayers for displaying complex vector-based data on a map, on both computer and smartphone. Being part of the HERE infrastructure makes access to the wide range of HERE map tiles, with their distinctive design, easy, and gives the maps a distinctive look. I have previously used HERE mapping for some cities in the Bike Share Map (& another example), initially where the OpenStreetMap base data was low in detail for certain cities, but now for all new cities I “onboard” to the map. The attractive cartography works well at providing context for the bikeshare station data there, and the tube flow data here.

There is some further information about the project on the HERE 360 blog, and I am looking to publish a more deatiled blogpost soon about some of the technical aspects of putting together Tube Heartbeat.

Stats

Number of stations Number of lines Number of line links between stations
268* 11 762

Highest flows of people in 15 minutes, for the four peaks:

Between stations (all are on Central line)
Morning 8208 0830-0845 Bethnal Green to Liverpool Street
Lunchtime 2570 1230-1245 Chancery Lane to Holborn
Afternoon 7166 1745-1800 Bank/Monument to Liverpool Street
Evening 2365 2230-2245 St Paul’s to Bank/Monument
Station entries
Morning 7715 0830-0845 Waterloo
Lunchtime 1798 1130-1145 Victoria
Afternoon 5825 1730-1745 Bank/Monument
Evening 2095 1015-1030 Leicester Square
Station interchanges
Morning 5881 0830-0845 Oxford Circus
Lunchtime 2060 1330-1345 Oxford Circus
Afternoon 5043 1745-1800 Oxford Circus
Evening 1109** 2215-2230 Green Park
Station exits
Morning 6923 0845-0900 Bank/Monument
Lunchtime 2357 1145-1200 Oxford Circus
Afternoon 7013 1745-1800 Waterloo
Evening 1203 1015-1030 Waterloo

* Bank/Monument treated as one station, as are the two Paddington stations.
** Other stations have higher flows at this time but as a decline from previous peak.

I’m hoping to also, as time permits, extend Tube Heartbeat to other cities which make similar datasets available. At the time of writing, I have found no other city urban transport authority that publishes data quite as detailed as London does, but San Francisco’s BART system is publishes origin/destination data on an hourly basis, there is turnstyle entry/exit data from New York’s MET subway, although only at a four-hour granularity, and Washington DC’s metro also publishes a range of usage data. I’ve not found an equivalent dataset elsewhere in Europe, or in Asia, if you know of one please do let me know below.

tubeheartbeat2

The data represented in Tube Heartbeat is Crown copyright & database right, Transport for London 2016. Background mapping imagery is copyright HERE.

Putting Cartography Back on the Map – Google Maps Getting Prettier

googlemaps_july2016

There was a time when Google Maps was an ugly ducking. It started life as a road map, and its grey background was decryed at a memorable keynote at the British Cartographic Society annual conference 8 years, contrasting with the classic Ordnance Survey Landranger maps where the spaces between roads were normally full of “something” – be it contours, trees or antiquities. Google’s features, on the other hand, were pretty messy, and often wrong. However, Google has been steadily beautifying its functional map (and correcting it), focusing on the cartography as well as the data, as it turns from a map of roads and POI pins, to a map of everything. 2013 was a big step forward, when the map became vector-based and superimposed features customised to just you. Now in 2016, it’s the look of the map itself that is the focus. Cartography on digital maps is far from dead.

This week, Google has unveiled a the latest update to Google Maps, showing that it is serious about the cartography and colour. The map has a cleaner, more refined look that continues its trend of taking out the detail you don’t need and focusing on the information that you are looking for. The two most obvious changes are (a) a new, brown/orange shading showing “areas of interest” – think high streets and tourist attractions, and (b) smaller roads have had their borders removed and are now simple white lines overlaid on a grey, green or brown background. I have been keen on this technique, using it for OOMap, DataShine and CDRC Maps. MapBox’s basic-style map of OpenStreetMap data also has taken this “white on grey, + data” approach which I am sure has helped inspire Google’s new look. (OpenStreetMap.org has always taken a different approach, with the many contributors wanting their particular mapping visible, it has always looked very busy and colourful. Unlike MapBox and Google Maps, OpenStreetMap.org’s map is to be seen “as is”, rather than acting as a background map upon which colourful project-specific data is intended to be overlaid.)

An accompanying blog post goes into more details about the changes. It includes a nice graphic demonstrating the new colour palette used and how Google are using colour to group and categorise map features, which I’ve reproduced here:

SS3

There is a clear use of complementary colours to balance out the map – the search results and current user interest shown in red, man-made features in pinks and oranges, and natural features in greens and blues, all criss-crossed with the white (and yellow) transport networks. It makes for a map that is logical to look at – and crucially, one that is immediately pleasing to the eye. It doesn’t “shout” at you any more.

One final note – the “Areas of Interest” is a powerful new bit of cartography – it draws the eye to it, and means Google Maps has a significant influence on what parts of an unfamiliar city you are likely to visit. It’s a subtle but key bit of “suggestive” mapping. Bad news for the businesses though that rely on passing trade, and are not in these areas.

Inside HERE

Z

A startup with a billion dollar asset. This is how HERE’s new CEO Edzard Overbeek describes the location services company that is making a striking pitch for being the third major digital mapping and location platform alongside Google and Apple.

HERE has had an interesting recent history. Originally NAVTEQ, one of the major cross-world road network databases, used by various “sat nav” systems, it was bought by Nokia and became Nokia Maps, before being rebranded as Ovi Maps. Nokia then sold its phone business to Microsoft – but as the latter already had Bing Maps, the digital mapping business was spun off into a new unit and sold to a consortium of German car companies. At the time, this perhaps seemed a surprising new set of owners but it has quite quickly become obvious – with self-driving car technology suddenly seemingly closer on the horizon, the need to have a global, highly precise digital map of the world’s streets is suddenly incredibly important – the aforementioned billion dollar asset. Google has been building it up from its initial, low-precision mapping, using its fleet of LIDAR mapping cars, and Apple has been doing the same, arguably starting from an even worse base. HERE has arrived in the space with the highest quality start, having been based on a digital map that is over 20 years old.

The insideHERE Event

HERE was kind enough to invite me to an event, insideHERE, at their European headquarters in the heart of Berlin, for demonstrations of their portfolio, using some of the platforms used recently at MWC, CES and the other major trade exhibitions in the technology and mobile space. They also discussed a few “under the hood” features, and what they are working on right now.

There were three themes, reflecting the three main segments of digital mapping at the moment – business, consumer and auto. A cancelled flight at very short notice (thanks for nothing, Norwegian!) meant that I arrived in Berlin late and so missed the first two. The first can be summarised with the HERE Reality Lens Lens product which provides high quality asset and street furniture mapping for the use and management by local authorities, and the second is encompassed by the HERE mobile app digital app, which occupies the same space as Apple Maps and Google Maps app, aiming to displace these on their respective platforms. This is a challenge of course, as the existing apps are pretty good, so HERE’s unique selling point is that they are designed for offline from the ground up (Google Maps offers this on a slightly more restricted basis, but HERE will be available in offline mode for an area, as soon as you initially load it up online.) Reality Lens and the HERE Offline Maps app are nice pieces of technology that utilise data from HERE’s car data gathering options and make it accessible to public sector and consumer users respectively, but it was clear, both from HERE’s new owners and the comparative length of time used during the day, that HERE Auto is the key sector for the company now.

Geodemographics

HERE have developed geodemographic profiles for car users (drivers/passengers), based on surveys in the USA, South Korea and Germany. Using cluster analysis of the results, they have identified six characteristic types of users, based on how they use cars and other transport options, day to day:

Z-2

Autonomous Navigation Data

Here’s a visualisation of the datasets that HERE use for self-driving cars. These are datasets designed for machines, not people, and the maps of the datasets, shown here, show the breadth and detail of the information used by self-driving cars to determine road information:

9k=

The data in these maps is highly compressed and delivered to cars, anywhere in the world, in cacheable 2km x 2km squares. (N.B. In one of the three pictures showing the maps of these datasets here, there is a mistake with the data shown. Can you see it? It’s obvious – once you’ve spotted it. No, it’s not that the cars on the wrong side of the road, as it’s showing a German autobhan rather than a British highway. Leave a comment if you find it!)

2Q==

Spatial Data Visualisation

HERE also have some nice demo rigs to show their data in a context that is familiar to people, such as using a top-down projection on a 3D model city section, allowing data to be draped over the buildings and street structure:

2Q==-1

9k=-1

Transit Demand Modelling

We also saw a glimpse of a microsimulation-based travel demand model (TDM) for central Berlin, with what-if scenarios possible by placing various objects on the screen visualising the output of the model, such as a rain shower or closed road. The transport mode share will likely continue to adjust in large cities throughout the world, while the street network will often remain static, so such models (and associated visualisations) try and predict what will happen on the ground:

2Q==-2

The other maps shown were in the user interface (i.e. dashboard/HUD) of a car test-rig, which is being used for UX/UI testing of autonomous/mixed-mode driving. I wrote about this in this previous blogpost.

HERE and the Future

Perhaps the most “exclusive” part of the day’s event was an hour long “fireside” chat with the new CEO of the company. As a relatively small group (there were around 10 of us)l, this was an excellent opportunity to grill the top-guy of one of the world’s three from-technology digital mapping providers (as opposed to from-GIS like ESRI or from-paper like the OS). Edzard Overbeek answered every question we threw at him efficiently. I quizzed him on whether indoor digital mapping, the “next frontier” identified by Google at least, will also be a priority for HERE given its new driving focus, to which Mr Overbeek was clear that, in order to be a serious player in the space it needs to be mapping everything, so that a single platform is available cross-use, i.e. if a customer journey ends with a walk through a department store, the platform needs to do the “last 100m” mapping too. It’s clear also that the HERE offline maps app will remain a key part of the company’s offering – not just to realise the value of their existing, long-built-up “consumer-grade” mapping, but to build the “HERE” brand to consumers. Ultimately though, their most important clients are the car companies – both the three that own the company but also others needing a “car mapping operating system”.

named

named_lennon_mccartney

named is a little website that I have recently co-written as part of an ongoing ESRC-funded project on UK surnames that we are conducting here at UCL Department of Geography. I put together the website and adapted for the UK some code on generating heatmaps showing regions of unusual popularity of a surname, that was created by researchers in the School of Computing, Informatics & Decision Systems Engineering at ASU (Arizona State University) in the USA.

The website is deliberately designed to be simple to use and “stripped down” – all you do is enter your surname and the website maps where in the UK there is an unusually high number of people with that surname living. There is also an option to enter an additional surname (for example, a maiden name for yourself or your partner, or the name of a friend) – and, by combining heatmaps of both names, we try and draw out where we think you might have met each other, or grown up together.

The Research

named_tweedy_coleOf most interest to us is the quality of the technique with pairs of surnames. It is well known already (for example, J A Cheshire, P A Longley (2012) Identifying Spatial Concentrations of Surnames, International Journal of GIS 26(2) pp309-325) that most traditional UK surname distributions remain surprisingly unchanged over many years – internal migration in the UK is a lot less than might be traditionally perceived. One of the research questions in the underlying project is to see whether this extends to marriages and other pairings too. So we encourage you to use this mode and help us understand and evaluate pairing surname distributions and patterns.

The site is also a useful information gathering tool – we are only in the early stages of evaluating the validity or accuracy of this method – we know it works well for certain regional UK names which are not too popular or too rare, at least. We ask for optional quick feedback following a search, so we can evaluate if the result feels right for you. So far, with the website been operational for around a week, nearly 10% of people are giving feedback, and around half of those suggest that it is good result for them. If it doesn’t highlight where you live now, it might be showing your ancestral home or other region that you have a historical link to. Or it may be showing complete rubbish – but let us know either way!

named_whyte_mackay

Try it out for yourself – visit here and see what it says for your surname. The site should be quite quick – it will take up to 10 seconds for names which have not already been searched, but is much faster if getting information that’s previously been searched for.

How it Works

The system is creating a probabilistic kernel density estimate (KDE), based on surname distributions (in a postcode) for an old electoral roll. It finds the relatively frequency/density of the surname compared with the general population in the area. So, in most cases, it will often highlight an area in the countryside – a sparse population, but maybe with a cluster of people with that surname. As such, it will only rarely highlight London and the other major cities of the UK, except for exceptionally urban-centric surnames, typically of foreign-origin. The method is not perfect – the “bandwidth” is fixed which means that neighbouring cities and other population fluctuations can cause false-positive results. However, we have seen enough “good” results that we think the simple has some validity, with the structure of the UK’s names.

named1

Design

On a design perspective, I wanted to build a website that looks different from the normal “full screen slippy maps” that I have designed for a lot of my research projects. Maps are normally rectangular, so I played with some CSS and a nice JQuery visual effects library, to create a circular map instead which appears to be on the back of an information disc.

Data Quality and Privacy

The map is deliberately small and low on detail because having a more detailed map would imply a higher level of precision for the underlying names data than can actually be justified. The underlying dataset has issues but is considered to be sufficient for this purpose, as long as the spatial resolution is low. Additionally, for rare names where a result may appear for only a small number of people with that name (when in rural places) we don’t want to be flagging individual villages or houses. The data’s just not good enough for that, for many names (it may well be good for some) and it may imply we are mapping exact data over someone’s house, possibly raising privacy issues – we are not, the data is not good enough for that but by coincidence it may still happen to line up with a very local feature if it was high res.

It should give an indication into the general area where your name is unusually popular relative to the local population there (N.B. not quite the same as where your name is popular in absolute terms) but I would be wary of the quality of the result if you were identifying a particular small town or exact location.

[A little update as one user worried that it was just showing a population heatmap. This would only happen for names which have a higher relative population in more dense area of the UK. Typically, older common foreign origin names will most likely show this, as foreigners traditionally migrate to cities in the UK first. The only name so far that I’ve seen it for (I haven’t tested it for many) is Zhang which is a very common surname. Compare Zhang (left) with an overall population heatmap (using the same buffer and KDE generation as the rest of the maps):

named_zhang_allpop

Some newer foreign origin names show an even more pronounced urban tendency, such as Begum and Mohammed.]

More…

Try named now, or if you are interested in surnames across the world, see the older WorldNames website, and for comparisons between 1881 and 1998 distributions in the UK, see GB Names.

If named shows “No Data” and you have entered a real surname, this may be because there are only very few of you on the UK – and in this case, I show the “No Data” graphic to protect your privacy. Otherwise I’d be mapping your house – or at least, your local neighbourhood.

Changes in Deprivation in England, 2010-15

Click any of the images in this article to go to the interactive map.

imd2015_londonup
Above: A significant reduction in relative deprivation in Blackheath and Maze Hill since 2010.

I’ve just now published a number of maps on the CDRC Maps platform which uses the DataShine mapping style (more about DataShine) to show demographic data relating to consumer and other datasets.

The maps relate to the Indices of Deprivation 2015, small-area measures of deprivation in England, which were compiled and published at the end of September by OCSI on behalf of the UK Government.

imd2015
Above: Deprivation varies between Tottenham, Walthamstow and Woodford Green, in 2015.

The Indices of Deprivation (of which the Index of Multiple Deprivation, or IMD is the overall index) split England into around 32000 areas (“LSOAs”), each containing a typical population of 1500. Each area is scored for several components, which are then combined (with different weights) to produce an overall score of deprivation for the area. Note that areas with little deprivation may be mainly compared of people who are not “wealthy” but just not deprived, and therefore rank the same as areas mainly populated by extremely affluent people. IMD is a measure of deprivation, not affluence.

The look of these maps, with their Red-Yellow-Green colour ramp, is intentionally similar to my New Booth map of the 2010 IMD deciles which was my first “colour the houses” map and the precursor to DataShine and therefore CDRC Maps.

imd2015_miltonkeynes
Above: Milton Keynes has a characteristic strip of high deprivation, running north/south.

These scores cannot be directly compared with those from previous exercises (2010, 2007 and 2004 are the recent ones) due to slight methodological alterations, however we can rank each area based on the overall score – this is the Index of Multiple Deprivation – and then compare ranking changes between the years. It should be noted that a decrease in rank (i.e. an increase in deprivation measure compared with other areas) does not mean that an area has become more deprived in absolute terms – it may be just becoming less deprived at a slower rate. I have mapped the overall rank change from 2010 to 2015, and also the rank change of the component which measures the effects of crime on deprivation, as this shows some particularly interesting spatial characteristics.

Looking at the overall changes, London’s pattern is striking:
imddelta_london
Above: London has an inner-city “ring” of blue showing a large reduction in relative deprivation since 2010.

London’s inner city areas – Zones 2-4 – have becoming significantly less deprived in the last year. Indeed London, in general, has done very well recently relative to the rest of England, with only a few areas (St John’s Wood, Thornton Heath, Mill Hill, East Barnet and Hounslow) showing a significant increase in relative deprivation levels. Again, this may mean that they are still becoming less deprived, just at a slower rate. By comparison, Blackheath, Ealing, Upton, North Wembley and Crouch End have become dramatically less deprived since 2010. There are smaller pockets throughout the city who are are also showing marked moves in both directions – see the interactive map. I use a different (Red-White-Blue) colour ramp for these maps, to emphasise that they are showing changes.

imddelta_readingbury
Above: The contribution of crime to deprivation has significantly dropped in Reading and increased in Bury.

Some of the more notable results for changes in the crime component ranking of the IMD are in Reading (where the impact of crime on deprivation has significantly reduced) and Bury (where it has had a significantly greater impact). In both towns (see above, presented at different scales) however, other components have acted in the opposite direction, such as the deprivation ranking of these two places, with respect to the rest of England, has not significantly changed in five years. Bury, was, and still is, already significantly more deprived than Reading, the difference between the two has increased.

Another example: comparing Gateshead with nearby South Shields. The former coming up, the latter going down:
imd_gateshead
Gateshead is almost universally moving out of deprivation at a faster rate than the rest of England, while South Shields is change much more slowly.

The components are income, employment, education, health, crime, barriers to housing and services, and living environment. Their weights are summarised in this nice infographic from gov.uk.

There is also an official summary which maps the data slightly differently. One of its analyses – Chart 6 – shows the local authorities (LA) where relative deprivation has significantly fallen, by measuring the proportion of areas within the LA that have moved out of the bottom 10% in the IMD, between 2010 and 2015. The top four are: Hackney, Tower Hamlets, Greenwich and Newham. These are four of the five Olympic Boroughs. The fifth, Waltham Forest, is also in the top 10. East London is changing.

See these maps and various geodemographic classifications at CDRC Maps.

imd2015_midlands
Across middle England, cities are more deprived than the countryside, with notable exceptions (such as Shrewsbury, Cambridge, northern Leeds and western Sheffield).