Category Archives: Bike Share

How Mexico City Does Bikeshare

The above map shows the estimated routes and flows of over 16 million users of the bikeshare in Mexico City, “ECOBICI“, across the 22 months between February 2015 and November 2016, using data from their open data portal. The system has been around since 2011 but its most recent major expansion, to the south, was in early February 2015, hence why I have show the flows from this date. The wider the lines, the more bikeshare bikes have been cycled along that street. The bikes themselves don’t have GPS, so the routes are estimated on an “adjusted shortest route” basis using OpenStreetMap data on street types and cycleways, where any nearby cycleway acts as a significant “pull” from the shortest A-to-B route. Having cycled myself on one of the bikes in November (and hence my journey is one of the 16.6 million here) I fully appreciate the benefits of the segregated cycle lanes along some of the major streets. As my routes are estimates, they don’t account for poor routes taken by people, or “tours” which end up at the same places as they started. So, the graphic is just a theoretical illustration, based on the known start/end data.

The bikeshare journeys are in a dark green shade, ECOBICI’s brand colour, with docking stations shown as magenta dots. Magenta is very much the colour of CDMX, the city government, and it consequently is everywhere on street signs and government employee uniforms. Mexico City doesn’t have rivers, which are the “natural” geographical landmark for cities like London and New York where I’ve created similar maps, so I’ve used the motorways (shaded grey) and parks (light green), to provide some context. Mexico City extends well beyond the ECOBICI area.

The maps shows huge flows down the “Paseo de la Reforma”. This route is always popular with cyclists, thanks to large, segregated cycle lanes in both directions, on the parallel side roads. On Sunday mornings, the main road itself is closed to motor traffic, along with some other link routes. This is not reflected in my routing algorithm but also acts to increase the popularity of the flow in this general area. To the north, a cluster of docking stations and a large flow indicates the location of Buena Vista station, the only remaining commuter rail terminal in Mexico City. Further south, the curved roads around Parque México and Parque España are also popular with bikeshare users, in this leafy area that very much feels like the “Islington” of Mexico City:

Mexico City’s ECOBICI is one of the 150+ systems I’m tracking live on Bike Share Map. You can see the live situation, or an animation for the last 48 hours.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien

London’s Bikeshare Needs A Redistribution of Stations

bikes_journey_day

Here’s an interesting graph, which combines data on total journeys per day on London’s bicycle sharing system (currently called “Santander Cycles”) from the London Data Store, with counts of available bicycles per day to hire, from my own research database. The system launched in summer 2010 and I started tracking the numbers almost from the start.

You can see the two big expansions of the system as jumps in the numbers of available bikes – to all of Tower Hamlets in early 2012, and to Putney and Fulham in late 2013. Since then, the system has somewhat stagnated in terms of its area of availability, although encouragingly at least the numbers of available bikes has remained constant at around 9500, suggesting that at least the operator is on top of being able to maintain and repair the bikes (or regularly source new ones). Some of the individual bikes have had 4000 trips on them. There is a small expansion due in the Olympic Park in spring 2016, but the 8 new docking stations represents only a 1% increase in the number of docking stations across the system, so I doubt it will have a significant impact on the numbers of available bikes for use.

There is a general downward trend in the numbers of uses of each bike per day, since the halycon Olympic days of Summer 2012, over and above the normal seasonal variation, which concerns me. The one-year moving average recently dipped below 3 uses of each bike per day, this summer, and I am not confident it will pick up any time soon. (The occasional spikes in uses/bike, by the way, generally correspond to sunny summer bank holidays, tube strikes and Christmas Day).

To rejuvenate the system and draw in more users, rather than relying on the established commuter and tourist flows which likely dominate the current usage, I am convinced that the system needs to expand – not necessarily in terms of the number of bikes or docking stations, but in its footprint. I think the system would be much improved by dropping the constraining rule on density (which approximates to always having one docking station every 300m) and instead redistributing some of the poorly performing docking stations themselves further out. It’s crazy that, five years on, there are no docking stations in central Hackney, Highbury, or Brixton, three areas with an established cycling culture and easily cycle-able into the centre of London. Conversely, Putney and Tower Hamlets simply don’t need the high density of docking stations that they currently have, except in specific areas (such as around the train/tube stations in Putney, and Canary Wharf).

Ideally we would have a good density of docking stations throughout cycleable London but, as docking stations (and bikes) are very expensive, I would suggest that TfL instead adopts the model used in Bordeaux (below). Here, the city retains a high-dense core serving tourists, commuters and other centrally-based workers, but adopts a much lower density in the suburbs, so that, while tourists can still “run into” docking stations they don’t know about in the centre thanks to the high density, local users can benefit from the facility in their neighbourhood too, even if it requires a little longer walk to get to it.

bikes_bordeaux

Technical note: Before November 2011, the London numbers included bicycles that were in a docking station but not available to hire (i.e. marked as broken). This exaggerates the number of available bikes (and correspondingly reduces the number of hires/bike/day from the true value) in this period by a small amount – typically around 3-5%, an effect I am not considering significant for this analysis.

Visit the new oobrien.com Shop
High quality lithographic prints of London data, designed by Oliver O'Brien

Seeing Red: 15 Ways the Boris Bikes of London Could be Better

santabikes

A big announcement for the “Boris Bikes” today, aka Barclays Cycle Hire. London’s bikeshare system, the second largest in the western world after Paris’s Velib and nearly five years old, will be rebranded as Santander Cycles, and the bikes with have a new, bright red branding – Santander’s corporate colour, and conveniently also London’s most famous colour. As well as the Santander logo, it looks like the “Santa Bikes” will have outlines of London’s icons – the above publicity photo showing the Tower of London and the Orbit, while another includes the Shard and Tower Bridge. A nice touch to remind people these are London’s bikes.

velibIt’s great that London’s system can attract “big” sponsors – £7m a year with the new deal – but another document that I spotted today reveals (on the last page) that, despite the sponsorship, London’s system runs at a large operating loss – this is all the more puzzling because other big bikeshare systems can (almost) cover their operating costs – including Washington DC’s which is both similar to London’s in some ways (a good core density, same bike/dock equipment) and different (coverage into the suburbs, rider incentives); and Paris’s (right), which has a very different funding model, and its own set of advantages (coverage throughout the city) and disadvantages (little incentive to expand/intensify). What are they doing right that London is not?

In financial year 2013/4, London’s bikeshare had operating costs of £24.3m. Over this time period, the maximum number of bikes that were available to hire, according to TfL’s Open Data Portal was 9471, on 26 March 2014. This represents a cost of just over £2500 per bike, for that year alone. If you look at it another way, each bike is typically used three times a day or ~1000 times a year, so that’s about £2.50 a journey, of which, very roughly, the sponsor pays about £0.50, the taxpayer £1 and the user about £1. In those terms it does sound better value but it’s still a surprisingly expensive system.

As operating costs, these don’t include the costs of buying the bikes or building the docking stations. Much of the cost therefore is likely ocurring in two places:

  1. Repairing the bikes – London’s system is wildly* successful, so each bike sees a lot of use every day, and the wear and tear is likely to be considerable. This is not helped by the manufacturers of the bikes going bust a couple of years ago – so there are no “new” ones out there to replace the older ones – New York City, which uses the same bikes, is suffering similar problems. (* Update: To clarify, based on a comment from BorisWatch, this assertion is a qualitative one, based on seeing huge numbers of the bikes in use, in certain places at certain times of the day. Doubtless, some do remain dormant for days.)
  2. Rebalancing/redistribution activity, operating a fleet of vehicles that move bikes around.

I have no great issues with the costs of the bikes – they are a public service and the costs are likely a fraction of the costs of maintaining the other public assets of roads, buses, railway lines – but it is frustrating to see, in the document I referred to earlier, that the main beneficiaries are in fact tourists (the Hyde Park docking stations consistently being the most popular), commuters (the docking stations around Waterloo are always popular on weekdays), and those Londoners lucky enough to live in Zone 1 and certain targeted parts of Zone 2 (south-west and east). Wouldn’t be great if all Londoners benefited from the system?

Here’s 15 ways that London’s bikeshare could be made better for Londoners (and indeed for all) – and maybe cheaper to operate too:

  1. Scrap almost all rebalancing activity. It’s very expensive (trucks, drivers, petrol), and I’m not convinced it is actually helping the system – in fact it might be making it worse. Most cycling flows in London are uni-directional – in to the centre in the morning, back out in the evening – or random (tourist activity). Both of these kinds of flows will, across a day, balance out on their own. Rebalancing disrupts these flows, removing the bikes from where they are needed later in the day (or the following morning) to address a short-term perceived imbalance that might not be real on-the-ground. An empty docking station is not a problem if no one wants to start a journey there. Plus, when the bikes are in sitting in vans, inevitably clogged in traffic, they are of no use to anyone. Revealingly, the distribution drivers went on strike in London a few months ago and basically everything carried on as normal. Some “lightweight” rebalancing, using cycle couriers and trailer, could help with some specific small-scale “pinch points”, or responding to special events such as heavy rainfall or a sporting/music event. New York uses cyclists/trailers to help with the rebalancing.
  2. Have a “guaranteed valet” service instead, like in New York. This operates for a certain number of key docking stations at certain times of the day, and guarantees that someone can start or finish their journey there. London already has this, to a certain extent, at some stations near Waterloo, but it would be good to highlight this more and have it at other key destinations. This “static” supply/demand management would be a much better use of the time of redistribution drivers.
  3. rrrHave “rider rewards“, like in Washington DC. Incentivise users to redistribute the bikes themselves, by allowing a free subsequent day’s credit (or free 60-minute journey extension) for journeys that start at a full docking station and end at an empty one. This would need to be designed with care to ensure “over-rebalancing”, or malicious marking of bikes as broken, was minimised. Everyone values the system in different ways, so some people benefit from a more naturally balanced system and others benefit from lower costs using it.
  4. Have more flexible user rules. Paris’s Velib has an enhanced membership “Passion” that allows free single journeys of up to 45 minutes rather than every 30 minutes. London, like Paris, is a large city, and the current 30 minute cutoff seems short and arbitrary, when considering most bikes are used around three times a day. Increasing the window would therefore have little impact on the overall distribution of the system and might in fact benefit it – because the journeys from the terminal stations to the City or the West End, which are the most distinctive flows seen, are acheived comfortably in under half an hour. In London, you have to wait 5 minutes between hires, but most systems (Paris, Boston, New York) don’t have this “timeout” period. To stop people “guarding” recently returned bikes for additional use, an alternative could be make it a 10 minute timeout but tie it to the specific docking station (or indeed a specific bike) rather than system-wide. Then, if people are prepared to switch bikes or docking stations, they can continue on longer journeys for free.
  5. Adjust performance metrics. TfL (and the sponsors) measure performance of the system in certain ways, such as the time a docking station remains empty at certain times of the day. I’m not sure that these are helpful – surely the principle metric of value (along with customer service resolution) is the number of journeys per time period and/or number of distinct users per time period. If these numbers go down, over a long period, something’s wrong. The performance metrics, as they stand, are perhaps encouraging the unnecessary and possibly harmful rebalancing activity, increasing costs with no actual benefit to the system.
  6. lyonRemove the density rule (one docking station every ~300 metres) except in Zone 1. Having high density in the centre and low density in the suburbs works well for many systems – e.g. Bordeaux, Lyon (above) and Washington DC, because it allows the system to be accessible to a much larger population, without flooding huge areas with expensive stations/bikes. An extreme example, this docking station is several miles from its nearest neighbour, in a US city.
  7. Build a docking station outside EVERY tube station, train station and bus station inside the North/South Circular (roughly, Zones 1-3). Yes, no matter how hilly* the area is, or how little existing cycling culture it has – stop assuming how people use bikes or who uses them! Bikeshare is a “last mile” transport option and it should be thought of as part of someone’s journey across London, and as a life benefit, not as a tourist attraction. The system should also look expand into these areas iteratively rather than having a “big bang” expansion by phases. It’s crazy that most of Hackney and Islington doesn’t have the bikeshare, despite having a very high cycling population. Wouldn’t be great if people without their own bikes could be part of the “cycling cafe culture” strong in these places? For other places that have never had a cycling culture, the addition of a docking station in a prominent space might encourage some there to try cycling for the first time. (*This version of the bikes could be useful.)
  8. Annual membership (currently £90) should be split into peak and off-peak (no journey starts from 6am-10am) memberships, the former increased to £120 and the latter decreased back to £45. Unlike the buses and trains, which are always full peak and pretty busy off-peak too, there is a big peak/offpeak split in demand for the bikes. Commuters get a really good deal, as it stands. Sure, it costs more than buying a very cheap bike, but actually you aren’t buying the use of a bike – you are buying the free servicing of the bike for a year, and free distribution of “your” bike to another part of central London, if you are going out in the evening. Commuters that use the bikes day-in-day-out should pay more. Utility users who use the bike to get to the shops, are the sorts that should be targetted more, with off-peak membership.
  9. officialmapA better online map *cough* of availability. The official map still doesn’t have at-a-glance availability. “Rainbow-board” type indications of availability in certain key areas of London would also be very useful. Weekday use, in particular, follows distinct and regular patterns in places.
  10. Better indication of where the nearest bikes/docks are, if you are at a full/empty docking station, i.e. a map with route indication to several docking stations nearby with availability.
  11. Better static signage of your nearest docking station. I see very few street signs pointing to the local docking station, even though they are hard-built into the ground and so generally are pretty permanent features.
  12. Move more services online, have a smaller help centre. A better view of journeys done (a personal map of journeys would be nice) and the ability to question overpayments/charges online.
  13. hubwayEncourage innovative use of the bikeshare data, via online competitions – e.g. Boston’s Hubway data visualisation competitions have had lots of great entries. These get further groups interested in the system and ways to improve it, and can produce great visuals to allow the operator/owner to demonstrate the reach and power of the system.
  14. Allow use of the system with contactless payment cards, and so integration with travelcards, daily TfL transport price caps etc. The system can’t use Oyster cards because of the need to have an ability to take a “block payment” charge for non-return of the bikes. But with contactless payment, this could be achieved. The cost of upgrading the docking points to take cards would be high, but such docking points are available and in use in many of the newer US systems that use the same technology.
  15. Requirement that all new housing developments above a certain size, in say Zone 1-3 London, including a docking station with at least one docking point per 20 residents and one new bike per 40 residents, either on their site or within 300m of their development boundary. (Update: Euan Mills mentions this is already is the case, within the current area. To clarify, I would like to see this beyond the current area, allowing an organic growth outwards and linking with the sparser tube station sites of point 7.)

London has got much right – it “went big” which is expensive but the only way to have a genuinely successful system that sees tens of thousands of journeys on most days. It also used a high-quality, rugged system that can (now) cope with the usage – again, an expensive option but absolutely necessary for it to work in the long term. It has also made much data available on the system, allowing for interesting research and increasing transparency. But it could be so much better still.

15094632681_a184a8a065_b
Washington DC’s systems – same technology as London’s, not that much smaller, but profitable.

From Putney to Poplar: 12 Million Journeys on the London Bikeshare

london_barclayscyclehire

The above graphic (click for full version) shows 12.4 million bicycle journeys taken on the Barclays Cycle Hire system in London over seven months, from 13 December 2013, when the south-west expansion to Putney and Hammersmith went live, until 19 July 2014 – the latest journey data available from Transport for London’s Open Data portal. It’s an update of a graphic I’ve made for journeys on previous phases of the system in London (& for NYC, Washington DC and Boston) – but this is the first time that data has been made available covering the current full extent of the system – from the most westerly docking station (Ravenscourt Park) to the the most easterly (East India), the shortest route is over 18km.

As before, I’ve used Routino to calculate the “ideal” routes – avoiding the busiest highways and taking cycle paths where they are nearby and add little distance to the journey. Thickness of each segment corresponds to the estimated number of bikeshare bikes passing along that segment. The busiest segment of all this time is on Tavistock Place, a very popular cycle track just south of the Euston Road in Bloomsbury. My calculations estimate that 275,842 of the 12,432,810 journeys, for which there is “good” data, travelled eastwards along this segment.

The road and path network data is from OpenStreetMap and it is a snapshot from this week. These means that Putney Bridge, which is currently closed, shows no cycles crossing it, whereas in fact it was open during the data collection period. There are a few other quirks – the closure of Upper Ground causing a big kink to appear just south of Blackfriars Bridge. The avoidance of busier routes probably doesn’t actually reflect reality – the map shows very little “Boris Bike” traffic along Euston Road or the Highway, whereas I bet there are a few brave souls who do take those routes.

My live map of the docking stations, which like the London Bikeshare itself has been going for over four years, is here.

[Update – A version of the map appears in Telegraph article. N.B. The article got a little garbled between writing it and its publication, particularly about the distinction between stats for the bikeshare and for commuter cyclists in London.]

More Cities, More Bikes, More Data

Screen Shot 2014-05-06 at 16.06.21

I presented some research I’ve carried out at CASA, at the Cycle City conference in Leeds last week. The research shows how the numbers of bikeshare bikes and docking stations have varied between 2010 and 2014, for 46 systems across the world (not all systems have numbers for whole period of study). The numbers are from the database which backs my live global map.

View the slides from my presentation here.

The work has been written up into a CASA Working Paper (#196). The appendix includes the numbers of bikes and docking stations, for the 46 systems, across eight periods of collection in six-monthly intervals from October 2010. You can view the paper as a PDF by following the link above.

5.5 Million Journeys at NYC Bike Share

nycbikeshare_journeys

[Updated – timeperiod-split maps added] Following on from my London bikeshare journeys graphic, here is the same technique applied with the data released by NYC Bike Share (aka Citi Bike) earlier this week.

If you look carefully at the full size map you can see a thin line heading north-eastwards, initially well out of the bikeshare “zone”, representing journeys between Williamsburg and Central Park, via the Queensboro Bridge cycle path. We see a similar phenomenon for journeys between Tower Bridge and Island Gardens in London. Whether any of the riders actually take this route, of course, is open to question – they might take a longer – but more familiar – route, that stays more within the area of the bikeshare.

Below is a version of the graphic with the data split into four timeperiods – weekday rush-hour peaks (7-10am and 4-7pm starts), weekday interpeak (10am-4pm), weekday nights (7pm-7am) and finally weekends. The data is scaled so that the same thicknesses of lines across the four maps represent the same number of journeys along each street segment – but bear in mind that there are fewer weekends than weekdays. While, as would be expected, the rush-hour peaks see the most number of journeys, there is less spatial variation across the city, between the four timeperiods, than I expected. Click on the graphic for a larger version.

timesplits

The graphics were produced by creating idealised routes (near-shortest path, but weighted towards dedicated cycle routes and quieter roads) between every pair of the ~330 docking stations in the system, using Routino and OpenStreetMap data (extracted using the Overpass API). Edge weights were then built up using a Python script, a WKT file was created and then mapped in QGIS, with data-based stroke widths applied from the weights.

The routes are only as good as the OpenStreetMap data – I think the underlying data is pretty good for NYC, thanks to great community work on the ground, but there is still a possibility that it has missed obvious routes, or proposed wacky ones. It also doesn’t account for journeys starting or ending at the same place, or journeys where the prime purpose is an exploration by bike – with the user unlikely therefore to take an “obvious” A-B route.

Even with that caveat, it’s still a revealing glimpse into the major route “vectors” of bikeshare in New York City.

London Cycle Hire on the Cover of BMJ

7946.cover_89I produced this data map which forms the front cover of this week’s British Medical Journal (BMJ). The graphic shows the volumes of Barclays Cycle Hire bikeshare users in London, based on journeys from February 2012 to January 2013 inclusive. The routes are the most likely routes between each pair of stations, as calculated using Routino and OpenStreetMap data. The area concerned includes the February 2012 eastern extension to Tower Hamlets (including Canary Wharf) but not the December 2013 extension to Putney. The river was added in from Ordnance Survey’s Vector Map District, part of the Open Data release. QGIS was used to put together the calculated results and apply data-specified styling to the map.

The thickness of each segment corresponds to the volume of cyclists taking that link on their journey – assuming they take the idealised calculated route, which is of course a not very accurate assumption. Nevertheless, certain routes stand out as expected – the Cycle Superhighway along Cable Street between the City and Canary Wharf is one, Waterloo Bridge is another, and the segregated cycle route south of Euston Road is also a popular route.

The graphic references an article in the journal issue which is on comparing health benefits and disbenefits of people using the system, with comparison to other forms of transport in central London. Pollution data is combined with accident records and models. The paper was written by experts at the UKCRC and the London School of Hygiene and Tropical Medicine (LSHTM) and I had only a very small part in the paper itself – a map produced by Dr Cheshire and myself was used to illustrate the varying levels of PM2.5 (small particulate matter) pollution in different parts of central London and how these combine with the volume of bikeshare users on the roads and cycle tracks. The journal editors asked for a selection of images relating to cycle hire in London in general and picked this one, as the wiggly nature and predominant red colour looks slightly like a blood capillary network.

A larger version of the graphic, covering the whole extent of the bikeshare system at the time, is here or by clicking on this thumbnail of it:

bmjfinal

Very rare journeys, such as those from London Bridge to Island Gardens, have faded out to such an extent that they are not visible on the map here. An example route, which the map doesn’t show due to this, goes through Deptford and then through the Greenwich Foot Tunnel.

For an interactive version of the graphic (using a slightly older dataset) I recommend looking at Dimi Sztanko’s excellent visualisation.

Citibike beating Barclays Cycle Hire

NYC Citibike’s meteoric rise continues – for October, the New York City bikeshare beat London’s Barclays Cycle Hire on average journeys per day, for both weekdays and weekends. Even more impressive considering that it’s only just over half the size.

Thanks to this release published today at the London Data Store, and this daily updating data from New York, I’ve been able to plot month-by-month figures, for the last three years for the Barclays Cycle Hire, and the last few months for Citibike, on the same graph. I’ve split out weekdays and weekends. Grey/black is New York City’s Citibike, while the colours (red, orange, green, blue) are the Barclays Cycle Hire for 2010, 2011, 2012 and 2013 respectively.

bch_numbers_monthbymonth

Click for the large version.

What’s even more impressive is that Citibike is currently physically smaller than London’s Barclays Cycle Hire. It currently has 330 docking stations and 4500 bikes, while London has 558 docking stations and 7600 bikes. These numbers don’t match exactly with official numbers, as I combine a small number of adjacent docking stations, and don’t count bikes in repair or otherwise unavailable for use.

London’s more temperature climate (a “warm/cool” city) means it should have a lead on NYC (a “hot/cold” city) in the summer and winter, while NYC may well be strongest in the spring and autumn.

Apologies for the rather lame looking graph. Excel crashed as I was setting it up, I sneaked a screenshot as the crash reporter popped up, but had to add the NYC data in manually in GraphicConverter…

Tracking, Visualising and Cycling

Along with Martin Zaltz Austwick, who blogs as Sociable Physics, I led a workshop session as part of CASA’s annual conference. The topic was “Tracking, Visualising and Cycling” and focused on analysing and mapping bikeshare data. I concentrated on mapping the near-real-time docking station data, while Martin graphed journey data. Both of us used Google Drive as a quick an easy platform to map spatial data and graph it. The techniques that the participants were led through are relatively rudimentary, but hopefully acheived our main purposes of demonstrating the availability of such data and the utility of Google Drive for quick analysis, without leaving anyone on the course behind.

After short presentations by Martin and myself, presenting our recent related output, there were two practical sessions. In the first session, I led participants through downloading the live dock locations/status JSON data files from bikeshare systems in the US, before hacking the JSON into a CSV suitable for upload to Google Drive and showing on a map as a Google Fusion Table. A calculated column was then added to show the empty/full ratio and the docking stations on the maps were coloured appropriately. The result looked a bit like this (if the New York dataset was picked):

trackingworkshop

A couple of gotchas we ran into: (1) If using Notepad, don’t save the JSON text, as that will “burn in” linebreaks that break it. (2) If you don’t see Google Fusion Tables in your Google Drive apps menu, you need to add it as an app using the button at the bottom of the popup.

Martin then followed by showing participants how to download journey data from the Washington DC “Capital Bikeshare” website, extracting just the data for Saturday 30 June 2012, extracting the number of minutes each journey took in Excel, binning the journeys by minute and then plotting it on a Google Speadsheet chart. An additional section was breaking down the plots by user type – showing a pronounced difference between Subscriber and Casual hires – the latter generally taking much longer for their journeys.

You can view the slides here.

Analysing “CitiBike” in New York City

The above interactive map compares the popularity of different CitiBike docking stations in New York City, based on the number of journeys that start/end at each dock. The top 100 busiest ones are shown in red, with the top 20 emphasised with pins. Similarly, the 100/20 least popular ones are shown in blue*.

CitiBike is a major bikesharing system that launched in New York City earlier in the summer and has been pulling in an impressive number of rides in its first few weeks – it regularly beats London’s equivalent, whose technology it shares, in terms of daily trip counts, even though London’s system is almost twice as big (compare NYC).

Different areas have different peak times

Here are three maps showing the differences in the popularity of each docking station at different times of the day: left covers the “rush hour” periods (7-10am and 4-7pm), the middle is interpeak (10am-4pm), the domain of tourists, and on the right is evening/night (7pm-7am) – bar-goers going home? The sequence of maps show how the activity of each docking station varies throughout the day, not how popular each docking station is in comparison to the others.

nyc_rushhour_small

Red pins = very popular, red = significantly more popular than average, green = significantly less popular than average. Binning values are different for each map. Google Maps is being used here. See the larger version.

Some clear patterns above – with the east Brooklyn docks being mainly used in the evenings and overnight, the rush hours highlighting major working areas of Manhattan – Wall Street and Midtown, and interpeak showing a popular “core” running down the middle of Manhattan.

The maps are an output from the stats created by a couple of requests for CitiBike data came through recently – from the New York Times and Business Insider – so it was a good opportunity to get around to something I had been meaning to do for a while – see if I can iterate through the docking station bike count data, spot fluctuations, and infer the number of journeys starting and ending at each docking station.

I was able to relatively quickly put together the Python script to do this fluctuation analysis and so present the results here. I can potentially repeat this analysis for any of the 100+ cities I’m currently visualising collecting data for. Some of these cities (not New York yet) provide journey-level data in batches, which is more accurate as it’s not subject to the issues above, but tends to only appear a few months later, and only around five cities have released such data so far.

Places with persistently empty or full docks differ

Here are two maps highlighting docks that are persistently empty (left) or full (right).

nyc_emptyfulldocks

Left map: green = empty <10% of the time, yellow = 10-15%, red = 15-20%, red pins = empty 20%+ of the time. Right map: green = full <2% of the time, yellow = 2-3%, red = 3-4%, red pins = empty 4%+ of the time. Google Maps is being used here. Live version of full map, live version of empty map.

The area near Central Park seems to often end up with empty docking stations, caused perhaps by tourists starting their journeys here, going around Central Park and then downtown. Conversely, Alphabet City, a residential (and not at all touristy) area fairly often has full docking stations – plenty of the bikes for the residents to use to get to work, although not ideal if you are the last one home on a bike.

How the stats were assembled and mapped

As mentioned above, I assembled the stats by looking at the data collected every two minutes, iterating it, and counting changes detected as docking or undocking “events”, while also counting the number of spaces or bikes remaining for the second set of maps.

There are a couple of big flaws to this technique – firstly, if a bike is returned and hired within a single two minute interval (i.e. between measurements) then neither event will be detected, as the total number of bikes in that docking station will have remained constant. This problem mainly affects the busiest docks, and those that see the most variation in incoming/outgoing flows, i.e. near parks and other popular tourist sites. The other issue is that redistribution activities (typically trucks taking bikes from A to B, ideal from full docks to empty docks) are not distinguishable. In large systems, like New York’s, this activity is however a very small proportion of the total activity – maybe less than 5%, and so generally discountable in a rough analysis like this. I detected 1.6 million “events” which equates to 0.8 million journeys which each have a start and end event. The official website is reporting 1.1 million journeys during the same period, suggesting that this technique is able to detect around 64% of journeys.

I’ve used Google Fusion Tables to show the results. Although its “Map” function is somewhat limited, it is dead easy to use – just upload a CSV of results, select the lat/lon columns, create a map, and then set the field to display and which value bins correspond to which pin types. Just a couple of minutes from CSV to interactive map. There are a few other similar efforts out there – which aim to take point-based data and stick it quickly on a map, but Google’s Fusion Tables does the job and is easy to remember.

The data is one month’s worth of journeys – 17 July to 16 August. One note about the popularity map – the data. I am really just scratching at the surface with what can be done with the data. One obvious next step is to break out weekend and weekday activity. There are a few other analysis projects around – this website is analysing the data as it comes in, to an impressive level of detail.

* Any docks added in the last month will probably show as being unpopular at the moment, as it’s an absolute count over the last month, regardless of whether the dock was there or not.