Sponsored: UK Analytical Datasets based on OpenData - A Overview of GeoLytix GeoData

By Annette Dellevoet


We at Sainsbury’s have been a customer of GeoLytix’s consulting services for over a year, and I am passionate about data, especially open data. Therefore I was excited when Blair Freebairn, the Managing Director of GeoLytix, told me this spring they were building a comprehensive set of Geodata for the UK using only OpenData as a source. After reviewing the early output, and comparing them against the data we currently license, we quickly decided to sign up and eagerly awaited delivery of the data.
In total there are over 100 map layers arranged into seventeen ‘themes’ or bundles. These bundles cover everything from a very detailed road display layer to the UK’s administrative geographies and a whole slew of Points of Interest and some innovative derived datasets. A summary of the offer is on the GeoLytix website
In Britain a very rich set of government data, including nearly all the Ordnance Survey mid-scale vector products, have been released under a wonderfully permissive Open license. The GeoLytix data license, although paid-for, is similarly permissive. This means I don’t have to worry about how many of my users have the data on their machines, or whether I am allowed to share it with colleagues elsewhere in the business or our other partners like Aimia who look after our Nectar card data. As long as the data is being used for Sainsbury’s own purposes I can share it internally and we can use it as we wish.
GeoLytix also decided to make some of their datasets truly OpenData, with charges only levied for updates and add-ons . To date these include postcode sector boundaries, Retail Places and a standard set of OA demographics derived from the latest 2011 Census.   
One of the best and most useful datasets is their detailed set of postal sector boundaries for the whole of Great Britain. As a current client, I was given early access to samples of these data as the methodology for creating the boundaries developed and had the chance to give feedback. GeoLytix have built the geographies using an innovative ‘snapping’ algorithm that makes nearly all the boundaries follow real world features like railways, rivers or roads. This results in very intuitive and accurate boundaries. I think it is great that such a key dataset, which was previously only available under restrictive and expensive terms, has been made OpenData. It means many of the smaller GIS teams, and even non GIS people, can now get access to these boundaries. 
The GeoLytix Postal Sector Boundaries showing correct postcode point enclosure with real world features as boundaries (click for larger image).
Within the main data bundles there are latest versions of all of the existing OS OpenData, but GeoLytix have consolidated all the data, styled it, made it all seamless and corrected lots of little errors. We could have done most of this processing ourselves but it would have taken ages. In addition GeoLytix have ‘mined’ the whole open data portfolio from data.gov.uk and other sources to add in a whole raft of extra key data. Most importantly they have created some new, innovative and very useful derived datasets. One dataset I particularly like, and use often, is the town and suburb boundaries. This collection of polygons represents real ‘places’ that our location planners (and indeed non-GIS specialists) are instantly comfortable with. This data is simply not available from any of other suppliers. We have found many uses for these in particular in building network plans for our Convenience portfolio and in accurately describing the locations of new competitor developments to some of our non-GIS users.
Another bundle we get a lot of use out of is the public transport pack. This is based on definitive national transport data. For example it includes over 380,000 bus stops. What is particularly powerful is that GeoLytix have added in passenger volumes, either definitive real values or modelled where they aren’t collected. This is a dataset I have built myself in the past, but it was a real pain trying to source all the data from National Rail, TfL and the smaller transport operators, not to mention matching the volumes up to the correct station locations, so it is great that now GeoLytix has done all the hard work!
We also use Retail Places: a set of polygons defining all of the UK’s town centres, retail parks, shopping centres and parades. For the largest town centres there is a 10m grid overlaid that scores the ‘pitch’ within each centre which is a nice extra. Some of the naming of these centres was a bit confusing, and the polygons aren’t as accurate as I would have liked. When I raised my concerns with Blair he took them on board, and produced a new much better release in January 2013. The dataset has been refined to remove the least significant areas and the polygons have been constrained using land use coding from OS Mastermap so they follow the exact lines of the retail areas.  I still have some concerns regarding the naming as names are only unique within towns, making it tricky to find the right location when searching at a national or regional level, also a ranking of places would be really useful, but I feel sure  these issues will be addressed  in the next release. One of the USP’s of GeoLytix’s service is their willingness to accept constructive criticism and work with the customer to make each product the best that it can be.
A snapshot of GeoLytix Retail Places and their pitch rating in Nottingham and surrounding area (click for larger image).
The GeoLytix Retail Places outlines within Central London. The full dataset is in green; and the Open data is labelled and shown in Red. The Open Retail Places cover 342 of the largest towns and retail parks (click for larger image).
GeoLytix’s latest release is a dataset created from the 2011 Census bulk outputs which includes a subset of the most useful demographic counts at OA level in CSV format. It would be very time-consuming to produce this in-house from the many Key Stats and Quick Stats tables released by ONS and NISRA, not to mention working out the differences between the definitions of certain counts for each country, but fortunately GeoLytix has again done all the grafting. Not only that, they have released this as OpenData and included a comprehensive user guide, lookup tables and estimated the proportions of 2001 OA’s in each 2011 OA so that accurate estimates of 2001 Census statistics can be made for the 2011 OA’s for comparison with the latest data. I am looking forward to this dataset being available for the whole of the UK once Scottish Census data becomes available.
There are also three collections of Points of Interest (POI). The education one is especially useful to us as it includes all the schools, colleges and universities together with pupil numbers and age ranges and official lookup codes. Education facilities are a key source of non-residential trade for our convenience stores. The sports stadia one that includes capacities, teams etc and covers every location where professional sport is played has some use to us, but in truth, is probably of most interest to my male colleagues for purely personal reasons! Some of the POI layers are of little us to us - including courts and prisons! But I do see how they will be of more interest to other potential users. Again they all come with official identifiers so users should be able to easily link them to other data sources.
GeoLytix are currently working with our team to trial building a bespoke demand surface that takes into account multi-channel behaviour and a customer’s mission type.  This will help us to better understand the potential for convenience, superstore and online sales in an area. 
The routable road network for use in drive time software is one bundle we do not use. We license a proprietary road network that we have built our own speed-model for using real world speeds. Changing the road network would trigger a re-build of many of our tools and models. 
The data is supplied as either shapefile or MapInfo tab files in British National Grid, this is fine for our purposes and if colleagues need it in other formats or projections I can do that myself very simply in our GIS. Because the data bundles include some very detailed 10K Raster layers and the nominal scale of the vector data is generally 1:10,000 the package is quite hefty at 15 GB. Every data bundle has full documentation explaining the sources and methodology outlines. They also include a comprehensive data dictionary that explains every data field in every layer. The guides and other supporting documents are well written and easy to understand, but there are a few minor errors I have picked up that will hopefully get corrected as the product matures. The data has been designed to be used together so all the layers ‘nest’ perfectly, and the vector layers exactly match the appropriate Raster backgrounds. GeoLytix have included Northern Irish data where possible, which is a big advantage over the regular OS OpenData. Wherever official codes exist they have been included, such as administrative area codes, hospital identifiers and public transport codes etc. This allows me to easily create links from the GeoLytix data to our internally built and external supplied datasets.
I like the easy and transparent pricing approach from GeoLytix. You can pick three bundles for £8,000 a year, six bundles for £14,000 or have all seventeen for £25,000 (if you want to you can also choose to license each individually). Currently we use the six bundle option, but have signed up to the full pack for next year.  GeoLytix are also more than happy to share any dataset with a potential customer on a 30 day evaluation license, this is much better than only being able to review a specific town or area.
The postal geography updates we license only cost £1,200 a year which is about an 80% saving over what we paid before. We have also been able to replace several data sets we previously licensed piece-meal. We have ended up with much wider, richer and up-to-date collection of data than before at less than we used to pay.
As a large team with over thirty analysts and planners we particularly like the flexible unlimited licensing. For smaller teams I think the pricing will be pretty much on par with their current arrangements but the data does include several innovations you won’t see anywhere else and will definitely work out more cost-effective than trying to process all the data yourself. For one-person GIS teams or non-specialists the pricing may put some people off, maybe GeoLytix could introduce more flexible pricing/offers for these smaller users? To get the most out of the data you will need to know your way around your GIS. The data is ‘flat’ and does not come with any reporting or analysis tools. Currently it will not be a huge amount of use to anyone without access to a GIS. 
During the nine months we have been using the data I have found GeoLytix to be very responsive and attuned to our needs. When we first licensed the data they came into the office to run the team through all the datasets, and having access to the team who actually build and maintain the data means we can always get an instant, direct and definitive answer to any queries.

Published Wednesday, April 17th, 2013

Written by Annette Dellevoet

Published in

Location Intelligence

If you liked this article subscribe to our bimonthly newsletter...stay informed on the latest geospatial technology

Sign up

© 2017 Directions Media. All Rights Reserved.