Special Announcement
Poll
How has your usage of Google Maps changed since Google changed its data source from a well-known vendor to itself?
I use it the same way as before.
I use it about the same, but now I report errors.
I use it less. 
I use it more.
Google Maps has a new data provider?
Webinar SignUp
Click below to sign-up for our latest Webinar

January 01
2010 Directions Media Webinars coming soon!
Directions Magazine, Web-based Mapping, Business GIS, GeoSpatial Consulting, Location Based Services
White Paper Downloads
Get the latest white papers from our sponsors
Directions Magazine, Web-based Mapping, Business GIS, GeoSpatial Consulting, Location Based Services
Articles
Three Standard Geocoding Methods
By Ela Dramowicz , Nova Scotia Community College
October 24, 2004

Classified Ads:
Take advantage of a special year-end sale on SPOTMaps, the 2.5 meter, seamless, color mosaic made to fit your area of interest. Save 25% off all SPOTMaps through November 10th, when you mention this ad! Click here for details

How Do They Work and What Accuracy Do They Offer?

Ed.Note: This is the first in a series of "Business Geographics 101" articles to be provided by Dr.Ela Dramowicz that will cover the basic techniques required for spatial business analytics.This series will also include articles on the correct use of thematic mapping techniques, assessment of market potential and trade area analysis. Although these articles will be illustrated with Canadian examples, the techniques are generally considered universal.Dr.Dramowicz is a member of the GIS for Business faculty at the Centre of Geographic Sciences at the Nova Scotia Community College.
 
In order to geocode data, it must contain information about location such as a street address, a postal code (or at least part of it), or a name of an area, e.g.county, census subdivision, etc.Geocoding is about adding x, y coordinates to point locations represented by these pieces of information.Three main methods of geocoding are available:

  • by street address,
  • by postal code; and
  • by boundary. 
Geocoding is performed using a reference layer.In Canada, the reference layers could be a street network file, a Postal Code Conversion File (PCCF), a Forward Sortation Area (FSA) file or any other boundary file.These layers can be purchased from Statistics Canada or other data providers.These same types of reference files are available for many other countries as well.Various ways of geocoding pinpoint records on a map with different levels of accuracy.The most accurate of these three methods is geocoding based on a street address.
 
Geocoding by street address
Address geocoding involves matching addresses in a table to be geocoded to the street names and address ranges in the street network file.Geocoding software reads the first record in the file.It first matches the street name in both - table to be geocoded and a reference table, which is accompanied by a map.Once the street name is matched, all address ranges for this street are examined in order to determine the street segment where the particular address is found, on the odd or even number side of the street.This part of the geocoding process is known as address matching.Since the coordinates of the street segment endpoints are in the reference layer, as well as the range of street numbers for this segment, it is possible for the software to interpolate the coordinates of the address.In the example below the table to be geocoded (Table 1) consists of a few restaurants and their addresses.Table 2 is an example of a reference attribute table, with the fields specific to a particular address style (U.S.streets in this case).The address of the Harvey's Restaurants (marked with a red arrow) is found in the second record in the reference table, on the even number side of the street (FromLeft, ToLeft).Interpolated addresses are distributed evenly along a street segment as illustrated in Figure 1.The 5650 Spring Garden Rd address is the sixth point from east (symbolized on a map with a red dot).Most likely it is not a true location of this restaurant.Street address geocoding produces only an approximation of this restaurant's true location.
 

Table 1.Table to be geocoded.
 

                                    Table 2.Sample records from the attribute table for a reference layer.
 

Figure 1.Illustration of street address geocoding using the street network file as a reference layer.(Click image for larger view.)

Address geocoding results in the same accuracy in urban and rural areas, as illustrated in Figures 2 and 3.In rural areas in Canada, however, address geocoding cannot often be completed, because address ranges are not available for small towns and other small communities.This is improving every year, as street network data providers release new versions of street network files.
 

Figure 2. (Click image for larger view.)
 

Figure 3. (Click image for larger view.)
 
Geocoding by postal code
Any database that includes postal code information can be geocoded by postal code based on a file that consists of postal code centroid points with the geographic coordinates attached to them.In Canada, the Postal Code Conversion File (PCCF) also provides a correspondence between the postal code and Statistics Canada's standard geographic areas for which census data are produced.Through the link between postal codes and standard geographic areas, the PCCF permits the integration of data from various sources.According to the Canada Post Corporation, the postal code consists of six characters (e.g.B0S 1M0) of which the first three are known as Forward Sortation Areas (FSAs).Forward Sortation Areas are large polygons with distinct boundaries, whereas postal codes do not have boundaries even though they represent areas.Figures 4 and 5 show the spatial relationship between postal codes and Forward Sortation Areas and the density of postal codes in urban and rural areas. 


Figure 4. (Click image for larger view.)


Figure 5.(Click image for larger view.)
 
Geocoding using postal code involves matching postal code information in the table to be geocoded to the postal code information (FSALDU - Forward Sortation Area Local Delivery Unit) in the PCCF file that acts as a reference layer.Tables 3 and 4 are examples of data sets used in this process.Postal codes in the table to be geocoded cannot contain spaces in order to ensure match with FSALDU.
 

Table 3.Table to be geocoded.
 

Table 4.Sample records and fields from the Postal Code Conversion File.
 
Based on the match (relational join operation) the table to be geocoded receives coordinates from the PCCF file and points can then be created on a map.The geocoded records will have locations identical to the centroid locations of those postal codes to which they were matched.In the above example only one record from Table 3 (indicated with a red arrow) was matched to a record in Table 4.
 
Geocoding based on a postal code produces radically different results in urban and rural areas.Urban postal codes represent very small areas as they approximate a block face - one side of the street between two intersections.The results of geocoding are therefore fairly accurate.The red circles in Figure 2, representing postal code-geocoded restaurants, are within a reasonable distance from green dots, representing street address- geocoded restaurants.Rural postal codes are very large, covering many communities; therefore the results of geocoding are less accurate.In Figure 3 all restaurants are geocoded to a single postal code location.There are no other postal codes within the area shown in this figure.The question is: who would use data sets geocoded this way? For some businesses it is the only available information about location of customers, gathered most likely from customer satisfaction surveys that include a question about postal code.Knowing who the people are who patronize their businesses (using the link between postal codes and census standard geographic areas) and where (what part of the city, town, or county) they are located helps in planning marketing campaigns and targeting new customers.For some research projects, due to data confidentiality, customer data is often aggregated by postal code (in both urban and rural areas), then geocoded and analyzed.

Geocoding by boundary
Geocoding by boundary is the least accurate of the three methods of mapping point locations.Any boundary file, such as counties or Forward Sortation Areas, can be used, as long as the boundary name in the table to be geocoded can be matched with the boundary name in the reference table.For example, GIS software reads the FSA name in the first record in the table to be geocoded (Table 5), matches it with the FSA name in the reference table (Table 6, record marked with a red arrow) and assigns the FSA centroid coordinates to this record.The geocoded location will display at the FSA centroid location.If the restaurants in Figure 2 were geocoded using the FSA, all would display at the red dot location that represents the centroid of this FSA.The FSA centroid for the area shown in Figure 3 is located outside of the map, but again, all restaurants would be located there.
 

Table 5.Table to be geocoded.
 

Table 6.Sample records and fields from the Forward Sortation Area reference table.
 
Geocoding using FSAs produces inaccurate results in both urban and rural areas.The larger the FSA, the less accurate the results.But still, the map produced with this method is better than no map at all.Learning more about customers would be difficult, however, because FSAs are large areas and they correspond to large census standard geographic areas that are more complex to profile.
 
None of the discussed geocoding methods produces accurate results in the sense of positional accuracy.In cases where such accuracy matters, a GPS device could be used for coordinate collection.GIS software can easily create points from pairs of x, y coordinates provided by GPS.
 
With the advent of GPS technology, more people are starting to use accurate location information instead of street addresses.According to the Natural Area Coding (NAC) Geographic Products Inc., geographic coordinates have too many digits for consumers to deal with.NAC developed the Natural Area Coding System that may revolutionize addresses all over the world (click here for a recent article about NAC).Natural Area Codes are defined by a series of grids applied on the earth surface.There is an unlimited number of NAC grids defined with cell sizes ranging from thousand kilometers to one meter, a few centimeters, or even smaller sizes.An eight or 10 character NAC (also called a Universal Address) is a cell on the forth or fifth level NAC Grid with width/length about 30 meters or one meter, respectively.There are many advantages of using Universal Addresses and this technology has already been adopted by a number of companies.NAC's intention is not to replace existing street addresses but to complement them, so that people who know how to use Universal Addresses can benefit from using them. 

Bookmark and Share

Your Comments
Post a comment
All comments provided in this section are those of the individual who has created the post. These are not the opinions of Directions Media, its editors, staff or owners unless otherwise noted. Directions Media retains the right to edit or delete any comments posted herein.

Three Standard Geocoding Methods by Ela Dramowicz (#1)
by Leslie Cooper, Wellington Underwriting Plc
   
Date: October 20, 2004 11:24 AM
These may be the three geocoding methods used in USA and Canada but there is a whole world out here (ie Europe, Far East etc). Geocoding techniques vary considerably - in the UK there is data from the Ordnance Survey which enables geocoding to be carried which will give a point guaranteed to be within the bounds of the given property.
In the Far East addressing structures may well vary considerably in shape and in Japan I beleive street numbers do not run along the street in the same way as they may elsewhere.

Our experience of trialing different geocoders (for 'worldwide' geocoding), both via web services and via API's, is very mixed.
One provider may distort results by almost completely basing the geocoding on the zip/post code - so we may supply an address which is completely correct apart from the postcode which is slightly wrong. The result will then be based on the zip/post code. Another provider may not base geocoding results on the same methods.

The other major problem we currently face is that different geocoding providers may produce different levels of geocoding:
Provider A has geocoding levels 'Address', 'Street' and 'Town/City'.
Provider B as levels of 'Address', 'InterpolatedAddress', 'Street', 'Town/City' and 'Region'.
How do you then use these different results in an application that works across all these different result types.

The NAC initiative would certainly help resolve some of the issues involved in worldwide geocoding but it would be a long haul. Would there then be something which could resolve and NAC against an address to ensure that the NAC or the address are correctly entered?

I hope you do not mind my ramblings on this subject!


Three Standard Geocoding Methods (#2)
by Ela Dramowicz, Centre of Geographic Sciences, NSCC
   
Date: October 26, 2004 15:40 PM
The article was addressing the North American situation. If we add other parts of the world to the list mentioned by Leslie, the geocoding issue will get even more complicated. The “worldwide” geocoding problem is far from being solved for a variety of reasons. From the technical point of view it involves the development of standards, availability of data, availability of data in a digital format and associated issue of accuracy. The example of Japan is simply dealing with different standards. In many African countries, the main problem is with the availability of data (especially in a digital format) since many communities often do not have street names! Even in Canada, you cannot geocode street addresses in small communities because street network files with corresponding address ranges do not cover 100% of the country.

Different levels of geocoding are associated with so called address style. Depending on what type of data you want to geocode and what data you have as a reference layer you can choose between various address styles.


Three Standard Geocoding Methods - Possibility in India (#3)
by Harsha Vardhan, JT Maps
   
Date: October 27, 2004 10:09 AM
Sir,
I have read the topic " Three Standard Geocoding Methods ". It is very much possible in countries like Canada, where the distribution of House or Block numbers is in a order. But,in countries like India where the house or block numbers are so unevenly disrtibuted,even in urban areas.I woul like to hear your comment, an other methodologies so that we can accurately locate a point in such cases.


3 Std GCdng Mtds - others (#4)
by Peter Rabley, International Land Systems (ILS), Inc
   
Date: October 27, 2004 14:00 PM
Sirs
I agree with Harsha and would add to that the idea that interpolation of addresses based on some set distance just does not work in most of the rest of the world. I understand the focus was NA and Canada where data is well developed and structured but I do agree that more on non standard addressing approaches would be far more useful. Most GIS savvy persons understand how geocoding works in well structured environments.

Yours


Three Standard Geocoding Methods - Styles (#5)
by Leslie Cooper (Mr), Wellington Underwriting Plc
   
Date: October 28, 2004 10:07 AM
Ela
You state in response to my ramblings:
"Different levels of geocoding are associated with so called address style"
I should clarify our understanding and position of things:
We have tried and are currently trying to bring together geocoding results from different sources. We are provided with schedules of locations by many different third parties (brokers) who will in turn have received these from many different sources. The schedules may well cover many countries and will inevitably be of varying quality. It is almost impossible to do address cleansing with the large number of countries involved.
There is no definitive geocoding source for geocoding worldwide addresses. There are a growing number with partial coverage.
We are trying to bring together results from multiple vendors and form one uniform/normalised set of geocoding levels. The levels are arrived at through a couple of possible routes:
(1) It may only wish to geocode to postcode level (This is not our situation).
(2) The geocoding service/application may not have sufficient data to be able to do better than say postcode level.

Our aim is to achieve the highest possible accuracy of geocoding for locations all over the world. So we have the situation where we have different results from different vendors. It's not about our choice of differing ranges of levels i.e. Vendor ABC has results at levels BUILDING, STREET and CITY, Vendor XYZ has results at levels BUILDING, POSTCODE, STREET, REGION and CITY. We need to 'normalise' the results so that we can analyse across the whole result set. We need to protect ourselves so that we can add and switch vendors as required.

I hope this clarifies things a little especially with respect to my gender.
Mr. Leslie Cooper
Wellington Underwriting Plc


Appreciation (#6)
by Muhammad Tubbsum, University of Calgary
   
Date: March 19, 2006 20:02 PM
I have gone through this web site. Lot's of good info in your study.

I appreciate your work

Take care


Three Standard Geocoding Methods - Possibility in India (#7)
by Ashok Verma, Spatial Hydrology
   
Date: April 13, 2006 20:40 PM
In order to geocode, a set of addressing standard is must that do not exist in most of developing nations. Hence, we have almost no choice left to geocode street address. A neural network of fuzzy logic concept might work but I am not sure.

I have been working with address matching issues from last 2 years. Sometimes, it's impossible to geocode a valid existing address. So I am not sure how an address can be geocoded with no set of addressing standards. If you find any methods, please let us know.


Advertisers