Since its inception in 2004, the growth and maturation of OpenStreetMap is evidenced through numerous metrics: in its data collection methodology, its editorial support infrastructure, its policies and standards, and, of course, in the quality and quantity of its data collection itself. As founder Steve Coast has said, “When I created OSM more than 10 years ago, I was just out to create a map — and that’s it. Since then, the tool and its uses have evolved and I believe that it’s an amazing thing that there are people involved in the [OSM] project that want to do socially beneficial things with the tool and that they’re interested in OSM from a different perspective.” In this article we will consider the OSM community of contributors, the data they produce, and the projects that have been made possible. Some of this information was presented during the recent American Association of Geographer’s 2016 meeting in San Francisco, during a session on OSM.
Who are the mappers?
As of today, over 2.4 million people have registered with OSM to add or edit data. During the last few years, there usually have been about 25,000 different people each month who are active mappers. That sounds like a lot of volunteered geographic information, but in reality there are a few people who contribute much and many more people who contribute little or nothing. This produces the type of familiar long-tail distribution consistent with many volunteer efforts.
One unique feature of OSM is the digital fingerprints of its contributors and editors. Scholars have begun to examine these historical records to better understand the patterns of who has contributed what and when. To study “the crowd,” geographer Sterling Quinn is using the data from individual contributors in several small cities across several countries to classify, compare and contrast the frequency and type of edits. His Crowd Lens for OpenStreetMap tool (in beta version) enables users to systematically compare the individual entries over time. With information like this he has begun to categorize contributors by their motivations, including casual or systematic mappers, casual or systematic fixers, and those who participate because they have a particular connection just to that place. Those who add and edit data associated with crisis and other humanitarian events are another large group.
What are they mapping?
The very first contribution to OpenStreetMap was probably a road. The road network has been an obvious starting place for mappers given its central role in transportation, navigation, emergency management, routing, etc. It’s also a relatively easy type of map element or feature to “trace” — which is the primary way of data entry. Using one of several editing interfaces, the way one creates data is to look at an aerial photograph of a place in the world and trace along lines (roads, streams, paths), or around areas (buildings, playgrounds, forest patches) that one can see in the photo; in the OSM lingo, these lines and areas are ways.
What about points? Sure. Flag poles, fire hydrants and obelisks become nodes in the system. When ways and notes fit together or have some kind of functional relationship, such as sections of a connected trail network, or a train station and the tracks that lead from it, topology can be indicated via relations.
But the real added value of data are the attributes or map features that the local community continually adds as tags to the geometric data that have been captured. It’s not just the polygon shape of a building that is important: it’s all of the additional information about the nature and functions of the building and the services it provides within. It’s not only that a structure has been mapped, but knowing that it is a commercial structure that does car repair on Volvos. Or that a particular ATM node is associated with the Bank of America and can or cannot accept cash deposits.
The level of detail now possible to attach to each segment of a roadway is formidable. Names, surface types, speed and weight limits, and even very localized restrictions on passing are some of the possible attributes to associate.
Entering the geometry of features, by tracing what is visible, can technically be done from anywhere and by anyone. But the more precious commodity is the local knowledge associated with the attributes, even if it is just the names of the roads. For example, Mapzen’s Indy Hurt has documented the expanding collection of OSM data in three Indian cities. The overall cumulative length of the road network and number of buildings are simple metrics that show growth, but the increase in the tags themselves is stronger evidence for the enrichment of the data through local knowledge. One single, long road is of lesser interest than that same length that has been dissected into its constituent segments, each with its own collection of tags.
What about data quality?
From day one, skeptics have voiced concerns that user-generated content would lack the quality of data generated via authoritative sources and developed under strict guidelines. In reality this fear has not borne out. For one, the OSM community has designed tools that support the overall editing process in several ways, from requiring topology between intersecting roads to having robust and pre-populated collections of “tags” that are approved to be used. Infrastructure like this prevents some of the willy-nilly data entry that some predicted would occur.
Moreover, in the last decade of implementation, what has borne out is Linus’s Law: when many people pay attention to a situation, the attention can be adequate to identify and then address existing problems.
The OSM community has also designed tools that make it easier to detect when deliberate errors are introduced and vandalism is at play. There are no perfect data sets, but the OSM collection has matured to the point that it is being used by many companies and agencies.
A more complex problem going forward is the ability of the crowd to keep the data current. In the face of constant global flux, without maintaining updated tags as well as the inevitable changes to the geometries of nodes and ways, OSM would quickly lose value. Alan McConchie of Stamen Design likens the process to gardening. The unknown but necessary number of gardening edits is equal to the number of new nodes (times an error rate) plus the number of existing nodes (times the rate of real-world change). How that will play out remains to be seen!
What’s next with OSM?
Change is what’s constant. New tools and lines of research are continuously emerging in efforts to better understand the mapping community and support their work at producing high quality data. Projects such as missingmaps.org and youthmappers.org specifically target high priority areas, where the demand for and supply of data are most out of synch. Humanitarian OpenStreetMap coordinates extensive first-response mapping following crisis events, and is also involved with the activity around tool development to analyze data. Understanding the real-time analysis of data during crisis mapping activities, and making the interpretation more accessible, is a project being undertaken by Jennings Anderson and others at the University of Colorado’s Project EPIC. Many branches of the federal government, including the State Department with its MapGive project, are promoting data creation efforts. Steve Coast has moved on to what3words, but his vision continues to be advanced as long as people continue to plant, water, weed and harvest the garden.