In the old days, it didn't matter. Storage was expensive (my first add-on hard drive cost $320 for 20 MB) and we were happy to get a 24-baud connection on e-mail. CPUs ran at glacial speeds. Thus most data sharing was manual and limited. Providers and users could make arrangements via phone or e-mail. Those days are over, yet we have not adapted or adopted technologies and policies for geospatial data transactions in the Internet Age. If the information exists that might support an automated exchange and use of data, that information is typically in a separate file or data stream, and thus the arrangements still need to be made via phone or e-mail.
Notice that we are talking about more than metadata. The additional data that define rights management, security and privacy are considered to be separate (though the ISO standard provides for them, as described below), so let's group these types of data with metadata under the heading, "Related Exploitation Data," or "RED." The benefits of RED should be immediately apparent to the user: This is the ancillary information you need so that you can make the best use of the data!
The challenge is to find a way to advance the provision and use of these exploitation data. The vendors have provided the tools to capture some of the classic metadata, but in the absence of a user acknowledgment of the value there is little payback to those who have worked so hard to get us where we are.
How do we break this tradition of selling cars without an owner's manual? We demonstrate the value of that manual to the consumers and they will then demand that it be included in every model. That is the good news. The bad news is that the concurrent need to keep it simple means that the first time a user may want more information usually comes after something bad has happened.
The commercial side is relatively straightforward, as companies are simple creatures; show them that providing exploitation data and the tools that collect them will make them money and it will be done. Their good will in the technology creation process has been phenomenal, but at the end of the day there still needs to be a financial benefit as perceived by the consumer.
Classic geospatial metadata are typically described as "data about data," but this definition continues to suffer from an ambiguity captured in the phrase "one man's data is another man's metadata." Metadata is a four letter word to many in the business. The best thing might be to deprecate the use of the word "metadata," and eventually do away with it altogether.
With a nod to XML, I like to think of all of these data that provide the information needed to exploit and use content in applications as being related to the data. These helper data could be embedded in the data, or, with XML technology, they could be linked to the data. The underlying and seldom met requirement is that these data must be immediately available for use by the human being AND by the machine or Web service finding and crunching the content. The same unmet requirement applies to all four kinds of modifiers on the use of the content:
- Classic metadata used to discover, assess, access and exploit the data from an historical perspective
- Geospatial rights management, like digital rights management in other electronic media worlds, stems from data providers' desire to control and profit from the use of their intellectual property. In the geospatial world, the motives are more diverse, including important applications in science and government, and the problems are more complex, such as the problem of different rights being associated with different layers or even different attributes on a single feature in a multi-source data product.
- Security was once driven entirely by the defense world, but more and more people outside that world are realizing that all of us have a stake in data security.
- Privacy, which was not much of an issue separate from security as long as most data came from governments, is now an issue thanks to GPS receivers, traffic cameras and other devices that take away our freedom to legally move about in public spaces without being under continual surveillance.
All of these issues and others involve data about how to exploit data - just like that owner's manual. You cannot take full advantage of today's cars without a manual and you cannot take full advantage of geospatial data and services without a user manual, or rather a package of information about how the data can and may be used.
It should be possible for one piece of software or a chain of Web services to parse these data that enhance the use of data to satisfy the needs of all four kinds of exploitation data.
We have most of the standards we need:
- For some years we have had ISO standards for metadata and XML encoding of classic metadata. Section B.2.3 of the 19115 standard allows for entry of legal and security characteristics associated with geospatial data.
- We have also had OGC standards for encoding data (OGC Geography Markup Language [GML] Encoding Standard) and for creating online publishing of geospatial data and geoprocessing services in "the cloud" (OGC Catalog Services - Web Interface Standard). The interface standards (from OGC and ISO) that enable interprocess communication among geoprocessing systems work with GML.
- User communities working with XML encodings for both metadata and data can dramatically reduce complexity by using tailored subsets -- profiles or "application schemas" - of the voluminous standards, and software products are available that automate the creation and use of schemas.
These four kinds of exploitation could be, and almost certainly will be, spun together into something much stronger and more substantial. As our industry enters the world of Web services, "data files" drop into the background. With this transition, I believe that data will live with and travel with their accompanying Related Exploitation Data.
- The OGC is tracking the IETF GeoPriv effort and concurrently working on the special Web service mechanisms and rights languages needed to articulate, manage and protect the rights of all participants in the geographic information marketplace, including the owners of intellectual property and the users who wish to use it.
- The OGC's GeoDRM Reference Model defines the framework for such standards. We anticipate that the OGC membership will use the GeoRM in developing OGC implementation standards for open interfaces and encodings that will enable diverse systems to participate in transactions involving data, services and intellectual property protection.
This union will greatly reduce obstacles to exploitation of data, because Web services will be able to immediately discover the fields in the Related Exploitation Data that indicate what can be done with the data and what may be done with the data. This will increase the value of data and services, reduce costs and simplify implementation.
These benefits are substantial, and a growing number of people now understand the potential. The most direct way forward would be for a group of data providers and data user organizations to join together in sponsoring an OGC testbed with this focus.
Ed. Note: Bacharach is leaving the OGC staff to return to the implementation world.