Common Pitfalls When Analyzing WMS/WFS Capabilities

In this article I will point out some of the obstacles met when reading and analyzing capabilities from an Open Geospatial Consortium (OGC) compliant Web Map Service (WMS) or Web Feature Service (WFS). This seemingly simple task involves some major difficulties, and I offer my experience to help others who may have come across similar issues.I hope this article will help others deal with similar issues, so please "post a comment" to let me know if I am succeeding!

I chose the WMS/WFS capabilities specifications as a platform to demonstrate some of the complexities developers encounter when working with OGC standards.I will offer a bit of my hands-on experience on how to deal with these difficulties.There is a lot to say about interacting with other OGC based services, such as parsing GML or performing WFST transactions.I will try and approach these topics in future articles.

I will approach this issue from a client perspective with the goal of interacting with non-specific vendors.Theoretically the task of interacting with various WMS or WFS providers should be straight forward.By definition that is what OGC based interoperability was designed for.

Well, it's not so simple.There are subtle and not so subtle differences in data delivery when accessing services from varied vendors.In addition to the problems created by the robust nature of the specs, keep in mind that OGC is very active in producing new versions to the specifications.While the specs keep updating and evolving, some vendors lag in their adoption of the newer versions.For example a few still use the old WMS 1.0.0.Backward support of older versions will increase the range of servers your software can interact with.

Our work on CarbonTools (OGC compliant software development toolkit) was mainly COM and .NET based, but I will try to stick with general descriptions and avoid source-code so developers on any platform will be able to relate.The purpose of this article is to enlighten developers about issues not frequently discussed and help project managers assess the difficulty level involved with this task.I will assume the reader has some familiarity with OGC specifications.I will skip the issue of fetching the capabilities XML from
the server.Furthermore, the reader should be familiar with XML and XML parsing.

Differences between WFS and WMS capabilities

Developers who need to read capabilities from both WMS and WFS will discover an unfortunate fact: the capabilities specifications for the two services are not identical.Although somewhat similar on first glance you will notice many differences, some of which are illustrated in the following table.

Click image for larger view.

These are some of the noticeable differences but not all of them by any means.Some of these differences are arguably understandable, like referring to <LAYER> vs. <FeatureType>, and require a different set of local parsing methods.Others differences may be transparently handled (if you are lucky) by the XML parser used, such as the DTD vs.schema usage.Some of the discrepancies are just annoying and require duel parsing methods, such as the online-resource handling.In CarbonTools for example we simply established a single handling method for the <DCPType> and let it deal with either case of online-resource regardless if it is a WMS or WFS.This way we don't need to worry about which is which, or handle future standards versions trying to converge into one or the other.

Versioning

Some of the issues described in this article are the result of the evolution of OGC specs.Naturally newer versions of specifications add, change or enhance the previous ones.Although I do not intend to provide a full coverage of the capabilities version history, there are two issues worth mentioning.The first is the version of the returned capabilities file fetched from the service.The second is the significant change of terms between WMS 1.0.0 and later WMS versions.

Version negotiation rules of WMS are described in the latest WMS specifications (currently only one version of WFS is available). It is important to remember, however, that the response from the service may not necessarily be in the same version as specified in the 'GetCapabilities' query.If the server does not support the requested version it will return a different version in the capabilities XML.Therefore it is necessary to parse the response version and not rely on the version specified in the query.

The changes made from the earliest WMS version (1.0.0) to later versions are significant.Pay close attention to the DTD descriptions (found in the OGC specs) when attempting to support WMS version 1.0.0 as there are many changes in the XML terminology.One major difference worth pointing out is the change in the names of available WMS operations, as detailed in this table.

Click image for larger view.

While these are more critical for sending queries to the server, it is essential to keep this discrepancy in mind when using the capabilities data to find if the server supports a specific operation. For example, if an application needs to send a GetFeatureInfo to a server which supports WMS 1.0.0 only, it must verify support for and use FeatureInfo instead.

Information inheritance

The capabilities specifications describe the inheritance of elements data in the description of layers and features. There are elements that have no inheritance, elements that are accumulated through inheritance and elements that are either inherited or overridden with new values.While this makes the capabilities file a powerful and robust data structure, it can make the life of a programmer very difficult.There is no intention of making this article a course in software engineering or data structures, but I do want to share a couple of my favorite solutions as we implemented them in CarbonTools.

In CarbonTools we treat each layer as an object with its own data and API.In order to be able to detach a layer object from the capabilities and 'move it around,' we decided that each object will be self sustaining and have its own complete dataset.This sacrifices memory efficiency for flexibility.If there is no need in your solution to detach the layer from the complete capabilities structure, you may consider a reference tree-type data structure.To illustrate that point, the following describes a parent object that contains data A, B and two children which have C and D data.In both cases the children objects inherit A, B and add their own data. The first case inherits the data by pointing to the parent structure; the second variation copies the parent's data to the children.

Click image for larger view.

In the WMS specifications there are two inheritance types described as 'add' and 'replace.' To demonstrate those I'll use the SRS tag (add) and the ScaleHint tag (replace).

The ScaleHint tag is an optional element that suggests maximum and minimum scales for which it is appropriate to display the layer.Therefore each layer can have a single or no ScaleHint tag.If ScaleHint is defined for the layer it overrides any previously defined element by its ancestors.If ScaleHint tag is not defined for the layer, you need to climb up the hierarchy and look for the first one to use.It is also possible that a layer and its ancestors will not have any ScaleHint defined.

In the next diagram layers A and B have no ScaleHint associated with them, while layers C, D and E all have <ScaleHint min=1 max=10> (D and E through inheritance).

Click image for larger view.

While the concept of inheritance is very familiar to object oriented programmers, this is more of a data storing issue since all the objects are of the same type (layer).In the case of 'replace' type inheritance (such as ScaleHint) the easiest way to cope is to have in each layer object a data store for the item, and when creating a new object (at the beginning of a <Layer> element parsing) just copy the parents' data and override it if a new ScaleHint tag is encountered.

While that sounds easy enough, things get more complicated with the SRS tag.This tag is defined as an 'add' type, which means that data is accumulated through inheritance.There are three major issues here.First, you must maintain some sort of a growing dynamic list of one or more SRS items for each layer.Second, each list is inherited by the children of the layer.And finally, get ready for this, there may be duplicate definitions due to the inheritance.

The following diagram explains the issue (through mock EPSG values).

Click image for larger view.

The resulted SRS support for the layers will be:

The requirements we set in CarbonTools are that a layer object will be detachable (no pointer links to other layers) and optimized for performance.Our solution was to use hash tables to accumulate the collected SRS values.The reason we used hash tables is to provide a quick response to the SupportsSRS API command, which checks if a specific SRS string is supported by the layer object.By using hash tables, this process is achieved in a one step reference operation and no loops {O(1)}.

There are numerous software solutions to this particular problem.When designing the application, I suggest thoroughly considering the prerequisites of the application, and weighing in speed and memory necessities. Some vendors contain a very long list of SRS tags in their capabilities file, others have but a very short one with only a few tags.A general solution should be ready to handle all cases quickly and efficiently.

Bounding boxes

The WMS specs define two elements for Bounding Boxes, LatLonBoundingBox and BoundingBox.The former is a mandatory element for each layer that describes the boundary in lat long coordinates (EPSG:4326).The BoundingBox tags contain multiple regions that are defined in the layer or through its parent layers (see inheritance).The WFS specs, however, define only one tag: LatLongBoundingBox.

The exact usage of these tags is described in the specs.Nevertheless, I would like to point the following subtleties that have to be addressed when dealing with the capabilities.

Notice the different spelling of the WMS LatLonBoundingBox and the WFS LatLongBoundingBox.It is easy to confuse the two; in fac sometimes vendors (especially newcomers to OGC) confuse them in the capabilities published.

The WFS LatLongBoundingBox represents a bounding box for the currently used SRS which is not necessarily lat-long (EPSG:4326). This is a major source of confusion both to vendors and end-users.One can only assume that the original intent was to keep WFS simple and use only lat-long values, but vendors could not handle such a requirement.Who knows why they kept the name.In any case this tag's value is correlated with a local SRS tag.

Another difference between the WMS and the WFS is that the WMS can have only one BoundingBox tag per SRS.This is not the case for WFS as the specs do not pose such a limitation, thus allowing publishing multiple regions under the same projection.

Namespaces

While namespaces have a very important role in XML, vendors sometimes tend to abuse or misuse them in their WFS.In addition to the common ogc (xmlns:ogc = "http://www.opengis.net/ogc") and wfs (xmlns:wfs = "http://www.opengis.net/wfs") namespaces, usually defined in a WFS capabilities, the specs allow the use of custom namespaces (e.g.xmlns:myns = "http://www.someserver.com/ ").

When analyzing the capabilities file, it is important to parse the namespace definitions used in the layer.This namespace has a mandatory role when using the layer name in queries.For example if a layer name is 'myns:somelayer' a GetFeature query must include the definition of the 'myns' namespace and sometime the only way to know that is through the capabilities.

In the case a FeatureType is described as follows:

(Click image for larger view.)

The corresponding GetFeature query may look like this:

(Click image for larger view.)

To make things complicated different vendors place the namespace definition in different places.Here are a couple of examples we encountered.

Namespace defined in the WFS_Capabilities root tag.(Click image for larger view.)

Namespace defined in the FeatureType definition. (Click image for larger view.)

Another important point I must make is not to rely on the WFS and OGC as fixed names, as some vendors change these names and even use these namespaces for their own name-spacing.For example, a capabilities file may define 'xmlns:ogcwfs = "http://www.opengis.net/wfs"' and 'xmlns:wfs="http://www.anotherserver.net/wfs"'.In this case if you use <wfs:Operations> for example you will get an XML error.

Online resource

The online resource redirection is sometimes a neglected feature.Programmers who are used to work with services that have a single fixed URL for a service may be tempted to use that URL for all their queries.This is a mistake as some vendors separate their queries to different CGI or ASP services, or even different servers, thus creating several addresses for different operations within the same service.

For example, a capabilities file can be read from www.someserver.com/wms/capabilities while the GetFeatures query must be done through www.someserver.com/wms/features.This is a common scenario and the only way to assure correct redirection is through analyzing the online resource element in the capabilities file.