Informix Spatial Data Technology: Update and Positioning

Directions Magazine's Joe Francica discussed the new geospatial features included in IBM's Informix Dynamic Server (IDS) with IBM's Bernie Spang, director of Data Servers. IDS 11 came out earlier this summer. From the IBM website's page on IDS 11: "The new Web Feature Service DataBlade module implements an Open Geospatial Consortium(R) Web Feature Service (OGC WFS) in IDS to act as a presentation layer for the Spatial and Geodetic DataBlade modules. The OGC WFS interface allows requests for geographical features across the web using platform-independent calls. The XML-based Geography Markup Language (GML) is used as the encoding for transporting the geographic features."

Joe Francica (JF): Will there be a map client offered with this version of IDS?

Bernie Spang (BS): No, there will be no map client offered with this version of IDS. We have always partnered with other vendors to provide this functionality, and some customers develop their own interface within their custom applications.

JF: Is the DataBlade technology, either Geodetic or Spatial, offered as a module or for an extra fee? What is the cost or how is the cost determined?

BS: Spatial is offered as a no-cost option to IDS 11 and it is available via download on ibm.com. Geodetic is currently offered for a list price of $50K/cpu. The Geodetic DataBlade uses a round earth model, and provides additional value for customers who need to work with very accurate representations of latitude and longitude that provide for curvature of the earth calculations.

JF: Are there plans to merge or migrate the technology available as the Informix DataBlades with DB2 extensions? Or, are the plans to keep these two database options separate?

BS: These will remain separate products. The Spatial Extender and the Geodetic Data Management Feature (an add-on to the Spatial Extender) are built on the same technology as the corresponding Informix DataBlades. While there are differences, including some significant ones, the similarities are great.

JF: What are the differentiating features between IDS Spatial DataBlades and the DB2 Spatial Extender?

BS: The Spatial DataBlade includes a true spatial aggregate and a high-performance, easy-to-use, R-tree-supported Nearest Neighbor predicate. (The OGC, in its publication of OpenGIS Features for ODBC (SQL) Implementation Specification, selected the term geometry to represent spatial features such as point locations and polygons. Typically, points represent an object at a single location, linestrings represent a linear characteristic, and polygons represent a spatial extent. An abstract definition of the OpenGIS noun geometry might be, "a point or aggregate of points symbolizing a feature on the ground.") In reality, these are minor functional differences; both are fully compliant with the OGC Simple Features Specification for SQL (with Types and Functions) and both internally use spatial technology from ESRI. The real differences stem from the differences between the DB2 and IDS database products [discussed below].

JF: You mention that the R-tree data structure is faster than the quad-tree database structures. Do you have benchmarks to back this up?

BS: As a five-year-old scientific paper by Oracle engineers showed, the difference between R-tree and quad-tree is not so much that the R-tree is inherently faster, but that the quad-tree takes a great deal more tuning, which makes it less suitable for a general, spatially enabled database. While quad-trees, R-trees and grids have their inherent strengths and weaknesses, the specific implementation matters a great deal more. Oracle and Informix both sport R-tree support, but while Informix's is implemented using the same low-level code libraries as the built-in access methods, and integrated at the kernel level through targeted extensibility interfaces, Oracle had to emulate the index structure using SQL-level tables and recursive queries. This has deleterious effects on performance (a much higher-level code path) and concurrency (a row in the index-emulation table corresponds to a node or page in the index structure, causing row-level locking effectively to turn into page-level locking for the index).

JF: Have you benchmarked Informix Spatial DataBlade vs. Oracle Spatial?

BS: Oracle prohibits its customers from publishing any comparative benchmark results, and there are no standard, public spatial benchmarks.

JF: What are the capabilities for user-defined functions of IDS?

BS: User defined routines can provide support for new datatypes, new access methods (via VII - virtual index interface), new table types (VTI - virtual table interface), comparison operations, and aggregate functions. They can be written in C, Java or SPL (a Stored Procedure Language used with IDS). User defined routines can allocate and manage database memory, as well as attach to other memory outside of the IDS engine. A rich, well-documented and stable API gives user-defined routines written in C (as in the Spatial and Geodetic DataBlades) enormous power, and helps guarantee that any given DataBlade keeps working as the server itself evolves.

JF: Are the features of the IDS Spatial DataBlade targeted at any particular industry?

BS: The IDS Spatial Datablade is targeted at traditional GIS applications. It does not provide any special support for topology applications or telecommunications networks. In addition, some enhancements have been applied to prepare the product for the expected onslaught of streaming spatial location data (tracking moving objects) in support of location-based services. This, too, is not specific to any particular industry, but is thought to have the greatest potential in the telecom, retail and government/military sectors.

JF: Can you provide a detailed explanation of the Index Advisor and how a user would go about tuning a typical IDS spatial database?

BS: The Index Advisor is a DB2 feature. IDS - in particular the upcoming IDS 11 release - does include great support for analyzing the performance and load characteristics of a given database, including identifying queries that can benefit from additional indexes, but they are part of a more general vocabulary of administrative interfaces, not called out as a separate "advisor." For spatial indexes specifically, DB2 has a Spatial Index Advisor to help configure the Grid index. The Informix R-tree, however, requires no configuration (other than the usual storage specifications that are available for all tables and indexes) and has no tuning "knobs." IDS generally requires much less tuning than other database servers (this is what makes it so powerful as a zero-administration, "hidden" database, integrated with a packaged solution), and this is true for the spatial capabilities as well. Other than the resources required for the increased data volumes and transfers caused by spatial data's bulk, there is no difference in tuning practices between a spatial database and a non-spatial one.