The Road Taken: The Evolution of GIS Data Clearinghouses From FTP to Map Services
At the 2005 American Library Association Conference
in Chicago, I sat in on a session focusing on GIS data and data
clearinghouses.One presenter used the phrase "the road not taken" to
describe the path he had decided, obviously, not to take in the
development of their clearinghouse.
The presentation made me contemplate the road taken by clearinghouses
in general, and in particular, the road taken by the one I'm involved
in here in Pennsylvania, Pennsylvania Spatial Data Access (PASDA).What has changed, what's
new, and what is on the horizon? But first, a word about PASDA.
PASDA is Pennsylvania's official public access
geospatial information clearinghouse.It was developed as a service to
the citizens, governments and businesses of the Commonwealth.PASDA is
a cooperative project of the Governor's Office of Administration,
Office for Information Technology, Bureau of
Geospatial Technologies
and Penn State Institutes for the
Environment of the Pennsylvania
State
University.Funding and support is provided by the Pennsylvania
Office
for Information Technology, Bureau of Geospatial Technologies.
 |
Figure
1.The Pennsylvania Spatial Data Access Clearinghouse.(Click
for larger image)
Background and Basis for the Clearinghouse Movement
The impetus behind the clearinghouse movement, and still a driving
force in it, is the Federal Geographic Data Committee (FGDC), which helped push the National
Spatial Data
Infrastructure (NSDI) forward.In addition to promoting and supporting
the development of the clearinghouse concept, the FGDC and its staff
have been brave enough to tackle the most dreaded of all subjects to a
data creator - metadata.Metadata is NDSI's backbone as it is for
clearinghouses everywhere.Needless to say, without FGDC's consistent
efforts and its early efforts to fund clearinghouse and metadata
development, many clearinghouses would not exist.
 |
Figure
2.The Federal Geographic Data Committee.(Click
for larger image)
The FGDC still continues to foster clearinghouse development and
has also, in recent years, supported Web-based mapping initiatives such
as those fostered by the Open
Geospatial Consortium.The US Department of Interior and the
FGDC have also been major proponents and partners in Geospatial
One-Stop - the latest cooperative initiative
involving many Federal, state and non-governmental partners.
 |
Figure
3.The Geospatial One-Stop Portal.(Click
for larger image)
Like many of the clearinghouses in the US, PASDA began in 1996 as a
simple FTP site with some GIS data and corresponding metadata records.
PASDA was originally funded by the PA Department of Environmental
Protection and is currently funded by the Pennsylvania Bureau of
Geospatial Technology.The clearinghouse is housed at the Pennsylvania
State University Institutes of the Environment.Initially, PASDA
was connected to the NSDI
and searchable through Z39.50 protocol and Isite software - an open
source program acquired from FGDC.Within three years, Pennsylvania was
one of about 25 data clearinghouses to be searchable through the NSDI -
others included New York,
Texas and Arkansas.
 |
Figure
4.The New York State GIS Data Clearinghouse.(Click
for larger image)
The Geography Network and Web GIS
As the clearinghouse movement grew, a new paradigm arose to reflect the
changing technology and needs of users.It was also at this time that
the horizon and benchmarks for clearinghouses began to change.As GIS
software became more widely used and users became more proficient,
clearinghouses began to expand their services beyond simple data
downloads.Web GIS, as it was called, began to drive clearinghouses to
push the envelope of access and technology.One of the first major
partnerships to promote Web-based GIS or map services was the Geography
Network (GN) developed by
ESRI.The GN set the tone for
cooperative efforts
across the country (including the current Geospatial One-Stop
initiative).In 2000, PASDA was one of the first organizations (others
included the Texas GIS data clearinghouse and National Geographic) to
be part of ESRI's GN.
 |
Figure
5.The Geography Network as it looks today.(Click
for larger image)
Another clearinghouse which embraced Web GIS in its early stages is the
Delaware GIS Data Clearinghouse - the home of the Delaware DataMIL.
 |
Figure
6.The Delaware DataMIL Map Production Laboratory.(Click
for larger image)
The evolution from FTP sites to map services has been fast and furious
- most, if not all of the early participants in the NSDI have taken
this road and embraced new services and technology.PASDA is no
exception.
PASDA Background
Over the past six years, the growth of PASDA - both data and services -
has been exponential.From the original 35 data sets and metadata
records, PASDA has expanded to included thousands of data sets and
evolved from an FTP site to a User Centered Interface (UCI), accessing
data via database queries and state-of-the-art map and Web services.
The data made available through PASDA are provided by data partners to
encourage the widespread sharing of geospatial data, eliminate the
creation of redundant data sets, and to further build an inventory
(through development and hosting of metadata) of available data
relevant to the Commonwealth.PASDA serves as the public resource for
locating data throughout the Commonwealth through its data storage,
interactive mapping/Web GIS applications, and metadata/documentation
efforts.
PASDA services are provided free of charge to all users.The data on
PASDA are provided by federal, state, local and regional government
agencies, non-profit organizations, and academic institutions
throughout the region.
PASDA Partners
So much of the focus in GIS is on the technology - software, hardware,
databases and/or functionality, that often the human factor is
forgotten.But no article on clearinghouses would be complete without a
discussion of the importance of partnerships and positive relationships
with data creators and developers.Over the years, the core of the
existence of a clearinghouse and ability to succeed sits squarely on
the shoulders of data partners and funders.The PASDA model, unlike,
for example, the New York model, which is a data cooperative requiring
membership to access data, is to provide data free to anyone with an
Internet connection.There are no data licensing agreements, membership
or login requirements.
To participate and share data with a clearinghouse takes time and
effort on the part of the data creator - even with support from
clearinghouse staff to create metadata and transfer data.This time is
well spent because it benefits hundreds of thousands of users every
year.
Developing a data partnership involves more than simply acquiring data.
In order to keep relationships strong and growing over time there must
be trust among the stakeholders, and responsiveness from the
clearinghouse.
Long time PASDA state data partners include the PA Department of
Environmental Protection (DEP),
PA
Department of Conservation and Natural Resources (DCNR), the DCNR Bureau
Topographic and Geologic
Survey (PAGS) and
the PA
Department of Transportation (PennDOT).
Other notable partners include the City
of Philadelphia, Lancaster
County and Chester County.(A
full list of PASDA data partners is
available here.)
PASDA offers many services to the Commonwealth.Maintaining metadata is
one of the primary services provided by clearinghouse staff.Metadata
is the basis for all of the capabilities of the clearinghouse - from
searching to data customization.PASDA staff provides free metadata
training and development for all of the data partners.
Data Customization and Web GIS
As PASDA evolved, like many of the state data clearinghouses, out of
the FTP mold and into the world of relational databases and Web GIS,
there were a number of significant challenges.Most important, what
services should be developed and how? In order to answer these
questions, PASDA has undertaken a program of scheduled strategic
planning sessions.The first occurred in 2000/2001 and has occurred
approximately every two years since that original session.The result
of these strategic planning sessions, which included input from users
and data partners throughout the state, was the development of
customization tools and Web mapping applications.
One of the first applications developed that moved beyond simple
display of data was the Pennsylvania
Atlas.
Ryan Baxter, primary ArcIMS developer and technology coordinator for
PASDA, incorporated clip and reproject functions into the atlas.
 |
Figure
7.The Pennsylvania Atlas.(Click
for larger image)
As an alternate to using a map interface, the Data Wiz was developed.
David Walrath, the primary Web GIS programmer for PASDA, developed
download, clip and reproject capabilities using ESRI software.The Data
Wiz has since morphed into what is referred to as the UCI.The UCI,
which was made available to the public just this past July 1st, will be
the transition vehicle for PASDA to integrate data search, display and
customization capabilities.The UCI project involved all aspects and
talents of the PASDA staff - from the data manager and metadata
coordinator to the technology coordinator and the Web GIS programmers.
 |
Figure
8.The PASDA User Centered Interface.(Click
for larger image)
Data is loaded into ArcSDE and accessed via database queries.Users can
select data, which is then added to a virtual "Data Cart," and then, if
necessary, customize, clip (to county, watershed, municipality, etc),
reproject (users can select predefined or custom projections) and
download all of the items in a data cart in a .zip file.This interface
saves users time and effort.
 |
Figure
9.The PASDA Data Cart.(Click
for larger image)
Several other specialized applications have been developed to help
access and use some of the datasets that were available through PASDA.
The Census Mapper, developed by Ryan Baxter, allows users to select
Census 2000 data from SF1 or SF3, display that data, and clip and
download them.There are approximately 1 billion potential combinations
of data and geography available through the Census Mapper.
 |
Figure
10.The PASDA Census Mapper.(Click
for larger image)
Another issue that arose during the last five years was how to provide
access to Web GIS applications and services that are related to
Pennsylvania but were not developed or hosted by PASDA.PASDA serves as
the GIS data inventory source for the Commonwealth.Whether an
organization's data are hosted on PASDA or hosted locally, PASDA still
works to develop metadata for the data and integrate them into the
overall data inventory.This effort to capture and inventory the
existence of data should be maintained - even in a distributed Web GIS
environment, so that information is still easy to find through a
central portal.Ryan Baxter has observed, "potentially, as more
organizations become proficient with Web GIS, they are more likely to
serve out Web-based versions of their own data." To address this,
PennCat,
the Pennsylvania Catalog of Internet Map Services and
Applications, was
developed to serve as a catalog of Web GIS applications and map
services hosted across the region, and to provide a seamless way to
integrate these data into desktop GIS applications.
 |
Figure
11.The Pennsylvania Catalog of Internet Map Services
and Applications.(Click
for larger image)
Another primary goal of PennCat is to provide access to Web mapping
image services, such as a statewide mosaic of aerial photography and
USGS DRGs (digital raster graphics, scanned quad sheets).By providing
these image services, users can access the data from their desktops via
the Internet, use them, and save them as part of their project -
without ever having to download the data.This saves time, money and
effort for users, and has become one of the most frequently used
services we have developed.
PASDA also works to support special initiatives by developing resources
and providing access to data that go with them.There are several
examples of this.The most recent is our work with the Bureau of
Geospatial Technologies, the Pennsylvania Emergency Management Agency
(PEMA) and PennDOT to
develop flood
mapping applications.The first application developed was the Tropical
Depression Ivan Imagery
Viewer.
 |
Figure
12.The Tropical Depression Ivan Imagery Viewer.(Click
for larger image)
This project - flooding is the primary natural disaster in Pennsylvania
- enabled users to access aerial photography, view pre-flood and
post-flood imagery and also view on-site photos of flood damage.A
second, similar application focused on spring 2005 Delaware
River
flooding.
Another project for which PASDA provides support - in terms of data
storage and metadata development - is the PAMAP
program.PAMAP
is the Pennsylvania version of the USGS' National Map program.PAMAP
is
a DCNR program run by the Bureau of Topographic and Geologic Survey in
partnership with the Governor's Office of Administration, the Office
for Information Technology, the Bureau of Geospatial
Technologies, the
USGS and Pennsylvania county governments.
On the Horizon
Like other clearinghouses in the country, PASDA must continue to grow
and meet the needs of its users.With significant increases in the
volume of data available, particularly imagery, enhanced and new map
services and image services are on the immediate horizon.Some
clearinghouses are also developing Web services - address
geocoding/matching, for example, for users.Most important for PASDA
and a key component to any clearinghouse - metadata development will
continue to be the linchpin of our growth and success.Continued vision
and leadership from the state and input from users will help guide us
through the next few years and help direct the next segment of our
travels down the GIS clearinghouse road in Pennsylvania.