Making Sense of Census data: A case study from the Kingdom of Saudi Arabia

By Aymen A. Solyman

Census operation is a very important event in the history of a nation.This operation covers every bit of land and property in the rural and urban areas in the country.In the Kingdom of Saudi Arabia (KSA), the Central Department of Statistics (CDS) conducts census operations every 10 years.The last census operation was conducted in (1413H-1992).The preparation of the next census operation started one and half year ago (1423H-2003) and is expected to be finished by the end of (1425H-2005).Census information must be shared by all divisions within the organization to support mandated functions and also to supply other ministries and organizations in KSA with this information.The key factors of implementing an Enterprise GIS solution are cooperation, management, knowledge, funding and experience.This article describes a system developed for KSA census data distribution.

GIS and census data
When we look at major forces of the 21st century like population growth, economic development and consumption of natural resources, we see that all of them have a spatial component.Data produced by the census is a primary source of information needed for effective development, planning and monitoring of population, services distribution and socio-economics.

The census database contains a lot of attribute information which can be linked to spatial data by spatial referencing.Relating the spatial component along with non-spatial attributes of the existing organizational data enhances user's understanding and gives new insights into the patterns and relationships in the data that would otherwise not be found, and to enhance the decision-making process.Taking into consideration that many potential users of spatial information don't know about GIS technology, the development of an Enterprise GIS solution would serve two crucial purposes.Firstly, it would allow the users to operate the system without having to know the underlying intricacies of GIS and RDBMS technology. Secondly, it would allow sharing of information and technical experience between a wide range of users.

Enterprise GIS
Over the past few years, the terms Enterprise GIS and Enterprise-wide GIS are becoming more common in the GIS community, in organizational strategic plans and requests for services.Enterprise GIS is not a method to provide automated map plotting capabilities to the entire organization.Enterprise GIS is the sum of a coordinated GIS effort and its federated systems working together to support and promote coordinated geo-spatial data development and access across jurisdictional and organizational boundaries.

The vision and promise of Enterprise GIS is its ability to be an essential or core technology that integrates information across the many "islands of computer information systems" that exist in any organization.

The critical capabilities associated with Enterprise GIS include:
  • integrating geographic data across multiple departments and serves the entire organization;
  • providing desktop access via LAN and WAN connections to anyone who needs access, i.e.Enterprise GIS fully supports client-server operability;
  • providing access to other information systems in the organization by using a map or application as the integrator of the organization's information; and
  • communicating with other entities outside of the enterprise itself.
System Architecture Design
While designing any Enterprise GIS solution with the help of client's requirements, the most important factors to be included are: functional module, graphic user interface, downloading time, system performance and the cost of the solution.

The system must be portable and extendable to accommodate future changes in hardware, software and networking.Keeping in view these factors, the solution KSA has developed is based on two components: the server side and the web client side, which runs in the browser.The server is a set of applications that serve the requests made by a client.It is divided into two sections: the map server which is responsible for creating maps from the spatial database based on client requests and sends them back to the client, and the other section is the data server which is responsible for managing tabular data at the server side and sending the information to the client.The web client is partitioned into two parts, the map console (map area, map tools, dynamic legend and the query items) and the side information table.

At this solution, the client is a thin client user, i.e.the analysis procedures is processed on demand on the server side.Data transmitted to the web-client are in standard HTML format embedded with typical PNG (Portable Network Graphics)-8bit image format (35K-45K) that can be viewed in any standard web browser.The advantage of this approach that there is no need to download any plug-ins on the client side.

The performance of the system is measured by the ability to respond to requests faster, and the reliability of the system.Typically, the response to a certain query must not take longer than 25 second.The overall system performance is dependant on the combination of client, server and networking performance, not the individual components, and is controlled by the weakest component within the total solution.

In this phase of the project, one server machine is used for the functionality of the web server, map server and data server, which are installed together.In phase two of the project after the completion of the census data project, it is planned that the web server, application server and data server will be on separate machines.

The diagram below illustrates a general view of the system design architecture.


Figure 1.Click image for larger view.


Database Design
The database design is the foundation for building any enterprise GIS solution.The enterprise GIS database design must be thorough, well documented, permit modifications and allow continual updates.Many enterprise GIS efforts suffer from either the complete lack of a database design or one that is so rigid that it cannot grow. It must be noted that the database design is a factor that affects the performance and download time of the solution.The database design defines these components.
  1. Spatial database content, structure and format.
  2. SQL server attribute data content, structure, format and constraints.
  3. Relationships between spatial data and census data.
  4. Relationships between attribute data tables in the existing RDBMS.
  5. Updating processes for both spatial and census data.
  6. Data dictionary for both spatial and census data.
The design of the database includes three major elements.
  1. Conceptual design, which is independent of hardware and software and could be a list of utilization goals.
  2. Logical design, which is the specification of the database vis-à-vis a particular software.
  3. Physical design, which pertains to the hardware and software characteristics and requires consideration of file structure, memory, disk space, access and speed. The database in the solution can be classified into two categories - tabular data and geospatial data.
Tabular Data (attribute data)
The database containing the KSA census information is classified into four main categories.
  • Population: The database stores data about the population categorized into Saudi, non-Saudi and total population in each province.In each table in the population database, the data are classified into categories based on the age range of both males and females.
  • Education: The database stores data about the number of educated people according to the level of education in each province.
  • Social: The database stores data about the marital status of Saudi, non-Saudi and total in each province.
  • Services: The database stores data about the different services available in each province including information about agriculture, education (primary-intermediate-secondary-high-others), health, administration, public and social services.
Geospatial Data
Three initial data layers were determined necessary to build the foundation for the census data solution.
  • International Boundaries Layer: The polygon layer contains the international boundaries of KSA.The source of this layer is the General Directorate of Military Survey.
  • Province Boundaries Layer: The polygon layer contains the boundaries of provinces in KSA.KSA consists of 13 provinces and 118 governorates.The source of this layer is the CDS. The mapping unit was produced based on the settlement GPS points.
  • Grids Layer: The polygon layer contains the grids of 1:250,000 maps produced by the Ministry of Petroleum and Mineral Resources-Aerial Survey Department.
The design specifications are documented in a data dictionary that contains a description of each data layer, the data types used to model geographic features (point, line, polygon), tables structure, field definitions, coding schemes, and other information.

System Implementation
Based on the user needs assessment and database contents, the census indicators are classified into nine categories.
  1. Population (distribution, density) and each sub-category is classified into Saudi, non-Saudi and total
  2. Education status (Saudi, non Saudi and total)
  3. Social status (Saudi, non Saudi, total)
  4. Education services (primary, intermediate, secondary, high and others).
  5. Health services
  6. Agriculture services
  7. Social services
  8. Administrative services
  9. Public services.
The results of any query can be visualized in a number of ways that enhance the user's understanding and interpretation of the data, some of which are:
  1. Compare multiple attributes of a feature by depicting the attributes as elements of bar or pie chart;
  2. Compare one feature to another by the relative size of each feature's chart; and
  3. Shade each category in a graded sequence of a user-defined color ramp.
As shown in Figure 2, the user can choose a specific category such as the distribution of population for Saudi for a specific age range.Figure 2 shows:
  • a thematic map classified into five equal categories according to the values of the query item (1);
  • a dynamic legend at the right side of the map console indicating the color and the value of each category (2);
  • a dynamic bar chart showing the percentage of values to the total in each province (3);
  • a table containing the values of the selected query item in each province (4); and
  • ·a table containing the provinces with max value, min value, mean value and the total value (5).

Figure 2.Census Indicators.

For statistical operations, the user can choose any two items from the population data to compare by relating the relative size of each feature's chart as shown in Figure 3.For example, the user can choose to compare the number of male of Saudi (query item1) to the total number of population (query item2).Figure 3 shows:
  • a chart map displaying a relative bar or pie chart of query item1 to query item2 in each province (the user can choose to display the results as pie or bar chart); and
  • a table containing the values of query item1, query item2 and percentage of query item1 to query item2 in each province.

Figure 3. Statistical Operations.Click image for larger view.

The solution allows users to print the map results in the standard template of the organization. Also the map can be exported as image to be used in other applications.

Published Thursday, January 6th, 2005

Written by Aymen A. Solyman



If you liked this article subscribe to our newsletter...stay informed on the latest geospatial technology

© 2016 Directions Media. All Rights Reserved.