Interview: Dr. Paul Torrens of Arizona State University on Process Models and Dynamic GIS... What GIS Needs Next

In the summer issue of ESRI's ArcNews, Dr. Paul Torrens, director of the Geosimulation Research Laboratory and associate professor at the School of Geographical Sciences and Urban Planning at Arizona State University, penned an article titled "Process Models and Next-Generation Geographic Information Technology." The article provides a truly unique vision on how GIS should work by incorporating more dynamic data and having users develop a better understanding of how geospatial phenomena really work, specifically those "processes" which control complex spatial situations. Editor in Chief Joe Francica interviewed Dr. Torrens about his ideas, space, time, cellular automata, Web architecture, "spimes" and the "Internet of things." Because of the length of the interview, it has been divided into two parts. This is part one. Part two will appear on Wednesday.

Directions Magazine (DM): In your article, you discuss the need to "consociate" geospatial data models with more dynamic and adaptive processes. Can these "process models" be viewed as a more natural way of thinking and rationalizing geospatial phenomenon to allow users to achieve everyday business objectives? Please define and give examples.

Photo: Tom Story

Paul Torrens (PT): Traditionally, geographic information systems have emphasized patterns and GIS developers have paid comparatively little attention to the role of processes in representing or analyzing spatial phenomena. The premium placed on pattern is pretty obvious in GIS: static maps dominate as a user interface to underlying spatial databases and we interact with that interface using patterns such as buffers, filters and so on, or by literally employing click-patterns with a mouse. Similarly, much of the spatial analysis that we perform on spatial data is pattern-based: point-pattern analysis and measures of spatial structure and spatial composition are common examples. These functionalities are incredibly useful, but the emphasis that GIS places on pattern belies the fact that spatial phenomena are composed of both patterns and processes and that the two really work together in synergy. GIS are abstract models of the real-world and so the absence of a rich set of process tools to describe that world is troublesome when we try to ally our GIS to real-world systems. This is beginning to change, as GIS become more tightly coupled with dynamic simulation models, which may serve as a dynamic engine for a GIS. And as these process models are beginning to be encapsulated as data models in the core of a GIS.

One can, for example, store snapshots of hurricane patterns in a GIS, and these may even be updated regularly through what we call comparative statics (i.e., a series of snapshots tied together in a temporal sequence, like frames in a movie sequence), but these are really just a freeze-frame sample of a complex, dynamic and adaptive system that is really driven by processes that control evaporation and condensation, wind movement, shifts in atmospheric pressure and so on through very complex mechanisms of non-linear interaction, path-dependency, positive and negative feedback, and phase transitions. A GIS generally has no native underlying model of how those processes operate over space and time and so, despite its wonderful capabilities for parsimoniously representing the geography of complicated systems, without being coupled to a process model, the GIS cannot tell us where a hurricane is likely to go next or what the dynamics of its interactions will be once it makes landfall. Yet, these are the details that are of critical interest for citizens, emergency management and first responders, policymakers and even insurance companies.

Retail site selection and marketing are other examples where process models could have a dramatic impact. Many retail businesses currently use GIS to map their customer base and the integration of point-of-sale geo-referencing, such as tagging cash register transactions with customers' ZIP Codes or phone number area codes to determine their home location, for example, has automated much of this and has allied GIS to mail-marketing and consumer preference analysis. The use of GIS-enabled geodemographics in this way yields a relatively rich description of who your customers are, where they come from, and what their transactional activity is and this information can easily be cross-linked to a wealth of demographic and economic detail that adds additional value to the data points that are harvested at a point of sale. Demographic details can easily be pulled per ZIP Code or area code to classify consumers, negating some of the need for revealed and stated preference surveys. But none of this information really tells you why the customer chose a particular store or transaction in a particular location at a particular time. Moreover, looking to the long-term (which is the time horizon for store location decisions, for example), as customers transition through the lifecycle, their value platforms for consumer goods and their preferences for services change, and the relationship between these attributes and the geography of their homes and workplaces shifts. Quite often, people change location when key phase shifts occur in the lifecycle: moving to a university, starting in their first jobs, getting married or starting a family, retiring and so on. The urban geography of American cities is in a continuous flux and the demographics of urban populations are changing constantly; this is especially true in the current housing and employment market. Static geodemographic systems lack a sophisticated scheme to preemptively estimate or forecast those changes; to do this, they need to be coupled with models of population dynamics, demographics, regional migration, intra-urban migration, even to models of the urban economy, land development and transportation dynamics.

The adaptive nature of consumer trends is even more complex if we consider the near-term, which can influence product selection in retail stores and management of the related supply chain. In the near-term, many fashions, fads and subcultures emerge and diffuse very rapidly and very widely, sometimes on the order of a day, temporally, and globally in terms of spatial reach. The pace of adaptation is accelerating as new technologies are adopted (Twitter tweets, flash mobbing, crowd-sourcing of online consumer review and reputation systems). Because of an essential resolution mismatch, static geodemographics contextualized in an ecology of ZIP Codes and decennial census information misses much of these trends, which is why there is growing interest in the use of higher-resolution geographic information technologies to fill in these gaps. This can be accomplished using location-based services triangulated by cell phone position and allied to micro-transactions such as swipes of consumer loyalty cards and credit cards, and also coupled to individual products using RFID tags, for example. Again, process models can take GIS out of a static information ecology and can animate the dynamics between the "dots" of these data points, potentially to the scale of the choreography of customers within a store and the individual selection of a product on the shelf. The implications of, say, shifting product displays, could be assessed on the order of a day and, from the bottom up, the knock-on implications for the supply chain on the order of weeks can be managed, for example.

DM: You talk about the need to fuse space and time into GIS because today much of it is used only for simple visualization. How would you expect the next generation of GIS solutions to better integrate the time dimension for spatial analysis?

PT: This is already happening, although much of the innovation is taking place outside traditional GIST research and development, but in ways that are very closely allied with geographic information systems. The rapid proliferation of mobile Internet and communications technologies (cell phones, hand-held gaming devices, Wi-Fi enabled laptops, Bluetooth peripherals, RFID inventory systems, in-car navigation devices) has necessitated development of entirely new database systems for handling moving objects and for indentifying and classifying events as interactions of those objects with each other and socio-technical systems (humans and the cell phone network, employees and products in a stock room, for example). We see this in our everyday lives: when we turn on our cell phones we expect them to work, regardless of location. Similarly, we take it for granted that a company like UPS can deliver a package anywhere in the United States (maybe even in the world, although there are obviously locations that are beyond its reach) and that we can track the movement of that package through way-stations in near real-time.

Georeferencing moving objects in this way constitutes a first layer of a larger possible information architecture; semantic analysis, which would "make sense" of these vast stores of data in the context of movement within a complex socio-technical system, constitutes the second layer. Development of this second layer is already underway: John Krumm and Eric Horvitz at Microsoft Research have a project to machine-learn driver trajectories from in-car GPS data, as a "predestination" system that would forecast likely activity patterns. Similarly, various efforts have been developed to couple car position data (from GPS devices, drivers' cell phones or through Wi-Fi networking of in-car devices on the road) with process models to generate real-time traffic reports. Kai Nagel and his group at the Technical University of Berlin have been working for some time on linking intelligent transport systems to dynamic agent-based traffic models that will perform near real-time forecasting of likely traffic outcomes from congestion, for example. It seems to be only a matter of time before these now disjointed information ecosystems align and allow for a many-system viewpoint (driver behavior, ramp-to-ramp traffic conditions, construction projects, electronic billboard notifications, staffing at toll booths) to unfold. Once again, process models are the catalyst that allow a system-of-systems architecture to unfold in a way that accommodates the complexity of interactions between diverse system elements. GIS is a likely candidate for organizing data for those systems, because of the widespread ability of location tags to couple diverse data sources.

In terms of visualization, or geovisualization if we consider human-computer interaction in a digital context, developments in space-time GIS are also ushering in new forms of interfacing for GIS, challenging the dominance of the map as a portal to spatial data. In academic GIS research and development, many researchers (including my own group) have turned to artificial intelligence in search of tools that can better model real-world processes and their evolution dynamically, whether this is through agent-based modeling, machine-learning or semantic analysis. The dynamics of these processes need to be visualized in ways that are not traditionally catered to in a typical GIS and so researchers have turned to computer-animated design and to commercial game engines for methodology that can push beyond the limitations of static cartographic interfaces to spatial data. In turn, this has led to the emergence of "immersive" virtual globe cartographic interfaces, such as NASA Worldwind and Keyhole's World Viewer (now Google Earth), which work with standard GIS data formats by essentially visualizing maps in 2.5D (2D with extrusion) and 3D. In each case, these products can be visualized on a desktop or any client device using browser-based visualization schemes such as Silverlight and Flash.

These interfaces are beginning to be standardized in GIS: ESRI's products now maintain interoperability with COLLADA (Collaborative Design Activity), for example, which is a format for exchanging 3D assets in CAD using standard XML schemes. Because of their foundation in XML, COLLADA assets can co-exist with similar mark-up schemes for GIS data. This allows for geographic objects to be exchanged between CAD and GIS (as well as virtual globes) rather seamlessly. Other examples, such as Microsoft's Virtual Earth (now "Bing Maps for Enterprise") are coupled to dynamic visual simulation software (ESP, for example), which will allow for the display of dynamic processes. In my group we have developed immersive, 3D animated "peoplescapes" that are interoperable with agent-based models, network graphs, GIS and spatial databases.

A further advantage of the growing synergy between GIS, visualization and animation is the ability to move beyond the mouse as an interface device and toward recent-generation controllers (Nintendo's Wii remote or data gloves, for example). These permit greater flexibility and degrees of freedom in manipulating geographic objects, whether they are on a screen or in a CAVE (Cave Automatic Virtual Environment), and allow for additional senses to be used, such as tactile feedback, which can assist in developing GIS for populations that may benefit from non-visual interactivity with geographic data. The proliferation of touch-screen technologies, whether on cell phones, tablet PCs, touch-screens or on dedicated "surface computers" (some of which are capable of visualizing many layers of information simultaneously: see Microsoft's Second Light project), is ushering in entirely new forms of gesture-based interaction with spatial data through GIS. Recent advances in using hands, feet and bodies - via 3D camera and imaging - as the actual controller for gaming (see 3DV Systems and Microsoft's Project Natal) will likely accelerate these developments even further. These require processes to model body movement and gesturing and to classify those movements as space-time processes and events that can be interpreted in what is, essentially, a specialized GIS. They also need models that can then translate those data in a meaningful geographic context on-screen.

DM: Many of our readers may not be familiar with the use of cellular and agent-based automata in simulations. Can you explain the concept and provide some examples of how it would be used to simulate geospatial processes?

PT: Cellular and agent-based automata are tools that can be used to build very detailed and complex simulations of spatial entities and processes. They are, fundamentally, media for processing and computation. Each automaton works as a computer, just like a PC's CPU, and automata actually work in the same ways that our brains function to process information. An automaton is capable of storing a finite amount of data and these data can take on any digital form: they could be spatial data, numerical data or text, for example. In addition to its own store of data, the automaton is able to accept data from other entities, from other automata, from software or from a computer user's mouse activities, for example. The automaton can share these data, or its own data, with anything that it interacts with and in this way we can think of an automaton as being capable of communication. So essentially, we have an automaton that works as a database. In addition, the automaton is endowed with a set of processes, which can be considered to work as rules, heuristics, algorithms, methods and so on. The role of these processes is to perform operations on data, to conflate data, search through data, to prune data and so on. Usually, the processes are designed to contextualize the data and to thereby convert the data into information. The results of this processing can also be shared with other automata or digital entities. Processes can also be introduced to describe how an automaton moves through an environment, thereby controlling how its information is exchanged and what its interactions might be. By adding processing to the automaton's database functionality, the automaton gains a level of intelligence, in so much as it is capable of (often proactively) extracting knowledge and meaning from data.

Agent-based automata use agency to shape their processing abilities and there are a variety of schemes in which agency can be interpreted: human agency, collective agency, emotional behavior, utility maximization and so on. When agents apply their agency introspectively, we usually refer to them as "individual-based models." When agents work with other agents (or with an environment, whether that is a technical environment, a social environment or a physical environment) to accomplish some goal, we usually refer to the automata as a "multi-agent system." When agents are used to represent nodes in a network (neurons in the brain, for example) and the exchange of information between nodes, we commonly build them as artificial neural networks. Artificial neural networks can be used to model real neural networks (decision making, for example) or they can be used to distribute processing of large tasks, when we use them in image processing for analyzing satellite imagery, for example. When the automaton is considered as residing in a discrete spatial unit (perhaps within a lattice of related spatial units), say within an individual parcel boundary within a city, for example, we regard the automata as "cellular automata" and information exchange takes place between cells based on diffusion.

The use of agent-automata in geographic modeling (what is termed as geosimulation) is motivated by a number of factors. Agent-automata are completely malleable to parameterization, such that an automaton can really represent anything that can be expressed digitally. They constitute what is known as universal computers. This means that an agent-based modeling architecture can be used to build simulations of lots of different things, which has advantages over more traditional forms of modeling that might fit one or two purposes only. Multi-agent systems can also be very efficient in solving problems that lend themselves to distributed computing, i.e., a problem is broken-down into lots of little pieces and each piece is handed off to a dedicated agent, with some system-level rules for reconstituting the results. So, we can use agent-automata to model traffic on highways, for example, at the scale of every single car on the road in an entire city, and we can examine how system-level effects emerge from the one-to-one interactions of drivers. This relates to the use of agent-automata to model complex adaptive systems, which are systems in which things like emergence, feedback, bifurcations, and fractal scaling are important. Most traditional models are not suitable for modeling complex systems because they are constrained by limits of independence, linearity and rationality, for example. Automata also have a close affinity to GIS data models, such as networks, rasters and vectors. In a way, a lattice of automata with many states is the same concept as cross-indexing many raster-layers in a GIS, but with the added advantage of embedded process models. Unlike in GIS, automata can easily be extended to three dimensions (even four dimensions if you consider space and time as separate).

I have worked extensively with automata as the engine for processes in urban simulations. I have built, over the years, models of suburban sprawl in which the automata are patches of land in an urbanizing landscape and agency is used to model how developers and settlers might build and populate that landscape. In this case, I am essentially using the model to build a synthetic city in a computer, so that I can experiment with possible trajectories of urbanization (smart growth, sprawl, green belts, edge cities, exurbanization and so on) in ways that are impossible to do in the real world. I have also built agent-based models in which automata are used to represent individual householders, property developers, their decisions to move within a city as they transition through the lifecycle, decisions to develop or redevelop land at particular densities and for particular activities, and the community- and city-level outcomes of their interactions as social neighborhoods, gentrification phenomena and so on. In the last few years, I have been building very detailed introspective automata to model real people. Here, my focus has been on infusing realistic behavioral geography and spatial cognition into their agency so that they can behave realistically, doing the right things in the right places and at the right times, so that they can move through their environments using spatial thinking, and so that they can structure their activities and interactions with each other and their surroundings using spatial cognition.

One of the benefits of agent-based modeling is its almost limitless extensibility, but this poses problems, particularly when agents are modeled at high resolutions, because the models approximate the level of detail that is present in the real-world analog of the thing you are trying to simulate. There is a potentially insatiable level of uniqueness that can be built into the agents and in this sense they are the perfect tool for postmodern science. Unlike in other geographic modeling fields, such as climate modeling, there has not really been a concerted effort to build fundamental laws for agent-based models, partly because for the systems that they are being built to explore - human systems, for example - no universal laws exist; geography is always a unique qualifier in these cases. So, models are often built anew per-application, which slows the pace of development in this area as an academic field. I have been working to circumnavigate these issues, to some extent, by building standardized and reusable schemes for agent-based modeling. Some years ago, when I was studying with Michael Batty's group at the Centre for Advanced Spatial Analysis, I partnered with Itzhak Benenson at Tel Aviv University to build what we called a "Geographic Automata System" that would standardize the automata components needed in urban simulation, and we actually borrowed on GIS design principles to do that. Now, I am working on a similar scheme for agent-based models that involve human movement (path- and way-finding, locomotion, steering, collective motion and so on).