Mapping the World ... One Neighborhood at a Time - Part Two

By Bernt Wahl and the Neighborhood Team

Using Neighborhood Data - Neighborhood Data Analysis
In the 1950s, mathematical researchers at the University of Paris Ithiel de Solla Pool, Stanley Milgram, Benoit Mandelbrot and Manfred Kochen examined the interconnectivity of social networks in the manuscript "Contacts and Influences"i. The researchers' primary interest was the role social networks played in extracting "social capital" in producing economic productivity. The study tried to measure the levels at which interconnectedness would produce optimal results for various tasks.

Later Milgram, in his 1967 seminal work, "The Small World Problem,"
ii proposed the concept of 6-degrees of separationiii. Milgram devised an experiment that measured the degree of contact between people based on distance - research inspired by another social physiologist, Karinthy Frigyes. Frigyes' 1920s' observations in Budapest had calculated the degree of social interaction of Eastern European cities during the Statist period. These studies showed semi-consistent ratios between varying sub-divided location defined units.

As divisions are broken down into smaller regions a greater intimacy evolves. While the term "a fellow countryman" denotes a certain closeness, referring to a neighbor or a family member brings an even closer sense of community. A city's identity is often manifested through a local sports team name or a local school identity. In what seems to be a refinement of local identity, sweatshirts are now appearing that display neighborhood names

Project History
In 1994, Ron Eglash and Bernt Wahl gave a description, in the book Exploring Fractals, of how repetitive patterns were found in Cairo, Egypt neighborhoods and African villages. Their work showed that new neighborhoods replicate the structure found in old neighborhoods while incorporating a larger self-similar structure that encompasses both levels.

The initial commercial application in defining neighborhood boundaries provided local information for commercial services via Internet Search. It was felt that customizable information services that provided updated data at a granular level could be superior to standard "yellow pages." In 1995, the "Worldwide Yellow Page" ( site was launched that contained both U.S. and international online directory information. The first location selected to create detailed information was the Solano Avenue Neighborhood (in Albany-Berkeley, CA). Solano Avenue stores were presented graphically with shops on both sides of the streets with a brief description of services provided and hours of operation. For a fee, businesses could post additional information.

In 1999, YellowGiant was founded to provide local yellow page services online. In the early days the amount of localized information was sparse and data still had to be aggregated using U.S. Census designations. It was not until around 2003-2004, with the advent of the Factle Search Engine project, that enough neighborhood data could be collected to provide meaningful localized results. The Factle Engine used a weighted aggregation system that leveraged the superior capacities of different search engines to give optimal results that were also less susceptible to "spam." Initially an automated iterative data collection system was conceived that would collect initial neighborhood information; then the data parameters would be expanded upon until a region's neighborhood data collection was exhausted. This process proved to be too difficult to automate so it was carried out by manually entering in seed neighborhood information and iterating the search process for the data found. Over time a great deal was learned about what constitutes a neighborhood definition, though neighborhood boundary data interpretation still needed human analysis.

Neighborhood Identities
Neighborhood identities can be broken down into five key elements: physical, functional, political, economic and cultural identity

Figure: Shows five neighborhood identities:
1. Physical – provides the special boundaries in which a neighborhood resides.
2. Functional – serves to provide an infrastructure in which community services are coordinated.
3. Political – provides the mechanism in which a voice is given to the community in governmental affairs.
4. Economic – provides the base in which wealth is accumulated and distributed in a community.
5. Cultural Identity – the soul of a community, where heritage, ethnicity and other distinguishing features establish characteristics of the community.

These identities are often interrelated, which, when combined, help form the fabric of a neighborhood community. Below are examples of how Neighborhood Based Boundaries and Demographic Data may prove useful in addressing the issues of neighborhood communities: public health, political districting, education, public safety, city planning and commercial marketing.

Neighborhood Map in Search
As localized search capabilities are expanding, organizations' and individuals' capabilities to gain access to more detailed local information increase. By supplying data based on defined locations with common characteristics, local residents, companies and service organizations are able to search for online information and provide community services more effectively.

Public Health - Healthy Places: Exploring the Evidence
vi reminds us of the old adage: there is no place like home. In its idealized form it is a familiar place where you can find comfort based on familiar surroundings in a comfortable setting. This world is comprised of "place effects" most notably stated by the author as: 1) nature contact, 2) buildings, 3) public places, and 4) urban forms that fit together in a neighborhood community. Though a long-time staple in urban planning, only recently has the impact of a community's characteristics on people's heath been given much consideration. The article goes on to say that designing areas and buildings with open spaces in a natural environment with plenty of light can have a measurable impact on keeping residents happy and healthy.

Neighborhood data provide a useful resource for doing analysis in public health. In Urban as a Determinant of Health, The healthy city: Its function and its future
vii (Vlahov, D. et al. 2007) and An Enemy of the Peopleviii the common theme is urbanization and the effect it will have on people's lives. In public health, spatial locality affects health risks, and the correlation between ethnic groups and related problems could offer a beneficial insight to policymakers seeking to manage these issues.

Diseases are often clustered
ix by neighborhoodsx . Neighborhood data might allow health officials to measure the likelihood that contact with someone - in a neighborhood vs. someone in an adjoining area - will speed the spread of a disease based on the assumption that people within a neighborhood have a higher likelihood of interaction. Also, health agencies looking to create local heath clinics that want to strategically place sites can use neighborhood demographic data to effectively provide services in a cost efficient manner that target specific needs.

Rates of Asthma in San Francisco, California based on neighborhoods. (Click for larger image)

Another article, Many Pathways from Land Use to Health Associations between Neighborhood Walkability and Active Transportation, Body Mass Index, and Air Quality
xi xii (Frank, L.D., Sallis, J.F., Conway, T. L. et al. 2006), shows the correlation between an environment, mobility and health. These key factors include the availability of bicycle paths, walkways and other physical means of getting around in an environment with clean air and limited traffic interference. It shows that if cities and towns provide convenient ways to get around on foot, people will tend to use these options and reap the health benefits that come from increased physical mobility. This, in turn, may also reduce dependence on cars and buses that in a congested environment cause high levels of air pollution, which may hamper respiratory systems. In a cited Neighborhood Quality of Life Study (NQLS), data show a link between walkability of an area and residents' Body Mass Index (BMI). Physical activity keeps us healthy and city design can be used to promote that activity. In order to optimize health all these opportunities have to be available to all residents.

One interesting use of neighborhood data came from the Alameda County of Public Health
xiii. In the 2008 PBS Series, "Unusual Causes: Is Inequality Making Us Sick?" the Alameda County Public Health Director, Dr. Anthony Iton, used neighborhood data and corresponding attributes of race, economic status, pollution, education, living conditions and other social conditions to make predictions about county residents' health and life expectancy. The difference in health longevity can be over 10 years, based strictly on a poorer neighborhood affiliationxiv. Dr. Iton's study stressed many factors including the amount of education a person had, which was one of the greatest factors in determining longevity in California. Other issues included environment, stress levels, access to healthcare and good nutrition.

Source: Moore, American Journal of Public Health © 2006. (Click for larger image)

In a 2006 article in the American Journal of Public Health, Dr. Moore studied the density of grocery stores versus liquor stores per 100,000 people based on a neighborhood's level of income.

In the study Neighborhood Characteristics Associated with the Location of Food Stores and Food Service Places
xv, analyses of the collected data showed that people in different class neighborhoods had varying access to quality food and might need to travel longer distances to obtain a healthy diet. These factors, based on neighborhood affluence, were reflected in the health of its residents. The data showed strong correlations between diet and disease (e.g. poorer people tended to consume higher levels of starches and lower levels of fresh vegetables). It found that people in poor, racially segregated neighborhoods consumed three times the amount of alcoholic beverages per person of those in the wealthier neighborhoods. With this level of consumption came a considerably higher level of certain health risks and social problems. Public agencies and private groups can analyze neighborhood patterns to propose solutions for their localized region. In some cases nutrition might be limited where grocery markets are sparse and people are afraid to venture longer distances to obtain fresh fruits and vegetables. Other times there might be issues of open space, where congested streets do not allow for adequate places to walk or exercise.

Currently, neighborhood analysis relies heavily on demographic data. In the future, topographic data may also play a major roll in analyzing neighborhood characteristics. Algorithms will be able to scan populated locations and determine neighborhoods by common area characteristics. Once neighborhood boundaries have been identified, OIM (object image recognition) will be able to make an assessment of the neighborhood region. Scanned satellite and aerial photos will be able to identify and catalog walkways, bicycle pathways, parks, the density of open space, building density, school locations, market types and many more items that make up a neighborhood. Granularity data may also include the type of cars people own, how much time they spend at home, use of facilities by inhabitants, condition of foliage, type and condition of buildings and the traffic density of different locations. Data analysis could be used to classify neighborhood types and conditions that could then be used to help people find specific areas to accommodate preferences. These data could be used by organizations or citizens to help address deficiencies or underlying needs of a community (e.g. areas with too much street exposure might plant trees or add a park). Similar demographic techniques are used by stores to identify opportune areas for new business development or to look at where city services may be provided. The process might provide information such as which areas may be extremely problematic for asthmatics or which areas might be ideal for residents to get around in by foot.

Public Safety - Neighborhood boundaries can emphasize stark delineations for rates of crime. A detailed knowledge of these relationships could prove to be effective in the deployment of crime prevention resources. Below are two maps, one based on neighborhood boundaries and the other on crime statistics for that area (Berkeley, California). Notice the correlation between crime rates and different neighborhoods.

Berkeley Crime Rates on left, Berkeley Neighborhood Boundaries on right. (2005) (Click for larger image)

Commercial Market Opportunities - Neighborhood datasets can provide meaningful granularity for understanding locally correlated attributes, with a significant improvement over current systems. Neighborhood data can intelligently classify groups, providing partitions that embody the strong associations between neighbors. With greater granularity and refined localized data, consumers will be able to obtain content that is more targeted to them.

Neighborhood data have the potential to more efficiently target an intended population segment while decreasing the cost to market products or services to a desired local audience. An advertisement may seem more personal if it is addressed to a member of a particular neighborhood or community, rather than a more generic mailing that only targets residents of a city. This could prove to be particularly poignant in real estate.

Today over 70% of homebuyers in the U.S. residential real estate market search online before they buy a home, according to a 2006 National Association of Realtors survey. Neighborhoods are a convenient way to identify an area. While many leading real estate sites provide search listings by city or ZIP Code, many people might ideally prefer to search by neighborhood names
xvi. Not only does a specific ZIP Code often correspond to multiple (2 to 5) neighborhoods, but organizing listings by ZIP Code is non-intuitive because often it is hard for people to associate a given area with a ZIP Codexvii. Furthermore, home purchasing decisions are often based on local attributes that closely correlate to neighborhoods, including home value, school district and crime rate. As a result of these strong associations, neighborhoods create a natural partition for real estate companies to present their available listingsxviii.

Political Polling and Campaigning - The socioeconomic and ideological similarities among neighborhood citizens could permit political groups to employ neighborhood datasets to target populations with specific viewpoints and values.

In J. Morgan Kousser's article, Redistricting California 1971-2001
xix, the author gives a historical account of how redistricting in California shaped the state's political landscape, along with an analysis of its potential future effects. He highlights how partisan bickering in district reapportionment undermines the state's ability - in both the legislature and courts - to provide equitable representation. Three key emphasized points resulting from partisan redistricting are: increased ethic representation, greater ability to block legislation and a lower willingness for partisan compromise, policies that are potentially contrary to California's overall interests.

The general consensus by voters - drawn from the report - is that a district should be drawn up to give the greatest voice to its constituents. A good way to cluster districts would be by neighborhood boundaries. Demographic data technology that delineates neighborhood boundaries is emerging with advances in GIS software and Internet data collection techniques. Neighborhood boundary clustering could accommodate local and regional districts, as well as congressional districts that have an "exact population equality" provision.

At U.C. Berkeley, Prof. Bruce Cain
xx and his team of political researchers have been working on ways to create systems for the State of California to divide districts equitably based on a given set of criteria. Neighborhood data sets seem an ideal data source for grouping constituencies to elect a representative.

A major issue in resolving district discrepancies is finding agreement from different parties on how representative districts should be drawn. Historically, political redistricting relied on educated guesswork, with varying degrees of success. Today, however, GIS computer algorithms are able to scrutinize voters' demographic data with exacting precision, leaving political experts able to engineer election district outcomes for their candidates. It would seem prudent for GIS technologies based on natural neighborhood boundaries to be incorporated in creating "fair" and "unbiased" boundaries for districts.

Neighborhood Social Networks - The socioeconomic and ideological similarities among neighborhood citizens' common viewpoints and values.

Neighborhood Activism - Neighborhood activism has proliferated over the last half century. As with other forms of citizen participation, local organizations are formed to address issues for local constituents. Often these organizations focus on local issues: development, crime, education or entitlements, issues that neighborhood committees can address on a local level to bring to the attention of city officials. Here common agendas are proposed by people based on similar identity (location), often banding with neighbors to put forth their concerns.

In the 1950s the "neighborhood movement"
xxii was instituted by San Francisco citizens in underrepresented areas to collectively voice opposition to massive freeway expansion. In New York City, it was downtown Manhattan neighborhood groups that were credited with preventing Robert Moses' planned freeway across Manhattan. By the 1960s "Freeway Revolts" by neighborhood groups were staged in Boston; San Francisco; Milwaukee; Portland, OR; Seattle; Washington, DC; Cleveland and Baltimore. Neighborhood organizations became very active in this struggle because they were fighting for their community, while many outside the community saw new freeways as a way to alleviate traffic congestion in their commute to workxxiii.

In Environmental Justice: Human Health and Environmental Inequalities, Robert J. Brulle and David N. Pellow discussed the origins of the social relationship to environmental inequalities for different socio-ethnic groups. Based in part on these obvious inequalities, departments - such as the EPA - were formed to help monitor and enforce laws to reduce the effects of pollution and toxicity for all, with a special focus on the disenfranchised. A substantial portion of the environmental movement's origins can be traced to the late 1960s, when political unrest attempted to correct past damage caused by minimally monitored industries. In the 1970s, this movement spread down to the community level based on the effects seen in poorer communities (e.g. infant mortality, increased exposure to pollution).

Other Applications
Demographic Research - By monitoring neighborhood changes analysts may be able to predict future trends and gain valuable marketing data. Neighborhood clusters offer marketers a way to hone in on a specific area with a distinct name. Corporations can track their consumer preferences and strategically target areas that have high interest for their product or service.

Urban Planning - Since neighborhoods are often based on infrastructure, once established, a neighborhood tends to keep its original size, though its identity may change through gentrification or emigration
xxiv. Our observations show higher population density in older established areas where there is a high desirability to live (e.g. SoHo in New York City)xxv.

A community's size is often formed by its residents' ability to reach essential services in a given amount of time
xxvi. Cities built before the 1900s tended to have neighborhoods with higher density in their inner core. By the 1920s, as roadways and subways proliferated, many city residents were able to move away from the highly congested city to surrounding areas. This migration - encouraged by greater mobility - resulted in new housing developments that were often inhabited by people with similar interests and incomes.

Based on collected neighborhood data
xxvii (including 148 U.S. cities, all with population 150,000 or greater), preliminary observations show that spatial correlations still exist for residents within these neighborhood communities .

Neighborhood Data courtesy Factle Map Co. (2007) (Click for larger image)

Map of London Neighborhoods Showing Expansion of Newer Neighborhoods
Factle © 2008

Economic Modeling in Neighborhoods
City Scalability - In the development of a metropolitan area or city there are often benefits to growth. In 1965, W.R. Thompson argued, in Urban Economic Growth and Development in a National System of Cities, that the rate that a city
xxix will continue to grow is based on its "economy of scale" capabilities. Key elements presented are its ability to attract resources to build infrastructure from private investments as well as acquiring funds from state and federal agencies. This continues until an optimum size is reached and the infrastructure efficiencies of scale taper off (e.g. cost for rent, skilled labor do not compensate for rewards). In a production economy, the greater the scale, the lower the production cost. While some aspects of this model may vary in our modern day service-based economy, many of these principles still hold true.

Since a neighborhood typically functions as a small city with its own ecosystem, it, too, might have an optimal size (e.g. a 5-minute travel time to go to the market). Neighborhood efficiency is often in opposition to the quality of life for its residents. As a neighborhood's population and services increase, other events tend to occur: rents increase, competition for customers intensifies, crime grows and a decrease in the quality of life may occur. In an efficient market economy, supply will follow demand and the neighborhood will seek equilibrium. The ability to measure that unit of equilibrium called the "neighborhood" could prove to be an insightful way of understanding what is going on at a local level.

As demand for localized services increases - as seen by the explosion in mobile devices, GPS units and online maps - so, too, is the likelihood that users will want to reference information by easy-to-recognize terms like neighborhood names. These familiar terms often can provide useful insights by providing detailed information at the local level. Technology is now emerging through GIS aggregation, online group participation and other means that will allow enhancement of neighborhood information. As new neighborhood boundaries and neighborhood information become available it will be interesting to see how they will be integrated into accessing local information.

If you are interested in contributing a neighborhood map of your town or city, the author would be glad to generate the boundaries - or you could send a GIS file in .shp, .tab or .kml file format and it can be added to the collection using the author's methodology. You will be sent the demographic data for your neighborhood, if interested. The data will be made freely available for research, academic or public health work. Contact the author at .(JavaScript must be enabled to view this email address) if you are interested.

• School of Information
• Richard Dorall, Factle Maps & University of Malaya
• Cheng Ming Yu, Multimedia University
• Michael Cho, UC Berkeley iMap co-author
• MOT Business Team
• UC Berkeley Interns and Kathy Dombrowski
• University of Malaya GIS Team
• UC Berkeley SSIP
• Shawn Newsam, UC Merced - Image Technology
• Information Access Seminar Group
• Center of Entrepreneurship and Technology
• Google, Yahoo & Microsoft (GYM)

i1978. "Contacts and Influence." Social Networks 1:5-51. British Journal of Mathematical and Statistical Psychology. 52, 169-193
iiStanley Milgram, "The Small World Problem", Psychology Today, 1967, Vol. 2, 60-67
iiiFrigyes Karinthy (June 25, 1887- August 29, 1938)
ivA New York City firm "Neighborhoodies Inc." produces sweatshirts with neighborhood names with sales in the millions of dollars.
vIdentifying Urban Neighborhoods: An Annotated Biography Thomas F. Broden, Ronn Lirkwood, Susan Roberts, John Roos, Thomas Swartz Council of Planning Librarians Institute for Urban Studies Notre Dame (1980)
viHealthy Places: Exploring the Evidence, Howard Frumkin, MD, MPH, DrPH, The author is with the Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University, Atlanta, Ga.
viiVlahov, D. et al. 2007. Urban as a determinant of health. Journal of Urban Health. 84:16-26.
viiiDuhl, L. 1986. The healthy City: Its function & its future. Health Promotion. 1(1):55-60.
ixAn Enemy of the People. Henrik Ibsen,
xAlameda County Heath Services (2006 request of neighborhood boundary data)
xiBernt Wahl Exploring Fractal (1995) Chapter 2 section 3 ‘models on the spread of disease'
xiiFrank, L.D., Sallis, J.F., Conway, T. L. et al. 2006. Many Pathways from Land Use to Health
xiiiAlameda County of Public Health was one our first users of our neighborhood data (2006).
xivSan Francisco Chronicle March 27, 2008 Business Section Life Span Linked to Your Wealth by Victoria Colliver
xvMorland K, Wing S, Diez Roux A, Poole C. 2002. Neighborhood characteristics associated with the location of food stores and food service places. Am J Prev Med. 22:23-29
xviBased on HomeGain search term traffic (2006)
xviiHomeGain's internal user analysis on online website navigation (2001)
Human abilities to process data [add references]
xviiiJ. Morgan Kousser's Redistricting California 1971-2001
xviiCorrespondence between Bruce Cain and Bernt Wahl
From Cain, "… consensus on the values to be optimized. but if you can get a consensus
on that, then automation might be a tool for the future."
xxiSallie A. Marston and Richard Meadows, Citizens in Conflict: neighborhood Politics and Urban Growth in Tucson , The City of the 21st Century p. 265
xxii, San Francisco Chronicle 1956 excerpts and 1948 San Francisco's Planning Department maps. Courtesy SPUR
xxiiiIn studies done in Baltimore in the early 1980s it was shown that community activism is often related to a participant's socioeconomic class. Neighborhood Politics Mathew Crenson (1983) Harvard Press.
xxvBernt Wahl neighborhood data analysis of 150 U.S. cities with 7000 neighborhoods (size, population, density, and demographic statistics)
xxviM. R. Wolfe, 1988 Keynote Address, The City of the 21st Century p. 5
xxviiBased on Kathleen Dombrowski and Bernt Wahl raw neighborhood data, of 150 U.S. cities with 7000 neighborhoods (size, population, density, and demographic statistics) provided by Factle Map Co. based on 2006 MapInfo data.
xxviiBernt Wahl, Data Maps of San Francisco Demographics (2007). Appendix A
xxvixW.R. Thompson, (1965). Urban Economic Growth and Development in a National System of Cities." The Study of Urbanization. New York: Weley, 431-490

Published Thursday, December 4th, 2008

Written by Bernt Wahl and the Neighborhood Team

If you liked this article subscribe to our bimonthly newsletter...stay informed on the latest geospatial technology

Sign up

© 2017 Directions Media. All Rights Reserved.