Mapping the World ... One Neighborhood at a Time - Part Two
Using Neighborhood Data - Neighborhood Data Analysis
In the 1950s, mathematical researchers at the University of Paris
Ithiel de Solla Pool, Stanley Milgram, Benoit Mandelbrot and Manfred
Kochen examined the interconnectivity of social networks in the
manuscript "Contacts and Influences"i.
The researchers' primary
interest was the role social networks played in extracting "social
capital" in producing economic productivity. The study tried to measure
the levels at which interconnectedness would produce optimal results
for various tasks.
Later Milgram, in his 1967 seminal work, "The Small World
Problem,"ii proposed the concept of 6-degrees of separationiii.
Milgram devised an experiment that measured the degree of contact
between people based on distance - research inspired by another social
physiologist, Karinthy Frigyes. Frigyes' 1920s' observations in
Budapest had calculated the degree of social interaction of Eastern
European cities during the Statist period. These studies showed
semi-consistent ratios between varying sub-divided location defined
units.
As divisions are broken down into smaller regions a greater intimacy
evolves. While the term "a fellow countryman" denotes a certain
closeness, referring to a neighbor or a family member brings an even
closer sense of community. A city's identity is often manifested
through a local sports team name or a local school identity. In what
seems to be a refinement of local identity, sweatshirts are now
appearing that display neighborhood namesiv.
Project History
In 1994, Ron Eglash and Bernt Wahl gave a description, in the book Exploring
Fractals, of how repetitive patterns were found in Cairo, Egypt
neighborhoods and African villages. Their work showed that new
neighborhoods replicate the structure found in old neighborhoods while
incorporating a larger self-similar structure that encompasses both
levels.
The initial commercial application in defining neighborhood boundaries
provided local information for commercial services via Internet Search.
It was felt that customizable information services that provided
updated data at a granular level could be superior to standard "yellow
pages." In 1995, the "Worldwide Yellow Page" (WYP.net) site was
launched that contained both U.S. and international online directory
information. The first location selected to create detailed information
was the Solano Avenue Neighborhood (in Albany-Berkeley, CA). Solano
Avenue stores were presented graphically with shops on both sides of
the streets with a brief description of services provided and hours of
operation. For a fee, businesses could post additional information.
In 1999, YellowGiant was founded to provide local yellow page services
online. In the early days the amount of localized information was
sparse and data still had to be aggregated using U.S. Census
designations. It was not until around 2003-2004, with the advent of the
Factle Search Engine project, that enough neighborhood data could be
collected to provide meaningful localized results. The Factle Engine
used a weighted aggregation system that leveraged the superior
capacities of different search engines to give optimal results that
were also less susceptible to "spam." Initially an automated iterative
data collection system was conceived that would collect initial
neighborhood information; then the data parameters would be expanded
upon until a region's neighborhood data collection was exhausted. This
process proved to be too difficult to automate so it was carried out by
manually entering in seed neighborhood information and iterating the
search process for the data found. Over time a great deal was learned
about what constitutes a neighborhood definition, though neighborhood
boundary data interpretation still needed human analysis.
Neighborhood Identities
Neighborhood identities can be broken down into five key elements:
physical, functional, political, economic and cultural identityv.
 |
Figure:
Shows five neighborhood identities:
1. Physical provides the special boundaries in which a neighborhood
resides.
2. Functional serves to provide an infrastructure in which community
services are coordinated.
3. Political provides the mechanism in which a voice is given to the
community in governmental affairs.
4. Economic provides the base in which wealth is accumulated and
distributed in a community.
5. Cultural Identity the soul of a community, where heritage,
ethnicity and other distinguishing features establish characteristics
of the community.
These identities are often interrelated, which, when combined, help
form the fabric of a neighborhood community. Below are examples of how
Neighborhood Based Boundaries and Demographic Data may prove
useful in
addressing the issues of neighborhood communities: public health,
political districting, education, public safety, city planning and
commercial marketing.
Neighborhood Map in Search
As localized search capabilities are expanding, organizations' and
individuals' capabilities to gain access to more detailed local
information increase. By supplying data based on defined locations with
common characteristics, local residents, companies and service
organizations are able to search for online information and provide
community services more effectively.
Public Health - Healthy Places:
Exploring the Evidence vi
reminds us of the old
adage: there is no place like home. In its idealized form it is a
familiar place where you can find comfort based on familiar
surroundings in a comfortable setting. This world is comprised of
"place effects" most notably stated by the author as: 1) nature
contact, 2) buildings, 3) public places, and 4) urban forms that fit
together in a neighborhood community. Though a long-time staple in
urban planning, only recently has the impact of a community's
characteristics on people's heath been given much consideration. The
article goes on to say that designing areas and buildings with open
spaces in a natural environment with plenty of light can have a
measurable impact on keeping residents happy and healthy.
Neighborhood data provide a useful resource for doing analysis in
public health. In Urban as a Determinant of Health, The healthy
city:
Its function and its futurevii (Vlahov, D. et al. 2007) and An
Enemy
of the Peopleviii the common theme is urbanization and the
effect it will have on people's lives. In public health, spatial
locality affects health risks, and the correlation between ethnic
groups and related problems could offer a beneficial insight to
policymakers seeking to manage these issues.
Diseases are often clusteredix by neighborhoodsx . Neighborhood data
might allow health officials to measure the likelihood that contact
with someone - in a neighborhood vs. someone in an adjoining area -
will speed the spread of a disease based on the assumption that people
within a neighborhood have a higher likelihood of interaction. Also,
health agencies looking to create local heath clinics that want to
strategically place sites can use neighborhood demographic data to
effectively provide services in a cost efficient manner that target
specific needs.
 |
Rates
of Asthma in San Francisco, California based on neighborhoods. (Click
for larger image)
Another article, Many Pathways from Land Use to Health Associations
between Neighborhood Walkability and Active Transportation, Body Mass
Index, and Air Qualityxi xii (Frank, L.D., Sallis,
J.F.,
Conway, T. L. et al. 2006), shows the correlation between an
environment, mobility and health. These key factors include the
availability of bicycle paths, walkways and other physical means of
getting around in an environment with clean air and limited traffic
interference. It shows that if cities and towns provide convenient ways
to get around on foot, people will tend to use these options and reap
the health benefits that come from increased physical mobility. This,
in turn, may also reduce dependence on cars and buses that in a
congested environment cause high levels of air pollution, which may
hamper respiratory systems. In a cited Neighborhood Quality of Life
Study (NQLS), data show a link between walkability of an area and
residents' Body Mass Index (BMI). Physical activity keeps us healthy
and city design can be used to promote that activity. In order to
optimize health all these opportunities have to be available to all
residents.
One interesting use of neighborhood data came from the Alameda County
of Public Healthxiii. In the 2008 PBS Series, "Unusual Causes: Is
Inequality Making Us Sick?" the Alameda County Public Health Director,
Dr. Anthony Iton, used neighborhood data and corresponding attributes
of race, economic status, pollution, education, living conditions and
other social conditions to make predictions about county residents'
health and life expectancy. The difference in health longevity can be
over 10 years, based strictly on a poorer neighborhood affiliationxiv.
Dr. Iton's study stressed many factors including the amount of
education a person had, which was one of the greatest factors in
determining longevity in California. Other issues included environment,
stress levels, access to healthcare and good nutrition.
 |
Source:
Moore, American Journal of Public Health © 2006. (Click
for larger image)
In a 2006 article in the American Journal of Public Health, Dr.
Moore
studied the density of grocery stores versus liquor stores per 100,000
people based on a neighborhood's level of income.
In the study Neighborhood Characteristics Associated with the
Location
of Food Stores and Food Service Placesxv, analyses of
the collected
data
showed that people in different class neighborhoods had varying access
to quality food and might need to travel longer distances to obtain a
healthy diet. These factors, based on neighborhood affluence, were
reflected in the health of its residents. The data showed strong
correlations between diet and disease (e.g. poorer people tended to
consume higher levels of starches and lower levels of fresh
vegetables). It found that people in poor, racially segregated
neighborhoods consumed three times the amount of alcoholic beverages
per person of those in the wealthier neighborhoods. With this level of
consumption came a considerably higher level of certain health risks
and social problems. Public agencies and private groups can analyze
neighborhood patterns to propose solutions for their localized region.
In some cases nutrition might be limited where grocery markets are
sparse and people are afraid to venture longer distances to obtain
fresh fruits and vegetables. Other times there might be issues of open
space, where congested streets do not allow for adequate places to walk
or exercise.
Currently, neighborhood analysis relies heavily on demographic data. In
the future, topographic data may also play a major roll in analyzing
neighborhood characteristics. Algorithms will be able to scan populated
locations and determine neighborhoods by common area characteristics.
Once neighborhood boundaries have been identified, OIM (object image
recognition) will be able to make an assessment of the neighborhood
region. Scanned satellite and aerial photos will be able to identify
and catalog walkways, bicycle pathways, parks, the density of open
space, building density, school locations, market types and many more
items that make up a neighborhood. Granularity data may also include
the type of cars people own, how much time they spend at home, use of
facilities by inhabitants, condition of foliage, type and condition of
buildings and the traffic density of different locations. Data analysis
could be used to classify neighborhood types and conditions that could
then be used to help people find specific areas to accommodate
preferences. These data could be used by organizations or citizens to
help address deficiencies or underlying needs of a community (e.g.
areas with too much street exposure might plant trees or add a park).
Similar demographic techniques are used by stores to identify opportune
areas for new business development or to look at where city services
may be provided. The process might provide information such as which
areas may be extremely problematic for asthmatics or which areas might
be ideal for residents to get around in by foot.
Public Safety - Neighborhood boundaries can emphasize stark
delineations for rates of crime. A detailed knowledge of these
relationships could prove to be effective in the deployment of crime
prevention resources. Below are two maps, one based on neighborhood
boundaries and the other on crime statistics for that area (Berkeley,
California). Notice the correlation between crime rates and different
neighborhoods.
 |
Berkeley
Crime Rates on left, Berkeley Neighborhood Boundaries on right. (2005)
(Click
for larger image)
Commercial Market Opportunities - Neighborhood datasets can
provide
meaningful granularity for understanding locally correlated attributes,
with a significant improvement over current systems. Neighborhood data
can intelligently classify groups, providing partitions that embody the
strong associations between neighbors. With greater granularity and
refined localized data, consumers will be able to obtain content that
is more targeted to them.
Neighborhood data have the potential to more efficiently target an
intended population segment while decreasing the cost to market
products or services to a desired local audience. An advertisement may
seem more personal if it is addressed to a member of a particular
neighborhood or community, rather than a more generic mailing that only
targets residents of a city. This could prove to be particularly
poignant in real estate.
Today over 70% of homebuyers in the U.S. residential real estate market
search online before they buy a home, according to a 2006 National
Association of Realtors survey. Neighborhoods are a convenient way to
identify an area. While many leading real estate sites provide search
listings by city or ZIP Code, many people might ideally prefer to
search by neighborhood namesxvi. Not only does a specific ZIP Code often
correspond to multiple (2 to 5) neighborhoods, but organizing listings
by ZIP Code is non-intuitive because often it is hard for people to
associate a given area with a ZIP Codexvii.
Furthermore, home purchasing
decisions are often based on local attributes that closely correlate to
neighborhoods, including home value, school district and crime rate. As
a result of these strong associations, neighborhoods create a natural
partition for real estate companies to present their available listingsxviii.
Political Polling and Campaigning - The socioeconomic and
ideological
similarities among neighborhood citizens could permit political groups
to employ neighborhood datasets to target populations with specific
viewpoints and values.
In J. Morgan Kousser's article, Redistricting California 1971-2001xix,
the author gives a historical account of how redistricting in
California shaped the state's political landscape, along with an
analysis of its potential future effects. He highlights how partisan
bickering in district reapportionment undermines the state's ability -
in both the legislature and courts - to provide equitable
representation. Three key emphasized points resulting from partisan
redistricting are: increased ethic representation, greater ability to
block legislation and a lower willingness for partisan compromise,
policies that are potentially contrary to California's overall
interests.
The general consensus by voters - drawn from the report - is that a
district should be drawn up to give the greatest voice to its
constituents. A good way to cluster districts would be by neighborhood
boundaries. Demographic data technology that delineates neighborhood
boundaries is emerging with advances in GIS software and Internet data
collection techniques. Neighborhood boundary clustering could
accommodate local and regional districts, as well as congressional
districts that have an "exact population equality" provision.
At U.C. Berkeley, Prof. Bruce Cainxx and his team of political
researchers have been working on ways to create systems for the State
of California to divide districts equitably based on a given set of
criteria. Neighborhood data sets seem an ideal data source for grouping
constituencies to elect a representative.
A major issue in resolving district discrepancies is finding agreement
from different parties on how representative districts should be drawn.
Historically, political redistricting relied on educated guesswork,
with varying degrees of success. Today, however, GIS computer
algorithms are able to scrutinize voters' demographic data with
exacting precision, leaving political experts able to engineer election
district outcomes for their candidates. It would seem prudent for GIS
technologies based on natural neighborhood boundaries to be
incorporated in creating "fair" and "unbiased" boundaries for districts.
Neighborhood Social Networks - The socioeconomic and
ideological
similarities among neighborhood citizens' common viewpoints and values.
Neighborhood Activism - Neighborhood activism has
proliferated over
the last half century. As with other forms of citizen participation,
local organizations are formed to address issues for local
constituents. Often these organizations focus on local issues:
development, crime, education or entitlements, issues that neighborhood
committees can address on a local level to bring to the attention of
city officials. Here common agendas are proposed by people based on
similar identity (location), often banding with neighbors to put forth
their concerns.
In the 1950s the "neighborhood movement"xxii was
instituted by San
Francisco citizens in underrepresented areas to collectively voice
opposition to massive freeway expansion. In New York City, it was
downtown Manhattan neighborhood groups that were credited with
preventing Robert Moses' planned freeway across Manhattan. By the 1960s
"Freeway Revolts" by neighborhood groups were staged in Boston; San
Francisco; Milwaukee; Portland, OR; Seattle; Washington, DC; Cleveland
and Baltimore. Neighborhood organizations became very active in this
struggle because they were fighting for their community, while many
outside the community saw new freeways as a way to alleviate traffic
congestion in their commute to workxxiii.
In Environmental Justice: Human Health and Environmental
Inequalities,
Robert J. Brulle and David N. Pellow discussed the origins of the
social relationship to environmental inequalities for different
socio-ethnic groups. Based in part on these obvious inequalities,
departments - such as the EPA - were formed to help monitor and
enforce laws to reduce the effects of pollution and toxicity for all,
with a special focus on the disenfranchised. A substantial portion of
the environmental movement's origins can be traced to the late 1960s,
when political unrest attempted to correct past damage caused by
minimally monitored industries. In the 1970s, this movement spread down
to the community level based on the effects seen in poorer communities
(e.g. infant mortality, increased exposure to pollution).
Other Applications
Demographic Research - By monitoring neighborhood changes
analysts may
be able to predict future trends and gain valuable marketing data.
Neighborhood clusters offer marketers a way to hone in on a specific
area with a distinct name. Corporations can track their consumer
preferences and strategically target areas that have high interest for
their product or service.
Urban Planning - Since neighborhoods are often based on
infrastructure,
once established, a neighborhood tends to keep its original size,
though its identity may change through gentrification or emigrationxxiv.
Our observations show higher population density in older established
areas where there is a high desirability to live (e.g. SoHo in New York
City)xxv.
A community's size is often formed by its residents' ability to reach
essential services in a given amount of timexxvi. Cities
built before the
1900s tended to have neighborhoods with higher density in their inner
core. By the 1920s, as roadways and subways proliferated, many city
residents were able to move away from the highly congested city to
surrounding areas. This migration - encouraged by greater mobility -
resulted in new housing developments that were often inhabited by
people with similar interests and incomes.
Based on collected neighborhood dataxxvii
(including 148 U.S. cities,
all with population 150,000 or greater), preliminary observations show
that spatial correlations still exist for residents within these
neighborhood communities .
 |
Neighborhood
Data courtesy Factle Map Co. (2007) (Click
for larger image)
 |
Map of
London Neighborhoods Showing Expansion of Newer Neighborhoods
Factle © 2008
Economic Modeling in Neighborhoods
City Scalability - In the development of a metropolitan area or
city
there are often benefits to growth. In 1965, W.R. Thompson argued, in
Urban Economic Growth and Development in a National System of Cities,
that the rate that a cityxxix will continue to grow is based on its
"economy of scale" capabilities. Key elements presented are its ability
to attract resources to build infrastructure from private investments
as well as acquiring funds from state and federal agencies. This
continues until an optimum size is reached and the infrastructure
efficiencies of scale taper off (e.g. cost for rent, skilled labor do
not compensate for rewards). In a production economy, the greater the
scale, the lower the production cost. While some aspects of this model
may vary in our modern day service-based economy, many of these
principles still hold true.
Since a neighborhood typically functions as a small city with its own
ecosystem, it, too, might have an optimal size (e.g. a 5-minute travel
time to go to the market). Neighborhood efficiency is often in
opposition to the quality of life for its residents. As a
neighborhood's population and services increase, other events tend to
occur: rents increase, competition for customers intensifies, crime
grows and a decrease in the quality of life may occur. In an efficient
market economy, supply will follow demand and the neighborhood will
seek equilibrium. The ability to measure that unit of equilibrium
called the "neighborhood" could prove to be an insightful way of
understanding what is going on at a local level.
Conclusion
As demand for localized services increases - as seen by the explosion
in mobile devices, GPS units and online maps - so, too, is the
likelihood that users will want to reference information by
easy-to-recognize terms like neighborhood names. These familiar terms
often can provide useful insights by providing detailed information at
the local level. Technology is now emerging through GIS aggregation,
online group participation and other means that will allow enhancement
of neighborhood information. As new neighborhood boundaries and
neighborhood information become available it will be interesting to see
how they will be integrated into accessing local information.
If you are interested in contributing a neighborhood map of your town
or city, the author would be glad to generate the boundaries - or you
could send a GIS file in .shp, .tab or .kml file format and it can be
added to the collection using the author's methodology. You will be
sent the demographic data for your neighborhood, if interested. The
data will be made freely available for research, academic or public
health work. Contact the author at bernt@factle.com
if you are
interested.
Acknowledgments
School of Information
Richard Dorall, Factle Maps & University of
Malaya
Cheng Ming Yu, Multimedia University
Michael Cho, UC Berkeley iMap co-author
MOT Business Team
UC Berkeley Interns and Kathy Dombrowski
University of Malaya GIS Team
UC Berkeley SSIP
Shawn Newsam, UC Merced - Image Technology
Information Access Seminar Group
Center of Entrepreneurship and Technology
Google, Yahoo & Microsoft (GYM)
i1978. "Contacts and Influence." Social Networks 1:5-51.
British Journal of Mathematical and Statistical Psychology. 52, 169-193
iiStanley Milgram, "The Small World Problem", Psychology
Today, 1967, Vol. 2, 60-67
iiiFrigyes Karinthy (June 25, 1887- August 29, 1938)
ivA New York City firm "Neighborhoodies Inc." produces
sweatshirts with neighborhood names with sales in the millions of
dollars.
vIdentifying Urban Neighborhoods: An Annotated Biography
Thomas F. Broden, Ronn Lirkwood, Susan Roberts, John Roos, Thomas
Swartz Council of Planning Librarians Institute for Urban Studies Notre
Dame (1980)
viHealthy Places: Exploring the Evidence, Howard Frumkin, MD,
MPH, DrPH, The author is with the Department of Environmental and
Occupational Health, Rollins School of Public Health, Emory University,
Atlanta, Ga. http://www.ajph.org/cgi/content/full/93/9/145
viiVlahov, D. et al. 2007. Urban as a determinant of health.
Journal of Urban Health. 84:16-26.
viiiDuhl, L. 1986. The healthy City: Its function & its
future. Health Promotion. 1(1):55-60.
ixAn Enemy of the People. Henrik Ibsen,
http://www.classicreader.com/booktoc.php/sid.7/bookid.1535/
xAlameda County Heath Services (2006 request of neighborhood
boundary data)
xiBernt Wahl Exploring Fractal (1995) Chapter 2 section 3
models on the spread of disease'
xiiFrank, L.D., Sallis, J.F., Conway, T. L. et al. 2006. Many
Pathways from Land Use to Health
xiiiAlameda County of Public Health was one our first users of
our neighborhood data (2006).
xivSan Francisco Chronicle March 27, 2008 Business Section Life
Span Linked to Your Wealth by Victoria Colliver
xvMorland K, Wing S, Diez Roux A, Poole C. 2002. Neighborhood
characteristics associated with the location of food stores and food
service places. Am J Prev Med. 22:23-29
xviBased on HomeGain search term traffic (2006)
xviiHomeGain's internal user analysis on online website
navigation (2001)
Human abilities to process data [add references]
xviiiJ. Morgan Kousser's Redistricting California 1971-2001
xviiCorrespondence between Bruce Cain and Bernt Wahl
From Cain, "
consensus on the values to be optimized. but if you can
get a consensus
on that, then automation might be a tool for the future."
xxiSallie A. Marston and Richard Meadows, Citizens in Conflict:
neighborhood Politics and Urban Growth in Tucson , The City of the 21st
Century p. 265
xxiihttp://www.bikesummer.org/1999/zine/freewayRevolt.htm, San
Francisco Chronicle 1956 excerpts and 1948 San Francisco's Planning
Department maps. Courtesy SPUR
xxiiiIn studies done in Baltimore in the early 1980s it was shown
that community activism is often related to a participant's
socioeconomic class. Neighborhood Politics Mathew Crenson (1983)
Harvard Press.
xxi
xxvBernt Wahl neighborhood data analysis of 150 U.S. cities
with 7000 neighborhoods (size, population, density, and demographic
statistics)
xxviM. R. Wolfe, 1988 Keynote Address, The City of the 21st
Century p. 5
xxviiBased on Kathleen Dombrowski and Bernt Wahl raw neighborhood
data, of 150 U.S. cities with 7000 neighborhoods (size, population,
density, and demographic statistics) provided by Factle Map Co. based
on 2006 MapInfo data.
xxviiBernt Wahl, Data Maps of San Francisco Demographics (2007).
Appendix A
xxvixW.R. Thompson, (1965). Urban Economic Growth and Development
in a National System of Cities." The Study of Urbanization. New York:
Weley, 431-490