When you hear people talk about R, is your first thought that it might be International Talk-Like-a-Pirate Day? More likely, they’re using R, a programming language and software environment that even a decade ago was already being noted for its expanding popularity, a pattern that has continued unabated. The language was first created by statisticians in the early 1990s, and gets its name from the first letter of their first names – Ross Ihaka and Robert Gentleman – but also as a play on the S programming language, on which R is based.
Both push and pull factors are contributing to the increased usage of R. Graphical user interfaces such as Rstudio provide a more accessible approach to accessing the underlying statistical functionality. Meanwhile, Data Science is permeating business, government, industry, and research applications, and R is a solution with versatility to meet the myriad demands of data diversity. That versatility and extensibility is key to its success. As the user community builds new or enhanced packages and produces support and training materials, others discover the environment and begin its exploration. The Comprehensive R Archive Network, from which one can freely download R and its code, has over 12,000 packages of user-community contributed code available, designed to do everything from “Calculating Spatial Synchrony Between Tree-Ring Chronologies” to “Easily Send HTML Email Messages.” Fortunately, we can benefit from lists of worthwhile collections that others have researched.
Not surprisingly, geospatial data and spatial analysis are prevalent components within the list of packages and code. Since R can handle shapefiles and other GDAL supported formats, it’s easy for people to dabble in spatial analysis. Helpful overviews of a range of spatial tasks and workflows are available through Roger Bivand’s CRAN Task View for analyzing spatial data or Spatial Data Analysis and Modeling with R. Simple Features for R makes a lot of sense when you’re familiar with object-oriented GIS.
If you are completely unfamiliar with the experience of typing commands when prompted with a blinking cursor, the R environment will seem opaque to you, but that may be a fairly low threshold for people who spend their lives on computers. Numerous guides and tutorials are readily available, including Udemy’s Core Spatial Data Analysis: Introductory GIS with R and QGIS and DataCamp’s Working with Geospatial Data in R. Academics have embraced the use of R in teaching and research and several have shared their instructional materials online, such as these by Alex Singleton at the University of Liverpool, David Rossiter at Cornell, or Pat Bartlein at the University of Oregon. Guy Lansley and James Cheshire have produced An Introduction to Spatial Data Analysis and Visualization in R that is available (after a free registration) via the U.K.’s Consumer Data Research Centre. Esri has created an R-ArcGIS Bridge to help the two software packages interact, and Learn ArcGIS now offers an R-ArcGIS Bridge lesson for practice.
On the other hand, using R isn’t a trivial enterprise for many, including GIS specialists. One can be an expert in the geometry of vector data or map projections without having a clue how to apply a function to an indexed matrix. It’s humbling. Not for me, of course. But, let’s just say I had a friend who knew something about GIS, and she was curious to see what the process would be to install R and its GUI, read a flat data table containing info on about 50 trees in a forested stand, add a shapefile of polygons from the NRCS Soil Survey, display them together, and have the system run some basic geospatial processing, such as the number of trees of a particular species located within a soil area of a certain depth-to-bedrock. Let’s imagine that she was already committed to having a regular workday with a regular amount of previously-scheduled obligations and anticipating a typical amount of interruptions. If she were to have begun these R tasks (whilst being regularly interrupted with other distractions) shortly after 8 a.m. and to have concluded shortly before 6 p.m., and at that point she had produced a small “map” with species-differentiated color-shaded points along with a mathematical answer to a spatial join procedure, identical to one that she had calculated previously with other GIS software, I’d say she had a reasonably successful day at experimenting with R, particularly when you consider how much more efficient at learning she might be if she were focusing on R alone.
Micro-investigations like these provide insights about the R user community. Statisticians and other mathematicians who are already confident and competent with R are probably enjoying the boon of new tutorials, datasets, and guides for getting more – correctly and thoroughly – from their forays with geospatial data. Data scientists who are already fluent in their targeted programs are increasingly likely to find compatibility with R, such as that which Tableau now provides. R is already widely deployed and successful across communities of scientists and business professionals alike, especially internationally. While it won’t supplant GIS in any comprehensive way, R enables studies that extend the reach of spatial analysis across new audiences. It’s undoubtedly a worthwhile addition to the skillset of forward-thinking geospatial data scientists.