MetaCarta Users Tap Unstructured Data for New Geographic Uses

By Adena Schutzberg

The MetaCarta Public Sector User Group Meeting, held in Tysons Corner, Virginia last week, brought together some 50 people including users, mostly from federal/military agencies, along with integrators and MetaCarta partners. Recall that MetaCarta is the company with technology to tease location information out of unstructured text, geocode those locations and thus organize documents geographically. I presented the opening keynote, exploring where the geospatial marketplace is, sharing indicators that might predict where we are going, and offering some predictions for what the future might hold.

David Sonnen of IDC presented a second keynote titled "Enterprise Spatial Information Management: Current State and Future Outlook." While he focused on the enterprise and I did not, we shared many of the same ideas and conclusions. Sonnen, Founder and CTO John Frank and I made up a panel and responded to questions in the afternoon. The most interesting topic, I think, related to the idea that we need to take the data off the computer monitor and interact with them in new ways. We spoke of "walking through" data and using devices like the Wii game controller to interact with and move data elements. Maybe it's not that far off?

I'd argue that the most interesting presentations were from the users.

Fusion Center
Bari Lee, an intelligence analyst from the North Central Texas Fusion Center, spoke of the Center's use of MetaCarta in its analytical toolkit. North Central Texas, the region that includes Dallas/Ft. Worth, is significant due to its large airport, population center and significant growth. The Fusion Center, one of some 140 planned nationwide, became operational in February 2006.

The Center focuses on prevention and early warnings for the region. It gets involved with issues related to gangs, crime and border security in connection with local and federal authorities. Of particular interest to many at the meeting was the way the Center accesses and uses data from local agencies; it does not host the data, but rather refreshes them regularly. That means analysts are not subject to the Freedom of Information Act (FOIA) or being dragged into court.

Lee described the Center as a sort of "wild west" for analysts in that they can use a variety of technologies before "politics" catches up and limits options. He came to the Center from a senior level position in intelligence in Europe and is a proponent of analysis being done locally. He argues that the intelligence community needs to get away from its "Washington-centric" vision.

Lee's vision of the future for analytical tools includes the ability to do a single query (with lots of smarts behind it) that will yield the right results with little digging. At the present, he illustrated, analytical tools often produce a "ball of yarn," an intertwined tangle of related results through which analysts must puzzle. He pointed out that with only four analysts, the Center hopes to use technology to help limit the amount of reading the analysts do! Lee can't say enough about how MetaCarta's tool helps narrow down the Center's research by bounding queries geographically. He pointed to the data density tool as being particularly useful to "find surprises."

Smart News Search
William Bevington is Executive Director at the Parsons Institute for Information Mapping (PIIM) at the New School in New York City. He highlighted work done under government contract to map current news feeds.

He noted that when the project, called Geospace & Media Tool (GMT) began about three years ago, his engineering research team identified about six tools that claimed to do the sort of document geocoding MetaCarta does. Of the six, only three performed at all, but only MetaCarta's solution stood out as doing it "well."

To get a sense of the GMT application, forget any mashups you've seen that post news headlines to maps. This solution is far smarter. It first uses a clustering engine to group stories into "buckets," (These are collected groups of like stories determined by algorithms of a service engine that may put 1 or hundreds of stories together). These buckets of stories on the same topic are then geocoded and symbolized on a map.

A simple 3 letter code is used to label the location. CAN, for example, for Canberra in Australia. Further additional "timing" smarts helps get the "new" news on the map even as "old news" lingers on feeds for several days.

The tool processes 100,000 stories a day, this is reduced to 50,000. About two weeks is the standard default timeframe 700,000 stories, but users could theoretically go back as far as they desired. Every search can be topic driven or location driven or both.

PIIM focuses on the information design to offer up better and more intuitive presentations, and Bevington stated that there are four basic ways to present data: pictures (as in maps), quantitatively, symbolically and relationally. Further, which one is used depends very much on the user and what the user is trying to do. As you might imagine that sort of thought May yield some elegant interfaces designed around the cognitive strengths of the user.

The technology and data providers includes newsfeeds from Factiva, MetaCarta's tools to extract locations and geocode the news stories, a clustering engine built by PIIM and Seimens and base data and mapping from ESRI. One additional database was woven in: listing and bios for political and business leaders called Leadership Directories. The returned articles from a search included links to biographical and organization information for any individual that were included in the Leadership Directories. An article about presidential candidates, for example, included Rudy Giuliani. His name, and those of the other candidates found in the database, were listed alongside the full text of the article. Then the fun began - it was possible to read Guiliani's bio and then see his connections (organizational) to the other candidates. Ideally, it's possible to "cross" the resulting text documents with any sort of directory to provide more information and/or connections. There are lots of great images here.

The good news is that the tool is very slick. The bad news is that it will likely be a for-fee sort of solution since so many of the tools and the data require subscriptions.

Supporting the Megacommunity
USAID is the agency behind work being done at Booz Allen Hamilton to build a tool to help link those in need in developing countries with those who provide services. The idea focuses on the concept of a megacommunity, explained Susan Kalweit of Booz Allen. A megacommunity is a collaborative environment based on common interests (poverty, say) while supporting on unique priorities of members. There is with no single dominant player. The Emerging Marketplace Tool (EMT) will enable this collaboration to allow players to find, learn about and ultimately, leverage one another. A prototype uses MetaCarta tools (GTS and OpenLayers, but moving to ESRI ArcGIS 9.2, and from .net to J2EE) to search documents (from 39 English language international development related sites and 235,000 documents) in a database of specific topics of interest in specific geographies. The vision is a solution that'll be free on Web, but also have a subscription component including more detailed information, tools and data. The vision from USAID is that enabling these connections between players will initiate action on many global challenges.

Ag Data for the World
The United States Foreign Agricultural Service's (FAS) goal is to, among other things, represent U.S. agricultural interests abroad, monitor trade agreements, and foster long-term economic growth and food security for nations in need. Information needed to support that work is available to both staff and the public via the Web-based Crop Explorer. It's updated regularly and provides crop condition information and forecasts of production. Its users include farmers and commodities traders in addition to the public and government employees.

The current iteration does not use MetaCarta but the implementation with those tools is nearly complete. The upgrade also includes looking to build a link from Crop Explorer to ArcGIS since that's the FAS analysts' tools of choice. Bob Baldwin of FAS told me later that his staff spent quite a lot of time in years past developing categories to allow users to query data based on geography. "That was before MetaCarta." Now, he said, "that sort of coding is automated." The MetaCarta implementation at FAS will be the third on a publicly available website. The other two sites are the Society of Petroleum Engineers map-based e-Library Search and soon, the EPA's Window to My Environment (WME). Dave Smith notes that this was demoed at last year's meeting.

Solutions for Analysts Who Don't Speak [Enter Language Here]

Raytheon's Dave Waldrup showed off its product called Expres'Sense. The name doesn't do it justice in my mind, so let's jump right to use case. Say you have an analyst who is looking into terrorist threats in the Middle East. Suppose he or she is a great analyst but does not know Arabic or any other languages of the region. Expres'Sense solves that problem by linking some very clever language tools (Language Weaver) to MetaCarta location extraction and tagging.

Here's the outline of the brief demo we saw that walked through how an analyst might search for documents with the term "bomb" in them in Arabic. First the analyst would key in the term "bomb." Then, he'd have to pick through some meanings in English to ensure the "right" meaning of bomb was used (not "bomb" meaning a really bad movie, for example). The system offered back some "kinds" of bombs - actually a very long list including "package bomb" and "car bomb" and "shoe bomb" - from which the analyst could select one or more terms. At that point the language of interest was identified (I think in the demo about five were offered). Then the query was built and run against a database of documents. The results came back in both Arabic and translated into English. Locations in each article could then be extracted, geocoded and presented on a map next to the article.

MetaCarta is Growing
I've been following MetaCarta since 2002 and things are coming along nicely. This intimate user conference was, I suspect, like those from the early days of other software companies. There is a lot of prototyping going on and a limited number of systems in use. Still, there's quite a lot of excitement as users see how their peers are using the new tools in different disciplines, for differently skilled users, and with different technology platforms. I expect to see this event, and this company, grow steadily in the coming years.


Published Thursday, May 31st, 2007

Written by Adena Schutzberg



If you liked this article subscribe to our newsletter...stay informed on the latest geospatial technology

© 2016 Directions Media. All Rights Reserved.