There's been a lot of coverage of Google's recent announcement via a blog of a KML search capability from Google Earth and Google Search. Michael Jones, Google's Chief Technologist for Google Earth, Maps, Local answered some questions to clarify what it does, how it works and explored some of its implications for searching for geodata.
DM:Are all publicly accessible KML files on the Web indexed by Google? Do their creators have to do something for them to be in the index?
MJ: Every KML & KMZ file on the web that is found by the Google web crawl is noted and indexed. The crawl honors include/exclude guidance from robots.txt files and is educated by site maps to find content that would otherwise be difficult to locate. Every resulting KML & KMZ file found by the crawl is indexed by its name, location, and by the contents of the KML description. Through KML Search, all of these files are now searched by the text string entered in the Google Earth search box.
Creators need only place their KML/KMZ on a publicly accessible web site and their geospatial data will be universally discoverable.
People and program agents can also search directly using Google Web Search. For example, visit http://www.google.com and try the following search:
This will show you all seven (do not suppress duplicates) of the KMZ files containing 'adena' in their descriptions.
DM: Does the search have a geographic part and a text part? How do those work? Based on where you are in GE? Based on text in KML?
MJ: We show the 'best' result subset of all the results. The details are subtle, but the idea is that the list of textual matches is also scored geospatially to produce a conflated score representing a good match. A perfect text match right where you are looking is a perfect score, a great match nearby or a so-so match on screen would be next, followed by great matches far away and poor matches on-screen. Then the best 'N' of these are selected and presented as the results in such a way that the Google Earth client zooms in/over/out to encompass the set of selected results. Users can explore these or follow the provided "more..." link to get more results, which is just like going to page 2, 3, and subsequent pages in Google Web search results.
DM: Might this be a way for all geo data to be found both for advertising needs and for the sort of geodata search folks might currently do at GOS, etc? I'm thinking a small bit of KML in a page could make it geosearchable in a way "local searches" are not today.
Could this be the answer to the old .geo idea?
MJ: yes, Yes, YES!
You are right on target with the "small bit of KML" comment.
If you want your county's fire plug Shape file to be findable on the WEB OF PAGES, you would have made an HTML reference page and decorated that with text that made searchers notice it when traversing your website, text that made it findable by web search tools like http://www.google.com, and added a hyperlink on the page referencing the Shape-file collection.
Now, you have an additional choice. If you want your county's fire plug Shape file to be findable on the WEB OF PLACES (using an Earth browser such as Google Earth), then you make a KML reference placemark and load it's description with text so that searchers notice it when looking at the placemark (even when part of a collection), find it when using tools like Google Earth Search (aka KML Search), and you'd add a hyperlink in the description of the placemark that references the Shape-file collection.
This simple step of creating a KML placemark (and waiting for the next web crawl) is all you need to let every one of the 200+ million users of Google Earth who flies nearby and types "fire plug" into the search box find your KML and be presented with the hyperlink to the Shape file (and by extension, MapInfo TAB files, Autodesk formats, NITFs, etc., all based on desired audience.)
Note that it is the author's option to also convert the referenced data into KML too. They would do this if their goal is to have those who browse, search, and explore the planet using Google Earth see the results (such as the fire plug locations) right there in Google Earth. This is an option, but is separate from using what you correctly describe as a small bit of KML to make the original data discoverable. This is the application of the world's most popular search technique to the task of finding data on a geospatial, view- based basis addressing in many ways the goals of GOS and SDI efforts both past and present.
DM: How does standard geo metadata play into such a search? I'm thinking not at all now, but maybe in the future?
MJ: Everything in the KML is indexed. If the metadata are placed into the KML description, then they are searchable. However, this is not a smart search in the sense of "select fire plugs painted more than 6 years ago", so there is much more to be done in this area. Youll note that Google started out indexing page-describing HTML, and then moved to index other popular document formats such as PDF and Words .DOC; likewise, were indexing place-describing KML and may later understand a larger collection of geospatial formats. If so, well be in a better position to deal structurally with important metadata at that time.
DM: So this is part of Google larger search vision?
MJ: When I present a slide with the web browser on one side and Google Earth and Maps on the other, and say "everything you can do on the web of pages you will be able to do on the web of places (via a browser such as Google Maps or Google Earth)", the launch of KML Search is what has been on my mind as the most significant move in that direction.
The Google Earth and Maps teams work to geolocate all information and help users find that information geospatially. While users need both halves, the finding part is a core Google skill and one that is very useful even when what is found is not hosted at Google, as is famously the case with Google Web Search. The launch of Google KML Search initiates this Google Earth Search capability for all of the world's spatially organizable data.