The question was, "Can you hide a message in a typical commercial GIS data set so that it does not noticeably affect the accuracy of the map and thus reveal itself?" With the help of Bill Huber, a well-known expert in ArcView and the moderator of the public ArcView mailing list, we set out to explore the problem.
Working in MapInfo, I took the route that exploits the (usually) unused precision in a typical MapInfo data set.It turns out that MapInfo stores its coordinate information as long integers, and typically lat/lon coordinates are converted to these integers by simply multiplying them by a million (1.0E+6).This means that in a global MapInfo data set coordinates are rounded off to the nearest 6 decimals of precision, or about 4 inches.
However, if your data covers a smaller area than the whole globe, you can increase MapInfo's coordinate precision over the extent of your data by specifying coordinate bounds.This causes MapInfo to use up to 2 billion (2.0E+9) units of precision between the explicit limits of your data, providing much higher coordinate precision while still being able to pack coordinates into integers.If your original data was digitized to only six decimal places, then increasing the resolution provides you with some free bits to play with steganography.
With this in mind, I took a data set of the New England states downloaded from the US Census Bureau's FTP site ftp://ftp.census.gov and set the bounds so they spanned 8 degrees in both directions around the data.This gave me about three more decimals of precision in which to hide a message in the coordinates.I calculated that each X-Y pair could safely hold 14 bits of message each, so by slicing the message into pieces I could pack it into successive coordinates without affecting the positional accuracy of the original data at all.
Though this method proves the concept, it is quite fragile.Simply generalizing the data again by removing the bounds destroys it completely.Reprojecting the coordinate system would also have an undesirable effect on the message.But it meets two criteria important to steganography and GIS: it is hidden to the casual observer, and does not alter the original data content.
Working with ArcView, Bill Huber took a different approach.Unlike MapInfo, ArcView stores coordinates using double precision floating point numbers and there is no way to alter that.So he elected to add false, closely spaced coordinates along the longest edge in a polygon.The information was stored in the relative distances between the false vertices; a long space represented a 1 bit while the short spaces represented a 0.Sentinel bytes with distinctive bit patterns surrounded the message so that its start and end could be recognized by a decoding program no matter which way it searched through the polygon.
This technique also creates messages invisible to the naked eye, and does not affect the positional accuracy of the real coordinates.In addition, it can withstand being subjected to alternate coordinate projections.However, one must be careful in choosing the distance between the false vertices.In our original experiment, Bill encoded a message in a map of Texas, and sent it to me as a Shape file.I ran it through MapInfo's Universal Translator (which is really a subset of Safe Software's FME translator), but the process used MapInfo's default six decimal precision, which wasn't high enough, and most of the message was destroyed.In a second attempt, Bill used larger spacings between the false vertices, and I was able decode that one perfectly, proving that the technique could be used across different GIS platforms.
See also these resources on the web...
Go here.