Analysis and Forecast of USGS GeoData Demands on the Web

By Bill Huber

The USGS has recently changed to using a private partner for serving data on the Web.(See our article at .) However, the private vendor will be able to meet only a tiny fraction of the demand for these data.

The USGS publishes "GeoData download statistics" at and related pages.These statistics include the monthly amount of Geodata files downloaded in megabytes by data product: digital elevation models (DEMs) and digital line graphics (DLGs).

This chart displays all the USGS statistics.

USGS GeoData Monthly Download Volumes By Data Type

More detail

These download volumes show consistent exponential growth over the entire time period.The average continuous growth rate is 45 percent.

There are hints of seasonality: it seems that download volumes get lower in the summer months.We can account for that by looking at year-over-year growth rates.

Year-over-year continuous growth rate in total download demand for USGS GeoData

More detail

There is not much of a pattern here.What is clear, though, is that if growth continues--and there is no sign that it will slow down--then sometime during this year there will be a 90% increase in demand compared to the same month in the previous year.Shortly there will be more than 500 gigabytes of sustained demand for these data.

The vendor uses T1 lines for serving the data.A T1 line delivers 1.5 Mbs, which equates to 150 kilobytes per second under ideal conditions. Used continuously, one such line could deliver 420 gigabytes (GB) in one month.

The vendor originally acquired one T1 line to serve these data. Reacting to users complaining about impossibly slow download speeds, the vendor has ordered a second line.It looks like two T1 lines could handle the projected demand, but this would be an illusion, for many reasons.

First, demand is not even.Most demand occurs during working hours, plus a little in the evening.With typical usage patterns, at best only half of a server's bandwidth is actually utilized.That brings the T1 capacity down to a more moderate 200 GB per month.

Second, there is going to be a sustained spike in download volumes because of an "overhang" in demand.The USGS has withdrawn earlier versions of its DEMs, which now account for half of all the download volume.DEM users will need to download the new improved DEMs.We cannot predict how much of an increase this will be, but it should be substantial: total DEM downloads to date exceed 2,100 GB.It will take a long while for people to download all that again.

Third, the SDTS "decimal meters" format, used for many of the new DEMs, uses 32-bit numbers to represent elevations rather than 16-bit numbers.That will almost double the sizes of the DEMs in this format.This could instantly increase download needs by 50% to about 750 GB per month.

Fourth, the average size of the DEMs recently downloaded is only 150 kilobytes.These must be compressed versions of the 30-meter resolution DEMs.As more and more of the 10-meter DEMs, which are nine times the size, come on-line, there will be an enormous increase in bandwidth needs.

Without even considering the overhang, it is clear the vendor needs at least four T1 lines just to handle DEM and DLG downloads this year.The overhang could increase that need to six lines or more.

Things get worse when you consider what is happening at the vendor's end.The vendor also serves digital raster graphics (DRGs).These are enormous files, typically ten times the size of a DEM covering the same area.When we contacted the USGS to talk about these issues, they speculated that DRG demand on the vendor's site could be the cause of the bottlenecks.

Now for the kicker.The vendor is offering the data for free (which is part of its obligation to the USGS) only at restricted download speeds of 2.5 kilobytes per second.To do this, it has to segment its bandwidth into "premium" pipes and free pipes.This means they will have to add even more T1 lines (or else turn away non-premium customers).

So, at a minimum, the vendor will need at least four T1 lines for free DEM and DLG downloads, another T1 line for DRGs, another T1 line for their web pages and advertisements, another T1 line for "premium" customers, and an unknown number of temporary T1 lines for the DEM overhang.

That is not going to happen: T1 lines are expensive.Instead, the limited bandwidth will frustrate users and cause them to use alternatives, such as paying for the data and having them delivered by CD-Rom.

Published Friday, August 10th, 2001

Written by Bill Huber

