Multicasting and cache systems ease bandwidth pressures
The government's busiest World Wide Web servers, like other hot spots on the Internet,
constantly struggle to add fatter hardware and pipes for better response and more
connections. But supply never quite meets demand.
It may be time to concede that the bigger-hammer approach isn't always the answer.
Could you rethink the way you move your data and strip out some of the redundancy instead?
Recent National Science Foundation "incubator" grants for new network
technologies focus on two potential bandwidth saviors--multicasting and distributed cache
systems. Both are in limited use on the Internet.
Multicasting has been available for the past few years on the MBONE, a core group of
high-end Internet servers that route IP multicast packets. They establish point-to-point
links called tunnels that feed machines running an "mrouted" multicast routing
daemon. The IP multicast packets are encapsulated for transmission through the tunnels.
As for caches, your desktop machine already uses them to speed up local loading of
graphics on Web pages you visit often. And the Internet's domain name servers keep
addresses in cache for several days to accelerate repeat connections.
The NSF-sponsored research could push multicasting and caching in new directions, so we
can all use our limited bandwidth more efficiently.
Multicasting promises to deliver often-requested data in a one-to-many fashion that
eliminates the need for each user to set up a separate connection. Unlike conventional
downloading, this data streaming lets you view or listen to the data as it arrives rather
than after complete download.
Right now, multicasting is used mostly for audio or video delivery. But its real
potential lies in site capture of broadcasts of any type of data. Regional servers could
be set to listen to updates from a central address at certain times of day--crop
forecasts, weather updates or public service information. That way, regional servers
wouldn't have to waste bandwidth getting separate copies.
The downside is that people who tap into a live data stream will miss whatever came
before or after their connection. The trick is to set up a system to capture a full
broadcast, so you must know ahead of time when it's being sent.
Government uses for multicasting include multimedia training and press conferences. For
example, the Washington-based Internet Multicasting Service at http://town.hall.org/radio/index.htm
has experimented with live congressional audio broadcasts.
The Stern School of Business at New York University has developed a Java application
that acts as an interface to the Securities and Exchange Commission's EDGAR database. You
can test it at http://allan.stern.nyu.edu/cgi-bin/JavaTicker/ticker_pref.pl,
but you'll see only 2-day-old SEC filings. The Java applet saves bandwidth by
streaming data like ticker tape on your screen, so the server needn't field multiple,
Netscape Communications Corp. is gambling that data streaming will become a popular
solution. It has an audio-streaming server, code-named Salmon, and a browser plug-in,
code-named Trout. Microsoft Corp. is developing a streaming server called NetShow, which
will challenge the small market built by Progressive Networks Inc.'s RealAudio Server and
Xing Technology Corp.'s StreamWorks server.
For a good introduction to multicasting and data streaming, visit http://www.mediadesign.co.at/newmedia/more/mbone-faq.html.
The cache solution takes a different approach: Files that become popular download
targets get duplicated to several regional locations. Today, if a California user tries to
download from a Web page hosted in the nation's capital, the files must travel across
country even if another user in the California office just downloaded the identical thing.
In a distributed cache system, the second user would receive the files from proxy
caches on a much closer regional hub. Copies would carry expiration times to prevent
server overload, and a parity or date check could be used to see if a file has been
updated since it was mirrored.
For a good description of different forms of caches in use today, and the legal issues
about distributing mirrored copies, see Lisa Sanger's "Caching on the Internet"
Hey, fed webmasters who run Microsoft Windows NT servers: Don't put your perl
executable programs in your cgi-bin. Most people know this, but the Computer Emergency
Response Team in Pittsburgh has noticed that it's an ongoing security problem.
A program called Latro, freely available on the Internet, lets hackers probe your site
to see if you've misconfigured your server and placed perl scripts for sorting and storing
forms information in the wrong place. Hackers can invade that way and execute programs on
your server--or their own dangerous programs. Details are available at http://w4.lns.cornell.edu/~pvhp/perl/ntperl.html.
Shawn P. McCarthy is a computer journalist, webmaster and Internet programmer for
GCN's parent, Cahners Publishing Co. E-mail him at firstname.lastname@example.org.