Advertisment

Web Caching Ray of Hope

author-image
DQI Bureau
New Update

The growing popularity of the Internet is burdening the

already taxed bandwidth. Numerous endeavors to address the problem of Net

congestion have hit high ground. While solutions like increasing the available

bandwidth, using digital subscriber lines or optical fiber networks aim at

improving the transfer of information over the Internet, the root cause of Net

congestion remains ignored.

Advertisment

The repeated transfer of frequently asked information

constitutes the bulk of Internet traffic. This being the case, any amount of

bandwidth would be insufficient. Installing faster circuits, routers or switches

can reduce congestion, but will not reduce the round-trip time between two

nodes. Trans-oceanic Internet links have round-trip delays in the range of 100–300

milliseconds. The speed of light also imposes fundamental limits on network

delays. When the transmission medium is fiber, data can travel at around 60% of

the speed of light, just about 179,876 kilometers per second. Thus, with no

workable solution in sight, a new technology–Web caching–is catching the

attention of all.

What’s Web caching?

Caching in general terms refers to a technology that speeds

up access to data by storing frequently requested data nearby. Caching is used

in CPUs for accessing data to and from memory. The technique is in use in

operating systems and satellite communications. In Web caching, the technology

is used for sending high-demand content from a server to an ISP’s cache

through satellites. The data that is needed most often is stored close to where

it is needed, thus avoiding the repeated transfer of data over long stretches of

network. To put the technique simply, it is similar to having a pile of folders

on your desk to save you umpteen trips to the filing cabinet across the room.

Web caching can hence be the easiest way to ease Net congestion.

Advertisment

How does it work?

When a user makes a request, it is first processed by the ISP’s

cache. The cache checks for a valid reply for the request. If it finds a reply,

it performs a second check. This time, the freshness of the information is

verified. In case, the information is fresh, it is put before the user. In case

the information is unavailable or stale, the cache makes a request on behalf of

the user. This request is routed through the satellite to the cache service’s

ground station. Fresh updated information is transmitted back to the ISP’s

cache, which then passes it on to the user.

Apart from making available the most recent versions of Web

pages, an ISP’s cache needs to be able to communicate in various protocols

like FTP, Gopher and HTTP. When the user makes an FTP request, the cache should

be able to use the FTP protocol when requesting the file from the FTP server and

conversely should be able to translate the FTP reply into an HTTP one for the

user.

Advertisment

A cache is more than just a local storage department. When

the cache is full, existing objects must be removed. Selecting the objects to be

removed is based on various replacement algorithms.

Speedier access

There are advantages of caching–bandwidth consumption,

server load and latency get reduced. The greatest advantage of Web caching is

the reduction in the backbone network traffic–it can reduce the demand for

bandwidth by at least 35%, and enable optimized use of the existing bandwidth.

Since the load on content servers will also be reduced, the number of users who

can reach the server’s documents without increased bandwidth will also rise,

without crashing the server. Storage of data at local ISPs will mean shorter

distances for data to travel, and thereby reduced latency. A cache can also

isolate end-users from network failures.

Advertisment

Big gainers: Multimedia and e-com

Caching is best suited for static pages and huge files that

need to be downloaded often. Multimedia files such as movie trailers or music

tracks will benefit most from the caching technology. In addition to being of

large sizes, these files are also prone to jitters or random variations in the

delivery of individual packets. Since the distance between the user and the

information is shortened by the use of caching, such distortions can be

eliminated.

Caching can also aid e-commerce. According to a report by

Zona Research, as much as $4.4 billion per year in potential B2C revenues might

be lost because of slow download speeds. Faced with this, shoppers migrate to

other Web sites. Web caching can help e-commerce sites retain shoppers by

caching portions of their Web pages. Static pages such as home pages and fixed

elements like the company logo, copyright information and navigational buttons

usually remain unchanged. Caching these portions would require only the new

elements of a page to be transmitted.

Advertisment

The flip side–not too bad

Caching, as of today, has some limitations. The biggest issue

is the probability of storing stale information. How recent can be the pages in

the local cache? How often can these caches refresh data? These form the basis

of any discussion on the implementation of caches.

Existing HTTP servers are incapable of informing caches about

updated objects. However, new standards like HTTP 1.1 include features that

allow users to specify freshness parameters and allow page authors to decide

which parts of a page should be cached. HTTP 1.1 has included the Cache-Control

header that is an improvement over the earlier Expires header of the HTTP

standard.

Advertisment

HTTP 1.1 has introduced the concept of active caching, which

effectively addresses the freshness issue. Rather than wait for page requests to

check for a Web object’s freshness, active caching determines and ‘pre-fetches’

objects that are likely to go stale. Active caching works on algorithms based on

factors such as the frequency with which the object has been requested, the

frequency at which it has undergone changes and the bandwidth cost of retrieving

the object.

Also, caching does imply some undesirable ramifications. One

is the possibility of undetected content modification and violation of

copyrights. With increasing cyber crimes, there are concerns that caches could

become targets for hacking. The skepticism of content alteration without the

permission of the content provider also deters the implementation of caching.

Moreover, the importance of access counts will be lost with widespread

acceptance of Web caching. This will take away from the content providers a

vital tool to judge the performance of any site. Additionally, operation of a

cache requires new equipment and personnel.

The benefits of caching, however, will be a strong incentive

to resist. In India, Web caching can considerably ease the pressure on the

existing bandwidth, even as an expedited effort is made to meet Nasscom’s

projected bandwidth requirement of 300GB by 2005.

Priya Sivakumaran



In New Delhi

Advertisment