Features

STORAGE: Your Data is Your Business

DQI Bureau

25 Jan 2002 00:00 IST

New Update

'640K ought to be enough for anybody.'

Advertisment

Microsoft chairman Bill Gates, 1981

Information overload

A call center company began operations three years ago with 50 employees and
five large MNC clients. With a sudden boom in the IT-enabled services sector,
the number of clients suddenly shot up to 70 and employee strength increased to
1000. Along with it came terabytes of customer data to manage across remote
locations. The company was totally unprepared. Neither did it have a suitable
infrastructure for storing this data, nor was it equipped to handle an emergency
situation like a virus attack or network breakdown. Before things could be
organized, the servers suddenly crashed due to overload. As a result, the
company not only had to deal with huge data loss, but also had to face a lot of
embarrassment and obviously, it affected business.

Data dropout

M&P Financials, a ten-year-old finance company dealing in stocks and
trading was relatively more IT-savvy. It was maintaining its records on various
storage media such as disks and tapes. Employees could make use of this rich
repository of information while dealing with clients, planning investments and
other operations. Although the company had taken basic security measures, its
recovery mechanisms were not in place. As it expanded operations and shifted to
new IT systems, about 20% of data got lost in the migration process. By the time
the loss was discovered, it was too late to do anything about it.

Advertisment

Best
Practices

Affordability and
RoI:
A cost-benefit analysis is essential. Even when you have decided to
implement network storage, the return on investment may not justify
the costs. A comprehensive SAN could cost crores of rupees, putting
them beyond the reach of most small enterprises. Calculating RoI is a
prerequisite.
Ease of Configuration and
Interoperability: Companies
implementing network storage often buy hardware from one company,
software from another, while a third company supplies the components
needed to connect everything together. This could pose serious
interoperability problems, if you don’t check the compatibility of
these solutions.
Scalability, Performance and
Upgrade:
Your requirements are expanding at an unpredictable pace. It is
essential therefore, to build your systems to be highly scaleable and
allow easy upgrades in a non-disruptive manner.
Data Integrity and
Security: Data
being a company’s most valuable asset, any loss of critical
information can cripple a company’s operations. Lack of industry
standards also heightens concerns about security. All necessary
safeguards must be built into your network architecture.
Vendor Support and
Services:
If you do not have in-house resources to manage your storage, vendor
support becomes all the more critical. As the solution includes a
combination of hardware and software, one should look for services
that include applications support.

Don’t bank on backups

A Web design agency religiously backed up every week. Then one day last
November, it urgently needed to recover some jobs five months old. And it found
that its copy of the restore tool in Central Point Backup, which it was using
for the backups, was corrupted; that version wasn’t available anymore, and
newer versions didn’t support that archived data format. Finally, they couldn’t
recover that data in time for the job, and the agency failed to deliver the
project. They’d backed up by the book, but they hadn’t ever tried to test
recovery…

The case for storage is quite straightforward. With business requirements
dictating increasing amounts of storage space, shortcomings of managing discrete
islands of storage have become more pronounced. Managing backup of isolated
storage devices would need additional manpower and cost. And because it cannot
be easily scaled or even reallocated, organizations are forced to over-buy.
Organizations cannot keep enhancing their IT budget to match the growth of
storage requirements. The traditionally used direct-attached storage mechanism
might be fine for individual needs or for a small setup. But it gets difficult
to manage over a large network and brings up many problems that the IT manager
has to deal with–Data replication, lack of cost-effective data-sharing
technologies, abundance of unattended files and stale data that its creators
use, forget about and never delete from their disks.

Advertisment

What to buy?

Before you make a buying decision, it is important to carefully review your
company’s needs and resources. Once you have your requirements clearly laid
out, you can determine which type of storage solution is best for you. Very
small organizations depend on simple hard disks and CD rewriters on PCs to
satisfy their storage needs. As the organization grows and the need to share
stored data increases, the primary storage option moves to the network, onto
attached network servers. As the organization scales up even further, the sheer
processing power required to pull the data out makes it worthwhile to make it
independent of the application server and network server. At this stage, you are
ready for implementing a SAN, or a storage area network.

Topologies

A look at DAS, NAS, and SAN–the three main topologies used in enterprise
storage, assessing their pros and cons.

DAS (Direct Attached Storage): This is the most basic and widely used method
of storage where the storage devices such as hard disks or tapes are directly
connected to the server either through a SCSI (Small Computer System Interface)
or fiber link. They could be inside the same box as the server or reside outside
and be connected through a cable. It is easy to implement and gives a reasonably
good performance at lower acquisition and administrative costs. A DAS setup can
have a single hard disk, multiple independent hard disks called JBOD (Just a
Bunch of Disks), or an array of hard disks configured for fault tolerance, known
as RAID (Redundant Array of Independent Disks). The most common types of media
and protocols used in DAS are SCSI, fiber channel, and SSA (Serial Storage
Architecture).

Advertisment

The traditionally used DAS model has its limitations and does not suffice the
increasing business needs any longer. Its capabilities are limited to the server
it is connected to. So if the server is down, users can’t access their data.
It can’t be connected over long distances, and the storage capacity is also
limited, compared to NAS and SAN. DAS devices also pose management problems, as
you have to manage the data on a server-by-server basis.

Forecast

The quick
acceptance and adoption of storage software solutions during 2001 can
certainly be expected to continue at an accelerated pace over the next
four years. Enterprises will continually look to demand more sophisticated
software solutions to assist in the management of increased storage
capacities, due in part to the business and disaster recovery. With this
growth comes increased complexity of underlying storage architectures such
as SAN. Moreover, with companies requiring 100% uptime and data
availability, taking systems offline to back up data or add capacity is
simply not an option.

With enterprise business applications
widening their scope, newer space-eating apps will come up everyday,
pushing for more storage, requiring high-availability solutions. Following
in step with global trends, more and more companies are looking for
location-independent storage, which would mean network storage options
such as SAN or NAS. The need for data protection, back-ups and disaster
recovery mechanisms is becoming more pronounced with rising volumes of
data. Other concepts that are coming up include storage over IP, IP-SAN,
end-to-end second generation switched fiber products, virtualization of
storage, iSCSI vs. infini-band competing technology, Virtual Interface
(VI), consolidation of storage, servers and applications. All these will
lead to the evolution of more open standards, which can enhance
interoperability among products from various vendors.Â

TEAM DQ

NAS (Network Attached Storage): Unlike DAS devices, which connect to a
particular server, NAS devices connect directly to your existing Ethernet
network and are independent of the server. So, if the server fails due to some
reason, data is still available to users on the network. NAS devices can be
placed anywhere on your network and can be accessed by any user. Its function is
similar to that of a network printer, which can be accessed by anyone when
required. NAS devices support most network protocols such as NFS for UNIX, CIFS
for Microsoft, FTP, and HTTP. This makes them useful for heterogeneous
networking environments. NAS devices comprise of a number of hard disk drives
and come with their own OS and management software.

Advertisment

They can be used for a variety of applications, including Web caching for
proxy servers, backup, databases, print spoolers, or simply as file servers.
Storage capacities for NAS devices can range from 2 GB to over 2 TB. A benefit
of NAS is that it allows organizations to setup a storage solution using their
existing Ethernet backbone, without investing in a separate network. On the flip
side, since NAS operates on networks primarily designed for data transmissions,
performance issues such as network congestion arise. Also, if you exceed your
storage capacity than you must add another NAS device.

SAN (Storage Area Networks): The latest technology in the area of storage is
SAN, or Storage Area Networks. As the name suggests, a SAN is a separate network
linked to your company’s main network via a high-speed interface like a fiber
channel or SCSI. This sort of a solution is useful for companies having high
transaction volumes like banks and customer-service oriented organizations who
need quick access to data at any point of time.

A SAN network consists of multiple storage systems and servers and is much
faster than a NAS system. That’s because unlike NAS, the various storage
devices in SAN are connected through a high-speed interface, such as fiber
channel or SCSI. A fiber channel typically has a data transfer rate of 1 GB/sec.
SCSI transfer rates are lower than fiber-channel, about 80 MB/sec. However, in
future these devices are likely to communicate over Gigabit Ethernet using iSCSI
or Fiber channel over TCP/IP (FCIP). The various storage devices in a SAN
interface with the company’s main network via switches and hubs, and can be
simultaneously accessed by multiple servers and computers. All SAN components
are controlled using SAN software, which allows users and system administrators
to remotely control its functioning.

Advertisment

Prices

Storage
devices can put you back by...

Hard-disk
drive: The
Maxtor 100 GB drive costs around Rs 12,800. An 18 GB Ultra 160 SCSI
drive would cost you around Rs 16,720. They are commonly-used in
high-end appliances, such as servers.
CD-Rs/CD-RWs:
A branded CD-R and CD-RW media disk would cost around Rs 40 and Rs
200, respectively. A CD-ReWriter can cost Rs 8,000-14,000. The Aopen
CD-ReWriter costs Rs 8,000.
Removable storage
devices:
Portable, removable storage devices are not so expensive anymore. A
100-MB zip drive costs Rs 5,000-7,000, while a jazz drive or mobile
drive costs Rs 20,000-25,000.
Tape
drives: HP’s
SureStore DLT1e drive, backing up 80 GB of data, costs Rs 80,000.

Calculating TCO

Various factors such as hardware equipment cost, software, implementation
and related services have to be taken into consideration. For a mid-size growing
enterprise, a typical basic level storage solution with about 50 to 60 machines
backup, you will need a tape library (Rs 3—20 lakh) and a backup server
machine (Rs 2—4 lakh), backup software (Rs 15—20 lakh). The remaining cost
would depend on what databases (SQL Server, Oracle Server etc.) you are using,
the applications (SAP, CRM, MS Exchange, ASP applications and other e-biz
applications), implementation/training (Rs 1—4 lakh) and how much is
outsourced.

Ensuring 24x7 availability

There are many elements in the storage infrastructure that need to be
considered in order to achieve true high availability levels. The storage
subsystem itself must be designed for high availability. Most advanced models
use RAID techniques to avoid the impact of media failures. They have redundant
components to ensure that access to data can be maintained during any
maintenance, upgrade or failure conditions, and provide facilities to allow all
microcode or firmware updates to take place without interrupting the
applications using the storage subsystem. The next consideration is the
connection from the server to the storage device. A switched Fiber Channel
infrastructure will provide a highly flexible and available connection. When
used in conjunction with path management middleware, multiple paths can be
provided, removing the impact of a failure in a connection path.

Advertisment

Benefits of storage consolidation

Most organizations look to consolidate the storage requirements of several
servers to ease the task of allocating new storage to application servers,
especially as their requirement for storage grows. Consolidating the requirement
of these servers into one storage pool enables companies to build in spare
storage that can be made available to the right server at the right time. The
result is a more available set of applications and reduced storage management
costs.

Team DQ

Products and Tools

Some
commonly used storage devices and how you can assess them

Hard-disk drive: The most common storage device,
the hard-disk drive can now store anywhere up to 100 GB of data. And with
prices of hard disks coming down, you can easily get a fast, high-capacity
drive at a decent price. For example, there’s Maxtor’s 100 GB drive.
There are various factors that affect the performance of the drive, such
as the seek time, spindle speed, data transfer rates, CPU utilization and
latency. Disk drives are also available in two different interfaces: IDE
(Integrated Device Electronics) and SCSI (Small Computer System
Interface). An interface is basically the channel over which data flows to
and from the hard disk. SCSI drives are faster and more expensive than
IDE, and are found in servers. In mid-range servers, you are likely to
find redundant arrays of disk drives. For instance, an array of three 36
GB SCSI drives, in a RAID-5 configuration (under Rs 1lakh, when bought
with a server) will give you 72 GB of fault-tolerant storage. That’s
food for thought, and a good place for you to start.

Removable storage devices:
Another backup option for mobile professionals is portable, removable
storage devices. A range of devices are available in this segment ranging
from Zip/Jazz drives, pen drives and, mobile disk drives to flash cards.
These devices are available in different capacities for different prices,
so if you are a frequent traveler you can choose one that meets your
requirement and suits your pocket too.

CD-Rs/CD-RWs: You
can minimize downtime and protect your critical data through various
backup devices. Among the cheapest options to back up data are CD-Rs and
CD-RWs.

Tape drives: The
other options to back up your servers and workstations are DLT/DAT tape
drives. DLT (Digital Linear Tapes) and DAT (Digital Audio Tapes) differ in
the type of recording technologies they use. These drives can back up
large chunks of data and are also quite expensive. For example, HP’s
SureStore DLT1e drive can back up 80 GB of data. Also, various NAS devices
from top vendors like IBM, Compaq, HP, Sun, Network Appliance and Snap
Server are also available. There are storage arrays from Compaq, Sun, EMC
and other SAN hardware such as switches and Host Bus Adapter cards etc.

Software: On
the software side, there are storage management software available in
three categories: information protection, application availability and
management tools. Major software vendors are Legato, Veritas, IBM, HP and
CA.