Features

Business Continuity at NSE

DQI Bureau

18 Sep 2002 00:00 IST

New Update

One of the earliest organisations to set up a disaster recovery site in 1998,

the National Stock Exchange has further improved its robustness by implementing

a business continuity plan. The Rs 500,000-crore Mumbai exchange now has a

tested and functional continuity plan to ensure uninterrupted operations in the

event of any dislocating emergency.

Advertisment

Under such circumstances, all operations, including IT support and

infrastructure, business support and administration, would shift to the recovery

site and remain there till the primary site is restored. While the recovery site

was first located at Pune it has now been shifted to Chennai. Comments C

Kajwadkar, Vice President of NSE.iT, the IT wing of the exchange, "We

wanted to move the site to another seismic zone" and Chennai was as remote

as they could get within the country.

	Simulating disaster situations that involve mock trading has been a key initiative for NSE.iT
vice-president C Kajwadkar

The exchange’s ability to continue working through a dislocating situation

hinges on two significant capabilities. The first one is dependent on creating

redundant IT and networking infrastructure at a backup recovery site. The second

involves a well-rehearsed continuity plan where employees and other office

support equipment move to the same recovery site. With regard to the first

initiative, the exchange has the distinction of not having compromised on the

level of investments in mission critical servers, whether at the primary or

backup site. They have invested in fault-tolerant Stratus servers using VOS OS,

which are considerably more expensive than others like Compaq’s highly

available, Non-stop Himalaya servers in use at the Bombay Stock Exchange. Points

out Satish Naralkar, CEO of NSE.iT, "Systems do fail once in two years. The

damage can be very high. Are you going to take a chance there? That’s a call

somebody has to take."

Advertisment

That’s a risk the National Stock Exchange has never been ready to take. It

has exactly replicated the same server infrastructure at the recovery site that

it uses to support mission critical applications at the exchange at Mumbai.

According to Kajwadkar, this is now a policy decision where the merits of

parallel capital investments at both the primary and recovery site are not

debated further. The critical business processes running on Stratus servers at

both locations include trading applications for capital market, wholesale debt

market and derivatives market. Other applications are less critical and

supported on less expensive platforms. For example the applications for clearing

and settlement are supported on Compaq Alpha 8000, 4000 and 2000 series and the

market surveillance applications run on Sun Ultra Sparc servers. Other back

office applications run on Hewlett Packard 9000 servers. Explains Naralkar,

"We have seven years of experience on actual failure rates of equipment,

not theoretical figures". Implying that their loyalty to Stratus servers

for supporting core trading applications has been built through experience and

not from vendor sales jargon.

	The cost of not having an adequate IT recovery solution in place is a risk that Satish Naralkar, CEO of NSE.iT, has never been ready to take
Satish Naralkar, CEO of NSE.iT

While investing in the recovery site NSE.iT also had the option of

considering collocating these servers with data centre vendors. This would have

been a faster and less expensive option. But was ruled out. "The exchange

would loose control of the site", explains Kajwadkar. The full capability

of the exchange to ride out a disaster is dependent on effective coordination

between the IT recovery and the continuity plans. With the IT recovery

capability assigned to an external vendor, coordination would have become

another obstacle.

Advertisment

A key challenge in the recovery operations has been to replicate the Gilat

VSAT hub centre at the backup site, an expensive investment. While the number of

active brokers has dropped considerably from its peak of 3,000 in the year 2000,

all profiles of active brokers at any time also have to be maintained at the

recovery site. This is updated every day. The transaction data from daily

trading is replicated in a batch mode at the recovery site using a combination

of 2Mbps leased lines and backup 64Kbps ISDN and VSAT links.. The exchange may

therefore have to decide whether to reconstruct the transactions during that

interval from data logs if available from the primary site or roll back

transactions by that interval. At the end of it all the exchange has a guideline

that decides whether operations should continue or not. Says Naralkar, "If

more than 30% of the brokers are affected for any reason we discontinue

trading". Implying if a certain percentage of brokers cannot get online

either under normal circumstances or during a disaster situation, the exchange

first has to ensure their connectivity. Trading can continue only after this

situation has been rectified.

High Points of Continuity at NSE
	Policy decision to maintain parallel investments in critical servers and applications at two sites
	Investment in expensive fault-tolerant servers
	Four years of learning experience in recovery and continuity efforts
	Planned exercises to simulate disaster situations with mock trading
	Remote location of recovery site ensuring capability to function through regional disasters

The second initiative ensuring that the exchange can work through a

dislocating situation is around capability in continuity operations. At the

exchange there are three teams that help to run operations. These include the

analyst team or business users; the clearing and settlement team and the IT

team. Explains Kajwadkar, "All the three teams have individual members and

need to work together". That creates a human resource issue since lack of

role clarity and interpersonal conflicts can arise under times of operational

dislocation and stress.

The continuity exercises therefore involve simulating a disaster when members

of all the three teams move to the recovery site and attempt to establish mock

trading. These exercises are usually conducted on pre-determined Saturday’s,

when brokers are informed beforehand and encouraged to participate in the mock

trading. Says Kajwadkar, "These mock trading sessions are useful to the

brokers as well since they get a feel of what they have to do in the event of a

disaster". This may involve establishing connectivity with the recovery

site using a number of network options. Brokers get familiar with what to do in

the event of failure of a VSAT terminal or any other primary connectivity

device. In recent months there hasn’t been an occasion to test the robustness

of the exchange’s continuity plans. The last flashpoint was in 1999 when the

INSAT satellite ensuring VSAT connectivity failed. That happened twice and both

the times it was on a week-end. On the other hand the exchange has been

regularly testing its recovery capability. In late 2001, the exchange rolled out

a plan to shift its primary site within Mumbai. To make this happen it first

shifted its recovery site from Pune to the new primary location within Mumbai.

The equipment was stabilized at this new primary site and then the equipment

from the old primary site was shifted to the new recovery site at Chennai. With

daily transactions hovering at Rs2,000 crore, an uptime of an even half-a-day is

sufficient payback for all recovery and continuity investments made over the

last seven years. "The biggest risk for an organisation is the belief that

the risk doesn’t exist" is Kajwadkar’s advice to other businesses.

Arun Shankar is a contributor to DQ

Advertisment