Advertisment

CITIZEN DATABASES: Taming the Billion

author-image
DQI Bureau
New Update

A population of one billion may be difficult to manage  but it offers a

market that is unique in terms of opportunity and challenge–especially if the

job at hand is to create a database and issue an ID card to each member therein.

Advertisment

Just consider the kind of wide area IP network that the election commission

(EC) has in place to manage a database almost half the size of the present

Indian population: 1,500 centers, a mix of leased lines, VSATs, dial-up RAS and

modem arrays across 32 states and union territories terminating in a GIS system

and a Web server in Delhi. However, the union ministry of home plans to achieve

much more than the EC–not only capturing information about each of these one

billion citizens, but also to create a mechanism that would enable regular

updating of the data. The ministry also realizes that in order to achieve 100%

results the country needs to complete the task in less than five years.

In technology terms, according to the TCS feasibility report on the citizen

database project submitted to the ministry of home affairs, this translates into

a massive five-tier network comprising 15,000 collection points, 462 access

layers, five backbone layers and one central data warehouse. The TCS report also

recommends use of optical character recognition techniques, as the proposed ID

card in plastic will have 2-D bar coding and store biometrics of the cardholder.

"In fact, the use of biometrics for a population of one billion itself

warrants R&D effort as no company in the world has algorithm to match one in

a billion biometrics, irrespective of whether it is a fingerprint or image of an

iris," says Phiroz Vandrevala, executive vice-president, TCS. The TCS

report has also christened the project as Nishan or National Identity System

Home Affairs Network.

Technology apart, experts feel that the ambition to accurately capture

complete citizenry data in less than five years is too big a task considering

that globally a pre-implementation six-year phase is common, with exceptions

such as Poland where it stretched to 10 years. On the contrary, the TCS report

suggests that if India adopts a market-based data capture model, it will not

only be able to speed up the process and complete it within the desired time

frame, but also help reduce the overall project cost.

Advertisment

The Nishan mechanism

While the present census exercise could have been the best vehicle to

simultaneously capture data for Nishan, the government somehow missed the bus.

Nevertheless, TCS recommends that the country can still make up on the time lost

by deciding to capture data though a flat structure based on private initiative

and a market model, where the data would be owned and secured by the government.

However, the plan suggests use of government machinery in sensitive areas, like

the North-East and Jammu and Kashmir and also in non-sensitive yet remote or

sparsely inhabited areas.

To make things easier, TCS applied the RK Swamy/BBDO analysis to the

habitat-dispersion matrix derived from the 19991 census. This enabled it to

classify population into urban, sub-urban and rural. According to the report,

while 82.4% of the Indian population resides in non-sensitive urban and

sub-urban locations, 13.1% people live in non-sensitive but remote rural

villages. Also, of the 4.5% of the population in the sensitive areas, 1.8% live

in urban and sub-urban habitats whereas the remaining 2.7% reside in remote

rural locations. Says Viraj R Chopra, consultant TCS and the man heading the

team responsible for preparing the Nishan feasibility report, "The exercise

helped us filter out habitation that would require a special and possibly a

mobile data capture mechanism, that in all likelihood will be beyond the ambit

of the market model."

Advertisment

However, what this also means is that the market-based data capture model

would be good enough to cover majority–more than 80%–of the Indian

population. The report recommends a network of 15,000 franchisee to cover this

population. The TCS report suggests 60 enrollments per day, per workstation in

non-sensitive areas. In sensitive areas, however, TCS suggests an enrollment

rate of 75 per day per workstation. This, according to the report, would be

enabled by the comparatively more managed process in these areas. The speed of

enrollment also needs to be increased keeping in mind the climatic constraints

that exist in the sensitive areas.

Says Chopra, "The TCS plan is very much like that of the DoT’s

decision to allow private operators to facilitate STD and ISD calls through

public call office (PCO) booths. Prior to its liberalization decision, anybody

willing to make STD or ISD calls had only two options–they could either book a

trunk call, that took ages to materialize, or go to the nearest DoT center to

speak with their dear ones. The process was not only cumbersome, it was also not

customer friendly. However, consumer is rightly the king now as one can just

walk into any of the many PCO booths or even call them for arranging a

conference call. While the DoT’s business has gone up, consumers have

convenience and several thousands have got a means to support their

families."

Chopra and his team have also divided the information to be captured from an

individual into three types–personal details, biometrics and verification

references and documents. Based on these details, TCS has also drawn a list of

applications, the database and application architecture, the network topology

and the cost involved.

Advertisment

"A

lot needs to be done before Nishan becomes a reality"

The Nishan network

According to the TCS plan, the system would require 462 distribution layer

nodes supporting store and forward feature, download of data capture

application, upload of captured data to access layer nodes and cache engines to

enhance query responses. The consultant suggests that these centers should be

located in each district headquarters or areas having good optic fibre

connectivity as the bandwidth required between distribution and access layer

nodes will range between 8 Mbps to 30 Mbps. Similarly, the 43 access layer nodes

need to be located at major urban centers on the DoT’s synchronous

transmission module (STM) rings with bandwidth requirement of 30 Mbps to 60 Mpbs

between access and the backbone layer nodes.

Advertisment

Nishan’s backbone layer nodes need to be located in Kolkata, Chennai,

Delhi, Hyderabad and Mumbai requiring bandwidth between 100 Mbps and 170 Mbps to

transfer data to the central servers. The backbone nodes will be capable of

hosting the captured data, supporting application for managing applicants’

queries and replicating the data to the central location server. The central

location server to be placed in Delhi will comprise of all Nishan database

replicated from backbone nodes. It will also store external database for

reference and backup of all captured Nishan data. The central server will not

only process and verify an application, it will also initiate the card

production process, including generation of PIN, card issuance and dispatch.

The application architecture

Nishan application architecture will be a component-based multi-tier

application, that would use distributed object architecture that encapsulates

the data and the business logic within the object and allows them to be located

anywhere within the distributed system. Typically the components of Nishan

application architecture would be data capture, data verification and the back

office component. While data capture component would have sub components

relating to capture of demographic information, biometrics, digital photograph

and scanned images, the data verification component will have interface with

other business components like IT PAN, passport and electoral roll. The back

office will contain core components like PIN generation, card production and

dispatch components, as also account and inventory management.

Advertisment

The distributed object based architecture will also empower creation of a

multi-tier application. The TCS report has categorized Nishan’s application

multi-tier architecture into three types–client tier, mid-tier and data

source-tier.

According to the report, the client tier would display content delivered by

the mid-tier. Client application for Nishan would be the data capture

application, including user authentication and data uploading components. The

client application would also contain user interface logic and screens that

signal completion of an enrollment process prompting the user to upload data.

Besides, it would also have menu-driven screens for editing, printing and help.

As the tier would act merely as the interface between the user and the supported

functionality, it will be ultra thin. In other words all processing of the

applications would be done at the application server.

The mid-tier, on the other hand, represents a logical layer between the Web

client and the database. The mid-tier for Nishan would be the application server

that encompasses the process logic, which is core of any application. The

mid-tier application component would receive and process data and requests

submitted by the client application. While the mid-tier application components

are classified as core and business components, Nishan’s core application

would be the biometrics component that processes received data as per the rules

and functions and remains independent of other components. The business

application components would be data verification that will have interface with

external components like IT PAN, passport and electoral databases.

Advertisment

The third and the last logical layer or data source-tier represent the RDBMS

or other proprietary data source that provide critical database. Nishan database

containing biometrics, digital photograph, scanned documents and applicants’

demographic information would belong to this tier. To avoid technical and

administrative complexity and reduce cost, TCS recommends a single RDBMS

platform. According to it, applications would run on a homogeneous database

environment, thought it could be on either ‘enterprise server’ or on ‘intermediate

server’.

Splitting of database would also be possible within the RDBMS platform.

Biometrics, textual, digital photograph and scanned document data would be split

across several databases that in turn would be installed in different or the

same location depending upon requirements. The splits would run horizontally

through the tables wherein each database would contain a subset of the rows of a

table.

The service provided by mid-tier application components will access and

manage the data through the data management layer. This layer would be

independent of the function logic and hold all data access routines, minimizing

the app—lication’s dependence on the underlying database management, the

physical storage and would help determine the location within the network.

Opportunity galore

According to the TCS report, the total project capital expenditure for Nishan

over 10 years works out to Rs 1,584 crore. This excludes the Rs 600 crore

required for data collection by the franchisees. In addition, the recurring

expenses for cards and consumables, network and application management costs

have been estimated at around Rs 1,800 crore. Apart from calculating the cost of

plastic cards at Rs 15 each, the TCS estimate has also taken into account Rs

25,000 per machine per month as card printing supervision cost. Provision has

also been made for card dispatch from the centralized card production location

to respective collection centers at Rs two per card.

What’s more, the provisioning for network management reflects TCS’ and

the government’s awareness of and requirement for top-notch technical skills

in implementing Nishan. Check these estimates for network management of the

system: Rs 20,000 per month for each distribution node, Rs 90,000 per month for

each access node, Rs 2.5 lakh per month for each backbone node and Rs five lakh

per month for the central node. The report also projects application management

cost at Rs 17.5 lakh per month during the initial phases. This is, however,

expected to touch Rs 25 lakh per month with the increase in the number of

professionals required for manual biometrics verification. In fact, TCS

estimates a total project cost of Rs 4,000 crore over a period of 10 years. This

excludes establishment, management, legal and fibre connectivity cost, as well

as the cost of capital.

While the projections are many and the TCS report is very optimistic about

the success of the project, it does add a cautionary note: ‘To succeed Nishan

needs a change in mindset–of those responsible for implementing it and of the

citizens. It also needs to assure that while the project uses the database of

the legacy systems–electoral roll, IT PAN, ration card, passport et all, it

has to deliberately avoid being inflicted by the maladies that ail these

institutions.’ It also suggests that the project should be extremely citizen

friendly and hence argues implementation of Nishan through the market-based

model. "This", remarks Chopra, "also represents new economy

employment opportunities for thousands across the country".

On hind sight, while the home ministry is still evaluating the TCS’

recommendations, it remains to be seen whether officialdom will ‘permit’

information technology to serve national interest or make the Nishan initiative

another bothersome ritual for Indian citizens.

"A

lot needs to be done before Nishan becomes a reality"

SHUBHENDU PARTH



in New Delhi

Advertisment