'India Could Leapfrog Using the Semantic Web

DQI Bureau
New Update

Quite an ordinary person is a term Sir Tim Berners-Lee uses for himself.

Nonetheless, the world that calls him the Father of the Web chooses to

disagree. It was in the early nineties, while working at CERN, that Sir Tim

proposed a project based on the concept of hypertext, to facilitate sharing and

updating information among researchers. The project is now known as the World

Wide Web (WWW), or simply the Web. Sir Tim did not patent his invention, and

made it available freely so that it could be adopted and spread rapidly.

According to Internet World Stats, there are an estimated 1.2 bn Internet users

spread across different continents. And the usage is growing at an astounding

244.7%, especially in countries like India and China.


Currently, Sir Tim is the director of the World Wide Web Consortium, or W3C,

an international standards organization that oversees the evolution of the Web.

The Father of the Web has now taken on the role of caretaker and is often

talking about different issues that hinder or are beneficial; for instance,

semantic Web, net neutrality, or introduction of domains like .mobi or .xxx. He

is also very excited about the prospects of Mobile Web and hopes that countries

like India could benefit from it. In an extensive interaction with Dataquest,

Sir Tim talks about different issues, be it Indias limited role on a global

scale or what he thinks of Web 3.0. Excerpts

Lets start the discussion with India; with 3.6% of Web userssome 40 mnIndia

is the fourth largest country in terms of numbers. Yet, the nation and its

people have so little say on how the Web is run, moderated or will evolve. What

do you make of that?

When you say that India has little say on how the Web is run, moderated or

will evolve, there are numerous aspects to it. The first thing is content. On

that front, Web is a very open space where anybody can publish what they want.

Basically, it all comes down to having enterprise and creativity of individuals

to put up a website or a blog and, of course, there is an increasing number of

Indian websites with local and global content. And, one of the most important

features of the Web is its diversity, not only in terms of languages and

culture, but also in terms of social things, the fact that you dont have to put

up really professional things, you could just put up amateur things. The bar has

been set pretty low. So, if people think that some subject or certain languages

are under-represented on the Web, I would encourage them to fix that.

The second thing is standards. Traditionally, hypermedia has been the crux of

the Web, namely interlinked text documents with pictures, but now we are seeing

a lot of audio and video on the Web. Another interesting area is the publication

of data on the Web, data about all kinds of people, about the way they are

connected, about products, about weather and so on. Things keep evolving over

the Internet and standards are a process of this evolution.


Personally, I want to have participation in the standard setting process from

every part of the globe and I would encourage people in India to involve in the

process. There is a W3C office in India that we set up to for helpdesk, to help

small local groups, it can be approached for guidance. The standard setting

process is basically an international activity, so any W3C working group that

has an inclination or an idea can be involved. One possibility in which people

in India can choose and direct further evolution of the Web is by getting

involved in the standard setting process. And, finally comes the infrastructure.

There is little governance of the underlying infrastructure like domain names,

etc. But that is relatively small part of social governance. What really drives

or regulates the Web is more of social laws of the land, laws regarding

copyright and libel and contracts, and these differ from nation to nation. India

has always been a part of the Web, and I expect it to play a bigger role in the

coming years.

How do you feel about the fact that millions of people in Sub-Saharan

Africa, Latin America or even in rural parts of India and China are oblivious

and untouched by the wonders of the Web?

I think the ability to blog and

be frank is a great tool and medium, but bloggers should bear certain



I have often pondered upon this, and feel that a lot many things are

inter-linked here. First and foremost, Web is not the be all and end all of

everything and I dont think that it should be forced on anybody. Many countries

in Africa or Latin America already have a long list of things including clean

water, healthcare, peace, etc that are a priority. History tells us that many of

these social development things have been achieved in the past without the

Internet. So, we have to make sure that while we are very excited, it is

important that we dont get distracted by it. The rush for fiber optics should

not come at the cost of clean water and healthcare.

Yet, I do feel that it is the duty of the developed countries to help the

developing countries in as many ways as possible.

There is also this view that the Web is basically a tool for the educated

and the elite, what is your take?

Well, to be honest, if one looks randomly, there seems to be a bias on the

Web. The bias comes from two ends, one is the language, there is

disproportionate amount of content in English, and second, the type of content,

a large amount of this content is rather technical in nature. So, you can easily

find content on technical topics using the search engines. For instance, if the

same term is used to describe something technical, or musical, or historical,

you would be more likely to find the technical paper. That is simply because

technical people are more apt to use the Internet and, thus, more apt to put

things on the Web. Hence, there has been a skew that has existed from the



You have often spoken very strongly about the universitality of the Web.

But is it really universal in a manner of speaking, with numerous governments

monitoring and blocking the flow of information, for instance, in China, Saudi

Arabia and even, to some extent, in India?

That is an interesting thought. I grew up in the West, and I believe that

openness is very important. I believe there are a very small number of dangerous

things that should be really banned. Certain things are just illegal, like child

pornography, communal incitement, criminal activity, etc. But, I also think that

free speech is very important. I do feel that anonymous free speech can

sometimes be dangerous because it can be used to spread lies. I think the

ability to blog and be frank is a great tool and medium, but bloggers should

bear certain responsibilities. I feel bloggers sometimes do not realize that

they have major force, if they misrepresent things, it can have a very negative

effect. In the days to come, general openness will increase inexorably because

people understand what they are missing and will demand that. However, I realize

that countries that are used to having very strong control on information flow,

it is impossible to change instantly. So, I think these changes will happen over

the time, at times there might be a few setbacks, but the opening up of the Web

and free flow of information are inevitable.

What do you think of the enterprises colluding with repressive regimes for

commercial gains, like Yahoo that helped in the prosecution of a blogger or

Google filtering the search results in China?

I am really not in the position to comment on individual cases, as I do not

know them well enough. It is a very tricky decision. I know that the companies

have stated that they were forced into areas of compromise. I think compromises

can sometimes be very essential for progress and can, at times, be very

disastrous. I am in no position to really weigh whether these compromises were

fair enough, or wise; history will be the best judge.

You have been talking extensively about Semantic Web or Data Web. When do

you think it will be a reality?

It is evolving at the moment. The data Web is in small stages, but it is a

reality. For instance, there is a Web of data about all kinds of things, like

there is a Web of data about proteins; it is in a very early stage. When it

comes to publicly accessible data, there is an explosion of data Web in the life

sciences community.


When you look about data for proteins and genes, and cell biology and

biological pathways, lots of companies are very excited. Meanwhile, there are

various data projects to create link data, ie data with which you can browse

unlike browsing that we normally do on the Web. With Link data, you can do

things like produce tables and maps and put them on spreadsheet. The

possibilities are endless. So, the data Web is starting to catch up and people

gradually understand how to use it as a data integration system. Under this new

term, link data, it has only been around for a year or so, there is a growing

rate of data that is actually on the Web that allows you to start exploring one

piece of data and pulling other related data and process it together.

Do you think developing countries that have relatively less Internet

penetration can leapfrog to Web 3.0 or Semantic Web?

I believe that is always the case. A developing country tends to leapfrog

over its developed peers in terms of technology, so, for instance, I would

expect developing countries when they put data on the Web (especially the

government) to use the government data on the Web in RDF (Resource Description

Framework). RDF integrates a variety of applications using XML. This is a great

way to disseminate data. For example, if the Indian government has census data,

or rainfall data, or even train timings; and, if they put the data on the Web

using the semantic data standards, then anybody can write a website, which can

use that train timings data and display them in their own language, as data is

global and does not have a language.

And that is one of the exciting things about semantic Web, when you put the

data out there, you are not putting the data in English or in Hindi, you are

putting it up just as data. Essentially, data is numbers, and these numbers can

be displayed in different languages. So, the train names, station names, etc can

be converted into multiple languages without human intervention. In the West,

governments are putting up data and other websites are picking up data from

these government websites, reusing it and making their own websites. So, or are websites that track the US government by

taking the data from US government websites. Anybody can use this data and

generate websites automatically in different localized languages. In these ways

and more, I think the semantic Web is more accessible and more internationalyou

could produce a Braille version, you could produce a speaking version based on

the same data. I am very excited about the prospects and possibilities presented

by semantic Web.


Shashwat DC