‘India Could Leapfrog Using the Semantic Web



Quite an ordinary person is a term Sir Tim Berners-Lee uses for himself.
Nonetheless, the world that calls him the Father of the Web chooses to
disagree. It was in the early nineties, while working at CERN, that Sir Tim
proposed a project based on the concept of hypertext, to facilitate sharing and
updating information among researchers. The project is now known as the World
Wide Web (WWW), or simply the Web. Sir Tim did not patent his invention, and
made it available freely so that it could be adopted and spread rapidly.
According to Internet World Stats, there are an estimated 1.2 bn Internet users
spread across different continents. And the usage is growing at an astounding
244.7%, especially in countries like India and China.

Currently, Sir Tim is the director of the World Wide Web Consortium, or W3C,
an international standards organization that oversees the evolution of the Web.
The Father of the Web has now taken on the role of caretaker and is often
talking about different issues that hinder or are beneficial; for instance,
semantic Web, net neutrality, or introduction of domains like .mobi or .xxx. He
is also very excited about the prospects of Mobile Web and hopes that countries
like India could benefit from it. In an extensive interaction with Dataquest,
Sir Tim talks about different issues, be it Indias limited role on a global
scale or what he thinks of Web 3.0. Excerpts

Lets start the discussion with India; with 3.6% of Web userssome 40 mnIndia
is the fourth largest country in terms of numbers. Yet, the nation and its
people have so little say on how the Web is run, moderated or will evolve. What
do you make of that?
When you say that India has little say on how the Web is run, moderated or
will evolve, there are numerous aspects to it. The first thing is content. On
that front, Web is a very open space where anybody can publish what they want.
Basically, it all comes down to having enterprise and creativity of individuals
to put up a website or a blog and, of course, there is an increasing number of
Indian websites with local and global content. And, one of the most important
features of the Web is its diversity, not only in terms of languages and
culture, but also in terms of social things, the fact that you dont have to put
up really professional things, you could just put up amateur things. The bar has
been set pretty low. So, if people think that some subject or certain languages
are under-represented on the Web, I would encourage them to fix that.

The second thing is standards. Traditionally, hypermedia has been the crux of
the Web, namely interlinked text documents with pictures, but now we are seeing
a lot of audio and video on the Web. Another interesting area is the publication
of data on the Web, data about all kinds of people, about the way they are
connected, about products, about weather and so on. Things keep evolving over
the Internet and standards are a process of this evolution.

Personally, I want to have participation in the standard setting process from
every part of the globe and I would encourage people in India to involve in the
process. There is a W3C office in India that we set up to for helpdesk, to help
small local groups, it can be approached for guidance. The standard setting
process is basically an international activity, so any W3C working group that
has an inclination or an idea can be involved. One possibility in which people
in India can choose and direct further evolution of the Web is by getting
involved in the standard setting process. And, finally comes the infrastructure.
There is little governance of the underlying infrastructure like domain names,
etc. But that is relatively small part of social governance. What really drives
or regulates the Web is more of social laws of the land, laws regarding
copyright and libel and contracts, and these differ from nation to nation. India
has always been a part of the Web, and I expect it to play a bigger role in the
coming years.

How do you feel about the fact that millions of people in Sub-Saharan
Africa, Latin America or even in rural parts of India and China are oblivious
and untouched by the wonders of the Web?

I think the ability to blog and
be frank is a great tool and medium, but bloggers should bear certain
responsibilities


I have often pondered upon this, and feel that a lot many things are
inter-linked here. First and foremost, Web is not the be all and end all of
everything and I dont think that it should be forced on anybody. Many countries
in Africa or Latin America already have a long list of things including clean
water, healthcare, peace, etc that are a priority. History tells us that many of
these social development things have been achieved in the past without the
Internet. So, we have to make sure that while we are very excited, it is
important that we dont get distracted by it. The rush for fiber optics should
not come at the cost of clean water and healthcare.

Yet, I do feel that it is the duty of the developed countries to help the
developing countries in as many ways as possible.

There is also this view that the Web is basically a tool for the educated
and the elite, what is your take?
Well, to be honest, if one looks randomly, there seems to be a bias on the
Web. The bias comes from two ends, one is the language, there is
disproportionate amount of content in English, and second, the type of content,
a large amount of this content is rather technical in nature. So, you can easily
find content on technical topics using the search engines. For instance, if the
same term is used to describe something technical, or musical, or historical,
you would be more likely to find the technical paper. That is simply because
technical people are more apt to use the Internet and, thus, more apt to put
things on the Web. Hence, there has been a skew that has existed from the
beginning.

You have often spoken very strongly about the universitality of the Web.
But is it really universal in a manner of speaking, with numerous governments
monitoring and blocking the flow of information, for instance, in China, Saudi
Arabia and even, to some extent, in India?
That is an interesting thought. I grew up in the West, and I believe that
openness is very important. I believe there are a very small number of dangerous
things that should be really banned. Certain things are just illegal, like child
pornography, communal incitement, criminal activity, etc. But, I also think that
free speech is very important. I do feel that anonymous free speech can
sometimes be dangerous because it can be used to spread lies. I think the
ability to blog and be frank is a great tool and medium, but bloggers should
bear certain responsibilities. I feel bloggers sometimes do not realize that
they have major force, if they misrepresent things, it can have a very negative
effect. In the days to come, general openness will increase inexorably because
people understand what they are missing and will demand that. However, I realize
that countries that are used to having very strong control on information flow,
it is impossible to change instantly. So, I think these changes will happen over
the time, at times there might be a few setbacks, but the opening up of the Web
and free flow of information are inevitable.

What do you think of the enterprises colluding with repressive regimes for
commercial gains, like Yahoo that helped in the prosecution of a blogger or
Google filtering the search results in China?
I am really not in the position to comment on individual cases, as I do not
know them well enough. It is a very tricky decision. I know that the companies
have stated that they were forced into areas of compromise. I think compromises
can sometimes be very essential for progress and can, at times, be very
disastrous. I am in no position to really weigh whether these compromises were
fair enough, or wise; history will be the best judge.

You have been talking extensively about Semantic Web or Data Web. When do
you think it will be a reality?
It is evolving at the moment. The data Web is in small stages, but it is a
reality. For instance, there is a Web of data about all kinds of things, like
there is a Web of data about proteins; it is in a very early stage. When it
comes to publicly accessible data, there is an explosion of data Web in the life
sciences community.

When you look about data for proteins and genes, and cell biology and
biological pathways, lots of companies are very excited. Meanwhile, there are
various data projects to create link data, ie data with which you can browse
unlike browsing that we normally do on the Web. With Link data, you can do
things like produce tables and maps and put them on spreadsheet. The
possibilities are endless. So, the data Web is starting to catch up and people
gradually understand how to use it as a data integration system. Under this new
term, link data, it has only been around for a year or so, there is a growing
rate of data that is actually on the Web that allows you to start exploring one
piece of data and pulling other related data and process it together.

Do you think developing countries that have relatively less Internet
penetration can leapfrog to Web 3.0 or Semantic Web?
I believe that is always the case. A developing country tends to leapfrog
over its developed peers in terms of technology, so, for instance, I would
expect developing countries when they put data on the Web (especially the
government) to use the government data on the Web in RDF (Resource Description
Framework). RDF integrates a variety of applications using XML. This is a great
way to disseminate data. For example, if the Indian government has census data,
or rainfall data, or even train timings; and, if they put the data on the Web
using the semantic data standards, then anybody can write a website, which can
use that train timings data and display them in their own language, as data is
global and does not have a language.

And that is one of the exciting things about semantic Web, when you put the
data out there, you are not putting the data in English or in Hindi, you are
putting it up just as data. Essentially, data is numbers, and these numbers can
be displayed in different languages. So, the train names, station names, etc can
be converted into multiple languages without human intervention. In the West,
governments are putting up data and other websites are picking up data from
these government websites, reusing it and making their own websites. So,
Mysociety.org or Govtrack.org are websites that track the US government by
taking the data from US government websites. Anybody can use this data and
generate websites automatically in different localized languages. In these ways
and more, I think the semantic Web is more accessible and more internationalyou
could produce a Braille version, you could produce a speaking version based on
the same data. I am very excited about the prospects and possibilities presented
by semantic Web.

Shashwat DC
shashwatc@cybermedia.co.in

Leave a Reply

Your email address will not be published. Required fields are marked *