The Importance of IDNs in Local Languages in India: Nitin Wali, Neustar

IDNs are incredibly complicated and there are many reasons why implementing and adopting them has proven difficult in the past

In this era of digital transformation, Voice, Video and Vernacular have clearly emerged as the three most important pillars of Digital India. With vernacular being one of the components of digitisation in India, there is a need take the Internet beyond India’s English speakers by supporting all 22 official Indian languages. In an interview with DataQuest, Mr Nitin Wali, Regional Director Technical Services, Neustar, talks about the need to use local-language domain names, the challenges associated with it, and so on.

 

Need for IDNs in vernacular languages

The .IN namespace – and the Indian digital economy in general – is in a unique position, poised for growth and ripe for innovation.

According to a recent report released by India’s Ministry of Electronics and Information Technology (MeitY – India’s Trillion-Dollar Digital Opportunity Report), “India is among the top three global economies in terms of the number of digital consumers.” MeitY also reports that “the existing digital ecosystem could contribute up to $500 billion of economic value, but the potential economic value for India could be as much as double that amount if digital technologies are used to unlock productivity, savings, and efficiency.”

Through the flagship Digital India Program, the Government of India is fully committed to the transformation of India into a digitally empowered society and knowledge economy.

However, it cannot be deemed a ‘digitally empowered society’ if only a segment of India’s population is able to participate. Historically there’s only been limited opportunity for Indian consumers and business owners to navigate the Internet without using the English language – particularly when it comes to registering and using local-language domain names.

This means India’s significant non-English speaking population must either use English, or be excluded. According to the 2011 Census of India, Hindi is by far the most-spoken first language in India – yet the Internet still largely functions in English. In fact some reports have suggested that there is a class distinction to this, with urban regions and wealthier groups more likely to speak English than lower income or rural populations. Without making the Internet more accessible to India’s non-English speaking population, it can’t achieve the social and economic benefit to the country that it is predicted to have.

In working to solve this challenge, one of our primary goals with .IN is to support all 22 official Indian languages. As such, we’re upgrading .IN to directly improve the Internationalised Domain Name (IDN) experience.

An IDN is a domain name that contains characters in a language-specific script or alphabet. Our new Registry web portal will allow domain name Registrars to provide IDNs in their native script from registration all the way to management and reporting of IDNs. This end-to-end local language support makes registering IDNs – and therefore, engaging with .IN – a more inclusive experience for all consumers.

The internet can be an incredible equalizer and an opportunity for economic growth, personal development and community building – but only if everyone is able to take part in it.

How the .IN reach can be expanded in India

Currently there are just over 2 million .IN domain names under management in the Registry. When we consider this against the MeitY report I mentioned earlier which states there are 560 million internet users in India, this equates to just four domain names per thousand internet users. By comparison, countries such as China (.cn), Brazil (.br), Australia (.au), and the United Kingdom (.uk) have approximately 29, 31, 146, and 192 domain names per thousand internet users respectively. If you extrapolate out these market penetration numbers to India and .IN domains – there should be anywhere from 16 million to 107 million domains under management.

Where the growth will come from will primarily be the new Indian business and individuals getting online that are proud of their online connection to India. Supporting NIXI as Technical Service Provider for .IN, our focus is on protecting, strengthening and innovating the .IN domain namespace in order to cement its place as a national legacy of the Indian digital economy. We’ll achieve this through ongoing investment and product innovation so that we can continue to increase performance and security, and communicating these benefits with the Indian people.

Countries that have done this before, and challenges associated with the same

IDNs are incredibly complicated and there’s many reasons why implementing and adopting them has proven difficult in the past. Two of the main challenges are in trying to capture the intention, complexity and rules of a language in a digital format; and in encouraging adoption of IDNs to facilitate further improvement.

The first point is the crux of the challenge from a technical perspective. With the Internet currently functioning largely in English, the ASCII script/alphabet doesn’t contain many restrictions on how characters can be used, sequenced or combined. If you want to be creative in your branding and call your business “Cars 4 You”, you can theoretically register that as a domain – cars4you.com, for example. It’s grammatically incorrect, but an English reader will understand the intent. We learn in English spelling that “i comes before e, except after c”, but if you misspell ‘friend’ as ‘freind’, there’s nothing in the language rules that will stop you from doing that.

Other global languages and scripts are significantly more complex than English. The ‘Cars 4 You’ example wouldn’t work in some Arabic languages for example, as the language table has rules around mixing Arabic and ASCII numerals. The more rules are baked-in to a language, the more factors need to be coded into an IDN to ensure it reflects the true intention and usage, and protects the integrity of the language.

Additionally, ‘Variant Rules’ are something we take very seriously. Put simply, this means ensuring words and characters that have the same meaning (are ‘variants’ of each other) can’t be registered as domains independently of each other, even if the characters are different.

For example, a Traditional Chinese character and a Simplified Chinese character can mean the same thing and anyone who reads the language would understand this. But without building this into the code behind an IDN, someone could register two separate domains that are variants, just because the characters don’t physically match. The German word ‘gross’ can also be written as ‘groß’ – these are variants as well.

This is more than just having multiple spellings of a word or multiple words that mean similar things. In English, it would be like allowing someone to register ‘google.com’ and ‘GOOGLE.com’ separately – we know these are the same thing, but at a computer programming level the characters aren’t the same, so without enforcing the rule of ‘case-insensitive’ domain names, the Registry could consider them different names.

These subtleties are complex to understand and to implement – but they are so important to the culture and intent of language that it’s vital we get them correct. It’s a challenge for domain name Registries but one we are working through and are committed to fully overcoming.

Beyond this, ‘Universal Acceptance’ remains a challenge for IDNs. Some email servers or software or websites still won’t recognise an IDN as a valid domain name – which in itself discourages people from adopting them. However we truly believe that IDNs have such an important role in increasing participation and inclusion online that we advocate people using them as much as possible.

The greater usage and demand there is, the more pressure there is on technology providers and platforms to keep up and evolve their systems to accommodate these tools – and ultimately, this all serves to support the digital transformation that we’re striving for.

Leave a Reply

Your email address will not be published. Required fields are marked *