Business Technologies

Follow the Mantras

DQI Bureau

17 Oct 2013 06:43 IST

New Update

When it comes to programming language, one can never have dearth of it. Take any alphabet (A, B, C, D, E, F, G...) and there is a corresponding language that exists. It won't be incorrect to say, that soon we might have more computer programming languages than spoken languages.

Advertisment

Nowadays one of the important categories of languages called Domain Specific Languages (DSL) are on the rise. They are created to serve various problemsof a specific domain. These languages have few characteristics that make them different from any generic purpose language like RAPID development, very high level, highly specialized, etc.

Few examples of these types of languages are YACC, HTML, Shell scripts, TDL, ABAP, etc.

Here, I will talk about few most important ways rather mantras, which need to be considered while creating a textual based domain specific language. Apart from being simple to learn and develop, these ways are also very powerful at the same time.

Advertisment

MAKE A NON-STRICT SYNTAX AND SEMANTICS

It was long back when I was learning the C language and wrote my first ‘Hello World' program. When I ran a compiler on it, it showed an error-'statement missing semicolon'. I started thinking if the compiler was so sure that a semicolon is missing at the end of the statement, why doesn't it put the semicolon itself and continue?

Just as, if your boss sends a mail on your promotion with a grammatical mistake, would you reject that note? Yes, this is where a programmer spends most of their time on. The syntax and semantics of the language needs to be non-strict, wherever possible. It should not ask programmers to correct something unless there is an ambiguity. It should allow programmers to write it in their own way with the compiler/interpreter having the intelligence to derive the meaning out of it.

Advertisment

Few things that can be considered to make it flexible or non-strict are:

It must be insensitive to ‘case', white space characters. For eg, MYVAR is same as My Var
Nouns/Keywords in the language should be allowed to be singular or plural. Eg, Variable My Var, OR Variables My Var, Second Var
It must support multiple aliases to the language keywords. Different people use different terminologies for the same thing! Give them the leeway. For eg, a logical keyword TRUE can be also called YES and with a Number 1 Logical FALSE can be NO, Number 0, etc. Similarly, a Colorvs. Colour

Alignment keywords could be "Left" with an Alias as "Top". It makes sense to use Top for a vertically layout against a Left for a horizontal layout.

Advertisment

Any language with minimum rules is simpler to learn and use. If it has lots of rules, a programmer becomes an expert in adhering to rules, gets expertise on language and not the domain.

DESIGN FOR CODE BREVITY

Code brevity is defined as the lesser lines of code one would require to write in order to achieve a task. Lesser the code better is the maintenance and the faster the development process.

Advertisment

From the assembly age, when a specific problem took millions of lines of code to the well-known high level languages that take in thousands; it makes a big difference how rapidly solutions can be developed. Each application programming interface ( API)or syntax of the language should be designed by keeping this in mind.

Most APIs taking parameters should have default values defined. Similarly, most properties of objects must have default behavior. These default behaviors provided must be context specific and based on the most commonly used behaviour. This makes the programmer to write nothing most of the time and still get what he wants. In this medium one can write a code only in exceptional cases of behavior.

Most languages are written using various special symbols, where each symbol varies in its meaning. There are languages which have exhausted the entire special symbols available on the keyboard and find some intuitive way to provide new capabilities like combination of symbols.

Advertisment

Prefixing symbols for various constructs (identifiers, etc.) of the language is another typical idea. It definitely benefits in speed and performance of an interpreted execution, however, leaves the programmer confused. If a specific symbol prefix does not work, they simply try other prefixes. Symbols definitely add to brevity, but take away simplicity. Give more ‘keyword' based syntax rather than using symbols wherever it is not natural to use. For example, rather than giving a ‘++' give a keyword as ‘Increment'.

A LANGUAGE THAT IS MADE FOR A DOMAIN MUST BE MADE FOR A DOMAIN EXPERT NOT A TECHNOLOGY GEEK

Terminologies

Advertisment

Terminologies are one of the most important things that need to be kept in mind while designing a language, and a must for DSLs. It should have all constructs of the language which a domain expert can easily relate to rather than using few technical jargons. For example, a domain expert would prefer using ‘single language' and ‘multi-language' vs calling it an ASCII/Unicode.

Specific onstructs

It should provide more specific constructs rather than a generic construct to solve a problem. Generic constructs solve more problems and also provide more ways to solve the same problem.

One approach will have pros/cons compared to another approach. Do not give a programmer a choice on the approach. Give a single, but the best approach wherever possible.

Focus on the domain! Not on technology

I have seen many programmers (of low level programming languages) spending most of their time in fixing crashes, optimizing memory, CPU utilization, etc. rather than actually solving the problem that the program is intended to solve.

Think how much time your language developers should spend on profilers, debuggers, diagnostics, logging, security, scalability, etc, than on understanding the domain and the problem that they are solving.

High level data constructs

Provide data types and data structures that are at a very high level and domain specific.

Provide inbuilt complex data types wherever necessary. For eg, currency could be a data type rather than using a float and a string for capturing currency information in a business DSL.

A data structure provided by the language plays an important role. More the number of data structures, more will be the rise in complexity.

Give a simple, single-of-the-box data structure, which would handle the complexity of various algorithms, based on data, etc, inside.

Trial approach of learning

Another critical difference between a techie and a domain expert is that a techie understands language by concept, while domain experts understand by solving problems of their domain. I have seen high level DSL based language programmers having created great solutions and yet do not understand the concepts of the language. What works for them is a trial and error approach.

For a domain expert, the first code he writes in the next two minutes should work!
That's what interests him, solving the problem instantly, and not the technology.
On the other hand, for instance, if you ask me I love C/C ++, but the domain experts doesn't even like ithis language.

Thus, a highly reusable language specification is great.
A HIGHLY REUSABLE LANGUAGE SPECIFICATION IS GREAT
Therefore, the language should be designed to provide high re-usability. In such a scenario, programmers do not need to repeat any steps which are common. They should be able to create a template and re-use them as and when required.

Some of the important factors to be taken into consideration are reusability, maintainability, and modifiability of the code at runtime.
Templatization of things is necessary wherever consistency is required and by only specifying what is needed to change is a powerful way they program.
Modifiability of the program (even the base code distributed) by others at runtime is another powerful idea. It allows customization of existing features of your product while alsoadding new features easily.

HIDE THE TECHNOLOGY BEHIND IT

Any API that language provides should hide the technology required behind it. Now-a-days most of the 4th generation languages are designed in this way. Everything else is taken care by system-be it memory management, resource management, encoding, protocols, formats.
For example:

The language processor should provide out of the box support for various operating systems, environments, devices, processor architectures (x86/x64), etc.
One should be able to use a web service without understanding ABCD of communication, protocols, and formats.
Schedule ones tasks, launch them in ‘parallel' (not in a separate thread) without understanding the nuances of threads, synchronization race conditions, etc.
Import data from various formats without understanding their technology.

SINGLE CODE, MULTIPLE BENEFITS

The language should allow writing codes first time such that the same can be used for multiple requirements differently. If I need to define a report for showing it on a screen, the same code should be designed in a way that it permits printing, exporting it in PDF, Excel, etc. The code should be adjustable to various formats and protocols.

I create a code to save some data in the database. What if the same works out-of-the-box to export it in various formats, or even send it to a web service?

This also improves the brevity of the code and increases the maintainability. Most often programmers make a bug fix in one place where it was reported and forget to fix it inanother place which may be used for a different output/format.

MULTI-TECHNOLOGY, MULTI- SYNTAX MAKES IT WORSE

Imagine writing a web application for business needs where you need to learn multiple technologies like web server, database systems, web browsers which come from different vendors. A programmer has to learn multiple languages with extremely different syntax and semantics like SQL, HTML, PHP, and Java Scripts. Programmers spend most of their time in understandingthis syntax's and converting data from different data structures across technologies.

It makes the programmer's life greatly simple if a single and consistent syntax with proper semantics were provided across different types of usage of the language/technology.

Presently, integrated languages are on the rise. LINQ (Language Integrated Query) is one where query was natively integrated into the language. This has performance benefits as well as reduces the learning curve.

What if you can define, manipulate and query data, integrate with multiple applications, design user interface, talk to web services, etc, all just by learning one language?

Today one needs to learn a DDL (Data Definition Language), DML (Data Manipulation Language), DQL (Data Query Language), Integration, Communication protocols and formats, mark-up languages, WSDL (Web Service Definition Language), and so on.

SIMPLE YET POWERFUL LANGUAGE COMES WITH SIMPLE YET POWERFUL DEVELOPMENT ENVIRONMENT

Creating a development environment with an editor and debugger for the language is equally important. It should support syntax colouring, quickly highlight errors done by the users, provide suggestions, and so on. If few things can be visually specified using a drag and drop option, it would be even great.

Most of the ambiguity in the language syntax should get caught while just typing the code. While designing an interpreted language, it should give the developer an assurance of ‘if it is fine here, it will be fine there'. There should be no tension of runtime errors or exception handling. An editor/tool should provide powerful navigation to navigate across the code, create book marks and use them wherever and whenever required, create and re-use readymade templates.If you have plans to allow a third party to write code using this language such that they can have revenue stream on this, then IP protection becomes another important criteria. Compiling it into a sort of encrypted/byte code would be the preferred option as it would give speed of loading as well as the IP protection during a deployment.

A debugger is a must. If my code is not working the way I have intended, how fast one can detect this as a critical factor. The solution is to create a debug mode in the product/interpreter which can pass on such problems to the developer.

LAST BUT NOT THE LEAST... NO POINTERS PLEASE!!!

Programming language becomes as powerful as a sharp knife by adding things like pointers, references, object ids, primary keys, and so on. But it kills the simplicity of the language and makes it unsafe. People like me love the C language because of pointers, at the same time people hate the C language also because of pointers.

There is a well-known joke on C/C++ programming language.

"C/C++ programming is a lot like sex, where you have flexibility to access private members of friend class. But one mistake and you could be supporting it for the rest of your life."

Ensure that your language is an exception to this.