Advertisment

Software Component Architectures

author-image
DQI Bureau
New Update

A few years ago, distributed

computing was a cutting-edge technology used only by early adopters or IT groups with very

specific technological needs. Today, due to the increasing demand for internet computing

and enormous complexity of enterprise-wide computing infrastructure, distributed computing

has become very much an accepted solution. Distributed computing is, however, still in its

infancy when it comes to the technology that supports it. This status is evident in the

lack of mature standards and uniform approaches to common issues.






In terms of distributed software components, the competing standards are the Object
Management Group's Common Object Request Broker Architecture (CORBA) 2.1, Microsoft's

ActiveX and Sun's JavaBeans. No other standards for distributed components are likely to

be successful in today's market. Previous technology proposals such as OpenDoc, Taligent

Frameworks and NeXT Objects have come and gone.






Until the 1990s, multi-user computing consisted primarily of a single large processor
(generally a mainframe or a midrange system) supporting multiple users through a

time-sharing scheme. Users interacted with the processor through a terminal and a

keyboard. Terminals were called dumb terminals then because they performed no computing.

Instead, they simply displayed character output generated by activities performed on the

main processor.



This style of computing was known as host-based computing, but it frequently is referred
to as a single-tier computing because computation occurs in only one location: the main

processor.






Initially, networking between PCs followed a peer-to-peer model: PCs could access data and
documents on other PCs, butt there was as yet no centralized repository for all a

company's data. Although this process avoided some of the problems of data duplication, it

did not solve them, and it created unwanted complexity in managing data that was

distributed between large numbers of PCs att an IT site.






Eventually, the model that emerged was to place all data on a central server and have the
PC clients access the data as needed. Processing still would bee performed locally on the

PCs, but all documents and data would rely on a central file server. Eventually, this

file-sharing model evolved to a system in which the file servers (now known as database

servers) also performed computational work. This model came to be known as client-server

computing and constituted the advent of two-tier architectures. The client tier consisted

of PCs, and the server tier included servers of all types.






Eventually, client server computing evolved to perform many data management tasks on the
server itself. This transition conserved network bandwidth and, for certain applications,

freed client PCs to do other work while waiting for the results of a query or a report.






Predictably, this approach encountered the problem of overloading servers, which now were
saddled with thee task of providing database access and performing much of the enterprise

computation. (Interestingly, the now-freed resources of the PC were quickly put to use by

new versions of the operating system and by business applications such as PowerBuilder

that required significant processing power from the client.) Eventually, a third tier

appeared in enterprise computing: the application server. In this scheme, the processing

logic for business applications was performed on an application server that interacted

with a separate database server: meanwhile, the client handled data presentation and some

data validation. There were now three distinct tiers: client PCs, application servers, and

database servers. This model proved successful and frequently is found today at businesses

of all sizes.






Three-tier computing quickly, lost its conceptual simplicity, however. The two driving
causes of this change were the increasing heterogeneity of computing in IT departments and

the advent of large, multi-site enterprise computing architectures.






To address the difficulties of developing applications for such a widely disparate
architecture and to handle the management of so many tiers, two different technologies

evolved. The older technology is messaging middleware, which arose from the success of TP

monitors. The newer technology, which is seen as the next wave of enterprise computing, is

distributed objects.



Middleware originally was designed as a solution to the problem of developing software for
heterogeneous and n-tier computing infrastructures. Specifically, the question was how to

write software that could access databases such as Microsoft SQL server, Oracle, and

Sybase while supporting clients running Windows, MacOS, and Unix and while interfacing

with application servers that ran on Unix boxes. Beyond the problem of writing the

application for such as heterogeneous environment was the question of maintaining the

application if a new database server was added or a new client operating system was

introduced.






The solution middleware offers is to provide developers with a uniform interface through
which their programs can access enterprise resources. The developers interact with only

the middleware layer, and the middleware then performs the necessary translations for the

respective databases, operating systems, and clients. In this sense, the middleware

presents the developer with a single, consistent interface that masks the complex

computing infrastructure. The middleware package-generally a large, intricate and

expensive piece of software-is installed in the enterprise between the application servers

and the clients on one side and the database servers on the other.






Of the many categories of middleware, modern TP monitors come closest to realizing the
original goal of the solution. TP monitors initially were devised to ensure the integrity

of transactions in high-volume, on-line transaction processing (OLTP) environments. Since

then, they have been extended to be effective tools for masking client and database

heterogeneity. However, TP monitors are primarily transaction-oriented and rarely are used

as a pure communications medium across different processes in the enterprise. This latter

goal-the ability to communicate data across disparate platforms, multiple sites, and

through or within firewalls-has become the main driver for generic middleware. Pure

middleware products now relegate themselves to the primary task of providing the

communications infrastructure for data processing within the enterprise.






Middleware originally was built on a technology called Remote Procedure Calls (RPCs), but
it has evolved to a message-oriented approach, which is the most common implementation

today.






Remote procedure calls


RPCs allow a program running one computer to call a procedure (similar to a function or
subroutine) that executes on a remote machine and performs a single, discrete task.

Generally, the RPC sends a request in the form of some data to the Unix operating system,

and today they function in many high-end client server operating systems such as Windows

NT.






This approach is conceptually simple and can be learned by any experienced programmer.
However, it suffers from several drawbacks. First, the program containing the RPC call

must wait for a response to the call before continuing. This approach (known as

request-reply) depends on synchronous operation: The request must be responded to before

processing can continue. This process requires guaranteed communication and suffers from

difficulties if the



remote machine is busy, offline, or if it cannot return a response quickly.





Second, RPCs are very tedious to write. Due to their conceptual simplicity, numerous RPCs
must be generated for even straightforward work to be done between computers. Each of

these RPCs must be written individually, requiring considerable effort on the part of

developers. Finally, RPCs allow point-to-point communication between machines but do very

little to mask thee details of implementation from the developers. Therefore, although

RPCs are effective communication devices, they are not suited to serve as the principal

form of middleware.






RPCs are generally not sold as stand-alone packages. Unix provides one model of RPC,
Windows NT another (although similar) model. The primary third-party RPC implementation

comes from The Open Group as part of the Distributed Computing Environment (DCE). DCE is

massive, enterprise-wide package that is supposed to provide a comprehensive set of

services for distributed computing specifically optimized for business processing. It

includes modules for security, clock synchronization, distributed file systems, and many

other purposes. However, the lack of commercial successes of DCE has thwarted attempts to

assess its capabilities to deliver on these claims. A notable exception to the absence of

third parties is Noblenet, whose product RC 3.0 provides an RPC infrastructure and

numerous software development tools that ease programming for RPCs.






Message-oriented middleware


Message-Oriented Middleware (MOM) uses a different approach from RPCs. Messages are sent
asynchronously to a destination. that is, they are sent there, but the sending program

does not require a reply before continuing operations. This approach generally is

implemented using one of two mechanisms: publish-and-subscribe or message queuing.






In publish-and-subscribe, the message is sent (or published) and the target destination
awaits receipt of messages specifically addressed to it or to a destination in which it

has expressed interest (subscribed). In this model, many different targets could subscribe

to the same data. For example, several programs could be waiting for real-time data to

come in from the factory floor. Rather than having the real-time data collection process

figure out who is interested in its data, the program simply publishes the data to a

predetermined target, whereupon all subscribers would receive the data. Another example

would be sending the same HTML page to numerous intranet users.






Publish-and subscribe does not work well in 'store-and-forward' situations, where data
must pass over several network segments and be stored safely at each segment before being

sent on. Under the message queuing approach, the message is sent to the target's queue.

Then the target, on its own schedule, can check the queue for messages it needs. This

approach is suited to communication with devices that might be off-line, such as computers

of mobile users. The messages pile up in the queue and await connection with the target to

begin delivery.






In both cases, the middleware package manages the message, making sure they are delivered
to the correct queue or target in real time and that they are sent only once.






How does the original program obtain an answer if its computation is predicated on a
return message? It checks its own queue for messages from the original destination system.

If it cannot proceed without the specific reply, it will continue checking the queue until

the reply is received (or until some preset interval has been established, whereupon it

views the delay as an error condition). In this manner, if thee program needs the

reliability of a request-respond model as is provided by RPCs, it can implement it with

MOM, but the program has the additional asynchronous capability of sending message that do

not require a response.






The absence of an immediate response does not create a reliability issue for asynchronous
operations based on MOM. If for some reason the MOM cannot deliver the message, the MOM

will perform some predetermined action. The two most popular options involve use of a

callback: Either an error function is called by the MOM to inform the sending program of

the error, or the connection uses multiple threads for communication, one of which is used

to signal the error. Therefore, a built-in mechanism ensures notification if a message

cannot be delivered or if some other error occurs.






In practice, MOM has not been able to fulfill the original mission of providing a uniform
interface across heterogeneous clients, operating systems, and databases. Thee main reason

for this failure has been the near-exponential growth in complexity of most enterprise

computing infrastructures. Instead, MOM has emerged as a messaging infrastructure; It

provides enterprise-wide communication of data, but third-party products then perform the

translation to the individual databases or applications as needed. An example of this

arrangement is TSI International's Mercator translator package that plugs into IBM's

MQSeries MOM. Many sites choose to write their own custom interfaces to the MOM.






This problem highlights one key drawback of messaging middleware: No standards for the
message formats exist. Every vendor has its own format and design for messages. As a

result, a key criterion when assessing a specific middleware package is the extent of

available third party software that can plug directly into the MOM.






Transaction processing monitors


Modern TP monitors provide the same transactional integrity across numerous client
platforms and servers and interface with most leading databases. In addition to verifying

the transaction, they enforce numerous other data-processing policies, such as security,

transaction and error logging, data replication, and so on.






Traditionally, TP monitors have used synchronous communication: For every action they
perform, they require verification that the action has occurred correctly. This

requirement is inherent in the need to provide data and transactional integrity. Although

this function is suitable for OLTP, the response delay built into this model means the TP

monitors have not functioned well as pure message-passing tools (such as for real-time

data feeds) in the same manner that MOM currently is used. Recently, however, TP monitors

have added asynchronous communications to their feature list and now provide message

queuing similar to most MOM. This functionality is provided separately from the TP

monitoring activity. However, this development underscores the increasing convergence of

MOM and TP monitors.






TP monitors have retained a singular commitment to making transaction integrity in a
heterogeneous environment as easy as possible for programmers to develop. Most TP

monitors, despite supporting a wide variety of clients and servers, use only 30 or 40

application programming interfaces (APIs), so most developers can avail themselves easily

of TP monitors' capabilities. In addition, The Open Group has published a set of standard

APIs for TP monitors, making their appeal to developers even stronger.






Distributed computing


Although RPCs, messaging middleware, and TP monitors provide a communications mechanism by
which business processes can be performed across a wide variety of platforms, they share a

common drawback in that the middleware layer typically is proprietary. This layer is

defined and maintained by one vendor, and all tools and applications must integrate with

the layer to be able to derive its benefits.






This model may work well within a single enterprise, where a uniform environment can be
created; however, it begins to be difficult if internet computing and substantial remote

access are necessary. Dial-in customers, extranets, and Web site visitors may need to

perform business transactions with a company, but they cannot be expected to have chosen

the necessary software to plug into the messaging middleware used by the vendor site.

Therefore, companies with these needs increasingly are looking to distributed objects to

allow global access to resources while hiding platform and database heterogeneity.






A method gaining increasing popularity in this context is based on ORBs. These software
products allow software components to interact across the enterprise. To understand how

ORBs work requires examining the nature of objects and components.






How components work


In their simplest form, components are parts of a program that can be called by the main
program to perform certain specific tasks. Some of the earliest component-like

technologies were Dynamic Link Libraries (DLLs). These libraries, common in Unix for more

than 10 years and in the Windows family of operating systems since the early 1990s, work

in the following manner. A DLL will consist of a series of related functions, for example,

functions that will render three-dimensional (3-D) graphics on the screen. This DLL is

known to exist on the system at some predefined location (on Unix, this location is almost

always/etc/lib; on Windows systems, the location can change, but it must be somewhere on

the execution path as specified in the user's environment). When a function in the DLL is

called, the DLL is read into memory and linked into the main program as if the DLL always

had been part of the original program. Once linked this way, any and all functions in the

DLL can be called and executed without further steps.






In this context, components can be defined as being a group of related functions
encapsulated with data into an object that can be called by a running program to perform a

specific task. The calling program can be on the same machine, on a different machine

within the enterprise, or on a completely different continent connecting to the calling

program across the internet. Examples of objects typically found on the same machine are

ActiveX components and JavaBeans; objects found across the enterprise would be CORBA

objects plus the previous two; objects found across the Internet would be CORBA objects

and Java applets. These are just they typical settings, however. Any one of these

component types can be found anywhere in the enterprise.






CORBA and ORBs


For components to work successfully across the enterprise, many vendors felt it was
necessary to band together and form standards by which components could be defined to a

central authority and invoked with the knowledge that any standard-compliant component

could be found and activated successfully. A group of vendors came together in the early

1990s as the Object Management Group (OMG) to fashion a set of guidelines for distributed

computing with components.






Unfortunately, the group chose the use the term "object" rather than
"component," forever causing confusion. OMG-defined objects are really

components. Components tend to be larger than simple objects-often, they are built from

objects-and they tend to lack some characteristics peculiar to objects. In addition,

components need not be written with object-oriented programming languages. Nonetheless,

because the term "object" is used by the OMG to refer to components, discussion

of distributed computing often uses these two terms interchangeably.






The first document produced by the OMG was CORBA VERSION 1.0. Since its first release, the
document has been updated several times. CORBA 2.1 was released in later 1997.






The CORBA models uses ORBs to handle requests from programs to components and from
components to each other. ORBs are programs that serve as traffic cops: They handled the

requests, fine the requested components, call them, and return the results. To enable

these operations, CORBA defines a standard way for components to specify their interfaces;

the kind of data they expect to be handed and the kind of data they expect to return. The

standard way of defining interfaces is known as the CORBA Interface Definition Language

(IDL). Using IDL, a programmer can identify that a component will accept an integer (for

example, the diameter in centimeter) and return an integer (the circumference, rounded to

closest centimeter). These interfaces are in IDL (the syntax of which looks similar to

C++). A translator program then generates code for the component: client stub code and

server skeleton code. This code is the basis on which objects can speak to each other.

Once the IDL definition is written, the interface definition can be placed in the

Interface Repository, a database where all object interfaces that a given ORB can

recognize are stored. For an ORB to interact with a component, the component's interface

information (technically, its metadata) must first exist in the Interface Repository.






IIOP


Introduced with the CORBA 2.0 specification in 1996, the Internet Inter-ORB Protocol
(IIOP) is a key extension to the CORBA environment. It solves the problem of how ORBs

communicate with each other. Prior to CORBA 2.0, the OMG had begun designing a protocol

for inter-ORB communication. This protocol came to be known as the General Inter-ORB

Protocol (GIOP). It specified the format of seven key data messages ORBs, would need to

share data, and it detailed how data should be sent between ORBs. Specifically, the data

needed to follow the formatting rules of the Common Representation (CDR). CDR rules

compensate for the fact that different processors use numbers in different formats (for

example, on Intel machines, a two-byte number is stored with the low-value byte first; on

RISC machines, the high-value byte generally is stored first).






Java-based component architecture


JavaBeans is a specification for components that can be deployed on any Java platform.
Like Java applets, JavaBeans components (Beans) can be used to give Web pages or other

applications interactive capabilities such as computing interest rates or varying page

content based on user or browser characteristics. JavaBeans provides a component

development environment to extend Java's cross-platform "write once, run

anywhere" capability to the development of enterprise-class applications with the

features of component reusability. JavaBeans has emerged as a component definition to

create distributable, reusable objects within the Java environment. The Java-based

component model has followed a slightly different evolutionary path than Microsoft's

Distributed COM (DCOM), although both had their foundations in client-side GUI component

development.






The Java model did not emerge from an RPC-based component definition that grew to include
distribution and object interoperation. Instead, the JVM concept and associated Java

programming language created an environment where applets could be written and then

fetched across the internet and run anywhere. As the applet concept grew and Java began to

take hold as a cross-platform development language, JavaSoft (a division of Sun) sought to

offer a component model by providing JavaBeans as a component definition. The actual

component architecture model is a compilation of facilities from both SunSoft (a

subsidiary of Sun) and JavaSoft, the key to which is Java. Fundamental to a working

Java-based component architecture model are the Java based APIs that provide the operating

services on top of the Java language environment. The collective component model includes

Java, Java-based APIs and JavaBeans.






In essence, JavaBeans is a component specification for the Java language. Initially
intended for the Sun- originated Java and aimed at NEO (Sun's first commercial ORB, which

Sun abandoned in January 1998), it has developed greater compliance to CORBA standards and

broadened its interface capability to become a true, standalone component framework.






JavaBeans


A Bean is a piece of Java-based software that can range from a small visual control such
as a progress indicator to full-fledged application such as spreadsheets. A Bean is a

specialized, primarily visual component Java class that can be added to any application

development project and manipulated by the Java builder tool. Beans can be combined and

interrelated to build Java applets or applications and as such must be executed within a

larger application (the hosting application is called a container). Beans also can be e

used to create new, more comprehensive, or specialized Beans.






Beans can be flexibly manipulated, but not executed, without the requirement for a
dedicated object container. The properties of a Bean are attributes, which describe the

state of the Bean such as color, position, and title, and methods, which are the actual

functions to be performed by the Bean. Bean methods include functions that can be

performed on the Bean itself to manipulate its internal state or to interact externally

with other objects. Methods also can respond to events, which are actions performed when

special internal or external signals occur such as a mouse click. Beans also can implement

conceptual and non-visual objects such as business rules. In these regards, JavaBeans do

not differ much from other components, except for their requirement of a Java

implementation.






Enterprise JavaBeans


EJBs are specialized, non-visual JavaBeans that run on a server. The EJB 1.0 specification
defines a component model to support multi-tier, distributed object applications. EJBs

extend JavaBeans from its origins as a client-side GUI component ÿset to a server-side

componentÿmodel, thus simplifying the process of moving application logic to the server

by automating services to manage components.






Server components execute within a component execution system, which provides a set of
run-time services for the server components, such as threading, transactions state

management, and resource sharing. This requirement means that EJBs do not need the

standard JavaBeans container context to execute.






EJB provide hooks for enterprise-class application developers to build reusable software
components that can be combined together to create full-function applications. The hooks

include interfaces to common development environments and richer class libraries geared

toward the enterprise.






The EJB model supports implicit transactions. To participate in distributed transactions,
individual EJBs do not need special code to distinguish individual transactions and their

operations. The EJB execution environment automatically manages the start, commit, and

rollback of transactions on behalf of the EJBs. Transaction policies can be defined during

the deployment process using declarative statements, Optionally, transactions can be

controlled by the client application.






Microsoft's COM


Microsoft's Component Architecture is based on Microsoft's COM, which was originally
designed to provide interoperability between Windows applications (Microsoft Office in

particular). Microsoft's COM supports development of components in the form of ActiveX

controls, compound document services and frameworks in the form of DocObjects, and service

components in the form of COM objects (OLE) and their automation.






The COM model is extended by Distributed COM (DCOM), allowing clients to access components
across a network and supporting client-to-server and server-to-server connections between

Windows 95 and Windows NT systems. The COM model is extended further by Component Object

Model Services (COM+), which supports platform-independent development of componentized

applications over standard protocols such as TCP/IP.






Microsoft bundles COM/DCOM along with a series of other framework facilities and
development tools into Windows NT Enterprise, taking the view that component models and

frameworks are integral elements of the operating system. Like many component

architectures, COM/DCOM is dependent on object communications brokering. However, unlike

other component architecture implementations, Microsoft's approach must communicate across

three networking protocols (NetBIOS, Novell IPX/SPX, and TCP/IP) which reflects of the

mixed heritage of PC component technologies and the lack of industry standards for PC

desktops. Microsoft is using Microsoft Transaction Service (MTS) as an abstract layer

behind which the three protocols can be bridged. Although not explicitly part of DCOM, MTS

is an important element in the Microsoft view of distributed applications and comes

bundled in Windows NT Server, Enterprise Edition.






COM and ActiveX


COM principally provides the architectural abstraction for OLE objects. COM supports an
implementation typing mechanism centered around the concept of a COM class. A COM class

has a well-defined identity, and a repository (the system registry) that maps

implementations (identified by class IDs) to specific executable code in the form of

ActiveX controls. (ActiveX controls were formerly known as OLE controls, or OCXs). COM

automation allow applications such as Microsoft Internet Explorer, Microsoft Office, and

development tools such as Visual Basic to access COM objects.






From a COM point of view, an object typically is a subcomponent of an application that
provides a public interface to other parts of an application or to other applications. OLE

objects usually are document-oriented and are tied to some visual presentation metaphor.

The typical domain of a COM object is a single-user, multitasking, GUI-based desktop

running Microsoft Windows. The primary goal of COM and OLE is to expedite collaboration

and information sharing among application suing the same desktop, as in the case of the

Microsoft Office suite. This view of COM is different from the CORBA perspective in which

an object is an independent component providing services to client applications.






DCOM




To extend COM's reach into distributed environments, Microsoft extended the COM model with
DCOM, allowing clients to access components across a network. To access a COM object on

some other machine, the client relies on DCOM. DCOM transparently transfers a local object

request to the remote object running on a different system.






DCOM provides functions similar to the naming, security, persistence, and versioning
services found in CORBA. For directory services, DCOM relies on Active Directory,

essentially a combination of Domain Name System (DNS) and Lightweight Directory Access

Protocol (LDAP).






Because DCOM uses a proprietary implementation of the DCE RPC for its component
architecture, COM/DCOM is limited to use within a homogeneous Microsoft environment. In

fact, a major stumbling block in distributed computing is the absence of real

interoperability between COM/DCOM objects and CORBA components. Even getting these two

component models to communicate with each other has proven difficult. Limited

communication between these component type is possible through some third-party products.

The notable exception to this rule is Object Bridge from Visual Edge Software, which

provides true interoperability between CORBA and COM/DCOM (most other packages simply

allow CORBA-to-OLE Automation interoperability). The difference is that the latter

approach requires use of CORBA's dynamic invocation function, which is significantly

slower than the more common static invocation used among CORBA components.






Microsoft Transaction Server


MTS is a component-based TP system for developing and deploying transaction-oriented
processing on Windows NT server. MTS also defines an application programming model for

developing distributed, component-based applications, and it provides a runtime

infrastructure for deploying and managing these applications. MTS was written using COM

and uses DCOM to communicate with resource managers and other MTS servers on a network.

When COM components are installed into MTS and client and server components that are

distributed across more than once node and that must use DCOM to communicate are managed

automatically.






MTS is bundled into Windows NT Enterprise Edition and acts as a container for COM service
components. It defines classic services such as multithreading, concurrency control, and

shared resource utilization, and relies on native Windows NT services to provide security,

naming, and systems management services. Other MTS services include pooling, queuing,

connection, and security, all architected within the COM/DCOM model.



The MTS protocol is designed to accept automatically COM/DCOM- based applications with
some additional configuration and DLL packaging. Therefore, it acts as the primary

bridging protocol for applications written directly to WinSock, Remote Automation, direct

MS-RPC, or COM/DCOM specifications. The MTS protocol specifically allows asynchronous

application operation, using MSMQ technology as an enabling protocol.






MTS is enabled by MSMQ-an operating system service providing store-and-forward
connectivity between two application programs. MSMQ enables two programs to communicate

with one another using queues and includes a queue manager responsible for routing

messages to their destination. It also supports transactions, namely transaction delivery

using the Distributed Transaction Coordinator (DTC) as the transaction manager.






DTC is a general-purpose transaction manager architected originally to supply two-phase
commit to Microsoft's SQL Server relational database. DTC has been extended to provide

general-purpose interfaces capable of providing transaction management for

program-to-program interactions under MTS. Operating as runtime middleware environment

within DCOM, DTC hosts application program modules and endows them with transaction

semantics.






COM+ and Active Server


COM+ supports the development of platform-independent advanced applications over standard
protocols such as TCP/IP. The primary goal of COM+ is to provide additional tools to the

COM/DCOM infrastructure to ease management of object technology development, further

advancing the COM/DCOM model into a model more generically aligned to other software

component models. COM+ provides security and memory management as well as a data-binding

feature that allows binding between object fields and specific database fields. COM+

enables components to redirect their service requests dynamically by calling various

services at runtime rather than being bound to a single implementation of service. In

addition to receiving and processing events related to instance creation, calls and

returns, and errors and instance deletion, interceptors also provide the mechanism to

implement transactions and system monitoring, further intertwining the COM/DCOM and MTS

architecture.




































































































































Excerpted with permission from

Technology Forecast: 1998.



Courtesy: PriceWaterhouse Coopers.

Advertisment