http://www.dmst.aueb.gr/dds/pubs/jrnl/1995-JOOP-Multipar/html/paper.html
This is an HTML rendering of a working paper draft that
led to a publication.
The publication should always be cited in preference to this
draft using the following reference:
- Diomidis Spinellis,
Sophia Drossopoulou, and Susan Eisenbach.
Object-oriented
technology in multiparadigm language implementation.
Journal of Object-Oriented Programming, 8(1):33–38, March/April
1995.
Citation(s): 2 (selected).This document is also available in
PDF format.The document's metadata is available in BibTeX format.
Find
the publication on Google Scholar
This material is presented to ensure timely dissemination of
scholarly and technical work. Copyright and all rights therein are
retained by authors or by other copyright holders. All persons
copying this information are expected to adhere to the terms and
constraints invoked by each author's copyright. In most cases, these
works may not be reposted without the explicit permission of the
copyright holder.
Diomidis Spinellis Publications
|
Object-Oriented Technology in Multiparadigm Language Implementation
Diomidis Spinellis
Department of Computing
Imperial College of Science, Technology and Medicine
dds@doc.ic.ac.uk
Abstract:
Implementing multi-paradigm programming environments is difficult,
as the complexities of language implementation are combined with the
interactions between the different languages. Research so far was
centered around judiciously choosing the best language features to
include in a multi-paradigm language.
Object-oriented technology
can be used as a structuring mechanism for multi-paradigm systems allowing
arbitrary languages to be mixed in a single system.
Objects are used to encapsulate modules written in different languages
and inheritance to model language extensions.
Using this
approach we have implemented a multiparadigm programming environment
generator and used it to create a five language multiparadigm programming
environment.
Multiparadigm programming languages and systems, allow the implementation
of applications in a number of different programming paradigms.
This makes it possible for the application developer to implement each
part of the application in the best suited paradigm.
Thus for example in a stock market decision support system the
market heuristics could be expressed in the logic programming paradigm,
the capture of incoming data could be implemented in the imperative
paradigm, and the analysis of it could be based on the functional paradigm.
Although multiparadigm systems can offer a number of advantages their
use is not widespread.
An important obstacle in their acceptance is the lack
of a suitable structure that allows the efficient and
flexible combination of different paradigms,
yet isolates them from unwanted interactions.
In the following sections we will examine how object-oriented
technology can be used as a paradigm encapsulation mechanism for
the design of multiparadigm programming environments.
By the adoption of this approach the following problems can be
addressed:
-
the accommodation of different syntactic notations and execution models
associated with different paradigms,
-
the use of existing compilers and interpreters through their encapsulation
as class methods,
-
the extension of given programming paradigms by the use of subclassing, and
-
the paradigm inter-operation via the tree class structure.
In section 2 we
describe work related to multiparadigm programming and the
use of object-oriented technology in this field.
In section 3 we outline how multiparadigm programming
systems can be modeled and implemented using our approach, and
in section 4 we present a system
built to support this development process.
Finally, section 5 contains an example of an
existing system and possible new applications of this approach.
In a survey of multiparadigm languages and systems we identified more
than 90 systems that allow programming in more than one paradigm.
Some prevalent paradigm combinations are the following:
- functional and logic programming [Bon91,Coh86,YS86],
- imperative and logic programming [MS90,Mul86],
- functional and imperative [Pra83,WW88],
- functional and object-oriented [SBK86,BDG+88,GM87],
- logic and object-oriented [Zan84,McC92], and
- imperative and object-oriented [Str86,Mey92].
Although some systems incorporate object-oriented technology,
its use is limited at the linguistic level i.e. the expression of
an implementation using the multiparadigm system; our
approach proposes the use of object-oriented technology at
a meta-linguistic level i.e. the
design and implementation of the multiparadigm system.
A structured approach towards multiparadigm programming that is not
based on specific programming paradigms is described in [Zav89].
The subject of that research is however the semantic and verification
issues related to multiparadigm programming, and not the use of
object-oriented technology as a paradigm encapsulation mechanism.
Objects evaluated in different paradigm contexts using different
resolvers are discussed in [TOO86].
Under that approach unification and non-deterministic choice are
provided as resolvers (similar to the functional programming lambda
operator) that can be used to evaluate objects.
The approach does not cater for different syntactic notations between
paradigms, nor does it fully
address the incorporation of additional paradigms.
Object oriented-technology can be used to model and structure
multiparadigm programming systems by regarding a set of programming
language implementations as a class structure.
In the following paragraphs we will examine the roles of inheritance,
classes and objects in a multiparadigm programming system.
A class structure is evident (though not
expressly used) even in existing language implementations.
At the root of such a class structure lies the machine language of
the target computer architecture.
All libraries and operating system software are available at that level.
Subclassing from the machine language level
in order to provide a readable description
we obtain assembly language.
All language tools that emit assembly language are
subclasses of that level.
However, a number of compilers use a still higher level language
such as C for their portable output representation, so these are
subclasses of the C language which in turn is a subclass
of the assembly language.
We stress that the above example is just an object-oriented
view of the existing state of affairs.
We used it just to provide an illustration of an example
tree class structure.
Other possible examples are the subclassing of a logic programming
language to obtain constraint logic programming, or the subclassing
of both functional and logic programming languages from a language
that provides the term data structure and garbage collection
of dynamically allocated objects.
Every language in the structure described above forms a class.
The single class variable contains the data needed for the runtime
implementation of that language, such as the machine registers,
the stack etc.
In addition, each language class contains methods for compiling
objects into a suitable representation, initialising its runtime
machinery and executing the compiled object code.
The compilation and execution can be either implemented at the
class level, or delegated to the superclass, by compiling code
to the superclass language and invoking its execution method
at runtime.
Having defined languages as classes, we can now see that objects
are the modules that are expressed in a given language.
Every such module consists of its source and compiled code, as well
as its internal state (e.g. static variable values) in the form
of instance variables.
In addition every module can have an instance initialisation method
to initialise the module state variables, or perform language
specific initialisation (e.g. invoke static variable
class constructors in an object-oriented paradigm implementation).
Figure 1:
Programming paradigm classes and objects
|
Summarising, a multiparadigm programming system can be modeled using
object-oriented technology by expressing every paradigm
as a class, using objects to isolate program modules,
and inheritance to capture language relationships.
Figure 1 illustrates this relationship.
Multiparadigm programming environments are complex collections of
a number of systems, such as compilers, run-time support libraries
and the associated utility tools.
In order to organise the task of creating such environments we
envisage the realisation of meta-environments: multiparadigm
environment generators.
These offer facilities for making the task of developing
multiparadigm programming environments as easy as possible.
Multiparadigm programming environment generators are based on,
the object-oriented approach we described in the previous section.
Their existence ensures that the approach is followed, and ad-hocery avoided.
Furthermore, the development time and implementation errors can be reduced.
We hope that such systems can even be used to create specialised paradigms
for one specific application.
In this way we can provide
a solid software engineering foundation for the concept of
``little languages'' [Ben88, pp. 83-100, 128-131] [Spi93].
In order to demonstrate this approach we implemented
MPSS, a multiparadigm programming environment generator.
In the following paragraphs we present its design and implementation.
We chose to implement MPSS as a tool suite, as this
would offer us
flexibility, extensibility, integration with the development
platform, and implementation ease.
Our design however lacks in
process enforcement, and user friendliness.
Implementing multiparadigm programming environments, is not a trivial
task.
We expect the users of our generator, to know what they are
doing.
A multiparadigm programming environment generator must support
the design based around objects.
We consider the following services to be essential:
-
allow the high level description of paradigms as classes,
-
convert a system described by paradigm classes into a multiparadigm
programming environment, and
-
provide support for using existing tools.
The functions of the resulting multiparadigm programming environment can be
divided in three areas:
- 1.
- paradigm-specific compilers,
- 2.
- paradigm combination support, and
- 3.
- run-time support.
Ideally support for all these areas should be provided by the generator.
Based on the requirements outlined in the previous section we can now
describe the structure of a multiparadigm programming environment
generator.
Such a generator consists of a paradigm compiler, which takes the
object description of a paradigm class and creates its compiler,
and run-time support.
Other, specialised tools, are used before the paradigm compiler to
assist the process of creating paradigm implementations
using existing compilers or tools.
Generic run-time support must be provided where possible, to
implement functionality that will be required for all multiparadigm
programming environments: this avoids duplication of effort among
multiparadigm programming environment implementors.
Finally a `system wrapper' combines all the tools, documentation and
run-time support objects into a single distributable system.
The combination of these components can be seen in figure 2.
In the following sections we will describe the functionality of
each of these components in more detail.
Figure 2:
Structure of a multiparadigm environment generator
|
For every paradigm, the implementor provides a paradigm class description
file, that defines the paradigm class.
This is then compiled by
the paradigm description compiler provided by MPSS into a compiler
for that paradigm.
When the
source code of all paradigms has been compiled a special link editor, the
multiparadigm link editor, is invoked to link all the paradigm
objects and associated support libraries together into a runnable system.
Two additional tools detect and protect class private variables.
The paradigm description compiler compiles a paradigm class
description file
into a compiler for that paradigm.
Paradigm description files are text files containing
definitions for class variables and methods.
Possible variable and method definitions are:
the instructions for compiling paradigm code,
variables that need to be protected, and the run-time support library name.
Some of the variables must be defined for every paradigm,
others can be defined if relevant.
The class variables currently supported are listed in table
1.
In addition, the multiparadigm
system builder can introduce more variables according to the
structure of the system.
For example, typechecking support can be added in the form of additional
class methods.
Variables and methods are defined by the method or variable name
starting at the beginning of a line, followed by a colon, followed by
the value.
Within the file, the programmer can refer to a variable value
by using its name mapped to uppercase characters within
curly brackets, preceded by a dollar sign, e.g. ${SELF.TOOL}.
The variable SOURCE, is automatically set within the code
generated by the paradigm compiler to reflect the name of
the filename source parameter given by the multiparadigm programming
environment user.
Table 1:
Class variables supported by the paradigm compiler
Variable name |
Meaning |
SUPER |
Superclass name (TARGET for root class) |
TOOL |
Name of the compiler to generate |
EXTENSION |
Paradigm source filename extension |
COMPILE |
Compilation instructions |
INSTANCEV |
Class member instance variables |
SYSTEM |
Multiparadigm environment name (e.g. blueprint) |
SYNOPSIS |
Summary of the paradigm's operation |
SOURCE |
User filename parameter |
The paradigm description compiler performs variable substitution for
the variables of the class and its superclass using the usual
SELF. and SUPER. variable prefixes.
Using an undefined variable
within a class description will produce a compilation error.
A sample paradigm description file, is listed in figure 3.
Figure 3:
Sample paradigm description file
# Paradigm description file for the backtracking paradigm
#
# $Id: btrack.pdef,v 1.2 1993/05/03 15:33:55 dds Exp dds $
# Environment name
system: blueprint
# Paradigm name
name: btrack
# Superclass name
super: term
# Source file extension
extension: pb
# Target tool name
tool: btrackpc
# Instance variables and functions. These variables and functions are
# duplicated across instances of the paradigms runtime machinery
instancev: rules_1 mpss_needlib_lbtrack_0 btrack_3
tryall_5 solve_4 import_unify_3
# Compilation instructions
compile:
rm -f ${SOURCE}.${SUPER.EXTENSION}
bt2term ${SOURCE}.${SELF.EXTENSION} >${SOURCE}.${SUPER.EXTENSION}
${SUPER.TOOL} ${SOURCE}.${SUPER.EXTENSION}
# Runtime support modules
runtime: lbtrack.o
# Paradigm description for automatic manual creation
synopsis: backtracking and unification
description:
\fIBtrack\fP is a pradigm for handling problems that can be
solved using backtracking and unification. The \fIbtrack\fP
programming style intentionaly resembles that of Prolog. Many
pure Prolog programs can be ported to \fIbtrack\fP without change.
|
High quality and efficient implementations of paradigms are costly
to implement.
For this reason a multiparadigm environment generator should allow
existing systems to be used wherever possible.
This means that mechanisms must be provided to make such systems conform
to the conventions expected by the overall system design.
These mechanisms must transform independent systems into encapsulated
paradigm objects.
Specifically, they should identify and deal with
potential resource requirement clashes in areas such as name-space,
memory usage, and non-reentrant functions.
One major problem when using existing tools is that of
name-space pollution.
Some tools create code with global identifier names (function or variable
identifiers) that can not be specified or changed by the user.
Such code can be used only once within a given program.
This is for example,
the case with the lex and yacc utilities.
Having such tools as part of a multiparadigm environment
is not possible, as they can only be used to create a single
instance of a code object.
MPSS provides an instance variable detection tool,
instancev, which when run on the object code that such a
utility generates, prints a list of the global variables that are defined
in it.
These can then be listed in the instancev section of
paradigm class description file to be automatically converted to
private instance variables.
Many of the paradigm translators, either because they are built based
on existing utilities,
or because of the translator design, or due to features of
the implemented language will contain global variables or procedures
that should be private to the class instance.
Protect is a tool of MPSS
that generates unique identifier names.
The names of the private
variables and procedures are listed in the instancev section
of the paradigm class description file.
Given this name list, protect will create regular
expressions, that when
applied to the assembly language output of the target architecture
paradigm will automatically convert them to class private variables
by prepending to them the name of the module in which they occur.
A limited version of this tool, yyhide [Gli91]
is part of the Andrew Toolkit [PHS+88].
Yyhide only deals with output generated from yacc and lex,
while our tool can handle output from any program generator.
Many paradigms need run-time support.
All classes contain a class
instance initialisation method and all objects an object instance
initialisation method.
These methods must be called in the appropriate order during the
startup sequence of a multiparadigm application.
This can be complicated (as it deals with system-dependent process
initialisation code).
Furthermore this is functionality that will be needed by all
multiparadigm programming environments.
In order to avoid duplication of effort among the multiparadigm environment
implementors, to ensure a correct and portable implementation, and
to simplify the task of implementing a multiparadigm programming environment,
this functionality can be provided by the environment generator.
Environment implementors can then, directly plug it in their systems
with minimum additional effort.
The design and implementation of these functions is highly dependent
on the target system.
In an open object-oriented integrated environment (such as Smalltalk)
it could be as easy as binding an additional method
to the process class constructor.
In a file-based system using a link-editor it could require modifying
the link-editor, the system startup code, or providing a kernel within
which multiparadigm applications are run.
MPSS contains support for the linking phase in the form
of mpld: the multiparadigm link editor.
In addition to the duties of the system link editor (resolving
references between the modules, and creating executable code) [PW72],
the multiparadigm link editor has three additional duties to perform:
-
initialise every paradigm class
by calling its class initialisation method,
-
initialise every separate module in each paradigm,
by calling its instance initialisation method, and
-
link in the runtime machinery of each paradigm,
by instructing the system linker to include the appropriate libraries.
The class initialisation is performed before the instance
initialisation, as the second one might depend on an initialised class instance.
Furthermore initialisation proceeds from the top to the bottom of the
class hierarchy, as a class might
depend on its superclass for its initialisation.
One further area where the multiparadigm environment implementor can be helped
is that of the final packaging.
Multiparadigm environments can consist of tens if not hundreds of separate
elements, such as specialised tools, libraries, compilers,
run-time support modules, documentation files etc.
The process of organising these into a system distribution can be
automated, minimising potential errors and relieving the implementor
of this mundane and error-prone activity.
The actual design and implementation depends very much on
the target system, the distribution method and the range of
activities that will be automated.
These can reach from a simple system installation procedure
in specified directories, to the creation of a distribution tape
with installation instructions and printed manuals.
Some explorative, rapid-prototyping development environments
(such as Smalltalk ), make it
difficult to separate a product from the environment where development
has taken place.
In such cases additional support will be required in order to make
this distinction and create an isolated product.
Mpss provides the system wrapper,
a tool that by reading the paradigm description files
locates the parts of the programming environment and organises
them into a suitable distribution format.
This ensures that the system distribution and its updates can be created
reliably and with the minimum effort.
The following is a sample approach for
building a multiparadigm programming environment using MPSS:
- 1.
- Decide the programming paradigms that will be supported, and
the names that will be given to their classes.
- 2.
-
Organise the paradigms into a tree class structure. Paradigms
that are extensions of another paradigm should have that
other paradigm as their superclass. The target architecture
paradigm must be at the root of the class tree.
If two paradigms share an important characteristic, try to abstract that
characteristic into a separate class
which will be the common superclass of
the two other classes.
- 3.
- Every paradigm needs a
class definition file.
This, as a minimum, must define:
-
the name of the paradigm,
-
the name of its superclass,
-
the extension of the filenames containing source code for that
paradigm,
-
the name of the compiler for that paradigm and,
-
instructions for compiling code written in that paradigm into
code of the target architecture.
- 4.
- For every paradigm create a translator that transforms the code
of that paradigm into code of the target architecture. This
is normally done by translating into the code of its superclass
and invoking the superclass compiler via SUPER.COMPILE.
The invocation code
for that translator is given in the compile section of
the paradigm class definition file.
- 5.
- If an existing tool is used for translating the paradigm code
use instancev to find out which variables must be made
private for every class member instance.
- 6.
- If the code generated by the translation process contains some
global variables that should be private to a class member instance,
list those in the instancev section of
the paradigm class definition file.
- 7.
- Compile all class definition files to create the compilers
for all paradigms.
- 8.
- Wrap up the system, using the system wrapper, for distribution or
installation.
We used MPSS in order to implement blueprint,
a multiparadigm system
that offers the six paradigms listed in table 2.
Its class structure is illustrated in figure 4.
The system was implemented using paradigm description files,
compiled by MPSS, existing tools (to implement the
imper, bnf and regex paradigms),
as well as compilers (term) and interpreters (fun, btrack)
written in blueprint.
Of the 4260 lines that comprise the implementation, 787 (18%) are part
of the paradigm description files, with the rest divided among the
other paradigms.
Blueprint was used to implement an application that manipulates definite
and indefinite integrals.
The lexical analysis and parsing were performed in regex and bnf,
the symbolic evaluation in btrack, the numeric evaluation in fun,
and user interface and graphing functions in term and imper.
The use of the different paradigms available under blueprint made
the system a lot more concise and readable than a similar older system
implemented in Modula-2.
Table 2:
Blueprint paradigm classes
Class name |
Description |
imper |
Imperative constructs |
term |
Term handling |
btrack |
Logic programming |
fun |
Functional programming |
bnf |
BNF grammars |
regex |
Regular expressions |
Figure 4:
Blueprint class hierarchy
|
The use of object-oriented technology for structuring multiparadigm
systems is not limited to the domain of programming language paradigms.
One other possible application domain is that of multiparadigm document
preparation systems.
A sample class structure of such a system is shown
in figure 5.
The root class in such a configuration would be a low level
page description language, similar to the one used for
describing the end product.
A possible example would be the Postscript language.
The other paradigms would be organised as a class
tree in the usual fashion, with lower level languages
such as TEX, troff and RTF
"[*]"
acting as the superclasses
of higher level languages such as LATEX, mm, pic, and
Microsoft Word.
Paradigm communication in this case means the integration of
the output of the different paradigms into the final document.
Many of the paradigms described, offer a facility for
incorporating Postscript code this is one more
reason for choosing Postscript as the common superclass.
Figure 5:
Multiparadigm document processing
|
We have demonstrated that object-oriented technology offers
a viable structuring mechanism for multiparadigm systems.
Inheritance is used to abstract common features between languages,
classes are used to encapsulate different languages,
and objects are used to define program modules.
This structure allows the implementation of multiparadigm programming
system generators: toolkits that assist the creation process of
multiparadigm programming systems.
Using this technology we implemented a sample multiparadigm programming
environment, and outlined one other possible application in a
different domain.
We have not addressed the areas of paradigm inter-operation,
type-checking and the efficiency of the resulting implementations.
These are areas that need further research, in order to find
how object-oriented technologies can best be applied to them.
I would like to thank Sophia Drossopoulou and Susan Eisenbach for
their help during the conduct of this research.
Support from the British Science and Engineering Research Council is
gratefully acknowledged.
- BDG+88
-
Daniel G. Bobrow, L. G. DeMichiel, R. P. Gebriel, S. E. Keene, G. Kiczales, and
D. A. Moon.
Common Lisp object system specification, X3J13 document
88-002R.
ACM SIGPLAN Notices, 23, September 1988.
Special Issue.
- Ben88
-
Jon Louis Bentley.
More Programming Pearls: Confessions of a Coder.
Addison-Wesley, 1988.
- Bon91
-
Pierre Bonzon.
Processing functional definitions as declarative knowledge: A reduced
bytecode implementation of a functional logic machine.
In H. Boley and M. M. Richter, editors, Processing Declarative
Knowledge: International Workshop PDK '91 Proceedings, pages 271-278,
Kaiserslautern, Germany, July 1991. Springer-Verlag.
Lecture Notes in Computer Science 567.
- Coh86
-
Shimon Cohen.
The APPLOG language.
In Doug DeGroot and Gary Lindstrom, editors, Logic Programming,
Functions, Relations and Equations, pages 239-276. Prentice Hall, Englewood
Cliffs, NJ, USA, 1986.
- Gli91
-
Bob Glickstein.
YYHIDE: A post-processor for Yacc and Lex files.
Part of the Andrew System, 1991.
- GM87
-
Joseph A. Goguen and José Meseguer.
Unifying functional, object-oriented and relational programming with
logical semantics.
In Bruce Shriver and Peter Wegner, editors, Research Directions
in Object-Oriented Programming, pages 417-477. MIT Press, 1987.
- McC92
-
Francis G. McCabe.
Logic and Objects.
Prentice Hall, 1992.
- Mey92
-
Betrand Meyer.
Eiffel: The Language.
Prentice Hall, 1992.
- MS90
-
Gerhard Mehldau and Robert A. Schowengerdt.
A C-extension for rule-based image classification systems.
Photogrammetric Engineering and Remote Sensing,
LVI(6):887-892, June 1990.
- Mul86
-
Carlo Muller.
Modula -- Prolog: A software development tool.
IEEE Software, 3(6):39-45, November 1986.
- PHS+88
-
Andrew J. Palay, Wilfred J. Hansen, Mark Sherman, Maria G. Wadlow, Thomas P.
Neuendorffer, Zalman Stern, Miles Bader, and Thom Peters.
The Andrew toolkit -- an overview.
In Proceedings of the Winter 1988 USENIX Conference, pages
9-22, Dallas, TX, USA, January 1988. Usenix Association.
- Pra83
-
Vaughan Pratt.
Five paradigm shifts in programming language design and their
realization in Viron, a dataflow programming environment.
In Conference Record of the Tenth Annual ACM Symposium on
Principles of Programming Languages, pages 1-9, Austin, TX, USA,
Januray 1983.
- PW72
-
Leon Presser and John R. White.
Linkers and loaders.
ACM Computing Surveys, 4(3):149-167, September 1972.
- SBK86
-
Mark J. Stefik, Daniel G. Bobrow, and Kenneth M. Kahn.
Integrating access-oriented programming into a multiparadigm
environment.
IEEE Software, 3(1):10-18, January 1986.
- Spi93
-
Diomidis Spinellis.
Implementing Haskell: Language implementation as a tool building
exercise.
Structured Programming, 14:37-48, 1993.
- Str86
-
Bjarne Stroustrup.
The C++ Programming Language.
Addison-Wesley, 1986.
- TOO86
-
Ikuo Takeuchi, Hiroshi Okuno, and Nobuyasu Ohsato.
A list processing language TAO with multiple programming paradigms.
New Generation Computing, 4(4):401-444, 1986.
- WW88
-
J. H. Williams and E. L. Wimmers.
Sacrificing simplicity for convenience: Where do you draw the line.
In Conference Record of the 15th Annual ACM Symposium on
Principles of Programming Languages, pages 169-179, Januray 1988.
- YS86
-
Jia-Huai You and P. A. Subrahmanyam.
Equational logic programming: An extension to equational programming.
In Conference Record of the 13th Annual ACM Symposium on
Principles of Programming Languages, pages 209-218, St. Petersburg Beach,
Florida, USA, Januray 1986. Association for Computing Machinery.
- Zan84
-
Carlo Zaniolo.
Object-oriented programming in Prolog.
In 1984 International Symposium on Logic Programming, pages
265-270, Atlantic City, New Jersey, USA, February 1984. The Computer Society
of the IEEE, IEEE Computer Society Press.
- Zav89
-
Pamela Zave.
A compositional approach to multiparadigm programming.
IEEE Software, 6(5):15-25, September 1989.
Footnotes
- ...RTF
- Rich Text Format: a common inter-application document exchange format.