Course Tutor
Luca Aceto,
Department of Computer Science,
Aalborg University
Course Description
The goal of this course is to introduce the main computational models
and techniques that underlie the syntax and semantics of programming
languages. As you will discover during the coming three weeks, and in
the remainder of your studies, the theory that we shall cover in this
course has important applications in, e.g., compiler design, various
design methodologies and notations (for instance, the UML notation),
speech recognition, text processing, and many facets of program
analysis and implementation. At the end of the course, you will be
familiar with the basic computational models of Finite State and
Pushdown Automata, with the classes of grammars that generate the
languages recognized by these abstract computational devices, and with
basic operational and axiomatic semantics of programming languages.
We shall try to emphasize the role played by semantics in compiler and
language design, program analysis, and good programming practice. (The
notions of pre- and post-conditions, loop invariants and the like have
their root in a formalism for reasoning about programs called Hoare
logic, which is the foundation of axiomatic semantics.)
Syntax
Programming languages, like all other natural or formal languages,
have a syntax and a semantics. Their syntax
describes the collection of legal programs by means of a set of rules
that allow one to construct new syntactically correct programs from
smaller ones. This set of program formation rules gives a
formal definition of the syntax of a programming
language. Such descriptions are now quite common, and are often given
in BNF
notation.
The development of high level programming languages --- like FORTRAN,
COBOL and LISP --- was one of the major advances in Computer Science
in the 1950s. These languages allowed programmers to specify commands
in mnemonic code and with high level constructs such as loops, arrays
and procedures. With the development of these languages, and of the
more sophisticated ones that followed, it became important to
understand their expressiveness (that is, what programs can be written
using them) as well as the characteristics of the simplest computing
machines that can translate them into machine language. This brought
about the study of formal languages and the automata that recognize
them. The goal of the first part of the course is to introduce some of
these classes of formal languages, their grammars, and the simplest
machines that recognize them.
Semantics
To quote from the introduction of Matthew
Hennessy's book The Semantics of Programming Languages: an
Elementary Introduction using Structural Operational Semantics
(John Wiley and Sons, New York, N.Y., 1990):
It is not possible to have a true understanding of a programming language without a mental model of its semantics, i.e., "how the language works".
Programmers usually develop this mental picture by day-to-day use of
the language via a compiler and/or an interpreter. This empirically
obtained knowledge of "what programs do" is imprecise, haphazard and
hard, if not impossible, to work with.
In the second part of this course, we shall see that the mental
picture of "how programs should work" can be developed and understood
in a machine independent way by using the tools of the formal
semantics of programming languages. More specifically, we shall
focus on the techniques of Structural Operational Semantics (SOS), and
will have a brief look at Axiomatic Semantics.
Textbooks
The main textbook for the syntax part of the course is Introduction to
the Theory of Computation by Michael Sipser. The author
maintains a list of errata
for the current edition of this book.
For the semantics part of the course, I shall use a textbook that
is available on line. This is:
- Hanne Riis Nielson,
Flemming Nielson:
Semantics with Applications: A Formal Introduction.
Wiley Professional Computing,
(240 pages, ISBN 0 471 92980 8),
Wiley, 1992.
In July 1999, a revised edition has been made available for download,
in
gzip'ed postscript,
postscript (recommended), or
pdf formats.
An excellent textbook focussing on Structural Operational Semantics is
The
Semantics of Programming Languages: An Elementary Introduction Using
Structural Operational Semantics (postscript file) by Matthew
Hennessy. You are warmly encouraged to read it.
Whenever necessary I shall supply notes and chapters from other
references to supplement the material in these books.
Prerequisites
The course assumes basic knowledge of basic discrete mathematics, such
as the one that you have been using in your algorithms and data
structures course, and a willingness to think about the material that
we cover. (In fact, the latter is much more important than
the former.) Those of you who would like to refresh their memory, or
who are not confident about their understanding of the basic
mathematics used in this course, are strongly encouraged to read,
e.g., Chapter 0 of Sipser's book. As practice makes perfect, I also
recommend that you work out some of the exercises to be found in those
references. Try, for example, exercises 0.2-0.5, 0.7, 0.10-0.11 in
Sipser's book.
Note: The above exercises are not really part of this
course. How many you attempt to solve depends only upon your level of
confidence with the mathematics that will be used as the course
progresses.
Course Material and Lecture Plan
The course will consist of a series of 15 lectures. Our working
language for the course will be English.
Below, I am attempting to give you an a priori description of parts of
the plan of the course. You are invited to take it with a pinch of
salt. The topic of each actual lecture may vary, depending on how fast
the course progresses, and on how receptive you are to the topics of
the course.
- Syntax
- Lecture 1 (Wednesday, 27 November 2002):
Introduction to the course; Mathematical Preliminaries:
Strings and languages; A taste of finite automata.
- Lecture 2 (Thursday, 28 November 2002):
Nondeterministic automata and their relationships to
deterministic ones.
- Lecture 3 (Friday, 29 November
2002): Regular expressions and finite automata.
- Lecture 4 (Monday, 2 December
2002): Languages that are not regular.
- Lecture 5 (Tuesday, 3 December 2002):
Context-free grammars.
- Lecture 6 (Wednesday, 4 December 2002):
Pushdown automata and their relationships to context-free
grammars.
- Lecture 7 (Thursday, 5 December 2002):
Languages that are not context-free.
- Semantics
- Lecture 8 (Friday, 6 December 2002):
Introduction to the formal semantics of programming
languages.
- Lecture 9 (Monday, 9 December 2002):
Natural and structural operational semantics for the language
While.
- Lecture 10 (Tuesday, 10 December 2002):
Relationships between the natural and structural operational
semantics for the language While.
- Lecture 11 (Wednesday, 11 December
2002): Operational semantics for extensions of the language
While.
- Lecture 12 (Thursday, 12 December 2002):
Natural semantics for blocks and procedures.
- Lecture 13 (Friday, 13 December 2002):
Natural semantics for blocks and procedures
(continued).
- Lecture 14 (Monday, 16 December
2002): A taste of program verification - partial and total correctness
of programs.
- Lecture 15 (Tuesday, 17 December
2002): All questions answered.
Exercise Classes and Advice on Modus Operandi
Each lecture will have an associated exercise "class" that follows it
immediately. I encourage you to work on the exercises in small groups, and to
discuss your solutions to them amongst yourselves. If you can explain
the solutions to one another, then you really understand what
is going on, and you will realize where there are gaps in your
mastering of the course material. Should you get really stuck
(see below), feel free to come and talk to me on the fifth floor,
where I have an office and will be available every day from 10:15
until 15:30. Alternatively, you can send me an email with your
questions, which I shall answer as soon as possible. Try to formulate
your answers in writing before you come to see me, as this will help
you clarify what the problems are.
The exercises will mostly be "pen and paper" ones, but I'll also
suggest programming projects. (The programming projects will most
likely span several exercise sessions.) The "pen and paper" exercises
marked with a star are those that I consider more important for your
understanding. All the exercises will be "doable", and working them
out will greatly increase your understanding of the topics covered in
the course. The best advice I can give you is to spend some time on
working them all out by yourselves, and to make sure you understand
the solutions if the other members of your group (or me) give you the
solutions on a golden plate. Above all, don't give up if you cannot
find the key to the solutions right away. Problem solving is often a
matter of mental stamina as much as creativity.
For further advice on how to learn this material (and, in fact, the
material in any course) I strongly recommend that you look at the
slides for the talk Psychologists'
tips on how and how not to learn by Wilfrid Hodges. In
particular, try to reflect upon the hints he gives, and ask yourselves
how much you practice what he preaches. You might also wish to read How
to Read Mathematics by Shai Simonson and
Fernando Gouveau --- a collection of useful, down-to-earth tips on how
to read, and learn from, mathematical texts.
Exam
The exam will be individual and written. Further details will be
posted on this page in due course. Thirty percent of the final mark for the
course will be determined by the course work that you will hand in during
the course.
Links of Course Related Interest
Software
- Syntax:
- The Java Computability Toolkit.
The Java Computability Toolkit (JCT) contains a pair of graphical
environments for creating and simulating NFAs, DFAs, and high level
Turing Machines.
- For an in-house simulator for Finite Automata, look at the DAT2 project developed in 1999
by some of your Aalborg
colleagues!
- For an in-house simulator for Pushdown Automata, look at the DAT2 project developed by
some of your Aalborg colleagues!
- Semantics for the language While:
- Miranda implementations of evaluation
of expressions, and of the natural
and structural operational semantics. (Courtesy of Hanne Riis Nielson and Flemming Nielson.)
- LETOS -- A Lightweight Execution Tool for Operational Semantics.
This is a lightweight tool proposed by Pieter Hartel to aid in the
development of operational semantics. To use letos, an operational
semantics must be expressed in its meta-language, which itself is a
superset of Miranda. The letos compiler is smaller than comparable
tools, yet letos is powerful enough to support publication quality
rendering using LaTeX, fast enough to provide competitive execution
and tracing using Miranda or Haskell, and versatile enough to support
derivation tree browsing using Netscape. Letos has been applied to a
Java Secure Processor, which is a version of the Java Virtual Machine
intended to run on smart cards. Letos has also been used with subsets
of various programming languages.
The latest version of letos is described in detail in a paper, which is included with the program and some sample inputs in a
gzipped tar file.
This page will be actively modified over the period 27
November-18 December 2002, and is currently undergoing heavy
restructuring. You are invited to check it regularly as the course
progresses. The page will be dormant after the end of the course.
Let me know of any error you find in
the course web pages.