Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
WG2.8 Technical Meeting

Sixteenth WG2.8 Technical Meeting
Packwood, WA, USA
July 2000

John Launchbury, Andy Moran

These were notes taken during the working group talks. Speakers had some opportunity to edit them afterwards. Much of what is described here is work in progress.

Mark Jones. 2020 Vision.
John Launchbury. FP application at Motorola.
Olivier Danvy. Lambda lifting, dropping and expressiveness.
Dick Kieburtz. Logic for Strategies.
Jacques Garrigue. Design Space of Polymorphic Variants.
Francois Pottier. Typing by Encoding: A Reductionistic Approach to Building Type Systems.
Martin Odersky. Functional Nets.
Xavier Leroy. Fun with IA64.
Warren Burton. Count your Garbage.
Daan Leijen. Haskell/DB.
Tim Sheard. Challenges in Representing Programs as Data.
Peter Thiemann. Enforcing Safety Properties by Program Specialization.
Conal Elliott. Compiling Embedded Languages.
Bob Harper. PSciCo.
Andy Moran. Unique Fixed-Point Induction for (Almost) All the Family.
Manuel Chakravarty. iHaskell -- A state-based GUI Library without Mutable State.
Joe Stoy. Proving Cache Protocols.
Rishiyur Nikhil. Experience Advocating Formal Methods at Digital/Compaq.
Jack Dennis. Computer Architecture for Functionally Programmed Computations.
Olivier Danvy. Syntactic Accidents in Program Analysis: On the Impact of the CPS Transformation.
Paul Hudak. Functional Reactive Programming from First Principles.
Satnam Singh. Formal Verification of Xilinx' Circuit Cores.
5 Minute Topics.
Discussion: Host Languages for (Embedded) DSLs.

Mark Jones. 2020 Vision.

Remember the days long ago, when the millennium was new, when Programatica was an exciting step forward in programming environments for any language, let alone for functional language... ??

The Programatica vision: build a programming environment that supports and encourages its users in thinking about, stating and validating key properties; enabling programming and validation to proceed hand in hand, using properties to link the two; "programming as though properties matter".

Properties should be included as declarations within the program, using essentially the same syntax as the programming language. Properties will be checked with different degrees of confidence, from using something like quickcheck (generating automatic test cases) to exporting the information to theorem proving tools. This "knob" concept allows for variation in confidence on asserted properties, and the level of proof is captured by interlinked certificates providing an audit trail. Always the emphasis is on encouraging programmers to express the properties they expect, paired with assistance in checking whether the properties hold or not.

John Launchbury. FP application at Motorola.

A few weeks ago, we presented some early work on describing and combining permutations for use in the APFU (AIM Permutation Function Unit) to cryptoprocessor designers at Motorola. The permutations involved have historically been calculated by hand, and doing so can take weeks for a single one (with a lot of time spent testing and debugging). By representing permutations with lists (where the value of the ith element denotes which element of the input the ith output element comes from), we were able to express all of the high-level operations on permutations that are used to build the desired complex permutations as simple list operations. For example, overlaying two permutations is list append. This was seen as radical by the Motorola engineers, because the code describing the final permutations looks very similar to the specifications that they work from. From an FP point of view, the specification tool-set was simply a carefully designed combinator library.

Olivier Danvy. Lambda lifting, dropping and expressiveness.

Lambda lifting: from block structured programs to recursive equations, with application to compiling, partial evaluation and program transformation. Lambda dropping is the reverse, where parameters which do not change in recursive calls are lifted out of the recursion. Examples: map, foldr, CPS versions of iterative functions.

Ideally, the programmer should not have to care whether the function is written in either style. In reality, different code gets compiled with implementation tradeoffs. For example, SML/NJ and GHC perform lambda dropping for single recursive functions only (not mutual recursion), and some of their algorithms are quadratic in the number of parameters. The output of partial evaluators can have hundreds of parameters, so an efficient lambda-dropping algorithm is needed to enables the compilers to cope. Fortunately, such an algorithm can be built on top of Harel's linear dominator algorithm.

As a puzzle: (a) implement lifted append in terms of dropped fold; (b) implement dropped append in terms of lifted fold.

Dick Kieburtz. Logic for Strategies.

How do we reason about strategies for transforming terms? Strategies include conditions (analyzing the structure of terms), context (where a term occurs and its bindings) and control (order of application). Stratego is a language for programming transformation strategies: how do we reason about Stratego programs?

Try weakest precondition: what property must a term have for the strategy to succeed and produce a term satisfying the target property? A healthy set of structural rules can be constructed, but to cope with Stratego patterns and instantiation, the weakest precondition rules need to be parameterized by an environment. When handling committed choice (i.e. not full backtracking), the weakest precondition laws have been chosen to assume a demonic non-determinism.

To characterize the non-local properties of a term, we need to quantify over term geography. We can reapply CTL to this problem. Quantifiers A and E describe paths, and G, F and X describe depths, and these quantifiers are used in pairs. Some limits in expressibility can be overcome by modal mu-calculus, allowing G and F to be described by combining recursion with the X modality. We now connect this with the Stratego strategies "all", "one", and "some" and other recursive forms. "All" corresponds with AX, "some" with EX, "bottomup" with AG, "topdown" with AG, and the weakest precondition rules can be described accordingly.

Jacques Garrigue. Design Space of Polymorphic Variants.

Structural polymorphism (over records or variants) is in between parametric and adhoc polymorphism, but the variant part was first implemented only a couple of years ago. The basic concept behind variant polymorphism is to keep track of which constructors can arise as inputs or outputs of functions. For example, the type for

    let f = function `Apple as x -> x

    val f : [< `Apple] -> [> `Apple]

meaning that the argument to f is a subtype of `Apple, but the result is a supertype of 'Apple. So, the input and output types may be completely decoupled.

A problem with early work (implemented in Objective Label) was the fact that unification of variant types did not commute. The Objective CAML implementation uses conjunctive types for argument types in matching types. Then failure of unification can be delayed until one tries to unify with a conflicting value type, by using the function, for example. (Conjunction is used to construct a principle type allowing each of the conflicting types; the constraints are then said to be grafted together.)

As often occurs when exploring a design-space, this exploration seems not to be terribly useful (the Objective Label approach seems to be sufficient), BUT the grafted constraints are interesting, necessary for principality, and may be applied to objects (allowing unification of objects with incompatible methods).

The Expression Problem: extend a recursive variant type with new constructors without recompilation. There is a solution in an untyped framework, and a failed attempt using virtual types. It can be done with polymorphic variants, but the solution presented here uses multiple inheritance. By keeping the recursion open in the type until it is closed in the expression, the recursive function gets a nicely extensible (recursive) type:

    let rec eval1 subst = eval_lambda eval1 subst
    val eval1: (string * ('a lambda as 'a)) list -> 'a -> 'a

(Here, eval_lambda expresses a single step of the recursion.) Then, given eval_expr, which is an evaluator for a simple arithmetic expression language (say), we can combine the two evaluators thus:

let eval_lexpr eval_rec subst : 'a lexpr -> 'a = function
  #lambda as x -> eval_lambda eval_rec subst x
| #expr as x   -> eval_expr eval_rec subst x

let rec eval3 subst = eval_lexpr eval3 subst

Polymorphic variants can be used for function parameters, fixed-world subtyping, and inheritance and extension. However, they are often used with the wrong expectations, type-errors are very poor, and there is a lack of good design patterns.

Francois Pottier. Typing by Encoding: A Reductionistic Approach to Building Type Systems.

The central motivation is to avoid duplication of proof obligations/work when extending/building type systems in a systematic way. The method relies on an untyped encoding of the typed calculus into an untyped calculus. Three examples: encoding exceptions into sums, encoding Java-like stack inspection into sets, and encoding information flow into labels. In each example, the source language inherits its type system from the target language. The translations from source to target are essentially monadic.

We begin with a source calculus, a target calculus, and a sound and complete translation between them (expressed as reasonable properties relating the translation and the source reduction relations and the two target reduction relations; the second is an "administrative" reduction). Then any type system for the target calculus which satisfies some simple well-behavedness conditions gives rise to a type system for the source calculus. Source and target language share essentially the same type language (modulo a simple translation) and therefore the source type systems inherits the nice properties of the target system.

Martin Odersky. Functional Nets.

Functional Nets arise out of a fusion of FP and Petri Nets (Aside: Martin is unclear whether or not this is still FP). FP gives us a rewrite-based semantics with function application as fundamental computation step. PN gives us synchronization mechanism. So, a functional net is a higher-order concurrent functional language with PN-style synchronization. The foundations already exist in the join calculus. The language presented here is called Funnel.

Stage 1: simple (higher-order) function definitions; call-by-value. Stage 2: objects (identify record = object, function = method). Permit parameterless functions, and functions + objects give algebraic types (via encoding). For example, datatypes have a single method match, which defines methods for each constructor (choosing the appropriate method, i.e. implements shallow pattern-matching).

Stage 3: Concurrency. Function calls act as a model for events with operators '&' for conjunction of events and '=' for rewrites. If '&' appears on the rhs of '=', then it is fork; on the lhs, it is 'join'. Petri Nets can easily be expressed as Functional Nets , but Functional Nets are strictly more powerful: "tokens" can have parameters, be defined by nested definitions, or be higher-order.

Rewriting semantics: a set of calls matching a lhs rewrites to an instantiated version of the rhs. Calls which do not match block until they form part of a set that does match. Combining objects and joins naively is cumbersome (especially as number of method grows) due to inadequacy of record syntax (need to hide function symbols, but need access to them for initialization). The solution is use qualifiers not only to access records but also to define them:

def newBuffer = {
   def this.put x & empty  = () & full x,
       this.get   & full x =  x & empty,
   this & empty
}

Here, this refers to the record being defined, and it is initialized to empty.

Mutable state is realized via reference cells (aka variables) with functions read, write (external), state (internal). This leads to stateful objects: an object with many methods and instances can be expressed via the mutable state quite simply, enforcing mutual exclusion (monitor like: mutex on entry to object, i.e. use of a method). Many other kind of mutex easily expressed; readers/writers nicely done. Channel-based mechanisms (CCS, CSP, asynchronous pi-calculus) can be easily encoded.

The formal model is treated in two stages: sequential and concurrent. The core calculus for functions and objects is a name-passing, continuation passing style calculus; then to add concurrency, we add '&' as an expression and as a definition (reduction is easily extended). The formulation is equivalent to the original join calculus (that was a reflexive CHAM, this is a rewrite system).

First implementation runs on top of JVM. It is an interpreter, statically-typed, and provides colored local type inference. It was taught in a class on foundations of programming.

Xavier Leroy. Fun with IA64.

The IA64 achieves instruction level parallelism by grouping instructions to be executed in parallel. The instructions in a group must be independent (no RAW or WAW dependencies), allowing the processor to execute the instructions in a group sequentially, in parallel, or split into subgroups, without changing the semantics. To generate code, we need to track the cycle latencies between dependent instructions, and then use a greedy algorithm to choose a subset of instructions to be emitted in one cycle (with heuristics favoring the critical path etc.). More advanced scheduling techniques include moving code across branches, unfolding loop iterations, and even rescheduling recursive functions.

Branch prediction attenuates the branch penalty that would otherwise arise. This works well even for function returns, but little support is provided for the computed addresses that arise from higher order functions. The IA64 introduces predicated instructions as an alternative for avoiding some conditional branches, but they don't seem helpful for recursive functions or pattern matching.

The real challenge is with memory loads (even for programs in C or Fortran), especially as the delay is not known statically. A priori, we could not lift loads prior to branches, because the load may be invalid. However, the IA64 has speculative load instructions which permit an early load without invoking an error. The validity of the load is checked only when the value is needed. The compiler must trade the expected value of the load with the likely interference with needed loads. One place where the benefit seems significant is in preloading the subcomponents from a pattern match.

Warren Burton. Count your Garbage.

Our intent is to instrument programs to know the maximum space required throughout, and also the space used between garbage collections. Could do this by forcing GC after each reduction, but the cost is too high. Instead we adapt a mechanism designed for studying parallel programs. We add an extra field to each block of memory containing a time stamp. When a GC is performed, we generate a history of the how much garbage was generated between separate collections. When combined with allocation information we can calculate the high tide mark on the memory.

How do we know when each block becomes garbage? We keep track of time globally: an integer that is incremented each time a block of memory is allocated. Then, whenever we drop a pointer to a memory block we write the last allocation time into that block. If the block turns out later to be garbage, then the time field will contain the time the block became garbage. To propagate this information to other blocks that also became garbage at this time, we use a union-find algorithm.

Chris Okasaki. Simplified persistent arrays.

Naive persistent arrays are O(1) read O(n) write; balanced trees are O(log n) read and write; version arrays, difference lists etc. provide O(1) read and write in special cases, but O(log n) or worse in general; Dietz' method is O(log log n) expected amortized for read and write, but the scheme is incredibly complicated and the constant is large (note that monadic or uniqueness approaches give up on persistence).

There are results that demonstrate that without true random access a log factor is unavoidable. We base the structure on a trie (a tree indexed by array addresses). How many bits at a time should we use? If 1, then reads and writes are O(log n). If log n bits, then reads are O(1) but writes are O(n). If we use (log n)/k bits, reads are O(1) and writes are approx. O(root(k,n)). Better results can be gained by allowing the depth and width to grow (slowly) with n. Take sqrt(log n) bits at a time. The width is 2^sqrt(log n) and the depth is sqrt (log n). This provides reads at O(sqrt (log n)), and writes at O(sqrt (log n)* 2^sqrt(log n)) (essentially O(log n)).

Implementation is easy: store the children of each node in a naive array. The only tricky operation is initializing a tree where we take care to save time and space by sharing identical subtrees. This means that the arrays work well for sparse structures as well.

Daan Leijen. Haskell/DB.

SQL is the standard access language for databases, providing selection, restriction, join, aggregates etc. and is very widely used in web systems. Drivers for the SQL server are usually written in PERL or Visual Basic, but these provide no static checking on the structure of the generated commands. A recent ebay collapse (costing $5M) was due to an uninitialized variable.

Haskell/DB is based around a monad for generating SQL commands. The monad code is typechecked and is guaranteed not to generate run-time errors. Haskell/DB is able to do a lot of relational optimization not available to SQL, and allow the SQL engine to do the low-level optimization it is good at.

This work raises questions about how to embed a language in Haskell. We need to address structure, syntax and types. The underlying structure can often be expressed using an algebraic datatype, prettied up with operator declarations. To add additional type information, we use phantom types. These are data types with unused type arguments where it is the type declarations on the operators that restrict which values can be combined.

Tim Sheard. Challenges in Representing Programs as Data.

Metaprogramming systems manipulate program fragments. Using algebraic datatypes for this purpose produces ugly and unclear programs. Furthermore, object-level bindings force us to worry about escaping scope, and force us to use gensym et al for generating new object variables. And then there's the whole problem of typechecking these generated programs...

We use annotations to distinguish metacode from object code, e.g. in the variable f is object level, whereas x and y are meta level. In general, the user should provide a parser for the object parts, switching back to the meta level for the escaped parts. The annotations may then indicate which sublanguage is expected, as many may be used at the same time.

Lambda bindings are not quite right to handle object-level binding as evaluation ceases at the lambda. Four problems arise: opaqueness (the structure of abstractions cannot be observed), loss of expressivity (constructing terms from object variables), junk (the ability to write functions that do not correspond to any object-level code), and latent effects (computational effects in the metalanguage are introduced into the pure syntax of the object language). One solution is to introduce a new lambda which evaluates its body and that is subject to higher-order pattern matching.

Indexed types can be used to keep track of features such as the length of a list. Similarly, we use indexed types to track object types. That is, the type languages of object languages are represented as algebraic datatypes. The relationship between source and target programs is maintained by explicit type transformation functions, that are used by the type checker to ensure type correctness.

Peter Thiemann. Enforcing Safety Properties by Program Specialization.

Imagine a monitoring execution for applets which terminates the program if it attempts to perform a "bad" operation (such as a read followed by a send). The underlying model is based on security automata. In the above example, the state space of the automata could be {before-read, after-read}. The effect of the monitoring execution can be achieved by translation into the lambda calculus (in CPS) passing the security state around. Unfortunately, this threads the security state, and requires every primitive operation to perform a run-time check.

To optimize these inefficiencies, we translate to a two-level term, and then attempt to apply program specialization. We make the security state a compile-time value, and assume that the calculation of the state does not depend on run-time values. In addition to using standard tricks, the specialization needs a couple of non-standard tricks including, surprisingly enough, some type specialization. Actually, the key point could be captured using first-class, unifiable memoization tables that capture local polyvariance.

An alternative is to keep the security state as a run-time value, but to try to get rid of as many run-time checks as possible. To do this, we keep track of the state-transformation function that would be applied up to any given point, and insert a test only at those points for which the state transformation function can generate "bad".

Conal Elliott. Compiling Embedded Languages.

Modern functional languages are good hosts for embedded domain specific languages. Can we make optimizing compilers that respond in a domain specific way? There are several examples of functional geometry that enable the creation of some images, but what about general images, such as the set of all functions (Point->Color)? Useful Point transformations include functions like translate, scale, rotate, and swirling. For images we have functions like over, fade, move, and swirl. These combine to give a very wide variety of image families

Pan programs are written in Haskell. When run, they produce expressions in a simple first order language that is compiled and executed to produce the image. In effect, Haskell acts as a powerful macro language for the simple image description language, but also includes some image optimization in the process through bottom-up simplification and sub-image sharing. Type safety is achieved by phantom type variables.

Compilation includes code motion, sharing via pointers and structural equality, loop hoisting, and array synthesis, and care is needed regarding termination given that all definitions are unfolded completely.

Bob Harper. PSciCo.

The goal of this project is to use high-level languages for scientific computing and to achieve a smooth integration of prototyping with production code. How can it win? By using methods that are "unthinkable" in current languages, and by changing the terms of the debate to include issues other than simply efficiency. The starting point is ML with library access to NESL (a first order functional language with excellent support for implicit parallelism). We build shared libraries of scientific code, applied to a range of application problems. Our language base includes a new distributed, real-time garbage collector. ML supports "coherent integration" of abstractions better than Java or C++ do, in that it confirms coherence at compile time rather than dynamically.

The arithmetic required boils down to line-side and in-circle tests, and floats are not sufficient for this. We use a "lazy" interval arithmetic, with a history of the computation. If the interval is not sufficient to answer the question, we redo with exact arithmetic. However, can we extend exact arithmetic to cope with more than polynomial operations? Maintenance of rounding mode introduces some interesting challenges, simplified by the insight that if the lower bound is recorded in negative form then only one mode is required.

Triangulated surfaces occur again and again. We have an abstract description that corresponds directly to the text book accounts. A simplex, for example, is a sequence of vertices, and a complex has a log-time traversal. The cost of persistence (pure functionality) is logarithmic. Can we afford this? Fortunately, there is an optimal randomized algorithm for the persistent case that only has to touch the surface O(n) times, keeping us within the existing O(n log n) bound.

Dealing with skinny triangles can work well in the pure setting. Rather than perform a forward search every time a skinny triangle is handled (to check no new problems are being introduced), we simply backtrack to a dynamically generated exception whenever a problem is discovered. The old data structures still exist. This is much cleaner than the imperative model, though some rework of computations may arise in some situations. Furthermore, this method applies well to the 3D case whereas the traditional methods do not.

Andy Moran. Unique Fixed-Point Induction for (Almost) All the Family.

How do we reason about programs that contain non-determinism? The programs could be CBV, CBN, CBNeed; the non-determinism could be erratic (choose one and make a commitment), countable (pick a natural number) or based on McCarthy's amb operator (evaluate in parallel and accept the first that finishes). We provide an operational semantics and derive operational equivalences, cost equivalences and a unique fixed-point induction for cost equivalence.

One way of showing equivalences between terms is to show that the defining equations have unique fixed points, and to use unique fixed-point induction. To classify unique fixed points we introduce a cost calculation, typically based on the occurrences of beta and the uses of the non-determinism operator, and augment the operational semantics with costs. We define two terms to be cost equivalent iff they cost the same to converge in all contexts (written here =.=). To regain the original equivalence, we introduce a "tick" (a dummy operation that costs one computation step, written here as /).

The unique fixed point induction law is as follows:

  M =.= /C[Ms]   N =.= /C[Ns]
  ---------------------------
           M =.= N

where C[Ms] is a context C[] and Ms is a (language specific) substitution of M. In practice, the proofs are very calculational. The rule is valid for all non-deterministic functional languages except CBNeed + countable choice and CBNeed + Amb.

Manuel Chakravarty. iHaskell -- A state-based GUI Library without Mutable State.

We want to be able to use the Widget set GTK+, but the problem is that it (like the other such Widget sets) is inherently stateful, but we would prefer to avoid mutable state. Another goal is to separate front-end and back-end. We use a two-step process: build a Haskell binding to the GTK+ set, and then build a functional interface to that binding. We want to have a smooth integration with the IO monad (since "real" programs need the other features of the IO monad, which is hard to achieve with a separate GUI monad), the library should be easily extensible, applications should be distributable, and asynchronous (multi-threaded) if necessary.

The main problem remains: how to remove the state. State in GUIs comes in two flavors: that encapsulated in GUI Widgets (usually part of non-Haskell land, so all we can do is make it easy to access/use, aka "view model"), and that required by the logic of the application (independent of UI, aka "domain model"). The latter can be affected asynchronously by the GUI.

There are three basic interactions between view and domain models: streams (one-way synchronous), links (translating from one model to the other; when one changes, so must the other), and state transformers (affecting the domain model's state asynchronously). We model a time series instead of a snapshot, so mutable state becomes a series (stream) of values. The central abstraction, which allows the expression of each of the three basic interactions, is the port: a port accepts incoming events and serializes (outputting a stream). (This acts as a merge, but without the need for reconfiguration when adding new wires.)

The library contains the Ports interface and many examples. One can in addition filter (by adjusting the value, not discarding) the input to a port, and distribution is coming soon. One can also create proxy ports and linked ports (feedback loops are avoided).

Joe Stoy. Proving Cache Protocols.

The goal is to use a term-rewriting system to describe various cache protocols, and have different back-ends, allowing, for example, for verification (both by hand and machine-assisted via PVS), and compilation to Verilog (thereby allowing hardware synthesis). The application is a memory model with an associated cache protocol.

The CRF memory model defines a number of sites, each of which is a processor-cache pair, with a simple interface between. This is described by a restricted term-rewriting systems. Rewrites must preserve the structure of the term rewritten (this restriction must be guaranteed by the designer). A cache cell is an address-value-cache-state triple, and a cache is a set of such cells; a memory cell is an address-value pair, and memory is a set of such cells. Allowed processor actions are defined by rewrites, as is the behavior of caches. In addition, we need some background rules which may fire at any time. These describe the interaction between the cache and the memory (cache load, writeback, purge).

The protocol consists of three micro-protocols, known as the base protocol, Writer-Push, and Migratory. The base protocol is a simple naive caching protocol. Writer-Push is designed for the situation where many sites are reading an address, but writes are relatively rare. Migratory corresponds to the situation where only site is interested in an address. These protocols can coexist in the same system (although Writer-Push and Migratory cannot of course be in effect for the same address). Soundness has been shown for the base protocol, and soundness and liveness for the Writer-Push protocol.

Soundness is shown via simulation: one defines and establishes an invariant, defines an abstraction function, and then show that the abstraction of each implementation action is a valid specification action (or sequence, thereof). Liveness says that it's always true that every operand is eventually ready.

A design methodology called "imperative/directive" is used which guarantees the soundness of the system designed by identifying an imperative subset (with imperative and directive messages) for which soundness is easy to show, and then describe the complete system on top of that. Rules in the complete system are either special cases of the imperative subset, or correspond to no-ops in the imperative subset, so the early soundness result follows easily. It is important to achieve an orthogonal split between mandatory and voluntary rules.

This was proved in the PVS system. PVS has a powerful logic and lots of arithmetic theory built-in, which means that the tedious elements of formal proof (which constitute a very large part of such work) can be handled automatically. PVS is a strongly-typed (almost) FL, and has a very rich subtype structure. We embedded Lamport's TLA into PVS, but not as an embedded PVS datatype, because we wanted to bring the full power of PVS' prover to bear on the TLA assertions. TLA's existential quantifier was a challenge, since it requires a new kind of binding. Another problem was that the soundness of TLA depends upon stutter independence, whereas no such guarantee can be given for general PVS assertions. Both of these problems were overcome, more or less satisfactorily.

Rishiyur Nikhil. Experience Advocating Formal Methods at Digital/Compaq.

The Wildfire server is a major part of Compaq's business, having up to 32 processors and "scalable shared memory". Each processor has its own cache, and accesses the memory via a switched interconnection network (rather than a shared bus). The switched structure requires an exceedingly complex cache coherence protocol based on directory structures, that is built directly into the hardware.

Using TLA+ (Lamport), the Alpha memory model and the Wildfire protocols were specified, and then showed that the protocol was a refinement of the memory model (the verification concentrated just on the anticipated difficult cases and used the TLA+ model checker). Two minor bugs (unexpected behaviors) were discovered (one in the protocol, one in the Alpha memory model). In addition, the exercise provided new insight, that the large number of messages could be decomposed into a few basic "message quarks".

TLA+ has been used elsewhere within Compaq since then. For example, the Alpha EV7 team adopted TLA+ for verification of their protocols with research personnel acting a consultants only. Again, a standards battle between FIO and SIO was resolved when TLA+ was used to exhibit deadlock problems, leading to a new industry standard. However, still an uphill battle to get TLA+ accepted as a fundamental part of the design process.

Jack Dennis. Computer Architecture for Functionally Programmed Computations.

From a 40 year perspective on architecture for functionally programmed computations there seem to be two worlds: one driven by ideas of lambda-calculus (expressiveness, HO, sequential), the other by parallelism (first order, data driven, no side effects). The 60's saw the invention of LISP, including a parallel interpreter and observations that asynchronous execution can still deliver determinacy. These were refined in the 70s with data flow models and implementations, some of which were built by industry, including an image processing chip. The 80s saw the invention of waiting/matching stores, as well as combined Von Neumann / Data Flow architectures. Many of these ideas are still being attempted in industry today, though major successes have been limited to date.

The features of an ideal architecture: global virtual memory (across all hierarchies and levels), memory management in hardware (GC), fine grain parallelism, honor principles of modular programming including data-structure persistence.

Olivier Danvy. Syntactic Accidents in Program Analysis: On the Impact of the CPS Transformation.

The results of program analysis are not necessarily meant to be preserved under meaning-preserving program transformations of the source code. Not much is known about the effect of syntactic change on program analyses. We focus here on CPS.

CPS provides binding-time improvement by patching the loophole in the let rule. On the other hand, a function D->S is necessarily constant, but the corresponding CPS equivalent (D->(S->A)->A) is not a constant CPS function. Up to the mid-90s, CPS was commonly seen as a good thing until Sabry and Felleisen exhibited an analysis for which CPS both increased and decreased the precision of dataflow analysis, so yielding incomparable results. Since then no work on CPS analysis has appeared.

We took an off-the-shelf monovariant, constraint-based analysis, applied to a language in A-normal form, both before and after applying the CPS transformation. First we tried 0-CFA. Given a solution of the analysis of a program p, we want a solution of the analysis of p' (the CPS counterpart of p). We build a relation between the results of the analysis in direct style and CPS. We are able to prove that the least solutions of the analyses of p and p' are related, and that the judgments made for variables are consistent with each other.

Moving to BTA (again, off-the-shelf), we see that the same relationship holds in the forward direction, but not in reverse, because of the let-rule. If we enhance the let-rule (to match the CPS behavior, i.e. continuation-based PE), we regain the equivalence theorem.

Our conclusion is that (in contrast to Sabry and Felleisen) some very standard analyses have results that are preserved or improved under CPS. Furthermore, the framework provides a proof of the folklore result that CPS provides a global binding time improvement.

Paul Hudak. Functional Reactive Programming from First Principles.

Functional reactive programming (FRP) is a general declarative framework for reactive hybrid systems, with continuous time. Components are either behaviors (reactive, continuous-time-varying values), or events (streams of discrete event occurrences). Behaviors and events are both first class, and there are a rich set of operators over each type and for allowing events to switch between behaviors. FRP has been applied to animation, robotics, vision and GUIs, and has also been experimentally reimplemented in Java.

There is an ideal semantics for FRP based on continuous time, but the implementation only approximates the semantics. In what sense is the implementation correct? Is there a convergence property between the semantics and implementation (even under the assumption that we have exact real number arithmetic)? We can show that under an appropriate set of assumptions, the discrete implementation is faithful to the semantics, in the limit (when the interval size tends to zero).

The model for a behavior of type a (i.e. Behavior a) is the function (StartTime->CurrentTime->a). The presence of start time is to handle integration cleanly. An event of type a (i.e. Event a) is similarly modelled by the function (StartTime->CurrentTime->[(Time,a)]). The basic FRP operators then work out fairly easily.

The discrete implementation models signals as stream transformers type Behavior a = [Time] -> [a] type Event a = [Time] -> [Maybe a] To connect the implementation to the semantics we introduce an abstract implementation function whose result is the value of a behavior at the end of a series of sampling, and then define the limit function to be the limit obtained when the sample size tends to zero. However, simple convergence by itself is not enough to guarantee that integration behaves well. Instead we use the standard notion of uniform convergence, which we can show guarantees that integration behaves well. Happily, uniform convergence is preserved by all the operators including lifting, switching and integration. Moreover, uniform convergence rules out infinitely frequently events, so eliminating Zeno's paradox.

Satnam Singh. Formal Verification of Xilinx' Circuit Cores.

Core generators produce parameterized circuits that map onto their FPGAs. Correctness is an issue, as it provides significant value to the customer that the circuit can relied upon. Of the FV methods available, model checking is most likely to find support within the company. Our goal was to develop a custom formal verification system that understands Xilinx' library components, to deploy the system for the verification of the BaseBLOX collection of Xilinx cores, and evaluate the uses and limits of Prover's model checker.

Xilinx core generators have a structural description. This contains significant intellectual property, so needs to be kept secret, but users need to have a behavioral model available for simulation. ensuring that these match is vital. We compare the EDIF output of the core against EDIF code generated from the behavioral description. The tool (Argus) generates custom Lustre scripts for Prover's model checker L4. The results are built into a set of web pages that the engineers can browse.

Xilinx has placed an upper limit of 64 input bits to gates, which provides a bound for model checking. Xor implementations were a challenge, and needed access to a custom tool from Prover. Times were very reasonable (e.g. 96 node adder tree of 8-bit adders takes 21 seconds, an 8-bit multiplier takes 3 minutes). A number of bugs in the cores were uncovered in this process, and moreover the verification ran much faster than conventional simulation. Xilinx is planning a press release announcing a new verified version of BaseBLOX.

5 Minute Topics

Tim Sheard. Experience with constructing a source-to-source transformation for C.

Credence manufactures "chip testers" (sells 10-25 a year, each costs 5-10 million dollars, embedded Sun workstation, all configured identically, service is the name of the game). Each "test" is encoded as a C program. They're upgrading from SunOs 4 to Solaris, and want to change their library organization. Key idea: PINLIST. Used to be an array of unsigned short; want to replace it with an ADT. Want to identify programs that break the abstraction, fix programs that break the abstraction.

Strategy: use off the shelf FP tools. Choices: Alex Aiken's tools, or Ckit (from SML/NJ people). Chose Ckit.

Bad Experiences: macros and conditional compilation, #include chasing. Ckit makes some bad choices for program representation, esp. if doing transformation rather than analysis (i.e. different programs can be represented by same AST).

Good Experiences: Both tools work pretty well. Parsers easy to use and amazingly fast (can parse hundreds of large programs in 3 or 4 seconds).

Conclusions: the tools were good for what they were designed for. But if stretched beyond that, then things get a little hairy. The tools were designed for analysis, but used for source-to-source transformation.

John Hughes. The Random Monad.

A little about the random test-case generation mechanism in quickcheck.

      class RandomGen a where
          next  :: a -> (Int, a)
          split :: a -> (a, a)
      
      newtype Gen a = Gen (Seed -> a)
  
      Gen m >>= f =
          Gen (\s -> let (s1, s2) = split s
                         a        = m s1
                         Gen m'   = f a
                     in m' s2)
  
      random :: Gen Int

The problem is that this is not actually a monad. Viz.:

      random >>= return   ===   random

random

Chris Okasaki. One hierarchy or two?.

In the Edison library, there are two distinct hierarchies, one for sets and one for finite maps. The hierarchies are very similar, and have very similar names. Furthermore, because sets and finite maps are interdefinable, it is natural to wonder whether we may want to have a single hierarchy, and implement sets as finite maps, for example. On the other hand, this would restrict implementors perhaps unnecessarily. The tradeoff is simplicity of concept for the programmer versus possible implementation costs.

Conal Elliott. Some Functional Images.

Images are functions from points to colors. Lots of code examples were presented. Even though we are manipulating expressions rather than actual floats, the code deliberately obscures this and looks just like floating point code. Representing transforms as functions rather than matrices is more general, and provides even more opportunities for precomputational optimization than matrices do.

Jacques Garrigue. The 3 Layers of LablGTK.

LablGTK is a GTK interface for OCaml (originally for Objective Label). There are 3 layers:

C stubs and support functions. OCaml has good support for C stubs; mainly one expresses a the type information for each function.
Low-level ML interface; as little actual code as possible, concentrating upon making a well-typed layer. This is error prone, since there is no consistency check possible with the C side.
Object-oriented interface, building upon the well-typed layer. Indeed, the well-typedness of the second layer is crucial to this layer. Each GTK class is mapped to a OCaml class, although we restrict the methods to those that are useful (the GTK hierarchy has some odd features). Some tricks were needed to get around the lack of polymorphic methods in OCaml.

Manuel Chakravarty. The Marshalling Problem.

Marshalling is needed whenever we want to call a foreign function: data types in the local language must be converted to data types in the foreign language. But where do we write the conversion function? Two approaches: marshall on the foreign side, or marshall locally. The problem with the former is that it may corrupt the local heap and we need both foreign and local code for binding a single function. And it is quite tedious. So we prefer marshalling locally, mainly because corruption of the local heap is impossible (unlikely?) and it's just nicer! But it certainly isn't bulletproof. typedefs and the difficulty of knowing exactly how big datatypes on the foreign side, #ifdefs, and lack of complete local versions of #includes make local marshalling difficult. However, if the local language has a good low-level interface to the foreign language, then one can do all the low-level bit/byte manipulation in the local language, while retaining all of the nice capabilities of the local language.

Martin Odersky. Lightweight Monads (and other bindings).

Monads are pretty heavy machinery for introducing operator bindings. Is there a simpler way? Introduce a binding form (val v op e1; e2) to mean (e1 op (\v.e2)). This is sufficient to cover monads, and more.

Martin Odersky. Classes and Laziness.

A class fulfils two roles: it defined a type of objects with a common interface, and defines a way to create objects of the class. We model this with a type definition (giving the structure) and a value definition (with a "new" method). Static class members correspond to new methods in the value definition.

The translation of inheritance is expressed by inclusion using "with", having the super-class as an explicit object. But, to achieve dynamic binding of methods under inheritance, we introduce "this". The translation passes the current value of "this" as a parameter to the object creator method. But to create an object, we need to pass the reference to the created object to the "new" method that is performing the creation. We use a "lazy" let-construct to achieve this, that builds the record in a thunk-like fashion. A complete version of laziness is not required, merely delaying evaluation until the end of the recursive definition.

Chris Okasaki. Allocating Arrays.

Some algorithms make an assumption that arrays can be allocated in constant time (i.e. without initializing all the elements). How can this be done in a managed-heap setting? We reapply an old trick. Start by building an association table of index-value pairs (including a count of valid elements), and construct a reference array which indicates where in the association table the binding is. If the reference is valid, the association table will point right back at the reference. Moreover, the garbage collector can be made aware of the tables, and simply not follow pointers in the reference array.

Daan Leijen. Visual Haskell.

Our goal is to enable the integration of languages into visual studios, when minimal resources are available. Visual studio provides a graphical environment, a customizable editor, syntax highlighting, dynamic syntax checking, integrated compilation with some debugging, project management etc. Other feature like data visualization tools etc. are beyond the scope of the project. The Visual Haskell environment has integrated these features including todo lists. The lexical analysis is incremental during typing, but parsing is refreshed from the top down, whenever a line is completed. This approach is sufficiently fast on files even up to 10,000 lines.

Discussion: Host Languages for (Embedded) DSLs

Is Haskell a good "host" for a DSL?

Haskell has many good properties:

Higher-order functions. Having higher-order combinators is very useful, since they yield a natural way of representing the semantics of the DSL.
Lazy evaluation. For example, defining a combinator like a conditional, one doesn't want the branches to be evaluated until needed. Other things like parsers or lexers rely upon laziness.
Lazy definition. One needs to be able to define datatypes of the DSL lazily.
Overloading.
Types.
Infix syntax.

Shortcomings of Haskell:

Lack of mechanisms for introducing "new" binding constructs.
Lack of mechanisms for syntactic extension (i.e. defining new analogues to the "do" notation).
Limitations of the type system. This arises when one wants more from the overloading mechanism. For example, overloading (+) and (-) is easy, but it's hard to overload other prelude operations. If you want to represent terms as data instead of functions (in order to allow manipulation of the DSL terms), it can be difficult to achieve the desired level of polymorphism in that data. Especially difficult is representing substitution in the DSL in a type-safe way. Often the only way to have collections of differently typed DSL terms with differently typed keys is to use IORefs.
Sharing at the DSL level.

What I Want.

Problem domain: designing FPGAs with (Xilinx) Lava. Lava is good for rapid-prototyping, which allows one to explore the design space much faster. It's less useful when one wants to move to a production quality design. But the fact that the alternative tools are so poor is a great boon.

Why I make circuits in Haskell:

Powerful combinators for composing circuits.
Higher-order functions for generic circuits.
Lazy evaluation for nice simulation.
DSL that captures the right abstractions.
Overloading/non-standard interpretations.
Blind prejudice.

So Haskell/Lava allows us to manipulate circuits, provides a nice framework for analyses, and provides support for formal verification. But we need:

Types of composite circuits.
Sharing (forks in wires). Observable sharing is one approach (cf. Chalmers Lava).
Variable names/debugging.
Recursive do.
Support for integer subranges.
Consistent overloading of bool, e.g. if .. then .. else. This seems to require some kind of mixfix declaration plus extensions to class hierarchies. Proposal: form a working group to really look at this problem.

DSLs for designing/generating circuits are very popular at the moment. Other groups have been better promoting/publishing their work, and have been perhaps less forthcoming about the limitations.

One of the biggest is adoption. A starting point would be distributing Lava with large sets of real circuits.

Reflections on Haskell as a host language.

The discussion should be more about experience with embedding DSLs in general, rather than focussing on Haskell. So, though the following points are about my experience in using Haskell as a host language, they should be interpreted more generally.

So, what do we need in Haskell? Apart from the above points, more flexibility in the Prelude for adding literals (i.e. fromInteger for other user-defined literals) would be very useful.

Different approach is an embedded compiler, DSEL, DSEC. Pan is a DSEC; Another example is GHC's rewrite rule mechanism. We need to be able to help the DSEL/C implementor understand the host language compilation. Also, we want a computer algebra system; that is, support for simplifying/optimizing expressions in common and useful datatypes, like integers and booleans, etc. All this requires a more complete understanding of how the host compiler performs its own transformations.

The ML Approach: DSL Compilers/Preprocessors.

Most of the discussion thus far has to do with the fact hat the DSL is embedded, rather than being a new language, so the issues involved are to do with how to use the host language to express the embedding. This is more difficult in OCaml, or ML; we tend to use the host language to generate an AST (more like a preprocessor or simple compiler). So, Conal's wish to have optimizations work upon the DSL terms seems to be best met by writing a compiler for DSL (giving total control over the optimizations applied).

The advantage of using the Haskell approach is that it is very easy to embed a DSL in Haskell, since one doesn't have to reimplement basic compiler features (like parsing, type checking, etc.). And one also has access to all of the normal Haskell constructs, and, in an open embedding, the DSL terms can even use Haskell. Additionally, one can easily combine different embedded DSLs (e.g. Haskell/DB + PrettyPrint + CGI) in the same program.

But perhaps the DSL user has no Haskell knowledge, and would prefer to have a limited language to work with. And one needs to be careful about distinguishing between the embedded language and the host. The first could perhaps be dealt with by defining a standard syntax for DSLs. And the second seems to cry out for meta-programming, but much work is required (but Tim is optimistic).

Sixteenth WG2.8 Technical MeetingPackwood, WA, USAJuly 2000