The Semantics of Multiprocessor Machine Code: x86, Power, and ARM

Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
The Semantics of Multiprocessor Machine Code: x86, Power, and ARM

The Semantics of Multiprocessor Machine Code: x86, Power, and ARM

Multiprocessors are now dominant, but real multiprocessors do not provide the sequentially consistent memory that is assumed by most work on semantics and verification. Instead, they have subtle relaxed (or weak) memory models, usually described only in ambiguous prose, leading to widespread confusion.

We have developed rigorous semantics for multiprocessor programs above three architectures: x86, Power, and ARM. Each covers the relaxed memory model, instruction semantics, and instruction decoding, for fragments of the instruction sets, and is mechanized in HOL. Such models should provide a solid intuition for low-level programming, and a sound foundation for future work on verification, static analysis, and compilation of low-level concurrent code, and for the design of high-level concurrent programming languages.

For x86, we have produced two models: x86-CC and x86-TSO.

The first, x86-CC, was based on the then-current Intel and AMD documentation: the Intel 64 and IA-32 Architectures Software Developer's Manual [up to revision 28, Sept. 2008] and AMD64 Architecture Programmer's Manual [vol 2, rev. 3.14, Sept. 2007], and also with the earlier Intel 64 Architecture Memory Ordering White Paper [Aug. 2007]. We tested the semantics against actual processors and the vendor litmus-test examples, and gave an equivalent abstract-machine characterisation of our axiomatic memory model. We proved that the memory model is equivalent to a simpler (`nice execution') definition. For programs that are, in some precise sense, data-race free, we proved in HOL that their behaviour is sequentially consistent. However, that vendor documentation, and hence x86-CC, turned out to be unsound with respect to the behaviour of actual x86 processors: see the Addendum to the x86-CC paper below. More recent vendor documentation (revision 29 of the Intel SDM) fixes that unsoundness but gives unreasonably weak guarantees to programmers.

Accordingly, we have now produced what we believe to be a better model, x86-TSO. To the best of our knowledge this is sound with respect to actual processor behaviour, matches the current vendor intentions, and is a better model to program above. We give two equivalent definitions of x86-TSO: an intuitive operational model based on local write buffers, and an axiomatic total store ordering model, similar to that of the SPARCv8. Both are adapted to handle x86-specific features. We have implemented the axiomatic model in our memevents tool, which calculates the set of all valid executions of test programs, and, for greater confidence, verify the witnesses of such executions directly, with code extracted from a third, more algorithmic, equivalent version of the definition.

The Power and ARM architectures have a very different memory model to x86, but are broadly similar to each other. Our preliminary DAMP 09 semantics described below is based on the Power 2.05 specification and the ARM Architecture Reference Manual v7. Note that this model has some major flaws; further work on the semantics is needed. Here the core memory model was also formalised in Coq.

Papers

A Better x86 Memory Model: x86-TSO (Extended Version). Technical Report UCAM-CL-TR-745. Scott Owens, Susmit Sarkar, and Peter Sewell. See also these Errata. A short version of this, below, is in TPHOLs 2009.
A Better x86 Memory Model: x86-TSO. Scott Owens, Susmit Sarkar, and Peter Sewell. In TPHOLs 2009.
A picture of the x86-TSO abstract machine
The Semantics of x86-CC Multiprocessor Machine Code (in POPL 2009). Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus O. Myreen, and Jade Alglave.
Supplementary Examples (additional illustrations and discussion for the examples in the x86-CC paper above, and a brief informal-mathematics summary of the x86-CC axiomatic memory model)
The Semantics of Power and ARM Multiprocessor Machine Code ( in DAMP 2009). Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. Note that the model presented here has some major flaws; further work is in progress.

Code

The x86-TSO HOL sources: to browse and download (gzipped tarball).
The x86-TSO HOL definitions, typeset: in ps and in pdf
The x86-CC HOL sources: to browse and download (gzipped tarball).
The x86-CC HOL definitions, typeset: in ps and in pdf
The DAMP'09 Power and ARM HOL sources: to browse and download (gzipped tarball).
The DAMP'09 Power and ARM HOL definitions, typeset: in ps and in pdf
The DAMP'09 Power and ARM Coq sources: to browse and download (gzipped tarball).

Tools for the x86-CC paper

memevents
x86 litmus
x86sem (gzipped tarball)

Experimental data for the x86-CC paper

From the x86 litmus memory model testing: summary and details.
From the x86sem sequential testing: results

People

(Computer Laboratory, University of Cambridge)

(INRIA Paris-Rocquencourt)

Samin Ishtiaq

(Microsoft Research Cambridge)

[validate]