Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
The Semantics of Multiprocessor Machine Code: x86, Power, and ARM
[go: Go Back, main page]

The Semantics of Multiprocessor Machine Code: x86, Power, and ARM

Multiprocessors are now dominant, but real multiprocessors do not provide the sequentially consistent memory that is assumed by most work on semantics and verification. Instead, they have subtle relaxed (or weak) memory models, usually described only in ambiguous prose, leading to widespread confusion.

We have developed rigorous semantics for multiprocessor programs above three architectures: x86, Power, and ARM. Each covers the relaxed memory model, instruction semantics, and instruction decoding, for fragments of the instruction sets, and is mechanized in HOL. Such models should provide a solid intuition for low-level programming, and a sound foundation for future work on verification, static analysis, and compilation of low-level concurrent code, and for the design of high-level concurrent programming languages.

For x86, we have produced two models: x86-CC and x86-TSO.

The first, x86-CC, was based on the then-current Intel and AMD documentation: the Intel 64 and IA-32 Architectures Software Developer's Manual [up to revision 28, Sept. 2008] and AMD64 Architecture Programmer's Manual [vol 2, rev. 3.14, Sept. 2007], and also with the earlier Intel 64 Architecture Memory Ordering White Paper [Aug. 2007]. We tested the semantics against actual processors and the vendor litmus-test examples, and gave an equivalent abstract-machine characterisation of our axiomatic memory model. We proved that the memory model is equivalent to a simpler (`nice execution') definition. For programs that are, in some precise sense, data-race free, we proved in HOL that their behaviour is sequentially consistent. However, that vendor documentation, and hence x86-CC, turned out to be unsound with respect to the behaviour of actual x86 processors: see the Addendum to the x86-CC paper below. More recent vendor documentation (revision 29 of the Intel SDM) fixes that unsoundness but gives unreasonably weak guarantees to programmers.

Accordingly, we have now produced what we believe to be a better model, x86-TSO. To the best of our knowledge this is sound with respect to actual processor behaviour, matches the current vendor intentions, and is a better model to program above. We give two equivalent definitions of x86-TSO: an intuitive operational model based on local write buffers, and an axiomatic total store ordering model, similar to that of the SPARCv8. Both are adapted to handle x86-specific features. We have implemented the axiomatic model in our memevents tool, which calculates the set of all valid executions of test programs, and, for greater confidence, verify the witnesses of such executions directly, with code extracted from a third, more algorithmic, equivalent version of the definition.

The Power and ARM architectures have a very different memory model to x86, but are broadly similar to each other. Our preliminary DAMP 09 semantics described below is based on the Power 2.05 specification and the ARM Architecture Reference Manual v7. Note that this model has some major flaws; further work on the semantics is needed. Here the core memory model was also formalised in Coq.

Papers

Code

Tools for the x86-CC paper

Experimental data for the x86-CC paper

  • From the x86 litmus memory model testing: summary and details.
  • From the x86sem sequential testing: results

People

(Computer Laboratory, University of Cambridge)

(INRIA Paris-Rocquencourt)

  • Samin Ishtiaq

(Microsoft Research Cambridge)

example execution

[validate]