Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Relaxed-Memory Concurrency

Relaxed-Memory Concurrency

Multiprocessors are now pervasive and concurrent programming is becoming mainstream, but typical multiprocessors (x86, Sparc, Power, ARM, Itanium) and programming languages (C, C++, Java) do not provide the sequentially consistent shared memory that has been assumed by most work on semantics and verification. Instead, they have subtle relaxed (or weak) memory models, exposing behaviour that arises from hardware and compiler optimisations to the programmer. Moreover, these memory models have usually described only in ambiguous (and sometimes flawed) prose, leading to widespread confusion. We are working to develop mathematically rigorous and usable semantics for multiprocessor programs, focussing on three processor architectures (x86, Power, and ARM); on the upcoming revisions of the C++ and C languages (C++0x and C1x); and on reasoning and verification using these models.

x86
Power
C++0x and C1x
Reasoning about x86 concurrency: TRF
Compiler Verification: CompCertTSO, from Concurrent Clight (with TSO semantics) to x86-TSO
Correctness of Compiler Optimisations for DRF
Program Logics for x86-TSO

Position Papers, Tutorials and Invited Talks
People
Funding

x86

For x86, we have produced the x86-TSO model. To the best of our knowledge, x86-TSO is sound with respect to actual processor behaviour, matches the current vendor intentions, and is a good model to program above. The model is described in the CACM paper below, written to be accessible for a broad audience, and formally defined in the TPHOLs 2009 paper below. The POPL 2009 paper gives a rather different model, x86-CC, based on the misleading vendor documentation of the time, that is not sound with respect to observable behaviour. See also more details.

x86-TSO: A Rigorous and Usable Programmer's Model for x86 Multiprocessors. Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, Magnus O. Myreen. In Communications of the ACM (Research Highlights) 2010 No.7.
A Better x86 Memory Model: x86-TSO. Scott Owens, Susmit Sarkar, and Peter Sewell. In TPHOLs 2009.
The Semantics of x86-CC Multiprocessor Machine Code (in POPL 2009). Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus O. Myreen, and Jade Alglave.

Power

For Power, we have produced several models. The PLDI 2011 paper below gives an abstract-machine model that, to the best of our knowledge, accurately captures the architectural intent and observable processor behaviour for a wide range of subtle examples. The TACAS 2011 tool paper describes our Litmus experimental testing tool. The CAV 2010 paper gives an earlier axiomatic model that is sound with respect to current POWER processor behaviour, as far as we know, but does not match the architectural intent for some examples (as described in the PLDI 2011 paper). It is based on a general framework developed in Alglave's thesis. The DAMP 2009 paper gives a very preliminary axiomatic model, based on a naive reading of the vendor documentation. See also more details of the PLDI model and more details of the CAV and DAMP work.

Understanding POWER Multiprocessors. Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, Derek Williams. In PLDI 2011
Litmus: Running Tests Against Hardware. Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell. Tool Demonstration Paper. In TACAS 2011.
Fences in Weak Memory Models. Jade Alglave, Luc Maranget, Susmit Sarkar and Peter Sewell In CAV 2010.
The Semantics of Power and ARM Multiprocessor Machine Code. Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. In DAMP 2009.

C++0x and C1x

C and C++ are defined by standards, but those standards have historically not covered the behaviour of concurrent programs, motivating an ongoing effort to specify concurrent behaviour in a forthcoming revision of C++ (unofficially, C++0x) [WG21, Boehm and Adve, PLDI 2008]. The next C standard (unofficially, C1X) [WG14] is expected to follow suit. The key issue here is the multiprocessor relaxed-memory behaviour induced by hardware and compiler optimisations. The proposals aim to provide strong guarantees for race-free programs, together with new (but subtle) relaxed-memory atomic primitives for high-performance concurrent code. However, the draft standards (e.g. the Final Committee Draft (N3092), are prose documents: while the result of careful deliberation, they have almost inevitably been unclear on some points, and have been subject to some subtle semantic flaws. The POPL 2011 paper below gives a mathematically rigorous semantics for C++ concurrency. To the best of our knowledge, this captures the intent of the Final Committee Draft (N3092) text, but with changes we suggested at the Rapperswil meeting of the C++ Standards Committee in July 2010, which have been incorporated into the current draft. Some further changes are currently under discussion. We hope that this will aid discussion of any further changes to the draft standard, provide an unambiguous correctness condition for compilers, and give a basis for analysis and verification of concurrent C and C++ programs. As a starting point, the paper also proves correctness of a compilation scheme from C++0x concurrency to x86-TSO. See also more details.

Mathematizing C++ Concurrency. Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. In POPL 2011

Reasoning about x86 concurrency: TRF

The low-level assembly-language implementations of the abstractions that support language-level concurrency, such as locks and concurrent datastructures, are particularly interesting and challenging to reason about because they invariably contain data races. The ECOOP 2010 paper below develops a novel principle for reasoning about assembly programs on our previous x86-TSO memory model, uses it to analyze concurrency abstraction implementations of two spinlocks (from Linux), a non-blocking write protocol, the double-checked locking idiom; and java.util.concurrent's Parker. This triangular race freedom principle strengthens the usual data-race freedom style of reasoning. See also more details.

Reasoning about the Implementation of Concurrency Abstractions on x86-TSO. Scott Owens. In ECOOP 2010.

Compiler Verification: CompCertTSO, from Concurrent Clight (with TSO semantics) to x86-TSO

CompCertTSO is a compiler that generates x86 assembly code from ClightTSO, a large subset of the C programming language enhanced with concurrency primitives for thread management and synchronisation, and with a TSO relaxed memory model. The development is based on the CompCert compiler from sequential Clight to PowerPC and ARM (developed by Leroy et al., INRIA). The CompCertTSO compiler is written mostly within the specification language of the Coq proof assistant, and its correctness --- the fact that, for well-defined source programs, any behaviour of the generated code (with x86-TSO semantics) is permitted by the source-language semantics --- has been proved within Coq. The POPL 2011 paper describes some of this work, and the second paper below describes further work on fence elimination; see also more details.

Relaxed-Memory Concurrency and Verified Compilation. Jaroslav Sevcik, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, Peter Sewell. In POPL 2011
Verifying Fence Elimination Optimisations. Viktor Vafeiadis, Francesco Zappa Nardelli. Draft.

Correctness of Compiler Optimisations for DRF

Current proposals for concurrent shared-memory languages, including C++ and C, give well-defined semantics (in fact, sequential consistency) only for programs without data races; this is known as the DRF guarantee. The correctness of compiler optimisations under the DRF guarantee is not completely clear, and experience with Java shows that this area is error-prone. The PLDI 2011 paper below gives a rigorous study of optimisations that involve both reordering and elimination of memory reads and writes, covering many practically important optimisations. It also discusses some surprising limitations of the DRF guarantee. Earlier work by Ševčík and Aspinall, in ECOOP 2008, showed that the Java Memory Model disallows some standard optimisations, including some performed by the Hotspot JVM. See also more details.

Safe Optimisations for Shared-Memory Concurrent Programs. Jaroslav Ševčík. To appear in PLDI 2011.

Program Logics for x86-TSO

The paper below develops a program logic for x86-TSO assembly programs, extending rely-guarantee reasoning. It is formalised in HOL4 and illustrated with Simpson's 4-slot algorithm.