Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456 REMS
REMS: Rigorous Engineering for Mainstream Systems
Project Outline
There are four interlinked Tasks:
Task 1: Semantic tools
The mathematical models of mainstream-system abstractions that we need
have to be expressed in well-defined formal languages, for the sake of
mathematical precision; and those languages need good software tool
support, to let us handle large and complex definitions. In this task
we address the foundational question of what formal languages are
suitable for this purpose; the pragmatic question of what facilities
good tool support should provide; and the engineering question of
actually building these tools to a sufficient quality to be usable by
other researchers and by industry staff.
In REMS we are developing and using
the Lem tool for
large-scale portable semantic specification (supporting a typed
higher-order logic and compiling to OCaml, LaTeX, Coq, HOL4, and
Isabelle/HOL), and the L3 and Sail domain-specific languages for processor
instruction-set semantic definition; these are used in much of the
rest of the project.
We also continue to develop the Ott tool for
programming language semantic definition (supporting user-defined
syntax and inductive relations over it, and compiling to LaTeX, Coq,
HOL4, and Isabelle/HOL), and to study the foundations needed for future
semantic tools.
Our first key interface is the architectural one between hardware and
software, specifying the multiprocessor behaviours that processor
implementations guarantee and that systems code relies on. Normal
industrial practice is to specify this interface in informal-prose
architecture reference manuals, but these prose descriptions cannot be
used directly to test either the hardware below the interface or
software above it, or to build verification tools, let alone as a basis
for proof. Compounding the problem, architectures must be nondeterministic loose
specifications, to accommodate microarchitectural variation,
so one cannot simply characterise them with reference implementations.
We are pursuing several directions here:
Our collaboration between semantic modelling
researchers and systems researchers working on CHERI (a multiprocessor
hardware design with explicit capabilities) has developed into a
central tool for the latter, for testing and debugging CHERI hardware
and software and for exploring new design options.
Semantics-driven random testing has been developed both for CHERI
and for an L3 model of ARMv6, and ARM Cortex-M0 implementations,
equipped with accurate cycle counts.
Turning to production multiprocessor semantics, for IBM POWER
we have integrated
an operational concurrency model with an ISA model for the sequential
behaviour of all the non-floating-point non-vector user-mode instruction set
(largely automatically derived from the vendor pseudocode, and
expressed in our Sail ISA description language).
For ARMv8 we have developed related models, and user-mode ISA
description, that have a clear relationship to the
intuitions of the ARM designers.
We also continue to develop and maintain sequential L3 models for ARMv6-A,
ARMv8 and x86-64, to support the work on
CakeML described later, and work research groups elsewhere.
We have also studied the concurrency behaviour of GPUs, for NVIDIA
and AMD hardware.
The next key interface is the programming language. Systems
programming is almost entirely carried out using low-level languages
such as C/C++. These are efficient and relatively easy to interface
with hardware, but they are difficult to use, even more difficult to
use correctly, and incredibly difficult to reason about, especially in the concurrent setting. This has led,
directly or indirectly, to the vast majority of bugs and
vulnerabilities observed in modern operating systems.
Work proceeds on Cerberus, a semantic model for a significant fragment
of C as it exists in current practice.
Its thread-local semantics is factored via an elaboration into a
simpler Core language, to make it conceptually and mathematically
tractable. This is integrated with an operational model for C11
concurrency, with a mechanised proof of equivalence to the axiomatic
C11 model of Batty et al. The front-end includes a parser that
closely follows the C11 standard grammar and a typechecker. Cerberus
is executable, to explore all behaviours or single paths of test
programs, and it supports proof, as shown by a preliminary experiment
in translation validation for the front-end of Clang, for very simple
programs.
Another view of the C de facto standard is provided by our work
on how to
make existing systems software run on the CHERI experimental
processor, identifying the tensions between the additional safety
guarantees that CHERI's capability system enforces vs the more
freewheeling assumptions of some existing software.
From a more systems-oriented perspective, we continue to develop our
libcrunch tool to dynamically detect type errors in C
code, and to use the underlying machinery for metadata to evolve the
runtime services provided by Unix processes towards those of VMs.
We explored the use of coroutines for
concurrency in C, based on a continuation-passing style translator.
We have
studied C compilation under GCC to the ARMv6-M
instruction set of the Cortex-M0 embedded processor, as in Task 2. Results suggest that this preserves enough code structure
to support using the translation validation approach of Thomas Sewell, Myreen and
Klein (see Task 4) to lift execution costs from machine instructions up to
C~source.
The C/C++11 concurrency model (as formalised by Batty et al. and
standardised by the ISO committee) is the best general-purpose
programming-language concurrency model we have, but it is far from
perfect. We have characterised a number of problems in the model
and suggested future directions.
Linking with our GPU architectural
modelling described in Task 2, we have also worked on precise
specification of (and corrections to) the OpenCL 2.0 concurrency
memory model, which incorporates features similar to those of the
C/C++11 concurrency model we studied earlier.
We have also worked on concurrency optimisation, adding optimisations to remove redundant memory
barriers in the ARM backend of LLVM, in work incorporated into the
main LLVM codebase.
Task 4: Concurrent Software Interfaces and Verification
Our final task develops reasoning methods and examines some particular
higher-level software interfaces.
Work on compositional verification has
provided new reasoning
principles about abstract atomicity, extended this to support
termination proofs for non-blocking algorithms,
developed more modular verification techniques, addressed vertical
compositionality, and developed a proof system for reasoning about
concurrent algorithms above the x86-TSO architecture model.
New tool support for Hybrid Separation Logic supports verification of
some simple programs operating on graphs and acyclic graphs (whereas
separation logic based approaches have up to now mostly only been able
to work on lists and trees).
We have also continued to work on linearisability-based
approaches to concurrent algorithm verification, and developed new
consistency conditions weaker than linearisability.
Looking in detail at the POSIX filesystem interface, we
developed a program
logic characterisation for a core of POSIX, and
a theoretical and empirical investigation into the
behaviour of filesystem implementations has produced an extensively tested model of POSIX,
Linux, and MacOSX filesystems.
Our compiler verification work has contributed to the CakeML verified
ML compiler project, and extended the seL4 verification down to the
binary level.
We also contributed to the CerCo verified compiler.
Finally, we
have developed a clean-slate specification and implementation of the
Transport Layer Security (TLS) protocol. This is fundamental to
internet use, but its implementations have a history of
security flaws.