GC6: Dependable Systems Evolution

 

Home
Up

The work of six of the groups is recorded in a special issue of the Formal Aspects of Computing Journal, and the following sections, which are based on the special issue's editorial,  summarise these papers.

 

The FLASH filestore Mini-Challenge

 

This page is based on a talk given at the ICFEM 2007 FLASH Workshop in Boca Raton on 13th November 2007.

 

A verification challenge: the POSIX-compliant flash-memory filestore

 

In the first conference on Verified Software: Theories, Tools, and Experiments, held in Zurich in 2005, Joshi and Holzmann suggested a new pilot project for the grand challenge: the specification and verification of the POSIX filestore interface of Unix. It involves a small subset of POSIX (which is detailed below) suitable for flash-memory hardware with strict fault-tolerant requirements to be used by forthcoming NASA missions. 

 

The challenge proposes the following questions:

  • Is the mechanical verification of a practical filestore within the state of the art? 

  • Can high levels of automation be achieved in the verification? 

  • Can useful guarantees be given about the dependability of the system across different hardware platforms? 

  • How do different theories, tools, and technique compare?

A series of informal development workshops are being organised during 2008 for those who are working on the problem. This is inspired by the success of the Mondex project, where workshops gave useful deadlines for research groups to make progress and to collaborate. It may even be the case that some groups will agree on a common notation and toolset.

 

Why POSIX?

 

The POSIX interface is mature, stable, clean, well-defined, and standardised.  It can be implemented using well-understood data structures and algorithms.  Although it is reasonably small, it is still complex enough to provide a real challenge to verification tools and techniques.  It has a wide application beyond JPL/NASA. 

 

Why flash memory?

 

Flash memory is chosen for mission-critical filestores, including space flight, because it is non-volatile and relatively insensitive to shock and changes in pressure and temperature.  Most importantly, there are no moving parts, and in particular, nothing rotates.

 

POSIX History

 

The POSIX API was specified in Z by Patrick Place (SEI, CMU) in 1995, who used the formal specification to analyse the requirements for the POSIX standard, highlighting ambiguities and contradictions.  This resulted in some guidelines for the actual POSIX standard that was developed in 1998.  Place was inspired by Morgan & Sufrin’s Z specification.

 

Intel has proposed an architecture for flash-based file systems that explains how to separate and then orthogonally combine the following aspects:

  • Functional requirements.

  • Different underlying hardware.

  • Fault-tolerance issues.

This architecture is amenable to generalising hardware.

 

Project overview

 

The POSIX interface was chosen for the following reasons:

  • It is a clean, well-defined, and standard interface that has been stable for many years.

  • Underlying data structures and algorithms are well understood.

  • It is complex enough to be interesting, in terms of reliability guarantees like unexpected power loss, concurrent access, or data corruption.

  • Modern information technology is massively dependent on reliable, safe, and secure information availability.

This architecture is particularly useful for NAND flash-memory devices, the technology JPL intend to use. It is also amenable to generalisation to other types of hardware, as well as varied fault-tolerance concerns. As a scientific (rather than commercial) challenge, this is important in order to move away from NASA's specific hardware and fault-tolerant requirements. Thus, following this path still keeps us to the Grand Challenge's ideal of scientific purity, generality, and accuracy.

 

The Modelling Goal

 

The project is going to build an initial formal model of NAND Flash memory based on the Open NAND Flash Interface specification (ONFI).  It will focus on structural aspects and internal organisation of the file system, giving an abstract view of each mandatory operation’s behaviour.  The ultimate goal is a mechanically verified NAND-flash filestore.

 

Functional requirements

 

The functional requirements specification for POSIX file systems are hosted by the Open Group, and was agreed as a standard for Unix implementations in 2003. Before this, an early attempt at producing this requirements document used the Z Notation to straighten out unresolved ambiguities (see the resulting Z document). The Open Group requirements document was originally inspired by available Unix file systems formal specifications, such as Morgan and Sufrin's paper on the Unix filing system.

 

Initial subset of POSIX

 

This minimal POSIX interface focuses on the basic functionality of files and directories, where we want to support at least the following API:

    create open close unlink read
    write truncate ftruncate stat fstat
    mkdir rmdir rename opendir readdir
    rewinddir closedir format mount unmount

The API commands are documented in the POSIX standard. There is no requirement to provide initial support for:

  • File permissions.

  • Hard or symbolic-links.

  • Encryption, directory listing, regular expressions, and other utility operations.

  • Entities other than traditional files and directories (e.g., no pipes, sockets, etc.).

Popular flash-memory file systems, such as YAFFS2, do not support these features, since they are not typically needed for the embedded environments in which flash devices usually reside. But this should not stop interested researchers tackling these problems: we are just prioritising the work, which must eventually be tackled.

 

The JPL proposal includes a conservative guarantee about concurrent behaviour: 

  • The result of executing concurrent operations is equivalent to executing them in some serial order. 

Hardware requirements

 

Although a general Intel architecture integrating the file system functionality with varied hardware is possible, we again turn our attention to one particular target: NAND flash-memory. This type of hardware has been standardised quite recently by ONFI

 

Handling Failure

 

There are two principal requirements for fault tolerance:

  • No corruption in the presence of unexpected power loss.

  • Recovery from faults specific to flash hardware..

In recovery from power loss, the file system is required to be reset-reliable in the following sense:

  • If an operation A is in progress at the time of a power loss, then on reboot, the file system state will be as if A either has successfully completed or has never started.

This guarantee entails a performance cost, which is being traded for simplicity. But this may be too strict a requirement for a general embedded file system, so this issue remains open in the context of the grand challenge, and so presents yet another research opportunity to those who want to get involved.

 

Flash devices are also prone to unrecoverable block failure due to workload-related aging, and wear-levelling algorithms attempt to minimise the failure rate.  Fault tolerance must be organised by host, with a need to present a fault-free view to higher levels.

 

Progress so far

 

The results obtained between May 2006 and December 2007 are:

  • Mechanisation of half of Morgan and Sufrin's Z specification.

  • Refinement proof to a Z data structure resembling Java's HashMap class.

  • Mechanisation of half of Open Group's Z formalisation of the requirements document for Real-time Distributed Systems Communication.

Plans for project publication

 

It is expected that each group working on the problem will submit papers to the various grand challenge workshops and conferences. They will also be asked to deposit their work in the Verified Software Repository, and to submit a longer paper to a special issue of a leading journal.

Contact: jim@cs.york.ac.uk

Last updated: 6th December 2007