Research Statement for David Coppit

David Coppit

Research Statement

Introduction

My approach to research is to develop a portfolio of opportunities. I invest my resources into those areas that are likely to yield good short-term research results, while also balancing the need to invest in longer-term, more risky activities that may have revolutionary impact. This strategy has proven to be a smart one, allowing me to more rapidly adjust to research opportunities and difficulties.

In the next two sections, I describe two directions of my research. The first, more ambitious direction is in software plans, an approach to structuring software that is orthogonal to traditional modularization techniques. The second is bounded exhaustive testing, an approach to testing that seeks to leverage automated testing technologies to improve the trustworthiness and reliability of software.

Software Plans

For thirty years computer scientists have struggled with methods for the design of software. To date this effort has focused primarily on the development of new programming language abstractions for separating and encapsulating concerns in software. Language developments such as procedures, abstract data types, and classes have made great strides in addressing this problem. And yet there are some concerns such as logging and debugging that remain difficult to encapsulate, resulting in additional complexity and its concomitant consequences on software quality.

Recent work in hyperspaces, traits, and aspect-oriented programming are attempts to address this problem. But each takes the traditional research approach, attempting to create new programming language abstractions to modularize inherently non-modular concerns. My research on software plans instead investigates an alternative approach that separates concerns using advanced editor technology.

I seek to develop a new abstraction for software, inspired by the notion of plans in traditional architectural design. A software plan is an editable view of a module that presents the code for the concern being implemented along with the code for the concerns upon which it depends but not code for independent concerns. The overall implementation of the module is a set of plans.

This is longer-term, higher-risk research project that presents an alternative to the current approaches. Key challenges include:

Impact

This work has the potential to create a new research direction in the aspect-oriented software development community. To date, researchers have focused on language-based approaches. In contrast, software plans is a language-independent approach that treats cross-cutting concerns as inherently different than the functionality that has been traditionally encapsulated using functional and object-oriented language abstractions.

Software plans have significant practical impact. Like program slicing, software plans can provide a reduced view of the code, but based on cross-cutting features rather that data flow. This means that the cost of many program maintenance activities can be reduced. It also provides a framework where a new feature can be integrated with each existing feature in a separate plan, before it is integrated into the complete system. This reduces the cost of development. We have submitted three papers to top conferences on our approach and prototype tool.

Current Status

Research in this area is progressing on several fronts. Undergraduates Benjamin Cox and Justin Manweiler were instrumental in the implementation of prototype editors that allowed us to experiment with different approaches to software plans [1]. Separately Robert Painter and I have developed a new model for code that allows us to explicitly represent concerns and their dependencies in code segments [6]. Robert has also developed heuristics for composing separately developed software plans that appear to work quite well.

Separately, Meghan Revelle has developed a technique for reverse-engineering software plans from legacy code. Her approach uses test cases to exercise the code for different concerns. Trace data is collected and then analyzed using formal concept analysis. The resulting concept lattice is then used to construct the software plan.

I currently have two proposals on this research under consideration. In January I plan to also submit a proposal to the NSF Science of Design program that seeks to investigate the relationship between software plans and object-oriented programming. In the meantime, my students continue to develop techniques, models, and tools to support the approach.

Bounded Exhaustive Testing

Traditional testing involves partitioning the input space, selecting a representative, and then deriving the expected output or verifying the actual output. As effective as this approach can be, it suffers from a key problem: faults involving rare combinations or sequences of events are unlikely to be revealed. Manual test selection is unlikely to reveal such faults because they are, by nature, hidden from human understanding by the complexities of the domain or the subtleties of low-level software issues. Automated methods, such as those involving code coverage analysis, operate under the unproven assumption that such measures can reveal faults involving both common and rare circumstances.

In comparison, model checking has been used successfully to identify subtle errors in specifications by verifying that desired properties hold in an exhaustive but limited segment of the state space. Jackson's "small scope hypothesis" states that a high proportion of faults can be found by exhaustively testing all inputs within some small scope [5]. This avenue of my research applies the bounded exhaustive verification approach used in model checking to the verification of software.

There are a number of key challenges. First, while exhaustive generation of simple test data is trivial, exhaustive generation of structurally complex test data is difficult. For example, it is a nontrivial task to generate all syntactically correct C++ programs up to a given size to test a compiler. Second, the relationship between exhaustive, random, and traditional testing approaches remains unknown. Without some guidance, software developers lack the information needed to wisely invest testing resources in each of the approaches. Third, the "oracle problem" is a looming issue—without some way to automatically validate test results, exhaustive or random testing remains infeasible.

The approach I take in this research is to combine automated input generation with the use of the design by contract methodology. Design by contract addresses the oracle problem by embedding the specification of the software into the software itself as self-checking assertions. This sidesteps the need to develop a separate oracle, using the observation is that it is usually easier to check a solution than it is to compute one.

Impact

Exhaustive testing was long ago discarded as an approach that is effective in theory, but infeasible in practice. One outcome of this research is to demonstrate that bounded exhaustive testing can be both feasible and effective for real systems, mirroring the success of bounded exhaustive verification in model checking. Specific contributions are:

Current Status

Recent work has shifted from exhaustive generation of small inputs to generation of random inputs. The hypothesis is that random testing can be as effective as exhaustive testing. However, random testing of real systems is nontrivial in practice. First, inputs to the software under test can have structure that makes truly random generation difficult. Traditional approaches are based on randomizing alternatives in a grammar, and fail to take into account bias resulting from the structure of the grammar. Justin Manweiler is investigating techniques for unbiased generation of structured test data from grammars.

Another problem is the size of the input space. Although a program can accept a wide range of input, it is often the case that only a subset of that input space will exercise the software in a way that is worth testing. To be effective, random input generation techniques must automatically identify this interesting region, then generate test cases randomly within it. Jackie Murphy is developing an algorithm for random generation of test inputs in an implicitly defined input space.

Work is also ongoing to demonstrate the efficacy of combining bounded exhaustive test data generation with specification-based contracts. our previous work used my formally-based Nova implementation as an oracle for the Galileo software. The goal of the new research to apply the contract-based approach to a sizeable system and demonstrate that it can reveal subtle faults not revealed by traditional testing techniques.

References

[1] David Coppit and Benjamin Cox. Software plans for separation of concerns. In Proceedings of the Third AOSD Workshop on Aspects, Components, and Patterns for Infrastructure Software, Lancaster, UK, 22 March 2004. IEEE.

[2] David Coppit and Jennifer Haddox-Schatz. On the use of specification-based assertions as test oracles. In Proceedings of the 29th Annual IEEE/NASA Software Engineering Workshop. Greenbelt, Maryland, 6-7 April 2005.

[3] David Coppit and Jeixin Lian. yagg: An Easy-To-Use Generator for Structured Test Inputs. In 20th IEEE/ACM International Conference on Automated Software Engineering, pages 356-359, Long Beach, California, 7-11 November 2005.

[4] David Coppit, Jinlin Yang, Sarfraz Khurshid, Wei Le, and Kevin Sullivan. Software assurance by bounded exhaustive testing. IEEE Transactions on Software Engineering, 31(4):328-39, April 2005.

[5] Daniel Jackson and Craig A. Damon. Elements of style: Analyzing a software design feature with a counterexample detector. IEEE Transactions on Software Engineering, 22(7): 484-495, July 1996.

[6] Robert R. Painter and David Coppit. A Model for Software Plans. In Proceedings of the First International Workshop on the Modeling and Analysis of Concerns in Software, pages 14-18, St. Louis, Missouri, 16 May 2005.

[7] Kevin Sullivan, Jinlin Yang, David Coppit, Sarfraz Khurshid, and Daniel Jackson. Software assurance by bounded exhaustive testing. Proceedings of the International Symposium on Software Testing and Analysis. Boston, Massachusetts, 11-14 July 2004.

Last changed November 06 2006 13:14:17. David Coppit, coppit@cs.wm.edu

There have been 86751 hits since Thu Jun 9 14:49:55 2005