Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Home Page of Prof. Thomas W. Reps
[go: Go Back, main page]

University of Wisconsin-Madison Skip navigationUW-Madison Home PageMy UW-MadisonSearch UW
 

 

UW-Madison
Computer Sciences Dept.

Thomas W. Reps

Professor
Computer Sciences Department
University of Wisconsin-Madison
1210 West Dayton Street
Madison, WI 53706-1685
USA

Picture of Thomas W. Reps
Contact Information:

E-mail: reps at cs.wisc.edu
Telephone: (608) 262-2091
Secretary: (608) 262-6616
Department: (608) 262-1204
Fax: (608) 262-9777
Finger information

Some Maps: Campus Map | Campus-to-Capitol Map | City Map

Ph.D., Cornell University, 1982 (CV, Biography, Research Summary, Past Research)

Research Interests:

See also Citeseer, Google Scholar, and DB&LP;.

Projects:


Research Summary

The goal of my research is to create tools that support the development of complex software systems. The aim is to create tools that help programmers develop correct, reliable, and secure software by providing powerful operations for manipulating programs and analyzing their properties.

The tools that we are developing are based on static analysis: static analysis concerns techniques for obtaining information about the possible states that a program passes through during execution, without actually running the program on specific inputs. Instead, static-analysis techniques explore a program's behavior for all possible inputs and all possible states that the program can reach. To make this feasible, the program is ``run in the aggregate''---i.e., on abstract descriptors that represent collections of many states.

Most of our effort is focused on three topics:

(A summary of some of my past research work can be found here.)

Static Program Analysis via 3-Valued Logic

The aim of this work [SRW02, RSW04] has been to create a parametric framework for program analysis that is capable of expressing rich properties of memory configurations, and confirming such behavioral properties as

  • when the input to a list-insert program is an acyclic list, the output is an acyclic list, and
  • when the input to a list-reversal program that uses destructive-update operations is an acyclic list, the output is an acyclic list.
A parametric analysis framework is one that can be instantiated in different ways to create different program-analysis algorithms that provide answers to different questions, with varying degrees of efficiency and precision.

This work was originally motivated by the problem of shape analysis for programs written in languages that permit destructive updating of dynamically allocated storage. The aim of shape analysis is to discover information (in a conservative way) about the possible ``shapes'' of the dynamically allocated data structures to which a program's pointer variables can point. As we investigated this problems more deeply, we discovered methods that allowed us to deal with properties of memory configurations other than just pure shape properties. For instance, recent work has concerned numeric properties [GDDRS04].

The key aspect of our approach is the way in which it makes use of 2-valued and 3-valued logic: 2-valued and 3-valued logical structures---i.e., collections of relations---are used to represent concrete and abstract stores, respectively; individuals represent entities such as memory cells, threads, locks, etc.; unary and binary relations encode the contents of variables, pointer-valued structure fields, and other aspects of memory states; and first-order formulas with transitive closure are used to specify properties such as sharing, cyclicity, reachability, etc. Formulas are also used to specify how the store is affected by the execution of the different kinds of statements in the programming language.

The analysis framework can be instantiated in different ways by varying the relation symbols of the logic, and, in particular, by varying which of the unary relations control how nodes are folded together. The specified set of relations determines the set of properties that will be tracked, and consequently what properties of stores can be discovered to hold at different points in the program by the corresponding instance of the analysis.

As a methodology for verifying properties of programs, the advantages of the 3-valued-logic approach are:

  1. No loop invariants are required.
  2. No theorem provers are involved, and thus every abstract execution step must terminate.
  3. The method is based on abstract interpretation and satisfies conditions that guarantee that the entire process always terminates.
  4. The method applies to programs that manipulate pointers and heap-allocated data structures.
  5. The method eliminates the need for the user to write the usual proofs required with abstract interpretation---i.e., to demonstrate that the abstract descriptors that the analyzer manipulates correctly model the actual heap-allocated data structures that the program manipulates.
A PowerPoint presentation on some of this material can be found here.

A prototype implementation that implements this approach has been created, called TVLA (Three-Valued-Logic Analyzer).

Static Analysis of Executables

The aim of this research is to recover intermediate representations that are similar to those that can be created for a program written in a high-level language. Our initial work has concerned x86 executables. The goal is to provide a platform that an analyst can use to understand the workings of potentially malicious code, such as COTS components, plugins, mobile code, worms, Trojans, and virus-infected code. In this context, the major challenge is that we must assume that the executable is untrustworthy, and hence symbol-table and debugging information cannot be relied upon (even if it is present). Instead, an analysis---called value-set analysis (VSA)---is carried out to recover information about the contents of memory locations and how they are manipulated by the executable [BR04].

A key feature of VSA is that it tracks address-valued and integer-valued quantities simultaneously. VSA is related to pointer-analysis algorithms that have been developed for programs written in high-level languages, which determine an over-approximation of the set of variables whose addresses each pointer variable can hold:

VSA determines an over-approximation of the set of addresses that each data object can hold at each program point.
At the same time, VSA is similar to range analysis and other numeric static-analysis algorithms that over-approximate the integer values that each variable can hold:
VSA determines an over-approximation of the set of integer values that each data object can hold at each program point.

Value-set analysis has been incorporated into a tool called CodeSurfer/x86. The facilities of CodeSurfer/x86 provide an analyst with a powerful and flexible platform for investigating the properties and behaviors of an x86 executable, using either (i) CodeSurfer/x86's GUI, (ii) CodeSurfer/x86's scripting language, which provides access to all of the intermediate representations that CodeSurfer/x86 builds for the executable, or (iii) GrammaTech's Path Inspector, which is a tool that uses a sophisticated pattern-matching engine to answer questions about the flow of execution in a program.

CodeSurfer/x86 is the outcome of a joint project between the Univ. of Wisconsin and GrammaTech, Inc. CodeSurfer/x86 makes use of both IDAPro, a disassembly toolkit, and GrammaTech's CodeSurfer system, a toolkit for building program-analysis and inspection tools.

CodeSurfer/x86 is a tool for code understanding and code inspection that supports both a graphical user interface (GUI) and a scripting language for accessing a program's system dependence graph (SDG) [HRB90], as well as other information stored in CodeSurfer's intermediate representations (IRs). The GUI supports browsing (``surfing'') of an SDG, along with a variety of operations for making queries about the SDG---such as slicing [HRB90] and chopping [RR95]. The GUI allows a user to navigate through a program's source code using these dependences in a manner analogous to navigating the World Wide Web.

CodeSurfer's scripting language provides a programmatic interface to these operations, as well as to lower-level information, such as the individual nodes and edges of the program's SDG, call graph, and control-flow graph, and a node's sets of used, killed, and possibly-killed variables. By writing programs that traverse CodeSurfer's IRs to implement additional program analyses, the scripting language can be used to extend CodeSurfer's capabilities.

Weighted Pushdown Systems and their Application to Interprocedural Dataflow Analysis

This work has established new connections between interprocedural dataflow analysis and model checking of pushdown systems (PDSs). Various connections between dataflow analysis and model checking have been established in past work, e.g., [S91, S93, S98, EK99, CC00]; however, with one exception ([EK99]), past work has shed light only on the relationship between model checking and bit-vector dataflow-analysis problems, such as live-variable analysis and partial-redundancy elimination. In contrast, the results obtained in our work [RSJ03, RSJM] apply to (i) bit-vector problems, (ii) the one non-bit-vector problem addressed in [EK99], as well as (iii) certain dataflow-analysis problems that cannot be expressed as bit-vector problems, such as linear constant propagation SRH96 and affine-relation analysis MOS04. In general, the approach can be applied to any distributive dataflow-analysis problem for which the domain of transfer functions has no infinite descending chains. (Safe solutions are also obtained for problems that are monotonic but not distributive.)

The contributions of the work can be summarized as follows:

  • Conventional dataflow-analysis algorithms merge together the values for all states associated with the same program point, regardless of the states' calling context. With the dataflow-analysis algorithm obtained via weighted PDSs, dataflow queries can be posed with respect to a regular language of stack configurations. Conventional merged dataflow information can also be obtained by issuing appropriate queries; thus, the new approach provides a strictly richer framework for interprocedural dataflow analysis than is provided by conventional interprocedural dataflow-analysis algorithms.
  • Because the algorithm for solving path problems in weighted PDSs can provide a witness set of paths, it is possible to provide an explanation of why the answer to a dataflow query has the value reported.

The algorithms described in [RSJ03, RSJM] have been implemented in two libraries that solve reachability problems on weighted PDSs: WPDS and WPDS++ (available soon). These libraries have been used to create prototype implementations of context-sensitive interprocedural dataflow analyses for uninitialized variables, live variables, linear constant propagation, and the detection of affine relationships.

[Back to the top]


Recent Items of Note*new*

Recent Papers

[Back to the top]

Miscellaneous

The paper

  • Horwitz, S., Reps, T., and Binkley, D., Interprocedural slicing using dependence graphs. In Proceedings of the ACM SIGPLAN 88 Conference on Programming Language Design and Implementation, (Atlanta, GA, June 22-24, 1988), ACM SIGPLAN Notices 23, 7 (July 1988), pp. 35-46. [abstract]
was selected for inclusion in a special SIGPLAN collection of the 50 most influential papers from the SIGPLAN Conference on Programming Language Design and Implementation from 1979 to 1999:
  • Horwitz, S., Reps, T., and Binkley, D., Interprocedural slicing using dependence graphs. 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979 - 1999): A Selection, K.S. McKinley, ed., ACM SIGPLAN Notices 39, 4 (April 2004), 232-243.

A retrospective on the paper was published as

  • Horwitz, S., Reps, T., and Binkley, D., Retrospective: Interprocedural slicing using dependence graphs. 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979 - 1999): A Selection, K.S. McKinley, ed., ACM SIGPLAN Notices 39, 4 (April 2004), 229-231. [retrospective (PS); retrospective (PDF)]

An extended version of the PLDI 88 paper later appeared as the following journal paper:

  • Horwitz, S., Reps, T., and Binkley, D., Interprocedural slicing using dependence graphs. ACM Transactions on Programming Languages and Systems 12, 1 (January 1990), 26-60. [abstract; paper]

[Back to the top]

Availability of Program-Slicing Tools

The Wisconsin Program-Slicing Tool

The The Wisconsin Program-Slicing Tool is a software system that supports operations on C programs, including backward slicing, forward slicing, and chopping [TOPLAS90, FSE94, FSE95b], which can help the user gain an understanding of what a program does and how it works. The Wisconsin Program-Slicing Tool consists of a package for building and manipulating control-flow graphs and program dependence graphs, as well as a front-end that parses C programs and translates them to the internal representations used for slicing.

Although the Wisconsin Program-Slicing Tool was distributed from 1996-2000 for not-for-profit research purposes under license from the University of Wisconsin, it is no longer being distributed. However, there is a commercial tool available, named CodeSurfer, that is derived from the Wisconsin implementation and is available from GrammaTech, Inc. GrammaTech has improved on the Wisconsin implementation considerably (both in terms of speed and space efficiency). GrammaTech also provides CodeSurfer to academic researchers on very favorable terms.

CodeSurfer

The Wisconsin program-slicing technology is now available in a commercial product, the CodeSurfertm tool available from GrammaTech, Inc.

CodeSurfer builds a dependence-graph program representation and provides a GUI for exploring this web. The dependence graph includes forward and backward links between each assignment statement and possible uses of the values stored by that assignment. Pointer analysis is used so that indirect loads and stores through pointers are taken into account, as well as indirect function calls. Dataflow analysis is used so that links between unrelated assignments and uses are excluded. Operations that highlight forward and backward slices show the impact of a given statement on the rest of the program (forward slicing), and the impact of the rest of a program on a given statement (backward slicing) [TOPLAS90, FSE94]. Operations that highlight paths between nodes in the dependence graph (chops) show ways in which the program points are interdependent (or independent) [FSE95b]. CodeSurfer's scripting language, which provides access to the dependence-graph program representation and the Tk widget set, is used for extensibility and for batch applications.

Full information about CodeSurfer is available here, where you can find

  • full documentation,
  • whitepapers,
  • pricing information,
  • information about use by academic researchers,
  • the site for downloading an evaluation copy.
You should be aware that the most important current limitations of CodeSurfer are as follows:
  • Language: C (full support), C++ (alpha support)
  • Program length: < 200K SLOC
CodeSurfer is currently available for Windows NT, Windows 2000, Windows XP, Linux, and Solaris (2.5.1 or later).

[Back to the top]

Categorized Index to Publications

Program Slicing, Differencing, Merging, etc.

Overview

The Wisconsin Program-Slicing Tool

CodeSurfer System

Slicing

Chopping

Differencing

Merging

Algebra of slices (and applications to program merging)

Semantics and slicing

Other applications of slicing

Implemented integration system (for a small Pascal subset)

Miscellaneous

Ph.D. Dissertations

[Back to the top]

Interprocedural Dataflow Analysis

Demand IDFA via bottom-up logic programming and the magic-sets transformation

Exhaustive and Demand IDFA via graph reachability

IDFA using more than graph reachability

PTIME completeness of IDFA

[Back to the top]

Alias Analysis, Pointer Analysis, and Shape Analysis

[Back to the top]

Other Program-Analysis Problems

Ph.D. Dissertations

[Back to the top]

Path Problems

Context-Free-Language Reachability

Other Path Problems

Ph.D. Dissertations

[Back to the top]

Model Checking

[Back to the top]

Computer Security

Ph.D. Dissertations

[Back to the top]

Code Instrumentation

Ph.D. Dissertations

[Back to the top]

Computational Differentiation and Computational Divided Differencing

[Back to the top]

Language-Based Program-Development Environments

Ph.D. Dissertations

[Back to the top]

Incremental Computing

Ph.D. Dissertations

[Back to the top]

Attribute Grammars

Ph.D. Dissertations

[Back to the top]

Miscellaneous

[Back to the top]


Disclaimer

This web page contains links to PostScript files of articles that may be covered by copyright. You may browse the articles at your convenience (in the same spirit that you read a journal article or an article from a conference proceedings in a public library). Retrieving, copying, or distributing these files may violate copyright law.

Please note that the definitive versions of these papers are the published versions. The PostScript versions are provided here as a courtesy, and, in some cases, there may be differences between the PostScript provided here and the published version. We believe that all of the differences are either formatting differences or copy-editing changes. If you cite these papers, please cite the published version rather than giving a URL.

[Back to the top]


List of Publications

See also Citeseer, Google Scholar, and DB&LP;.

Books

[Back to the top]

Journal Publications

[Back to the top]

Invited Papers

  • Reps, T., Sagiv, M., and Wilhelm, R., Static program analysis via 3-valued logic. In Proc. Int. Conf. on Computer-Aided Verification, 2004, 15-30. [abstract; PostScript; PDF; (c) Springer-Verlag; talk: Powerpoint.]

  • Rall, L.B. and Reps, T.W., Algorithmic differencing. In Perspectives on Enclosure Methods, U. Kulisch, R. Lohner, and A. Faciush (eds.), Springer-Verlag, Vienna, 2001, 133-147. (Invited paper presented at SCAN 2000: 9th GAMM-IMACS Int. Symp. on Sci. Comput., Comp. Arith., and Validated Numerics, (Karlsruhe, Ger., Sept. 19-22, 2000).) [abstract]

  • Wilhelm, R., Sagiv, M., and Reps, T., Shape analysis. In Proc. of CC 2000: 9th Int. Conf. on Compiler Construction, (Berlin, Ger., Mar. 27 - Apr. 2, 2000). [abstract; PostScript; PDF; (c) Springer-Verlag]

  • Reps, T., Program analysis via graph reachability. In Proc. of ILPS '97: International Logic Programming Symposium, (Port Jefferson, NY, Oct. 12-16, 1997), J. Maluszynski (ed.), The M.I.T. Press, Cambridge, MA, 1997, pp. 5-19. [abstract; paper, (c) Springer-Verlag]

  • Reps, T., The use of program profiling in software testing. In Proc. of Informatik '97 (Aachen, Germany, Sept. 24-27, 1997), M. Jarke, K. Pasedach, and K. Pohl (eds.), Springer-Verlag, Berlin, Ger., 1997, pp. 4-16. [abstract; paper, (c) Springer-Verlag]

  • Horwitz, S. and Reps, T., The use of program dependence graphs in software engineering. In Proceedings of the Fourteenth International Conference on Software Engineering, (May 11-15, 1992, Melbourne, Australia), ACM, New York, NY, 1992, pp. 392-411. [abstract; paper]

  • Reps, T. and Horwitz, S., Semantics-based program integration. In Proceedings of the Second European Symposium on Programming, (Nancy, France, March 21-25, 1988), Lecture Notes in Computer Science, Vol. 300, H. Ganzinger (ed.), Springer-Verlag, New York, NY, 1988, pp. 1-20.

[Back to the top]

Book Chapters

  • Sagiv, M., Reps, T.W., Wilhelm, R., and Yahav, E., On the utility of canonical abstraction. In Engineering Theories of Software Intensive Systems, M. Broy, J. Gruenbauer, T. Hoare, and D. Harel (eds.), Kluwer Academic Publishers, Dordrecht, The Netherlands, 2005, 215-256.

  • Reps, T., Sagiv, M., and Wilhelm, R., Shape analysis and applications. In The Compiler Design Handbook: Optimizations and Machine Code Generation, CRC Press, 2002, 175-217.

  • Reps, T., Demand interprocedural program analysis using logic databases. In Applications of Logic Databases, R. Ramakrishnan (ed.), Kluwer Academic Publishers, Boston, MA, 1994, pp. 163-196. [abstract; paper]

[Back to the top]

Reprinted in Collections

  • Horwitz, S., Reps, T., and Binkley, D., Interprocedural slicing using dependence graphs. 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979 - 1999): A Selection, K.S. McKinley, ed., ACM SIGPLAN Notices 39, 4 (April 2004), 232-243. [abstract]

    Reprinted from Proceedings of the ACM SIGPLAN 88 Conference on Programming Language Design and Implementation, (Atlanta, GA, June 22-24, 1988), ACM SIGPLAN Notices 23, 7 (July 1988), pp. 35-46.

    A retrospective on the paper was published as

    Horwitz, S., Reps, T., and Binkley, D., Retrospective: Interprocedural slicing using dependence graphs. 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979 - 1999): A Selection, K.S. McKinley, ed., ACM SIGPLAN Notices 39, 4 (April 2004), 229-231. [retrospective (PS); retrospective (PDF)]

  • Horwitz, S., Reps, T., and Binkley, D., Interprocedural slicing using dependence graphs. In Software Change Impact Analysis, S.A. Bohner and R.S. Arnold (eds.), IEEE Computer Society, Los Alamitos, CA, 1996.

    Reprinted from ACM Transactions on Programming Languages and Systems 12, 1 (January 1990), 26-60. [abstract; paper]

  • Horwitz, S., Reps, T., and Binkley, D., Interprocedural slicing using dependence graphs. In Software Merging and Slicing, V. Berzins (ed.), IEEE Computer Society, Los Alamitos, CA, pp. 10-44.

    Reprinted from ACM Transactions on Programming Languages and Systems 12, 1 (January 1990), 26-60. [abstract; paper]

  • Horwitz, S., Prins, J., and Reps, T., Integrating non-interfering versions of programs. In Software Merging and Slicing, V. Berzins (ed.), IEEE Computer Society, Los Alamitos, CA, pp. 137-179.

    Reprinted from ACM Transactions on Programming Languages and Systems 11, 3 (July 1989), 345-387. [abstract; paper]

  • Ramalingam, G. and Reps, T., A theory of program modifications. In Software Merging and Slicing, V. Berzins (ed.), IEEE Computer Society, Los Alamitos, CA, 90-105.

    Reprinted from Proceedings of the Colloquium on Combining Paradigms for Software Development, (Brighton, UK, April 8-12, 1991), Lecture Notes in Computer Science, Vol. 494, S. Abramsky and T.S.E. Maibaum (eds.), Springer-Verlag, New York, NY, 1991, pp. 137-152.

  • Reps, T. and Teitelbaum, T., Language processing in program editors. In Language Architectures and Programming Environments, T. Ichikawa and H. Tsubotani (eds.), The World Scientific Publishing Company, Singapore, 1992, pp. 146-169.

    Reprinted from IEEE Computer 20, 11 (November 1987), 29-40.

  • Teitelbaum, T. and Reps, T., The Cornell Program Synthesizer: A syntax-directed programming environment. In Interactive Programming Environments, D. Barstow, E. Sandewall, and H. Shrobe (eds.), McGraw-Hill, 1984, pp. 97-116.

    Reprinted from Communications of the ACM 24, 9 (September 1981), 563-573. [abstract; paper, via ACM Digital Library]

  • Teitelbaum, T., Reps, T., and Horwitz, S., The why and wherefore of the Cornell Program Synthesizer. In Software Development Environments, A.I. Wasserman (ed.), IEEE Computer Society, Washington, DC, 1981, 64-72.

    Reprinted from Proceedings of the ACM SIGPLAN/SIGOA Symposium on Text Manipulation, (Portland, OR, June 8-10, 1981), ACM SIGPLAN Notices 16, 6 (June 1981), pp. 8-16.

[Back to the top]

Magazine Articles

[Back to the top]

Conference Publications

[Back to the top]

Software

[Back to the top]

Patents

[Back to the top]

Pending Submissions

[Back to the top]

Other Publications and Reports

[Back to the top]


Visitors, Post-Docs, and Students

Visitors and Post-Doctoral Associates

  • Neil Immerman (U. Mass.), 2004-05.

  • Bertrand Jeannet (IRISA), Mar. - June, 2003.

  • Stefan Schwoon (Univ. of Stuttgart), Feb. - Mar., 2003.

  • David Melski, 2002. (Currently with GrammaTech, Inc., Ithaca, NY.)

  • Mooly Sagiv, 1994-95. (Currently Senior Lecturer (tenured), School of Computer Science, Tel-Aviv University, Israel.)

  • Robert Paige (NYU), 1990-91.

  • Jiazhen Cai (NYU), 1990-91.

  • Wuu Yang, 1990-91. (Currently Professor, Department of Computer and Information Science, National Chiao-Tung University, Taiwan.)

  • Jan Prins , 1986-87. (Currently Professor, Department of Computer Science, University of North Carolina, Chapel Hill.)

Former Students

Current Students

[Back to the top]

 
Computer Sciences | UW Home