Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Explicit Multi-Threading (XMT) - Home Page
[go: Go Back, main page]


Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

Explicit Multi-Threading (XMT): A PRAM-On-Chip Vision

A Desktop Supercomputer


The XMT vision was introduced in 1997.
Click for an updated introduction to XMT, June 2006 .
Click for the most recent XMT presentation. .
Click for an older version of this page .


Navigate this page:


Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

The XMT programming model, in a nutshell

The XMT programming model is quite simple. See figure below. Its primary example is a standard C programming language variant called XMT-C that adds only two basic commands. The main added command is spawn. The program alternates between serial mode and parallel mode. The spawn command can declare any number of concurrent threads and causes a switch from serial mode to parallel mode. Each thread advances through its program at its own speed till termination. When all threads terminate (depicted as Join in the figure), the program switches back to serial mode. Successive spawn commands can each declare a different number of threads. This brief description is not meant to replace a fuller description that can be found in technical presentations and papers below. For example, this brief description suppresses: (i) the second added command, (ii) the classification of XMT-C as a so-called single-program multiple-data (SPMD) language, (iii) the so-called independence-of-order (IOS) semantics of XMT-C, and, of course, (iv) how the program is compiled and implemented in hardware.


Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

New:

On-Line Tutorial on Parallel Programming through Parallel Algorithms

The following material is now available for download following a full day tutorial to high-school students at the University of Maryland on September 15, 2007:
(1)
Announcement of the tutorial.
(2) Short description of the tutorial, same as for the June 17 Seattle tutorial, noted below.
(3) 72 slides.
(4) Video recording of the September 15 tutorial in 4 parts. The morning session is in Part 1A (100 minutes), and Part 1B (95 minutes). Then the students had a 30 minutes hands-on session on the XMT desktop supercomputer (1st picture and 2nd picture), followed by heading back to the lecture room (3rd picture), Part 2A (109 minutes), and Part 2B (10 minutes).
In addition, the following 3 documents provide a basis for parallel algorithms, as well as for parallel programming in XMT-C:
(5) U. Vishkin. Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques, 104 pages. These class notes are meant to make the fundamental theory of parallel algorithms accessible to computer science and engineering graduate students. They have also been tested successfully for upper division undergraduate students. Download class notes.
(6) XMT-C Manual and
(7) XMT-C Tutorial

Past tutorial: Tutorial on "How to Think Algorithmically in Parallel", Seattle, WA, Sunday, June 17, 2007, held in conjunction with the 21st ACM International Conference on Supercomputing (ICS).



Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

Talks

·  (General talk:) Explicit Multi-Threading (XMT): A Theorist's Proposal for a Big Little Computer System. Download talk (27 slides). Versions of the SPAA'98 talk were given at IBM Haifa Research Laboratory (8/98), University of Maryland Computer Engineering Colloquium (9/98), The Computer Architecture and Parallel System Laboratory, University of Delaware (10/98), The Center for Research on Parallel Computation, Rice University (10/98), CASES98: Workshop on Compiler and Architecture Support for Embedded Systems, Washington, DC (12/98), New York University (12/98), IBM T.J. Watson (12/98), the Workshop on Parallel Algorithms (WOPA'99), Atlanta (5/99), Technion - Israel Institute of Technology (5/99), and Georgia Institute of Technology (3/00).

·  (Specialized talk:) Experiments with list ranking for Explicit Multi-Threaded (XMT) instruction parallelism (by S. Dascal and U. Vishkin). 3rd Workshop on Algorithms Engineering (WAE'99), London, U.K., July 1999.
Download talk (13 slides).

·  (General talk) What to do with all this hardware? Could the PRAM-On-Chip architecture lead to upgrading the WINTEL performance-to-productivity platform? Download abstract. Download talk (postscript, slides in 6 quads) or (pdf, 6 quads) Invited talks were given at the Intel Microprocessor Research Lab (MRL), Santa Clara, CA, and at Microsoft Research, Redmond, WA, in January 2002, and at IBM T.J. Watson Research Center, Yorktown Heights, NY, in June 2002.
Link to the Multi-University Research Lab Seminar Series for a recording of the Microsoft Research talk (based on the 6 quads of slides above).

·  (Specialized talk) Bending Light for Multi-Chip Virtual PRAMs? (12 slides), 3rd Workshop on Non-Silicon Computation, held in conjunction with the 31st International Symposium on Computer Architecture (ISCA 2004), June 19-23, 2004, Munich, Germany.

·  (Specialized talk) Pilot development time productivity study on the Spring 2004 course Parallel Algorithmics (12 slides), DARPA's High Productivity Computing Systems Program and Productivity Summit, Fairfax, VA June 29-30, 2004. For context, see http://www.highproductivity.org/

·  (General talk) PRAM-On-Chip: A Quest for Not-So-Obvious Non-Obviousness. Download abstract . Download talk (pdf, 33 slides) . Invited talks were given at Intel Research Lab, Petach Tiqwa (Israel), 8/2004, and at MFCS'04, Prague, 8/2004.

·  (General talk) Can Parallel Computing Finally Impact Mainstream Computing? Download abstract . Download talk (pdf, 40 slides) . Invited talks were given at Microsoft Research and Intel Research Pittsburgh, 3/2005 and 4/2005.

·  (General talk) PRAM on Chip: Advancing from Parallel Algorithms to a Parallel System. Download talk (pdf, 46 slides) . Talks were given at Sandia National Lab 3/2006 and the UMIACS Workshop on Parallelism in Algorithms and Architectures, May 12, 2006.

·  (General talk) PRAM-On-Chip Vision: Explicit Multi-Threading (XMT) Technology. Download talk (pdf, 53 slides) . Distinguished EECS Colloquium, University of Central Florida, March 2007.

·  (Panel discussion position) For General-Purpose Parallel Computing: It is PRAM or never

(pdf, 17 slides) . Microsoft Workshop on Manycore Computing, Seattle, Washington, June 20-21, 2007 .

·  (General talk) Towards Realizing a PRAM-On-Chip Vision. Download talk (pdf, 60 slides) . Keynote, Workshop on Highly Parallel Processing On Chip (HPPC) , August 28, 2007, in conjuction with Euro-Par 2007, Rennes, France.


Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

Publications

·  U. Vishkin, S. Dascal, E. Berkovich and J. Nuzman. Explicit Multi-Threading (XMT) Bridging Models for Instruction Parallelism (Extended Abstract). In Proc. 10th ACM Symposium on Parallel Algorithms and Architectures (SPAA), 1998.
Note: This paper introduces XMT to readers whose background includes reasonable understanding of algorithms and theory. The presentation is by way of a bridging model. Among other things, a bridging model provides a design space for algorithm designers and programmers, as well as a design space for computer architects. It is convenient to describe our wider vision regarding ``parallel-computing-on-a-chip'' as a two-stage development and therefore two bridging models are presented: Spawn-based multi-threading (Spawn-MT) and Elastic multi-threading (EMT).
Download TR (12 pages).

·  U. Vishkin, S. Dascal, E. Berkovich and J. Nuzman. Explicit Multi-Threading (XMT) Bridging Models for Instruction Parallelism (Extended Summary and Working Document). Current version of UMIACS TR-98-05 . First version: January 1998.
Note: This long paper is really the MAIN publication for the XMT framework. The SPAA'98 paper gives a principled presentation of XMT. It also introduces XMT to readers with algorithms and theory background. The next central presentation for the project is the journal version of the SPAA'01 paper. In contrast, the September 1999 TR UMIACS99-55 by Berkovich, Nuzman, Franklin, Jacob and Vishkin is not a good single representative of the XMT project, as a whole; while it describes possible architecture choices for the XMT framework, and studies their microarchitecture performance implications, some part of it are misleading when it comes to understanding choices actually made for XMT. The paper emphasizes an important ``decentralization'' property of the architecture. Its performance hardly degrades even when the number of cycles for traversing wires (interconnects) increases. The introduction that the latter paper provides to XMT is less comprehensive and less direct than the SPAA98 paper, as some aspects of XMT are not addressed, and some architecture choices were made for the sake of completeness of the architecture and not because they are required by the XMT paradigm.
Download TR (47 pages).

·  S. Dascal and U. Vishkin. Experiments with list ranking for Explicit Multi-Threaded (XMT) instruction parallelism (extended abstract). In Proc. 3rd Workshop on Algorithms Engineering (WAE'99), London, U.K., July 1999.
Download extended abstract (17 pages).
Download talk (13 slides).
An extended summary of this work is available as UMIACS-TR-99-33. Download extended summary (27 pages).
Invited to the Special Issue for WAE99: Journal version, from the ACM Journal of Experimental Algorithmics, Volume 5, 2000.

·  E. Berkovich, J. Nuzman, M. Franklin, B. Jacob, U. Vishkin. XMT-M: A scalable decentralized processor. UMIACS TR 99-55, September 1999.
Download TR (postscript, 18 pages). Note: as the above comment following UMIACS TR-98-05 states, UMIACS TR 99-55 could be misleading in understanding architecture choices made for XMT and is not a good single representative publication of the XMT project.

·  U. Vishkin. A No-Busy-Wait balanced tree parallel algorithmic paradigm. In Proc. 12th ACM Symposium on Parallel Algorithms and Architectures (SPAA), 2000.
Download TR (postscript, 9 pages). pdf.

·  D. Naishlos, J. Nuzman, C-W. Tseng, and U. Vishkin. Evaluating multi-threading in the prototype XMT environment. In Proc. 4th Workshop on Multi-Threaded Execution, Architecture and Compilation (MTEAC2000), December 2000 (held in conjunction with the 33rd Int. Symp. on Microarchitecture MICRO-33). Best Paper Award for MTEAC2000.
Download TR (pdf, 9 pages).

·  D. Naishlos, J. Nuzman, C-W. Tseng, and U. Vishkin. Evaluating the XMT Parallel Programming Model. In Proc. of the 6th Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS-6), April 2001.
Download TR (pdf, 6 pages).

·  U. Vishkin. Two techniques for reconciling algorithm parallelism with memory constraints (extended abstract). In Proc. 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA), 2002.
Download TR (postscript, 4 pages). Download talk (postscript, 13 slides).

·  D. Naishlos, J. Nuzman, C-W. Tseng, and U. Vishkin. Towards a First Vertical Prototyping of an Extremely Fine-Grained Parallel Programming Approach. The conference version appeared in Proc. 13th ACM Symposium on Parallel Algorithms and Architectures (SPAA-01)), July 2001.
Download TR (pdf, 10 pages).
The journal version appeared in the invited Special Issue for SPAA01: TOCS 36,5 pages 521-552, Springer-Verlag, 2003.
Download the TR that led to the journal version (pdf, 26 pages).

·  A. Balkan, G. Qu, and U. Vishkin. Arbitrate-and-Move primitives for high throughput on-chip interconnection networks. In Proc. IEEE International Symposium on Circuits and Systems (ISCAS), SoC Design Technology lecture session , Vancouver, May 23-26, 2004.
Download TR (pdf, 4 pages).

·  U. Vishkin. Can Optical Interconnection Networks Lead to Cheaper High-Performance Multiprocessors? Proc. Optics East, S.P.I.E. - The International Society for Optical Engineering, Active and Passive Optical Components for WDM Communications IV, Volume 5595, Philadelphia, October 26-28, 2004, 162-166. Download TR (pdf, 5 pages).
The computer science and engineering side of this work was presented in the paper:

·  U. Vishkin and I. Smolyaninov. Bending Light for Multi-Chip Virtual PRAMs? Proc. 3rd Workshop on Non-Slicon Computation, held in conjunction with the 31st International Symposium on Computer Architecture (ISCA 2004), June 19-23, 2004, Munich, Germany, 32-38.
Download TR (pdf, 7 pages).

·  P. Gu and U. Vishkin. Case Study of Gate-level Logic Simulation on an Extremely Fine-Grained Chip Multiprocessor Journal of Embedded Computing, Special Issue on Embedded Single-Chip Multicore Architectures and related research - from System Design to Application Support, 2, 2 (2006), 181-190.
The test environment including the XMT machine, parallel logic simulation program and benchmark circuits .

·  A. Balkan, G. Qu and U. Vishkin. Mesh-of-Trees and Alternative Interconnection Networks for Single-Chip Parallel Processing. In Proc. 17th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2006), Steamboat Springs, Colorado, September 11-13, 2006, 73-80. Best Paper Award.

·  U. Vishkin, I. Smolyaninov and C. Davis. Plasmonics and the parallel programming problem. Silicon Photonics Conference, SPIE Symposium on Integrated Optoelectronic Devices 2007, January 20-25, 2007, San Jose, CA.
Download TR (pdf, 10 pages).

·  X. Wen and U. Vishkin. PRAM-On-Chip: First Commitment to Silicon. Brief announcement in Proc. 19th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 301-302, June 9-11, 2007.
Download TR (pdf, 2 pages). Download talk (pdf, 17 slides).

·  A. Balkan, M. Horak, G. Qu, and U. Vishkin. Layout-Accurate Design and Implementation of a High-Throughput Interconnection Network for Single-Chip Parallel Processing. Hot Interconnects 2007, Stanford University, CA, August 22-24, 2007. Download TR (pdf, 8 pages). Download talk (pdf, 21 slides).

·  U. Vishkin, G. Caragea and B. Lee. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform. Accepted to Handbook on Parallel Computing: Models, Algorithms, and Applications, Editors: S. Rajasekaran and J. Reif, CRC Press. To appear in 12/2007. This paper provides models for developing performance-tuned XMT programs. There is no conflict between this work and the quest for ease-of-programming using the PRAM model. The implied methodology (for the XMT system developers) is as follows. Given a simple PRAM-like program manually performance-tune it using the performance models in this paper. Then teach the compiler to produce the performance-tuned program.


Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

Course on Parallel Algorithms Including XMT Programming Assignments, Tutorial and Software Tools

·  Spring 2007: ENEE759K/CMSC751 Parallel Algorithmics.

Below please find older software tools as well as older supplemental information to XMT publications. More recent material is available through the Parallel Algorithms course web site above.

Prior Generation XMT Tools & Empirical Results


Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

Students in the Research Team

  • Aydin Balkan, graduate student, Electrical and Computer Engineering, University of Maryland.
  • Xingzhi Wen, graduate student, Electrical and Computer Engineering, University of Maryland.
  • Fuat Keceli, graduate student, Electrical and Computer Engineering, University of Maryland.
  • George Caragea, graduate student, Computer Science, University of Maryland.
  • Alex Tzannes, graduate student, Computer Science, University of Maryland.
  • Mike Detwiler, undergraduate student, Computer Science, University of Maryland.
  • Mike Horak, graduate student, Electrical and Computer Engineering, University of Maryland.
  • Tom Dubois, graduate student, Computer Science, University of Maryland.
  • Mary Kiemb, graduate student, Electrical and Computer Engineering, University of Maryland.

Faculty in the Research Team

  • Professor Rajeev Barua, Electrical and Computer Engineering, University of Maryland.
  • Professor Gang Qu, Electrical and Computer Engineering, University of Maryland.

This research project is directed by Professor Uzi Vishkin.

Former Students

  • Efraim Berkovich, MS, Electrical and Computer Engineering, University of Maryland.
  • Shlomit Dascal, MS, Computer Science, Tel Aviv University; was visiting graduate student at UMIACS.
  • Dorit Naishlos, MS, Computer Science, University of Maryland.
  • Joseph Nuzman, MS, Electrical and Computer Engineering, University of Maryland. Joe started working on XMT while still an Honors ECE undergraduate student.
  • Pei Gu, MS, Electrical and Computer Engineering, University of Maryland.
  • Roni Kupershtok, post-doc, Electrical and Computer Engineering, University of Maryland.
  • Adam Beytin, undergraduate student, Computer Engineering, University of Maryland.
  • Nghi Nguen Ba, undergraduate student, Computer Engineering, University of Maryland.
  • Paul Mazzucco, undergraduate student, Computer Engineering, University of Maryland.
  • Yoni Ashar, undergraduate student, Computer Science, University of Maryland.
  • Darya Fillipova, undergraduate student, Computer Science, University of Maryland.

Top | Programming Model | NEW On-Line Tutorial | Talks | Publications | Course & Tools | Team

Funding by the National Science Foundation and the Department of Defense is graciously acknowledged. Some of the material on this page is based upon work supported by the National Science Foundation under Grant No. 0325393. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or the Department of Defense.