JPS599944B2

JPS599944B2 - data processing equipment

Info

Publication number: JPS599944B2
Application number: JP50020823A
Authority: JP
Inventors: ラ−ロンハツトソンモ−リス; リ−ブスベザニイルイス; イアベンカ−ト
Original assignee: Control Data Corp
Current assignee: Control Data Corp
Priority date: 1974-03-13
Filing date: 1975-02-19
Publication date: 1984-03-06
Also published as: AU7854475A; NL181055C; DE2505518A1; DE2505518C2; FR2264323A1; JPS50123242A; NL7502309A; NL181055B; GB1479404A; NL8602882A; FR2264323B1

Description

【発明の詳細な説明】この発明はデータ処理装置に関するものであり、更に詳
しくはコンピユータなどの装置内の種々の部分間におけ
るデータや情報の処理のための装置に関するものである
。DETAILED DESCRIPTION OF THE INVENTION This invention relates to data processing devices, and more particularly to devices for processing data and information between various parts within a device such as a computer.

データ処理技術においては、コンピユータ内の二つの部
分の間を最短の時間で比較的大量のデータを移動させた
いことがしばしば生じる。In the data processing arts, it often occurs that it is desirable to move relatively large amounts of data between two parts within a computer in the shortest amount of time.

たとえば、メモリに記憶されているデータは、コンピュ
ータの演算部における連続的なオペレーシヨンのために
メモリバンクから連続して読み出されるのが普通である
。しかしながら、同じあるいは隣接したメモリバンクに
ある、異なつたオペランドを表わすデータをオペレート
したい場合も生じる。さらに、リザルタントを表わすデ
ータを同じあるいは隣接したメモリバンクにストアした
いこともしばしばである。ベクトルオペレーシヨンは、
１つのベクトルを表わす複数個のオペランド中の個々の
オペランドが、第２のベクトルを表わす複数個のオペラ
ンド中の個々のオペランドと処理されて第３のベクトル
を表わす複数個のリザルタントを得る、コンピユータに
よつて実行されるオペレーシヨンである。For example, data stored in memory is typically read continuously from memory banks for continuous operation in a computing section of a computer. However, it may be desirable to operate on data representing different operands located in the same or adjacent memory banks. Additionally, it is often desirable to store data representing results in the same or adjacent memory banks. Vector operations are
a computer in which each operand in a plurality of operands representing one vector is processed with an individual operand in a plurality of operands representing a second vector to obtain a plurality of results representing a third vector; This is the operation that is executed.

たとえば、簡単なベクトルオペレーシヨンＡ＋Ｂ一Ｃは
、Ａ１＋Ｂ１−Ｃ１、Ａ２＋Ｂ２−Ｃ６Ａ３＋Ｂ３＝Ｃ
３Ｃ・・・・・・・・、Ａｎ＋Ｂｎ−Ｃｎという連続的
なォペレーシヨンからなる。このとき、Ａｌ，Ａ２ラＡ
３，・・・・・・・・・，ＡｎはベルトルＡを成し、Ｂ
ｌ，Ｂ２，Ｂ３，・・・・・・・・・，Ｂｎはベクトル
Ｂを成し・ＣｈＣ２ラＣ３，・・・・・・・・・，Ｃｎ
はリザルタントベクトルＣを成す。通常は、ベクトル問
題の計算の前に、Ａ，，Ａ２，Ａ３，・・・・・・・・
・，ＡｎからなるベクトルＡはメモリバンク中の連続し
たバンクにストアされる。Ｂｌ，Ｂ２，Ｂ３，・・・・
・・・・・，ＢｎからなるベクトルＢはメモリ中の連続
したバンクにストアされ、Ｃｌ，Ｃ２，Ｃ３，・・・・
・・・・・，ＣｎからなるリザルタントベクトルＣはメ
モリ中の複数個のバンクにストアされる。しばしば、ベ
クトルＡとＢとがストアされるバンクは同じか、オーバ
ーラツプしており、リザルタントベクトルＣも、オペラ
ンドベクトル同様に、同じか、オーバーラツプしたメモ
リバンクにストアされることになる。通常は、メモリの
リード／ライト回路は特定の時間に、メモリバンクから
ただ１つのデータプロツクにしかアクセスできない。For example, the simple vector operation A+B-C is A1+B1-C1, A2+B2-C6 A3+B3=C
3C...... Consists of a continuous operation of An+Bn-Cn. At this time, Al, A2 la A
3, ・・・・・・・・・, An constitutes Bertol A, B
l, B2, B3, ・・・・・・・・・, Bn form the vector B・ChC2 la C3, ・・・・・・・・・, Cn
forms a resultant vector C. Usually, before calculating a vector problem, A,,A2,A3,...
. , An is stored in consecutive banks in the memory bank. Bl, B2, B3,...
..., Bn is stored in consecutive banks in memory, Cl, C2, C3,...
..., Cn is stored in multiple banks in memory. Frequently, vectors A and B are stored in the same or overlapping banks, and resultant vector C, like the operand vectors, will be stored in the same or overlapping memory banks. Typically, a memory read/write circuit can access only one data block from a memory bank at a particular time.

したがつて、メモリバンクの１つの主サイクルの間に、
１つのメモリバンクからＡとＢのオペランドを同時に読
み出すことは不可能であり、また、同じメモリバンクに
Ｃリザルタントをストアすることも不可能である。メモ
リとしては例えば大容量のコアメモリを想定しており、
メモリと中央処理装置との間を接続するデータ送受用の
線は１メモリバンクのデータビツト数より少なく、メモ
リと中央処理装置との間での１データバンク分のデータ
の送受は数回に分けて行うものを想定している。Therefore, during one main cycle of the memory bank,
It is not possible to simultaneously read the A and B operands from one memory bank, nor is it possible to store the C resultant in the same memory bank. For example, we assume that the memory is a large-capacity core memory,
The number of data transmission/reception lines connecting the memory and the central processing unit is smaller than the number of data bits in one memory bank, and the transmission and reception of one data bank's worth of data between the memory and the central processing unit is divided into several times. It is assumed that the

同じメモリバンクから同時に２以上のオペランドを読み
書きした場合には正しいデータが読み出されているか又
は正しいデータが書き込まれているかについて問題が生
ずるので、同時に同じメモリバンクから２以上のオペラ
ンドのアクセス即ち読み出し又は書き込みは行わないこ
とＸしている。以前は、ＡとＢオペランドが同じメモリ
バンクに生じたときは、最初に生じたオペランドは、第
２に生じたオペランドが読み出されるまでの設定時間だ
け遅延しなければならなかつた。If two or more operands are read or written from the same memory bank at the same time, a problem arises as to whether the correct data is being read or written. Or, the writing is not performed. Previously, when the A and B operands occurred in the same memory bank, the first occurring operand had to be delayed by a set amount of time until the second occurring operand was read.

たとえば、第１図を参照すれば、ＡとＢベクトルは同じ
メモリバンクにストアされている。このとき、Ａ１ない
しＡ８はスーパーワード（略してソート、以後ソートと
呼ぶ）としてバンク１にストアされ、Ａ９ないしＡｌ６
はソートとしてバンク２にストアされる、またＢ１ない
しＢ５はソートとしてバンク１にストアされ、Ｂ６ない
しＢｌ３はソートとしてバンク２にストアされる。以前
のコンピユータでは第１にバンク１からＡ１オペランド
を読み出すのが習慣であつた。このＡ１オペランドはＢ
１が読み出されるまで遅延され、そしてＡ１とＢ１とが
コンピユータの演算部に送られてＣ１が得られた。Ａ２
が読み出されるまではＢオペランドは遅延され、Ｂ２が
読み出されるまでＡオペランド遅延された。この過程が
、オペランド各々の連続的な読み出しの後生じる遅れを
併つて進行し、結果的には重大な遅れとなる。さらに、
以下の記述でより充分に理解されるように、リザルタン
トＣデータを同じメモリバンクにストアするときにも遅
れが生じる。For example, referring to FIG. 1, the A and B vectors are stored in the same memory bank. At this time, A1 to A8 are stored in bank 1 as superwords (sort for short, hereinafter referred to as sort), and A9 to Al6
are stored in bank 2 as a sort, B1 through B5 are stored in bank 1 as a sort, and B6 through Bl3 are stored in bank 2 as a sort. In previous computers, it was customary to read the A1 operand from bank 1 first. This A1 operand is B
It was delayed until 1 was read out, and then A1 and B1 were sent to the arithmetic section of the computer to obtain C1. A2
The B operand was delayed until B2 was read, and the A operand was delayed until B2 was read. This process proceeds with the delay that occurs after successive readings of each operand, resulting in a significant delay. moreover,
As will be more fully understood in the discussion below, there is also a delay when storing Resultant C data to the same memory bank.

ほとんどのメモリは、メモリの１つの主サイクルの間に
いくつかのバンクから連続的にオペランドにアクセスす
ることができる。Most memories allow operands to be accessed sequentially from several banks during one main cycle of the memory.

たとえば、第１図を参照すれば、１つの主サイクルの間
に８つまでの連続したバンクにアクセスすることができ
る。本発明の目的は、すべてのベクトルに対し１つだけ
の遅延しか生じないで各オペランドに対する連続的な遅
延は生じないような装置を提供することである。本発明
の他の目的は、最短時間でコンピユータのメモリからデ
ータを読み出し、また書き込むための装置を提供するこ
とである。For example, referring to FIG. 1, up to eight consecutive banks can be accessed during one main cycle. It is an object of the present invention to provide such a device in which there is only one delay for all vectors and no successive delays for each operand. Another object of the invention is to provide an apparatus for reading and writing data from and to a computer memory in the shortest possible time.

１つのオペランドベクトルを表わす複数個のオペランド
の個々が、第一２のオペランドベクトルを表わす複数個
のオペランドの個々と連続的に処理されるベクトルオペ
レーシヨンがコンピユータオペレーシヨンで行なわれる
ことはよく知られている。It is well known that vector operations are performed in computer operations in which each of a plurality of operands representing one operand vector is processed sequentially with each of a plurality of operands representing a second operand vector. ing.

各オペランドベクトルは多数の個々のオペランドからな
り、１つのオペランドベクトルは数千のこのようなオペ
ランドからなるのもまれではない。このようなオペラン
ドベクトルのいくつかのオペランドが、たとえばゼロと
いつた設定された値をとることはよく生じる。この発明
は、このような設定された値を有するオペランドが省略
されて、その結果かなりのメモリスペースがこのような
設定された値、たとえばゼロのベクトルのオペランドを
ストアするために使用されないような装置を目的とする
。さらに、ベクトルが、このようなゼロの値を有する多
数のオペランドを有するときには、このようなオペラン
ドに対して演算を施さないことにより、計算時間の短縮
が実現される。ここで使用されているように、「オペラ
ンドベクトル」という術語は、連続的に配置されたオペ
ランドからなるベクトルを意味する。「リザルタントベ
クトル」は、連続的に配置された複数個のリザルタント
からなるベクトルである。［オペランドスパースベクト
ル」と「リザルタントスパースベクトル」は連続的に配
置された複数個のオペランドまたはリザルタントである
が、「オペランドベクトル」または［リザルタントベク
トル」と異なる点はいくつかの設定された値、たとえば
ゼロを表わすすべてのオペランドあるいはリザルタント
が省略されている点である。「オペランドオーダベクト
ル」と「リザルタントオーダベクトル］は連続的になら
べられたビツト列であるが、そのビツト列の１は、それ
ぞれオペランドベクトルあるいはリザルタントベクトル
中の対応したオペランドあるいはリザルタントの設定さ
れた状態を表わし、そのビツト列のＯは、オペランドベ
クトルあるいはリザルタントベクトル中のもう１つの設
定された状態を表わす。ここに示された例では、スパー
スベクトルは対応したオペランドベクトルあるいはリザ
ルタントベクトルのすべての非ゼロの項を含み、オーダ
ベクトルは、オペランドベクトルあるいはリザルタント
ベクトルの非ゼロの項に対応した１を有し、またオペラ
ンドベクトルあるいはリザルタントベクトル中のゼロの
値の項に対応してＯを有する。以前は、コンピユータ内
におけるベクトルオペレーシヨンは、コンピユータのメ
モリに複数個のオペランドベクトルの各々をすべてスト
ア（設定された値、たとえばゼロを有するオペランドを
も含めて）し、オペランドベクトルのすべてのオペラン
ドをコンピユータの演算部および制御部（通常はさらに
データ交換部）で処理することによつてなされた。Each operand vector consists of a large number of individual operands, and one operand vector often consists of thousands of such operands. It often happens that some operands of such an operand vector take a set value, for example zero. The present invention provides an apparatus in which operands with such set values are omitted so that significant memory space is not used to store operands with such set values, e.g. vectors of zeros. With the goal. Further, when a vector has a large number of operands having such a value of zero, calculation time is reduced by not performing operations on such operands. As used herein, the term "operand vector" means a vector consisting of consecutively arranged operands. A "resultant vector" is a vector consisting of a plurality of consecutively arranged results. [Operand sparse vector] and [Resultant sparse vector] are multiple operands or results arranged consecutively, but they differ from [Operand vector] or [Resultant vector] in that they have some set values. , for example, all operands or results representing zero are omitted. The "operand order vector" and the "resultant order vector" are consecutive bit strings, and the 1s in the bit strings correspond to the settings of the corresponding operand or resultant in the operand vector or resultant vector, respectively. The bit string O represents another set state in the operand vector or resultant vector. In the example shown here, the sparse vector represents all of the corresponding operand vector or resultant vector. The order vector has 1's corresponding to non-zero terms in the operand vector or resultant vector, and O corresponding to zero-valued terms in the operand vector or resultant vector. Previously, a vector operation within a computer would store each of multiple operand vectors in the computer's memory (including operands with set values, such as zero), and store all of the operand vectors in the computer's memory. This was done by processing the operands of the computer in an arithmetic section and a control section (usually also a data exchange section).

オペランドが設定された値（たとえばゼロ）を表わす場
合には、かなりのメモリスペースがこれらのオペランド
をストアしないことによつて節約できることが理解され
よう。このようなオペランドをストアしたり処理したり
するかわりに、オーダビツトが各オペランドの値の状態
を表示してストアされ、オーダビツトが処理される。た
とえば、オペランドベクトル中の６４ビツトのオペラン
ドの場合で、このようなオペランドのうちただ５％が設
定された値、たとえばゼロであつたならば、オーダベク
トルをストアすることにより、メモリスペースの節約に
なる。したがつてもし、オペランドベクトルが１万個の
オペランドと５％のこのようなオペランド（５００個の
オペランドすなわち３２０００ビツト）を有すれば、５
００個のゼロ値のオペランド、すなわち３２０００ビツ
トの代わりに１万ビツトのオーダベクトルをストアする
ことになつてメモリが本質的に節約される。It will be appreciated that if the operands represent a set value (eg, zero), significant memory space can be saved by not storing these operands. Instead of storing and processing such operands, ordered bits are stored representing the state of each operand's value, and the ordered bits are processed. For example, in the case of 64-bit operands in an operand vector, if only 5% of such operands have a set value, say zero, you can save memory space by storing an ordered vector. Become. Therefore, if the operand vector has 10,000 operands and 5% such operands (500 operands or 32,000 bits), then 5
Memory is essentially saved in storing 00 zero-valued operands, ie, an ordered vector of 10,000 bits instead of 32,000 bits.

さらに、リザルタントが、すくなくともその一部はゼロ
値のオペランドから導かれるので多数のゼロ値のオペラ
ンドを有するベクトルの処理にかなりの計算時間が不必
要に浪費される。本発明は、特にフルベクトルの代わり
にスパースベクトルを記憶することを可能ならしめるた
めにスパースベクトルを処理し、また演算および制隣の
目的でスパースベクトルを扱うための装置に関するもの
である。本発明の他の目的は、コンピユータ内のメモリ
と演算部との間でスパースベクトルを処理するための装
置を提供することである。Furthermore, since the results are derived at least in part from zero-valued operands, considerable computational time is unnecessarily wasted in processing vectors with a large number of zero-valued operands. The present invention relates to an apparatus for processing sparse vectors, in particular to make it possible to store sparse vectors instead of full vectors, and for handling sparse vectors for arithmetic and neighborhood purposes. Another object of the present invention is to provide an apparatus for processing sparse vectors between memory and an arithmetic unit within a computer.

本発明の他の目的は、オーダベクトルがコンピユータの
スパースベクトルに対するオペレーシヨンを制御するた
めに与えられるスパースベクトルを処理するための装置
を提供することである。Another object of the present invention is to provide an apparatus for processing sparse vectors in which ordered vectors are provided to control a computer's operations on sparse vectors.

本発明に従えば、コンピユータの演算部へ送られるべき
すくなくとも２種類の複数個のオペランドを表わすデー
タは選択的なバツフアを通して送られる。最初に到着し
たデータは第２のデータが読み出し可能になるまで遅延
される。バツフアは連続的に到着する第１のベクトルの
データを連続的にストアし、第２のベクトルから対応し
たデータに一致してフアーストイン、フアーストアウト
のベースでの第１のデータを読み出す。たとえば、メモ
リバンク中の各ソートが８個までのオペランを含むよう
な前述した例を参照すれば、最初に読まれたオペランド
（すなわちＡ１からんまで）はオペランドバツフアにス
トアされる。最初のＢオペランドＢ１はＡ１ないしＡ８
と同じバンクにストアされているので、すべてのＡオペ
ランドＡ１ないしＡ，４が最初の主メモリサイクルの間
に最初の８つのバンクから読み出されるまではＢ１は読
み出されない。したがつて、リード回路は、Ａ６５ない
しＡ７２オペランドを第９バンクから読み出し、かつＢ
１ないしＢ５（あるいはＢ８まで）オペランドを第１バ
ンクから読み出すために第１および第９バンクをスキヤ
ンする。その間、バツフアは連続的にＡ６。などをスト
アしつつＢ１などを読みながらＡ１などを送り出す。し
たがつてＡ１とＢ１とが送り出され、読いてＡ２とＢ２
とが送り出される。本発明の一つの特徴は最初に到着し
たデータに応答してそのデータを一時ストアするための
制御装置を提供することにある。In accordance with the present invention, data representing a plurality of at least two types of operands to be sent to an arithmetic section of a computer is sent through a selective buffer. The first data to arrive is delayed until the second data can be read. The buffer continuously stores the data of successively arriving first vectors and reads the first data on a first-in, first-out basis in accordance with the corresponding data from the second vector. For example, referring to the example above where each sort in a memory bank contains up to eight operans, the first operands read (ie, A1 through) are stored in the operand buffer. The first B operand B1 is A1 to A8
is stored in the same bank as B1, so B1 is not read until all A operands A1 through A,4 have been read from the first eight banks during the first main memory cycle. Therefore, the read circuit reads operands A65 to A72 from the ninth bank and
Scan the first and ninth banks to read operands 1 through B5 (or up to B8) from the first bank. Meanwhile, Batsuhua is continuously A6. etc., while reading B1, etc., and sending out A1, etc. Therefore, A1 and B1 are sent out and read as A2 and B2.
is sent out. One feature of the invention is to provide a controller for temporarily storing data in response to first arriving data.

本発明の他の特徴は、リザルタントデータを選択的に遅
延してそのデータをメモリへ書き込むための制御装置を
提供することにある。Another feature of the invention is to provide a controller for selectively delaying resultant data and writing that data to memory.

本発明の更に他の特徴は、最適な時間だけデータ流を巧
みに遅延してメモリからのオペランドとメモリへのリザ
ルタントとが最短時間で転送されるようにするための２
つのバツフア間の制御装置を提供することにある。Yet another feature of the present invention is to engineer a delay in data flow for an optimal amount of time so that operands from memory and results to memory are transferred in the shortest amount of time.
The object of the present invention is to provide a control device between two buffers.

本発明の１つの面は、スパースベクトルを通すレジスタ
を選択的に制御するために処理される、オペランドベク
トルの各項の設定値を表わすビツトを有するオーダベク
トルを提供するところにある。One aspect of the invention is to provide an ordered vector having bits representing the set value of each term of the operand vector that is processed to selectively control the registers through which the sparse vector passes.

本発明の他の面において、スパースベクトルはレジスタ
に送られ、そのオペランドは、オーダベクトルによつて
与えられた制御に従つて連続的にゲートを通して送られ
る。オペランドスパースベクトルはコンピユータによつ
て処理されてリザルタントスパースベクトルを生じる。
本発明の後者の面によれば、オーダベクトルは選択的に
レジスタを制御して２つのオペランドスパースベクトル
のオペランドを一列に並べ、コンピユータによるその後
の処理にそなえる。In another aspect of the invention, the sparse vector is sent to a register and its operands are sent through the gate sequentially according to control given by the order vector. The operand sparse vectors are processed by a computer to produce resultant sparse vectors.
According to the latter aspect of the invention, the ordered vector selectively controls registers to align the operands of the two operand sparse vectors for subsequent processing by the computer.

オペランドスパースベクトルから作られたリザルタント
スパースベクトルをオーダベクトルの論理制御に従つて
選択的にゲートを通すような方法でオーダベクトルを処
理するための装置が提供され、またリザルタントオーダ
ベクトルとリザルタントスパースベクトルとをメモリに
密にストアするようにリザルタントオーダベクトルを発
生するための装置が提供される。以上述べた本発明の特
徴およびその他の特徴は以下の記述と添付された図面と
からより充分に理解されよう。Apparatus is provided for processing ordered vectors in such a way that resultant sparse vectors created from operand sparse vectors are selectively gated according to logical control of the ordered vectors; An apparatus is provided for generating resultant ordered vectors for densely storing sparse vectors in memory. The features of the invention described above and other features will be more fully understood from the following description and the accompanying drawings.

図面、とくに第１図を参照すると、コンピユータのメモ
リにおける複数個のバンク中の記憶データの典型的な配
置が示されている。Referring to the drawings, and in particular to FIG. 1, a typical arrangement of stored data in banks in a computer memory is shown.

第１図はメモリの物理的な配置を示し、縦軸にはメモリ
におけるバンクの番号を示し、横軸又は横方向には各バ
ンク上のデータの記憶位置を示し、各バンクに対応して
横方向に当該バンクに記憶されるオペランドデータ及び
リザルタントデータの配置を示す。第１図に示されるよ
うに、それぞれ複数個のオペランドＡ１？Ａ２，Ａ３ｌ
９Ｏ″″″″。″゛ラＡｎおよびＢ１？Ｂ２５Ｂ３，・
・・・・・・・・，ＢｎからなるＡおよびＢと記された
複数個のベクトルがある。各オペランドは時に「ワード
」と呼ばれ、各グループのオペランドは時々「スーバー
ワード」あるいは「ソート」と呼ばれる。説明のために
、各ベクトル中の各連続したソートはメモリの中の連続
した場所にストアされているものとし、メモリの各バン
クは１つのベクトルのソートをストできるものとする。
したがつて、第１図に示すように、メモリのバンク１は
オベランドＡ１からＡ８まで、バンク２はオペランドＡ
９からＡｌ６までを含む。さらに、メモリのバンク１は
オペランドＢＯからＢ，（のこりの３つの場所は空であ
る）まで、バンク２はオペランドＢ６からＢｌ３まで含
む。後でより理解されるように、リザルタントベクトル
Ｃは、リザルタントＣ１からＣ６までがバンク１へ（の
こりの２つの場所は空である）、Ｃ７からＣｌ４までは
バンク２ヘスドアされることになる。空の領域は、この
ような空の領域もおこりうることを説明するだけに示さ
れる。この例では空の場所の数は０と７との間の任意の
数である。最適な方法でこのようなデータを転送する装
置が第２図に示される。Figure 1 shows the physical layout of memory, with the vertical axis showing the bank number in the memory, the horizontal axis or horizontal direction showing the storage location of data on each bank, and the horizontal axis corresponding to each bank. The direction indicates the arrangement of operand data and resultant data stored in the bank. As shown in FIG. 1, a plurality of operands A1? A2, A3l
9O″″″″. ″゛RaAn and B1?B25B3,・
There are a plurality of vectors labeled A and B consisting of ......, Bn. Each operand is sometimes referred to as a "word," and each group of operands is sometimes referred to as a "superword" or "sort." For purposes of illustration, assume that each successive sort in each vector is stored in consecutive locations in memory, and that each bank of memory can store the sort of one vector.
Therefore, as shown in FIG. 1, bank 1 of memory is operand A1 to A8, and bank 2 is operand A
9 to Al6. Additionally, bank 1 of memory contains operands BO through B, (the remaining three locations being empty), and bank 2 contains operands B6 through BI3. As will be better understood later, resultant vector C will be routed to bank 1 for results C1 to C6 (the remaining two locations are empty) and to bank 2 for results C7 to Cl4. Empty regions are shown only to illustrate that such empty regions may also occur. In this example, the number of empty locations is any number between 0 and 7. An apparatus for transferring such data in an optimal manner is shown in FIG.

第２図は、本発明の好適実施例に従つたバツフアおよび
コントロールシステムのプロツク回路図である。第２図
に示した装置はコンピユータのメモリのストリツジアク
セスコントロールからのチャンネル１１を入力に持つリ
ードレジスタ１０を含む、同様に、同じようなリードレ
ジスタ１２はコンピユータのメモリのストリツジアクセ
スコントロールからのチヤンネル１３をその入力に持つ
。レジスタ１０および１２は別々のオペランドＡとＢと
をうける。リードレジスタ１０は、フアンイン回路１５
への出力チヤンネル１４、２オペランド（１／４ソート
）バツフア１７の出力チヤンネル１６およびフアンイン
回路１９への出力チヤンネル１８を有する。同様に、リ
ードレジスタ１２は、フアンイン回路１９への出力チヤ
ンネル２０、２オペランド（１／４ソート）バツフア２
２への出力チヤンネル２１、およびフアンイン回路２４
への出力チヤンネル２３を有する。フアンイン回路１５
はチヤンネル２５を通して２オペランド（１／４ソート
）レジスタ２６へ出力を送り、同レジスタはチヤンネル
２７を通して２オペランド（１／４ソート）レジスタ２
８へ出力を送る。レジスタ２８はチヤンネル２９を通し
て出力をコンピユータのデータ交換および演算部へ送る
。同様に、フアンイン回路２４はチヤンネル３０を通し
て出力を２オペランド（１／４ソート）レジスタ３１へ
送り、つぎにレジスタ３１はチヤンネル３２を通して２
オペランド（１／４ソート）レジスタ３３へ送る。レジ
スタ３３の出力はチャンネル３４を通してコンピユータ
のデータ交換および演算部へ送られる。バツフア１７は
チヤンネル３５を通してフアンイン回路１５へ出力を送
り、バツフア２２はチヤンネル３６を通してフアンイン
回路２４へ出力を送る。フアンイン回路１９はチャンネ
ル３７を通して出力をバツフア３８へ送る。同バツフア
は１２８オペランド（１６バンク）バツフアで、フアン
イン回路１５および２４へそれぞれチヤンネル３９およ
び４０を通して出力を送る。第２図では、太線はＡおよ
びＢオペランドを表わすデータが通りうる径路を示し、
細線はコントロールラインを示す。FIG. 2 is a block diagram of a buffer and control system in accordance with a preferred embodiment of the present invention. The apparatus shown in FIG. 2 includes a read register 10 having as input channel 11 from a computer memory storage access control; similarly, a similar read register 12 has a channel 11 from a computer memory storage access control. It has channel 13 as its input. Registers 10 and 12 receive separate operands A and B. The read register 10 is connected to the fan-in circuit 15
14, an output channel 16 to a two-operand (1/4 sort) buffer 17, and an output channel 18 to a fan-in circuit 19. Similarly, the read register 12 has an output channel 20 to the fan-in circuit 19, a 2-operand (1/4 sort) buffer 2
2, and a fan-in circuit 24.
It has an output channel 23 to. Fan-in circuit 15
sends an output through channel 25 to 2-operand (1/4 sort) register 26, which in turn sends an output through channel 27 to 2-operand (1/4 sort) register 2.
Send output to 8. Register 28 sends its output through channel 29 to the data exchange and arithmetic section of the computer. Similarly, the fan-in circuit 24 sends its output through channel 30 to a two-operand (1/4 sort) register 31, which in turn sends an output through channel 32 to a two-operand (1/4 sort) register 31.
Send to operand (1/4 sort) register 33. The output of register 33 is sent through channel 34 to the data exchange and calculation section of the computer. Buffer 17 sends its output to fan-in circuit 15 through channel 35, and buffer 22 sends its output to fan-in circuit 24 through channel 36. Fan-in circuit 19 sends its output through channel 37 to buffer 38. The buffer is a 128 operand (16 bank) buffer that sends output to fan-in circuits 15 and 24 through channels 39 and 40, respectively. In FIG. 2, the thick lines indicate the paths that data representing the A and B operands can take,
Thin lines indicate control lines.

第２図に示されるように、オーバーロード／エネブルコ
ントロール４１はレジスタ２８と２６との間に設けられ
、オーバーロードコントロール４２はレジスタ２６とバ
ツフア１７との間に設けられ、オーバーロードコントロ
ール４３はバツフア１７とコントロール装置４４の間に
設けられる。同様に、オーバーロード／エネブルコント
ロール４５がレジスタ３３と３１の間に設けられ、オー
バーロードコントロール４６はレジスタ３１とバツフア
２２の間に設けられる。オーバーロードコントロール４
７はバツフア２２とコントロール装置４４の間に設けら
れる。コントロール装置４４ぱコントロール出力４８を
フアンイン回路へ送る。ここに記述された装置はメモリ
から読み出されたデータをバツフアリングすることので
きるものである。As shown in FIG. 2, overload/enable control 41 is provided between registers 28 and 26, overload control 42 is provided between register 26 and buffer 17, and overload control 43 is provided between register 26 and buffer 17. It is provided between the buffer 17 and the control device 44. Similarly, an overload/enable control 45 is provided between registers 33 and 31, and an overload control 46 is provided between register 31 and buffer 22. overload control 4
7 is provided between the buffer 22 and the control device 44. Control device 44 sends a control output 48 to the fan-in circuit. The apparatus described herein is capable of buffering data read from memory.

特に、また後でオペレーシヨンはより詳しく説明される
が、もしＡベクトルデータが最初に到着しはじめたなら
、最初のいくつかのオペランドはチャンネル１４を通つ
てデータ交換部へ送られる。データ交換部は、まだ、Ｂ
ベクトルからの最初のオペランドＢ１を受けていないの
で、最初の２つまでのオペランド（オペランドＡ１とＡ
２）がレジスタ２８ヘスドアされる。もしオペランドＢ
１がまだ到着しなければ、データ交換部はＡオペランド
を受ける準備ができない。したがつて、オバーロード信
号がチヤンネル４１を通してレジスタ２６へ送られ、レ
ジスタ２８に対してはデータを送らないように指示する
。したがつて、次の２つまでのオペランド（オペランド
Ａ３およびＡ４）がレジスタ２６ヘスドアされる。もし
Ｂ１オペランドがまだ到着しないならば、レジスタ２６
はそのとき一杯であり、オーバーロード信号がオーバー
ロードコントロール４２を通つてバツフア１７へ送られ
る。そのとき、バツフア１７は作動して次の２つのオペ
ランド（オペランドＡ５およびＡ６）がバツフア１７へ
送られる。バツフア１７が一杯になると、コントロール
装置４４がチャンネル４３によつて駆動されフアンイン
回路１９を作動させてのこりのすべてのＡデータを受け
させる。のこりのすべてのＡオペランドはフアンイン回
路１９を通つてバツフア３８へ送られる。最初のＢオペ
ランドが到着したとき、それはチヤンネル２３を通して
フアンイン回路２４へ送られ、更にレジスタ３３へ送ら
れる。この時点でＡ１とＢ１オペランドを受取ることが
可能になつたデータ交換部はこれらのオペランドがレジ
スタ２８と３３から出力されることを可能にする。１つ
のオペランドがデータ交換部へ移されたとき、読み出し
によつて生じたスペースには次のオペランドが入る。In particular, and the operation will be explained in more detail later, if A vector data begins to arrive first, then the first few operands are sent to the data exchange over channel 14. The data exchange section is still B
Since the first operand B1 from the vector has not been received, the first two operands (operands A1 and A
2) is stored in the register 28. If operand B
1 has not arrived yet, the data exchange is not ready to receive the A operand. Therefore, an overload signal is sent through channel 41 to register 26, instructing register 28 not to send data. Therefore, up to the next two operands (operands A3 and A4) are stored in register 26. If the B1 operand has not yet arrived, register 26
is then full and an overload signal is sent to buffer 17 through overload control 42. At that time, buffer 17 is activated and the next two operands (operands A5 and A6) are sent to buffer 17. When buffer 17 is full, controller 44 is driven by channel 43 to activate fan-in circuit 19 to receive all remaining A data. All remaining A operands are passed through fan-in circuit 19 to buffer 38. When the first B operand arrives, it is sent through channel 23 to fan-in circuit 24 and then to register 33. The data exchange, now enabled to receive the A1 and B1 operands, allows these operands to be output from registers 28 and 33. When one operand is moved to the data exchange, the space created by the read is filled with the next operand.

同時に、Ａオペランドは、バツフア３８へ送られ、バツ
フア１７からオペランドは一掃される。この過程がすべ
てのＡとＢのオペランドがメモリから読み出されるまで
続く。「主サイクル」というここで使用される術語は、
そのバンクがオペランドを送り出したりリザルタントを
ストアしたりする要求を完全に満たされる間の１つのメ
モリバンクに関連した時間のことである。At the same time, the A operand is sent to buffer 38 and buffer 17 is cleared of operands. This process continues until all A and B operands have been read from memory. The term "main cycle" used here is
The amount of time associated with a memory bank during which that bank is fully satisfied with requests to send operands or store results.

ある要求がバンクに対してなされたとき、そのバンクへ
は主サイクルを完了するまでは次のアクセスをすること
ができない。特に第１図を参照すれば、いくつかのバン
クが連続的にアクセスされ、その結果アクセスに関連し
たベクトルの必要なオペランドが連続的に読み出し又は
書き込みされる。ここでの説明のために述べれば、その
ぅちの任意の１つのバンクにアクセスするために、１つ
の主サイクルに等価な時間で８つまでのバンクに連続的
にアクセスすることができる。したがつて、最初にアク
セスされたバンクには、ただその主サイクルが完了すれ
ば再びアクセスすることができるのは明らかであり、１
つの主サイクルは８バンクにアクセスするに必要な時間
である。したがつて、ＡとＢオペランドを含むバンクに
対してＡオペランドを求めてアクセスする場合にはＡオ
ペランドアクセスの主サイクルが完了するまではＢオペ
ランドは読み出せず、ＡオペランドＡ１ないしＡ６４が
バンク１ないし８から読み出されたあとＢオペランドは
読み出される。したがつて、ＡオペランドＡ１ないしＡ
６４はバツフアにストアされる。その後、次のＡオペラ
ンドＡ６５などがメモリの９番目のバンクから読み出さ
れ、バツフア３８ヘスドアされる。それと同時にＢオペ
ランドＢ１などが第１のバンクから読み出される。最初
のＢオペランドがバツフアを通つてコンピユータのデー
タ交換および演算部へ送られ、最初のＡオペランドはバ
ツフアから読み出される。上述のことは、Ａオペランド
が最初に到着したときの状況に関連して説明したが、も
しＢオペランドが最初に到着しても、Ａオペランドが封
着して処理されるまでＢオペランドがバツフア３８にス
トアされているという点をのぞけば同様であることは明
らかであろう。When a request is made to a bank, no further access can be made to that bank until the main cycle has been completed. With particular reference to FIG. 1, several banks are successively accessed so that the necessary operands of the vectors associated with the accesses are successively read or written. For purposes of this discussion, up to eight banks can be accessed sequentially in the equivalent time of one main cycle to access any one of them. It is clear, therefore, that a bank that was accessed for the first time can be accessed again only after its main cycle is completed;
One main cycle is the time required to access eight banks. Therefore, when accessing a bank containing A and B operands for the A operand, the B operand cannot be read until the main cycle of the A operand access is completed, and the A operands A1 to A64 are stored in bank 1. After reading from 8 to 8, the B operand is read. Therefore, the A operands A1 to A
64 is stored in the buffer. Thereafter, the next A operand, A65, etc., is read from the ninth bank of memory and stored in buffer 38. At the same time, the B operand B1 etc. are read from the first bank. The first B operand is passed through the buffer to the data exchange and arithmetic section of the computer, and the first A operand is read from the buffer. The above has been explained in relation to the situation when the A operand arrives first, but even if the B operand arrives first, the B operand remains in the buffer 38 until the A operand is sealed and processed. It is clear that they are the same except that they are stored in .

この点に関しては、コントロール装置４４がどちらのベ
クトルが最初に到着したかに基づいてフアンイン回路１
９とバツフア３８の入出力制御を行なう。第２図に示さ
れるように、データ交換部からのリザルタントデータは
、データチヤンネル５１を通してレジスタ５０で受けら
れ、チヤンネル５２を通してゲート５３へ送られ、さら
にバツフア６１へ送られる。In this regard, the controller 44 controls the fan-in circuit 1 based on which vector arrives first.
9 and the buffer 38. As shown in FIG. 2, resultant data from the data exchange is received at register 50 through data channel 51, sent through channel 52 to gate 53, and then sent to buffer 61.

リザルタントがストアされるはずのメモリバンクが空で
あつたならば、ストリツジアクセスコントロールがエネ
ブル信号をコントロールチヤンネル５４を通して送り、
データはフアンイン回路５６を通り、チヤンネル５７を
通してライトレジスタ５８へ送られ、さらにチャンネル
５９を通してストリツジアクセスコントロールへ送られ
る。しかしながら、最初のメモリバンクが依然としてデ
ータ読み出し中（たとえばＡまたはＢオペランドを読み
出し中）でフアンイン回路５６が駆動されなければ、リ
ザルタントＣデータはバツフア６１へー時ストアされる
。１２８までのリザルタント（１６バンク）をストアで
きるバツフア６１はコントロールチヤンネル６３を通し
てカウンタ６２を駆動し、バツフア６１中にストアされ
たリザルタントソードの数をカウントさせる。If the memory bank in which the resultant is to be stored is empty, the storage access control sends an enable signal through the control channel 54;
The data passes through the fan-in circuit 56, through channel 57 to write register 58, and through channel 59 to the storage access control. However, if the first memory bank is still reading data (eg, reading an A or B operand) and the fan-in circuit 56 is not driven, then the resultant C data is stored in the buffer 61. The buffer 61, which can store up to 128 results (16 banks), drives a counter 62 through a control channel 63 to count the number of resultant swords stored in the buffer 61.

カウンタ６２からの出力はコントロールチヤンネル６４
を通してコントロール装置４４へ送られる。この目的は
後述する。バツフア６１はデータチヤンネル６５を通し
て出力をフアンイン回路５６へ送り、バツフア６１中の
データは、リザルタント用のメモリバンクがデータを受
けるようになつたとき、メモリへ書き込まれる。第３Ａ
ないし第３Ｃ図は、メモリから読み出されるデータに関
連してバツフアのオペレーシヨンの状態を説明するため
の図である。The output from counter 62 is sent to control channel 64.
is sent to the control device 44 through. The purpose of this will be explained later. Buffer 61 sends its output through data channel 65 to fan-in circuit 56, and the data in buffer 61 is written to memory when the resultant memory bank is ready to receive data. 3rd A
3C to 3C are diagrams illustrating the state of buffer operation in relation to data read from memory.

第３Ａ図を参照すδと、，Ａベクトルが最初に到着する
ものとしよう。第３Ａ図は３２の等しいセグメントに分
割された円を示すが、各セグメントは、Ａ＋０からＡ＋
２４８まで連続的に記号がつけられている。この円の各
セグメントの領域がメモリ中の１つのバンクにあるオペ
ランド（ソート）を表わす。４分円の各々はメモリＦＩ
）８バンクを表わし、それぞれは１つの主サイクルの間
にアクセスできる。Let δ and , A vector arrive first, see FIG. 3A. FIG. 3A shows a circle divided into 32 equal segments, each segment ranging from A+0 to A+
Symbols are added consecutively up to 248. The area of each segment of this circle represents an operand (sort) in one bank in memory. Each quadrant is a memory FI
) represents eight banks, each of which can be accessed during one main cycle.

説明のために述べれば、１つのソート（８オペランド）
はコンピユータの４つの従サイクルの間に読み出され、
８ソートが１つの主サイクルの間に読み出される。した
がつて３２の従サイクルは第３Ａ図の４分円を表わし、
メモリの１つの主サイクルをつくる。ここでの説明のた
めには必らずしも必要ではないが、特定の主サイクルの
内の３２番目の、すなわち最後の従サイクルは他のオペ
ランドによつて、あるいはリザルタントによつて同じメ
モリバンクをアドレスするために使用される。To illustrate, one sort (8 operands)
is read during four slave cycles of the computer,
8 sorts are read during one main cycle. The 32 slave cycles thus represent the quadrants of Figure 3A,
Creates one main cycle of memory. Although it is not necessary for purposes of this discussion, the 32nd or last slave cycle of a particular major cycle may be assigned to the same memory bank by another operand or by a resultant. used to address.

したがつて、実際には、１つの４分円に対しては３１の
従サイクルを使用するだけでよい。もう１つの従サイク
ルは読み出しとアドレツシングの両方に同時に使用され
る。第３Ａ図に関して、Ａと記された線は固定された位
置にあり、メモリサイクルとしてはメモリ位置を表わす
円は矢印７０の方向に時計方向に回転する。Therefore, in practice, only 31 slave cycles need to be used for one quadrant. Another slave cycle is used for both reading and addressing simultaneously. With reference to FIG. 3A, the line marked A is in a fixed position and the circle representing the memory location rotates clockwise in the direction of arrow 70 as the memory cycle progresses.

したがつてコンピユータの４つの従サイクルの完了時に
は第３Ａ図の円の位置はＡ＋８０位置が線Ａに隣接して
円の最高点にあるように回転している。オペランドＡ１
以降のＡオペランド及びオペランドＢ１以降のＢオペラ
ンドが共にバンク１から配置される場合には、Ａオペラ
ンドのアクセスを優先させるとした場合に、Ａオペラン
ドのメモリバンク９からの読み出しがはじまり、最初の
オペランドＡ１ないしＡ６４がバンク１ないしバンク８
から完全に読み出されるまで、Ｂオペランドの読み出し
のためにメモリバンク１へのアクセスはなされない。Thus, upon completion of four slave cycles of the computer, the position of the circle in FIG. 3A has rotated such that the A+80 position is adjacent line A and at the highest point of the circle. Operand A1
If the subsequent A operand and the B operand after operand B1 are both located from bank 1, and if access to the A operand is given priority, reading of the A operand from memory bank 9 will start, and the first operand will be read. A1 to A64 are bank 1 to bank 8
Memory bank 1 is not accessed to read the B operand until it has been completely read from memory bank 1.

理想的な場合は、第３Ｂ図に示されるが、Ｂオペランド
はメモリのバンク９からはじまつて配置される。したが
つて、第３Ｂ図を参照すれば、ＡオペランドＡ１ないし
Ａ８はバンク１から読み出され、ＢオペランドＢ１ない
しＢ８は同時にバンク９から読み出される。最初の主サ
イクルの完了時にオペランドＢ１ないしＢ６４はバンク
９−１６から読み出され、リード回路はバンク９からＡ
オペランド（たとえばＡ６５）を、バンク１７からはＢ
オペランド（たとえばＢ６５）を読み出すことができる
。しかしながら、もしＡとＢのオペランドが同じメモリ
バンクを占めていたならば、オペランドベクトルのうち
の１つは読まれず、たとえばＡのオペランドベクトルは
第２図に示した装置に連続的にストアされる。In the ideal case, shown in FIG. 3B, the B operands would be located starting at bank 9 of the memory. Thus, referring to FIG. 3B, A operands A1-A8 are read from bank 1 and B operands B1-B8 are simultaneously read from bank 9. At the completion of the first main cycle, operands B1 through B64 are read from banks 9-16 and the read circuits are read from banks 9 to A.
operand (for example A65) and B from bank 17.
The operand (eg B65) can be read. However, if the operands of A and B occupy the same memory bank, one of the operand vectors will not be read and, for example, the operand vector of A will be stored sequentially in the device shown in FIG. .

したがつて、もしＡオペランドが最初に到着したならば
、２つのオペランドＡ１とＡ２のオペランドぱレジスタ
２８にストアされ、２つのオペランドＡ，とＡ４のオペ
ランドはレジスタ２６にストアされ、２つのオペランド
Ａ５とＡ６のオペランドはバツフア１７にストアされ、
のこりのオペランドＡ７ないしＡ６４はバツフア３８ヘ
スドアされる。（前述のことから、オペランドがはじめ
にバツフア１７にストアされていても、フアンイン回路
１５で処理された時点ですべてののこりのオペランドは
バツフア３８に一時ストアされることは理解されよう。
）その後、特に第３Ｃ図に示されるように、最初の主メ
モリサイクルが完了し、８つまでのＡソート（オペラン
ドＡ１ないしＡ６４）がバツフアにストアされたとき、
リード回路はメモリバンク９から次のＡオベランドＡ６
５などを読み出し続け、それと同時にＢオペランドを読
み出すためにバンク１がアクセスされる。したがつて、
コンピユータメモリの次の４従サイクルの完了時に９つ
のソート（オペランドＡ１ないしＡ，２）がバンク１な
いし９から読み出され、１つのソート（オペランドＢ１
ないしＢ８）がバンク１から読み出される。オペランド
ＢＯが読まれたとき、オペランドＡ１とＢ１は演算部へ
送られる。最初のＡオペランドが演算部へ送られたあと
、次のオペランドはバツフアの中で１つづつ位置をかえ
る。次に第３Ｄ図を参照すると、こ匁ではＡとＢオペラ
ンドベクトルは異なつたメモリバンクからはじまるもの
とし、ＢベクトルはＡベクトルを２バンクおくらせるも
のとしてある。Therefore, if the A operand arrives first, the two operands A1 and A2 will be stored in the operand register 28, the two operands A, and A4 will be stored in the register 26, and the two operands A5 and A4 will be stored in the register 26. and the operands of A6 are stored in buffer 17,
The remaining operands A7 to A64 are passed to buffer 38. (From the foregoing, it will be understood that even if the operands are initially stored in buffer 17, all remaining operands are temporarily stored in buffer 38 once processed by fan-in circuit 15.
) Then, when the first main memory cycle is completed and up to eight A sorts (operands A1 through A64) have been stored in the buffer, as specifically shown in FIG.
The read circuit is from memory bank 9 to the next A overland A6.
5, etc., and at the same time bank 1 is accessed to read the B operand. Therefore,
At the completion of the next four slave cycles of computer memory, nine sorts (operands A1 to A,2) are read from banks 1 to 9 and one sort (operand B1
to B8) are read from bank 1. When operand BO is read, operands A1 and B1 are sent to the arithmetic unit. After the first A operand is sent to the arithmetic unit, the next operand changes position in the buffer one by one. Next, referring to FIG. 3D, in this example, the A and B operand vectors are assumed to start from different memory banks, and the B vector is assumed to cause the A vector to move two banks.

特に、Ｂオペランドがバンク３からはじまるならば、Ａ
オペランドはバンク１からはじまり、このようなときに
は、最初のＡオペランド（オペランドＡ１ないしＡｌ６
）はメモリバンク１および２から読まれ、Ｂオペランド
Ｂ１ないしＢｌ６はバンク３および４から読まれる。次
の４つの従サイクルの間、１つのソート（オペランドＢ
ｌ７ないしＢ２４）が読まれるが、Ａオペランド及びＢ
オペランドの両オペランド読出し開始の際に始つた主サ
イクルにおいてメモリバンク３については既にＢオペラ
ンドの読出しがされているのでリード回路はＡｌ７を読
むことができない。従つて、Ａｌ７およびそれにつづく
オペランドは一時的にスキツプされ、Ｂソート（Ｂｌ７
ないしＢ２４）がバツフアにストアされる。この過程は
すでに述べたようにＢｌ７とＢｌ８がレジスタ３３を占
め、Ｂ，，とＢ２Ｏがレジスタ３１を占め、Ｂ２、とＢ
２２がバツフア２２を占めるまで続く。Ｂ２３ないしＢ
６４はバンク３からはじまる最初の主サイクルの間に読
まれ、フアンイン回路１９をへてバツフア３８へ送られ
る。したがつて、線ＢはＢ′の位置までシフトされ、こ
れは第３Ｂ図および第３Ｃ図に関連して示したのと本質
的には同じ状況になる。次の４つの従サイクルの間、メ
モリに連動したリード回路は、バンク１１からＢオペラ
ンドを読み続け、一方これと同時にバンク３からＡオペ
ランドＡｌ，などを読み出す。したがつてＡｌ７ないし
Ａ２４オペランドがメモリから読み出されて連続的にチ
ヤンネル１４とレジスタ２８を通してデータ交換部へ送
られると同時にＢオペランドＢ６５ないしＢ７２がメモ
リから読み出されてバツフア３８ヘスドアされる。チヤ
ンネル２９および３４へ接続されているデータ交換部は
それぞれレジスタ２８と３３にストアされていたＡｌ，
とＢｌ，オペランドを受ける準備ができており、そのデ
ータを処理し、それによつてＡｌ８とＢｌ８オペランド
が次に処理される。この過程はバツフア１７からオペラ
ンドがクリアされるまで次の４つの従サイクルの間続く
。その後、すでに説明したようにデータがバツフア３８
から読み出される。ＡおよびＢオペランドがコンピユー
タの演算部で処理され、Ｃリザルタントが作られてデー
タ交換部からチヤンネル５１を経てレジスタ５０に入る
。In particular, if the B operand starts in bank 3, then the A
The operands start from bank 1, and in such a case, the first A operand (operand A1 to Al6
) are read from memory banks 1 and 2, and B operands B1 through Bl6 are read from banks 3 and 4. During the next four slave cycles, one sort (operand B
l7 to B24) are read, but the A operand and B
Since the B operand has already been read from memory bank 3 in the main cycle that started when reading both operands, the read circuit cannot read Al7. Therefore, Al7 and the following operands are temporarily skipped and B sorted (B17
to B24) are stored in the buffer. As already mentioned, Bl7 and Bl8 occupy register 33, B,... and B2O occupy register 31, B2, and B
This continues until 22 occupies the buffer 22. B23 or B
64 is read during the first main cycle starting from bank 3 and sent through fan-in circuit 19 to buffer 38. Line B is therefore shifted to position B', which is essentially the same situation as shown in connection with FIGS. 3B and 3C. During the next four slave cycles, the memory associated read circuit continues to read the B operand from bank 11 while simultaneously reading the A operand Al, etc. from bank 3. Thus, operands Al7 to A24 are read from memory and sent sequentially through channel 14 and register 28 to the data exchange, while at the same time B operands B65 to B72 are read from memory and sent to buffer 38. The data exchanges connected to channels 29 and 34 store Al, which was stored in registers 28 and 33, respectively.
and Bl, are ready to receive operands and process that data so that the Al8 and Bl8 operands are processed next. This process continues for the next four slave cycles until the operand is cleared from buffer 17. After that, as already explained, the data becomes buffer 38.
is read from. The A and B operands are processed in the arithmetic section of the computer, and a C resultant is created and enters the register 50 via channel 51 from the data exchange section.

最初のリザルタントＣが、ＡまたはＢオペランドがＣリ
ザルタントを書き込もうとするメモリバンクと同じメモ
リバンクのグループ内から読まれるときに到着した場合
、衝突が生じるのでストリツジアクセスコントロールを
通してメモリへ再びアクセスする前に一時的にＣリザル
タントをバツフアにストアしておくのが望ましい。した
がつて、Ｃリザルタントはデータ交換部からレジスタ５
０へ送られ、その後ゲート５３へ送られる。特に第４Ａ
図を参照すればＣリザルタントをメモリへ書き込むもつ
とも簡単な場合が説明されている。第４Ａ図に図説的に
示された状況の場合、ＡオペランドＡ１ないしＡｌ４４
はバンク１ないし１８から読まれる。同様にＢオペラン
ドＢ１ないしＢ８Ｏはバンク１ないし１０から読まれる
。したがつて最初のＣリザルタントがレジスタ５０に到
着したときに、最初の８メモリバンクは空である。なぜ
なら、Ａオペランドはバンク１９から読み出されており
、Ｂオペランドはバンク１１から読み出されているから
である。さらに、Ｃリザルタントの到着する時刻は第４
Ａ図に示された３番目の４分円であるから、Ａオペラン
ド読出しのバンクがその主サイクルの間に移動しても、
後述する如く１つの主サイクル内でＡオペランド読出し
のされるバンクの番号と３２異なる番号のバンクへのＣ
リザルタント書き込みでメモリの同じリード／ライト部
を用い衝突が起きるということはない。その結果、リザ
ルタントＣデータはバツフア６１へ送られ、バンク１は
空でリザルタントデータを受けいれられることを表示す
るストリツジアクセスコントロールからの信号がチヤン
ネル５４を通してフアンイン回路５６へ送られ、フアン
イン回路５６を駆動し、さらにライトレジスタ５８を駆
動してチヤンネル５９およびストリツジアクセスコント
ロールを経てリザルタントデータをメモリへ書き込む。
しかしながら、もしＣリザルタントがストアされるはず
の最初のメモリバンクで依然としてオペランドを読み出
しているときに、最初のＣリザルタントが到着したなら
ば衝突が生じる。If the first resultant C arrives when the A or B operands are read from within the same group of memory banks as the memory bank into which the C resultant is being written, a collision will occur and the memory will not be accessed again through the storage access control. It is desirable to temporarily store C resultant in a buffer. Therefore, C resultant receives register 5 from the data exchange section.
0 and then to gate 53. Especially the 4th A
Referring to the figure, a very simple case of writing a C resultant to memory is illustrated. For the situation illustrated diagrammatically in FIG. 4A, the A operands A1 to Al44
are read from banks 1 through 18. Similarly, B operands B1 through B8O are read from banks 1 through 10. Therefore, when the first C resultant arrives at register 50, the first eight memory banks are empty. This is because the A operand is being read from bank 19 and the B operand is being read from bank 11. Furthermore, the arrival time of Resultant C is 4th.
Since it is the third quadrant shown in the A diagram, even if the bank of A operand read moves during its main cycle,
As will be described later, C to a bank with a number 32 different from the bank number from which the A operand is read within one main cycle.
Resultant writes use the same read/write section of memory and no conflicts occur. As a result, resultant C data is sent to buffer 61 and a signal from the storage access control indicating that bank 1 is empty and ready to accept resultant data is sent through channel 54 to fan-in circuit 56. Furthermore, the write register 58 is driven to write resultant data to the memory via the channel 59 and storage access control.
However, if the first C resultant arrives while it is still reading operands in the first memory bank where it is supposed to be stored, a conflict will occur.

たとえば、第４Ｂ図に示されるように、ＡオペランドＡ
１ないしＡ８８がバンク１ないし１１から読み出され、
ＢオペランドＢ１ないしＢ２４がバンク１ないし３から
読み出される。したがつて、第１図を参照すれば、メモ
リバンク１は主サイクルが完了し、かつＢオペランドＢ
２５ないしＢ６４のバンク４ないし８からの読み出しが
完了するまではアクセスできない。その結果、ストリツ
ジアクセスコントロールからチヤンネル５４を経てフア
ンイン回路５６へ送られる信号は、同回路に対し、バン
ク１ないし８はアクセスできず、したがつてＣリザルタ
ントデータはバツフア６１へー時ストアされるべきであ
ることを伝える。バツフア６１は１６ソートまでのデー
タ（１２８リザルタント）をストアすることができる。
この場合、第４Ｂ図に示される図の第２の４分円の間に
Ｃリザルタントが到着したならば、バツフア６１は８ソ
ートのＣリザルタント（６４リザルタント）をストアす
る。同時にカウンタ６２は、後述される目的のためにバ
ツフア６１中のリザルタントのソート数をカウントする
。メモリバンク１からのＢオペランドの読み出しが完了
したときに、８つまでのＣリザルタントソード（６４リ
ザルタント）がバツフア６１にストアされている。（第
４Ｂ図に示した場合では、Ｃリザルタントは３つのＢオ
ペランドソードがバンク１ないし３から読み出されたと
きに到着するのでただ５つのＣリザルタントソードしか
バツフア６１にはストアされない。）メモリバンク１が
空になつたとき、ストリツジアクセスコントロールはチ
ヤンネル５４を介してフアンイン回路５６を駆動し、リ
ザルタントデータをバツフア６１からライトレジスタ５
８へ移し、さらにリザルタントデータをメモリの中へ連
続的に書き込むようにする。さらに、Ｃリザルタントは
連続的にバツフア６１へー時ストアされる。第４Ｃ図と
第４Ｄ図は、Ｃリザルタントデータがメモリへ返される
場合を含んだ最悪の場合を示す。For example, as shown in FIG. 4B, the A operand A
1 to A88 are read from banks 1 to 11;
B operands B1-B24 are read from banks 1-3. Therefore, referring to FIG. 1, memory bank 1 has completed its main cycle and
Access is not possible until reading from banks 4 to 8 of banks 25 to B64 is completed. As a result, the signal sent from the storage access control to the fan-in circuit 56 via the channel 54 cannot access banks 1 through 8 to that circuit, and therefore the C resultant data is stored in the buffer 61. Tell them what they should do. The buffer 61 can store data of up to 16 sorts (128 results).
In this case, if a C resultant arrives during the second quadrant of the diagram shown in FIG. 4B, buffer 61 stores 8 sorts of C results (64 results). At the same time, counter 62 counts the number of sorts of results in buffer 61 for purposes described below. When reading of the B operand from memory bank 1 is completed, up to eight C resultant swords (64 results) are stored in buffer 61. (In the case shown in Figure 4B, only five C resultant swords are stored in buffer 61 because the C resultant arrives when the three B operand swords are read from banks 1-3.) Memory When bank 1 becomes empty, the storage access control drives fan-in circuit 56 via channel 54 to transfer resultant data from buffer 61 to write register 5.
8, and the resultant data is continuously written into the memory. Further, the C resultant is continuously stored in the buffer 61. Figures 4C and 4D illustrate the worst case, which includes the case where C resultant data is returned to memory.

この場合、ＡオペランドＡ１ないしＡ２ＯＯがメモリの
最初の２５バンクから読み出され、ＢオペランドＢ１な
いしＢｌ３６はメモリの最初の１７バンクから読み出さ
れる。したがつて、メモリのバンク１はＣリザルタント
データを受け入れそうに思われる。しかしメモリのリー
ド／ライト部のアドレツシング回路は、３２番目のバン
クごとに同じアドレツシング回路を用いるようになつて
いる。したがつて、Ｃリザルタントデータが第４Ｃ図に
示されるように４番目の４分円すなわち主サイクルの間
に到着するので、リード回路が次の主サイクルでメモリ
バンク３３からＡオペランドを読み出す場合に衝突が生
じる。なぜなら、バンク３３からバンク４０迄のＡオベ
ランド読み出しにバンク１からバンク８迄のリード／ラ
イト部のアドレツシング回路を用い又バンク８のリード
／ライト部のアドレツシング回路はＣリザルタントのバ
ンク８への書込みに用いるので、この主サイクルにおけ
るＡオペランド読み出しとＣリザルタント書込みを両方
実行することはできないからである。その結果、ストリ
ツジアクセスコントロートは信号はチヤンチル５４を通
してフアンイン回路５６へ送り、その結果、最初のＣリ
ザルタントデータはバツフア６１ヘスドアされる。こう
して、第４Ｂ図に示された場合のように、Ｃリザルタン
トトデータは連続的にバツフア６１ヘスドアされる。
カウンタ６２は、前述したように、バツフア６１ヘスド
アされているリザルタントソードの数を表示する。この
カウントが８になつたとき、バツフア６１に８ソートの
リザルタントがストアされてトいることを表示するが
（第４Ｃ図で破線Ｃ″で表示）、カウンタ６２はチヤン
ネル６４を通してエネブル信号をコントロール装置４４
へ送り、さらにＡオペランドを一侍ストアさせる。この
理由は、もし、Ｃリザルタントが第４Ｃ図の破線Ｃｌで
示された点に書き込みを開始したら、Ｂオペランドをメ
モリの３番目の４分円から読み出しているので衝突が生
じるからであり、Ｃリザルタントの書込みを開始する場
合にはＢオペランドの読出しを一時休止しなければなら
ないからである。さらに、もし、Ｃリザルタントがもう
８ソートだけ一時ストアしていたならば、４番目の４分
円からのＢオペランドの読み出しのときにも衝突が生じ
る。２つのアプローチが考えられる。In this case, the A operands A1-A2OO are read from the first 25 banks of memory, and the B operands B1-B136 are read from the first 17 banks of memory. Therefore, bank 1 of memory appears likely to accept C resultant data. However, the addressing circuit of the read/write section of the memory uses the same addressing circuit for every 32nd bank. Therefore, if the read circuit reads the A operand from memory bank 33 in the next main cycle, since the C resultant data arrives during the fourth quadrant or main cycle as shown in FIG. A collision occurs. This is because the addressing circuit of the read/write section from bank 1 to bank 8 is used to read the A overland from bank 33 to bank 40, and the addressing circuit of the read/write section of bank 8 is used to write the C resultant to bank 8. This is because it is not possible to execute both the A operand read and the C resultant write in this main cycle. As a result, the storage access control sends a signal through the channel 54 to the fan-in circuit 56, so that the first C resultant data is transferred to the buffer 61. Thus, as in the case shown in FIG. 4B, the resultant data is continuously buffered 61.
As described above, the counter 62 displays the number of resultant swords that have been buffered to the buffer 61. When this count reaches 8, indicating that 8 sorts of results are stored in the buffer 61 (indicated by the dashed line C'' in FIG. 4C), the counter 62 sends the enable signal through the channel 64 to the control device. 44
, and then store the A operand. The reason for this is that if the C resultant were to start writing at the point indicated by the dashed line Cl in Figure 4C, there would be a collision because it was reading the B operand from the third quadrant of memory, and the C This is because reading of the B operand must be temporarily suspended when writing resultants is started. Furthermore, if the C resultant had temporarily stored 8 more sorts, there would also be a conflict when reading the B operand from the fourth quadrant. Two approaches are possible.

１つは、Ｃリザルタントを３つの主メモリサイクル（２
４バン・ク）にわたつて一時ストアしておき、第４Ａ
図に示される状況にすることである。One is to run the C resultant in 3 main memory cycles (2
4 banks), and
The goal is to create the situation shown in the figure.

しかしながら、この解決法はリザルタントの使用までに
不必要な時間がかかり、かつバツフア６１のサイズを大
きくしなければならないのであまり望ましくない。した
がつて、第２の解決案がより望ましい。これは、カウン
トが８になり、リザルタントがライトレジスタへまた送
られないときに、カウンタ６２がコントロール装置４４
を駆動してＡオペランドをさらに８ソート（８バンク、
すなわち６４オペランド）だけ一時ストアしようとする
ものである。したがつて、第４Ｄ図に示されるように、
Ａオペランドはメモリから連続的に読み出されるが、バ
ツフア３８へ１６ソート（１２８オペランド）のＡオペ
ランドがストアされる。その間、Ｂオペランドの読み出
しは一時的に１主サイクル（６４オペランド）の間だけ
休止され、第４Ｄ図の状況になるまでＣリザルタントは
バツフア６１へさらに７リザルタントソード（５６リザ
ルタント）を一時ストアする。したがつて、Ａオペラン
ドは線Ａで読み出され、Ｂオペランドは線Ｂ′で読み出
され、Ｃリザルタントは線σで処理される。したがつて
、Ｃリザルタントは最大１６メモリバンクになるまで一
時ストアされ、Ａオペランドも同様に１６メモリバンク
になるまで一時ストアされる。この場合、したがつて、
ＡオペランドＡ１ないしＡ３２Ｏは最初の４０メモリバ
ンク即ちメモリバンク１乃至４０から読み出され、Ｂオ
ペランドＢ１ないしＢｌ，６は最初の２４メモリバンク
から読み出される。したがつて、次の４従サイクルでは
、ＡオペランドＡ３２ｌないしＡ３２８がメモリバンク
４１から読み出され、ＢオペランドＢｌ９，ないしＢ２
Ｏ４はメモリバンク２５から読み出され、Ｃリザルタン
トＣ１ないしＣ８は衝突なしでメモリバンク１へ書き込
まれる。オペランドのほとんどの演算処理はバンクの数
従サイクルに相当する時間内にコンピユータの演算部で
なされる。However, this solution is less desirable because it takes unnecessary time to use the resultant and requires increasing the size of the buffer 61. Therefore, the second solution is more desirable. This means that when the count reaches 8 and the resultant is not sent to the write register again, the counter 62
to further sort the A operand 8 times (8 banks,
In other words, it attempts to temporarily store only 64 operands. Therefore, as shown in Figure 4D,
The A operands are successively read from memory, but 16 sorts (128 operands) of the A operands are stored in the buffer 38. Meanwhile, the reading of the B operand is temporarily suspended for one main cycle (64 operands), and the C resultant temporarily stores an additional 7 resultant swords (56 results) in the buffer 61 until the situation shown in FIG. 4D is reached. . Therefore, the A operand is read out on line A, the B operand is read out on line B', and the C resultant is processed on line σ. Therefore, the C resultant is temporarily stored until the maximum number of memory banks is 16, and the A operand is similarly temporarily stored until the number of memory banks is 16. In this case, therefore,
The A operands A1 to A32O are read from the first 40 memory banks, memory banks 1 to 40, and the B operands B1 to Bl,6 are read from the first 24 memory banks. Therefore, in the next four slave cycles, the A operands A32l to A328 are read from the memory bank 41, and the B operands Bl9 to B2 are read from the memory bank 41.
O4 is read from memory bank 25 and C results C1 to C8 are written to memory bank 1 without collision. Most of the arithmetic processing of the operands is performed in the arithmetic section of the computer within a time corresponding to several cycles of the bank.

したがつて、リザルタントがオペランドと同じバンクに
たとえばバンク１からストアされるならば、第４Ｂ図に
関連して説明した状況が生じやすい。しかしながら、リ
ザルタントがそれに続くバンクにストアされることもあ
り、この場合は第４Ａ、第４Ｃおよび第４Ｄ図に関連し
て説明した状況が生じる。本発明の理解には必らずしも
必要ではないが、線Ａ，ＢおよびＣが４つの四分円のう
ちの３つのものに分割線の形で入つていることが第４Ｄ
図かられかるだろう。Therefore, if the resultant is stored in the same bank as the operand, for example from bank 1, the situation described in connection with FIG. 4B is likely to occur. However, results may also be stored in subsequent banks, in which case the situation described in connection with Figures 4A, 4C and 4D occurs. It is not necessary for an understanding of the invention that lines A, B and C fall into three of the four quadrants in the form of dividing lines.
It will be obvious from the diagram.

４つの四分円のうちの第４の分割線（第４Ｄ図のＡ＋２
１８、第４Ｂ図のＡ＋１５２）にＩ／０アクセスを行う
ような適当なバツフア技術を用いれば衝突のないやり方
で入出力チヤンネルがアクセスできる。The fourth dividing line of the four quadrants (A+2 in Figure 4D)
I/O channels can be accessed in a collision-free manner using appropriate buffering techniques, such as I/O accesses at A+152 in Figure 4B.

第５図には、本発明に従つたオペランドスパースベクト
ルを処理するための装置が示されている。FIG. 5 shows an apparatus for processing operand sparse vectors according to the invention.

第５図に示された装置はリードレジスタ１０を有し、同
レジスタはチャンネル１１を通してコンピュータのメモ
リおよびストリツジアクセスコントロールからＡオペラ
ンドスパースベクトルを受ける。リードレジスタ１０は
バツフア１０２へ出力を送り、同バツフアはチヤンネル
１０３を通してオペランドシフトレジスタ１０４へ出力
を送る。同様に、リードレジスタ１２はコンピユータの
メモリおよびストリツジアクセスコントロールからチヤ
ンネル１３を通してＢオペランドスパースベクトルを受
け、バツフア１０７へ出力を送り、同バツフアはチヤン
ネル１０８を通してオペランドシフトレジスタ１０９へ
出力を送る。バツフア１０２と１０７は３２オペランド
の容量を有するのが望ましく、フアーストインフアース
トアウトのベイシスでチヤンネル１１および１３から到
着したオペランドを整列させて連続的に並んだオペラン
ド列とする。バツフア１０２と１０７は第２図に示した
装置からなり、これまで説明したように作動する。リー
ドレジスタ１１０は、コンピユータのメモリおよびスト
リツジアクセスコントロールからチヤンネル１１１を通
して入力を受け、バッファ１１２へ出力を送る。The apparatus shown in FIG. 5 has a read register 10 which receives the A operand sparse vector from the computer's memory and storage access control through channel 11. Read register 10 sends an output to buffer 102 which sends an output to operand shift register 104 through channel 103. Similarly, read register 12 receives the B operand sparse vector through channel 13 from the computer's memory and storage access control and sends an output to buffer 107 which sends an output through channel 108 to operand shift register 109. Buffers 102 and 107 preferably have a capacity of 32 operands, and arrange operands arriving from channels 11 and 13 on a first-in-first-out basis into a continuous series of operands. Buffers 102 and 107 consist of the equipment shown in FIG. 2 and operate as previously described. Read register 110 receives input through channel 111 from the computer's memory and storage access control and sends output to buffer 112.

同様に、リードレジスタ１１３は、コンピユータのメモ
リおよびストリツジアクセスコントロールからチヤンネ
ル１１４を通して入力を受け、バツフア１１５へ出力を
送る。後でより詳しく理解されようが、Ａオペランドス
パースベクトルはチヤンネル１１を通つてメモリからリ
ードレジスタ１０へ入り、Ｂオペランドスパースベクト
ルは、チヤンネル１３を通つてメモリからリードレジス
タ１２へ入り、Ｘオペランドオーダベクトルはメモリか
らチャンネル１１１を通つてリードレジスタ１１０へ入
り、Ｙオペランドオーダベクトルはメモリからチャンネ
ル１１４を通つてリードレジスタ１１３へ入る。また、
後でより詳しく説明されるが、ＸおよびＹオーダベクト
ルは、それぞれ、ＡおよびＢオペランドベクトルにある
オペランドの数と同じだけのビツト数を有し、Ｘおよび
Ｙオーダベクトル中の各「１」のビツトはそれぞれＡお
よびＢオベランドベクトル中の非ゼロのオペランドに対
応し、ＸおよびＹオーダベクトル中の各「０」のビツト
はそれぞれＡおよびＢオペランドベクトルのゼロ値のオ
ペランドに対応する。しかしながら、リードレジスタ１
０および１２によつて受けられる入力は、ゼロ値のオペ
ランドを含まないことを理解されたい（なぜなら、それ
らはスパースベクトルであるから）。しかしながら、ゼ
ロ値のオペランドはそれに対応したオーダベクトルの「
Ｏ」のビツトによつて検出される。バツフア１１２およ
び１１５は１０２４ビツトまでストアできる。Similarly, read register 113 receives input from the computer's memory and storage access control through channel 114 and sends output to buffer 115. As will be understood in more detail later, the A operand sparse vector enters the read register 10 from memory through channel 11, the B operand sparse vector enters the read register 12 from memory through channel 13, and the X operand ordered vector enters read register 110 from memory through channel 111, and the Y operand order vector enters read register 113 from memory through channel 114. Also,
As will be explained in more detail later, the X and Y ordered vectors each have the same number of bits as the number of operands in the A and B operand vectors, and each ``1'' in the X and Y ordered vectors Each bit corresponds to a non-zero operand in the A and B operand vectors, and each "0" bit in the X and Y order vectors corresponds to a zero-valued operand in the A and B operand vectors, respectively. However, read register 1
It should be understood that the inputs received by 0 and 12 do not include zero-valued operands (because they are sparse vectors). However, a zero-valued operand has a corresponding order vector of "
Detected by the bit “O”. Buffers 112 and 115 can store up to 1024 bits.

バツフア１１２の出力はチヤンネル１１６を通してＸス
ケールレジスタ１１７へ送られ、同レジスタは出力をＸ
左シフトネツトワーク１１８へ送り、さらに同ネツトワ
ークは、出力をチャンネル１１９を通してＸフリツプフ
ロツプ１２０へ送る。同様に、バツフア１１５は出力を
チヤンネル１２１を通してＹスケールレジスタ１２２へ
送り、同レジスタは出力をＹ左シフトネツトワーク１２
３へ送り、同ネツトワークは出力をチヤンネル１２４を
通してＹフリツプフロツプ１２５へ送る。レジスタ１１
７と１２２の出力はまた０Ｒゲート１２６へ送られ、同
０Ｒゲートはその出力を正視化カウンタネツトワーク１
２７へ送り、同ネツトワークは出力をシフトカウントレ
ジスタ１２８へ送る。シフトカウントレジスタ１２８の
出力は両方の左シフトネツトワーク１１８および１２３
へ送られる。Ｘ左シフトネツトワーク１１８はチャンネ
ル１１９ａを通して出力をＸスケールレジスタ１１７へ
送り、一方、Ｙ左シフトネツトワークはチヤンネル１２
４ａを通して出力をＹスケールレジスタ１２２へ送る。
Ｘフリツプフロツプ１２０からの出力はチャンネル１３
０を通してオペランドシフトレジスタ１０４へ送られ、
Ｙフリツプフロツプ１２５の出力はチヤンネル１３１を
通してオペランドシフトレジスタ１０９へ送られる。ま
た、Ｘフリツプフロツプ１２０の出力はチヤンネル１３
０ａを通して第２図に示した装置へ送られ、また、Ｙフ
リツプフロツプ１２５からの出力もチヤンネル１３１ａ
を通して第２図に示した装置へ送られる。マルチプレク
サ１３２は、チヤンネル１３３を通してバツフア１０２
から、チャンネル１３４を通してバツフア１０７から、
チヤンネル１３５を通してバツフア１１２から、さらに
チヤンネル１３６を通してバツフア１１５から入力を受
ける。マルチプレクサ１３２は、チヤンネル１３７を通
して出力をストリツジアクセスコントロールへ送る。こ
の目的は後で詳しく説明される。第５図に示した装置の
動作を説明してきたが、Ａオペランドスパースベクトル
のＡオペランドはメモリからチヤンネル１１を通つてバ
ツフア１０２ヘスドアされる。The output of buffer 112 is sent through channel 116 to X scale register 117, which
The left shift network 118 sends its output through channel 119 to an X flip-flop 120. Similarly, buffer 115 sends its output through channel 121 to Y scale register 122, which sends its output to Y scale register 122.
3 and the same network sends the output through channel 124 to Y flip-flop 125. register 11
The outputs of 7 and 122 are also sent to 0R gate 126, which sends its outputs to direct counter network 1.
27 and the same network sends the output to shift count register 128. The output of shift count register 128 is connected to both left shift networks 118 and 123.
sent to. The X left shift network 118 sends its output to the X scale register 117 through channel 119a, while the Y left shift network sends its output through channel 119a.
The output is sent to the Y scale register 122 through 4a.
The output from X flip-flop 120 is channel 13.
0 to the operand shift register 104;
The output of Y flip-flop 125 is sent to operand shift register 109 through channel 131. Also, the output of the X flip-flop 120 is channel 13.
0a to the device shown in FIG. 2, and the output from Y flip-flop 125 is also sent to channel 131a.
is sent to the device shown in FIG. Multiplexer 132 connects buffer 102 through channel 133.
from buffer 107 through channel 134,
Input is received from buffer 112 through channel 135 and from buffer 115 through channel 136 . Multiplexer 132 sends its output through channel 137 to the storage access control. This purpose will be explained in detail later. Having described the operation of the apparatus shown in FIG. 5, the A operand of the A operand sparse vector is passed from memory through channel 11 to buffer 102.

Ａベクトルは複数個のＡオペランドＡｌ，Ａ２，Ａ７，
Ａ８，・・・・・・・・・，Ａｎからなり、各々は非ゼ
ロ値を示す。同様に、ＢオペランドスパースベクトルＢ
１−のＢオペランドはメモリから送られてバツフア１０
７ヘスドアされる。バツフア１０２と１０７頃受けたオ
ペランドを連続の順にならべる。このことは、第２図に
示した装置に関連してより充分に説明される。いままで
説明したように、オペランドスパースベクトルは、もと
のオペランドベクトルのすべての非ゼロオペランドを含
む。説明のために述べれば、以下の説明は、オペランド
ベクトルの各々は１６個のオペランド（それぞれＡ，−
Ａｌ６およびＢ１−Ｂｌ６）を含み、ＡオペランドＡ３
，Ａ４，Ａ，，Ａ６，Ａ，，Ａｌ，，Ａｌ２およびＡｌ
３はゼロ値である（したがつてＡスパースベクトルには
Ａｌ，Ａ２，Ａ７ラＡ８２ＡｌＯラＡｌ４ツＡｌ５およ
びＡｌ６力磯る）ような例についてのものである。また
、ＢオペランドＢｌ９Ｂ４ラＢ５２Ｂ６Ｂ７ＦＢ８ＦＢ
ｌ２ツＢｌ３およびＢｌ４の各々はゼロ値である（した
がつてＢスパースベクトルにはＢ２，Ｂ３，Ｂ，，Ｂｌ
Ｏ，Ｂｌｌ，Ｂｌ，およびＢｌ６が残る）。すでに説明
したように、バツフア１０２と１０７は、最初に到着し
たＡオペランドＡ１が最初に到着したＢオペランドＢ２
と並び、２番目に到着したＡオペランドＡ２が２番目に
到着したＢオペランドＢ３と並び、３番目に到着したＡ
オペランドＡ，が３番目に到着したＢオペランドＢ９と
並ぶようにＡとＢのオペランドを並べる。上で説明した
ように、レジスタ１０４と１０９は各々４つのオペラン
ドをストアできるので、オペランドＡｌ，Ａ２，Ａ７お
よびＡ８がレジスタ１０４ヘスドアされ、オペランドＢ
２，Ｂ３，Ｂ，およびＢｌＯがレジスタ１０９にストア
され、残つたオペランドはバツフア１０２と１０７にス
トアされる。したがつてＡとＢのスパースベクトルは第
８図に示すように並ぶ。この例は、１６個の項を有する
オペランドベクトル（Ａオペランドスパースベクトルは
８つの項、Ｂオペランドスパースベクトルは７つの項）
についてのものであるが、この例は動作原理を説明する
ために選ばれた非常に簡単な例であり、かつ、通常はベ
クトルは数千のオペランドを含むことは理解されよう。The A vector has multiple A operands Al, A2, A7,
A8, . . . , An, each indicating a non-zero value. Similarly, B operand sparse vector B
The B operand of 1- is sent from memory and has a buffer of 10
7 Hesdoor is done. Arrange the operands received around buffers 102 and 107 in consecutive order. This is explained more fully in connection with the apparatus shown in FIG. As explained above, the operand sparse vector contains all non-zero operands of the original operand vector. For illustrative purposes, the following description assumes that each of the operand vectors has 16 operands (A, -
Al6 and B1-Bl6), and the A operand A3
,A4,A,,A6,A,,Al,,Al2 and Al
3 is a zero value (so the A sparse vector contains Al, A2, A7, A82, AlO, Al4, Al5, and Al6). Also, B operand Bl9B4RAB52B6B7FB8FB
Each of Bl3 and Bl4 has a zero value (so the B sparse vector has B2, B3, B,, Bl
O, Bll, Bl, and Bl6 remain). As already explained, the buffers 102 and 107 are arranged such that the A operand A1, which arrived first, is the B operand B2, which arrived first.
The A operand A2, which arrived second, is aligned with the B operand B3, which arrived second, and the A operand A2, which arrived third.
Operands A and B are arranged so that operand A is lined up with B operand B9, which arrived third. As explained above, registers 104 and 109 can each store four operands, so operands Al, A2, A7 and A8 are stored in register 104 and operand B
2, B3, B, and BIO are stored in register 109, and the remaining operands are stored in buffers 102 and 107. Therefore, the sparse vectors A and B are arranged as shown in FIG. This example is an operand vector with 16 terms (A operand sparse vector has 8 terms, B operand sparse vector has 7 terms)
It will be appreciated that this example is a very simple example chosen to illustrate the principle of operation, and that vectors typically contain several thousand operands.

例示したような小さなベクトルに対しては上述した方法
よりも他の方法で動作を扱かつた方が実際には便利であ
ろう。さらに、１６個のオペランドベクトルの例では、
著しく長いベクトル（すなわち３２項よりも長いもの）
の扱かい方の詳しい説明は省略されており、このような
長いベクトルの扱かい方は後でより明らかになるであろ
う。したがつて、ここでは例示だけの目的で１６個のオ
ペランドベクトルの簡単な例を示し、説明の便利の目的
で記述されたことに注意されたい。長いベクトルを扱か
うための動作の細かな原理は後で明らかになる。Ｘオー
ダベクトルは全Ａオペランドベクトル中のオペランドの
数に等しいだけの複数個のビット数を含む。For small vectors such as those illustrated, it may actually be more convenient to handle the motion in other ways than those described above. Furthermore, in the example of 16 operand vectors,
Significantly long vectors (i.e. longer than 32 terms)
A detailed explanation of how to handle is omitted, and how to handle such long vectors will become clearer later. Therefore, it should be noted that a simple example of a 16 operand vector is shown here for illustrative purposes only and is written for convenience of explanation. The detailed principles of operation for handling long vectors will become clear later. The X-order vector contains a number of bits equal to the number of operands in the total A-operand vector.

Ｘオーダベクトルの「１」のビツトは対応するＡオペラ
ンドが非ゼロの値を有することを表示し、Ｘオーダベク
トルの［０」のビツトは対応するＡオペランドがゼロ値
を有する（したがつてＡスパースベクトルには存在しな
い）ことを表示する。同様にＹオーダベクトルは対応し
たＢオペランドの値によつて［１」か［０」かの複数個
のビツトを含む。したがつて、Ａオペランドベクトルが
１６個の項を含み、Ｂオペランドベクトルが１６個の項
を含み場合の例では、ｘおよびＹオーダベクトルの各々
は第８図に示しだような配列になつた１６ビツトを有す
る。したがつて、第８図では、Ｘオーダベクトルは、１
番目、２番目、７番目、８番目、１０番目、１４番目、
１５番目および１６番目（左から右へ読んだとき）に、
Ａオペランドの非ゼロに対応して２進の［１」を有し、
Ｙオーダベクトルは、２番目、３番目、９番目、１０番
目、１１番目、１５番目および１６番目（左から右へ読
んだとき）に、Ｂオペランドの非ゼロに対応して２進の
［１」を有する。ＸおよびＹオーダベクトルはそれぞれ
チヤンネル１１１および１１４を通してメモリから読み
出され、バツフア１１２および１１５にストアされる。A '1' bit in the X-order vector indicates that the corresponding A operand has a non-zero value, and a '0' bit in the X-order vector indicates that the corresponding A operand has a zero value (and therefore (does not exist in sparse vectors). Similarly, the Y-ordered vector contains a plurality of bits that are either [1] or [0] depending on the value of the corresponding B operand. Therefore, in the example where the A operand vector contains 16 terms and the B operand vector contains 16 terms, each of the x and Y order vectors is arranged as shown in Figure 8. It has 16 bits. Therefore, in FIG. 8, the X order vector is 1
th, 2nd, 7th, 8th, 10th, 14th,
15th and 16th (when read from left to right),
has a binary [1] corresponding to non-zero of the A operand,
The Y-ordered vector has binary [1 ”. The X and Y ordered vectors are read from memory through channels 111 and 114 and stored in buffers 112 and 115, respectively.

ＸオーダベクトルとＡオペランドスパースベクトルは同
じリードレジスタを用いて同じデータチャンネルを通し
て読まれ、同様にＹオーダベクトルとＢオペランドスパ
ースベクトルは同じリードレジスタを用い同じデータチ
ヤンネルを通して読まれることは理解されよう。このよ
うな場合、スイツチング装置（図示されない）があつて
、Ａオペランドをバツフア１０２へＸビツトをバツフア
１１２へ送るようにし、またＢオペランドをバツフア１
０７へ、Ｙビツトをバツフア１１５へ送るようにする。
（バツフア１０２と１０７は３２オペランドまでストア
でき、バツフア１１２と１１５は１０２４ビツトまでス
トアできることが理解されよう。）スケールレジスタ１
１７と１２２は１６ビツトのレジスタである。ｘおよび
Ｙオーダベクトルの最初の１６ビツト（この場合、Ｘ１
−Ｘｌ６およびＹ１−Ｙｌ６）がスケールレジスタ１１
７および１２２へそれぞれ送られる。スケールレジスタ
１１７はその全１６ビツトの内容をＸ左シフトネツトワ
ーク１１８へ送り、同ネツトワークは１ビツト止めＸ１
を出力してフリツプフロツブ１２０をセツト（あるいは
りセツト）し、残りの１５ビツトを左にシフトしてそれ
らをスケールレジスタ１１７に直す。それらのビツトは
最初の１５ビツトの位置を占める（１６ビツトめは空の
まま残る）。Ｘオーダベクトルの１ビツトめが２進の「
１」ならば（これはさきの例と同じである）Ｘフリツプ
フロツプ１２０はセツトされてゲート信号をオペランド
シフトレジスタ１０４ヘチヤネル１３０を通して送る。
オーダベクトルが左シフトレジスタを２回めに通つたと
き、１ビツト目の位置にある第２ビツトＸ２はフリツプ
フロツプ１２０をセツトあるいはりセツトし、残りのビ
ツトは左にシフトされて、第３ピツトがレジスタ１１７
の１ビツトめを占める。オペランドシフトレジスタ１０
４と１０９は、各フリツプフロツプ１２０と１２５がゲ
ートを開いたときをのぞき、通常はチヤンネル２９と３
４上にゼロを出力する。したがつて、１つのレジスタ１
０４か１０９が開かれたとき（他は閉じられたとき）、
１つのレジスタはオペランドを出力し、他のレジスタは
ゼロを出力しつづける。この過程は、すべてのビツトが
処理され、レジスタ１１７が空になるまで読けられる。
そのとき、バツフア１１２はＸオーダベクトルの次の１
６ビツトをレジスタ１１７へ送り、上述の過程がくり返
される。同様に、スケールレジスタ１２２と左シフトネ
ツトワーク１２３はＹオーダビツトを１ビツトずつ処理
してＹフリツプフロツプ１２５をセツト（またはりセツ
ト）し、ゲート信号はチヤンネル１３１を通してオペラ
ンドシフトレジスタ１０９へ送られる。It will be appreciated that the X-order vector and the A-operand sparse vector are read through the same data channel using the same read register, and similarly the Y-order vector and the B-operand sparse vector are read through the same data channel using the same read register. In such a case, a switching device (not shown) is provided to send the A operand to buffer 102 and X bits to buffer 112, and send the B operand to buffer 112.
07, the Y bit is sent to the buffer 115.
(It will be appreciated that buffers 102 and 107 can store up to 32 operands, and buffers 112 and 115 can store up to 1024 bits.) Scale Register 1
17 and 122 are 16-bit registers. The first 16 bits of the x and Y ordered vectors (in this case,
-Xl6 and Y1-Yl6) is the scale register 11
7 and 122, respectively. Scale register 117 sends its entire 16-bit contents to
outputs and sets (or resets) flip-flop 120, shifts the remaining 15 bits to the left, and stores them in scale register 117. These bits occupy the first 15 bit positions (the 16th bit remains empty). The 1st bit of the X-order vector is binary “
1'' (this is the same as in the previous example), the X flip-flop 120 is set to send the gate signal through the operand shift register 104 channel 130.
When the order vector passes through the left shift register a second time, the second bit X2 in the first bit position sets or unsets the flip-flop 120, the remaining bits are shifted to the left, and the third bit is shifted to the left. register 117
occupies the 1st bit of Operand shift register 10
4 and 109 are normally connected to channels 29 and 3, except when each flip-flop 120 and 125 is gated.
Outputs zero on 4. Therefore, one register 1
When 04 or 109 is opened (others are closed),
One register outputs the operand and the other register continues to output zero. This process continues until all bits have been processed and register 117 is empty.
At that time, the buffer 112 is the next one of the X order vector.
The 6 bits are sent to register 117 and the process described above is repeated. Similarly, scale register 122 and left shift network 123 process the Y-ordered bits one by one to set (or re-set) Y flip-flop 125, and the gate signal is sent to operand shift register 109 through channel 131.

例としては、ＸおよびＹフリツプフロツプ１２０および
１２５は単安定マルチバイブレータで、２進の「１」の
ビツトを受けたときにゲート信号を出すようにセツトさ
れ、２、進の「０」のビツトを受けたときにりセツトさ
れるようになつている。０Ｒゲート１２６はＸとＹオー
ダビツトの論理和を通すように動作する。By way of example, X and Y flip-flops 120 and 125 are monostable multivibrators that are set to gate when they receive a binary ``1'' bit and gating when they receive a binary ``0'' bit. It is now reset when you receive it. 0R gate 126 operates to pass the logical OR of the X and Y ordered bits.

例では、０Ｒゲートは１１１０００１１１１１００１１
１を通す。In the example, the 0R gate is 111000111110011
Pass 1.

（４、５、６、１２、１３ビツトめはゼロである。）最
初のビツトがゼロ（ＸおよびＹオーダベクトル中のゼロ
の一致で生じる）である時、正規化カウントネツトワー
ク１２７は、次に来る２進の「１」に先だつゼロの数を
カウントし、シフトカウントレジスタ１２８を駆動して
両方の左シフトネツトワーク１１８と１２３へ出力を送
り、その結果同ネツトワーク内のカウントを同じ数だけ
シフトする。したがつて、第４回目のパスでは（頭のビ
ツトはゼロ）、正規化カウントネツトワークは次の１の
ビツトに先だつゼロのビツトの数をカウントして（この
場合は３）そのカウントをレジスタ１２８へ送る。この
結果により、ネツトワーク１１８と１２３内のビツトは
３ビツトほど位置がシフトされ、レジスタ１１７と１２
２へ返されるときにはＸおよびＹオーダベクトルは次の
ビットＸ７とＹ７を送り出す位置にもつてくる。同様に
、Ｘｌ２およびＹｌ，においては、ネツトワーク１２７
は２をカウントしてシフトネツトワーク１１８と１２３
をシフトするように作動し、再びそれらを左へシフトし
てＸｌ２，Ｘｌ３，Ｙｌ，およびＹｌ３をスキツプする
。前述のことから、ＸおよびＹフリツプフロツプ１２０
と１２５は以下のテーブルに従つてセツトおよびりセツ
トされる。(The 4th, 5th, 6th, 12th, and 13th bits are zeros.) When the first bit is zero (occurring from the coincidence of zeros in the X and Y order vectors), the normalized counting network 127 , and drives the shift count register 128 to send an output to both left shift networks 118 and 123, so that the counts in the same networks are the same number. Shift only. Therefore, on the fourth pass (the leading bit is zero), the normalization counting network counts the number of zero bits that precede the next one bit (three in this case) and registers that count. Send to 128. As a result, the bits in networks 118 and 123 are shifted by three bits, and the bits in networks 118 and 123 are shifted in position by three bits.
2, the X and Y order vectors are also in position to send out the next bits, X7 and Y7. Similarly, for Xl2 and Yl, network 127
counts 2 and shifts networks 118 and 123
and shifts them to the left again to skip Xl2, Xl3, Yl, and Yl3. From the foregoing, the X and Y flip-flops 120
and 125 are set and reset according to the table below.

フリツプフロツプ１２０と１２５からのゲート信号はレ
ジスタ１０４と１０９のゲートを開けて、それぞれチャ
ンネル２９と３４を通して１つのオペランドをコンピユ
ータのデータ交換部および演算部へ送る。The gate signals from flip-flops 120 and 125 open the gates of registers 104 and 109 and send one operand through channels 29 and 34, respectively, to the data exchange and arithmetic sections of the computer.

すでに説明したように、レジスタ１０４と１０９（およ
びバツフア１０２と１０７）は第８図に示したようなオ
ペランドを含む。したがつて、前述のテーブルと第８図
を参照すると、コンピユータの最初の従サイクルの間、
フリツプフロツプ１２０は、ゲート出力をレジスタ１０
４へ送り、最初のＡオペランドをチヤンネル２９上に送
る。しかしながら、フリツプフロツプ１２５はりセツト
され、ゲート信号はレジスタ１０９へ送られないで、そ
の結果最初のＢオペランドは出力されない。そのかわり
、レジスタ１０９はすでに説明したようにゼロを出力す
る。次のＡオペランドＡ２はレジスタ１０４の送り出せ
る位置に進められ、次の従サイクルの間両方のフリツプ
フロツプ１２０と１２５はレジスタ１０４と１０９のゲ
ートを開け、Ａ２とＢ２とを送り出す。同様に次の従サ
イクルの間、フリツプフロツプ１２５はレジスタ１０９
のゲートを開け、Ｂ３を出力する（レジスタ１０４はフ
リツプフロツプ１２０からのゼロ出力によりゼロを出力
する）。次の従サイクルの間、フリツプフロツプ１２０
はレジスタ１０４のゲートを開け、Ａ７を出力する。同
様にすべてのベクトルに対して処理が行なわれる。した
がつて、チヤンネル２９と３４上のデータは以下のよう
に出力される。マルチプレクサ１３２はバツフア１０２
，１０７，１１２，１１５から信号を受けて、Ａ．Ｂ．
ＹおよびＹベクトルのデータを適当なバツフアへさらに
送るような要求を出す。As previously discussed, registers 104 and 109 (and buffers 102 and 107) contain operands as shown in FIG. Therefore, referring to the foregoing table and FIG. 8, during the first slave cycle of the computer,
The flip-flop 120 sends the gate output to the register 10.
4 and sends the first A operand on channel 29. However, flip-flop 125 is reset and no gate signal is sent to register 109, so that the first B operand is not output. Instead, register 109 outputs a zero as previously described. The next A operand, A2, is advanced to a position where it can be sent out in register 104, and during the next slave cycle both flip-flops 120 and 125 open the gates of registers 104 and 109, sending out A2 and B2. Similarly, during the next slave cycle, flip-flop 125 loads register 109.
, and outputs B3 (register 104 outputs zero due to the zero output from flip-flop 120). During the next slave cycle, flip-flop 120
opens the gate of register 104 and outputs A7. Similarly, processing is performed on all vectors. Therefore, the data on channels 29 and 34 are output as follows. The multiplexer 132 is the buffer 102
, 107, 112, 115, A. B.
Make a request to further send the Y and Y vector data to the appropriate buffer.

したがつて、１つのバツフアが内容量が少なかつたなら
、その事を検出した信号はマルプレクサ１３２を通して
ストリツジアクセスコントロール（図示されない）へ送
られ、バツフアに更にデータを送るように要求する。ス
トリツジアクセスコントロールは、任意の１つのメモリ
サイクルの間に１つのグループのメモリバンクからリー
ドレジスタ１０，１２，１１０および１１３のうち任意
の１つへデータを送ることができる。したがつて、各バ
ツフア１０２，１０７，１１２および１１５のストアし
ているベクトルの量はある時刻においてはおそらく異な
つている。その結果、マルチプレクサは非衝突時におい
て異なつたベクトルを要求することが多いだろう。しか
しながら、２つまたはそれ以上のバツフアからデータ要
求があつたならば、マルチプレクサはそのうちの１つを
とりあげ、必要とするベクトルがなくなつたときに前述
の過程は一時停止する。コンピュータの演算部はチヤン
ネル２９および３４上に生じるオペランドを用いて、特
定の定められた演算機能に従つてリザルタントを発生す
る。たとえば、加算のモードにおいては、演算部は、Ａ
１＋０＝Ｃ１、Ａ２＋Ｂ２−Ｃ２、０＋Ｂ３−Ｃ３、Ａ
７＋０−Ｃ７といつた演算を行なう。減算のモードでは
、演算部はＡ１−０＝Ｃ１、Ａ２−Ｂ２−Ｃ２、０−Ｂ
３−Ｃ３、Ａ７−０＝Ｃ７といつた演算を行なう。乗算
のモードでは、演算部はＡ１・０＝Ｃ１、Ａ２・Ｂ２＝
Ｃ２、Ｏ−Ｂ３−Ｃ３、Ａ７・Ｏ−Ｃ７といつた演算を
行なう。さらに除算のモードでは、演舅部はＡ１／０−
Ｃ１、Ａ２／Ｂ２＝Ｃ２、０／Ｂ３＝Ｃ３、Ａ７／Ｏ＝
Ｃ７といつた演算を行なう。ゼロオペランドを含んだ乗
算はゼロリザルタントを生じ、ゼロ除数を含んだ除算は
不定のリザルタントを生じ、さらにゼロ被除数を含んだ
除算はゼロリザルタントを生じることは理解されよう。
後でより詳しく特に第６図に関連して説明されるが、こ
のような演算は許されず、演算部での計算結果にかかわ
らずゼロとされる。特に第６図を参照すると、コンピユ
ータのデータ交換部や演算部からリザルタントをコンピ
享一タのメモリやストリツジアクセスコントロールへ送
るための装置が示されている。Therefore, if one buffer is low on content, a signal detecting this is sent through multiplexer 132 to a storage access control (not shown) requesting the buffer to send more data. The storage access control can send data from a group of memory banks to any one of read registers 10, 12, 110 and 113 during any one memory cycle. Therefore, the amount of vectors stored in each buffer 102, 107, 112, and 115 is likely to be different at any given time. As a result, the multiplexer will often require different vectors in non-collision situations. However, if two or more buffers request data, the multiplexer picks up one of them, and the process is paused when no more vectors are needed. The computing section of the computer uses the operands occurring on channels 29 and 34 to generate results in accordance with certain defined computing functions. For example, in the addition mode, the arithmetic unit performs A
1+0=C1, A2+B2-C2, 0+B3-C3, A
Perform the calculation 7+0-C7. In subtraction mode, the arithmetic unit calculates A1-0=C1, A2-B2-C2, 0-B
3-C3, perform the calculation A7-0=C7. In the multiplication mode, the arithmetic unit calculates A1・0=C1, A2・B2=
Operations such as C2, O-B3-C3, and A7/O-C7 are performed. Furthermore, in the division mode, the operator part is A1/0-
C1, A2/B2=C2, 0/B3=C3, A7/O=
Perform calculations such as C7. It will be appreciated that multiplication involving zero operands yields a zero resultant, division involving a zero divisor yields an indeterminate resultant, and division involving a zero dividend yields a zero resultant.
As will be explained in more detail later with particular reference to FIG. 6, such an operation is not allowed and is set to zero regardless of the result of the calculation in the arithmetic section. Referring specifically to FIG. 6, there is shown an apparatus for transmitting results from a data exchange section or arithmetic section of a computer to a memory or storage access control section of the computer.

Ｃリザルタントはチヤンネル５１を通してデータ交換部
から送られレジスタ１４１で受けられる。リザルタント
はその後バツフア１４２へ送られ（同バツフアは１２８
リザルタントまでストアできる）。さらにライトレジス
タ１４３へ送られチヤンネル５９を通してコンピユータ
のメモリやストリツジアクセスコントロールへ送られて
処理される。そのバツJャAリングおよびタイミングの制
御の詳細は第２図に関連して既に述べた。第５図のフリ
ツプフロツプ１２０と１２５からのｘおよびＹゲート信
号はチヤンネル１３０と１３１を通してＡＮＤゲート１
４５，１４６および１４７へ送られる。特に、Ｘフリツ
プフロツブ１２０からのゲート信号はチヤンネル１３０
を通してＡＮＤゲート１４５と１４７のＡＮＤ入力の１
つへ送られ、Ｙフリツプフロツプ１２５からのゲート信
号はチヤンネル１３１を通してＡＮＤゲート１４５と１
４６のＡＮＤ入力の１つへ送られる。フアンクシヨンコ
ントロール１４８はチヤンネル１４９と１５０を通して
ゲート出力を出す。乗算か除算の場合には、ゲート出力
はチヤンネル１４９を通してＡＮＤゲート１４５の第３
のＡＮＤ入カへ送られる。加算か減算の場合は、ゲート
信号はチヤンネル１５０を通してＡＮＤゲート１４６と
１４７へ送られる。したがつて、乗算か除算のモードで
は第５図の７リツプフロツプ１２０と１２５からのゲー
ト出力はＡＮＤゲート１４５を駆動してそこから信号出
力を得るのに必要である。同様に、加算か減算のモード
ではフリツプフロツプ１２０および／あるいは１２５か
らのゲート出力はＡＮＤゲート１４６と１４７のどちら
かあるいは両方を選択的に駆動することは理解されよう
。ＡＮＤゲート１４５，．１，４６および１４７の出力
は０Ｒゲート１５１の入力となる。したがつて、オペラ
ンドのうちのいずれかが存在してフアンクシヨンコント
ロール１４８が加算か減算のモードにあるときか、ある
いは両方のオペランドが存在してフアンクシヨンコント
ロールが乗算か除算のモードにあるときにだけ０Ｒゲー
ト１５１は出力をチヤンネル１５２上に出しレジスタ１
４１へ送る。したがつて、演算部で許される演算の場合
だけ０Ｒゲート１５１から出力が生じる。したがつて、
ゼロ値の乗数、除数、被除数を含む場合は、０Ｒゲート
１５１からの出力はチヤンネル１５２上へ生じない。チ
ヤンネル１５２上のゲート信号はレジスタ１４１のゲー
トを開け、適当なリザルタントがバツフア１４２へ送ら
れ、さらにメモリにストアされる。したがつて、許され
ていない演算が演算部でなされた場合、レジスタ１４１
はゲートが開けられず、許されていないリザルタントは
ストアされないで、このような許されていないリザルタ
ントはただ無視されるだけである。上に示した例から、
また前述の説明からレジスタ１４１は、加算または減算
のフアンクシヨンのときはゲートが開いてリザルタント
Ｃｌ，Ｃ２，Ｃ３９Ｃ７９Ｃ８９Ｃ９ラＣｌＯ，Ｃｌｌ
ラＣｌ４，Ｃｌ５およびＣｌ６を通す。The C resultant is sent from the data exchange section through channel 51 and received by register 141. The resultant was then sent to Bathua 142 (which was 128
(You can store up to Resultant). The data is then sent to the write register 143 and sent through the channel 59 to the computer's memory or storage access control for processing. The details of the butcher ring and timing control have already been described in connection with FIG. The x and Y gate signals from flip-flops 120 and 125 of FIG.
45, 146 and 147. In particular, the gate signal from the X flip-flop 120 is channel 130.
1 of the AND inputs of AND gates 145 and 147 through
The gate signal from Y flip-flop 125 is sent to AND gate 145 through channel 131.
to one of the 46 AND inputs. Function control 148 provides gate outputs through channels 149 and 150. In the case of multiplication or division, the gate output is passed through channel 149 to the third gate of AND gate 145.
is sent to the AND input of For addition or subtraction, the gating signal is sent through channel 150 to AND gates 146 and 147. Therefore, in the multiply or divide mode, the gate outputs from the seven lip-flops 120 and 125 of FIG. 5 are necessary to drive the AND gate 145 and obtain the signal output therefrom. Similarly, it will be appreciated that in addition or subtraction modes, the gate outputs from flip-flops 120 and/or 125 selectively drive one or both of AND gates 146 and 147. AND gates 145, . The outputs of 1, 46 and 147 become the inputs of 0R gate 151. Therefore, when either of the operands is present and the function control 148 is in add or subtract mode, or both operands are present and the function control is in multiply or divide mode. 0R gate 151 puts its output on channel 152 only when register 1
Send to 41. Therefore, an output is generated from the 0R gate 151 only in the case of an operation permitted by the operation section. Therefore,
In the case of zero-valued multipliers, divisors, and dividends, the output from 0R gate 151 does not appear on channel 152. The gate signal on channel 152 opens the gate of register 141 and the appropriate resultant is sent to buffer 142 for further storage in memory. Therefore, if an operation that is not allowed is performed in the arithmetic unit, the register 141
The gate will not be opened, disallowed results will not be stored, and such disallowed results will simply be ignored. From the example shown above,
Also, from the above explanation, when the register 141 is an addition or subtraction function, the gate is opened and the resultant Cl, C2, C39C79C89C9, ClO, Cl.
Pass through Cl4, Cl5 and Cl6.

しかしながら、乗算または除算のモードでは、レジスタ
１４１はリザルタントＣ２，ＣｌＯ，Ｃｌ５およびＣｌ
６だけを通して、他のリザルタントは許されていないリ
ザルタントであるからレジスタ１４１によつて捨てられ
る。いままで、乗算または除算のフアンクシヨンから得
られたゼロ値または不定値のリザルタントは許されてい
ないリザルタントであると仮定してきた。しかしながら
、このようなリザルタントをストアしないことにより、
それらをゼロと仮定してしまうことになることは理解さ
れよう。このような仮定はほとんどの場合、有効である
が、不定値を有するリザルタントを決定したい場合は、
各除数や被除数をコンピユータ中の他の部分で比較回路
で調べ、それらをメモリヘスドアすればよい。前述のこ
とから、スパースベクトルがオーダベクトルによつて論
理的にゲートを通過させられあるいは遮断されて演算が
施され、スパースリザルタントベクトルが作られメモリ
にストアされる方法が説明された。しかしながら、この
ようなスパースリザルタントベクトルは、リザルタント
オーダベクトルなしでは必らずしもすべてが有用なもの
とはならない。第７図はリザルタントオーダベクトルｚ
を発生するための装置を示す。このリザルタントオーダ
ベクトルｚは複数個のビツトからなり、２進の「１」は
リザルタントベクトル中の非ゼロのリザルタントを示し
、２進の「０」はゼロ値のリザルタントを示す。第７図
に示すように、リードレジスタ２１０はコンピユータの
メモリおよびストリツジアクセスコントロールからチヤ
ンネル２１１を通してＸオーダベクトルを受け、そのデ
ータをＸバツフア２１２へ送る。However, in multiplication or division mode, register 141 has resultants C2, ClO, Cl5 and Cl
6, the other results are discarded by register 141 as they are not allowed results. Until now, we have assumed that zero-valued or indeterminate-valued results from multiplication or division functions are disallowed results. However, by not storing such results,
It will be understood that we will be assuming them to be zero. Although such an assumption is valid in most cases, if you want to determine a resultant with an indeterminate value,
Each divisor and dividend can be checked by a comparison circuit in another part of the computer, and then stored in memory. From the foregoing, it has been described how sparse vectors are logically gated and gated by ordered vectors and operated on to create sparse resultant vectors and stored in memory. However, such sparse result vectors are not necessarily all useful without the resultant order vector. Figure 7 shows the resultant order vector z
This shows a device for generating . This resultant order vector z consists of a plurality of bits, where a binary "1" indicates a non-zero resultant in the resultant vector, and a binary "0" indicates a zero-valued resultant. As shown in FIG. 7, read register 210 receives X-ordered vectors from the computer's memory and storage access control through channel 211 and sends the data to X-buffer 212.

同様にリードレジスタ２１３はコンピユータのメモリお
よびストリツジアクセスコントロールからチヤンネル２
１４を通してＹオーダベクトルを受け、そのデータをバ
ツフア２１５へ送る。それらのビツトは連続的にバツフ
ア２１２および２１５からＡＮＤゲート２４５，２４６
および２４７へ送られる。フアンクシヨンコントロール
１４８（これは第６図に関連して説明されたものと同じ
フアンクシヨンコントロールである）は、乗算または除
算フアンクシヨンを示すゲート信号をチヤンネル１４９
を通して、また加算または減算フアンクシヨンを示すゲ
ート信号をチヤンネル１５０を通して送る。乗算または
除算フアンクシヨンを示す、フアンクシヨンコントロー
ルからの出力はチヤンネル１４９を通してＡＮＤゲート
２４６へ送られ、同ＡＮＤゲートは他の入力をバツフア
２１２および２１５から受ける。フアンクシヨンコント
ロール１４Ｂ＆よ如算または減算フアンクシヨンを示す
ゲート信号をチヤンネル１５０を通してＡＮＤゲート２
４５と２４７へ送る。ＡＮＤゲート２４５，２４６およ
び２４７の出力は０Ｒゲート２５１を通して送られｚフ
リツプフロツプ２６０をセツトあるいはりセツトする。
フリツプフロツプ２６０の出力は出力レジスタ２６１へ
接続され、チヤンネル２６２を通して出力をデータ交換
部へ送る。ＸおよびＹベクトルはそれぞれバツフア２１
２と２１５を通して送られるので、それぞれのＡＮＤゲ
ートに送られて０Ｒゲート２５１を通してフリツプフロ
ツプ２６０をセツトまたはりセツトする。Similarly, read register 213 provides access to channel 2 from the computer's memory and storage access control.
It receives the Y-order vector through 14 and sends the data to buffer 215 . Those bits are sequentially transferred from buffers 212 and 215 to AND gates 245 and 246.
and sent to 247. Function control 148 (which is the same function control described in connection with FIG. 6) sends a gate signal to channel 149 indicating a multiply or divide function.
and sends a gate signal through channel 150 indicating an addition or subtraction function. The output from the function control, indicating a multiplication or division function, is sent through channel 149 to AND gate 246, which receives other inputs from buffers 212 and 215. Function control 14B & gate signal indicating addition or subtraction function is passed through channel 150 to AND gate 2
Send to 45 and 247. The outputs of AND gates 245, 246 and 247 are sent through OR gate 251 to set or reset z flip-flop 260.
The output of flip-flop 260 is connected to an output register 261 which sends the output through channel 262 to the data exchange. The X and Y vectors each have a buffer of 21
2 and 215, and are sent to their respective AND gates to set or reset flip-flop 260 through OR gate 251.

たとえば、加算または減算モードでは、ｘまたはＹベク
トルに生じる２進の「１」の各々に対し、各ＡＮＤゲー
ト２４５または２４７は０Ｒゲート２５１を作動させて
フリツプフロツプ２６０をセツトさせ、レジスタ２６１
へ２進の「１」を出力する。逆に、乗算または除算のモ
ードでは、もしＸとＹのビツトが２進の「１］ならば、
ＡＮＤゲート２４６は駆動されて０Ｒゲート２５１を作
動し、フリツプフロツプ２６０をセツトし、２進の「１
」をレジスタ２６１へ出力する。他のすべての場合は、
フリツプフロツプ２６０はりセツトされて２進の［０」
をレジスタ２６１へ出力する。このようにして、ｚオー
ダベクトルが作られ、データ交換部へ送られ、チヤンネ
ル５１を通してレジスタ１４１へ送られ、さらにメモリ
へ送られる。第９図は加算と減算の例に対するＣリザル
タントスパースベクトルとＺオーダベクトルを示し、第
１０図は乗算と除算に対するＣリザルタントスパースベ
クトルとｚオーダベクトルを示す。第７図の多くの部分
は第５図と第６図の一部分に本質的に同一であることは
理解されよう。したがつて、回路の簡単化のために、リ
ードレジスタ２１０と２１３は第５図のリードレジスタ
１１０と１１３であり、バツフア２１２と２１５は第５
図のバツフア１１２と１１５であつてよい。ＡＮＤゲー
ト２４５，２４６および２４７は第６図のＡＮＤゲート
１４５，１４６および１４７であり、０Ｒゲート２５１
は第６図の０Ｒゲート１５１であつてよい。しかし、動
作の簡単さにより、それらは区別した方がよい。なぜな
ら第７図のＡＮＤゲートと０Ｒゲートは１６ビツトのゲ
ートが望ましいが、第６図のそれらは１ビツトのゲート
が望ましいからである。しかしながら、同じ回路が使用
された場合は、データの２回目のくり返しのときにＺオ
ーダベクトルを発生する必要がある。本発明は、上述の
ように、コンピユータの演算部による演算のためにコン
ピユータのメモリから連続的にデータを読み出し、かつ
演算部からメモリヘリザルタントを書き込むための装置
を提供する。この装置は最小のプリセツト遅れで作動し
、一たびバツフアリングが確立すると、遅れは必要でな
くなる。したがつて、この発明により、コンピユータの
連続動作問に不可避な付加的な遅れが解消される。それ
に代つて、コンピユータの最適動作とシステムタイミン
グに従つて各データの流れの初めか初めに近いときにプ
リセツト遅れが生じる。本発明の１つの特徴は、バツフ
アリングがデータアクセスに無関係な速度で確立される
点にある。For example, in addition or subtraction mode, for each binary "1" that occurs in the
Outputs a binary “1” to. Conversely, in multiplication or division mode, if the bits of X and Y are binary "1", then
AND gate 246 is driven to operate 0R gate 251, setting flip-flop 260 and outputting a binary "1" signal.
" is output to the register 261. In all other cases,
Flip-flop 260 is set to binary [0]
is output to the register 261. In this way, a z-ordered vector is created and sent to the data exchange, sent through channel 51 to register 141, and then sent to memory. FIG. 9 shows C resultant sparse vectors and Z-order vectors for addition and subtraction examples, and FIG. 10 shows C resultant sparse vectors and Z-order vectors for multiplication and division. It will be appreciated that many portions of FIG. 7 are essentially identical to portions of FIGS. 5 and 6. Therefore, to simplify the circuit, read registers 210 and 213 are the same as read registers 110 and 113 in FIG.
This may be the buffers 112 and 115 shown in the figure. AND gates 245, 246 and 247 are AND gates 145, 146 and 147 in FIG.
may be the OR gate 151 in FIG. However, for ease of operation, it is better to distinguish between them. This is because the AND gate and 0R gate in FIG. 7 are preferably 16-bit gates, but those in FIG. 6 are preferably 1-bit gates. However, if the same circuit were used, it would be necessary to generate the Z-ordered vector during the second iteration of the data. The present invention, as described above, provides an apparatus for continuously reading data from a memory of a computer for operations by an arithmetic unit of the computer and writing memory inheritance from the arithmetic unit. The system operates with minimal preset delay; once buffering is established, no delay is necessary. Thus, the present invention eliminates the additional delays that are unavoidable in continuous operation of a computer. Instead, the preset delay occurs at or near the beginning of each data stream, depending on optimal computer operation and system timing. One feature of the invention is that buffering is established at a rate that is independent of data access.

したがつて、「短かい）ベクトルの場合（１つのベクト
ルがおよそ４８オペランド以内の場合）、バツフアリン
グは確立される必要がない。なぜなら、後から到着する
オペランドは最初のオペランドの処理後に到着するから
である。さらに、データの間にギヤツプが生じたならば
、バツフアは、ベクトルの扱いが最適になるようにそれ
自体を同期させる。また、「中くらいの」長さのベクト
ルの場合（およそ３２０オペランド以内のベクトルの場
合）、前述したようにコントロールパスを解放してより
最適化がはかられる。本発明は、また、演算を行うため
にスパースベクトルを整列させるための装置を提供する
。Therefore, for "short" vectors (one vector has approximately 48 operands or less), buffering does not need to be established, since later arriving operands arrive after the first operand is processed. Additionally, if a gap occurs between the data, the buffer synchronizes itself for optimal vector handling. Also, for "medium" length vectors (approximately 320 (in the case of a vector within the operand), the control path is freed up for further optimization as described above. The invention also provides an apparatus for aligning sparse vectors to perform operations.

さらに、本発明は、リザルタントをストアするのに必要
なメモリスペースを最小にするためにリザルタントベク
トルを縮めるための装置を提供する。特に第９図と第１
０図に示されるように、メモリスペースの本質的な節約
が、非ゼロのオペランドとリザルタント、および必要な
オーダベクトルだけをストアすることによつてなされる
。ひきつづきリザルタントを使うときは（たとえばオペ
ランドとしてひきつづいて演算されるとき）、ベクトル
は同じ方法で使用される。本発明に従つた装置により、
きわめて多量の要素をもつベクトルをコンピユータのメ
モリ中の最小のスペースにストアすることが可能となり
、スパースベクトルを用いれば計算時間の短縮をはかる
ことも可能である。特に、ベクトルが、設定された値（
たとえばゼロ）を有する要素を多量に含むときはそうで
ある。本発明に従つた装置を用いれば、先行技術におい
て従来知られているような重大な遅れを招くことなくコ
ンピユータのメモリと演算部との間でオペランドとリザ
ルタントを表わす大きなベクトルを扱うことができる。
この発明は、この記載に述べられ図面に示された実施例
に喰られるものではない。Additionally, the present invention provides an apparatus for shrinking resultant vectors to minimize the memory space required to store results. Especially Figure 9 and 1
As shown in Figure 0, substantial savings in memory space are made by storing only non-zero operands and results and the necessary order vectors. When using resultants subsequently (for example, when they are subsequently operated on as operands), vectors are used in the same way. With the device according to the invention,
Vectors with extremely large numbers of elements can be stored in a minimal amount of space in a computer's memory, and sparse vectors can also be used to reduce computation time. In particular, if the vector has a set value (
This is the case, for example, when it contains a large number of elements with zero values. With the device according to the invention, large vectors representing operands and results can be handled between the memory and the arithmetic section of a computer without incurring the significant delays known in the prior art.
The invention is not limited to the embodiments described in this description and shown in the drawings.

この実施例は例のために示したのであり、制限のために
示したのではなく、添付された特許請求の範囲に従つた
だけである。This embodiment is shown by way of example and not by way of limitation, but only in accordance with the claims appended hereto.

[Brief explanation of the drawing]

第１図はコンピユータの複数個のメモリバンク内のオペ
ランドベクトルＡ，ＢとリザルタントベクトルＣの記憶
位置を示す図である。第２図は、本発明の好適実施例に従つたバツフアリング
制御装置のプロツク回路図である。第３Ａないし３Ｄ図
は、コンピユータのメモリから演算部ヘオペランドベク
トルを送るための第２図に示したバツフアリング制御装
置の作動を説明するためのグラフイツクな図である。第
４Ａないし４Ｄ図は演算部からメモリヘリザルタントベ
クトルを送るための第２図に示したバツフアリング制御
装置の作動を説明するためのグラフイツクな図である。
第５図は本発明に従つてオペランドスパースベクトルと
オペランドオーダベクトルを処理するための装置のプロ
ツク回路図である。第６図は、本発明に従つてリザルタ
ントスパースベクトルを制御するための装置のプロツク
回路図である。第７図は本発明に従つてリザルタントオ
ーダベクトルを発生するための装置のプロツク回路図で
ある。第８図ないし第１０図は、第５図ないし第７図に
示された装置の動作を説明するのに便利なオーダベクト
ルおよびスパースベクトルを示したものである。２８，
２６，１７，３３，３１，２２，３８・・・・・・オペ
ランドをストアするための装置、４１，４２，４４，４
５，４６・・・・・・制御装置、１１０・・・・・・第
１レジスタ、１１３・・・・・・第２レジスタ、１０４
，１０９・・・・・・シフト装置、１２０・・・・・・
第１発生装置、１２５・・・・・・第２発生装置。FIG. 1 is a diagram showing the storage locations of operand vectors A, B and resultant vector C in a plurality of memory banks of a computer. FIG. 2 is a block circuit diagram of a buffering control system in accordance with a preferred embodiment of the present invention. 3A to 3D are graphic diagrams for explaining the operation of the buffering control device shown in FIG. 2 for sending operand vectors from the memory of the computer to the arithmetic unit. 4A to 4D are graphic diagrams for explaining the operation of the buffering control device shown in FIG. 2 for sending memory heritage vectors from the calculation section.
FIG. 5 is a block diagram of an apparatus for processing operand sparse vectors and operand ordered vectors in accordance with the present invention. FIG. 6 is a block diagram of an apparatus for controlling resultant sparse vectors in accordance with the present invention. FIG. 7 is a block circuit diagram of an apparatus for generating resultant order vectors in accordance with the present invention. FIGS. 8-10 show ordered vectors and sparse vectors useful for explaining the operation of the apparatus shown in FIGS. 5-7. 28,
26, 17, 33, 31, 22, 38... Device for storing operands, 41, 42, 44, 4
5, 46... Control device, 110... First register, 113... Second register, 104
, 109...shift device, 120...
First generator, 125...Second generator.

Claims

[Claims] 1. A_1, A_2, A_3... sent from memory
......The operand of the first vector in the form of A_n, and B_1, B_2, B_3......, B_
A data processing device for simultaneously forwarding operands of a second vector in the form of n to an arithmetic unit of a computer, wherein each of the first and second vectors includes a finite number of operands, A_1, A_2, A3...・・・・・・A_n
each represents a respective consecutive first operand of said first vector, B_1, B_2, B_3...
. . B_n each represents a respective consecutive second operand of said second vector, and said data processing device converts said first and second vectors into first and second consecutively ordered operand sequences, respectively. a first register device connected to the input device for storing an operand of the first vector; and a second register device connected to the input device for storing an operand of the second vector. The first and second register devices store operand pairs A_1 and B_1, A_2 and B_2, A_3.
and B_3... Said first and second register devices arranged to send A_n and B_n to an output device, a buffer device for storing operands, and said first
and a second register device, selectively configuring the buffer device to store further operands of the vector associated with the saturated register device in response to one of the register devices becoming saturated with an operand. controlling the buffer device to supply stored further operands to the saturated register device in response to the arrival of corresponding sequentially ordered operands of other vectors in the respective register device; a control device for controlling the sending of operands to a data processing device, the output device being connected to the arithmetic unit and sequentially sending each pair of the operands to the arithmetic unit. 2. A data processing device for processing data between a memory unit and an arithmetic unit of a computer, wherein the memory has a first sparse vector including a plurality of first operands and a first sparse vector including a plurality of second operands. 2 sparse vectors, each of the first operands of the first sparse vector representing a value different from a predetermined value of the corresponding first operand vector; each of the second operands represents a value different from the predetermined value of the corresponding second operand vector, and the memory further includes a first order vector containing a first plurality of consecutive bits. a second order vector including a second plurality of consecutive bits, at least one bit of the first order vector corresponding to one term of the first operand vector; At least one bit of the vector corresponds to one term of said second operand vector, and each bit of said first and second ordered vectors corresponds to an operand for which that bit does not represent said predetermined value. when corresponding, each bit of said first and second order vectors has a second binary value when that bit corresponds to an operand representing said predetermined value. The data processing device has a first register device for storing a first operand of the first sparse vector and a second register device for storing a second operand of the second sparse vector. At that time, the first
said first and second sparse vectors arranged to align each successive first operand of said sparse vector with each successive second operand of said second sparse vector.
a register device, a buffer device for storing operands, the buffer device being connected to the first and second register devices, in response to one of the register devices being saturated with an operand, the buffer device is configured to be a saturated register device; selectively controlling the storage of further subsequent operands of associated sparse vectors, said buffer device being saturated in response to the arrival of corresponding operands of other vectors in their respective register devices; a control device that continuously supplies further operands and controls the operands to be sent to the arithmetic unit; a logic device that supplies first and second gate signals in response to bits of the first and second order vectors; and in response to the first gate signal, the operand of the first sparse vector is sent from the first register device to the arithmetic unit, and in response to the second gate signal, the operand of the second sparse vector is sent to the second sparse vector. A data processing device comprising a gate device for sending data from a register device to the arithmetic unit.