JPH0361225B2

JPH0361225B2 -

Info

Publication number: JPH0361225B2
Application number: JP26024485A
Authority: JP
Inventors: Nobuyuki Sugiura
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1985-11-20
Filing date: 1985-11-20
Publication date: 1991-09-19
Also published as: JPS62119674A

Description

【発明の詳細な説明】〔目次〕概要産業上の利用分野従来の技術発明が解決しようとする問題点問題点を解決するための手段作用実施例発明の効果〔概要〕複数個の演算処理部、又は１個乃至複数個のメ
モリアクセス処理部を含む複数個の演算処理部を
持つデータ処理装置において、命令発信制御部の
近傍に、命令投入部で選択された命令のコピーを
格納する発信待機レジスタQXを設け、各命令の
処理サイクルの次のサイクルで発信条件を判定す
る命令を選択して格納するように制御することに
より、発信制御を、上記発信待機レジスタQXの
内容のみで行うようにしたものである。[Detailed Description of the Invention] [Table of Contents] Overview Industrial Application Fields Conventional Technology Problems to be Solved by the Invention Means for Solving Problems Effects of the Invention [Summary] Plural arithmetic processing units , or in a data processing device having a plurality of arithmetic processing units including one or more memory access processing units, a transmission standby system in which a copy of the instruction selected by the instruction input unit is stored near the command transmission control unit. By providing a register QX and controlling the instruction to select and store the instruction that determines the transmission condition in the cycle following the processing cycle of each instruction, transmission control can be performed only by the contents of the transmission standby register QX. This is what I did.

[Industrial application field]

本発明は、データ処理装置、例えば、ベクトル
データ処理装置において、後発の命令を先発の命
令より先に実行できるようにした命令制御方式に
関する。 The present invention relates to an instruction control method that allows a later instruction to be executed before an earlier instruction in a data processing device, such as a vector data processing device.

最近の半導体技術の著しい進歩に伴つて、デー
タ処理装置の大きな機能ブロツクが、１つのチツ
プ、或いはプリント板内に高集積化されるように
なつてきた。 With recent remarkable advances in semiconductor technology, large functional blocks of data processing devices have come to be highly integrated within a single chip or printed board.

又、該データ処理装置を高速化する為に、マシ
ンサイクルを短くする動向にあり、該チツプ、或
いはプリント板間での論理遅延が、当該データ処
理装置の処理能力に影響を与えるようになつてき
た。 In addition, there is a trend to shorten machine cycles in order to increase the speed of data processing devices, and logic delays between chips or printed circuit boards have begun to affect the processing capacity of the data processing devices. Ta.

一方、複数の演算パイプライン、又は、１つ乃
至複数個のメモリアクセスパイプラインを含む演
算パイプラインを備えたベクトルデータ処理装置
においては、１つのベクトル命令で、多量のデー
タについての処理を行う為、複数のベクトル命令
間において、該命令の順序性を守る要因がなけれ
ば、後続の命令を先行して実行させることによ
り、該ベクトルデータ処理装置全体の処理能力を
大幅に向上させることができる。 On the other hand, in a vector data processing device equipped with multiple arithmetic pipelines or an arithmetic pipeline including one or more memory access pipelines, a single vector instruction is used to process a large amount of data. , if there is no factor that protects the order of a plurality of vector instructions, the processing capacity of the entire vector data processing device can be greatly improved by executing subsequent instructions in advance.

このようなベクトルデータ処理装置において
は、命令制御部が、大きく分けて命令投入部と、
命令発信部とから構成されており、命令投入部の
待ち合わせレジスタQ₁，Q₂中にある命令を選択
して、命令発信部に送り、先行命令と比較して発
信条件を判定した後、該発信を妨げる要因がなけ
れば、該ベクトル命令を命令発信部に取り込み、
関連する演算部を起動するような命令制御を行つ
ている。（後述の特開昭57−161938号公報参照）この場合、上記命令投入部と、命令発信部と
が、別々のチツプ、或いはプリント板に収容され
ている場合には、命令投入部から命令発信部への
データの転送に伴う論理遅延や、命令投入部にお
ける前述の待ち合わせレジスタQ₁，Q₂の何れか
を選択する為の選択回路の論理段数による論理遅
延のため、当該命令の発信制御に時間がかかり、
命令投入部の命令が発信されるのが遅くなつてし
まうと云う問題があつた。 In such a vector data processing device, the instruction control section is broadly divided into an instruction input section,
It consists of an instruction sending section, which selects the instruction in the waiting registers _Q1 and _Q2 of the instruction inputting section, sends it to the instruction sending section, compares it with the preceding instruction, determines the sending condition, and then selects the instruction in the waiting registers Q1 and Q2 of the instruction sending section. If there are no factors preventing the transmission, take the vector command into the command transmission section,
It performs command control such as activating related arithmetic units. (Refer to Japanese Unexamined Patent Publication No. 57-161938 mentioned below.) In this case, if the command input section and the command transmission section are housed in separate chips or printed boards, the command input section can transmit the command. Due to the logic delay associated with the transfer of data to the instruction input section and the number of logic stages of the selection circuit for selecting either of the above-mentioned waiting registers Q ₁ or Q ₂ in the instruction input section, it is difficult to control the transmission of the instruction. It takes time,
There was a problem in that the commands from the command input section were delayed.

このような事情から、命令投入部と、命令発信
部との間の命令制御タイミングは同じにして、且
つ命令投入部と、命令発信部との間のブロツク間
転送の存在を見掛け上見えないようにして、当該
ベクトルデータ処理装置の処理能力を向上させる
命令制御方式が要求されるようになつてきた。 For this reason, the command control timing between the instruction input section and the instruction transmission section is the same, and the existence of inter-block transfer between the instruction input section and the instruction transmission section is made invisible. As a result, there has been a demand for an instruction control system that improves the processing performance of vector data processing devices.

[Conventional technology]

一般に、複数のエレメントを有する第２オペラ
ンドＡ（a₀，a₁，…a_o-1）と、複数のエレメント
を有する第３オペランドＢ（b₀，b₁，……b_o-1）
とで、対応するエレメント同志に演算を施し、演
算結果の第１オペランドＣ（c₁，C₁，……c_o-1）
（ここで、Ｃ＝Ａ＋Ｂなる演算を施すならば、c_i
＝a_i+b_i）を得るデータ処理装置は、ベクトルデータ処理装置と呼ばれている。 In general, a second operand A (a ₀ , a ₁ , ...a _o-1 ) has multiple elements, and a third operand B (b ₀ , b ₁ , ...b _o-1 ) has multiple elements.
Then, the operation is performed on the corresponding elements, and the first operand of the operation result C (c ₁ , C ₁ , ...c _o-1 )
(Here, if we perform the operation C=A+B, c _i
A data processing device that obtains the _following _equation is called a vector data processing device.

これに対して、エレメントが１個（ｎ＝１）に
限定された従来の汎用処理装置はスラカー処理装
置と呼ばれている。 On the other hand, a conventional general-purpose processing device in which the number of elements is limited to one (n=1) is called a slacker processing device.

第３図はベクトルデータ処理装置の概要を示し
た図であつて、実線の矢印データの流れを示し、
点線の矢印は制御信号の流れを示している。 FIG. 3 is a diagram showing an outline of the vector data processing device, and shows the flow of data with solid line arrows.
Dotted arrows indicate the flow of control signals.

先ず、ストア制御部４はベクトル・レジスタ６
からのデータを主記憶装置MS１に格納する為の
ものであり、ロード処理部５は主記憶装置MS１
からデータを読み出して、ベクトル・レジスタ６
に格納する為のものである。 First, the store control unit 4 stores the vector register 6.
The load processing unit 5 is for storing data from the main memory device MS1 in the main memory device MS1.
Read data from vector register 6
It is for storing in.

該ベクトル・レジスタ６は、複数のエレメント
よりなるベクトルデータを保持するベクトル・レ
ジスタを複数個有している。 The vector register 6 has a plurality of vector registers that hold vector data consisting of a plurality of elements.

上記ストア処理部４、ロード処理部５、乗算器
MP７、及び加算器AD８はパイプライン構造の
ものである。 The above store processing section 4, load processing section 5, multiplier
MP7 and adder AD8 have a pipeline structure.

命令制御部９は、ベクトル・レジスタ６や、ス
トア処理部４、ロード処理部５、乗算器MP７、
加算器AD８等を制御する。 The instruction control unit 9 includes a vector register 6, a store processing unit 4, a load processing unit 5, a multiplier MP7,
Controls adder AD8 etc.

上記ストア処理部４、ロード処理部５、乗算器
MP７、加算器AD８を含めて演算処理部と称す
る。 The above store processing section 4, load processing section 5, multiplier
MP7 and adder AD8 are collectively referred to as an arithmetic processing section.

又、ベクトル命令は命令コード、第１オペラン
ド指定部、第２オペランド指定部、及び第３オペ
ランド指定部を有しており、例えば、ベクトル乗
算命令は、 VM １，２，３で表される。これは、ベクトル・レジスタ２の内
容と、ベクトル・レジスタ３の内容とを乗算し、
該乗算結果をベクトル・レジスタ１に格納するも
である。 Further, a vector instruction has an instruction code, a first operand specification section, a second operand specification section, and a third operand specification section. For example, a vector multiplication instruction is expressed as VM 1, 2, 3. This multiplies the contents of vector register 2 by the contents of vector register 3,
The multiplication result is stored in vector register 1.

ベクトル加算命令は、 VA ４，５，１で表される。これは、ベクトル・レジスタ５の内
容と、ベクトル・レジスタ１の内容を加算し、該
加算結果をベクトル・レジスタ４に格納するもの
である。 The vector addition instruction is represented by VA 4,5,1. This adds the contents of vector register 5 and the contents of vector register 1, and stores the addition result in vector register 4.

一般に、ベクトル命令を処理する場合には、乗
算器MP７や、加算器AD８等をパイプライン構
造とし、先行しているエレメントの演算処理が完
了する前に、後続のエレメントを該演算パイプラ
インに投入するようにしている。 Generally, when processing vector instructions, multiplier MP7, adder AD8, etc. are configured in a pipeline structure, and subsequent elements are input into the arithmetic pipeline before the arithmetic processing of the preceding element is completed. I try to do that.

第４図は加算器ADにおけるベクトル加算命令
の処理状況を示した図であつて、ａ図に示すよう
に、データの読み出し（READ）両オペランドの指数比較（COMPARE）指数合わせのためのシフト（PRE SHUFT）加算（ADD）加算後の正規化のためのシフト（POST
SHIFT）データの書き込み（WRITE）の６段階のパイプラインとなる。このような命令
処理は本図ｂのように平行四辺形で表され、縦方
向は上記のパイプラインの各段階を示し、横方向
は当該ベクトル加算命令で処理されるエレメント
数を示している。 Figure 4 is a diagram showing the processing status of the vector addition instruction in the adder AD. PRE SHUFT) Addition (ADD) Shift for normalization after addition (POST
SHIFT) Data writing (WRITE) This is a six-stage pipeline. Such instruction processing is represented by a parallelogram as shown in FIG.

今、 VM １，２，３ VA ４，５，１ VA ７，８，９と云うベクトル命令列があつたとする。 Now VM 1, 2, 3 VA 4,5,1 VA 7, 8, 9 Suppose we have a vector instruction sequence called

このときの通常の処理では、第５図のベクトル
命令列の通常の処理を示す図に示した如くに処理
される。但し、ベクトル乗算命令は、11段階のパ
イプラインであり、エレメント数は８であるとす
る。本図において、の「VA ４，５，１」が
時刻＝11から始まつているのは、該「VA ４，
５，１」がの「VM １，２，３」の結果デー
タを使用しているからであり、少なくとも、第１
エレメントの乗算が完了しないと、当該加算を開
始することができないことによる。 In the normal processing at this time, processing is performed as shown in FIG. 5, which shows the normal processing of the vector instruction sequence. However, it is assumed that the vector multiplication instruction is an 11-stage pipeline and has 8 elements. In this figure, "VA 4, 5, 1" starts from time = 11 because "VA 4, 5, 1" starts from time = 11.
This is because "VM 5,1" uses the result data of "VM 1,2,3", and at least the first
This is because the addition cannot be started until the multiplication of the elements is completed.

の「VA ７，８，９」は先行のベクトル命
令とは、ベクトル・レジスタの干渉がないので、
の「VA ４，５，１」の前に先行して実行さ
せることにより、命令処理時間を短縮させること
ができる。 "VA 7, 8, 9" is different from the preceding vector instruction because there is no vector register interference.
The command processing time can be shortened by executing the command before "VA 4, 5, 1".

第６図は後発のベクトル命令を先行のベクトル
命令より先に実行させた場合の命令処理を示した
図であつて、命令処理時間の短縮状態を良く示し
ている。 FIG. 6 is a diagram showing instruction processing when a subsequent vector instruction is executed before a preceding vector instruction, and clearly shows how the instruction processing time is shortened.

具体的には、第６図の例では、第５図の通常の
処理形態に比較して、命令処理時間が９サイクル
短くなつていることが分かる。（例えば、37サイ
クルから28サイクル）このように、プログラムで指示されたベクトル
命令の実行順序をを変更するように制御すると、
ベクトルデータ処理装置の処理能力は大幅に向上
することになる。 Specifically, in the example shown in FIG. 6, it can be seen that the instruction processing time is nine cycles shorter than in the normal processing form shown in FIG. (For example, from 37 cycles to 28 cycles) In this way, if you control to change the execution order of vector instructions specified by the program,
The processing power of vector data processing devices will be significantly improved.

そこで、改良されたベクトルデータ処理装置に
おける命令制御方式が、特開昭57−161938号公報
に開示されている。 Therefore, an improved command control system for a vector data processing device is disclosed in Japanese Patent Laid-Open No. 161938/1983.

第７図は改良された命令制御部の一実施例をブ
ロツク図で示した図であつて、第８図は命令の追
い越しが行われる場合の動作をタイムチヤートで
示した図である。 FIG. 7 is a block diagram illustrating an embodiment of the improved instruction control section, and FIG. 8 is a time chart illustrating the operation when an instruction is overtaken.

先ず、フエツチレジスタＦ１０には、主記憶装
置MS１から取り出された命令情報がセツトされ
る。Q₁，Q₂投入制御回路１１はレジスタ干渉チ
エツク回路１３−１，及び１３−２が、レジスタ
干渉なしを示していること等を条件として、上記
フエツチレジスタＦ１０の命令情報を待ち合わせ
レジスタQ₁，１２−１，又はQ₂，１２−２へ移
す。 First, instruction information retrieved from main memory device MS1 is set in fetch register F10. The Q ₁ and Q ₂ input control circuit 11 transfers the instruction information of the fetch register F10 to the waiting register Q ₁ on the condition that the register interference check circuits 13-1 and 13-2 indicate that there is no register interference. , 12-1, or Q ₂ , 12-2.

セレクタSEL１４は、１サイクル毎に制御信号
の値を反転する選択制御回路１９からの制御信号
に従つて、上記待ち合わせレジスタQ₁，１２−
１，又はQ₂，１２−２の何れかを交互に選択す
る。 The selector SEL14 selects the above-mentioned waiting registers _Q1 , 12- according to a control signal from a selection control circuit 19 that inverts the value of the control signal every cycle.
1, or Q ₂ , 12-2 alternately.

次に、命令発信制御回路１５は、レジスタ干渉
チエツク回路１８−１，及び１８−２の双方が、
レジスタ干渉なしを示していること等を条件とし
て、セレクタSEL１４の出力がベクトル加算命令
のときには、この命令情報を加算レジスタAR１
７に、該セレクタSEL１４の出力がベクトル乗算
命令のときには、この命令情報を乗算レジスタ
MR１６に移し、これと同時に演算処理部起動情
報を送出する。 Next, the instruction transmission control circuit 15 causes both the register interference check circuits 18-1 and 18-2 to
Under the condition that there is no register interference, when the output of selector SEL14 is a vector addition instruction, this instruction information is stored in addition register AR1.
7, when the output of the selector SEL14 is a vector multiplication instruction, this instruction information is stored in the multiplication register.
The data is transferred to the MR 16, and at the same time, arithmetic processing unit activation information is sent out.

このような、命令制御部９において、ベクトル
命令の追い越しが行われる場合の動作をタイムチ
ヤートで示したものが、第８図である。 FIG. 8 is a time chart showing the operation when a vector instruction is overtaken in the instruction control unit 9.

本図において、「VM １，２，３」なる命令情
報は、時刻Ｔ＝０において、フエツチレジスタＦ
１０にセツトされ、時刻Ｔ＝１で待ち合わせレジ
スタQ₁，１２−１に移され、Ｔ＝２で乗算レジ
スタMR１６に移されることにより、乗算が開始
される。この時、乗算開始と同時にWRITE開始
前フラグが“オン”にセツトされ、ベクトル・レ
ジスタ６への演算結果の書き込みが開始されると
“オン”にリセツトされる。 In this figure, the instruction information “VM 1, 2, 3” is stored in the fetch register F at time T=0.
10, is moved to the waiting register Q ₁ , 12-1 at time T=1, and is moved to the multiplication register MR16 at T=2, thereby starting multiplication. At this time, the WRITE pre-start flag is set to "on" at the same time as the multiplication starts, and is reset to "on" when writing of the operation result to the vector register 6 starts.

「VA ４，５，１」なる命令情報は、時刻Ｔ
＝１でフエツチレジスタＦ１０にセツトされ、時
刻Ｔ＝２で待ち合わせレジスタQ₂，１２−２に
セツトされる。 The command information “VA 4, 5, 1” is at time T
At time T=1, it is set in the fetch register F10, and at time T=2, it is set in the waiting register _Q2 , 12-2.

この場合、命令発信制御回路１５において、命
令発信条件が検査されるが、乗算レジスタMR１
６に保持されている「VM １，２，３」の第１
オペランドレジスタと、上記待ち合わせレジスタ
Q₂，１２−２に保持されている上記「VA ４，
５，１」の第３オペランドレジスタとがレジスタ
干渉を起こしているので、該待ち合わせレジスタ
Q₂，１２−２に保持されている「VA ４，５，
１」命令の発信は待たされるように機能する。 In this case, the command transmission condition is checked in the command transmission control circuit 15, and the multiplication register MR1
The first of "VM 1, 2, 3" held in
Operand register and above waiting register
The above “VA 4,” held in Q ₂ , 12-2.
5, 1" and the third operand register is causing register interference, the waiting register
“VA 4, 5,” held in Q ₂ , 12-2
The transmission of the ``1'' command functions as if it were to wait.

時刻Ｔ＝２で、「VA ７，８，９」なる命令が
フエツチレジスタＦ１０にセツトされると、待ち
合わせレジスタQ₂，１２−２に保持されている
「VA ４，５，１」命令と、フエツチレジスタＦ
１０に保持されている上記「VA ７，８，９」
命令との間には、レジスタ干渉がないので、時刻
Ｔ＝３において、「VA ７，８，９」命令が待ち
合わせレジスタQ₁，１２−１に移され、且つ命
令の発信を妨げる要因がないので、時刻Ｔ＝４に
おいて、該「VA ７，８，９」命令は、加算レ
ジスタAR１７に移されると共に、加算器AD８
に対して加算処理の起動をかけるように動作す
る。 At time T=2, when the instruction "VA 7, 8, 9" is set in the fetch register F10, the instruction "VA 4, 5, 1" held in the waiting registers Q ₂ , 12-2 and , fetish register F
The above "VA 7, 8, 9" held in 10
Since there is no register interference with the instruction, at time T=3, the "VA 7, 8, 9" instruction is moved to the waiting register Q ₁ , 12-1, and there is no factor that prevents the instruction from being sent. Therefore, at time T=4, the "VA 7, 8, 9" instruction is transferred to the addition register AR17 and is also transferred to the adder AD8.
It operates to start addition processing for .

このようにして、後発命令の先行命令に対する
追い越しが行われる。 In this way, the subsequent instruction overtakes the preceding instruction.

上記の特開昭57−161938号公報には、他の追い
越し方式も開示されているが、次に述べる問題点
に関しては条件を示すので、ここでは省略する。 Although other overtaking systems are disclosed in the above-mentioned Japanese Patent Application Laid-Open No. 57-161938, the following problems will be omitted here as they will only be described in terms of conditions.

[Problem that the invention seeks to solve]

従つて、従来の命令制御方式においては、命令
の発信制御はかなり複雑な論理が必要となる為、
上記命令発信制御回路１５の論理段数が多くなる
と云う問題があつた。 Therefore, in the conventional command control method, control of issuing commands requires quite complex logic.
A problem arises in that the number of logic stages in the command generation control circuit 15 increases.

又、回路を構成する物量も多くなるので、第７
図で示した命令投入部と、命令発信部とを、互い
に離れた別々のブロツク（例えば、別々のチツプ
とか、プリント板）に分けて構成する必要が出て
くる。 Also, since the amount of materials that make up the circuit increases, the seventh
It becomes necessary to configure the command input section and the command transmission section shown in the figure in separate blocks (for example, separate chips or printed boards) separated from each other.

このようにすると、信号の遅延時間を伸ばす原
因となつて、命令発信制御を、前述の命令投入か
ら命令発信迄の１マシンサイクルの間で済ますこ
とが困難となり、ひいては、該マシンサイクルの
増加を余儀なくされる場合も生じてくる。 If this is done, the delay time of the signal will be increased, and it will be difficult to control the command transmission within one machine cycle from the input of the command to the transmission of the command, which will further increase the number of machine cycles. There will be times when you are forced to do so.

特に、発信条件の判定の前に、待ち合わせレジ
スタQ₁，Q₂，１２−１，１２−２中の命令情報
を、セレクタSEL１４を通してブロツク間の転送
を行い、命令発信制御回路１５へ送出する迄の遅
延時間がネツクになる。 In particular, before determining the transmission condition, the command information in the waiting registers Q ₁ , Q ₂ , 12-1, 12-2 is transferred between blocks through the selector SEL 14 until it is sent to the command transmission control circuit 15. The delay time becomes a nuisance.

本発明は上記従来の欠点に鑑み、命令投入部の
待ち合わせレジスタQ₁，Q₂から命令発信制御回
路迄の論理遅延のネツクを解消する方法を提供す
ることを目的とするものである。 SUMMARY OF THE INVENTION In view of the above-mentioned drawbacks of the conventional art, it is an object of the present invention to provide a method for eliminating the logical delay bottleneck from the waiting registers Q ₁ and Q ₂ of the instruction input section to the instruction generation control circuit.

[Means for solving problems]

第１図は、本発明の概念を示した図であり、第
２図が本発明の一実施例の概念をブロツク図で示
した図である。 FIG. 1 is a diagram showing the concept of the present invention, and FIG. 2 is a diagram showing the concept of one embodiment of the present invention in a block diagram.

本発明においては、発信待機レジスタQX２０
を命令発信制御回路１５の近傍、例えば、直前に
設けると共に、Q₁，Q₂投入制御回路１１の出力
をセレクタSEL１４にバイパスするルートを設け
るように構成し、QX入力選択制御回路２１が、
各々の条件に対して以下のような制御を、セレク
タSEL１４で行うようにする。 In the present invention, the transmission standby register QX20
is provided near, for example, just before the command generation control circuit 15, and a route is provided to bypass the outputs of the Q ₁ and Q ₂ input control circuits 11 to the selector SEL 14, and the QX input selection control circuit 21
The selector SEL14 performs the following control for each condition.

(a) 選択制御回路１９が待ち合わせレジスタQ₁
（以下、Q₁レジスタと云う）１２−１を選択す
る１サイクル前のタイミングに、Q₁，Q₂投入
制御回路１１がQ₁レジスタ１２−１に命令を
投入するとき、フエツチレジスタ（以下、Ｆレ
ジスタて云う）１０の命令情報を選択し、該命
令がQ₁レジスタ１２−１にセツトされると同
時に、発信待機レジスタ（以下、QXレジスタ
と云う）２０もセツトするように制御する。(a) The selection control circuit 19 is the waiting register Q ₁
(hereinafter referred to as the _Q1 register) 12-1, when the _Q1 , _Q2 input control circuit 11 inputs an instruction to the _Q1 register 12-1 at a timing one cycle before selecting the fetch register (hereinafter referred to as the Q1 register) 12-1. _.

但し、Q₁レジスタ１２−１にセツトされな
いときには、QXレジスタ２０にもセツトしな
い。 However, when it is not set in the _Q1 register 12-1, it is not set in the QX register 20 either.

(b) 選択制御回路１９がQ₁レジスタ１２−１を
選択する１サイクル前のタイミングに、Q₁，
Q₂投入制御回路１１がQ₁レジスタ１２−１に
命令を投入しないときには、該Q₁レジスタ１
２−１をQXレジスタ２０にセツトする。(b) One cycle before the selection control circuit 19 selects the Q ₁ register 12-1, Q ₁ ,
When the Q ₂ input control circuit 11 does not input an instruction to the Q ₁ register 12-1, the Q ₁ register 1
2-1 is set in the QX register 20.

(c) 選択制御回路１９が待ち合わせレジスタQ₂
（以下、レジスタと云う）１２−２を選択する
１サイクル前のタイミングに、Q₁，Q₂投入制
御回路１１がQ₂レジスタ１２−２に命令を投
入するとき、Ｆレジスタ１０の命令情報を選択
し、該命令がQ₂レジスタ１２−２にセツトさ
れると同時に、QXレジスタ２０にもセツトす
るように制御する。(c) The selection control circuit 19 is the waiting register Q ₂
(hereinafter referred to as register) 12-2, when the Q ₁ and Q ₂ input control circuit 11 inputs an instruction to the Q ₂ register 12-2, it inputs the instruction information of the F register 10. The instruction is selected and controlled so that it is set in the QX register 20 at the same time as the instruction is set in the _Q2 register 12-2.

但し、Q₂レジスタ１２−２にセツトされな
いときには、QXレジスタ２０にもセツトしな
い。 However, when it is not set in the _Q2 register 12-2, it is not set in the QX register 20 either.

(d) 選択制御回路１９がQ₂レジスタ１２−２を
選択する１サイクル前のタイミングに、Q₁，
Q₂投入制御回路１１がQ₂レジスタ１２−２に
命令を投入しないときには、該Q₂レジスタ１
２−２をQXレジスタ２０にセツトする。(d) One cycle before the selection control circuit 19 selects the Q ₂ register 12-2, Q ₁ ,
When the Q ₂ input control circuit 11 does not input an instruction to the Q ₂ register 12-2, the Q ₂ register 1
2-2 is set in the QX register 20.

以上の動作を概念的に示したものが、第１図で
あつて、横軸のＦ，Q₂，Q₁，QX，MR，ARは
前述の各レジスタを示しており、縦軸のＴ１〜各
時刻を示しており、〜は命令情報を示してい
る。そして、備考欄の(a)〜が上記制御条件に対応
している。 The above operation is conceptually shown in Fig. 1, in which F, Q ₂ , Q ₁ , QX, MR, AR on the horizontal axis indicate the above-mentioned registers, and T1 to T1 on the vertical axis. Each time is shown, and ~ shows command information. And (a) ~ in the remarks column corresponds to the above control conditions.

本図から明らかな如く、本発明を実施した場合
には、命令発信部から見たときの、QXレジスタ
２０の内容は、前述の選択制御回路１９が選択し
たQ₁レジスタ１２−１、或いはQ₂レジスタ１２
−２の内容と全く一致しており、且つタイミング
的にも、遅延が見られないと云う特徴がある。 As is clear from this figure, when the present invention is implemented, the contents of the QX register 20 when viewed from the instruction transmitter are the _Q1 register 12-1 selected by the selection control circuit 19, or the Q ₂ register 12
-2, and also has the characteristic that there is no delay in terms of timing.

以上のように制御すれば、どのような場合で
も、１マシンサイクル毎に、QXレジスタ２０に
は、Q₁レジスタ１２−１，又はQ₂レジスタ１２
−２の命令のコピーが交互にセツトされることが
分かる。 If controlled as described above, in any case, the QX register 20 will contain either the _Q1 register 12-1 or the _Q2 register 12 for each machine cycle.
It can be seen that copies of the -2 instructions are set alternately.

つまり、QXレジスタ２０からの出力は、論理
的にはセレクタSEL１４の出力と同じであり、且
つタイミング的にも同じである。 In other words, the output from the QX register 20 is logically the same as the output from the selector SEL14, and also in terms of timing.

従つて、セレクタSEL１４と、ブロツク間転送
の遅延時間が見えない分だけ、命令発信問題の高
速化が図れることになる。 Therefore, the problem of command transmission can be speeded up by the fact that the delay time of the selector SEL 14 and inter-block transfer is not visible.

[Effect]

即ち、本発明によれば、複数個の演算処理部、
又は１個乃至複数個のメモリアクセス処理部を含
む複数個の演算処理部を持つデータ処理装置にお
いて、命令発信制御部の近傍に、命令投入部で選
択された命令のコピーを格納する発信待機レジス
タQXを設け、各命令の処理サイクルの次のサイ
クルで発信条件を判定する命令を選択して格納す
るように制御することにより、発信制御を、上記
発信待機レジスタQXの内容のみで行うようにし
たものであるので、命令投入部から命令発信部迄
のブロツク間転送を意識する必要がなく、命令発
信制御の高速化が図れる効果がある。 That is, according to the present invention, a plurality of arithmetic processing units,
Alternatively, in a data processing device having a plurality of arithmetic processing units including one or more memory access processing units, a transmission standby register that stores a copy of the instruction selected by the instruction input unit near the instruction transmission control unit. By providing QX and controlling the instruction to select and store the instruction that determines the transmission condition in the next cycle of each instruction processing cycle, transmission control is performed only by the contents of the transmission standby register QX. Therefore, there is no need to be aware of inter-block transfer from the command input section to the command transmission section, and there is an effect that the speed of command transmission control can be increased.

〔Example〕

以下本発明の実施例を図面によつて詳述する。 Embodiments of the present invention will be described in detail below with reference to the drawings.

前述の第２図は、本発明の一実施例の概略をブ
ロツク図で示した図であつて、投入制御回路１１
からセレクタSEL１４に対する命令情報のバイパ
ス通路と、QX入力選択制御回路２１、発信待機
レジスタQX２０が本発明を実施するのに必要な
機能ブロツクである。尚、全図を通して、同じ符
号は同じ対象物を示している。又、本実施例にお
いては、説明の便宜上、本発明に直接関係しない
レジスタ干渉チエツク回路は省略してある。 The above-mentioned FIG. 2 is a diagram schematically showing an embodiment of the present invention in the form of a block diagram, in which the input control circuit 11
A bypass path for command information from to selector SEL14, QX input selection control circuit 21, and transmission standby register QX20 are functional blocks necessary to implement the present invention. Note that the same reference numerals indicate the same objects throughout the figures. Further, in this embodiment, for convenience of explanation, a register interference check circuit that is not directly related to the present invention is omitted.

本発明を実施しても、命令の発信制御の動作も
のものは従来と変わることはないので省略し、こ
こでは、本発明のポイントである。QXレジスタ
２０に対する命令情報のセツト方式を中心に説明
する。 Even if the present invention is implemented, the operation of controlling the transmission of commands will not change from the conventional one, so a description thereof will be omitted, and this is the main point of the present invention. The method of setting instruction information for the QX register 20 will be mainly explained.

先ず、選択制御回路１９は従来と同じようにし
て、Q₁レジスタ１２−１と、Q₂レジスタ１２−
２を交互に選択する選択信号Ｓを発生し、その選
択信号Ｓと、Q₁，Q₂発信制御回路１１からの、
上記Q₁レジスタ１２−１又は、Q₂レジスタ１２
−２の何れに、Ｆレジスタ１０からの命令情報を
投入したかを示す信号Ｔとに基づいて、QX入力
選択制御回路２１が、セレクタSEL１０に対し
て、次のような制御を行う。即ち、 (a) 選択制御回路１９がQ₁レジスタ１２−１を
選択する１サイクル前のタイミングに、Q₁，
Q₂投入制御回路１１がQ₁レジスタ１２−１に
命令を投入するとき、Ｆレジスタ１０の命令情
報を選択し、該命令がQ₁レジスタ１２−１に
セツトされると同時に、上記バイパスルートを
通してQXレジスタ２０にもセツトするように
制御する。 First, the selection control circuit 19 operates the _Q1 register 12-1 and the _Q2 register 12-1 in the same way as in the conventional case.
A selection signal S for alternately selecting Q 1 and Q 2 is generated, and the selection signal S and Q ₁ , Q ₂ from the transmission control circuit 11 are
The above Q ₁ register 12-1 or Q ₂ register 12
-2, the QX input selection control circuit 21 performs the following control on the selector SEL10 based on the signal T indicating to which of the input terminals the command information from the F register 10 is input. That is, (a) _{Q 1} _,
When the _Q2 input control circuit 11 inputs an instruction to the _Q1 register 12-1, it selects the instruction information in the F register 10, and at the same time that the instruction is set in the _Q1 register 12-1, it inputs the instruction through the bypass route. It is also controlled to be set in the QX register 20.

(b) 選択制御回路１９がQ₁レジスタ１２−１を
選択する１サイクル前のタイミングに、Q₁，
Q₂投入制御回路１１がQ₁レジスタ１２−１に
命令を投入しないときには、該Q₁レジスタ１
２−１のみをQXレジスタ２０にセツトする。(b) One cycle before the selection control circuit 19 selects the Q ₁ register 12-1, Q ₁ ,
When the Q ₂ input control circuit 11 does not input an instruction to the Q ₁ register 12-1, the Q ₁ register 1
Only 2-1 is set in the QX register 20.

(c) 選択制御回路１９がQ₂レジスタ１２−２を
選択する１サイクル前のタイミングに、Q₁，
Q₂投入制御回路１１がQ₂レジスタ１２−２に
命令を投入するとき、Ｆレジスタ１０の命令情
報を選択し、該命令がQ₂レジスタ１２−２に
セツトされると同時に、上記バイパスルートを
通してQXレジスタ２０にもセツトするように
制御する。(c) One cycle before the selection control circuit 19 selects the Q ₂ register 12-2, Q ₁ ,
When the _Q2 input control circuit 11 inputs an instruction to the _Q2 register 12-2, it selects the instruction information in the F register 10, and at the same time that the instruction is set in the _Q2 register 12-2, it inputs the instruction through the bypass route. It is also controlled to be set in the QX register 20.

(d) 選択制御回路１９がQ₂レジスタ１２−２を
選択する１サイクル前のタイミングに、Q₁，
Q₂投入制御回路１１がQ₂レジスタ１２−２に
命令を投入しないときには、該Q₂レジスタ１
２−２のみをQXレジスタ２０にセツトする。(d) One cycle before the selection control circuit 19 selects the Q ₂ register 12-2, Q ₁ ,
When the Q ₂ input control circuit 11 does not input an instruction to the Q ₂ register 12-2, the Q ₂ register 1
Set only 2-2 in the QX register 20.

ここで、Q₁レジスタ１２−１を選択する１
サイクル前のタイミング、又はQ₂レジスタ１
２−２を選択する１サイクル前のタイミングで
あることを知ることは、上記選択信号Ｓの否定
をとることによつて可能である。 Here, 1 selects _Q1 register 12-1.
Timing before cycle or Q ₂ register 1
It is possible to know that the timing is one cycle before selecting 2-2 by negating the selection signal S.

このように、本発明においては、QXレジスタ
からの出力を、論理的に、セレクタSEL１４の出
力と同じとすると共に、タイミング的にも同じと
した所に特徴がある。 As described above, the present invention is characterized in that the output from the QX register is logically the same as the output of the selector SEL14, and the timing is also the same.

〔Effect of the invention〕

以上、詳細に説明したように、本発明の命令制
御方式は、複数個の演算処理部、又は１個乃至複
数個のメモリアクセス処理部を含む複数個の演算
処理部を持つデータ処理装置において、命令発信
制御部の近傍に、命令投入部で選択された命令の
コピーを格納する発信待機レジスタQXを設け、
各命令の処理サイクルの次のサイクルで発信条件
を判定する命令を選択して格納するように制御す
ることにより、発信制御を、上記発信待機レジス
タQXの内容のみで行うようにしたものであるの
に命令投入部から命令発信部迄のブロツク間転送
を意識する必要がなく、命令発信制御の高速化が
図れる効果がある。 As described above in detail, the instruction control method of the present invention is applicable to a data processing device having a plurality of arithmetic processing units or a plurality of arithmetic processing units including one or more memory access processing units. A transmission standby register QX is provided near the instruction transmission control section to store a copy of the instruction selected by the instruction input section.
By controlling the instruction to select and store the instruction that determines the transmission condition in the cycle following the processing cycle of each instruction, transmission control is performed only by the contents of the transmission standby register QX. There is no need to be aware of the inter-block transfer from the command input section to the command transmission section, which has the effect of increasing the speed of command transmission control.

[Brief explanation of drawings]

第１図は本発明の概念を説明した図、第２図は
本発明の一実施例の概略をブロツク図で示した
図、第３図はベクトルデータ処理装置の概要を示
した図、第４図は加算器ADにおけるベクトル加
算命令の処理状況を示した図、第５図はベクトル
命令列の通常の処理を示す図、第６図は後発のベ
クトル命令を先発のベクトル命令より先に実行し
た場合の命令処理を示した図、第７図は改良され
た命令制御部の一実施例をブロツク図で示した
図、第８図は命令の追い越しが行われる場合の動
作をタイムチヤートで示した図、である。図面において、１は主記憶装置MS、３は演算
処理部、４はストア処理部、５はロード処理部、
６はベクトル・レジスタ、７は乗算器MP、８は
加算器AD、９は命令制御部、１０はフエツチレ
ジスタＦ、１１はQ₁，Q₂投入制御回路、１２−
１，１２−２は待ち合わせレジスタQ₁，Q₂、１
３−１，１３−２，１８−１，１８−２はレジス
タ干渉チエツク回路、１４はセレクタSEL、１５
は命令発信制御回路、１６は乗算レジスタMR、
１７は加算レジスタAR、VM １，２，３、VA
４，５，１、VA ７，８，９はベクトル命令、
をそれぞれ示す。 Fig. 1 is a diagram explaining the concept of the present invention, Fig. 2 is a block diagram showing an outline of an embodiment of the invention, Fig. 3 is a diagram showing an outline of a vector data processing device, and Fig. 4 is a diagram showing an outline of a vector data processing device. The figure shows the processing status of a vector addition instruction in the adder AD, Figure 5 shows the normal processing of a vector instruction sequence, and Figure 6 shows the execution of a later vector instruction before the earlier vector instruction. FIG. 7 is a block diagram showing an example of an improved instruction control unit, and FIG. 8 is a time chart showing the operation when an instruction is overtaken. Figure. In the drawing, 1 is a main storage device MS, 3 is an arithmetic processing unit, 4 is a store processing unit, 5 is a load processing unit,
6 is a vector register, 7 is a multiplier MP, 8 is an adder AD, 9 is an instruction control unit, 10 is a fetch register F, 11 is a _Q1 , _Q2 input control circuit, 12-
1, 12-2 are waiting registers Q ₁ , Q ₂ , 1
3-1, 13-2, 18-1, 18-2 are register interference check circuits, 14 is a selector SEL, 15
is an instruction transmission control circuit, 16 is a multiplication register MR,
17 is addition register AR, VM 1, 2, 3, VA
4, 5, 1, VA 7, 8, 9 are vector instructions,
are shown respectively.

Claims

[Claims] 1. In a data processing device having a plurality of arithmetic processing units or a plurality of arithmetic processing units including one or more memory access processing units, a plurality of arithmetic processing units that hold instruction information before execution of an arithmetic operation are provided. waiting registers Q ₁ , Q ₂ , 12-1, 12-2,
The plurality of waiting registers Q ₁ , Q ₂ , 12-
1, 12-2 and the new instruction information, the above-mentioned waiting registers Q ₁ , Q ₂ , 12-
Q ₁ and Q ₂ input control circuit 1 to input to 1 and 12-2
1, and the corresponding waiting registers Q ₁ , Q ₂ , 12-1,
1 from the instruction information group held in 12-2
an instruction input section consisting of a selection circuit SEL14 for selecting instruction information and a selection control circuit 19; a plurality of registers MR holding instruction information during execution of an operation;
AR, 16, 17, instruction information during execution of the operation, and the above-mentioned waiting registers Q ₁ , Q ₂ , 12-1,
12-2, and the comparison means 15 compares the waiting registers Q ₁ , Q 2 , Q ₂ ,
12-1 and 12-2, one of the instructions before execution of the operation is selected, and the instruction for which it has been confirmed that there are no factors preventing the start of execution of the operation is added to the instruction information during execution of the operation. an instruction issuing section that moves the instruction to an "empty" register among the plurality of registers MR, AR, 16, and 17 holding the instruction, and activates the arithmetic processing section corresponding to the instruction, and selects the instruction input section. The above-mentioned waiting registers Q ₁ , Q ₂ , 12-1, 12-2 are connected to the circuit SEL14.
In addition to providing means for bypassing and inputting the output of the command information F10 to be input to the selection circuit SEL1,
By inputting the instruction information selected by 4, one of the plurality of waiting registers Q ₁ , Q ₂ , 12-1, 12-2 is selected for each machine cycle. A transmission standby register QX, 20 is provided in which the same instruction information is set, and the instruction in the transmission standby register QX, 20 is compared with information on the instruction currently being executed to determine if there is a factor preventing the start of the operation. When it is confirmed that there is no such instruction, the instruction is moved to an "empty" register among the multiple registers MR, AR, 16, and 17 that hold information about the instruction being executed, and the above operation corresponding to the instruction is executed. An instruction control method characterized by activating a processing section.