JP5013966B2

JP5013966B2 - Arithmetic processing unit

Info

Publication number: JP5013966B2
Application number: JP2007138047A
Authority: JP
Inventors: 輝明田中
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2007-05-24
Filing date: 2007-05-24
Publication date: 2012-08-29
Anticipated expiration: 2027-05-24
Also published as: JP2008293270A

Description

この発明は、機器を駆動するコントローラで制御処理を実行するといった演算処理装置に関するものである。 The present invention relates to an arithmetic processing apparatus in which a control process is executed by a controller that drives a device.

コントローラ上で制御処理を実行するハードウェア演算装置（専用演算回路やプロセッサ）では、発熱やノイズの問題から低クロック周波数で回路を動作させる事が求められる。そのため、１クロックサイクルで幾つもの演算処理を行える演算回路を複数実現し、それら複数の演算回路を複数のクロックサイクルで駆動し所望の演算全体を完遂するという、処理内容に特化した演算回路を実現することが必要となる。 In a hardware arithmetic unit (dedicated arithmetic circuit or processor) that executes control processing on a controller, it is required to operate the circuit at a low clock frequency because of problems of heat generation and noise. For this reason, an arithmetic circuit specialized in processing contents is realized, in which a plurality of arithmetic circuits capable of performing various arithmetic processes in one clock cycle are realized, and the plurality of arithmetic circuits are driven in a plurality of clock cycles to complete an entire desired arithmetic operation. It needs to be realized.

そのような処理内容に特化した演算回路を使用する場合、演算回路実装時に想定していた処理内容に対して修正が発生すると、その処理内容を実行することが困難である。このため、従来の演算処理装置では、実行する処理内容を変更可能な演算回路とすると共に、動作周波数を高くすることなく低い動作周波数でも演算性能を向上させるため、ＡＬＵ（演算論理装置）、ＢＲＬ（バレルシフタ）およびマルチプレクサで固定的に接続されたデータパスを構成し、ＡＬＵ、ＢＲＬ、マルチプレクサへのコンフィグレーションデータ（オペコードデータ）を設定することで、処理内容に特化した演算データパスを構成していた（例えば、特許文献１参照）。 When an arithmetic circuit specialized for such processing contents is used, if the processing contents assumed when the arithmetic circuit is mounted are corrected, it is difficult to execute the processing contents. For this reason, in the conventional arithmetic processing unit, an arithmetic circuit capable of changing the processing content to be executed is used, and in order to improve the arithmetic performance even at a low operating frequency without increasing the operating frequency, ALU (arithmetic logic unit), BRL (Barrel shifter) and a data path fixedly connected by a multiplexer are configured, and configuration data (opcode data) to ALU, BRL, and multiplexer is set, so that an operation data path specialized for processing contents is configured. (For example, refer to Patent Document 1).

特開２００４−５４３３５号公報JP 2004-54335 A

しかしながら、従来の演算処理装置では、処理内容が修正される場合を想定しての予めの対応策として、算術演算、論理演算、シフト等の演算回路への入出力を選択信号により切り替えるマルチプレクサを多用する形となる。そのため、各演算回路への入出力切り替えを想定しないような処理内容に特化した専用演算回路を使用する場合に比べ、演算実行に長い遅延時間・長い実行サイクル数が必要であった。また、元々、演算装置中で接続されていない演算回路に関しては演算そのものを実行する事が出来ないという問題もあった。
一方、入出力経路の切り替えを想定しないような専用演算回路を採用した場合、専用演算回路実装後に発生した処理内容の修正に対応することが困難であった。 However, in the conventional arithmetic processing unit, a multiplexer that switches input / output to an arithmetic circuit such as arithmetic operation, logical operation, and shift by a selection signal is widely used as a precaution for assuming that the processing content is corrected. It becomes the form to do. For this reason, a longer delay time and a longer number of execution cycles are required for execution of operations compared to the case of using a dedicated operation circuit specialized for processing contents that do not assume input / output switching to each operation circuit. In addition, there has been a problem that the operation itself cannot be executed with respect to the operation circuit that is not originally connected in the operation device.
On the other hand, when a dedicated arithmetic circuit that does not assume switching of an input / output path is employed, it is difficult to cope with correction of processing content that occurs after the dedicated arithmetic circuit is mounted.

この発明は、上記のような問題点を解決するためになされたものであり、処理の高速実行を可能とすると共に、処理内容の修正時の実行時間の悪化を防ぐことのできる演算処理装置を得ることを目的としている。 The present invention has been made in order to solve the above-described problems, and provides an arithmetic processing device that enables high-speed execution of processing and prevents deterioration in execution time when processing content is corrected. The purpose is to get.

この発明に係る演算処理装置は、算術演算、論理演算、シフト演算を行う演算手段を有し、各演算手段に対して定義されている命令コードの発行により前記各演算を実行する一般演算処理手段と、所定のクロックサイクルで特定の演算処理を実行する特殊演算処理手段と、前記特定の演算処理に対して定義されている命令コードの発行に基づいて、前記特殊演算処理手段での演算処理の実行および一時停止を制御する特殊演算制御手段とを備え、前記特殊演算処理手段が一時停止するクロックサイクル数を、特定の命令コードを使用して任意の値に設定するものである。 The arithmetic processing device according to the present invention has arithmetic means for performing arithmetic operation, logical operation, shift operation, and general arithmetic processing means for executing each operation by issuing an instruction code defined for each arithmetic means And a special arithmetic processing means for executing a specific arithmetic processing in a predetermined clock cycle, and issuing an instruction code defined for the specific arithmetic processing in the special arithmetic processing means. Special operation control means for controlling execution and suspension, and the number of clock cycles at which the special operation processing means is suspended is set to an arbitrary value using a specific instruction code .

この発明の演算処理装置は、算術演算、論理演算、シフト演算を行う演算手段を有し、各演算手段に対して定義されている命令コードの発行により前記各演算を実行する一般演算処理手段と、所定のクロックサイクルで特定の演算処理を実行する特殊演算処理手段と、前記特定の演算処理に対して定義されている命令コードの発行に基づいて、前記特殊演算処理手段での演算処理の実行および一時停止を制御する特殊演算制御手段とを備え、前記特殊演算処理手段が一時停止するクロックサイクル数を、特定の命令コードを使用して任意の値に設定するので、処理の高速実行を可能とすると共に、処理内容の修正時の実行時間の悪化を防ぐことができる。

The arithmetic processing apparatus of the present invention includes arithmetic operation means for performing arithmetic operation, logical operation, and shift operation, and general arithmetic processing means for executing each operation by issuing an instruction code defined for each operation means; , Special arithmetic processing means for executing specific arithmetic processing in a predetermined clock cycle, and execution of arithmetic processing in the special arithmetic processing means based on the issuance of an instruction code defined for the specific arithmetic processing And special arithmetic control means for controlling suspension, and the number of clock cycles in which the special arithmetic processing means pauses is set to an arbitrary value using a specific instruction code , enabling high-speed processing execution In addition, it is possible to prevent the execution time from being deteriorated when the processing content is corrected.

実施の形態１．
図１は、この発明の実施の形態１による演算処理装置を示す構成図であるが、この説明に先立ち、一般的な演算処理装置について説明する。 Embodiment 1 FIG.
FIG. 1 is a block diagram showing an arithmetic processing apparatus according to Embodiment 1 of the present invention. Prior to this description, a general arithmetic processing apparatus will be described.

一般的なマイクロプロセッサは、命令コードと呼ばれるデータをプロセッサが取り込み、これを解釈し、各命令コードに対して定義された動作を実行することにより、演算動作を行う形態となっている。命令コードは命令メモリ上に置かれており、命令メモリ上の各命令コードに対してはアドレスが割り当てられており、アドレスの昇順に命令コードが読み込まれ（フェッチ動作）、命令コードが解釈され（デコード動作）、命令コードが意味する演算が実行（演算動作）される。また、命令コードとしては、データメモリからのデータの取り込み（ロード動作）、データの書込み（ストア動作）も定義されており、これらの命令コードを発行した場合にはデータメモリアクセスが行われると共に、結果がレジスタもしくはデータメモリに書き込まれる。 A general microprocessor is configured to perform arithmetic operations by fetching data called instruction codes by the processor, interpreting them, and executing operations defined for the respective instruction codes. The instruction code is placed on the instruction memory, and an address is assigned to each instruction code on the instruction memory, the instruction codes are read in the ascending order of addresses (fetch operation), and the instruction codes are interpreted ( (Decoding operation), the operation indicated by the instruction code is executed (arithmetic operation). In addition, as the instruction code, fetching of data from the data memory (load operation) and writing of data (store operation) are also defined. When these instruction codes are issued, the data memory is accessed, The result is written to a register or data memory.

図２は、一般的なマイクロプロセッサのパイプラインステージを示した図である。
一般的なマイクロプロセッサは、命令コードの読み込み、解釈、演算実行等が、独立して動作する機能ブロック（“ステージ”と呼ばれる）で構成されており、これらがパイプライン動作することによって、効率的に処理を実行することが可能となる。図２に示すマイクロプロセッサは、命令フェッチ（ＩＦ）１、命令デコード（Ｄ）２、命令実行（ＥＸ）３、メモリアクセス（ＭＡ）４、結果ストア（Ｓ）５、命令メモリ６、レジスタブロック７、データメモリ８を備えている。 FIG. 2 is a diagram showing a pipeline stage of a general microprocessor.
A general microprocessor is composed of functional blocks (called “stages”) that operate independently to read, interpret, and execute instruction codes. These are efficiently operated by pipeline operation. It is possible to execute the process. The microprocessor shown in FIG. 2 includes an instruction fetch (IF) 1, an instruction decode (D) 2, an instruction execution (EX) 3, a memory access (MA) 4, a result store (S) 5, an instruction memory 6, and a register block 7. A data memory 8 is provided.

これら機能ブロックは、一般に次のような機能を有している。
命令メモリ６：命令コードが格納されているメモリ。
レジスタブロック７：演算実行で一時的に使用するデータ格納領域。
データメモリ８：演算の入力データ、出力データを格納するメモリ。
命令フェッチ１：命令メモリ６より命令をフェッチするステージ。
命令デコード２：命令をデコードするステージ。演算命令の場合、ソース、デスティネーションレジスタの利用可能性のチェックを行い、利用可能なら読み出す。利用可能でなければ利用可能となるまで待つ。
命令実行３：演算実行。
メモリアクセス４：ロードストア命令の場合、データメモリアクセスを行う。演算命令ではこのステージは何もしない。
結果ストア５：結果の格納と命令の後処理。 These functional blocks generally have the following functions.
Instruction memory 6: Memory in which instruction codes are stored.
Register block 7: a data storage area temporarily used for execution of operations.
Data memory 8: A memory for storing operation input data and output data.
Instruction fetch 1: a stage for fetching an instruction from the instruction memory 6.
Instruction decode 2: A stage for decoding an instruction. In the case of an operation instruction, the availability of the source and destination registers is checked, and if it is available, it is read. If not available, wait until it becomes available.
Instruction execution 3: Calculation execution.
Memory access 4: In the case of a load / store instruction, data memory access is performed. This stage does nothing for arithmetic instructions.
Result store 5: Store results and post-process instructions.

また、命令実行３（ＥＸステージ）では、演算を実行するために必要となる演算回路が実装されている。そのような演算回路としては、例えば、
算術演算回路（算術加算、算術減算、算術乗算、算術除算、算術シフト演算）
論理演算回路（論理和、論理積、排他論理和、論理シフト演算）
などが存在している。
また、レジスタブロック７上では、一般的に複数のレジスタが存在している。説明のため、ここではレジスタがｒ０からｒ３１までの３２本が存在しているとする。また、これらレジスタは、演算等で一般的に使用されるレジスタであるため、一般に「汎用レジスタ」と呼ばれる。
これらのブロックが並列動作することにより、マイクロプロセッサは演算を実行している。その演算実行の動作を、命令コードを元に説明する。 In the instruction execution 3 (EX stage), an arithmetic circuit necessary for executing an operation is mounted. As such an arithmetic circuit, for example,
Arithmetic operation circuit (arithmetic addition, arithmetic subtraction, arithmetic multiplication, arithmetic division, arithmetic shift operation)
Logical operation circuit (logical sum, logical product, exclusive logical sum, logical shift operation)
Etc. exist.
In general, a plurality of registers exist on the register block 7. For the sake of explanation, it is assumed here that there are 32 registers from r0 to r31. In addition, these registers are generally used in operations and the like, and are generally called “general-purpose registers”.
As these blocks operate in parallel, the microprocessor executes operations. The operation execution operation will be described based on the instruction code.

図３は、命令メモリ６に設定された命令コードの説明図である。
また、図４は、このような命令コードの説明図である。
これらの命令コードは、ＭＩＰＳ（Microprocessor without Interlocked Pipeline Stages）プロセッサと呼ばれるプロセッサで採用されている一般的な命令コードである。各命令コードは命令メモリ６上で、命令アドレスで指し示される位置に格納されている。尚、図４において、ＧＰＲは、General Purpose Register（汎用レジスタ）である。 FIG. 3 is an explanatory diagram of instruction codes set in the instruction memory 6.
FIG. 4 is an explanatory diagram of such an instruction code.
These instruction codes are general instruction codes adopted in a processor called a MIPS (Microprocessor without Interlocked Pipeline Stages) processor. Each instruction code is stored in the instruction memory 6 at a position indicated by an instruction address. In FIG. 4, GPR is a general purpose register.

図３に示す命令コードを解釈し動作するプロセッサの処理の流れとしては、最初に命令アドレス００００の命令コード［ＬＷｒ８，０（ｒ１）］が読み込まれ、実行される。この命令コードでは、汎用レジスタｒ１上に保持されているデータメモリ８上のアドレスに対してオフセット値０が足し合わされたデータメモリ８上のアドレス値が生成され、データメモリ８上で当該アドレスに保持されている４バイトのデータが汎用レジスタｒ８上に格納される。
次に、アドレス０００４の［ＬＷｒ９，４（ｒ１）］の命令コードが実行される。この命令コードの実行により、アドレス値（ｒ１＋４）のデータメモリ８上の４バイトデータが汎用レジスタｒ９に格納される。 As the processing flow of the processor that operates by interpreting the instruction code shown in FIG. 3, the instruction code [LW r8, 0 (r1)] at the instruction address 0000 is first read and executed. In this instruction code, an address value on the data memory 8 in which the offset value 0 is added to the address on the data memory 8 held on the general-purpose register r1 is generated and held at the address on the data memory 8. The stored 4-byte data is stored in the general-purpose register r8.
Next, the instruction code [LW r9, 4 (r1)] at address 0004 is executed. By executing this instruction code, 4-byte data in the data memory 8 having the address value (r1 + 4) is stored in the general-purpose register r9.

次に、命令アドレス０００８の命令コード［ＭＵＬｒ１０，ｒ８，ｒ９］が実行され、汎用レジスタｒ８と汎用レジスタｒ９に格納されている値を算術乗算した結果が汎用レジスタｒ１０に格納される。
次に、命令アドレス０００ｃの命令コード［ＬＷｒ１１，８（ｒ１）］が実行され、アドレス値（ｒ１＋８）のデータメモリ８上のデータが汎用レジスタｒ１１上に格納される。
次に、命令アドレス００１０の［ＡＤＤｒ１２，ｒ１０，ｒ１１］が実行され、汎用レジスタｒ１０と汎用レジスタｒ１１に格納されている値を算術加算した結果が、汎用レジスタｒ１２に格納される。 Next, the instruction code [MUL r10, r8, r9] at the instruction address 0008 is executed, and the result of arithmetic multiplication of the values stored in the general-purpose register r8 and the general-purpose register r9 is stored in the general-purpose register r10.
Next, the instruction code [LW r11,8 (r1)] of the instruction address 000c is executed, and the data on the data memory 8 having the address value (r1 + 8) is stored in the general-purpose register r11.
Next, [ADD r12, r10, r11] at the instruction address 0010 is executed, and the result of arithmetic addition of the values stored in the general-purpose register r10 and the general-purpose register r11 is stored in the general-purpose register r12.

次に、命令アドレス００１４の［ＬＷｒ１３，１２（ｒ１）］が実行され、アドレス値（ｒ１＋１２）のデータメモリ上のデータが汎用レジスタｒ１３上に格納される。
次に、命令アドレス００１８の［ＳＵＢｒ１４，ｒ１３，ｒ１２］が実行され、汎用レジスタｒ１３と汎用レジスタｒ１２に格納されている値から算術減算である（ｒ１３−ｒ１２）が演算され、演算結果が汎用レジスタｒ１４上に格納される。
次に、命令アドレス００１ｃの［ＳＷｒ１４，１６（ｒ１）］が実行される。
本命令コードの実行により、アドレス値（ｒ１＋１６）で示されるデータメモリ８上の領域に汎用レジスタｒ１４の値が格納される。
このようにして命令メモリ６上の命令コードを元にプロセッサ上で処理が実行され、結果がデータメモリ８上に格納されるという動作を連続して実行されるように、一般的なプロセッサは構成されている。 Next, [LW r13, 12 (r1)] of the instruction address 0014 is executed, and the data in the data memory having the address value (r1 + 12) is stored in the general-purpose register r13.
Next, [SUB r14, r13, r12] at the instruction address 0018 is executed, arithmetic subtraction (r13-r12) is calculated from the values stored in the general-purpose register r13 and the general-purpose register r12, and the calculation result is the general-purpose value. It is stored on the register r14.
Next, [SW r14, 16 (r1)] of the instruction address 001c is executed.
By executing this instruction code, the value of the general-purpose register r14 is stored in the area on the data memory 8 indicated by the address value (r1 + 16).
In this way, the general processor is configured so that the processing is executed on the processor based on the instruction code on the instruction memory 6 and the result is stored in the data memory 8 continuously. Has been.

また、各演算の実行を行うためにプロセッサ内部の命令実行３（ＥＸステージ）に実装されている演算回路は、一般的にはプロセッサ動作の基準となる動作クロックの１サイクルで実行されるように、各演算回路の遅延時間は設定されているが、遅延時間が１サイクル以上必要となるものでは、複数サイクルで演算が実行されるような演算回路となっている。
例えば、あるプロセッサでは、ＩＥＥＥ７５４規格における６４ビット表現された浮動小数点の演算を行う演算回路において、乗算回路（６４ビット浮動小数点乗算）は５サイクルで結果を出力するようになっており、また加算回路（６４ビット浮動小数点乗算）は４サイクルで結果を出力するようになっている。 In addition, the arithmetic circuit mounted in the instruction execution 3 (EX stage) in the processor for executing each operation is generally executed in one cycle of the operation clock that is a reference of the processor operation. Although the delay time of each arithmetic circuit is set, if the delay time is one cycle or more, the arithmetic circuit is such that the arithmetic is executed in a plurality of cycles.
For example, in a certain processor, in an arithmetic circuit that performs 64-bit floating point arithmetic in the IEEE 754 standard, a multiplication circuit (64-bit floating point multiplication) outputs a result in five cycles, and an addition circuit (64-bit floating point multiplication) outputs a result in 4 cycles.

例えば、６４ビット浮動小数点乗算を行った後、６４ビット浮動小数点加算を行うような、
Ｄ＝Ａ×Ｂ＋Ｃ
という演算を行う場合、上記の乗算と加算を使用すれば、５サイクル＋４サイクルの計９サイクルで演算結果を取得することが可能である。このような、乗算した結果に加算を行うような演算は一般的であり、“積和演算”と呼ばれ多用されることが多い。本演算に対しては積和演算に特化した演算回路が実装されているプロセッサも多く、例えば上記でサイクル数を参考にしているプロセッサでは、積和演算回路は５サイクルで演算結果を取得できるようになっている。
このように、ある特定処理を高速に実行したい場合、その演算を専用に実行する演算回路を開発しプロセッサ中に実装した方が、より短いサイクル数で演算を完了することが可能である。そのため、特定処理の実行を主とするプロセッサでは、処理内容に特化した演算処理を、できるだけ高速に（短い実行サイクル数で）実行出来るよう、処理内容特化の演算回路が実装されることが多い。 For example, after performing 64-bit floating point multiplication, performing 64-bit floating point addition,
D = A × B + C
When the above calculation is performed, if the above multiplication and addition are used, the calculation result can be acquired in a total of 9 cycles of 5 cycles + 4 cycles. Such an operation that adds to the result of multiplication is common and is often referred to as “product-sum operation” and is often used. Many processors are equipped with arithmetic circuits specialized for product-sum operations for this operation. For example, in a processor that refers to the number of cycles as described above, the product-sum operation circuit can acquire an operation result in five cycles. It is like that.
As described above, when it is desired to execute a specific process at high speed, it is possible to complete the operation in a shorter number of cycles by developing an operation circuit dedicated to the operation and mounting it in the processor. For this reason, in a processor that mainly executes specific processing, a processing content-specific arithmetic circuit may be mounted so that arithmetic processing specialized for the processing content can be executed as fast as possible (with a short number of execution cycles). Many.

そのような演算例として、例えばまとまった
リスト１：
ｄ＝（ａ×ｂ）＋ｃｈ＝（（ｄ＞ｅ）？ｅ：（（ｄ＜ｆ）？ｆ：ｄ））＞＞ｇ
ｋ＝（ｈ＋ｉ）＞＞ｊ
という演算処理を特定処理として高速に実行したい場合には、この全体の演算データフローを直接的に実現する演算回路を考える。その演算回路の遅延時間を、クロックの周期時間で分割することにより、複数サイクルで動作する演算回路として実現し、プロセッサ上に実装することになる。
尚、上記処理において、
×は算術乗算
＋は算術加算
（Ａ＞Ｂ）は比較演算（もしＡがＢよりも大きければ真、そうでなければ偽）
（Ｃ＜Ｄ）は比較演算（もしＣがＤよりも小さければ真、そうでなければ偽）
（（Ｘ）？Ｙ：Ｚ）は選択演算（もしＸが真ならばＹを、偽ならばＺを返す）
＞＞は算術右シフト
を示している。 As an example of such an operation, for example, a complete list 1:
d = (a × b) + ch = ((d> e)? e: ((d <f)? f: d)) >> g
k = (h + i) >> j
In the case where it is desired to execute the arithmetic processing as a specific processing at high speed, an arithmetic circuit that directly realizes the entire arithmetic data flow is considered. By dividing the delay time of the arithmetic circuit by the clock cycle time, the arithmetic circuit is realized as an arithmetic circuit operating in a plurality of cycles and mounted on the processor.
In the above process,
X is arithmetic multiplication + is arithmetic addition (A> B) is a comparison operation (true if A is greater than B, false otherwise)
(C <D) is a comparison operation (true if C is less than D, false otherwise)
((X)? Y: Z) is a selection operation (if X is true, Y is returned; if false, Z is returned)
>> indicates arithmetic right shift.

図５は、この演算を表現した演算データフローを示す説明図である。
図において、ａからｊと記載されているブロックは、演算を実行するために必要となる入力データを示しており、ｋが記載されたブロックは演算結果の出力データを示している。また、図中の円で囲まれた部分は、各々内部に記載された演算を行う回路を示している。更に、ＳＥＬと記載されたブロックは、データの選択を行う、いわゆるセレクタを示しており、ＳＥＬブロックへの入力となる“Ｂ”と記載された部分で取得する真偽値に対し、もし値が真（１）ならば“Ｔ”と記載された入力の値を結果として出力し、偽（０）ならば“Ｆ”と記載された入力の値を結果として出力することを示しており、上記処理中の選択演算に対応している。 FIG. 5 is an explanatory diagram showing a calculation data flow expressing this calculation.
In the figure, blocks indicated by a to j indicate input data necessary for executing an operation, and a block indicated by k indicates output data of an operation result. Also, the portions surrounded by circles in the figure indicate circuits that perform operations described therein. Further, the block described as SEL indicates a so-called selector for selecting data, and if the value is compared with the true / false value acquired in the portion described as “B” which is an input to the SEL block, If true (1), the value of the input described as “T” is output as a result, and if false (0), the value of the input described as “F” is output as a result. It corresponds to the selection operation being processed.

このようなリスト１に示した演算データフローを、一般的な演算を行う命令コードで実行する場合、必要となる全命令コードは図６に示す通りとなる。尚、図６における各命令コードの説明は図４に記載された通りである。また、図６の説明において、ＬＷ命令およびＳＷ命令における、例えば[ｂ]、[ａ]等は、各々データｂおよびａが格納されているデータメモリ上のアドレスを示している。また、汎用レジスタｒ０は、値が０となっている事を前提としている。このように一般的な命令コードで処理を実行する場合は１９ステップ必要となり、多くの演算時間（実行サイクル数）がかかることになる。そこで、本発明では、複数のステップを同時に行う回路を設け、この回路を用いて演算処理を実行する。 When the operation data flow shown in the list 1 is executed with an instruction code for performing a general operation, all necessary instruction codes are as shown in FIG. The description of each instruction code in FIG. 6 is as described in FIG. In the description of FIG. 6, for example, [b], [a], etc. in the LW instruction and the SW instruction indicate addresses on the data memory in which the data b and a are stored, respectively. The general-purpose register r0 is assumed to have a value of 0. In this way, when processing is executed with a general instruction code, 19 steps are required, and a lot of calculation time (number of execution cycles) is required. Therefore, in the present invention, a circuit that performs a plurality of steps simultaneously is provided, and arithmetic processing is executed using this circuit.

図７は、図５に示した演算データフローの分割例を示す説明図である。
即ち、図７は、図５の演算データフローから、演算回路の遅延時間をクロック時間（クロック周波数の逆数）で分割し、複数サイクルでの特殊演算回路を構成することを表している。図７において、サブデータフロー７１、サブデータフロー７２およびサブデータフロー７３が、元の演算データフローの遅延時間をクロック時間で分割した結果、三つの演算回路で構成する結果となり、分割された結果の回路部位を示している。 FIG. 7 is an explanatory diagram showing a division example of the calculation data flow shown in FIG.
That is, FIG. 7 shows that the special arithmetic circuit is configured in a plurality of cycles by dividing the delay time of the arithmetic circuit by the clock time (reciprocal of the clock frequency) from the arithmetic data flow of FIG. In FIG. 7, the sub data flow 71, the sub data flow 72, and the sub data flow 73 result from dividing the delay time of the original arithmetic data flow by the clock time, resulting in three arithmetic circuits. The circuit part is shown.

このような特殊処理回路をプロセッサに搭載すれば、同じ処理内容を一般的な命令コードで実行する場合に比べて、非常に短い時間で処理を完了することが可能となる。
しかしながら、処理内容に特化した演算回路をプロセッサ内部に実装した後、処理内容に修正が発生した場合に同じ演算回路で処理が実行出来ないという問題が存在する。もし実装した回路が、全く使用不可能となれば、その処理内容を実行するためには、先述の一般的な命令コードを使用して演算を行う必要が発生するため、処理時間が長大化する。
その問題を防ぐ方策の一例としては、処理内容の将来的な修正内容を見通して回路実装することが考えられるが、どのような修正内容が発生するか予測が困難であるため実現すべき回路の想定が困難であるという問題や、回路に汎用性を持たせれば遅延時間が長くなり回路の演算時間に悪影響をもたらすという問題がある。更に、回路規模が大きくなるためプロセッサの実現コストが高くなるという問題もある。 If such a special processing circuit is installed in the processor, the processing can be completed in a very short time compared to the case where the same processing content is executed by a general instruction code.
However, there is a problem that processing cannot be executed by the same arithmetic circuit when the processing contents are corrected after the arithmetic circuit specialized for the processing contents is mounted inside the processor. If the implemented circuit becomes completely unusable, it will be necessary to perform an operation using the above-mentioned general instruction code in order to execute the processing contents, so the processing time will be lengthened. .
As an example of a measure to prevent the problem, it is conceivable to mount the circuit in anticipation of future correction contents of the processing contents, but it is difficult to predict what correction contents will occur, so it is difficult to predict the circuit to be realized. There is a problem that the assumption is difficult, and there is a problem that if the circuit has general versatility, the delay time becomes long and the calculation time of the circuit is adversely affected. Furthermore, since the circuit scale is increased, there is a problem that the implementation cost of the processor is increased.

そこで、本発明では、それらの問題を回避するために、用途特化の、複数サイクルで実行される特殊演算処理手段を実装し、この特殊演算処理手段を制御する特殊演算制御手段に指定した任意の実行サイクルで演算を停止させる機能を持たせると共に、一般的なプロセッサの命令コードを発行することにより、用途特化の特殊演算処理手段の途中結果に対し修正を加え、再度、特殊演算処理制御手段に対し、用途特化の特殊演算処理手段の実行を再開する機能を定義している。その全体像を示したものが図１であり、以下、図１を用いて説明する。 Therefore, in the present invention, in order to avoid these problems, the special arithmetic processing means that is executed in a plurality of cycles, which is specialized for the application, is implemented, and an arbitrary specified as the special arithmetic control means for controlling the special arithmetic processing means. In addition to having a function to stop the computation in the execution cycle, and by issuing a general processor instruction code, the intermediate results of the special-purpose processing means specialized for the application are corrected and the special arithmetic processing control is performed again. The function for resuming the execution of the special processing means for application specific is defined for the means. The whole image is shown in FIG. 1 and will be described below with reference to FIG.

図１において、特殊演算回路９は、特定の演算処理を実行する演算回路であり、特殊演算制御手段９ａと特殊演算処理手段９ｂを備えている。特殊演算制御手段９ａは、特殊演算処理手段９ｂの実行と停止を制御する手段であり、特殊演算処理手段９ｂは、特定の演算処理を実行する手段であり、これらの詳細については後述する。また、レジスタブロック７と特殊演算回路９との間、データメモリ８と特殊演算回路９との間、及び命令デコード（Ｄ）２と特殊演算回路９との間には、それぞれ専用通信路１０，１１，１２が設けられている。
また、命令フェッチ（ＩＦ）１〜結果ストア（Ｓ）５は、図２に示した一般的なマイクロプロセッサの機能ブロックと同様であり、これら機能ブロックによって一般演算手段が構成されている。 In FIG. 1, a special arithmetic circuit 9 is an arithmetic circuit for executing a specific arithmetic process, and includes special arithmetic control means 9a and special arithmetic processing means 9b. The special arithmetic control means 9a is a means for controlling execution and stop of the special arithmetic processing means 9b, and the special arithmetic processing means 9b is a means for executing specific arithmetic processing, and details thereof will be described later. Also, a dedicated communication path 10 between the register block 7 and the special arithmetic circuit 9, between the data memory 8 and the special arithmetic circuit 9, and between the instruction decode (D) 2 and the special arithmetic circuit 9, respectively. 11 and 12 are provided.
The instruction fetch (IF) 1 to result store (S) 5 are the same as the functional blocks of the general microprocessor shown in FIG. 2, and these functional blocks constitute general arithmetic means.

図８は、特殊演算回路９への接続状態の詳細を示すブロック図である。
図において、通信路１０１ａは、レジスタブロック７に対する要求信号とレジスタを指定するインデックス信号を送る通信路を示しており、通信路１０１ｂは、通信路１０１ａに対応して指定したレジスタから読み出された値を特殊演算回路９に送るための通信路を示している。また、通信路１０２ａと通信路１０２ｂ、通信路１０３ａと通信路１０３ｂは、それぞれ通信路１０１ａと通信路１０１ｂと同じ対応を示しており、特殊演算回路９は、レジスタブロック７から三つのレジスタ値を同時に取り込めるよう構成されている。尚、図示例ではレジスタブロック７からのデータ読み込みを行う通信路を３セット用意しているが、３セットに限定されるものではない。 FIG. 8 is a block diagram showing details of the connection state to the special arithmetic circuit 9.
In the figure, a communication path 101a indicates a communication path for sending a request signal to the register block 7 and an index signal for designating a register, and the communication path 101b is read from a register designated corresponding to the communication path 101a. A communication path for sending a value to the special arithmetic circuit 9 is shown. Further, the communication path 102a and the communication path 102b, the communication path 103a and the communication path 103b show the same correspondence as the communication path 101a and the communication path 101b, respectively, and the special arithmetic circuit 9 receives three register values from the register block 7. It is configured to capture at the same time. In the illustrated example, three sets of communication paths for reading data from the register block 7 are prepared, but the number is not limited to three sets.

また、通信路１０４ａは、レジスタブロック７に対するレジスタ値書込み用の要求信号とレジスタを指定するインデックス信号を送る通信路を示しており、通信路１０４ｂは、通信路１０４ａに対応して指定したレジスタに対して書き込む値を特殊演算回路９からレジスタブロック７に通信するための通信路を示している。尚、図示例ではレジスタブロック７へのデータ書込みを行う通信路を１セット用意しているが、１セットに限定するものではない。 A communication path 104a indicates a communication path for sending a register value write request signal to the register block 7 and an index signal for designating the register. The communication path 104b is a register designated corresponding to the communication path 104a. A communication path for communicating a value to be written to the register block 7 from the special arithmetic circuit 9 is shown. In the illustrated example, one set of communication paths for writing data to the register block 7 is prepared, but the present invention is not limited to one set.

また、通信路１０５ａは、データメモリ８に対するデータ値読み込み用の要求信号とデータメモリ８上でのアドレスを指定するアドレス信号を送る通信路を示しており、通信路１０５ｂが通信路１０５ａに対応して指定したアドレスから読み出された値を特殊演算回路９に通信するための通信路を示している。そして、通信路１０６ａと通信路１０６ｂ、通信路１０７ａと通信路１０７ｂは、通信路１０１ａと通信路１０１ｂと同じ対応を示しており、特殊演算回路９はデータメモリ８から、三つのデータ値を同時に取り込めるようにしている。尚、図示例ではデータメモリ８からデータ読み込みを行う通信路を３セット用意しているが、３セットに限定するものではない。 A communication path 105a indicates a communication path for sending a request signal for reading a data value to the data memory 8 and an address signal for designating an address on the data memory 8, and the communication path 105b corresponds to the communication path 105a. The communication path for communicating the value read from the designated address to the special arithmetic circuit 9 is shown. The communication path 106a and the communication path 106b, the communication path 107a and the communication path 107b show the same correspondence as the communication path 101a and the communication path 101b, and the special arithmetic circuit 9 simultaneously receives three data values from the data memory 8. I am trying to capture it. In the illustrated example, three sets of communication paths for reading data from the data memory 8 are prepared, but the number is not limited to three sets.

また、通信路１０８ａは、データメモリ８に対するデータ値書込み用の要求信号とデータを指定するアドレス信号を送る通信路を示しており、通信路１０８ｂが通信路１０８ａに対応して指定したアドレスに対して書き込む値を特殊演算回路９からデータメモリ８に通信するための通信路を示している。尚、図示例ではデータメモリ８へのデータ書込みを行う通信路を１セット用意しているが、１セットに限定するものではない。 A communication path 108a indicates a communication path for sending a request signal for writing a data value to the data memory 8 and an address signal designating data. The communication path 108b corresponds to the address designated corresponding to the communication path 108a. A communication path for communicating a value to be written from the special arithmetic circuit 9 to the data memory 8 is shown. In the illustrated example, one set of communication paths for writing data to the data memory 8 is prepared. However, the communication path is not limited to one set.

更に、通信路１０９ａは、命令デコード２から特殊演算回路９に送られる開始信号を送る通信路、通信路１０９ｂは、命令デコード２から特殊演算回路９に送られる一時停止サイクル数を送る通信路、通信路１０９ｃは、命令デコード２に特殊演算回路９から送られる演算実行中信号を送る通信路をそれぞれ示している。 Further, the communication path 109a is a communication path that sends a start signal sent from the instruction decode 2 to the special arithmetic circuit 9, and the communication path 109b is a communication path that sends the number of pause cycles sent from the instruction decode 2 to the special arithmetic circuit 9. A communication path 109 c indicates a communication path for sending a computation execution signal sent from the special arithmetic circuit 9 to the instruction decode 2.

図９は、特殊演算回路９の詳細を示すブロック図である。
特殊演算制御手段９ａは、命令デコード２との通信路１０９ａ〜１０９ｃが接続されると共に、レジスタブロック７との間に通信路１０１ａ〜１０４ａが、データメモリ８との間に通信路１０５ａ〜１０８ａがそれぞれ接続されている。また、特殊演算制御手段９ａは、一時停止サイクル保持部９０１と実行サイクル保持部９０２を備えている。一時停止サイクル保持部９０１は、命令デコード２から通信路１０９ｂを介して指定された一時停止サイクル数を保持するものであり、実行サイクル保持部９０２は、特殊演算処理手段９ｂの実行時において、現在、何サイクル目を実行しているかを保持する機能ブロックである。 FIG. 9 is a block diagram showing details of the special arithmetic circuit 9.
The special arithmetic control means 9 a is connected to communication paths 109 a to 109 c with the instruction decode 2, communication paths 101 a to 104 a are connected to the register block 7, and communication paths 105 a to 108 a are connected to the data memory 8. Each is connected. The special arithmetic control means 9 a includes a temporary stop cycle holding unit 901 and an execution cycle holding unit 902. The pause cycle holding unit 901 holds the number of pause cycles designated from the instruction decode 2 via the communication path 109b, and the execution cycle holding unit 902 performs the current operation when the special arithmetic processing means 9b is executed. This is a functional block that holds what cycle is being executed.

特殊演算回路９において、特殊演算処理手段９ｂは、特殊演算サブ回路９１１，９１２，９１３から構成されている。これらの特殊演算サブ回路９１１，９１２，９１３は、それぞれ、図７におけるサブデータフロー７１，７２，７３の機能を実現する回路であり、それぞれの回路が、レジスタブロック７との通信路１０１ｂ〜１０４ｂ、データメモリ８との通信路１０５ｂ〜１０８ｂと接続されている。また、特殊演算サブ回路９１１，９１２，９１３と特殊演算制御手段９ａとは、それぞれ通信路１１０〜１１２によって接続され、これら通信路１１０〜１１２を介して演算実行信号が与えられるようになっている。 In the special arithmetic circuit 9, the special arithmetic processing means 9b includes special arithmetic sub-circuits 911, 912, and 913. These special operation sub-circuits 911, 912, and 913 are circuits that realize the functions of the sub-data flows 71, 72, and 73 in FIG. 7, respectively, and each circuit communicates with the communication paths 101b to 104b with the register block 7. The communication paths 105b to 108b with the data memory 8 are connected. The special arithmetic subcircuits 911, 912, and 913 and the special arithmetic control means 9a are connected by communication paths 110 to 112, respectively, and an operation execution signal is given through these communication paths 110 to 112. .

次に、このように構成された特殊演算回路９の動作について説明する。
特殊演算制御手段９ａは、独立した二つの動作で構成されている。一つは、通信路１０９ｂを介して、命令デコード２から指定される一時停止サイクル数を一時停止サイクル保持部９０１に保持する動作である。もう一つは、通信路１０９ａを介して、命令デコード２から指定される開始要求に対応して、特殊演算サブ回路９１１，９１２，９１３を動作させ、その動作に必要となる各種信号を駆動する動作である。以下、これらの動作を図１０に示すフローチャートを用いて説明する。 Next, the operation of the special arithmetic circuit 9 configured as described above will be described.
The special arithmetic control means 9a is composed of two independent operations. One is an operation of holding the number of pause cycles designated by the instruction decode 2 in the pause cycle holding unit 901 via the communication path 109b. The other is that the special arithmetic subcircuits 911, 912, and 913 are operated via the communication path 109a in response to the start request specified by the instruction decode 2, and various signals necessary for the operation are driven. Is the action. Hereinafter, these operations will be described with reference to the flowchart shown in FIG.

尚、本実施の形態では、それぞれの特殊演算サブ回路９１１，９１２，９１３は、各演算実施時に、レジスタブロック７およびデータメモリ８からの入力値通信路である通信路１０１ｂ〜１０３ｂおよび通信路１０５ｂ〜１０７ｂのどの通信路から値を取り込み、レジスタブロック７およびデータメモリ８への出力値通信路である通信路１０４ｂおよび通信路１０８ｂのどの通信路に値を出力すれば良いかが固定的に実現されているとする。
また、特殊演算制御手段９ａでは、特殊演算サブ回路９１１，９１２，９１３をどの順番に実行させれば良いかが予め固定的に実現されているとし、更に、特殊演算サブ回路９１１，９１２，９１３を実行させる際に、レジスタブロック７から、どのインデックスで指定される値を取り込めば良いか、また、データメモリ８から、どのアドレスで指定される値を取り込めば良いかが、予め固定的に実現されているとする。 In the present embodiment, the special operation sub-circuits 911, 912, and 913 receive the communication paths 101b to 103b and the communication path 105b, which are input value communication paths from the register block 7 and the data memory 8, at the time of each calculation. ~ 107b from which communication path is taken in, and the communication path 104b and communication path 108b, which are output value communication paths to the register block 7 and the data memory 8, are fixedly realized. Suppose that
Further, in the special arithmetic control means 9a, it is assumed that the order in which the special arithmetic subcircuits 911, 912, and 913 should be executed is fixedly realized in advance, and the special arithmetic subcircuits 911, 912, and 913 are further realized. When executing the above, it is fixed in advance which index value to be fetched from the register block 7 and which address value to be fetched from the data memory 8 is to be fetched. Suppose that

図１０のフローチャートにおいて、特殊演算回路９の処理が開始されると、特殊演算制御手段９ａは、先ず、実行サイクル保持部９０２のサイクル数を０に設定する（ステップＳＴ１）。その後は、ステップＳＴ２において、通信路１０９ａからの演算開始信号がＯＮになるまで、処理の開始待ちを行う。演算開始信号がＯＮとなると、実行サイクル保持部９０２のサイクル数に１を加算する（ステップＳＴ３）。次に、実行サイクル保持部９０２の値と一時停止サイクル保持部９０１の値が同じかどうかを比較し（ステップＳＴ４）、もし同じであれば、演算を一時停止する（ステップＳＴ５）とし、一時停止サイクル保持部９０１の値が変更されるまでステップＳＴ４、ステップＳＴ５をループする。 In the flowchart of FIG. 10, when the processing of the special arithmetic circuit 9 is started, the special arithmetic control means 9a first sets the number of cycles of the execution cycle holding unit 902 to 0 (step ST1). Thereafter, in step ST2, the process waits until the calculation start signal from the communication path 109a is turned on. When the calculation start signal is turned ON, 1 is added to the number of cycles of the execution cycle holding unit 902 (step ST3). Next, it is compared whether or not the value of the execution cycle holding unit 902 and the value of the suspension cycle holding unit 901 are the same (step ST4), and if they are the same, the calculation is paused (step ST5) and paused. Steps ST4 and ST5 are looped until the value of the cycle holding unit 901 is changed.

一方、ステップＳＴ４において、実行サイクル保持部９０２の値と一時停止サイクル保持部９０１の値が異なれば（一時停止サイクル保持部９０１の値が変更されれば）、ステップＳＴ６の処理に遷移する。ステップＳＴ６では、演算完了のサイクル数に到達しているかを判定し、そうであれば、ステップＳＴ７で一時停止サイクル保持部９０１を０に設定し、ステップＳＴ１の処理に遷移する。一方、ステップＳＴ６で、演算完了のサイクル数に到達していなければ、ステップＳＴ８の処理に遷移して、現在の実行サイクルに対応する演算回路を実行し、ステップＳＴ３に戻る。 On the other hand, if the value of the execution cycle holding unit 902 and the value of the pause cycle holding unit 901 are different from each other in step ST4 (if the value of the pause cycle holding unit 901 is changed), the process proceeds to step ST6. In step ST6, it is determined whether the number of computation completion cycles has been reached. If so, the pause cycle holding unit 901 is set to 0 in step ST7, and the process proceeds to step ST1. On the other hand, if the calculation completion cycle number has not been reached in step ST6, the process proceeds to step ST8, the calculation circuit corresponding to the current execution cycle is executed, and the process returns to step ST3.

次に、各演算サイクルでどのような処理が実行されるかについての詳細を説明する。
図１１は、各演算サイクルの状態遷移を示す説明図である。
先ず、最初のサイクルであるサイクル１では、
データメモリ８からの入力用通信路１０５ａで、データ“ａ”の値を取得するために必要となる要求およびアドレス信号を、
データメモリ８からの入力用通信路１０６ａで、データ“ｂ”の値を取得するために必要となる要求およびアドレス信号を、
データメモリ８からの入力用通信路１０７ａで、データ“ｃ”の値を取得するために必要となる要求およびアドレス信号を、
各々発行する。
データメモリ８からは次のサイクルで各データ値を取得可能であるとする。 Next, details of what processing is executed in each calculation cycle will be described.
FIG. 11 is an explanatory diagram showing state transition of each calculation cycle.
First, in cycle 1, which is the first cycle,
A request and an address signal necessary for acquiring the value of the data “a” on the input communication path 105a from the data memory 8 are:
A request and an address signal necessary for acquiring the value of the data “b” on the input communication path 106a from the data memory 8 are:
A request and an address signal necessary for acquiring the value of the data “c” on the input communication path 107a from the data memory 8 are:
Issue each.
It is assumed that each data value can be acquired from the data memory 8 in the next cycle.

サイクル２では、“ａ”、“ｂ”、“ｃ”の値を使用可能であるため、特殊演算サブ回路９１１で演算を実行（通信路１１０の信号をＯＮ）すると共に、結果となる“ｄ”の値を、通信路１０４ｂを介してレジスタブロック７に出力する。また、同時に、次のサイクル３で必要となる“ｅ”、“ｆ”、“ｇ”の値を取得するための要求信号およびアドレスを通信路１０５ａ〜１０７ａを介してデータメモリ８に発行する。 In cycle 2, since the values of “a”, “b”, and “c” can be used, the special operation sub-circuit 911 executes the operation (the signal of the communication path 110 is turned ON) and the result “d” The value "" is output to the register block 7 via the communication path 104b. At the same time, a request signal and an address for acquiring values of “e”, “f”, and “g” required in the next cycle 3 are issued to the data memory 8 via the communication paths 105a to 107a.

サイクル３では、“ｅ”、“ｆ”、“ｇ”の値と、“ｄ”の値を使用可能であるため、特殊演算サブ回路９１２で演算を実行（通信路１１１の信号をＯＮ）すると共に、結果となる“ｈ”の値を、レジスタブロック７に出力する。また同時に、次のサイクル４で必要となる“ｉ”、“ｊ”の値を取得するための要求信号およびアドレスをデータメモリ８に対して発行する。 In cycle 3, since the values of “e”, “f”, “g” and “d” can be used, the special operation sub-circuit 912 executes the operation (the signal of the communication path 111 is turned ON). At the same time, the resulting “h” value is output to the register block 7. At the same time, a request signal and an address for acquiring values of “i” and “j” required in the next cycle 4 are issued to the data memory 8.

サイクル４では、“ｉ”、“ｊ”の値と、“ｈ”の値を使用可能であるため、特殊演算サブ回路９１３で演算を実行（通信路１１２の信号をＯＮ）すると共に、結果となる“ｋ”の値を、通信路１０８ｂを介してデータメモリ８に出力する。
以上のように、特殊演算制御手段９ａは特殊演算処理手段９ｂの制御を行う。尚、特殊演算処理手段９ｂが演算を実行中、即ち、演算開始待ちでなく、もしくは一時停止でない時は、通信路１０９ｃの特殊演算回路実行中信号がＯＮとなり、演算中である事を示す。 In cycle 4, since the values of “i”, “j” and “h” can be used, the special operation sub-circuit 913 executes the calculation (the signal of the communication path 112 is turned ON) and the result is The value of “k” is output to the data memory 8 through the communication path 108b.
As described above, the special arithmetic control unit 9a controls the special arithmetic processing unit 9b. When the special arithmetic processing means 9b is executing a calculation, that is, not waiting for the start of the calculation or not being temporarily stopped, the special arithmetic circuit in-progress signal of the communication path 109c is turned on to indicate that the calculation is in progress.

次に、命令デコード２から見て、どのように特殊演算回路９の動作を制御するかについて、処理内容（リスト１）に修正が発生した状況を含め説明する。
通常のリスト１に対応した処理を、特殊演算回路９を使用して演算させたい場合には、対応する命令コード、例えば、
ＳＰＭＣＩ
を定義し、本命令コードを発行する事で行う事とする。
本命令コードは、通常の命令コードと同様に、命令メモリ６上に保持され、本命令コードを命令フェッチ１でフェッチし、命令デコード２でデコードする事により、特殊演算回路９に対する演算実行処理であることが解釈され、特殊演算回路９に対する開始信号が通信路１０９ａを介して発行される。これにより、特殊演算回路９は実行を開始する。 Next, how to control the operation of the special arithmetic circuit 9 as viewed from the instruction decode 2 will be described including the situation in which the processing content (list 1) has been corrected.
When processing corresponding to the normal list 1 is to be performed using the special arithmetic circuit 9, a corresponding instruction code, for example,
SPMCI
This is done by issuing this command code.
This instruction code is held in the instruction memory 6 in the same way as a normal instruction code. The instruction code is fetched by the instruction fetch 1 and decoded by the instruction decode 2, thereby performing an operation execution process for the special arithmetic circuit 9. It is interpreted that there is, and a start signal for the special arithmetic circuit 9 is issued through the communication path 109a. Thereby, the special arithmetic circuit 9 starts execution.

尚、先述の通り、特殊演算回路９が演算を実行中の場合、通信路１０９ｃからの特殊演算回路実行中信号がＯＮとなり、その間、命令デコード２は停止したままとなる。
今、リスト１で示された処理内容を下記のように修正し、実行したいとする。
リスト２：
ｄ＝（ａ＊ｂ）＋ｃ
ｈ＝（（ｄ＞ｅ）？ｅ：（（ｄ＜ｆ）？ｆ：ｄ））＜＜Ｘ
ｋ＝（ｈ＋ｉ）＞＞ｊ As described above, when the special arithmetic circuit 9 is executing an operation, the special arithmetic circuit in-progress signal from the communication path 109c is turned on, and the instruction decode 2 remains stopped during that time.
Now, assume that the processing contents shown in the list 1 are corrected as follows and executed.
Listing 2:
d = (a * b) + c
h = ((d> e)? e: ((d <f)? f: d)) << X
k = (h + i) >> j

これは、リスト１に比べてｈを求める際の算術シフトが、値ｇ分の右シフトから、値Ｘ分の左シフトに変更されているというような修正である。
元々、リスト２の処理内容を実行可能な形態として特殊演算回路９を構成していないため、リスト２に書かれた処理内容を、そのまま演算回路で実行することは不可能である。
しかしながら、この修正された処理内容は、
ｄ＝（ａ＊ｂ）＋ｃ
ｈ＝（（ｄ＞ｅ）？ｅ：（（ｄ＜ｆ）？ｆ：ｄ））＜＜０
ｈ＝ｈ＜＜Ｘ
ｋ＝（ｈ＋ｉ）＞＞ｊ
というように、値ｇを０に設定し、その結果ｈを値Ｘだけ算術左シフトを行い、それをｈに上書きし、再度ｋ導出の演算を実行すれば修正した処理内容を実行することが可能である。これは、データメモリ８上の“ｇ”の値を０にセットし、特殊演算回路９を３サイクル目で停止（特殊演算サブ回路９１２実行後に停止）し、通常命令における算術左シフト命令（ＳＬＡ）を発行した後、特殊演算回路９を実行再開（特殊演算回路の４サイクル目を実行。特殊演算サブ回路９１３を実行）することにより、本修正された処理内容を実行する事が可能である。 This is a modification in which the arithmetic shift for obtaining h as compared with the list 1 is changed from the right shift for the value g to the left shift for the value X.
Originally, the special arithmetic circuit 9 is not configured in such a way that the processing contents of the list 2 can be executed, so that the processing contents written in the list 2 cannot be directly executed by the arithmetic circuit.
However, this modified process is
d = (a * b) + c
h = ((d> e)? e: ((d <f)? f: d)) << 0
h = h << X
k = (h + i) >> j
As described above, if the value g is set to 0, the result h is arithmetically shifted to the left by the value X, overwritten with h, and the k-derivative operation is executed again, the modified processing content can be executed. Is possible. This is because the value of “g” on the data memory 8 is set to 0, the special arithmetic circuit 9 is stopped at the third cycle (stopped after execution of the special arithmetic subcircuit 912), and the arithmetic left shift instruction (SLA) in the normal instruction is executed. ) Is issued, the special operation circuit 9 is restarted (the fourth cycle of the special operation circuit is executed. The special operation sub-circuit 913 is executed), so that the modified processing content can be executed. .

このような処理を実行するために、特殊演算回路９に対する一時停止サイクル数指定を行える命令コードを、例えば、
ＭＣＳＴＰｎ
のように定義する。ここで、ｎは即値データを表し、特殊演算回路９に対して、ｎサイクル目で処理を一時停止させる事を指示する命令である。 In order to execute such processing, an instruction code that can specify the number of pause cycles for the special arithmetic circuit 9 is, for example,
MCSTP n
Define as follows. Here, n represents immediate data, and is an instruction for instructing the special arithmetic circuit 9 to suspend processing in the nth cycle.

本命令コードが、命令デコード２で解釈され、特殊演算回路９に対して、通信路１０９ｂを介して、一時停止サイクル保持部９０１に設定される。この命令コードを発行後に、先述のＳＰＭＣＩ命令コードを発行すれば、所望の動作が実現される。その様子を図１２の命令コード列を元に説明する。尚、ここで、特殊演算回路９のサイクル２終了後の結果データとなる“ｈ”は、レジスタブロック７中のレジスタｒ５に保持されているとする。 This instruction code is interpreted by the instruction decode 2 and set in the suspension cycle holding unit 901 via the communication path 109b for the special arithmetic circuit 9. If the above-mentioned SPMCI instruction code is issued after this instruction code is issued, a desired operation is realized. This will be described based on the instruction code string of FIG. Here, it is assumed that “h”, which is the result data after the end of the cycle 2 of the special arithmetic circuit 9, is held in the register r5 in the register block 7.

図１２は、本実施の形態の演算処理装置における命令コード列の説明図である。
通常のＳＰＭＣＩ命令コードによる、特殊演算処理実行は、命令アドレスの０１０４の発行により実行される。ここで、命令アドレス０１００および０１０８および０１２４の“．．．”は、任意の通常命令コードの発行を示している。 FIG. 12 is an explanatory diagram of an instruction code string in the arithmetic processing unit according to this embodiment.
Execution of special arithmetic processing by a normal SPMCI instruction code is executed by issuing an instruction address 0104. Here, “...” of the instruction addresses 0100, 0108, and 0124 indicates the issuance of an arbitrary normal instruction code.

修正された特殊処理内容を実行したい場合には、命令アドレス０１０ｃ以降のように命令コードを記述するようになる。
先ず、命令アドレス０１０ｃの命令コードにより、特殊演算回路９に対して一時停止サイクル数として３を指定する。
その後、命令アドレス０１１０の命令コードにより、“Ｘ”の値をレジスタｒ３に取り込んでおく。
その後、命令アドレス０１１４の命令コードにより、“ｇ”に対して０を設定する。ここではレジスタｒ０の値が０であるとする。
その後、命令アドレス０１１８のＳＰＭＣＩ命令により特殊演算回路９が動作を開始する。動作開始後、一時停止サイクル数３で停止するように設定されているため、３サイクル目の処理発行前に特殊演算回路９は停止し、通信路１０９ｃの特殊演算回路実行中信号がＯＮからＯＦＦとなる。 In order to execute the modified special processing content, an instruction code is described as in the instruction address 010c and thereafter.
First, 3 is designated as the number of pause cycles for the special arithmetic circuit 9 by the instruction code of the instruction address 010c.
Thereafter, the value of “X” is fetched into the register r3 by the instruction code of the instruction address 0110.
Thereafter, 0 is set to “g” by the instruction code of the instruction address 0114. Here, it is assumed that the value of the register r0 is 0.
Thereafter, the special arithmetic circuit 9 starts to operate according to the SPMCI instruction at the instruction address 0118. Since the operation is set to stop at the number of pause cycles 3 after the operation starts, the special arithmetic circuit 9 stops before issuing the process in the third cycle, and the special arithmetic circuit execution signal of the communication path 109c is turned from ON to OFF. It becomes.

命令デコード２は、特殊演算回路実行中信号がＯＦＦならば次の命令コードが発行可能であるため、命令アドレス０１１ｃの命令が発行される。先述の通り、リスト２中の演算結果“ｈ”はレジスタｒ５に保持されているため、その値がレジスタｒ３の値である“Ｘ”ビットほど、左にシフトされる。
その後、命令アドレス０１２０の命令コードＭＣＳＴＰ０が発行される。これにより、特殊演算回路９が処理を完了するまでは、図１０のフローチャートにおけるステップＳＴ４がＹＥＳとならないため、特殊演算回路９の残りの処理が実行される。
実行完了後、特殊演算回路実行中信号がＯＮからＯＦＦとなるため、命令アドレス０１２４以降の命令コードが実行されることとなる。 The instruction decode 2 can issue the next instruction code if the special arithmetic circuit execution signal is OFF, and therefore the instruction at the instruction address 011c is issued. As described above, since the operation result “h” in the list 2 is held in the register r5, the value is shifted to the left by “X” bits which is the value of the register r3.
Thereafter, the instruction code MCSTP 0 at the instruction address 0120 is issued. Thus, until step ST4 in the flowchart of FIG. 10 is not YES until the special arithmetic circuit 9 completes the processing, the remaining processing of the special arithmetic circuit 9 is executed.
After the execution is completed, the special arithmetic circuit execution signal changes from ON to OFF, so that the instruction code after the instruction address 0124 is executed.

このように、本実施の形態では、処理内容に特化した演算に対しては、専用の特殊演算回路９により高速実行が可能であると共に、処理内容に修正が発生した場合でも、大きく処理時間を落とすことなく、実装されている特殊演算回路９を使用して、修正後の処理内容を実行することが可能となる。
尚、本実施の形態では、特定のリスト１、リスト２のような処理内容について説明を行ったが、特殊演算回路とする処理内容については、リスト１、リスト２に制約するものではない。
また、本実施の形態では、特殊演算処理手段９ｂとして三つの特殊演算サブ回路９１１，９１２，９１３で構成したが、この数に限定されるものではなく、一つ以上の値であればよい。 As described above, in the present embodiment, the processing specialized for the processing content can be executed at high speed by the dedicated special arithmetic circuit 9, and even when the processing content is corrected, the processing time is greatly increased. It is possible to execute the modified processing content by using the special arithmetic circuit 9 that is mounted without dropping the function.
In the present embodiment, the processing contents such as the specific list 1 and list 2 have been described. However, the processing contents of the special arithmetic circuit are not limited to the lists 1 and 2.
In the present embodiment, the special arithmetic processing means 9b is constituted by the three special arithmetic subcircuits 911, 912, and 913. However, the number is not limited to this number, and may be one or more values.

更に、本実施の形態では、特殊演算回路９が実行する処理内容として、ある特定の一つの処理内容で説明したが、特殊演算回路９上に実装する処理内容として、複数の独立した処理内容を一つにまとめたものでも良い。そのような場合においては、上述した特殊演算回路９の実行開始を指示する命令コード“ＳＰＭＣＩ”にオペランド（引数）を与え、例えば、
ＳＰＭＣＩｍ
とし、引数のｍを複数の独立した処理内容から一つを選択するＩＤ番号とし、ｍの値を変えることにより、複数の処理内容から任意の処理内容を選択することも可能である。また、その場合、特殊演算制御手段９ａは共有化して使用することが可能である。 Furthermore, in the present embodiment, the processing content executed by the special arithmetic circuit 9 has been described as one specific processing content. However, as the processing content implemented on the special arithmetic circuit 9, a plurality of independent processing details are provided. It may be a single item. In such a case, an operand (argument) is given to the instruction code “SPMCI” instructing the start of execution of the special arithmetic circuit 9 described above.
SPMCI m
It is also possible to select an arbitrary processing content from a plurality of processing contents by changing m of the argument to an ID number for selecting one from a plurality of independent processing contents and changing the value of m. In that case, the special arithmetic control means 9a can be shared.

以上のように、実施の形態１の演算処理装置によれば、算術演算、論理演算、シフト演算を行う演算手段を有し、各演算手段に対して定義されている命令コードの発行により各演算を実行する一般演算処理手段と、所定のクロックサイクルで特定の演算処理を実行する特殊演算処理手段と、特定の演算処理に対して定義されている命令コードの発行に基づいて、特殊演算処理手段での演算処理の実行および一時停止を制御する特殊演算制御手段とを備えたので、高速に処理を実行出来ると共に、修正が発生した場合に、部分的に演算結果を修正する事が可能であり、処理内容修正後の処理の実行時間が長大化することを避けることができる。 As described above, according to the arithmetic processing apparatus of the first embodiment, the arithmetic processing unit has arithmetic means for performing arithmetic operation, logical operation, and shift operation, and each operation is performed by issuing an instruction code defined for each arithmetic means. General arithmetic processing means for executing a specific arithmetic processing means in a predetermined clock cycle, and special arithmetic processing means based on the issuance of an instruction code defined for the specific arithmetic processing Special processing control means that controls the execution and pause of arithmetic processing in the system, so that processing can be executed at high speed, and when corrections occur, it is possible to partially correct the arithmetic results Therefore, it is possible to avoid an increase in the execution time of the processing after the processing content correction.

また、実施の形態１の演算処理装置によれば、特殊演算処理手段が一時停止するクロックサイクル数を、特定の命令コードを使用して任意の値に設定するようにしたので、実装する特殊演算処理手段の形態によらず、特殊演算処理手段の任意のサイクル後の結果に対し、一般命令で補正をかけることが可能となるため、特殊演算処理手段の処理内容の実装と、特殊演算制御手段の実装とを独立して実現可能となり、本演算処理装置の実装開発が容易になるという効果と共に、どのような特殊演算処理手段の処理内容に対しても、一時停止用の命令コードが一つで済み、命令のデコード処理に負荷をかけなくて済むという効果と共に、命令コードサイズを不用に増やさずに済むという効果がある。 Further, according to the arithmetic processing unit of the first embodiment, the number of clock cycles at which the special arithmetic processing means pauses is set to an arbitrary value using a specific instruction code. Regardless of the form of the processing means, it is possible to correct the result after an arbitrary cycle of the special arithmetic processing means with a general instruction, so that the processing contents of the special arithmetic processing means are implemented and the special arithmetic control means In addition to the effect that the implementation of this processing unit can be easily implemented, there is one instruction code for pause for any processing content of any special arithmetic processing means. This is advantageous in that it does not require a load on the instruction decoding process, and it does not increase the instruction code size unnecessarily.

また、実施の形態１の演算処理装置によれば、特殊演算処理手段は、複数の独立した特定の演算処理を実行可能に構成され、複数の独立した特定の演算処理の選択を、特定の演算処理の開始を指定する命令コードの引数で指定するようにしたので、命令コードを複数の独立した処理内容毎に増やす事が必要でなくなるため、命令コードサイズを不用に増やさずに済むという効果がある。 Further, according to the arithmetic processing apparatus of the first embodiment, the special arithmetic processing means is configured to be able to execute a plurality of independent specific arithmetic processes, and selects a plurality of independent specific arithmetic processes as a specific arithmetic process. Since it is specified by the argument of the instruction code that specifies the start of processing, it is not necessary to increase the instruction code for each of multiple independent processing contents, so there is an effect that it is not necessary to increase the instruction code size unnecessarily. is there.

この発明の実施の形態１による演算処理装置を示す構成図である。It is a block diagram which shows the arithmetic processing apparatus by Embodiment 1 of this invention. 一般的なマイクロプロセッサのパイプラインステージの説明図である。It is explanatory drawing of the pipeline stage of a general microprocessor. この発明の実施の形態１による演算処理装置の命令メモリ上の命令コードの説明図である。It is explanatory drawing of the instruction code on the instruction memory of the arithmetic processing unit by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の命令コードの説明図である。It is explanatory drawing of the instruction code of the arithmetic processing unit by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の処理内容のデータフローを示す説明図である。It is explanatory drawing which shows the data flow of the processing content of the arithmetic processing unit by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の処理内容を通常の命令コードで表現した説明図である。It is explanatory drawing which expressed the processing content of the arithmetic processing unit by Embodiment 1 of this invention with the normal instruction code. この発明の実施の形態１による演算処理装置の処理内容のデータフローをサイクル毎に分割した構成を示す説明図である。It is explanatory drawing which shows the structure which divided | segmented the data flow of the processing content of the arithmetic processing unit by Embodiment 1 of this invention for every cycle. この発明の実施の形態１による演算処理装置の特殊演算回路への接続状態の詳細を示すブロック図である。It is a block diagram which shows the detail of the connection state to the special arithmetic circuit of the arithmetic processing apparatus by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の特殊演算回路の詳細を示すブロック図である。It is a block diagram which shows the detail of the special arithmetic circuit of the arithmetic processing apparatus by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の特殊演算回路の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the special arithmetic circuit of the arithmetic processing apparatus by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の特殊演算回路の各演算サイクルの状態遷移を示す説明図である。It is explanatory drawing which shows the state transition of each arithmetic cycle of the special arithmetic circuit of the arithmetic processing apparatus by Embodiment 1 of this invention. この発明の実施の形態１による演算処理装置の特殊演算回路の命令コード列の説明図である。It is explanatory drawing of the instruction code sequence of the special arithmetic circuit of the arithmetic processing unit by Embodiment 1 of this invention.

Explanation of symbols

１命令フェッチ、２命令デコード、３命令実行、４メモリアクセス、５結果ストア、６命令メモリ、７レジスタブロック、８データメモリ、９特殊演算回路、９ａ特殊演算制御手段、９ｂ特殊演算処理手段。 1 instruction fetch, 2 instruction decode, 3 instruction execution, 4 memory access, 5 result store, 6 instruction memory, 7 register block, 8 data memory, 9 special operation circuit, 9a special operation control means, 9b special operation processing means.

Claims

General arithmetic processing means having arithmetic means for performing arithmetic operation, logical operation, shift operation, and executing each operation by issuing an instruction code defined for each arithmetic means;
Special arithmetic processing means for executing specific arithmetic processing in a predetermined clock cycle;
Special arithmetic control means for controlling execution and suspension of arithmetic processing in the special arithmetic processing means based on the issuance of an instruction code defined for the specific arithmetic processing ,
An arithmetic processing unit characterized in that the number of clock cycles at which the special arithmetic processing means pauses is set to an arbitrary value using a specific instruction code .

The special arithmetic processing means is configured to be able to execute a plurality of independent specific arithmetic processes,
Wherein the plurality of independent selection of specific processing, the arithmetic processing apparatus according to claim 1, characterized in that specified in the argument of the instruction code specifying the start of the specific operation processing.