JPH0326410B2

JPH0326410B2 -

Info

Publication number: JPH0326410B2
Application number: JP59000753A
Authority: JP
Inventors: Toshio Yagihashi
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1984-01-09
Filing date: 1984-01-09
Publication date: 1991-04-10
Also published as: JPS60144827A

Description

[Detailed description of the invention]

〔発明の属する技術分野〕本発明はデータ処理装置に於けるパイプライン
化乗算回路に関する。〔従来技術〕従来、この種のパイプライン化された浮動小数
点乗算回路は、第１図のブロツク図に示すよう
に、被乗算の仮数部を格納する被乗数レジスタ
１、乗数の仮数部を格納する乗数レジスタ２、被
乗数と乗数を入力し、被乗数の倍数を作成する部
分積群作成回路３、該部分積群作成回路３の出力
を加算し、最終和及び最終桁上げを出力する多入
力桁上げ保存加算器７、該最終和及び最終桁上げ
を格納する最終和レジスタ８および最終桁上げレ
ジスタ９、該最終和レジスタ８および最終桁上げ
レジスタ９の２出力を加算する桁上げ先見加算器
１０、演算結果レジスタ１１、制御回路１４から
成る。さらに、チエツク回路として、入力オペラ
ンド１，２のモジユロ３剰余作成回路４，５、該
モジユロ３剰余作成回路の出力を乗算する入力オ
ペランド積モジユロ３剰余作成回路６、演算結果
のモジユロ３剰余作成回路１２、前記入力オペラ
ント積モジユロ３剰余作成回路６の出力と演算結
果のモジユロ３剰余作成回路１２の出力とを比較
するコンパレータ１３を含み構成されているのが
一般的であつた。又、ベクトル演算プロセツサに
於けるベクトル浮動小数点乗算については、高速
処理が要求され１マシンサイクルで１エレメント
の乗算処理が必要である。例えば、浮動小数点の
仮数部が56ビツトの場合、乗算を２ビツトリコー
ド乗算器で実現すると、第２図のブロツク図に示
す如く、29個の部分積が必要である。この29個の
倍数を第３図の具体的な構成例に見られるような
多入力桁上げ保存加算器３により加算すると、27
個の３入力加算器が必要である。又、桁上げ先見
加算器１０のデータ幅も112ビツト必要となる。
このように、従来のパイプライン乗算器で１エレ
メント／マシンサイクルの高速性能を実現するに
は、多大な金物を要し、価格を引き上げるという
欠点があつた。〔発明の目的〕本発明の目的は、従来の欠点である金物量の大
幅な削減を計り、比較的高い精度を保ちつつ処理
の高速化と経済性を向上することのできるパイプ
ライン化乗算回路を提供することにある。〔発明の構成〕本発明によるパイプライン化乗算回路は、２入
力浮動小数点データを乗算するパイプライン化乗
算回路に於いて、Ｍ（Ｍは仮数部のビツト長）ビ
ツトの被乗数とＭビツトの乗数との部分積群の上
位Ｍ＋Ｎ（Ｎは整数）ビツト部分のみを作成する
部分積群作成回路と、該部分積群作成回路の出力
を入力して和及び桁上げを作成する多入力桁上げ
保存加算器と、該桁上げ保存加算器の２出力を加
算する桁上げ先見加算器と、入力オペランドのモ
ジユロ３剰余の積を作成する回路と、Ｍ＋Ｎ＋１
ビツトから2Mビツトの切捨てられる部分積群の
モジユロ３剰余を作成する回路と、前記入力オペ
ランドのモジユロ３剰余の積作成回路の出力から
前記切捨てられる部分積群のモジユロ３剰余作成
回路の出力を減じるモジユロ３減算回路と、該モ
ジユロ３減算回路の出力と前記桁上げ先見加算器
の出力のモジユロ３とを比較して乗算エラーを検
出する手段とを備えたことを特徴とする。〔発明の実施例〕次に、本発明によるパイプライン化乗算回路に
ついて実施例を挙げ、図面を参照して説明する。第４図は、本発明による実施例の構成を示すブ
ロツク図である。なお、この例で扱う浮動小数点
のデータ型式を示すと、第５図ａに示すようなビ
ツト構成となる。この構成において、Ｓは仮数部
Ｍの符号、Ｅは指数部７ビツト、Ｍは絶対値表現
の仮数部56ビツトをそれぞれ表わしている。第４
図の実施例において、１５は被乗数の仮数部56ビ
ツトを格納するレジスタ、１６は乗数の仮数部56
ビツトを格納するレジスタ、１７は被乗数と乗数
との倍数を作成する部分積群作成回路である。被
乗数の倍数を作成するアルゴリズムはリコード手
法を用いている。第５図ｂに示すごとく、リコー
ド手法では、乗数56ビツトa₁、a₂、…、a₅₆を１
ビツトずつ重ね合わせ、各々３ビツトに分割して
M0、M1、…、M28の組とする。M0〜M28の重
み付けは（−２，１，１）であり、３ビツトのビ
ツトパターンにより倍率は第１表のように設定さ
れる。 [Technical field to which the invention pertains] The present invention relates to a pipelined multiplication circuit in a data processing device. [Prior Art] Conventionally, this type of pipelined floating-point multiplication circuit, as shown in the block diagram of FIG. A multiplier register 2, a partial product group creation circuit 3 that inputs the multiplicand and the multiplicand and creates a multiple of the multiplicand, and a multi-input carry that adds the outputs of the partial product group creation circuit 3 and outputs the final sum and final carry. a save adder 7; a final sum register 8 and a final carry register 9 for storing the final sum and the final carry; a carry look-ahead adder 10 for adding the two outputs of the final sum register 8 and the final carry register 9; It consists of an operation result register 11 and a control circuit 14. Further, as check circuits, modulo 3 remainder creation circuits 4 and 5 for input operands 1 and 2, input operand product modulo 3 remainder creation circuit 6 for multiplying the output of the modulo 3 remainder creation circuit, and a modulo 3 remainder creation circuit for the operation result. 12. It has generally been constructed to include a comparator 13 for comparing the output of the input operant product modulo-3 remainder generating circuit 6 and the output of the calculation result modulo-3 remainder generating circuit 12. Furthermore, vector floating point multiplication in a vector arithmetic processor requires high-speed processing, and multiplication processing of one element is required in one machine cycle. For example, when the mantissa part of a floating point number is 56 bits, if multiplication is implemented using a 2-bit recode multiplier, 29 partial products are required as shown in the block diagram of FIG. When these 29 multiples are added by the multi-input carry save adder 3 as shown in the specific configuration example in Figure 3, the result is 27
three-input adders are required. Further, the data width of the carry look-ahead adder 10 is also required to be 112 bits.
As described above, in order to achieve high-speed performance of one element/machine cycle with a conventional pipeline multiplier, a large amount of hardware is required and the price increases. [Object of the Invention] The object of the present invention is to provide a pipelined multiplication circuit that can significantly reduce the amount of hardware, which is a disadvantage of the conventional method, and that can increase processing speed and improve economic efficiency while maintaining relatively high accuracy. Our goal is to provide the following. [Structure of the Invention] A pipelined multiplication circuit according to the present invention is a pipelined multiplication circuit that multiplies two-input floating point data, and in which a multiplicand of M bits (M is the bit length of the mantissa part) and a multiplier of M bits. A partial product group creation circuit that creates only the upper M+N (N is an integer) bit part of the partial product group with , and a multi-input carry storage that creates sum and carry by inputting the output of the partial product group creation circuit. an adder, a carry lookahead adder that adds the two outputs of the carry save adder, a circuit that creates a product of modulo 3 remainders of input operands,
Subtract the output of the modulo-3 remainder creation circuit for the partial product group to be truncated from the output of the product creation circuit for the modulo-3 remainder of the input operand and the circuit for creating the modulo-3 remainder for the partial product group to be truncated by 2M bits from the input operand. It is characterized by comprising a modulo 3 subtraction circuit and means for detecting a multiplication error by comparing the output of the modulo 3 subtraction circuit with the modulo 3 output of the carry lookahead adder. [Embodiments of the Invention] Next, embodiments of a pipelined multiplication circuit according to the present invention will be described with reference to the drawings. FIG. 4 is a block diagram showing the configuration of an embodiment according to the present invention. The floating point data type handled in this example has a bit configuration as shown in FIG. 5a. In this configuration, S represents the sign of the mantissa part M, E represents the 7-bit exponent part, and M represents the 56-bit mantissa part of the absolute value representation. Fourth
In the example shown in the figure, 15 is a register that stores the 56-bit mantissa part of the multiplicand, and 16 is a register that stores the 56-bit mantissa part of the multiplier.
A register 17 for storing bits is a partial product group generating circuit that generates a multiple of the multiplicand and the multiplier. The algorithm for creating multiples of the multiplicand uses the recoding method. As shown in Figure 5b, in the recoding method, the 56-bit multiplier a ₁ , a ₂ , ..., a ₅₆ is
Overlap each bit bit by bit and divide each bit into 3 bits.
The set is M0, M1, ..., M28. The weighting of M0 to M28 is (-2, 1, 1), and the magnification is set as shown in Table 1 according to the 3-bit bit pattern.

〔Effect of the invention〕

近年、浮動小数点の乗算器は、汎用機でも１サ
イクルで処理する乗数のビツト幅は12、16、24ビ
ツト、又はそれ以上と拡大されてきており、又、
ベクトル演算プロセツサの分野でも、１サイクル
で１エレメントの処理をする高速性能のパイプラ
イン化乗算器が要求されるようになつた。仮数部
56ビツトの被乗数と乗数との乗算を１サイクルで
１エレメントの処理で実現するには、第１図で示
す如き従来技術によれば、多大の金物に要するの
に対し、本発明によれば、第７図に示すごとく部
分積群の2^-73ビツト以下のハードウエアを削除
し、且つチエツク回路により、入力オペランドの
モジユロ３剰余の積から切捨てられる部分積群部
を減じた結果と、桁上げ先見加算器のモジユロ３
剰余とを比較することにより、比較的少ないハー
ドウエアで積の精度を高精度に保ちつつ信頼性を
向上することができる。 In recent years, the bit width of floating-point multipliers that can process multipliers in one cycle has been expanded to 12, 16, 24 bits, or more even on general-purpose machines.
In the field of vector arithmetic processors, there has also been a demand for high-speed pipelined multipliers that process one element in one cycle. mantissa
In order to realize the multiplication of a 56-bit multiplicand and a multiplier by processing one element in one cycle, according to the conventional technique as shown in FIG. 1, a large amount of hardware is required, but according to the present invention, As shown in Figure 7, the hardware below ^2-73 bits of the partial product group is deleted, and the result of subtracting the truncated partial product group part from the product of the modulo 3 remainder of the input operand by a check circuit and the carry Look-ahead adder modulo 3
By comparing the product with the remainder, it is possible to maintain high precision of the product and improve reliability with relatively little hardware.

[Brief explanation of the drawing]

第１図は従来のパイプライン化乗算回路の構成
例を示すブロツク図、第２図は、第１図における
部分積群作成回路３の具体的な構成を示すブロツ
ク図、第３図は、第１図における多入力桁上げ保
存回路７の具体的な構成を示すブロツク図、第４
図は本発明による実施例の構成を示すブロツク
図、第５図ａおよびｂは、第４図の実施例に適用
されるそれぞれ浮動小数点のデータ型式およびリ
コード手法を示すフオーマツト、第６図は、第４
図における部分積群作成回路１７の具体的構成を
示すブロツク図、第７図は、第４図における桁上
げ保存加算器２１の具体的な構成を示すブロツク
図、第８図は、第４図の実施例における被乗数と
乗数の部分積群のうち、切捨てられる部分を説明
するための図である。図において、１５は被乗数レジスタ、１６は乗
数レジスタ、１７は部分積群作成回路、１８はモ
ジユロ３剰余積作成回路、１９は部分積のモジユ
ロ３剰余積作成回路、２０はモジユロ３減算回
路、２１は多入力桁上げ保存加算器、２２は最終
和レジスタ、２３は最終桁上げレジスタ、２４は
桁上げ先見加算器、２５は72ビツトレジスタ、２
６はモジユロ３剰余作成回路、２７は比較器、２
８は制御回路であ。 FIG. 1 is a block diagram showing a configuration example of a conventional pipelined multiplication circuit, FIG. 2 is a block diagram showing a specific configuration of the partial product group generation circuit 3 in FIG. 1, and FIG. 4 is a block diagram showing a specific configuration of the multi-input carry storage circuit 7 in FIG.
The figure is a block diagram showing the configuration of an embodiment according to the present invention, FIGS. 5a and 5b are formats showing the floating point data type and recoding method applied to the embodiment of FIG. 4, respectively, and FIG. Fourth
7 is a block diagram showing a specific configuration of the partial product group generation circuit 17 in FIG. 4. FIG. 8 is a block diagram showing a specific configuration of the carry save adder 21 in FIG. 4. It is a figure for explaining the part which is rounded out of the partial product group of a multiplicand and a multiplier in Example. In the figure, 15 is a multiplicand register, 16 is a multiplier register, 17 is a partial product group creation circuit, 18 is a modulo-3 remainder product creation circuit, 19 is a partial product modulo-3 remainder product creation circuit, 20 is a modulo-3 subtraction circuit, and 21 is a multi-input carry save adder, 22 is a final sum register, 23 is a final carry register, 24 is a carry lookahead adder, 25 is a 72-bit register, 2
6 is a modulo 3 remainder creation circuit, 27 is a comparator, 2
8 is a control circuit.

Claims

[Claims]

1. In a pipelined multiplication circuit that multiplies two-input floating point data, the upper M+N (N is an integer) bits of the partial product group of the M (M is the bit length of the mantissa) bit multiplicand and the M bit multiplier. A partial product group creation circuit that creates only a part, a multi-input carry save adder that inputs the output of the partial product group creation circuit and creates a sum and a carry, and the carry save adder.
M
A circuit that creates a modulo 3 remainder of the truncated partial product group of 2M bits from +N+1 bits, and a modulo 3 remainder of the truncated partial product group from the output of the product creation circuit of the modulo 3 remainder of the input operand.
It is characterized by comprising a modulo-3 subtraction circuit for subtracting the output of the remainder creation circuit, and means for detecting a multiplication error by comparing the output of the modulo-3 subtraction circuit and the modulo-3 of the output of the carry look-ahead adder. pipelined multiplication circuit.