JP2864597B2

JP2864597B2 - Digital arithmetic circuit

Info

Publication number: JP2864597B2
Application number: JP33731889A
Authority: JP
Inventors: 清一郎岩瀬
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1989-12-26
Filing date: 1989-12-26
Publication date: 1999-03-03
Anticipated expiration: 2014-03-03
Also published as: JPH03196712A

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、ディジタルフィルタ等の積和演算に適用
できるディジタル演算回路に関する。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digital operation circuit applicable to a product-sum operation such as a digital filter.

[Summary of the Invention]

請求項（１）の発明は、２進数の第１の入力とキャリ
ーである第２の入力とサムである第３の入力とが供給さ
れる全加算器と、第２の入力に対して設けられ、下位か
らのキャリーを全加算器に供給すると共に、上位へキャ
リーを渡すためのキャリー接続手段と、全加算器から出
力されるキャリー及びサムをクロックに同期してホール
ドする手段と、ホールド手段からのキャリーをキャリー
接続手段を介して上位へ供給すると共に、サムの出力を
全加算器に帰還する帰還路とからなり、ゲート数が少な
い構成とできる。The invention according to claim (1) is provided for a full adder to which a first input of a binary number, a second input as a carry, and a third input as a sum are supplied, and a second input. A carry connection means for supplying a carry from the lower order to the full adder and passing the carry to the higher order, a means for holding the carry and the sum output from the full adder in synchronization with a clock, and a hold means And a feedback path for feeding the sum output back to the full adder, while supplying the carry from the carrier through the carry connection means.

請求項（２）の発明は、少なくとも２個の第１及び第
２の乗算器と第１及び第２の乗算器の出力を加算する加
算器とからなるディジタル演算回路であって、第１及び
第２の乗算器は、被乗数と乗数の分割されたデータとの
部分積を形成し、キャリー及びサムに分割した形態で部
分積を累加算する構成とされ、加算器は、第１及び第２
の乗算器の夫々のサム及びキャリーを順次選択するセレ
クタを有し、セレクタの出力信号を累加算する構成とさ
れたもので、ゲート数が少ない積和演算回路の構成とで
きる。According to a second aspect of the present invention, there is provided a digital operation circuit comprising at least two first and second multipliers and an adder for adding outputs of the first and second multipliers, wherein the first and second multipliers are provided. The second multiplier is configured to form a partial product of the multiplicand and the data obtained by dividing the multiplier, and accumulate the partial product in a form divided into a carry and a sum.
And a selector for sequentially selecting each of the sum and the carry of the multiplier, and the output signal of the selector is cumulatively added. Thus, a product-sum operation circuit having a small number of gates can be obtained.

[Conventional technology]

ｎタップのFIRディジタルフィルタは、入力系列をx_i
とし、出力系列をy_iとし、インパルス応答をh₀〜h_n-1と
する時、なる演算を行うものである。オーディオ信号のディジタ
ルフィルタ処理では、乗算器と累加算器を各１個持ち、
プログラムで上述の演算を制御する構成（所謂DSP）が
用いられている。しかしながら、サンプリング周波数が
高い画像データのリアルタイム処理では、乗算器及び累
加算器に時分割処理をさせる時間的余裕がない。従っ
て、第５図に示すように、上述の演算処理と対応して回
路を排した構成が用いられていた。第５図は、（ｎ＝
４）の例である。第５図の構成は、第６図のように表現
することができる。即ち、シフトレジスタ部31と乗算部
32と加算トリー部33とからなる。An n-tap FIR digital filter converts the input sequence to x _i
When the output sequence is y _i and the impulse response is h ₀ to h _n-1 , The following calculation is performed. In digital filter processing of an audio signal, one multiplier and one accumulator are provided.
A configuration (so-called DSP) in which the above-described calculation is controlled by a program is used. However, in real-time processing of image data having a high sampling frequency, there is not enough time for the multiplier and the accumulator to perform time-division processing. Therefore, as shown in FIG. 5, a configuration in which a circuit is omitted in correspondence with the above-described arithmetic processing is used. FIG. 5 shows that (n =
This is an example of 4). The configuration in FIG. 5 can be expressed as in FIG. That is, the shift register unit 31 and the multiplication unit
32 and an addition tree unit 33.

上述のディジタルフィルタに使用される乗算器とし
て、部分積加算回路を並列に並べた並列乗算器が通常、
使用されている。第７図は、本願出願人の提案にかかわ
るブースの乗算アルゴリズムを使用した並列乗算器の一
例を示している（特開昭64−86270号公報参照）。但
し、第７図では、１ビット分の回路構成を示しており、
10ビットの乗数（係数）を想定して５個の部分積を加算
する構成とされている。As a multiplier used in the above digital filter, a parallel multiplier in which partial product addition circuits are arranged in parallel is usually used.
in use. FIG. 7 shows an example of a parallel multiplier using the Booth multiplication algorithm according to the proposal of the present applicant (see Japanese Patent Application Laid-Open No. 64-86270). However, FIG. 7 shows a circuit configuration for one bit,
The configuration is such that five partial products are added assuming a 10-bit multiplier (coefficient).

シフトレジスタ部のタップから取り出された入力デー
タの１ビットがフリップフロップ34を介してセレクタ3
5、36、37、38、39とビット接続回路40、41、42、43、4
4に夫々供給される。セレクタ35〜39には、係数、即
ち、乗数の２ビット毎に形成された３ビットの制御信号
が図示せずブースのデコーダから供給される。このデコ
ーダには、係数の注目する２ビットとその下位の１ビッ
トの合計３ビットが供給される。セレクタ35〜39は、２
次のブースのアルゴリズムに基づいて、入力の１ビット
の±２倍と±１倍と０倍とを選択する。入力の±２倍の
データは、ビット接続回路40〜44により下位のビットを
選択することで実現される。即ち、１ビットシフトで２
倍の値が形成される。One bit of the input data extracted from the tap of the shift register section is connected to the selector 3 via the flip-flop 34.
5, 36, 37, 38, 39 and bit connection circuits 40, 41, 42, 43, 4
Supplied to 4 respectively. To the selectors 35 to 39, a coefficient, that is, a 3-bit control signal formed every two bits of the multiplier is supplied from a booth decoder (not shown). This decoder is supplied with a total of three bits, that is, two bits of interest of the coefficient and the lower one bit. Selectors 35 to 39 are 2
Based on the following Booth's algorithm, ± 2 times, ± 1 times and 0 times of 1 bit of the input are selected. The data of ± 2 times the input is realized by selecting the lower bits by the bit connection circuits 40 to 44. That is, 2 bits by one bit shift
A double value is formed.

45、46、47、48は、１ビットの全加算器（フルアダ
ー）で、３入力が供給され、キャリーｃとサムｓの２ビ
ットを出力する。全加算器46、47及び48のキャリー入力
としてビット接続回路49、50、51を夫々介して下位から
のキャリーが供給される。キャリー接続回路49、50、51
は、桁上げのためのキャリーを上のビットプレーンの回
路に接続することと、下のビットプレーンの回路からの
桁上げキャリーを受け入れることを行うことを示してい
る。Reference numerals 45, 46, 47, and 48 denote 1-bit full adders (full adders), to which three inputs are supplied and output two bits of a carry c and a sum s. Carries from the lower order are supplied as carry inputs of the full adders 46, 47 and 48 via the bit connection circuits 49, 50 and 51, respectively. Carry connection circuit 49, 50, 51
Indicates that the carry for the carry is connected to the circuit in the upper bit plane, and that the carry for the carry from the circuit in the lower bit plane is received.

通常の並列乗算器では、最後にキャリーも加算してし
まうのであるが第７図の構成では、加算トリーの後でキ
ャリーの加算を行う前提で、キャリー及びサムの２ビッ
トの冗長２進数の形態の出力をフリップフロップ52及び
53から出力している。この場合、加算トリー部として、
第８図に示すように、入力側のフリップフロップ54を介
された６本の入力を４段の全加算器55、56、57、58で順
次加算し、出力側のフリップフロップ59及び60を介して
出力する構成を使用できる。第８図も、１ビット分のみ
の構成を示している。第８図から明らかなように、１ビ
ットのデータを全てキャリーｃとサムｓの２本で扱って
いる点が通常の構成と異なっている。この第８図は、FI
Rディジタルフィルタの３タップ分の部分積の加算の例
である。In a normal parallel multiplier, carry is added at the end, but in the configuration of FIG. 7, the form of a 2-bit redundant binary number of carry and sum is assumed on the assumption that carry is added after the addition tree. Output of flip-flop 52 and
Output from 53. In this case,
As shown in FIG. 8, the six inputs passed through the input-side flip-flop 54 are sequentially added by four-stage full adders 55, 56, 57, 58, and the output-side flip-flops 59 and 60 are added. A configuration to output via can be used. FIG. 8 also shows a configuration of only one bit. As is apparent from FIG. 8, the difference from the normal configuration is that all 1-bit data is handled by two carry c and sum s. This FIG.
It is an example of addition of partial products for three taps of an R digital filter.

通常の構成とは、１ビットフルアダーの場合、第９図
Ａに示すように、Ａ及びＢの入力と下位からのキャリー
入力ciとが供給され、加算出力ｓと上位へのキャリー出
力coとが発生することを意味する。これに対して、上述
の第８図の構成は、演算出力では、冗長２進数として扱
い、全ての演算が済んだ後で、冗長２進数を普通の２進
数にする考え方に基づいている。つまり、第９図Ｂに示
すように、Ａ、Ｂ、Ｃの同じビット桁の３本の入力を加
算して２本の同じビット桁の出力s1及びs2を出力してい
る。この考え方では、多数の同じビット桁の入力が、最
後に１ビットにつき２本の出力まで減らせるが、１本に
することができない。従って、第９図Ｃに示すように、
多数の全加算器を直列に接続して、ビット毎に１本の通
常の２進数の出力を形成することが必要である。In the normal configuration, in the case of a 1-bit full adder, as shown in FIG. 9A, the inputs of A and B and the carry input ci from the lower side are supplied, and the addition output s and the carry output co to the upper side are provided. That occurs. On the other hand, the configuration shown in FIG. 8 described above is based on the concept that the arithmetic output is treated as a redundant binary number, and after all the operations are completed, the redundant binary number is converted into a normal binary number. That is, as shown in FIG. 9B, three inputs of the same bit digit of A, B, and C are added, and two outputs s1 and s2 of the same bit digit are output. With this concept, a large number of inputs of the same bit digit can be reduced to two outputs per bit at the end, but not one. Therefore, as shown in FIG. 9C,
It is necessary to connect a number of full adders in series to form one regular binary output per bit.

しかしながら、この第９図Ｃに示す構成は、キャリー
が多段に伝播して低速な演算回路である。かかる加算回
路を高速とする方法として、キャリー先見（キャリール
ックアヘッド）とかキャリーセレクトとかが知られてい
る。しかしながら、これらの方法は、ゲート数が増大す
る欠点がある。従って、第５図におけるディジタルフィ
ルタを構成する各乗算器や各加算トリー毎にこのような
加算回路を設けることは、高速化の障害となる。そこ
で、先の出願では、各乗算器や各加算器では、１ビット
当り２本の演算途中で止めて、次の演算に入り、全ての
演算の後で第７図の高速化したもので、冗長２進数から
普通の２進数に変換している。However, the configuration shown in FIG. 9C is a low-speed arithmetic circuit in which carry propagates in multiple stages. As a method of increasing the speed of such an addition circuit, carry look ahead (carry look ahead) and carry select are known. However, these methods have a disadvantage that the number of gates increases. Therefore, providing such an adder circuit for each multiplier and each adder tree constituting the digital filter in FIG. 5 is an obstacle to speeding up. Therefore, in the prior application, in each multiplier and each adder, two operations per bit are stopped in the middle, the next operation is started, and after all the operations, the speed is increased as shown in FIG. The conversion is from a redundant binary number to a normal binary number.

第８図は、乗算部の１ビットプレーン分を示したもの
で、ｎビットの乗算器とするには、第10図のように、重
ねてｎプレーンにする必要がある。第10図では、簡単の
ために、乗数により語長が延びることは、加味されてい
ない。第９図も、加算トリー部の１ビットプレーンを示
すもので、ｎビット分とするには、第10図のように、重
ねてｎプレーンとする必要がある。第10図において、MP
Yが第７図の構成に対応しており、ATが第８図の構成に
対応している。また、第10図において、接続線は、簡単
のため最上位ビットのプレーンについてのみ示してある
が、他のビットプレーンについて同様に接続される。FIG. 8 shows one bit plane of the multiplying unit. In order to form an n-bit multiplier, it is necessary to overlap the n planes as shown in FIG. In FIG. 10, for simplicity, extension of the word length by the multiplier is not taken into account. FIG. 9 also shows a 1-bit plane of the addition tree unit. In order to have n bits, it is necessary to overlap the n-plane as shown in FIG. In FIG. 10, MP
Y corresponds to the configuration in FIG. 7, and AT corresponds to the configuration in FIG. In FIG. 10, connection lines are shown only for the most significant bit plane for simplicity, but are similarly connected for other bit planes.

[Problems to be solved by the invention]

先に提案されている構成では、第７図及び第８図のよ
うに、フリップフロップとフリップフロップとの間に全
加算器等のゲート回路が多数挟まれていた。つまり、パ
イプラインレジスタの間にゲート回路が多数段直列にな
ったものが挟まっている構成である。かかる構成は、各
ゲート回路が働いている時間が僅かで、クロックサイク
ルの大半が休んでいるために、効率が悪い回路と言え
る。かかる効率の悪さを改善しないと、画像信号処理用
の高速な演算回路が大規模となり、消費電力の増大、コ
ストの増大が生じる。In the previously proposed configuration, as shown in FIGS. 7 and 8, a large number of gate circuits such as full adders are interposed between flip-flops. In other words, a configuration in which a large number of gate circuits are arranged in series between pipeline registers is sandwiched. Such a configuration can be said to be a circuit with low efficiency because each gate circuit has a short working time and most of the clock cycles are rested. Unless such inefficiency is improved, the scale of a high-speed arithmetic circuit for image signal processing becomes large, and power consumption and cost increase.

かかる効率の悪さを解決するためには、ゲート回路を
なるべく小規模の形でパイプラインレジスタ間に挟めば
良いので、第11図に示すように、全加算器61の入力側及
び出力側に夫々フリップフロップ62及び63が設けられ
る。しかしながら、第11図のように、全加算器単位或い
はブースのセレクタ単位でパイプライン化することは、
クロックを３倍の周波数に上げることができるが、フリ
ップフロップが増えてゲート数が増大する問題を生じ
る。In order to solve such inefficiency, it is only necessary to sandwich the gate circuit between pipeline registers in as small a form as possible, and as shown in FIG. 11, the input side and the output side of the full adder 61 are respectively provided. Flip-flops 62 and 63 are provided. However, as shown in FIG. 11, it is not possible to pipeline in units of full adders or in units of booth selectors.
Although the frequency of the clock can be tripled, the number of flip-flops increases and the number of gates increases.

従って、この発明の目的は、ゲート数が少なく、ま
た、ゲートが無駄に遊ぶことがないように、改良された
ディジタル演算回路を提供することにある。SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide an improved digital arithmetic circuit so that the number of gates is small and the gates do not play wastefully.

[Means for solving the problem]

請求項（１）の発明は、２進数の第１の入力とキャリ
ーである第２の入力とサムである第３の入力とが供給さ
れる全加算器（16）と、第２の入力に対して設けられ、下位からのキャリーを
全加算器（16）に供給すると共に、上位へキャリーを渡
すためのキャリー接続手段（18）と、全加算器（16）から出力されるキャリー及びサムをク
ロックに同期してホールドする手段（17）と、ホールド手段（17）からのキャリーをキャリー接続手
段（18）を介して上位へ供給すると共に、サムの出力を
全加算器（16）に帰還する帰還路とからなるディジタル演算回路である。The invention of claim (1) provides a full adder (16) to which a first input of a binary number, a second input of a carry, and a third input of a sum are supplied, And a carry connection means (18) for supplying the carry from the lower order to the full adder (16) and for transferring the carry to the upper order, and a carry and a sum output from the full adder (16). Means (17) for holding in synchronization with the clock; and carry from the hold means (17) is supplied to the higher order via carry connection means (18), and the sum output is fed back to the full adder (16). This is a digital operation circuit consisting of a feedback path and.

請求項（２）の発明は、少なくとも２個の第１及び第
２の乗算器（8A、8B）と第１及び第２の乗算器（8A、8
B）の出力を加算する加算器（９）とからなるディジタ
ル演算回路であって、第１及び第２の乗算器（8A、8B）は、被乗数と乗数の
分割されたデータとの部分積を形成し、キャリー及びサ
ムに分割した形態で部分積を累加算する構成とされ、加算器（９）は、第１及び第２の乗算器（8A、8B）の
夫々のサム及びキャリーを順次選択するセレクタ（20）
を有し、セレクタ（20）の出力信号を累加算する構成と
されたディジタル演算回路である。The invention according to claim (2) is characterized in that at least two first and second multipliers (8A, 8B) and first and second multipliers (8A, 8B).
A digital operation circuit comprising an adder (9) for adding the output of B), wherein the first and second multipliers (8A, 8B) calculate a partial product of the multiplicand and the data obtained by dividing the multiplier. And sums the partial products in a form divided into carry and sum. The adder (9) sequentially selects the respective sum and carry of the first and second multipliers (8A, 8B). Selector to do (20)
And a digital arithmetic circuit configured to accumulate the output signal of the selector (20).

[Action]

請求項（１）の発明において、１個の全加算器16とそ
の入力及び出力側のパイプライン用のフリップフロップ
15、17とにより、乗数のビット数の1/2の個数の部分積
の加算結果を時分割動作で得ることができる。In the invention of claim (1), one full adder 16 and its input and output pipeline flip-flops
With the use of 15 and 17, it is possible to obtain the addition result of the partial products of half the number of bits of the multiplier by the time-division operation.

請求項（２）の発明では、ディジタルフィルタのよう
な積和演算を行う時に、乗算部8A及び8Bと加算トリー９
が共に、キャリー及びサムの２本の信号の形態で処理を
行う。加算トリー９は、乗算部8A及び8Bからの４本の入
力をセレクタ20で順に選択して累加算を行う。従って、
ゲート数を少なくでき、また、全加算器等の回路が無駄
に遊ぶことを防止できる。According to the invention of claim (2), when performing a product-sum operation such as a digital filter, the multiplying units 8A and 8B and the addition tree 9 are used.
Perform processing in the form of two signals, carry and sum. The addition tree 9 sequentially selects four inputs from the multiplication units 8A and 8B by the selector 20 and performs cumulative addition. Therefore,
The number of gates can be reduced, and circuits such as full adders can be prevented from being idle.

〔Example〕

以下、この発明を４タップのFIRディジタルフィルタ
に適用した一実施例について図面を参照して説明する。
第１図は、この一実施例の全体的な構成を示す。入力デ
ータは、その１サンプルが例えば８ビット並列のもの
で、２を補数とするコードである。但し、第１図では、
１ビットプレーンに関しての構成のみが示されている。Hereinafter, an embodiment in which the present invention is applied to a 4-tap FIR digital filter will be described with reference to the drawings.
FIG. 1 shows the overall configuration of this embodiment. The input data is a code whose one sample is, for example, 8 bits in parallel and whose 2's complement is used. However, in FIG.
Only the configuration for one bit plane is shown.

第１図において、１、２、３及び４は、夫々入力デー
タのサンプリング周期と等しい遅延時間を有する単位遅
延素子例えばフリップフロップである。フリップフロッ
プ１（第１タップ）の出力データａが乗算部8Aに供給さ
れる。フリップフロップ２（第２タップ）の出力データ
及びフリップフロップ３（第３タップ）の出力データが
２τ（τ：クロックの周期）の遅延量の遅延回路５及び
６に夫々供給され、遅延回路５及び６の出力データｂ′
及びｃ′が乗算部8B及び8Cに夫々供給される。フリップ
フロップ４（第４タップ）の出力データｄが４τの遅延
回路７に供給され、遅延回路７の出力データｄ′が乗算
部8Dに供給される。In FIG. 1, reference numerals 1, 2, 3 and 4 denote unit delay elements, for example, flip-flops each having a delay time equal to the sampling period of input data. The output data a of the flip-flop 1 (first tap) is supplied to the multiplier 8A. The output data of the flip-flop 2 (second tap) and the output data of the flip-flop 3 (third tap) are supplied to delay circuits 5 and 6 having a delay amount of 2τ (τ: clock cycle), respectively. 6 output data b '
And c 'are supplied to multipliers 8B and 8C, respectively. The output data d of the flip-flop 4 (fourth tap) is supplied to the delay circuit 7 of 4τ, and the output data d ′ of the delay circuit 7 is supplied to the multiplying unit 8D.

乗算部8A〜8Dは、２次のブースのアルゴリズムによ
り、係数と各タップのデータとの乗算を行うものであ
る。即ち、（Ｘ×Ｙ）（X:被乗数（データ）、Y:乗数
（係数））の乗算を行う時に、乗数の相続く符号のパタ
ーンによって、（０、＋Ｘ、−Ｘ、＋2X、又は−2X）の
演算を行うことにより乗算を行うものである。従って、
各乗算部8A〜8Dに夫々設けられたブースのセレクタに
は、係数の相続く３ビットがブースのデコーダに供給さ
れることで形成された制御信号が供給される。これらの
０、±Ｘ、±2Xが部分積と称される。The multiplication units 8A to 8D multiply the coefficients by the data of the taps according to the secondary Booth algorithm. That is, when multiplying (X × Y) (X: multiplicand (data), Y: multiplier (coefficient)), (0, + X, -X, + 2X, or -2X ) Is performed to perform multiplication. Therefore,
A booth selector provided in each of the multipliers 8A to 8D is supplied with a control signal formed by supplying three consecutive bits of coefficients to a booth decoder. These 0, ± X, ± 2X are called partial products.

また、乗算部8A〜8Dは、フリップフロップ１、２、３
及び４からなるシフトレジスタ部からの入力データにつ
いて、係数語長の1/2に相当する数の部分積をクロック
サイクル毎に累積する。この部分積は、２ビットの桁ず
れを有している必要がある。従って、シフトレジスタ部
の出力は、４クロックサイクル毎に右シフトするだけで
なく、クロックサイクル毎に２ビットシフトを行う。こ
のシフトの方法としては、乗算部8A〜8Dの夫々の入力側
にセレクタを設けたり、乗算部8A〜8Dの夫々の入力を記
憶し、ビット桁を上げる方向にシフトできるシフトレジ
スタを設ける等を採用できる。The multiplying units 8A to 8D include flip-flops 1, 2, 3
And 4, a partial product of a number corresponding to 1/2 of the coefficient word length is accumulated for each clock cycle. This partial product must have a 2 bit shift. Therefore, the output of the shift register unit shifts not only to the right every four clock cycles but also to two bits every clock cycle. As a method of this shift, a selector is provided on each input side of the multiplying units 8A to 8D, or a shift register that stores each input of the multiplying units 8A to 8D and can shift in the direction of increasing the bit digit is provided. Can be adopted.

乗算部8Aの出力ｅと乗算部8Bの出力ｆとが加算トリー
9Aに供給される。乗算部8Cの出力ｇと乗算部8Dの出力ｈ
とが加算トリー9Bに供給される。加算トリー9A及び9Bの
夫々の出力ｉ及びｊが加算トリー10に夫々供給される。
これらの加算トリー9A、9B及び10では、キャリーとサム
の２組分の４ビットを累加算する。The output e of the multiplier 8A and the output f of the multiplier 8B are the addition tree.
Supplied to 9A. The output g of the multiplier 8C and the output h of the multiplier 8D
Are supplied to the addition tree 9B. The outputs i and j of the summing trees 9A and 9B, respectively, are provided to a summing tree 10, respectively.
These addition trees 9A, 9B, and 10 add two bits of carry and sum to each other.

加算トリー10の出力（キャリー及びサム）ｋがフリッ
プフロップ11及び12に供給され、フリップフロップ11及
び12から出力ｌが得られる。図示せずも、この出力ｌ
は、冗長２進数であり、累加算器の構成により、１ビッ
トが１本の普通の２進数に変換される。The output (carry and sum) k of the addition tree 10 is supplied to the flip-flops 11 and 12, and the output 1 is obtained from the flip-flops 11 and 12. Although not shown, this output l
Is a redundant binary number, and one bit is converted into one ordinary binary number by the configuration of the accumulator.

第２図は、第１図に示す回路の動作を示すタイミング
チャートである。乗算部8A〜8D、加算トリー9A、9B、10
の動作クロックは、入力データのサンプリング周波数の
４倍の周波数である。つまり、入力データのサンプリン
グ周期をＴとし、クロックの周期をτで表すと、（Ｔ＝
４τ）の関係にある。x1、x2、・・・は、並列化された
入力データの同一の桁（MSB、LSB等）の１ビットを夫々
表している。FIG. 2 is a timing chart showing the operation of the circuit shown in FIG. Multiplication units 8A to 8D, addition trees 9A, 9B, 10
Is a frequency four times the sampling frequency of the input data. That is, if the sampling period of the input data is represented by T and the clock period is represented by τ, then (T =
4τ). .. represent one bit of the same digit (MSB, LSB, etc.) of the parallelized input data.

フリップフロップ１の出力データａに対して、フリッ
プフロップ２の出力データｂ、フリップフロップ３の出
力データｃ、フリップフロップ４の出力データｄは、
Ｔ、2T、3T夫々遅れている。遅延回路５の出力データ
ｂ′は、ｂに対して、２τの遅れを持ち、遅延回路６の
出力データｃ′は、ｃに対して、２τの遅れを持ち、遅
延回路７の出力データｄ′は、ｄに対して４τの遅れを
持つ。For output data a of flip-flop 1, output data b of flip-flop 2, output data c of flip-flop 3, and output data d of flip-flop 4
T, 2T, 3T are each late. The output data b 'of the delay circuit 5 has a delay of 2τ with respect to b, the output data c' of the delay circuit 6 has a delay of 2τ with respect to c, and the output data d 'of the delay circuit 7 Has a delay of 4τ with respect to d.

乗算部8Aの出力ｅ、乗算部８の出力ｆ、乗算部8Cの出
力ｇ、乗算部8Dの出力ｈの夫々において、tnは、入力デ
ータのxnと係数の乗算結果が得られるタイミングを示し
ている。例えば乗算部8Aの出力ｅにおいて、t4は、x4と
第１タップの係数との乗算結果が得られるタイミングで
ある。In each of the output e of the multiplication unit 8A, the output f of the multiplication unit 8, the output g of the multiplication unit 8C, and the output h of the multiplication unit 8D, tn indicates the timing at which the multiplication result of xn of the input data and the coefficient is obtained. I have. For example, in the output e of the multiplication unit 8A, t4 is a timing at which a result of multiplication of x4 by the coefficient of the first tap is obtained.

加算トリー9Aの出力ｉ及び加算トリー9Bの出力ｊにお
いて、tmnは、係数及びxmの積と係数及びxnの積の和が
得られるタイミングを示している。例えば加算トリー9A
の出力ｉにおいて、t54は、x5と第１タップの係数の積
（乗算出力ｅにおいてt5で示すタイミングで得られる）
と、x4と第２タップの係数の積（乗算ｆにおいてt4てで
示すタイミングで得られる）との和が得られるタイミン
グを示している。In the output i of the addition tree 9A and the output j of the addition tree 9B, tmn indicates the timing at which the sum of the product of the coefficient and xm and the product of the coefficient and xn is obtained. For example, addition tree 9A
T54 is the product of x5 and the coefficient of the first tap (obtained at the timing indicated by t5 in the multiplication output e).
And the timing at which the sum of x4 and the product of the coefficient of the second tap (obtained at the timing indicated by t4 in the multiplication f) is obtained.

更に、加算トリー10の出力ｋにおいて、ｔmnopは、tm
n及びtopで夫々得られた和出力の和が得られるタイミン
グを示す。例えばt4321は、加算トリー9Aの出力ｉでt43
のタイミングで発生する加算出力と、加算トリー9Bの出
力ｊでt21のタイミングで発生する加算出力との和が得
られるタイミングを示している。フリップフロップ11及
び12では、元のサンプリング周波数のクロックで加算ト
リー10の出力がサンプリングされ、フィルタ演算出力ｌ
がフリップフロップ11及び12から得られる。Furthermore, at the output k of the summing tree 10, tmop is tm
The timing at which the sum of the sum outputs obtained respectively by n and top is obtained is shown. For example, t4321 is t43 at the output i of the addition tree 9A.
The timing at which the sum of the addition output generated at the timing of (1) and the addition output generated at the timing t21 at the output j of the addition tree 9B is obtained. In the flip-flops 11 and 12, the output of the addition tree 10 is sampled by the clock of the original sampling frequency, and the filter operation output l
Are obtained from the flip-flops 11 and 12.

上述の乗算部8Aは、第３図に示す構成を有している。
第３図において、13がブースのセレクタを示し、セレク
タ13は、第１タップに対する係数をブースのデコーダに
供給することで得られた３ビットの制御信号で制御され
る。セレクタ13は、係数の２ビット毎の制御信号に応じ
て入力データの０倍、±１倍、±２倍の信号を選択的に
出力する。このセレクタ13には、キャリー接続回路14を
介された下位からのキャリーと入力データａとの２ビッ
トが供給される。キャリー接続回路14からの下位のキャ
リーが選択される時には、２倍の出力が発生することを
意味する。また、２を補数とするコードの場合には、
“0"と“1"とを反転して、最下位ビットに“1"を足すこ
とで極性の反転が実現される。The above-described multiplication unit 8A has a configuration shown in FIG.
In FIG. 3, reference numeral 13 denotes a Booth selector, and the selector 13 is controlled by a 3-bit control signal obtained by supplying a coefficient for the first tap to the Booth decoder. The selector 13 selectively outputs a signal of 0 times, ± 1 times, ± 2 times of the input data according to a control signal for every two bits of the coefficient. The selector 13 is supplied with two bits of a carry from the lower side via the carry connection circuit 14 and the input data a. When the lower carry from the carry connection circuit 14 is selected, it means that double output is generated. In the case of a code with 2's complement,
The polarity inversion is realized by inverting “0” and “1” and adding “1” to the least significant bit.

セレクタ13の出力がフリップフロップ15を介して全加
算器16に供給される。全加算器16のキャリーｃ及びサム
ｓの２ビットがフリップフロップ17を介して出力される
と共に、全加算器16の入力側に帰還される。この帰還路
により累加算器（アキュムレータ）が構成される。この
累加算器の構成で、部分積の加算が時分割処理でなされ
る。前述のように、サンプリング周波数の４倍のクロッ
クでセレクタ13、フリップフロップ15及び17、全加算器
16が動作する。The output of the selector 13 is supplied to the full adder 16 via the flip-flop 15. Two bits of the carry c and the sum s of the full adder 16 are output via the flip-flop 17 and are fed back to the input side of the full adder 16. This feedback path constitutes a cumulative adder (accumulator). With the configuration of the accumulator, the addition of the partial products is performed by time division processing. As described above, the selector 13, the flip-flops 15 and 17 and the full adder are clocked by four times the sampling frequency.
16 works.

入力データａの１ビット例えばx4が供給されると、第
１タップの係数（８ビット）の２ビット毎に部分積がセ
レクタ13から生じる。合計４個の部分積が全加算器16に
より、４クロック周期で累加算され、第２図の乗算部8A
の出力ｅにおいて、t4で示すタイミングでx4と係数の乗
算出力が得られる。各累加算に先行してフリップフロッ
プ17がクリアされるか、又は帰還路にANDゲートを挿入
して初期化がなされる。この累加算の時に発生したキャ
リーがキャリー接続回路18を介して上位の桁の全加算器
に供給され、また、下位の桁のキャリーがキャリー接続
回路18を介して全加算器15に供給される。When one bit, for example, x4 of the input data a is supplied, the selector 13 generates a partial product every two bits of the coefficient (8 bits) of the first tap. A total of four partial products are cumulatively added by the full adder 16 in four clock cycles, and the multiplication unit 8A shown in FIG.
In the output e, a multiplied output of x4 and a coefficient is obtained at the timing indicated by t4. Prior to each accumulation, the flip-flop 17 is cleared, or initialization is performed by inserting an AND gate in the feedback path. The carry generated at the time of the accumulation is supplied to the full adder of the upper digit via the carry connection circuit 18, and the carry of the lower digit is supplied to the full adder 15 via the carry connection circuit 18. .

上述の乗算部8Aは、１ビット分の入力をキャリーｃと
サムｓとの２ビットで表現する形で累加算を行う。乗算
部8B、8C、8Dも第３図に示す乗算部8Aと同一の構成とさ
れている。The above-described multiplication unit 8A performs cumulative addition in such a manner that an input of one bit is represented by two bits of a carry c and a sum s. The multipliers 8B, 8C, and 8D have the same configuration as the multiplier 8A shown in FIG.

加算トリー9Aの詳細を第４図に示す。乗算部8A及び8B
からは、上述のように、キャリーとサムの２ビットの出
力ｅ及びｆが発生するので、この２ビットの加算をする
ことができる構成が必要とされる。The details of the addition tree 9A are shown in FIG. Multipliers 8A and 8B
As described above, since 2-bit outputs e and f of carry and sum are generated as described above, a configuration capable of adding these 2 bits is required.

第４図において、20は、乗算部8A及び8Bからの入力ｅ
及びｆ（合計４個の入力）を切り替えるためのセレクタ
である。遅延回路５が挿入されているので、加算すべき
入力ｅ及びｆの間には、２クロック周期の遅延がある。
これらの入力のキャリーに関しては、キャリー接続回路
21及び22が挿入されている。また、サムの入力に関して
のみフリップフロップ23及び24が挿入され、夫々の入力
ｅ及びｆにおいて、キャリーに対してサムが１クロック
周期遅れてセレクタ20に供給される。従って、セレクタ
20に対しては、入力ｅのキャリー、入力ｅのサム、入力
ｆのキャリー、入力ｆのサムが４クロック周期で順番に
供給され、セレクタ20は、これらの１ビットを順番に選
択して出力する。In FIG. 4, reference numeral 20 denotes an input e from the multipliers 8A and 8B.
And f (total of four inputs). Since the delay circuit 5 is inserted, there is a delay of two clock cycles between the inputs e and f to be added.
For carry of these inputs, the carry connection circuit
21 and 22 have been inserted. Flip-flops 23 and 24 are inserted only with respect to the input of the sum, and the sum is supplied to the selector 20 with a delay of one clock cycle with respect to the carry at the respective inputs e and f. Therefore, the selector
The carry of the input e, the sum of the input e, the carry of the input f, and the sum of the input f are sequentially supplied to the selector 20 at four clock cycles, and the selector 20 selects these one bit in order and outputs the selected one bit. I do.

セレクタ20で選択された一つの入力がフリップフロッ
プ25に供給される。フリップフロップ25、全加算器26、
フリップフロップ27は、乗算部8A〜8Dと同様の累加算器
を構成している。この累加算器でフリップフロップ25を
介された４個の入力が累加算される。従って、入力ｅが
供給されたタイミングから（４＋１＋１＝６）クロック
周期後に、フリップフロップ27から加算トリー9Aの出力
ｉが得られる。例えば第２図における入力ｅのタイミン
グt4から６クロック周期後のタイミングt43が加算トリ
ー9Aの出力ｉが得られるタイミングである。One input selected by the selector 20 is supplied to the flip-flop 25. Flip-flop 25, full adder 26,
The flip-flop 27 constitutes the same accumulator as the multipliers 8A to 8D. The four inputs through the flip-flop 25 are cumulatively added by the accumulator. Therefore, the output i of the addition tree 9A is obtained from the flip-flop 27 after (4 + 1 + 1 = 6) clock cycles from the timing at which the input e is supplied. For example, a timing t43 six clock cycles after the timing t4 of the input e in FIG. 2 is a timing at which the output i of the addition tree 9A is obtained.

加算トリー9B及び10も第４図と同一の構成を有してい
る。加算トリーＡ及び9Bを設けずに、１個の加算トリー
10のみで、加算処理を行うことができる。但し、その場
合には、セレクタが８入力の一つを順次選択し、繰り返
し加算の回数が８回に増えるので、回路の演算速度が第
４図の構成の２倍の必要がある。The addition trees 9B and 10 also have the same configuration as in FIG. One addition tree without addition trees A and 9B
The addition process can be performed with only 10. In this case, however, the selector sequentially selects one of the eight inputs, and the number of repetitive additions increases to eight, so that the operation speed of the circuit needs to be twice that of the configuration in FIG.

また、フィルタのタップ数、語長、累加算器の語長等
は、上述の実施例に限定されるものではない。特に、累
加算器の語長は、１ビットのものが最も高速であるが、
ｎビットの語長に拡張しても良い。更に、この発明は、
ディジタルフィルタに限らず、FFT,コサイン交換等の積
和演算に対しても適用できる。Further, the number of taps, the word length of the filter, the word length of the accumulator, and the like are not limited to the above-described embodiment. In particular, the word length of the accumulator is 1-bit, which is the fastest,
The word length may be extended to n bits. Further, the present invention
The present invention can be applied not only to digital filters but also to product-sum operations such as FFT and cosine exchange.

〔The invention's effect〕

この発明は、全加算器及び帰還路からなる累加算器が
パイプライン構成とされており、少ないゲート数で加算
或いは乗算器を行うことができると共に、ゲートが無駄
に遊ぶことを防止できる。また、フィルタ演算のような
積和演算を行う場合、乗算部及び加算トリーでの処理が
冗長２進数で行うことができ、乗算部及び加算トリーの
夫々でキャリーの桁上げの加算を行う必要がなくなり、
演算速度が低下することを防止できる。更に、この発明
では、累加算器のように同じ回路構成のものが多いの
で、IC化に適している。According to the present invention, since the accumulator including the full adder and the feedback path has a pipeline configuration, the addition or the multiplier can be performed with a small number of gates, and the gates can be prevented from being idle. In addition, when a product-sum operation such as a filter operation is performed, the processing in the multiplication unit and the addition tree can be performed using redundant binary numbers, and it is necessary to add carry carry in each of the multiplication unit and the addition tree. Gone
The calculation speed can be prevented from lowering. Further, in the present invention, since there are many such circuits having the same circuit configuration as a cumulative adder, the present invention is suitable for IC implementation.

[Brief description of the drawings]

第１図はこの発明の一実施例の全体のブロック図、第２
図はこの一実施例の動作を示すタイミングチャート、第
３図は乗算部の一例の構成を示すブロック図、第４図は
加算トリーの一例の構成を示すブロック図、第５図及び
第６図はFIRディジタルフィルタの説明に用いるブロッ
ク図、第７図及び第８図は先に提案されている乗算器及
び加算トリーを夫々示すブロック図、第９図は加算処理
の説明のための略線図、第10図はビットプレーン間の接
続関係を示す略線図、第11図は全加算器毎にパイプライ
ン化する構成を示すブロック図である。図面における主要な符号の説明 8A〜8D:乗算部、 9A、9B、10:加算トリー、 13:ブースのセレクタ、 16、26:全加算器、 20:セレクタ。FIG. 1 is an overall block diagram of one embodiment of the present invention, and FIG.
FIG. 3 is a timing chart showing the operation of this embodiment, FIG. 3 is a block diagram showing an example of a configuration of a multiplication unit, FIG. 4 is a block diagram showing an example of a configuration of an addition tree, and FIGS. FIG. 7 is a block diagram used to describe an FIR digital filter, FIGS. 7 and 8 are block diagrams respectively showing a multiplier and an addition tree proposed above, and FIG. 9 is a schematic diagram illustrating an addition process. FIG. 10 is a schematic diagram showing a connection relationship between bit planes, and FIG. 11 is a block diagram showing a configuration in which a pipeline is formed for each full adder. Explanation of main symbols in the drawings 8A to 8D: multiplication unit, 9A, 9B, 10: addition tree, 13: booth selector, 16, 26: full adder, 20: selector.

Claims

(57) [Claims]

1. A binary first input and a second carry.
And a third input which is a sum, which is provided for the second input, supplies a carry from the lower order to the full adder, and passes a carry to the upper order. Carrying means for holding the carry and the sum output from the full adder in synchronization with a clock; and supplying the carry from the holding means to a higher order through the carry connecting means. And a feedback path for returning the sum output to the full adder.

2. A digital operation circuit comprising at least two first and second multipliers and an adder for adding outputs of the first and second multipliers, wherein the first and second multipliers are provided. Is configured to form a partial product of the data obtained by dividing the multiplicand and the multiplier, and to cumulatively add the partial product in a form divided into a carry and a sum. A digital arithmetic circuit comprising: a selector for sequentially selecting each sum and carry of the two multipliers; and a configuration for accumulating the output signals of the selector.