JPS5952840B2

JPS5952840B2 - Speech synthesizer interpolation device

Info

Publication number: JPS5952840B2
Application number: JP2391780A
Authority: JP
Inventors: 格川崎; 友明入路; 靖文河野
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1980-02-27
Filing date: 1980-02-27
Publication date: 1984-12-21
Also published as: JPS56121099A

Description

【発明の詳細な説明】本発明は人間の音声生成モデルに基づく音声合成器に使
われる音声パラメータの補間装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech parameter interpolation device used in a speech synthesizer based on a human speech production model.

上記方式の音声合成器は一般にはターミナルアナログ方
式や線形予測（ＬＰＣ）方式の合成器として知られるも
のである。The above-mentioned type of speech synthesizer is generally known as a terminal analog type or linear prediction (LPC) type synthesizer.

フオルマント周波数、有声音源のピッチ周期、振幅など
のパラメータが合成器に与えられて発声がなされるもの
である。通常、上記パラメータは２０ｍｓまたは１０ｍ
ｓ程度のフレームの間、同一の値として取扱われる。し
かし、このフレーム長では不十分であり、さらに短かい
フレーム長でパラメータを設定していく方が良質の音質
が得られるが、このようにすると当然、ビット数の増大
を招く。このため、フレーム長は或る程度長くしておき
、音声合成器内で２．５ｍｓ程度の単位で補間を行なう
ことにより、ビット数を増加させずに良質な音声を得る
ことが出来ることが一般に知られている。第１図は一般
に使われる補間回路のブロック図である。Parameters such as the formant frequency, the pitch period of the voiced sound source, and the amplitude are given to a synthesizer to produce a voice. Usually the above parameters are 20ms or 10m
It is treated as the same value for about s frames. However, this frame length is insufficient, and better sound quality can be obtained by setting parameters with an even shorter frame length, but this naturally results in an increase in the number of bits. For this reason, it is generally possible to obtain high-quality speech without increasing the number of bits by setting the frame length to a certain extent and performing interpolation in units of about 2.5 ms within the speech synthesizer. Are known. FIG. 1 is a block diagram of a commonly used interpolation circuit.

同図において、メモリ１に蓄えられたデ；一タが補間演
算部３で演算され、メモリ４に蓄えられる。この場合、
補間の基準点となるデータを蓄えるメモリがメモリ１と
メモリ２といつたように２組必要で、さらに補間の結果
を蓄えるメモリ３と合せると３組のメモリが必要となる
。In the figure, data stored in a memory 1 is calculated by an interpolation calculation section 3 and stored in a memory 4. in this case,
Two sets of memories, memory 1 and memory 2, are required to store data serving as reference points for interpolation, and three sets of memories are required when including memory 3, which stores interpolation results.

音声合成器で使用されるフレームあたりのビツト数はか
なり多いのでメモリ容量は大きくなる。本発明は、その
ような問題を解決し、更に補間回路に工夫を加えること
により、音声合成器の発声速度を容易に可変できるよう
にしたものである。Since the number of bits per frame used by the speech synthesizer is quite large, the memory capacity is large. The present invention solves such problems and furthermore makes it possible to easily vary the speech rate of the speech synthesizer by adding a modification to the interpolation circuit.

以下、本発明を図示の実施例に基いて説明する。第２図
は本発明の基本を説明するためのプロツク図である。同
図において、５は第１レジスタ、６は第２レジスタ、７
は減算器、８，９，１０は各各１／２，１／４，１／８
の割算器、１１は加算器である０上記第１レジスタ５に
は音声合成に使用されるパラメータがフレーム毎に入力
されるＯフレームの最後に第１レジスタ５のデータは直
接第２レジスタ６に送られる。従つて各フレームの最初
は第２レジスタ６は前のフレームのデータが記憶されて
いることになる。補間点では第１レジスタ５のデータと
第２レジスタ６のデータが減算器７で減算され、割算器
８，９，１０のいずれかを通る。上記割算器は補間値の
変化分を作るので、第２レジスタ６のデータと上記割算
器の出力を加算器１１で加算して、第２レジスタ６に記
憶させれば、第２レジスタ６は補間データを次々と得る
ことが出来る。第３図は本発明の実施例を示すプロツク
図である。Hereinafter, the present invention will be explained based on illustrated embodiments. FIG. 2 is a block diagram for explaining the basics of the present invention. In the figure, 5 is the first register, 6 is the second register, and 7 is the first register.
is a subtracter, 8, 9, 10 are each 1/2, 1/4, 1/8
Divider 11 is an adder 0 Parameters used for speech synthesis are input to the first register 5 for each frame O At the end of the frame, the data in the first register 5 is directly input to the second register 6 sent to. Therefore, at the beginning of each frame, the second register 6 stores the data of the previous frame. At the interpolation point, the data in the first register 5 and the data in the second register 6 are subtracted by a subtracter 7, and then passed through one of the dividers 8, 9, and 10. Since the divider creates a change in the interpolated value, if the data in the second register 6 and the output of the divider are added together in the adder 11 and stored in the second register 6, the data in the second register 6 can obtain interpolated data one after another. FIG. 3 is a block diagram showing an embodiment of the present invention.

同図において、第２図の５，６にそれぞれ相当する第１
レジスタ２０および第２レジスタ２１はシフトレジスタ
からなり、クロツクに応じてデータを最下位ビツト（Ｌ
ＳＢ）から送り出す。２９ごは補間演算部であり、第１
レジスタ２０および第２レジスタ２１のデータをとり込
み、第２レジスタ２１へ補間結果を送り込む。In the same figure, the first
The register 20 and the second register 21 are made up of shift registers and shift data to the least significant bit (L) according to the clock.
SB). 29 is an interpolation calculation unit, and the first
The data in the register 20 and the second register 21 are taken in, and the interpolation result is sent to the second register 21.

２２は減算器であり、上記第１レジスタ２０、第２レジ
スタ２１からのデータをクロツクに合せて１ビツトずつ
演算して加算器２５に入力する。22 is a subtracter which operates the data from the first register 20 and second register 21 bit by bit in synchronization with the clock and inputs the result to the adder 25.

２３，２６は単位遅延回路の直列結合によりクロツクに
同期して動作する遅延回路である。23 and 26 are delay circuits which operate in synchronization with a clock by connecting unit delay circuits in series.

２４，２７は上記各遅延回路２３，２６における単位遅
延回路の各ステージの出力を必要に応じて出力出来るセ
レクタである。24 and 27 are selectors that can output the output of each stage of the unit delay circuit in each of the delay circuits 23 and 26 as required.

くセレクタ２４の出力は加算器２５に入力される。加算
器２５の出力は遅延回路２６に入力される。２８はセレ
クタであり、これは前記セレクタ２７の出力と第］レジ
スタ２０の出力をセレクトして第２レジスタ２１に送り
出す０遅延回路２３は第２レジスタ２１の出力を遅延さ
せる。The output of selector 24 is input to adder 25. The output of adder 25 is input to delay circuit 26. 28 is a selector which selects the output of the selector 27 and the output of the register 20 and sends it to the second register 21. The zero delay circuit 23 delays the output of the second register 21.

その遅延時間はセレクタ２４で調整される。減算器２２
の遅延時間と遅延回路２３の遅延時間の差で加算器２５
に達する減算器２２による減算結果のデータと第１レジ
スタ２１からのデータのタイミングが決定される。遅延
時間の調整で第２図の割算器８．９，１０の選択がなさ
れる。遅延回路２６とセレクタ２７は遅延回路２３とセ
レクタ２４で発生したタイミングの差を調整するために
設けたものである。セレクタ２８は、フレームの最後に
第１レジスタ２０から第２レジスタ２１に直接送られる
経路を作り出すために設けられたものである。次表は第
２図の割算器の選択を示す表である。２０ｍｓフレーム
の場合、１０ｍｓフレームの場合、各々２，５ｍｓ間隔
で補間がなされる。The delay time is adjusted by the selector 24. Subtractor 22
The difference between the delay time of the adder 25 and the delay time of the delay circuit 23
The timing of the data of the subtraction result by the subtracter 22 and the data from the first register 21 that reach . The selection of dividers 8.9 and 10 in FIG. 2 is made by adjusting the delay time. The delay circuit 26 and the selector 27 are provided to adjust the timing difference generated between the delay circuit 23 and the selector 24. The selector 28 is provided to create a path directly sent from the first register 20 to the second register 21 at the end of the frame. The following table shows the selection of the divider of FIG. In the case of a 20 ms frame and in the case of a 10 ms frame, interpolation is performed at intervals of 2 and 5 ms, respectively.

従つて、２０ｍｓフレームの場合は８点、１０ｍｓフレ
ームの場合は４点で補間がなされる。前記表中の通常モ
ードの場合がこれを表わす。２０ｍｓフレームで第１補
間点から第３補間点の場合が１４の割算器を使用し、第
４補間点から第６補間点の場合がＺの割算器を使用し、
第７補間点ではＺの割算器を使用する。Therefore, interpolation is performed at 8 points for a 20 ms frame and at 4 points for a 10 ms frame. This is shown in the case of normal mode in the table above. In a 20ms frame, from the first interpolation point to the third interpolation point, a divider of 14 is used, from the fourth interpolation point to the sixth interpolation point, a divider of Z is used,
At the seventh interpolation point, a Z divider is used.

第４図は本発明の実施例による補間結果を示すグラフで
ある。FIG. 4 is a graph showing interpolation results according to an embodiment of the present invention.

横軸は時間であり、図示の番号は補間点のフレーム内番
号である。縦軸は補間値である。補間点をＯと補間点８
の補間値は第１レジスタに外部から送られてくるデータ
である。上記のアルゴリズムで補間を実行すると、点ａ
から点ｉに至る補間値が順次得られる。１０ｍｓフレー
ムにおける場合を第５図に示す０点ｐから点ｔに至る補
間値が得られることになる。The horizontal axis is time, and the numbers shown are the intraframe numbers of interpolation points. The vertical axis is the interpolated value. Interpolation point O and interpolation point 8
The interpolated value is data sent to the first register from the outside. When performing interpolation with the above algorithm, point a
Interpolated values from to point i are sequentially obtained. In the case of a 10 ms frame, interpolated values from point 0 p to point t shown in FIG. 5 are obtained.

補間値は完全な線型補間にはなり得ないが、音声合成器
のパラメータ補間などでは十分であり音声合成器からの
音質には何ら影響を与えない。上記のごとき補間回路の
制卿に工夫を加えることによつて発声速度の制御を容易
に行なわすことができる。Although the interpolated values cannot be perfectly linear interpolated, parameter interpolation of the speech synthesizer is sufficient and does not affect the quality of the sound from the speech synthesizer. By adding a device to the control of the interpolation circuit as described above, the speech rate can be easily controlled.

このことについて説明する。音声合成のパラメータのと
り込み速度が変化した場合、発声速度の制御が実現され
る。音声生成モデルに基づく音声合成器では音源と声道
が分離されるので、有声音源のピッチを変えることなく
声道のパラメータの変化を早く、または遅く出来るので
音質の低下を招くことなく早口、遅口の発声が可能とな
る。前記の表では２０ｍｓフレームで早口の場合、第６
補間点を最終補間点とし、第１レジスタの内容を第２レ
ジスタへ直接転送し次のフレームへ入る例を示している
。補間値は第４図の点ａから点Ｊに至る値となる。発声
速度は２５７０早くなる。方、遅口の場合、フレーム最
終での第１レジスタから第２レジスタへの直接転送を延
長し、補間をその間続ける。前記表の２０ｍｓフレーム
の遅口の場合、２補間時間だけフレームが延長され、第
８および第９補間点においても１／２割算器を通した補
間が続行虹れる。この結果、補間値は第４図の点ａから
点ｍに至る値をとる。フレームは２５７０のびるので２
５？の遅口となる。前記表と第５図に１０ｍｓフレーム
に設定されたバラメータで同碌ｊこ２５％の且口と遅口
を実現するための条件と補間値の変化を示している。一
般にはＸｍｓのフレームで通常モードではＮ個（但し、
Ｎは正の整数）の補間点を設け、ΣＭｓ毎に補間を求め
て行き、早口、遅口のモードではＮ±Ｍ個（但し、Ｍは
正の整数）の補間点を設け、上記と同毎に補間を求めて
行き、フレーム周期をＭｘ（１±−）Ｍｓ毎に変更すれ
は容易に発声速度Ｎを調整することが出来る。This will be explained. When the speed of capturing parameters for speech synthesis changes, control of the speaking speed is realized. In a speech synthesizer based on a speech generation model, the sound source and vocal tract are separated, so the parameters of the vocal tract can be changed quickly or slowly without changing the pitch of the voiced sound source, so it is possible to change the parameters of the vocal tract quickly or slowly without degrading the sound quality. Mouth speech becomes possible. In the table above, in the case of fast speaking in a 20ms frame, the 6th
An example is shown in which the interpolation point is set as the final interpolation point, the contents of the first register are directly transferred to the second register, and the next frame is entered. The interpolated value is the value from point a to point J in FIG. The speaking speed increases by 2570 points. On the other hand, in the case of slow transfer, the direct transfer from the first register to the second register at the end of the frame is extended, and interpolation continues during that time. In the case of the 20 ms frame delay in the table above, the frame is extended by two interpolation times, and the interpolation through the 1/2 divider continues at the 8th and 9th interpolation points. As a result, the interpolated value takes a value ranging from point a to point m in FIG. The frame stretches 2570, so 2
5? Becomes a slow talker. The above table and FIG. 5 show the conditions and changes in interpolated values for achieving 25% of the same speed and slow speed using parameters set to a 10 ms frame. In general, it is a frame of Xms, and in normal mode there are N frames (however,
N is a positive integer) interpolation points are set, and interpolation is obtained for each ΣMs. In fast and slow speaking modes, N±M interpolation points (where M is a positive integer) are set and the same as above is performed. The utterance rate N can be easily adjusted by calculating interpolation for each time and changing the frame period every Mx(1±-)Ms.

なお、先の例では２０ｍｓフレームと１０ｍｓフレーム
の例を示したが、２０ｍｓフレームで設定されたパラメ
ータを使つて１０ｍｓフレームの補間回路を採用すれは
２倍の早口が得られるし、逆の場合は２倍の遅口が得ら
れる。In addition, the previous example showed an example of 20ms frame and 10ms frame, but if you use the interpolation circuit of 10ms frame using the parameters set for 20ms frame, you can get twice as fast speech, and vice versa. You get twice as slow.

以上の説明から明らかなように本発明によれば、第１図
の従来例のように３個のメモリは不要であり、例えば第
１レジスタおよび第２レジスタといつた２個のメモリで
十分であるので、低コスト化に有利であり、また、比較
的簡単な構成で発声速度を容易に変化させることも可罷
であるという非常にすぐれた効果が得られるものである
。As is clear from the above description, according to the present invention, three memories are not required as in the conventional example shown in FIG. 1, and two memories, for example, the first register and the second register, are sufficient. Therefore, it is advantageous for cost reduction, and it is also possible to easily change the speaking speed with a relatively simple configuration, which is a very excellent effect.

[Brief explanation of the drawing]

第１図は従来例のプロツク図、第２図は本発明の基本を
説明するためのプロツク図、第３図は本発明の実施例の
プロツク図、第４図および第５図は本発明の実施例によ
る補間値の経過を示すグラフである。５，２０・・・・・・第１レジスタ、６，２１・・・・
・・第２レジスタ、７，２２・・・・・・減算器、８，
９，１０・・・・・・割算器、１１，２５・・・・・・
加算器、２３，２６・・・・・・遅延回路、２４，２７
，２８・・・・・・セレクタ、２９・・・・・・補間演
算部。Fig. 1 is a block diagram of a conventional example, Fig. 2 is a block diagram for explaining the basics of the present invention, Fig. 3 is a block diagram of an embodiment of the present invention, and Figs. 4 and 5 are block diagrams of the present invention. It is a graph showing the progress of interpolated values according to an example. 5, 20... 1st register, 6, 21...
...Second register, 7, 22...Subtractor, 8,
9, 10... Divider, 11, 25...
Adder, 23, 26...Delay circuit, 24, 27
, 28... Selector, 29... Interpolation calculation unit.

Claims

[Claims] 1. By serially connecting a subtracter and an adder that perform operations bit by bit starting from the least significant bit of data in synchronization with a clock and sequentially output the operation results, and a unit delay circuit. It is equipped with a delay circuit that operates in synchronization with a clock, and a first register and a second register, and data is updated in the first register every frame period, and the data in the first register and the second register are the lowest. bits to the subtracter, and the output of the subtracter and the output of one of the first register and the second register are supplied to the adder through the delay circuit, thereby forming an interpolation calculation section. , by changing the delay time of the unit delay circuit in the delay circuit, a change in interpolation value is created, an interpolation value corresponding to the number of interpolations within a frame period is calculated in an interpolation calculation section, and the interpolation results are sequentially processed. An interpolation device for a speech synthesizer, characterized in that the interpolation device is configured to serially transfer data to the second register. 2. In the statement of claim 1, the voice parameters of the voice synthesizer are set every x milliseconds, and
In the second mode, the frame period x (ms) is read into the first register and N (N is a positive integer) interpolation points are obtained within one frame. In the second mode, the frame period x (1 ±M/N) milliseconds (where M is a positive integer), the data is input into the first register, and is operated to obtain (N±M) interpolation points within one frame. 1. An interpolation device for a speech synthesizer, characterized in that the rate of speech from the speech synthesizer can be adjusted in a first mode and a second mode.