JPS5912194B2

JPS5912194B2 - speech synthesizer

Info

Publication number: JPS5912194B2
Application number: JP12942981A
Authority: JP
Inventors: 誠森戸; 賢一郎細田; 隆矢頭
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1981-08-20
Filing date: 1981-08-20
Publication date: 1984-03-21
Also published as: JPS5831394A

Description

【発明の詳細な説明】本発明は、線型予測による音声合成器に関するものであ
る。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech synthesizer using linear prediction.

従来より線型予測による方法には進展した方式も含めて
各種の方式が提案されている。Various methods have been proposed for linear prediction, including advanced methods.

それらの代表的な方式として、ＬＰＣ（ＬｉｎｅａｒＰ
ｒｅｄｉｃｔｉｖｅＣｏｄｌｎｇ）方式、ＰＡＲＣＯＲ
（Ｐａｒｔｉａｌ１０ＲｅｓｐｏｎｓｅａｎｄＡｕｔｏ
ｃｏｒｒｅｌａｔｉｏｎ）方式、ＬＳＰ（ＬｉｎｅＳｐ
ｅｃｔｒｕｍＰａｉｒ）方式が提案されている。これら
の方式は音声の情報圧縮率、合成音声の品質などの点で
すぐれた方式である。これらの方式は人間の発声機構を
第１図のようにモデル化１５することが基本となつてい
る。第１図において、１は発声部であり、有声音に関し
ては声帯による空気の振動、無声音に関しては肺からの
空気圧などによる乱空気流を表わしている。A typical method is LPC (LinearP
redirectiveCodlng) method, PARCOR
(Partial 10 Response and Auto
correlation) method, LSP (LineSp
ectrumPair) method has been proposed. These methods are excellent in terms of speech information compression rate, quality of synthesized speech, etc. These methods are based on modeling 15 the human vocal mechanism as shown in FIG. In FIG. 1, reference numeral 1 represents a voice producing section, which represents the air vibration caused by the vocal cords for voiced sounds, and the turbulent airflow caused by air pressure from the lungs for unvoiced sounds.

２は共振部であり、生体的には咽、鼻、舌、２０歯、唇
などの声道と呼ばれる共振管を意味する。Reference numeral 2 is a resonant part, which in biological terms refers to a resonant tube called the vocal tract, such as the throat, nose, tongue, teeth, lips, etc.

第１図のように、モデル化された発声機構は第２図のよ
うに電気回路系におきかえられる。第２図において、３
は励振回路で第１図に“ける発声部１に相当する。４は
共振フィルタで第１２５図における共振部２に相当する
。The modeled vocal mechanism as shown in FIG. 1 is replaced by an electric circuit system as shown in FIG. In Figure 2, 3
1 is an excitation circuit, which corresponds to the voicing section 1 in FIG. 1. 4 is a resonance filter, which corresponds to the resonance section 2 in FIG. 125.

さて、第１図、第２図のような発声機構のモデル化のた
めの音声の分析あるいは合成方式について、次に説明す
る。第３図に、従来のこの種の方式を実施した音声分析
合成装置の一例を示す。３０第３図において、１１は音
声の分析装置、１２は分析系から合成系・＼の伝送路（
又は記憶装置）、１３は音声の合成装置である。Next, a speech analysis or synthesis method for modeling a vocal mechanism as shown in FIGS. 1 and 2 will be explained. FIG. 3 shows an example of a conventional speech analysis and synthesis device implementing this type of method. 30 In Figure 3, 11 is a speech analysis device, 12 is a transmission line from the analysis system to the synthesis system (
13 is a voice synthesis device.

１４は音声のアナログ信号入力端子で、１５はローパス
フィルタ、１６はアナログ信号をディジタル信号に変換
する３５ＡＤ変換器、ＩＴは自己相関係数を算出する自
己相関回路、１８は自己相関係数によつて各種パラメー
タ（ＬＰＣ系分析合成装置においてはαパラメータ、Ｐ
ＡＲＣＯＲ系分析合成装置においてはｋパラメータ、Ｌ
ＳＰ系分析合成装置においてはＬＳＰパラメータ）を算
出するパラメータ算出回路、１９はこれらのパラメータ
によつて構成される音源抽出フイルタ（このフイルタは
音声を合成する際に用いている音声合成フイルタの逆フ
イルタである）、２０は音源判定を行う音源判定回路、
２１はそれぞれのパラメータ出力信号、２２は音源情報
出力信号、２３は符号器で、これらは分析装置１１を構
成している。14 is an audio analog signal input terminal, 15 is a low-pass filter, 16 is a 35AD converter that converts the analog signal into a digital signal, IT is an autocorrelation circuit that calculates an autocorrelation coefficient, and 18 is an autocorrelation circuit that calculates an autocorrelation coefficient. various parameters (α parameter, P
In the ARCOR system analysis and synthesis equipment, the k parameter, L
In the SP-based analysis/synthesis device, a parameter calculation circuit that calculates LSP parameters), and 19 a sound source extraction filter configured by these parameters (this filter is an inverse filter of the speech synthesis filter used when synthesizing speech). ), 20 is a sound source determination circuit that performs sound source determination;
21 is each parameter output signal, 22 is a sound source information output signal, and 23 is an encoder, which constitute the analysis device 11.

２４は符号器２３によつて符号化され、伝送路１２（又
は記憶装置）を通つて伝送されてきた符号を復号化する
復号器、２５は復号化された各種パラメータ（αパラメ
ータ、ｋパラメータ又はＬＳＰパラメータ）で、２６は
復号化された音源情報信号であり、２７は音源信号を発
生する音源発生回路、２８は音声合成フイルタ、２９は
デイジタル信号をアナログ信号に変換するＤＡ変換器、
３０はローパスフイルタ、３１は合成音声のアナログ信
号出力端子である。24 is a decoder that decodes the code encoded by the encoder 23 and transmitted through the transmission path 12 (or storage device); 25 is a decoder that decodes the code encoded by the encoder 23 and transmitted through the transmission path 12 (or storage device); 26 is a decoded sound source information signal, 27 is a sound source generation circuit that generates a sound source signal, 28 is a speech synthesis filter, 29 is a DA converter that converts a digital signal into an analog signal,
30 is a low-pass filter, and 31 is an analog signal output terminal for synthesized speech.

これらによつて合成装置１３は構成されている。以下第
３図に従い、その作用を詳細に説明する。The synthesis device 13 is constituted by these. The operation will be explained in detail below with reference to FIG.

音声のアナログ信号入力端子１４から加えられた音声信
号はローパスフイルタ１５によつて高い周波数成分が除
去される。このローパスフイルタは後述するＡＤ変換器
による音声信号のひずみを小さくするために用いられ、
ローバスフイルタ１５の遮断周波数ωＣは油変換器によ
る標本化周波数の％以上の周波数成分を除去するように
設定されなければならない。例えば、ＡＤ変換器１６の
標本化周波数が８ＫＨｚである場合には４ＫＨｚ以上の
周波数成分を除去しなければならないため、ローパスフ
イルタ１５の遮断周波数は３．４ＫＨｚ程度に選ばれる
。ローパスフイルタ１５の出力は、ＡＤ変換器１６によ
つて標本化周期Ｔごとに標本化される。ここで、ｔ＝Ｎ
Ｔごとに標本化されたデイジタル信号をＸ。とする。Ａ
Ｄ変換器１６の出力であるデイジタル信号Ｘｎは、例え
ば符号１ビット、値１１ビツトによつて構成される。こ
れらのデイジタル信号は自已相関回路１７、ならびに音
源抽出フイルタ１９に入力される。自己相関回路１７で
は音源の自己相関係数ρ。が演算される。自己相関係数
ρ。は次の式で与えられる。（ここで、Ｎは自己相関係
数を算出する標本点数である。）（１）式によつて算出
された自己相関係数ρ。A low-pass filter 15 removes high frequency components from the audio signal applied from the audio analog signal input terminal 14. This low-pass filter is used to reduce distortion of the audio signal caused by the AD converter, which will be described later.
The cutoff frequency ωC of the low-pass filter 15 must be set so as to remove frequency components that are greater than % of the frequency sampled by the oil converter. For example, when the sampling frequency of the AD converter 16 is 8 KHz, frequency components of 4 KHz or higher must be removed, so the cutoff frequency of the low-pass filter 15 is selected to be about 3.4 KHz. The output of the low-pass filter 15 is sampled at every sampling period T by the AD converter 16. Here, t=N
The digital signal sampled every T is X. shall be. A
The digital signal Xn, which is the output of the D converter 16, is composed of, for example, 1 bit for the sign and 11 bits for the value. These digital signals are input to an autocorrelation circuit 17 and a sound source extraction filter 19. The autocorrelation circuit 17 calculates the autocorrelation coefficient ρ of the sound source. is calculated. Autocorrelation coefficient ρ. is given by the following formula. (Here, N is the number of sample points for calculating the autocorrelation coefficient.) Autocorrelation coefficient ρ calculated by equation (1).

はパラメータ算出回路１８に入力され、それぞれの方式
に従つたパラメータが算出される。パラメータ算出回路
１８においては、ＬＰＣ方式、ＰＡＲＣＯＲ方式、ＬＳ
Ｐ方式の各種方式に従つてαパラメータ、ｋパラメータ
、ＬＳＰパラメータが算出される。are input to the parameter calculation circuit 18, and parameters are calculated according to each method. In the parameter calculation circuit 18, LPC method, PARCOR method, LS
α parameters, k parameters, and LSP parameters are calculated according to various P methods.

共通してこれらのパラメータは第１図の発声機構モデル
図における共振部２、また第２図における共振フイルタ
４を構成する上でのフイルタの係数を与えるものである
。すなわち、これらのパラメータによつて咽や鼻や舌な
どの伝達関数が決定される。これらのパラメータによつ
て決定された声道の伝達関数（第２図における共振フイ
ルタの伝達関数）をＨ（ｊω）とする。一方、ＡＤ変換
器１６から出力される音声のスペクトラム関数をＹ（ｊ
ω）とする。パラメータ算出回路１８によつて算出され
た各パラメータによつて音源抽出フイルタ１９では声道
の伝達関数Ｈ（ｊω）の逆の特性、すなわち、１／Ｈ（
ｊω）なる特性をもつフイルタを構成する。すると、音
源抽出フイルタ１９はＡＤ変換器１６からの信号Ｙ（ｊ
ω）を入力とし、声道の伝達関数の逆特性、すなわち、
１／Ｈ（ｊω）なる特性を有するフイルタとなる。音源
抽出フイルタ１９の出力は、一般に残差信号と呼ばれ、
第１図における発声部１、あるいは第２図における励振
回路３、又は生体的にいう声帯に相当する。音源抽出フ
イルタ１９の出力残差信号は音源判定回路２０に送られ
、音源の解析が行われる。音源判定回路２０によつて解
析される音源の情報は有声音の基本周期情報Ｔ（一般に
ピｐツチ周期と呼ばれている）、有声音と無声音の判定
情報Ｖ／ＵＶ、音声の振幅情報Ａの三種類である。Commonly, these parameters provide filter coefficients for configuring the resonant section 2 in the vocal mechanism model diagram of FIG. 1 and the resonant filter 4 in FIG. 2. That is, the transfer functions of the throat, nose, tongue, etc. are determined by these parameters. Let H(jω) be the transfer function of the vocal tract (transfer function of the resonance filter in FIG. 2) determined by these parameters. On the other hand, the spectrum function of the audio output from the AD converter 16 is expressed as Y(j
ω). Using each parameter calculated by the parameter calculation circuit 18, the sound source extraction filter 19 calculates the inverse characteristic of the vocal tract transfer function H(jω), that is, 1/H(
A filter with the characteristic jω) is constructed. Then, the sound source extraction filter 19 receives the signal Y(j
ω) as input, the inverse characteristic of the vocal tract transfer function, i.e.,
This results in a filter having a characteristic of 1/H(jω). The output of the sound source extraction filter 19 is generally called a residual signal,
This corresponds to the vocalization section 1 in FIG. 1, the excitation circuit 3 in FIG. 2, or the biologically speaking vocal cords. The output residual signal of the sound source extraction filter 19 is sent to the sound source determination circuit 20, where the sound source is analyzed. The sound source information analyzed by the sound source determination circuit 20 includes voiced sound fundamental period information T (generally called piptsuchi period), voiced and unvoiced sound determination information V/UV, and audio amplitude information A. There are three types.

音源判定回路２０からの各音源情報出力は前述の三種類
の情報（有声音のピツチ周期情報Ｔ一Ｐゝ有声音・無声
音判定情報Ｖ／Ｕ、音声の振幅情報Ａ）である。音源抽
出フイルタ１９からの出力である残差信号をそのまま伝
送する方法やローパスフイルタを通して別の符号化方式
によつて符号化して伝送する方式も提案されているが、
伝送路１２を通過する情報量（又は記憶装置に記憶され
る情報量）が極めて多くなるため一般的ではない。前述
の三種類の音源情報出力信号（有声音ピツチ周期情報Ｔ
、有声音・無声音判定情ＳＶ／４ＪＶ、ｐ音声の振幅情
報Ａ）は符号器２３によつて、それぞれ数ビツトに符号
化される。Each sound source information output from the sound source determination circuit 20 is the aforementioned three types of information (voiced sound pitch period information T1P, voiced sound/unvoiced sound determination information V/U, and audio amplitude information A). There have also been proposals for transmitting the residual signal, which is the output from the sound source extraction filter 19, as it is, or for transmitting it after passing it through a low-pass filter and encoding it using another encoding method.
This is not common because the amount of information passing through the transmission path 12 (or the amount of information stored in the storage device) is extremely large. The aforementioned three types of sound source information output signals (voiced sound pitch period information T
, voiced sound/unvoiced sound determination information SV/4JV, and p voice amplitude information A) are each encoded into several bits by the encoder 23.

同様に、パラメータ算出回路１８によつて算出されたパ
ラメータは符号器２３によつてやはり数ビツトに符号化
され、伝送路１２（又は記憶装置）を通して５ミリ秒〜
３０ミリ秒程度のフレーム周期と呼ばれる周期ごとに合
成装置１３に送られる。次に、合成装置１３において、
５〜３０ミリ秒のフレーム周期Ｔｆごとに送られてくる
符号化された各信号を復号器２４によつて復号する。Similarly, the parameters calculated by the parameter calculation circuit 18 are also encoded into several bits by the encoder 23, and passed through the transmission line 12 (or storage device) for 5 milliseconds to
The data is sent to the synthesis device 13 at every cycle called a frame cycle of about 30 milliseconds. Next, in the synthesis device 13,
A decoder 24 decodes each encoded signal sent every frame period Tf of 5 to 30 milliseconds.

復号器２４によつて復号された信号のうち、音源情報信
号２６は音源発生回路２７に入力され、一方、復号器２
４によつて復号された信号のうち、パラメータ２５（α
パラメータ又はｋパラメータ又はＬＳＰパラメータ）は
音声合成フイルタ２８に入力される。音源発生回路２７
では、第１図における発声部、第２図における励振回路
３に相当する信号で合成フイルタ２８を励振するための
信号を作成する。Among the signals decoded by the decoder 24, the sound source information signal 26 is input to the sound source generation circuit 27;
Among the signals decoded by 4, parameter 25 (α
parameters (or k parameters or LSP parameters) are input to the speech synthesis filter 28. Sound source generation circuit 27
Now, a signal for exciting the synthesis filter 28 is created using a signal corresponding to the voice generating section in FIG. 1 and the excitation circuit 3 in FIG. 2.

音源発生回路２７は、パルス発生器と雑音発生器で構成
され、両者は音源情報信号２６の内の有声音・無声音判
定情報Ｖ／ＵＶ信号によつて選択される。また、パルス
発生器におけるパルス間隔は音源情報信号２６の内のピ
ッチ周期情報Ｔによｐつて決定される。The sound source generation circuit 27 includes a pulse generator and a noise generator, both of which are selected by the voiced sound/unvoiced sound determination information V/UV signal of the sound source information signal 26. Further, the pulse interval in the pulse generator is determined by the pitch period information T in the sound source information signal 26.

これらの選択された信号は音源情報信号２６の内の音声
の振幅情報Ａによつて音声の振幅レベルにあわせた信号
となる。音源発生回路２７によつて作られた音源信号は
合成フイルタ２８に入力される。These selected signals become signals matched to the amplitude level of the audio according to audio amplitude information A in the audio source information signal 26. The sound source signal generated by the sound source generation circuit 27 is input to a synthesis filter 28.

合成フイルタ２８は、復号器２４によつて復号された各
パラメータによつて音声を合成するための声道のフイル
タを構成する。その伝達関数は前述したＨ（ｊω）で与
えられる。また、各パラメータによる合成フイルタ２８
の係数は伝送路１２（又は記憶装置）を通してパラメー
タが送られてくる周期、すなわち、フレーム周期Ｔｆご
とに更新される。一方、音源発生回路２７からの入力信
号は分析装置１１におけるＡＤ変換器１６の標本化周波
数に対応する標本化周期Ｔ（標本化周波数が８ｋＨｚの
場合には１２５μ秒）ごとに入力されるため、音声合成
回路２８の演算は標本化周期Ｔ秒ごとに行われ、その出
力は前記標本化周期と同じ周期である出力周期Ｔ秒ごと
にＤＡ変換器２９に送られる。The synthesis filter 28 constitutes a vocal tract filter for synthesizing speech using each parameter decoded by the decoder 24. The transfer function is given by H(jω) mentioned above. In addition, a synthesis filter 28 based on each parameter
The coefficients are updated every cycle at which parameters are sent through the transmission path 12 (or storage device), that is, every frame cycle Tf. On the other hand, since the input signal from the sound source generation circuit 27 is input every sampling period T (125 μs when the sampling frequency is 8 kHz) corresponding to the sampling frequency of the AD converter 16 in the analyzer 11, The operation of the speech synthesis circuit 28 is performed every sampling period T seconds, and its output is sent to the DA converter 29 every output period T seconds, which is the same period as the sampling period.

ＤＡ変換器２９では、音声合成フイルタからのデイジタ
ル信号をアナログ信号に変換してローパスフイルタ３０
に送る。ローパスフイルタ３０では、デイジタル信号か
らアナログ信号に変換する際に発生する高周波数成分を
除去する働きをする。ローパスフイルタ３０の出力は、
合成音声のアナログ出力端子３１を通して外部オーデイ
オ装置（アンプ、スピーカ）によつて聴かれる。このよ
うにして、音声合成装置１３は動作する。このような音
声の分析および合成方式では、パラメータ算出回路１８
によつて算出されたパラメータ信号２１を音源判定回路
２０によつて決定された音源情報信号２２を符号化し、
フレーム周期と呼ばれる周期５〜３０ミリ秒ごとに合成
系に情報を転送し、合成装置においては、復号化された
パラメータ信号２５から合成フイルタを、復号化された
音源情報信号２６から音源となる励振信号を作り出し、
合成音を合成していた。このような分析合成系では分析
部と合成部とは互いに逆の関係にあるといえよう。従つ
て、分析部において、分析された音声の情報をすべて合
成部に転送し、音声を合成すると、かなり良質の合成音
が得られることが予想される。しかし、分析部によつて
分析された音声の情報すべてを伝送路１２（もしくは記
憶装置）を通じて合成部に転送することは、音声の情報
圧縮ならびに記憶装置における記憶領域の軽減という目
的とは矛盾する。従つて、音声の情報圧縮ならびに記憶
装置における記憶領域の軽減という目的から伝送路もし
くは蓄積装置における音声の情報は極力小さくされるべ
きである。The DA converter 29 converts the digital signal from the voice synthesis filter into an analog signal and passes it through the low-pass filter 30.
send to The low-pass filter 30 functions to remove high frequency components generated when converting a digital signal to an analog signal. The output of the low pass filter 30 is
The synthesized voice is listened to by an external audio device (amplifier, speaker) through the analog output terminal 31. In this manner, the speech synthesizer 13 operates. In such a speech analysis and synthesis method, the parameter calculation circuit 18
encodes the parameter signal 21 calculated by the sound source information signal 22 determined by the sound source determination circuit 20,
Information is transferred to the synthesis system at intervals of 5 to 30 milliseconds called a frame period. create a signal,
They were synthesizing synthesized sounds. In such an analysis-synthesis system, the analysis section and the synthesis section can be said to have an inverse relationship to each other. Therefore, if the analysis section transfers all the analyzed speech information to the synthesis section and synthesizes the speech, it is expected that a synthesized speech of fairly high quality will be obtained. However, transferring all the audio information analyzed by the analysis unit to the synthesis unit via the transmission line 12 (or storage device) is inconsistent with the purpose of compressing audio information and reducing the storage area of the storage device. . Therefore, for the purpose of compressing audio information and reducing the storage area in the storage device, the audio information in the transmission path or storage device should be made as small as possible.

そのために、第３図における分析装置１１から合成装置
１３に送られる音声の情報も音声のパラメータ（αパラ
メータ又はｋパラメータ又はＬＳＰパラメータ）、音源
情報信号（有声音のピツチ周期情報Ｔ、有声音無声音
判定情ｐ報Ｖ／ＵＶ、音声の振幅情報Ａ）に制限されて
いる。For this purpose, the voice information sent from the analyzer 11 to the synthesizer 13 in FIG. The determination information is limited to p information V/UV and audio amplitude information A).

ここで注目すべきことは、音源情報信号として第３図の
音源抽出フイルタ１９の出力である残差信号そのものを
符号化して伝送あるいは蓄積するのではなく、残差信号
を有声音、無声音に分けてそれぞれインパルス、ランダ
ム雑音にモデル化して、そのモデル化情報を伝送（ある
いは記憶）している。第４図に音源発生回路２７の詳細
なプロツク図を示す。What should be noted here is that instead of encoding and transmitting or storing the residual signal itself, which is the output of the sound source extraction filter 19 in FIG. Each signal is modeled as an impulse or random noise, and the modeling information is transmitted (or stored). FIG. 4 shows a detailed block diagram of the sound source generating circuit 27.

４０は雑音発生回路、４１はインパルス発生回路、４２
は有声音のピツチ周期情報Ｔ，の入力端子、４３は切り
換えスイツチ、４４は有声音・無声音判定情報／ＵＶの
入力端子、４５は振幅乗算回路、４６は音声振幅情報Ａ
の入力端子、４７は音源信号の出力端子であつて、以上
のごとく、音源発生回路２７は構成されている。40 is a noise generation circuit, 41 is an impulse generation circuit, 42
is an input terminal for voiced pitch period information T, 43 is a changeover switch, 44 is an input terminal for voiced/unvoiced sound determination information/UV, 45 is an amplitude multiplication circuit, and 46 is audio amplitude information A.
The input terminal 47 is an output terminal for the sound source signal, and the sound source generating circuit 27 is configured as described above.

第４図に従つて、音源発生回路２７の機能について説明
する。The function of the sound source generation circuit 27 will be explained with reference to FIG.

雑音発生回路４０では、ランダム雑音がＴ秒ごとに発生
させられ、その雑音の振幅値は＋１又は−１である。In the noise generation circuit 40, random noise is generated every T seconds, and the amplitude value of the noise is +1 or -1.

インパルス発生回路４１では、有声音のピッチ周期情報
Ｔ，、入力端子４２から入力されるピツチ周期情報Ｔ，
にしたがつてＴ，秒ごとにインパルス性の信号が発生さ
せられ、その振幅は１である。これら雑音発生回路４０
の出力とインパルス発生回路４１の出力は、切り換えス
イツチ４３によつてどちらか一方が選択される。スイツ
チ４３の選択情報としては４４から入力される有声音・
無声音判定情報Ｖ／Ｕが用いられる。すなわち、有声音
・無声音判定情報／ＵＶが゛１１のときは有声音と判定
され、インパルス発生回路４１からの出力が選択される
。また、有声音・無声音判定信号／Ｕが”Ｏ”のときは
無声音と判定され、ランダム雑音発生回路４０からの出
力が選択され、振幅乗算回路４５に出力される。選択さ
れた音源は振幅乗算回路４５において、入力端子４６か
ら入力される音声の振幅情報Ａを乗せられ、音源信号出
力端子４７より音源信号として出力される。このように
して得られた音源信号は合成フイルタ２８の入力となる
。一方、振幅情報Ａに関して、合成フイルタ２８の演算
ビツト数の軽減を目昨として音声の振幅情報Ａを乗算す
る以前の信号、すなわち、スイツチ４３の出力そのまま
を合成フイルタ２８に入力し、音声の振幅情報Ａは合成
フイルタ２８の出力演算後に乗じる方式も提案されてい
るが、本提案事項に直接関係がないため、音声の振幅情
報Ａは合成フイルタ２８の入力前に乗せられるものとす
る。In the impulse generation circuit 41, the pitch period information T of the voiced sound, the pitch period information T input from the input terminal 42,
Therefore, an impulsive signal is generated every T seconds, and its amplitude is 1. These noise generating circuits 40
The output of the impulse generating circuit 41 and the output of the impulse generating circuit 41 are selected by a changeover switch 43. The selection information for the switch 43 includes voiced sounds input from the switch 44.
Unvoiced sound determination information V/U is used. That is, when the voiced sound/unvoiced sound determination information/UV is '11', the sound is determined to be a voiced sound, and the output from the impulse generation circuit 41 is selected. Further, when the voiced sound/unvoiced sound determination signal /U is "O", the sound is determined to be an unvoiced sound, and the output from the random noise generation circuit 40 is selected and output to the amplitude multiplication circuit 45. The selected sound source is loaded with audio amplitude information A input from an input terminal 46 in an amplitude multiplication circuit 45, and is output as a sound source signal from a sound source signal output terminal 47. The sound source signal obtained in this manner becomes an input to the synthesis filter 28. On the other hand, regarding the amplitude information A, the signal before being multiplied by the audio amplitude information A, that is, the output of the switch 43 as it is, is input to the synthesis filter 28 in order to reduce the number of calculation bits of the audio signal. A method has also been proposed in which the information A is multiplied after the output calculation of the synthesis filter 28, but this is not directly related to the present proposal, so the audio amplitude information A is multiplied before the input to the synthesis filter 28.

ここで、音源信号として特に有声音の場合の音源信号に
ついて考える。前述したように、有声音の場合の音源と
しては通常インパルス列が用いられ、その周期は音源情
報信号２５の中のピツチ周期情報Ｔ，によつて与えられ
る。Here, a sound source signal in the case of a voiced sound will be considered in particular. As mentioned above, an impulse train is usually used as a sound source for voiced sounds, and its period is given by pitch period information T, in the sound source information signal 25.

このピツチ周期情報Ｔ，の値は、通常、男性の場合は５
〜１０ｍｓｅｃ１女性の場合は３〜５ｍｓｅｃ程度であ
る。一方、音源情報などの一連のデータばフレーム周期
゛２と呼ばれる５〜３０ｍｓｅｃの周期ごとに更新され
ることは前に述べたが、フレーム周期が長い場合、すな
わち、低ビツトによる音声の情報圧縮を行う場合、フレ
ーム周期を３０ｍｓｅｃ付近に設定する場合には、１フ
レーム内に数本のインパルスが入つた有声音源となる。The value of this pitch cycle information T, is normally 5 for men.
~10 msec For a woman, it is about 3 to 5 msec. On the other hand, as mentioned earlier, a series of data such as sound source information is updated every 5 to 30 msec called frame period 2, but when the frame period is long, in other words, audio information compression using low bits is If this is done, and the frame period is set to around 30 msec, the result will be a voiced sound source with several impulses in one frame.

１フレーム周期内におけるそれらのパルス間隔はピツチ
周期Ｔｐですべて同じである。The pulse intervals within one frame period are all the same pitch period Tp.

第５図ａに、フレーム周期Ｔｆが３０ミリ秒、ｎ番目の
フレーム周期におけるピツチ周期情報Ｔを８ミリ秒、
ｎ＋１番目のフレーム周期におｐけるピツチ周期Ｔを
１０ミリ秒とした場合の音ｐ源信号の図を示す。In Fig. 5a, the frame period Tf is 30 milliseconds, the pitch period information T in the nth frame period is 8 milliseconds,
A diagram of the sound p source signal when the pitch period T in p in the n+1th frame period is 10 milliseconds is shown.

第５図ａにおいて、ｎ番目のフレーム周期内では８ミリ
秒ごとのインパルスが４本、ｎ＋１番目のフレーム周期
内では１０ミリ秒ごとのインパルスが３本、音源発生回
路２７内のインパルス発生回路４１によつて発生される
。In FIG. 5a, there are four impulses every 8 milliseconds within the n-th frame period, three impulses every 10 milliseconds within the n+1-th frame period, and the impulse generating circuit 41 in the sound source generating circuit 27. generated by.

このインパルスの間隔、８ミリ秒又は１０ミリ秒、すな
わち、ピツチ周期Ｔ，は分析装置１１における残差信号
のパルス間隔の同一フレーム周期内での平均的な値と思
われる。それに対して、実際の声の発声部、すなわち、
声帯などにおいて発せられる励振波形のパルス間隔はゆ
つくりと第５図ｂのごとく変化しているものと思われる
。従つて、合成装置１３内におけるインパルス発生回路
４１においても、ｎ番目のフレーム周期内はすべて８ミ
リ秒間隔のパルス、ｎ＋１番目のフレーム周期内はすべ
て１０ミリ秒間隔のパルスといつた具合いにフレーム周
期ごとに不連続にかわるパルス列を用いるよりも、第５
図ｂのように、徐々にパルス間隔が変化していくインパ
ルス列を用いて合成フイルタを駆動した方が実際の残差
信号との類似性が増すことになり、音質が向上すると思
われる。The interval between these impulses, 8 or 10 milliseconds, ie, the pitch period T, is considered to be the average value of the pulse interval of the residual signal in the analyzer 11 within the same frame period. On the other hand, the vocal part of the actual voice, that is,
It seems that the pulse interval of the excitation waveform emitted from the vocal cords etc. slowly changes as shown in Fig. 5b. Therefore, the impulse generating circuit 41 in the synthesizer 13 generates frames in such a manner that all pulses are generated at 8 ms intervals within the n-th frame period, and pulses are generated at 10 ms intervals within the n+1 th frame period. Rather than using a pulse train that discontinuously changes every cycle,
As shown in FIG. b, if the synthesis filter is driven using an impulse train whose pulse intervals gradually change, the similarity with the actual residual signal will increase, and the sound quality will be improved.

本発明は、合成音声の品質を向上させることを目的とし
ており、合成フイルタ２８の入力となる。The present invention aims to improve the quality of synthesized speech, and serves as an input to the synthesis filter 28.

有声音の音源信号のパルス列の間隔をフレーム周期Ｔｆ
内で徐々に変化させることによつて、フレーム間でのパ
ルス列の間隔が不連続になることを防ぎ、分析部１１に
おける残差信号により近い形で有声音時の音源を与える
ことを特徴としているものであり、以下詳細に説明する
。前に述べたように、フレーム周期ごとに送られてくる
各種の情報は、そのフレーム周期内での平均的な値であ
るため、フレーム周期間ではかなりの不連続性をもつて
いると思われる。The interval of the pulse train of the voiced sound source signal is the frame period Tf
By gradually changing the pulse train interval within the frame, discontinuity in the pulse train interval between frames is prevented, and the sound source of the voiced sound is given in a form that is closer to the residual signal in the analysis unit 11. This will be explained in detail below. As mentioned earlier, the various information sent every frame period is an average value within that frame period, so there is likely to be considerable discontinuity within the frame period. .

一方、実際の各種の情報はゆるやかに連続的に変化して
いるものであつて、フレーム周期内での不連続を小さく
するためには、フレーム周期を短く設定しなければなら
ない。On the other hand, various types of actual information change slowly and continuously, and in order to reduce discontinuities within the frame period, the frame period must be set short.

しかし、フレーム周期を短く設定することは情報量の増
加を招く。従つて、長いフレーム周期設定時において、
各種の情報の不連続性をなくす方法として”補間”とい
う処理が通常とられる。第６図に、補間の処理について
述べる。However, setting the frame period short causes an increase in the amount of information. Therefore, when setting a long frame period,
A process called "interpolation" is usually used to eliminate discontinuities in various types of information. FIG. 6 describes the interpolation process.

第６図によつて、５０は各フレーム周期内に送られてく
る各種の情報であり、５１は補間を行つた結果の各種の
情報である。According to FIG. 6, 50 is various types of information sent within each frame period, and 51 is various types of information as a result of interpolation.

この補間という処理は通常各種パラメータ２５（ＬＰＣ
方式のαパラメータ、ＰＡＲＣＯＲ方式のｋパラメータ
、ＬＳＰ方式のＬＳＰパラメータ）、あるいは音源情報
信号２６の内の音声の振幅情報Ａについては行われてい
る処理であるが、ピツチ周期Ｔ，について補間を行つた
合成器は提案されていない。第７図に、本発明の音声合
成器の有声音源発生部に用いるピツチ周期情報Ｔの補
間処理を行うｐピツチ周期補間処理回路を示す。This process of interpolation usually involves various parameters 25 (LPC
This process is performed for the α parameter of the PARCOR method, the k parameter of the PARCOR method, the LSP parameter of the LSP method), or the amplitude information A of the sound in the sound source information signal 26, but interpolation is performed for the pitch period T. No ivy synthesizer has been proposed. FIG. 7 shows a p-pitch period interpolation processing circuit that performs interpolation processing of pitch period information T used in the voiced sound source generating section of the speech synthesizer of the present invention.

第７図において、６０はピツチ周期情報ＴのＳＰ入力
端子で、６１は減算器、６２はシフタ、６３は補間処理
単位レジスタ、６４は加算器、６５はピツチ周期レジス
タ、６６はパルス発生のタイミングを得るカウンタ、６
７はパルス発生タイミング信号、６８はパルス又はパル
ス列発生回路、６９は有声音の音源信号（音声の振幅情
報Ａを乗じる前の信号）の出力端子である。In FIG. 7, 60 is the SP input terminal of pitch cycle information T, 61 is a subtracter, 62 is a shifter, 63 is an interpolation processing unit register, 64 is an adder, 65 is a pitch cycle register, and 66 is the timing of pulse generation. A counter that obtains 6
7 is a pulse generation timing signal, 68 is a pulse or pulse train generation circuit, and 69 is an output terminal for a voiced sound source signal (signal before being multiplied by audio amplitude information A).

また、第８図ａにこれらの補間処理のタイムチヤート、
第８図ｂに第８図ａを部分的に拡大したタイムチヤート
を示す。In addition, Fig. 8a shows a time chart of these interpolation processes.
FIG. 8b shows a partially enlarged time chart of FIG. 8a.

第１クロツク信号７０はフレーム周期Ｔｆ毎に発生する
信号で、差分レジスタ６３、減算器６１における減算処
理タイミング、差分レジスタ６３における読み込みタイ
ミングを与える。The first clock signal 70 is a signal generated every frame period Tf, and provides the timing for subtraction processing in the difference register 63 and the subtracter 61, and the timing for reading in the difference register 63.

第２クロック信号７１は補間処理周期Ｔ（合成フイルタ
の出力クロツク、および分析装置におけるＡＤ変換器の
標本化の周期と同じ）毎に発生する信号を表わし、補間
処理単位レジスタ６３とピッチ周期レジスタ６５の駆動
パルスとして働く。第１クロツク信号７０の周期Ｔｆと
第２クロツク信号７１の周期Ｔは回路構成の簡略化をは
かるために、ＴｆはＴの２のべき乗倍に選ばれるのが常
である。従つて、本実施例の説明においては、一例とし
て、補間処理周期Ｔ＝１２５マイクロ秒フレーム周期
Ｔｆ−３２ミリ秒に設定し、ＴとＴｆの比はＴ：Ｔｆ＝１：２５６＝１：
２８とする。The second clock signal 71 represents a signal generated every interpolation processing period T (same as the output clock of the synthesis filter and the sampling period of the AD converter in the analyzer), and is generated in the interpolation processing unit register 63 and the pitch period register 65. Acts as a driving pulse. The period Tf of the first clock signal 70 and the period T of the second clock signal 71 are usually selected to be a power of 2 times Tf in order to simplify the circuit configuration. Therefore, in the description of this embodiment, as an example, the interpolation processing period T=125 microseconds, the frame period Tf-32 milliseconds, and the ratio of T and Tf is T:Tf=1:256=1:
28.

従つて、第１クロツク信号７０のパルス７００００と７
０００１の間には２５６個の第２クロツク信号７１のパ
ルスが存在する。それらのパルスの番号を７１０００〜
７１２５５とする。また、第１クロツク信号７０におけ
るパルス７００００と第２クロツク信号７１におけるパ
ルス７１０００、および第１クロツク信号７０における
パルス７０００１と第２クロツク信号７１におけるパル
ス７１２５６は同じタイミングであるとする。階段状線
７２は第２クロツク信号の各タイミングにおけるピツチ
周期レジスタ６５の内容をアナログ的に表示したグラフ
で、階段状線７３はカウンタ６６の内容をアナログ的に
表示したグラフである。パルス発生タイミング信号７４
はカウンタ６６によつて発せられるインパルスのタイミ
ングを示す。では、第７図と第８図Ａ，ｂに従つて、隣
接するフレーム周期Ｔｆにおいて異なるピッチ周期情報
Ｔの補間処理を行うピツチ周期補間処理回ｐ路の動作に
ついて説明する。Therefore, pulses 70,000 and 7 of first clock signal 70
There are 256 pulses of the second clock signal 71 between 0001 and 0001. Change the number of those pulses from 71000 to
71255. It is also assumed that the pulse 70000 in the first clock signal 70 and the pulse 71000 in the second clock signal 71, and the pulse 70001 in the first clock signal 70 and the pulse 71256 in the second clock signal 71 have the same timing. A stepped line 72 is a graph representing the contents of the pitch period register 65 in an analog manner at each timing of the second clock signal, and a stepped line 73 is a graph representing the contents of the counter 66 in an analog manner. Pulse generation timing signal 74
indicates the timing of the impulses issued by counter 66. The operation of the pitch period interpolation processing circuit p which performs interpolation processing of different pitch period information T in adjacent frame periods Tf will now be described with reference to FIGS. 7 and 8A and 8B.

入力端子６０より各フレーム周期の最初に入力される有
声音のピツチ周期情報Ｔは７ビツトのｐデイジタル符号
によつて構成されている。The pitch period information T of the voiced sound input from the input terminal 60 at the beginning of each frame period is composed of a 7-bit p digital code.

このディジタル符号は第３図における合成装置１３の復
号器２４の出力であつて、伝送路１２（あるいは記憶装
置）を通じて得られるデイジタル符号そのものでないこ
とを明記しておく。入力端子６０から入力されるピツチ
周期情報Ｔ，を７ビツトのデイジタル符号に設定した理
由は、人間のピツチ周期の範囲を考慮に入れた結果であ
つて、理由を簡単に説明する。ピツチ周期Ｔは合成フ
イルタの出力周期Ｔをｐ基準として設定されており、６
０から入力されるデイジタル符号が１の場合はピツチ周
期Ｔ＝Ｔｐ２の場合は〃Ｔ＝２Ｔｐ３の場合は〃Ｔ＝３Ｔｐなる値をとる。It should be noted that this digital code is the output of the decoder 24 of the synthesizer 13 in FIG. 3, and is not the digital code itself obtained through the transmission line 12 (or storage device). The reason why the pitch period information T, inputted from the input terminal 60 is set to a 7-bit digital code is as a result of taking into account the range of human pitch periods, and the reason will be briefly explained. The pitch period T is set with the output period T of the synthesis filter as p reference, and is 6
When the digital code input from 0 is 1, the pitch period is T = T p 2, T = 2T p 3, T = 3T p .

従つて、Ｔ＝１２５マイクロ秒である場合、６０から入
力される７ビツトのデイジタル符号の最大値は１２７で
あるから１２７×１２５マイクロ秒＝１５．８７５ミリ
秒となり、人間のピツチ周期３〜１０ミリ秒を再現でき
る。Therefore, when T = 125 microseconds, the maximum value of the 7-bit digital code input from 60 is 127, so 127 x 125 microseconds = 15.875 milliseconds, which is a human pitch period of 3 to 10 Can reproduce milliseconds.

入力端子６０からのピツチ周期Ｔ，のデイジタル符号は
第８図ａにおけるパルス７００００のタイミングで入力
される。次に、減算器６１において、今回入力されたピ
ツチ周期Ｔ，のデイジタル符号とピツチ周期レジスタ６
５の内容との減算が行われる。A digital code with a pitch period T from the input terminal 60 is input at the timing of the pulse 70000 in FIG. 8a. Next, in the subtracter 61, the digital code of the pitch period T, input this time and the pitch period register 6
Subtraction with the contents of 5 is performed.

ピツチ周期６５には前のフレーム周期における補間処理
のピッチ周期情報の最終値が格納されている。ピツチ周
期レジスタ６５のビツト構成は小数点以上７ビツト、小
数点以下８ビツトの計１５ビツトである。小数点以上の
７ビツトは入力端子６０から入力されるピツチ周期情報
Ｔ，のデイジタルビツト構成の７ビツトに対応している
。減算器６１では、前フレーム周期における最終ピツチ
周期情報と今のフレーム周期情報との差分を演算してい
ることになる。減算器６１の出力は、シフタ６２によつ
て１／２５６される。このシフタ６２は、シフトレジス
タ等を用いてもよいが、配線を工夫することによつても
実現できる。すなわち、減算器６１の出力を８ビツト右
にシフトする結線を減算器６１と補間処理単位レジスタ
６３との間で行つている。従つて、シフタ６２に相当す
る素子的な構成要素は特に設けなくてもよい。シフタ６
２によつて１／２５６する理由は、減算器６１によつて
得られた隣接するフレーム周期間Ｔｆでのピツチ周期Ｔ
，の差を補間処理周期Ｔあたりの補間処理単位に変換す
るためで、１フレーム周期Ｔｆと補間処理周期Ｔとの比
を２のべき乗（本処理では２５６）に選んだ処理は、こ
の変換処理の構成を簡略化するためである。シフタ６２
によつて８ビツト右にシフトされた減算器６１の出力は
、補間処理単位レジスタ６３に補間処理単位として格納
される。補間処理単位レジスタ６３の内容は前述したよ
うに補間処理周期Ｔあたりのフレーム周期間での差を意
味している。補間処理単位レジスタ６３のデイジタル符
号構成は符号１ビツトと小数点以下８ビツトによつて構
成される。次に加算器６４において、補間処理単位レジ
スタ６３とピツチ周期レジスタ６５が加算され、その結
果はパルス７１００１のタイミングによつてピッチ周期
レジスタ６５に読み込まれる。The pitch period 65 stores the final value of the pitch period information of the interpolation process in the previous frame period. The pitch period register 65 has a total of 15 bits, 7 bits above the decimal point and 8 bits below the decimal point. The 7 bits above the decimal point correspond to the 7 bits of the digital bit configuration of the pitch period information T input from the input terminal 60. The subtracter 61 calculates the difference between the final pitch cycle information in the previous frame cycle and the current frame cycle information. The output of the subtracter 61 is reduced by 1/256 by a shifter 62. This shifter 62 may be implemented using a shift register or the like, but it can also be realized by devising wiring. That is, a connection is made between the subtracter 61 and the interpolation processing unit register 63 to shift the output of the subtracter 61 to the right by 8 bits. Therefore, there is no need to provide any elemental component corresponding to the shifter 62. shifter 6
2 is 1/256 because the pitch period T in the adjacent frame period Tf obtained by the subtractor 61 is
, into an interpolation processing unit per interpolation processing period T, and the processing in which the ratio of one frame period Tf and the interpolation processing period T is selected as a power of 2 (256 in this processing) is this conversion processing. This is to simplify the configuration. shifter 62
The output of the subtracter 61 shifted to the right by 8 bits is stored in the interpolation processing unit register 63 as an interpolation processing unit. The contents of the interpolation processing unit register 63 mean the difference in frame period per interpolation processing period T, as described above. The digital code structure of the interpolation processing unit register 63 consists of a 1-bit code and 8 bits below the decimal point. Next, in the adder 64, the interpolation processing unit register 63 and the pitch period register 65 are added, and the result is read into the pitch period register 65 at the timing of the pulse 71001.

パルス７１００１からパルス７１００２までは、補間処
理単位レジスタ６３の内容とピツチ周期レジスタ６５の
内容を加算器６４で加算し、その結果をパルス７１００
２のタイミングによつてピッチ周期レジスタ６５に読み
込む処理を行う。From pulse 71001 to pulse 71002, the contents of the interpolation processing unit register 63 and the contents of the pitch period register 65 are added by the adder 64, and the result is added to the pulse 71000.
The process of reading into the pitch period register 65 is performed at timing 2.

このとき、補間処理単位レジスタ６３の内容は変化しな
い。パルス７１００２からパルス７１００３までも同様
の処理を行う。以下２５４回、同様の処理を行い、パル
ス７１２５６によつてフレーム周期内の最終の補間デー
タがピツチ周期レジスタ６５に読み込まれる。At this time, the contents of the interpolation processing unit register 63 do not change. Similar processing is performed for pulses 71002 to 71003. The same process is repeated 254 times, and the final interpolated data within the frame period is read into the pitch period register 65 by pulse 71256.

ピツチ周期レジスタ６５に格納される補間されたピツチ
周期情報の変化のようすを第８図Ａ，ｂの階段状線７２
で示す。ピツチ周期レジスタ６５のデイジタル出力のう
ち、小数点以上の７ビツトはカウンタ６６のプリセツト
入力となる。The change in the interpolated pitch cycle information stored in the pitch cycle register 65 is shown by the stepped line 72 in FIGS. 8A and 8B.
Indicated by Of the digital output of the pitch period register 65, seven bits above the decimal point serve as a preset input to the counter 66.

カウンタ６６はクロツク７１によつてダウンカウント動
作を行い、カウンタ出力値がＯになつたとき、レジスタ
６５の小数点以上の７ビツトの値をロードする。また、
このとき、出力端子６７からインパルスを発生する。第
８図ｂのタイムチヤートに従つて、カウンタ６６の動作
を説明する。The counter 66 performs a down-count operation by the clock 71, and when the counter output value reaches O, the value of 7 bits above the decimal point of the register 65 is loaded. Also,
At this time, an impulse is generated from the output terminal 67. The operation of the counter 66 will be explained according to the time chart shown in FIG. 8b.

パルス７１０５０が入つてきたときのカウンタ６６の内
容を０００００１０１レジスタ６５の内容を０００１０
００．００００００００、差分レジスタ６３の内容を０
．０１００００００とする。The contents of the counter 66 when the pulse 71050 comes in are 00000101, and the contents of the register 65 are 00010.
00.00000000, set the contents of difference register 63 to 0
．． 01000000.

すると、各パルス時におけるピツチ周期レジスタ６５の
内容、カウンタ６６の内容、パルス発生のタイミング信
号６７の動作は次の表１の如くなる。前記表１に従う動
作におけるカウンタ６６の内容の変化の様子を第８図ｂ
の階段状線７３で示し、カウンタ６６から得られるパル
ス発生タイミング信号６７を第８図ｂに示す。パルス発
生タイミング信号６７によつてパルス発生回路はパルス
又はパルス列を発生させ、音源信号出力端子６９から有
声音の音源信号を出力する。パルス発生回路６８によつ
て発生するパルスはインパルス、又はバーカ一系列信号
、正弦二乗波など、各信号が提案されているが、本発明
ではそれらのパルスの発生を与えるパルス発生タイミン
グ信号の間隔を補間することを特徴としており、パルス
又はパルス列の種類は問題ではない。以上の如く、有声
音源発生回路は動作する。本発明によるピツチ周期の補
間処理を行うことによつてフレーム間でのピツチ周期の
不連続性を軽減することができ、そのため合成音の抑揚
がなめらかになり、合成音の音質が向上する。Then, the contents of the pitch period register 65, the contents of the counter 66, and the operation of the pulse generation timing signal 67 at the time of each pulse are as shown in Table 1 below. FIG. 8b shows how the contents of the counter 66 change during the operation according to Table 1.
The pulse generation timing signal 67 obtained from the counter 66 is shown in FIG. 8b. The pulse generation circuit generates a pulse or a pulse train according to the pulse generation timing signal 67, and outputs a voiced sound source signal from the voice signal output terminal 69. Various signals have been proposed for the pulses generated by the pulse generation circuit 68, such as an impulse, a Barker series signal, or a sine square wave, but in the present invention, the interval of the pulse generation timing signal that gives the generation of these pulses is It is characterized by interpolation, and the type of pulse or pulse train does not matter. As described above, the voiced sound source generation circuit operates. By performing pitch period interpolation processing according to the present invention, discontinuity in the pitch period between frames can be reduced, resulting in smoother intonation of the synthesized sound and improved sound quality of the synthesized sound.

本発明は、フレーム周期間でのピツチ周期Ｔｐの不連続
性を解消する回路を有しているので、有声音時の音源と
してピツチ周期ごとのパルスもしくはパルス列を用いる
音声合成装置において利用することができる。Since the present invention has a circuit that eliminates the discontinuity of the pitch period Tp in the frame period, it can be used in a speech synthesis device that uses pulses or pulse trains for each pitch period as a sound source for voiced sounds. can.

[Brief explanation of drawings]

第１図は発生機構のモデル化の図、第２図は発声機構を
電気回路におきかえたモデル図、第３図は従来の音声の
分析合成装置、第４図は音源発生回路の説明図、第５図
ａはｎ番目とｎ＋１番目のフレームにおける有声音のピ
ツチ周期Ｔ，の不連続性を説明する図、第５図ｂは実際
の有声音時の励振波形を示した図、第６図は補間処理を
示した図、第７図はピツチ周期補間処理回路を示した図
、第８図ａはピツチ周期の補間処理を説明するためのタ
イムチヤート、第８図ｂは第８図ａを部分的に拡大して
示したタイムチヤートである。１１・・・・・・音声の分析装置、１２・・・・・・伝
送路又は記憶装置、１３・・・・・・音声の合成装置、
１４・・・・・・音声のアナログ信号入力端子、１５・
・・・・・ローパスフイルタ、１６・・・・・・ＡＤ変
換器、１７・・・・・泪已相関回路、１８・・・・・・
パラメータ算出回路、１９・・・・・・音源抽出フイル
タ、２０・・・・・・音源判定回路、２１・・・・・・
パラメータ（αパラメータ、又はｋパラメータ、又はＬ
ＳＰパラメータ）、２２・・・・・・音源情報信号（Ｔ
，、Ｖ／ＵＶ．Ａ）、２３・・・・・・符号器、２４・
・・・・・復号器、２５・・・・・・復号されたパラメ
ータ、２６・・・・・・復号された音源情報信号、２７
・・・・・・音源発生回路、２８・・・・・・合成フィ
ルタ、２９・・・・・・ＤＡ変換器、３０・・・・・・
ローパスフイルタ、３１・・・・・・合成音声出力端子
、４０・・・・・・雑音発生回路、４１・・・・・・イ
ンパルス発生回路、４２・・・・・・ピッチ周期情報Ｔ
入力端子、４３・・・・・・スイッチ、４４・・・・・
・有声・Ｐ・）
無声判定信号Ｖ／ＵＶ入力端子、４５・・・・・・振幅
乗算回路、４６・・・・・・振幅Ａの入力端子、４７・
・・・・・音源信号出力端子、６０・・・・・・ピツチ
周期情報Ｔのｐ入力端子、６１・・・・・・減算器、６
２・・・・・・シフタ、６３・・・・・・差分レジスタ
、６４・・・・・・加算器、６５・・・・・・レジスタ
、６６・・・・・・カウンタ、６７・・・・・・パルス
発生タイミング信号、６８・・・・・・パルス又はパル
ス列発生回路、６９・・・・・・有声音の音源信号出力
端子、７０・・・・・・フレーム周期Ｔｆる表わす第１
クロツク信号、７００００，７０００１・・・・・・第
１クロツク信号７０のパルス、７１・・・・・・補間処
理周期Ｔを表わす第２クロツク信号、７１０００〜７１
２５６・・・・・・第２クロツク信号７１のパルス、７
２・・・・・・レジスタ６５の内容を表わす階段状線、
７３・・・・・・カウンタ６６の内容を示す階段状線、
７４・・・・・・カウンタ６０より発せられるインパル
スのタイミング。Figure 1 is a diagram showing the modeling of the generation mechanism, Figure 2 is a model diagram in which the voice generation mechanism is replaced with an electric circuit, Figure 3 is a conventional speech analysis and synthesis device, and Figure 4 is an explanatory diagram of the sound source generation circuit. Figure 5a is a diagram explaining the discontinuity of the pitch period T of the voiced sound in the nth and n+1th frames, Figure 5b is a diagram showing the excitation waveform during an actual voiced sound, and Figure 6 7 is a diagram showing the interpolation process, FIG. 7 is a diagram showing the pitch period interpolation processing circuit, FIG. 8a is a time chart for explaining the pitch period interpolation process, and FIG. 8b is the same as FIG. This is a partially enlarged time chart. 11...Speech analysis device, 12...Transmission path or storage device, 13...Speech synthesis device,
14...Audio analog signal input terminal, 15.
...Low pass filter, 16...AD converter, 17...Yumi correlation circuit, 18...
Parameter calculation circuit, 19... Sound source extraction filter, 20... Sound source determination circuit, 21...
Parameter (α parameter, or k parameter, or L
SP parameter), 22... sound source information signal (T
,,V/UV. A), 23... Encoder, 24...
... Decoder, 25 ... Decoded parameters, 26 ... Decoded sound source information signal, 27
......Sound source generation circuit, 28...Synthesis filter, 29...DA converter, 30...
Low pass filter, 31...Synthesized voice output terminal, 40...Noise generation circuit, 41...Impulse generation circuit, 42...Pitch period information T
Input terminal, 43...Switch, 44...
・Voiced・P・)
Silence determination signal V/UV input terminal, 45... Amplitude multiplier circuit, 46... Input terminal for amplitude A, 47.
... Sound source signal output terminal, 60 ... P input terminal for pitch period information T, 61 ... Subtractor, 6
2...Shifter, 63...Difference register, 64...Adder, 65...Register, 66...Counter, 67... . . . Pulse generation timing signal, 68 . . . Pulse or pulse train generation circuit, 69 . . . Voiced sound source signal output terminal, 70 . . . 1
Clock signal, 70000, 70001...Pulse of first clock signal 70, 71...Second clock signal representing interpolation processing period T, 71000-71
256...Pulse of second clock signal 71, 7
2...A stepped line representing the contents of the register 65,
73...A stepped line indicating the contents of the counter 66,
74... Timing of the impulse emitted from the counter 60.

Claims

[Claims]

1. In the voiced sound source generation section of the linear prediction type speech synthesizer, the first register 65 stores the interpolation processing result.
and a subtracter that takes the difference between the pitch period information of the voiced sound sent every frame period and the interpolation processing result stored in the first register 65, and converts the difference into an interpolation processing unit corresponding to the interpolation processing period. A second register 63 stores a unit of interpolation processing and a second register 63 for storing a unit of interpolation processing.
an adder that sequentially adds the interpolation processing unit stored in the register 63 and the interpolation processing result stored in the first register 65, and stores the addition result in the first register 65 again; It is also connected to the pulse generation circuit, and counts down every interpolation processing cycle, and when the stored value reaches a specific value, updates the stored value with the contents of the first register 65, and starts the pulse generation circuit. 1. A speech synthesizer comprising a pitch period interpolation processing circuit having a counter 66 that generates a signal for activating the circuit.