JPH0782358B2

JPH0782358B2 - Speech analysis and synthesis method

Info

Publication number: JPH0782358B2
Application number: JP58225928A
Authority: JP
Inventors: 正久古屋; 康彦新居
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1983-11-30
Filing date: 1983-11-30
Publication date: 1995-09-06
Anticipated expiration: 2010-09-06
Also published as: JPS60118899A

Description

【発明の詳細な説明】産業上の利用分野本発明は、音声分析合成系において合成音声の品質を劣
化させずに音声合成に用いる情報量を減少させる音声分
析合成方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech analysis / synthesis method for reducing the amount of information used in speech synthesis without degrading the quality of synthesized speech in a speech analysis / synthesis system.

従来例の構成とその問題点一般に、音声の分析合成方法とは、第１図に示すように
音声信号をフレーム周期ｍで分析し、区間種類（無音区
間，無声区間，有声区間），スペクトルパラメータ，音
源振幅制御パラメータ，駆動音源周期パラメータを抽出
し、これらのパラメータと駆動音源からディジタルフィ
ルタを用いて音声を合成する方法である。各パラメータ
は、フレーム周期ｍ毎に更新される。Configuration of Conventional Example and Its Problems In general, a speech analysis and synthesis method is to analyze a speech signal at a frame period m, as shown in FIG. 1, and determine a section type (silent section, unvoiced section, voiced section), spectrum parameter. , A sound source amplitude control parameter and a driving sound source period parameter are extracted, and a voice is synthesized from these parameters and the driving sound source by using a digital filter. Each parameter is updated every frame period m.

第２図は音声を分析して得られる音源振幅制御パラメー
タB,駆動音源周期パラメータＣと原音声波形Ａを示して
いる。第２図からも明らかなように、駆動音源周期パラ
メータ（第２図Ｃ）は隣接するフレームの値が同じこと
が多い。このため隣接するフレームの駆動音源周期が同
じ値の場合リピートフラグを設定して情報圧縮を図るこ
とができる。この方法では、駆動音源情報の60％程度が
削減される。ところが音源振幅制御パラメータ（第２図
Ｂ）は隣接するフレームの値が同じことはほとんど無い
ため情報圧縮は行なわれていない。FIG. 2 shows a sound source amplitude control parameter B, a driving sound source cycle parameter C, and an original sound waveform A obtained by analyzing the sound. As is clear from FIG. 2, the driving sound source cycle parameter (FIG. 2C) often has the same value in adjacent frames. Therefore, when the driving sound source periods of adjacent frames have the same value, the repeat flag can be set to achieve information compression. With this method, about 60% of the driving sound source information is reduced. However, as for the sound source amplitude control parameter (FIG. 2B), since the values of adjacent frames are almost the same, information compression is not performed.

第３図は従来の合成方法で音声を合成する場合の音源振
幅制御パラメータ値，或は駆動音源周期パラメータ値を
持たなければならないフレームを示したものである。駆
動音源周期パラメータはリピートフラグにより全フレー
ム40％程度のフレームが値を持っているだけであるが、
音源振幅制御パラメータはすべてのフレームでその値を
持っている。このように隣接フレームのパラメータ値が
同じ場合のみリピートフラグを設定して情報圧縮を行な
う従来の音声分析合成方法では、音源振幅制御パラメー
タの情報圧縮が行なえない欠点があった。FIG. 3 shows a frame which must have a sound source amplitude control parameter value or a driving sound source cycle parameter value when speech is synthesized by a conventional synthesis method. The drive sound source cycle parameter is only 40% of all frames have a value due to the repeat flag.
The sound source amplitude control parameter has its value in every frame. As described above, the conventional speech analysis and synthesis method in which the information is compressed by setting the repeat flag only when the parameter values of the adjacent frames are the same has a drawback that the information of the sound source amplitude control parameter cannot be compressed.

発明の目的本発明は、上記従来例の欠点を除去するものであり、合
成音声品質と劣化させること無く音源振幅制御パラメー
タと駆動音源周期パラメータの情報圧縮を行なう音声分
析合成方法を提供することを目的とするものである。An object of the present invention is to eliminate the above-mentioned drawbacks of the conventional example, and to provide a voice analysis / synthesis method for performing information compression of a sound source amplitude control parameter and a driving sound source period parameter without deteriorating the synthesized voice quality. It is intended.

発明の構成本発明は、上記目的を達成するために、音声を分析して
得た音源振幅制御パラメータ及び、駆動音源周期パラメ
ータをフレーム間で折線近似し、各パラメータはこの折
れ線の変曲点フレームでの値を持ち、合成時に必要な各
フレームの音源振幅制御パラメータ値及び、駆動音源周
期パラメータ値は、各々相隣る変曲点フレームのパラメ
ータ値から補間演算により求めるものであり、合成音声
品質を劣化させることなく音源振幅制御パラメータと駆
動音源周期パラメータの情報圧縮が行なえる効果を有す
る。In order to achieve the above object, the present invention approximates a sound source amplitude control parameter obtained by analyzing a voice and a driving sound source cycle parameter between frames, and each parameter is an inflection point frame of this polygonal line. , The sound source amplitude control parameter value of each frame and the driving sound source cycle parameter value of each frame required at the time of synthesis are obtained by interpolation calculation from the parameter value of each inflection point frame adjacent to each other. This has the effect that information compression of the sound source amplitude control parameter and the driving sound source period parameter can be performed without deteriorating the noise.

実施例の説明以下に本発明の実施例を図面とともに説明する。第４図
Ａ〜Ｅは、第２図に示したのと同じ音声を分析して得ら
れる音源振幅制御パラメータ（第４図Ｂ），駆動音源周
期パラメータ（第４図Ｃ）及び、これらの折線近似（第
４図D,E）と原音声波形（第４図Ａ）を示している。図
中・は変曲点を示す。この図から、音源振幅制御パラメ
ータ及び、駆動音源周期パラメータを折線近似し、その
変曲点のフレームのみその値を持つことにより、音源振
幅制御パラメータのみならず駆動音源周期パラメータの
情報も同様にして圧縮できることが分かる。Description of Embodiments Embodiments of the present invention will be described below with reference to the drawings. 4A to 4E are sound source amplitude control parameters (FIG. 4B), driving sound source cycle parameters (FIG. 4C) obtained by analyzing the same sound as shown in FIG. The approximation (Fig. 4D, E) and the original speech waveform (Fig. 4A) are shown. In the figure, indicates the inflection point. From this figure, the sound source amplitude control parameter and the driving sound source period parameter are approximated to a polygonal line, and only the frame at the inflection point has that value, so that not only the sound source amplitude control parameter but also the driving sound source period parameter information is obtained in the same manner. You can see that it can be compressed.

第５図は、本発明による合成方法で音源振幅制御パラメ
ータ値、或は駆動音源周期パラメータ値を持たなければ
ならないフレームを示したものである。また、図中Ａは
補間演算によりパラメータ値を算出するフレームを示し
ている。音源振幅制御パラメータは、全フレームの30％
程度、駆動音源周期パラメータも全フレームの30％程度
のフレームでその値を持てば良いことが分かる。FIG. 5 shows a frame which must have a sound source amplitude control parameter value or a driving sound source period parameter value in the synthesis method according to the present invention. Further, in the figure, A indicates a frame in which a parameter value is calculated by interpolation calculation. Source amplitude control parameter is 30% of all frames
It can be seen that the driving sound source period parameter should have that value in about 30% of all frames.

発明の効果本発明は上記のような音声分析方法であるので合成音声
品質を劣化させずに音源振幅制御パラメータ及び、駆動
音源周期パラメータの情報圧縮を行なうことができる。EFFECTS OF THE INVENTION Since the present invention is the voice analysis method as described above, it is possible to perform the information compression of the sound source amplitude control parameter and the driving sound source cycle parameter without deteriorating the synthesized voice quality.

[Brief description of drawings]

第１図は音声分析合成方法の概略図、第２図Ａ〜Ｃは従
来の音声分析合成方法における原音声波形と音源振幅制
御パラメータ及び駆動音源周期パラメータの波形図、第
３図は従来の方法における合成フレームパラメータ値の
説明図、第４図Ａ〜Ｅは本発明の一実施例における音声
分析合成方法における音源振幅制御パラメータと駆動音
源周期パラメータの折線近似説明図、第５図は同方法に
おける合成フレームパラメータ値の説明図である。FIG. 1 is a schematic diagram of a speech analysis and synthesis method, FIGS. 2A to 2C are waveform diagrams of an original speech waveform, a sound source amplitude control parameter, and a driving sound source period parameter in the conventional speech analysis and synthesis method, and FIG. 3 is a conventional method. 4A to 4E are explanatory diagrams of polygonal line approximation of the sound source amplitude control parameter and the driving sound source period parameter in the voice analysis and synthesis method in one embodiment of the present invention, and FIG. 5 is the same method. It is an explanatory view of a synthetic frame parameter value.

Claims

[Claims]

1. A driving sound source amplitude control parameter obtained by analyzing a voice is approximated to a polygonal line between frames, and the driving sound source amplitude control parameter has only the value of an inflection point frame of this polygonal line, and each frame is driven at the time of synthesis. A voice analysis and synthesis method characterized in that a value obtained by interpolation calculation from values of adjacent inflection point frames is used as a sound source amplitude control parameter value.

2. A driving sound source cycle parameter obtained by analyzing speech is approximated to a polygonal line between frames, and the driving sound source cycle parameter has only the value of the inflection point frame of this polygonal line, and the driving sound source cycle of each frame at the time of synthesis. The voice analysis / synthesis method according to claim 1, wherein a value obtained by interpolation calculation from values of adjacent inflection point frames is used as the parameter value.