Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPS5912199B2 - Audio parameter modification method - Google Patents
[go: Go Back, main page]

JPS5912199B2 - Audio parameter modification method - Google Patents

Audio parameter modification method

Info

Publication number
JPS5912199B2
JPS5912199B2 JP56214568A JP21456881A JPS5912199B2 JP S5912199 B2 JPS5912199 B2 JP S5912199B2 JP 56214568 A JP56214568 A JP 56214568A JP 21456881 A JP21456881 A JP 21456881A JP S5912199 B2 JPS5912199 B2 JP S5912199B2
Authority
JP
Japan
Prior art keywords
speech
frequency
pitch frequency
bandwidth
polar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP56214568A
Other languages
Japanese (ja)
Other versions
JPS58111997A (en
Inventor
亨 金盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP56214568A priority Critical patent/JPS5912199B2/en
Publication of JPS58111997A publication Critical patent/JPS58111997A/en
Publication of JPS5912199B2 publication Critical patent/JPS5912199B2/en
Expired legal-status Critical Current

Links

Landscapes

  • Details Of Television Systems (AREA)
  • Stereo-Broadcasting Methods (AREA)
  • Selective Calling Equipment (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Description

【発明の詳細な説明】 (1)発明の技術分野 本発明は、音声の分析合成方式において、合成音声中の
異音などの聞き辛い音の発生を自動的に抑制するための
音声パラメータ修正方式に関する。
Detailed Description of the Invention (1) Technical Field of the Invention The present invention provides a speech parameter correction method for automatically suppressing the occurrence of sounds that are difficult to hear, such as abnormal noises, in synthesized speech in a speech analysis and synthesis method. Regarding.

(2)技術の背景一般に、PARCOR、LSPなどの
線形予測方式では、Pを予測次数、αiを線形予測係数
としたH(z)= 1+ΣαiZ 1=1 によつて表わされる全極型モデルにより音声のスペクト
ル包絡特性は近似するものであるため、このモデルには
あてはまらない場合の多いNやMな10どの鼻音性の子
音や撥音、あるいは母音の工などの、第1フオルマント
周波数が低い音声でかつ基本周波数が第1フオルマント
周波数にほぼ一致している場合などに第1フオルマント
の帯域幅が異常に狭く分析されることがある。
(2) Background of the technology Generally, in linear prediction methods such as PARCOR and LSP, speech is calculated using an all-pole model expressed by H(z)=1+ΣαiZ 1=1, where P is the prediction order and αi is the linear prediction coefficient. The spectral envelope characteristics of are approximate, so this model is often not applicable to nasal consonants such as N and M 10, plosive sounds, or vowel sounds, which have a low first formant frequency. When the fundamental frequency almost matches the first formant frequency, the bandwidth of the first formant may be analyzed to be abnormally narrow.

このような場合15には、合成された音声は、その振幅
が非常に大きくなつたり、あるいは、聞き辛い異音を含
むものとなつたりすることが多い。(3)従来技術と問
題点 従来、音声の分析合成を行なう場合で、特定のクo 合
成音声の品質を向上させるためには、合成と試聴を繰り
返しながら人手によつてパラメータの異常が生じている
時点を探索し、さらにパラメータに適当な修正を加えて
みる、という作業を繰り返さねばならなかつた。
In such cases 15, the synthesized speech often has a very large amplitude or contains abnormal sounds that are difficult to hear. (3) Conventional technology and problems Conventionally, when analyzing and synthesizing speech, it is necessary to manually perform parameter abnormalities while repeating synthesis and listening in order to improve the quality of synthesized speech. I had to repeat the process of searching for a certain point in time, and then making appropriate modifications to the parameters.

しかも、従来は、合成音声95の異常性が、音声パラメ
ータのどの部分に起因して生じているのかが、必ずしも
適確に判別することができず、たとえば、短いが強い刺
激をもつ「ギヨン、ギヨン」というような異音が合成音
声中に混じつた場合、パラメータの修正は試行錯誤30
的になり、非効率な処理をしいられていた。(4)発明
の目的本発明は、異常な音声を発生する可能性が特に高
い上記のようなパラメータを、自動的に検出修正して、
合成音声の品質を向上させることを目的35とする。
Moreover, in the past, it was not always possible to accurately determine which part of the speech parameters caused the abnormality in the synthesized speech 95. If an abnormal sound such as "Giyon" is mixed into the synthesized speech, modifying the parameters is a matter of trial and error30
This resulted in inefficient processing. (4) Purpose of the Invention The present invention automatically detects and corrects the above-mentioned parameters that are particularly likely to cause abnormal sounds.
The purpose 35 is to improve the quality of synthesized speech.

(5)発明の構成 本発明は、異常パラメータの検出および修正処理を自動
的に行なうため、スペクトル包絡中の狭少な帯域幅をも
つ極周波数とピツチ周波数との近接度が高い場合に、合
成音声中の該極周波数の近傍のパワーレベルが異常に増
加し、異音発生原因となる点に着目してなされたもので
ある。
(5) Structure of the Invention The present invention automatically detects abnormal parameters and corrects them, so when there is a high degree of proximity between a polar frequency with a narrow bandwidth in the spectrum envelope and a pitch frequency, the synthesized voice This was done by focusing on the fact that the power level in the vicinity of the polar frequency increases abnormally, causing abnormal noise.

第1図は、上述した極周波数とピッチ周波数との関係の
説明図である。
FIG. 1 is an explanatory diagram of the relationship between the above-mentioned pole frequency and pitch frequency.

同図において、Fは、複数の極周波数Fiと帯域幅Bi
により近似的に表現されたスペクトル包絡である。本発
明は、各極周波数の帯域幅Biの中で、特に狭小な帯域
幅Biをもつ極周波数Fi(たとえば300Hz)に、
ピツチ周波数Piが、たとえば数Hz乃至30Hz程度
の差で近接していた場合に、異常パラメータと判定し、
ピツチ周波数を適当な値だけずらすものである。本発明
は、上述した原理に基づき、その構成として、音声波を
分析し、ピッチ周波数に関する情報とスペクトル包絡に
関する情報とを抽出し、これをパラメータ時系列として
音声合成を行なう音声の分析合成方式において、上記ス
ペクトル包絡に関する情報を極周波数と帯域幅の関数に
より近似する手段と、該関数の極周波数の中で狭少な帯
域幅を伴うものを抽出する手段と、該関数の極周波数ど
ピツチ周波数との近接度を判定する手段と、該抽出手段
と判定手段とにより、帯域幅が狭少で、かつピツチ周波
数に近接した極周波数の存在が検出されたとき、該ピツ
チ周波数を変更する手段とをそなえ、該帯域幅が狭少な
極周波数とピツチ周波数との間隔を広げることにより、
合成音声中の異音の発生を抑制することを特徴としてい
る。
In the figure, F is a plurality of polar frequencies Fi and a bandwidth Bi
This is the spectral envelope approximately expressed by The present invention provides a polar frequency Fi (for example, 300 Hz) having a particularly narrow bandwidth Bi among the bandwidth Bi of each pole frequency.
If the pitch frequencies Pi are close to each other with a difference of, for example, several Hz to 30 Hz, it is determined that the parameter is abnormal,
This is to shift the pitch frequency by an appropriate value. The present invention is based on the above-mentioned principle and includes a speech analysis and synthesis method in which speech waves are analyzed, information on pitch frequency and information on spectral envelope are extracted, and speech synthesis is performed using this as a parameter time series. , means for approximating the information regarding the spectral envelope by a function of a polar frequency and a bandwidth; a means for extracting a polar frequency with a narrow bandwidth among the polar frequencies of the function; means for determining the proximity of the pitch frequency, and means for changing the pitch frequency when the existence of a polar frequency having a narrow bandwidth and close to the pitch frequency is detected by the extraction means and the determining means. Therefore, by widening the interval between the pole frequency and the pitch frequency where the bandwidth is narrow,
It is characterized by suppressing the occurrence of abnormal sounds in synthesized speech.

(6)発明の実施例以下に、本発明を実施例にしたがつ
て詳述する。
(6) Examples of the Invention The present invention will be described in detail below using examples.

第2図は本発明を実施した音声分析器の構成図である。
同図において、1は音声分析処理部、2は本発明に係る
音声パラメータ修正処理部である。音声分析処理部1に
おいて、3は入力音声からピツチ、振幅、有声/無声の
情報を抽出する音源情報分析部、4はスペクトル包絡情
報をパラメータ化するための線形予測分析およびPAR
COR変換を行なう線形予測分析部である。分析部3,
4からの分析出力は、図示されない音声合成器において
音声合成に使用される。パラメータ修正処理部2におい
て、5は線形予測分析部4から出力されたPARCOR
係数を、線形予測係数に戻す変換部である。
FIG. 2 is a block diagram of a speech analyzer embodying the present invention.
In the figure, 1 is a speech analysis processing section, and 2 is a speech parameter correction processing section according to the present invention. In the speech analysis processing section 1, 3 is a sound source information analysis section that extracts pitch, amplitude, voiced/unvoiced information from input speech, and 4 is a linear predictive analysis and PAR for parameterizing spectral envelope information.
This is a linear predictive analysis unit that performs COR conversion. Analysis department 3,
The analysis output from 4 is used for speech synthesis in a speech synthesizer, not shown. In the parameter correction processing unit 2, 5 is the PARCOR output from the linear prediction analysis unit 4.
This is a conversion unit that converts coefficients back into linear prediction coefficients.

もし、線形予測分析部4から、中間処理データとして線
形予測係数を取り出すことができれば、変換部5は不要
である。6は、前述した音声のスペクトル包絡に関する
全極型近似モデルの伝達関数H@)において、その分母
をA(z)としたときの方程式について、ニユートンラ
フソ名去などを用いて、A(z)=Oを満たすz平面上
の根を求める求根演算部である。
If linear prediction coefficients can be extracted from the linear prediction analysis unit 4 as intermediate processing data, the conversion unit 5 is not necessary. 6 is the transfer function H@) of the all-pole approximation model regarding the spectral envelope of the voice mentioned above, and the equation when the denominator is A(z) is expressed as A(z)= This is a root-finding calculation unit that finds roots on the z-plane that satisfy O.

7は、求根演舞部6の演算結果の解を とし、Tをサンプリング周期としたとき、極周波数およ
び帯域幅 を求める変換部である。
Reference numeral 7 denotes a conversion unit that calculates the polar frequency and bandwidth, where the solution of the calculation result of the root-finding performance unit 6 is taken as the sampling period and T is the sampling period.

8は、狭少な帯域幅Biをもつ極周波数Fiの抽出部で
ある。
8 is a part for extracting the polar frequency Fi having a narrow bandwidth Bi.

帯域幅の狭少度は、Biの値が、ある一定の閾値以下で
あるか、あるいはが、ある一走力閾値以上であるかによ
つて判定し、抽出する。
The degree of narrowness of the bandwidth is determined and extracted depending on whether the value of Bi is less than a certain threshold value or more than a certain running force threshold value.

9は、抽出された狭少帯域幅の極の極周波数と、音源情
報分析部3の出力中のピツチ情報(周波数)Piとの近
接度を判定する判定部である。
Reference numeral 9 denotes a determination unit that determines the degree of proximity between the extracted polar frequency of the narrow bandwidth and the pitch information (frequency) Pi being output from the sound source information analysis unit 3.

近接度O判定は、たとえば、次の方程式Fi 〉100 1fi−Pil に基づいて行なわれる。For example, the proximity O determination is performed using the following equation Fi 〉100 1fi-Pil It is carried out based on.

この判定式を満たす極周波数Fiがあつたとき、これを
異常パラメータと判定して出力する。10は、ピツチ周
波数変更部であり、狭少帯域幅をもつ極周波数に対して
、ピツチ周波数を、一定値以上、たとえば30HZ以上
離すように、ピツチ周波数を修正し、分析出力へPiと
して供給する。
When a polar frequency Fi that satisfies this determination formula is found, it is determined to be an abnormal parameter and output. Reference numeral 10 denotes a pitch frequency changing unit, which corrects the pitch frequency so that the pitch frequency is separated by a certain value or more, for example, 30 Hz or more, with respect to the polar frequency having a narrow bandwidth, and supplies it to the analysis output as Pi. .

なお、ピツチ周波数の修正の他に、極周波数の帯域幅を
広げたり、時間窓を調整するなどの修正を行なうことも
可能である。
In addition to modifying the pitch frequency, it is also possible to perform modifications such as widening the bandwidth of the polar frequency and adjusting the time window.

(7)発明の効果 本発明は、異常な音声を発生する可能性が高いパラメー
タを事前に自動的に修正することができるため、合成音
声の品質を向上させる作業を極めて容易にすることがで
きる。
(7) Effects of the Invention The present invention can automatically correct in advance parameters that are likely to cause abnormal speech, making it extremely easy to improve the quality of synthesized speech. .

【図面の簡単な説明】 第1図は、本発明の原理の説明図、第2図は本発明実施
例の構成図である。 図において、1は音声分析処理部、2は音声パラメータ
修正処理部、3は音源情報分析部、4は線形予測分析部
、5はPARCOR係数から線形予測係数への変換部、
6は求根演算部、7は根から極周波数および帯域幅への
変換部、8は狭少帯域幅の抽出部、9は極周波数とピツ
チ周波数との近接度判定部、10はピツチ周波数変更部
、をそれぞれ示す。
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an explanatory diagram of the principle of the present invention, and FIG. 2 is a configuration diagram of an embodiment of the present invention. In the figure, 1 is a speech analysis processing section, 2 is a speech parameter correction processing section, 3 is a sound source information analysis section, 4 is a linear prediction analysis section, 5 is a conversion section from PARCOR coefficients to linear prediction coefficients,
6 is a root calculation unit, 7 is a conversion unit from roots to pole frequency and bandwidth, 8 is a narrow bandwidth extraction unit, 9 is a proximity determination unit between pole frequency and pitch frequency, and 10 is a pitch frequency change unit. , respectively.

Claims (1)

【特許請求の範囲】[Claims] 1 音声波を分析し、ピッチ周波数に関する情報とスペ
クトル包絡に関する情報とを抽出し、これをパラメータ
時系列として音声合成な行なう音声の分析合成方式にお
いて、上記スペクトル包絡に関する情報を極周波数と帯
域幅の関数により近似する手段と、該関数の極周波数の
中で狭少な帯域幅を伴うものを抽出する手段と、該関数
の極周波数とピッチ周波数との近接度を判定する手段と
、該抽出手段と判定手段とにより、帯域幅が狭少で、か
つピッチ周波数に近接した極周波数の存在が検出された
とき、該ピッチ周波数を変更する手段とをそなえ、該帯
域幅が狭少な極周波数とピッチ周波数との間隔を広げる
ことにより、合成音声中の異音の発生を抑制することを
特徴とする音声パラメータの修正方式。
1 In a speech analysis and synthesis method that analyzes speech waves, extracts information regarding pitch frequency and information regarding spectral envelope, and performs speech synthesis using this as a parameter time series, the information regarding the spectral envelope is combined with polar frequencies and bandwidth. means for approximating by a function; means for extracting one with a narrow bandwidth among the polar frequencies of the function; means for determining the proximity between the polar frequencies of the function and the pitch frequency; means for changing the pitch frequency when the existence of a pole frequency having a narrow bandwidth and close to the pitch frequency is detected by the determining means, the pole frequency having the narrow bandwidth and the pitch frequency; A speech parameter modification method characterized by suppressing the occurrence of abnormal sounds in synthesized speech by widening the interval between.
JP56214568A 1981-12-25 1981-12-25 Audio parameter modification method Expired JPS5912199B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56214568A JPS5912199B2 (en) 1981-12-25 1981-12-25 Audio parameter modification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56214568A JPS5912199B2 (en) 1981-12-25 1981-12-25 Audio parameter modification method

Publications (2)

Publication Number Publication Date
JPS58111997A JPS58111997A (en) 1983-07-04
JPS5912199B2 true JPS5912199B2 (en) 1984-03-21

Family

ID=16657867

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56214568A Expired JPS5912199B2 (en) 1981-12-25 1981-12-25 Audio parameter modification method

Country Status (1)

Country Link
JP (1) JPS5912199B2 (en)

Also Published As

Publication number Publication date
JPS58111997A (en) 1983-07-04

Similar Documents

Publication Publication Date Title
US6691083B1 (en) Wideband speech synthesis from a narrowband speech signal
McCree et al. A mixed excitation LPC vocoder model for low bit rate speech coding
US5455888A (en) Speech bandwidth extension method and apparatus
KR20040028932A (en) Speech bandwidth extension apparatus and speech bandwidth extension method
EP2096631A1 (en) Audio decoding device and power adjusting method
US20070061135A1 (en) Optimized windows and interpolation factors, and methods for optimizing windows, interpolation factors and linear prediction analysis in the ITU-T G.729 speech coding standard
JPH1097296A (en) Speech encoding method and apparatus, speech decoding method and apparatus
US8073687B2 (en) Audio regeneration method
US5812966A (en) Pitch searching time reducing method for code excited linear prediction vocoder using line spectral pair
JP3618217B2 (en) Audio pitch encoding method, audio pitch encoding device, and recording medium on which audio pitch encoding program is recorded
US6289305B1 (en) Method for analyzing speech involving detecting the formants by division into time frames using linear prediction
JPS5912199B2 (en) Audio parameter modification method
Wong On understanding the quality problems of LPC speech
JPS5912198B2 (en) Audio parameter abnormality detection method
JP3398968B2 (en) Speech analysis and synthesis method
JPH0650440B2 (en) LSP type pattern matching vocoder
Alku et al. A new linear predictive method for compression of speech signals.
Vogten et al. The formator: a speech analysis-synthesis system based on formant extraction from linear prediction coefficients
JP3063088B2 (en) Speech analysis and synthesis device, speech analysis device and speech synthesis device
JPH0141998B2 (en)
JPS599920B2 (en) Audio parameter modification method
JPH06202695A (en) Speech signal processor
JPS62278598A (en) Band division type vocoder
KR930011736B1 (en) Pitch Control Method of Waveform Coding and Hybrid Coding by Pitch Half Method of Speech Signal
Hosom F0 estimation for adult and children's speech.