JPH0632034B2

JPH0632034B2 - Speech coding method

Info

Publication number: JPH0632034B2
Application number: JP59105747A
Authority: JP
Inventors: 茂小野
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1984-05-25
Filing date: 1984-05-25
Publication date: 1994-04-27
Anticipated expiration: 2009-04-27
Also published as: JPS60249200A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声信号の低ビットレイト波形符号化方式、特
に伝送情報量を16kビット／秒程度以下となるような符
号化方式に関する。Description: TECHNICAL FIELD The present invention relates to a low bit rate waveform coding method for a voice signal, and more particularly to a coding method that reduces the amount of transmission information to about 16 kbit / sec or less.

（従来技術）音声信号を16kビット／秒程度以下の伝送情報量で符号
化するための効果的な方法として、音声信号の駆動音源
信号系列をそれを用いて再生した信号と入力信号との誤
差最小を条件として短時間毎に探索する方法が知られて
いる。米国ベル電話研究所のビー・エス・アタール(B.
S.ATAL)氏らによる、駆動音源信号系列を複数個のパル
スで表わし、その振幅と位相を短時間毎に符号器側でア
ナリシスバイシンセシス（Analysis-by-Synthesi
s）；A-b-S法により求める方式は有効である。これに対
する説明は1982年度のアイ・シー・エー・エス・エス・
ピー（ICASSP）の予稿集614〜617貢、「アニューモ
デルオブエルピーシーイクサイテイションフォ
ープロデュースィングナチュラルサウンディング
スピーチアットロービットレイツ（A new mo
del of LPC excitation for producing natural soundi
ng speech at low bit rates）」（文献１）に掲載され
ているので、ここでは詳細な説明は省く。文献１の従来
方式はパルス系列を求める手段としてＡ−ｂ−Ｓ法を用
いているため、演算量が非常に多いという欠点がある。
それに対し特許出願番号昭57−231603号明細書「音声符
号化方式」（文献２）においては、上記パルス系列を求
めるために演算量を大幅に縮少する方式が提案されてい
る。これらの方式により、伝送レイトを１６ｋビット／
秒以下とした領域で良好な再生音質が得られると報告さ
れている。(Prior Art) As an effective method for encoding a voice signal with a transmission information amount of about 16 kbit / sec or less, an error between a signal reproduced from a drive source signal sequence of the voice signal and an input signal is used. A method is known in which a search is performed every short time under the condition of the minimum. B.S.Atal (B.
S.ATAL) and others represent the driving sound source signal sequence with multiple pulses, and the amplitude and phase are analyzed at the encoder side every short time by analysis-by-synthesis.
s); The method obtained by the AbS method is effective. The explanation for this is the 1982 ICSAS
PIC (ICASSP) 's Proceedings 614-617, "A new model of LPC Excitement for Producing Natural Sounding Speech at Low Bit Rate (A new mo
del of LPC excitation for producing natural soundi
ng speech at low bit rates) ”(reference 1), so detailed description is omitted here. Since the conventional method of Document 1 uses the A-B-S method as a means for obtaining a pulse sequence, it has a drawback that the amount of calculation is very large.
On the other hand, in Japanese Patent Application No. 57-231603, "Voice coding method" (reference 2), a method is proposed in which the amount of calculation is greatly reduced in order to obtain the pulse sequence. With these methods, the transmission rate is 16 kbit /
It has been reported that good playback sound quality can be obtained in a region of less than a second.

ここで前記文献２の従来方式を簡単に説明する。１フレ
ーム内Ｋ個のパルス系列からなる駆動音源系列を次のよ
うに表わす。Here, the conventional method of Document 2 will be briefly described. A driving sound source sequence consisting of K pulse sequences in one frame is expressed as follows.

ここでδ（・）はクロネッカー(KRONECKER)のδであ
る。Ｎはフレーム長、g_kは位置l_kに立つパルスの振幅を
表わす。d(n)を合成フィルタに入力して得られる再生信
号は、合成フィルタの予測係数をα_ｉ（ｉ＝１，…，Ｍ，
Ｍは合成フィルタの次数）とすると、次のように書け
る。 Here, δ (·) is δ of Kronecker. N represents the frame length, and g _k represents the amplitude of the pulse standing at the position l _k . Playback signal obtained by inputting d (n) to the synthesis filter Is the prediction coefficient of the synthesis filter α _i (i = 1, ..., M,
If M is the order of the synthesis filter), it can be written as follows.

入力音声信号x(n)と再出信号との１フレーム内の重み付き二乗誤差Ｊはとなる。ここで＊はたたみ込み積分を示す記号であり、
w(n)は重み関数を表わす。重み関数は入力音声信号と再
生信号との聴覚上での誤差を最も小さくするために導入
される。聴覚のマスキング効果によれば、音声エネルギ
ーの大きい帯域では雑音は抑圧される。重み関数は、誤
差にこのような聴覚の特性を考慮した重み付けを行うも
のである。重み関数としては、そのＺ変換W(Z)を合成フ
イルタの予測パラメータα_ｉと０≦γ≦１を満足する実
定数γよりと表わされるものが提案されている（前記文献１）。さ
らにのＺ変換をそれぞれとすると(3)式は次のように表わされる。 Input voice signal x (n) and re-output signal The weighted squared error J in one frame with Becomes Where * is a symbol indicating convolution,
w (n) represents a weighting function. The weighting function is introduced in order to minimize the auditory error between the input audio signal and the reproduced signal. According to the auditory masking effect, noise is suppressed in the band where the voice energy is high. The weighting function weights the error in consideration of such auditory characteristics. As the weighting function, the Z-transform W (Z) is calculated from the prediction parameter α _i of the synthesis filter and the real constant γ that satisfies 0 ≦ γ ≦ 1. What is represented is proposed (Reference 1). further Z conversion of Then, Eq. (3) is expressed as follows.

また、(2)式の関係からは次のようになる。 Also, from the relationship of equation (2) Is as follows.

ここで H(Z)は合成フィルタのＺ変換、D(Z)は駆動音源のＺ変換
である。(5)を(4)に代入するとＪ＝|X(Z)W(Z)-H(Z)W(Z)D(Z)|² −(66) である。 here H (Z) is the Z conversion of the synthesis filter, and D (Z) is the Z conversion of the driving sound source. Substituting (5) into (4), J = | X (Z) W (Z) -H (Z) W (Z) D (Z) | ² − (66).

従って、X(Z)W(Z)とH(Z)W(Z)の逆Ｚ変換の信号をそれぞ
れx_w(n)＝x(n)*w(n)とh_w(n)＝h(n)*w(n)と記すと、(6)
は次のようになる。Therefore, the inverse Z-transformed signals of X (Z) W (Z) and H (Z) W (Z) are x _w (n) = x (n) * w (n) and h _w (n) = h, respectively. When written as (n) * w (n), (6)
Is as follows.

(7)式を最小にするような音源パルス系列の振幅g_k位置l
_kを求めるのに、(7)式をg_kで偏微分して０とおいた式、
つまりの関係を利用する。 Source pulse sequence amplitude g _k position l that minimizes Eq. (7)
To obtain _k , the expression (7) is partially differentiated by g _k and is set to 0,
That is Take advantage of the relationship.

ここで、ψ_xh(・)はx_w(n)とh_w(n)から計算した相互相関
関数列を、_hh(・)はh_w(n)の自己相関々数列をそれぞれ
表わし、次のように表わされる。尚_hh(・)は共分散関
数とも呼ばれる。Where ψ _xh (・) is the cross-correlation function sequence calculated from x _w (n) and h _w (n), and _hh (・) is the autocorrelation sequence of h _w (n). It is expressed as follows. Note that _hh (•) is also called the covariance function.

従来方式は、(8)のg_kをl_Kだけの関数とみることによ
り、ｋ番目のパルスの振幅と位置を決めるものである。
つまり、(8)の|gk|を最大にするl_kをｋ番目のパルスの
位置とし、そのときのg_kをｋ番目のパルスの振幅とする
ものである。この方式はg_kが正確にl_kだけの関数であれ
ば、(7)式を最も小さくする音源パルス系列が計算され
るが、実際の音声信号はその限りでなく、一般にg_kは、
l₁,l₂,…,l_kなどの関数である。 The conventional method determines the amplitude and position of the k-th pulse by considering g _{k in} (8) as a function of l _K only.
That is, l _k that maximizes | gk | in (8) is the position of the kth pulse, and g _k at that time is the amplitude of the kth pulse. In this method, if g _k is a function of exactly l _k, the sound source pulse sequence that minimizes Eq. (7) is calculated, but this is not the case for the actual speech signal, and in general g _k is
Functions such as l ₁ , l ₂ , ..., l _k .

第１図は、文献２の従来方式を示すブロック図である。
第２図は、音源パルス系列計算回路140で文献２の従来
方式に従い行われる音源パルス系列の振幅g_k，位置l_kを
求める処理手順を表わす流れ図である。以後第１図に示
す文献２従来方式の実施例の構成要素と第２図に示す文
献２従来方式による音源パルス系列探索アルゴリズムに
ついて詳述する。第１図において各構成要素は１フレー
ム毎に処理を行う。100は符号器入力端子を示し、Ａ／
Ｄ変換された音声信号系列x(n)が入力される。110はバ
ッファメモリ回路で、音声信号系列を１フレーム分蓄積
する。Ｋパラメータ計算回路180は、バッファメモリ回
路110に蓄積された音声信号x(n)を入力し、あらかじめ
定められた数だけＫパラメータK_i（１≦ｉ≦Ｍ）を計算
する。この値はＫパラメータ符号化回路190に出力され
る。Ｋパラメータ符号化回路190は、例えばあらかじめ
定められた量子化ビット数に基づいてK_iを符号化し、そ
の符号I_kiをマルチプレクサ160へ出力する。またＫパラ
メータ符号化回路190は、I_kiを復号化し、復号値Ｋ′_ｉ
（１≦ｉ≦Ｍ）をインパルス応答計算回路120と重み付
け回路200へ出力する。重み付け回路200は、入力音声信
号x(n)とＫパラメータ復号値Ｋ′_iを入力し、合成フィ
ルタの周波数特性に依存した重み関数w(n)を用い、前述
のx_w(n)を計算し、得られたx_w(n)を相互相関々数計算回
路135へ出力する。インパルス応答計算回路120は、Ｋ′
_iを入力し、前述のh_w(n)（インパルス応答と前述の重み
関数のたたみ込み積分）を定められたサンプル数だけ計
算し、求まったh_w(n)を共分散関数計算回路130と相互相
関関数計算回路135とへ出力する。共分散関数計算回路1
30は、あらかじめ定められたサンプル数のh_w(n)を入力
し、前述の(10)式に従って_hh(l_i,l_j)（０≦l_i，l_j≦
Ｎ−１）を計算し、これを音源パルス系列計算回路140
へ出力する。相互相関関数計算回路135は、入力されたx
_w(n)とh_w(n)との相互相関関数を計算し音源パルス系列
計算回路140へ出力する。次に音源パルス系列計算回路
の説明をする。音源パルス系列計算回路140は、相互相
関々数計算回路135からψ_xh(l_k)（０≦l_k≦Ｎ−１）
を、共分散関数計算回路130から_hh(l_i,l_j)（０≦l_i，
l_j≦Ｎ−１）をそれぞれ入力し、前述のパルス計算アル
ゴリズム(8)式を用いて音源パルス系列の振幅g_k及び位
置l_kを計算する。第２図は、文献２の従来方式における
音源パルス系列計算回路140で行なわれる処理手順を表
わす流れ図である。１つ目のパルスは(8)式において、
Ｋ＝１とおき振幅g₁を位置l₁の関数、g₁＝ψ
_ｘｈ（ｌ_１）／_ｈｈ(l₁,l₁)として表わす。次に、|g₁
|を最大にするl₁を選び、その際のl₁,g₁を１番目のパル
ス位置及び振幅とする。２番目のパルスは、(8)式にお
いてＫ＝２とおき、|g₂|を最大にするl₂を選び、その際
のl₂,g₂を２番目のパルスの位置及び振幅とする。３番
目以降のパルスも同様にして計算し、あらかじめ定まっ
たパルス数に達するまで続ける。第２図において、１は
パルスの個数を計算する計算カウンターを１に初期化す
る。２は比較であり、パルスの個数があらかじめ定めら
れた個数より大きいか小さいかを判断し、定められた個
数より大きければ、パルス系列計算の処理を終える。３
は(8)式の計算を行うもので、(8)式において、l₁，…，
l_k-1及びg₁,…，g_k-1を既知とし、|g_k|を最大にするl_k
を求め、そのときのg_k,l_kをｋ番目のパルスの振幅と位
置として出力する。４は加算器で、パルスの個数を計算
する計算カウンターの内容を１つふやす。以上で音源パ
ルス計算回路140の説明を終える。FIG. 1 is a block diagram showing the conventional method of Document 2.
FIG. 2 is a flowchart showing a processing procedure for obtaining the amplitude g _k and the position l _k of the sound source pulse sequence performed by the sound source pulse sequence calculation circuit 140 according to the conventional method of Document 2. Hereinafter, the constituent elements of the embodiment of the literature 2 conventional system shown in FIG. 1 and the source pulse sequence search algorithm by the literature 2 conventional system shown in FIG. 2 will be described in detail. In FIG. 1, each constituent element performs processing for each frame. 100 indicates an encoder input terminal, A /
The D-converted audio signal sequence x (n) is input. Reference numeral 110 denotes a buffer memory circuit, which stores an audio signal sequence for one frame. The K parameter calculation circuit 180 inputs the audio signal x (n) accumulated in the buffer memory circuit 110 and calculates K parameters K _i (1 ≦ i ≦ M) by a predetermined number. This value is output to the K parameter encoding circuit 190. The K parameter encoding circuit 190 encodes K _i based on, for example, a predetermined number of quantization bits, and outputs the code I _ki to the multiplexer 160. Further, the K parameter encoding circuit 190 decodes I _ki and outputs the decoded value K ′ _i.
(1 ≦ i ≦ M) is output to the impulse response calculation circuit 120 and the weighting circuit 200. The weighting circuit 200 inputs the input speech signal x (n) and the K parameter decoded value K ′ _i , and uses the weighting function w (n) depending on the frequency characteristic of the synthesis filter to calculate the above x _w (n). Then, the obtained x _w (n) is output to the cross correlation coefficient calculation circuit 135. The impulse response calculation circuit 120 uses K ′
_i is input, the above-mentioned h _w (n) (convolution of the impulse response and the above-mentioned weighting function) is calculated by a predetermined number of samples, and the obtained h _w (n) is calculated by the covariance function calculation circuit 130. It outputs to the cross-correlation function calculation circuit 135. Covariance function calculation circuit 1
For 30, input h _w (n) of a predetermined number of samples, and according to the above equation (10), _hh (l _i , l _j ) (0 ≦ l _i , l _j ≦
N-1) is calculated, and this is used as the sound source pulse sequence calculation circuit 140
Output to. The cross-correlation function calculation circuit 135 uses the input x
The cross-correlation function between _w (n) and h _w (n) is calculated and output to the sound source pulse sequence calculation circuit 140. Next, the sound source pulse sequence calculation circuit will be described. The sound source pulse sequence calculation circuit 140 uses the cross correlation _coefficient calculation circuit 135 to _obtain ψ _xh (l _k ) (0 ≦ l _k ≦ N−1)
From the covariance function calculation circuit 130 to _hh (l _i , l _j ) (0 ≦ l _i ,
l _j ≦ N−1) is input, and the amplitude g _k and the position l _k of the sound source pulse sequence are calculated using the above-described pulse calculation algorithm (8). FIG. 2 is a flowchart showing a processing procedure performed by the sound source pulse sequence calculation circuit 140 in the conventional method of Document 2. The first pulse is in equation (8),
K = 1 and amplitude g ₁ is a function of position l ₁ , g ₁ = ψ
_It is represented as _xh (l ₁ ) / _hh (l ₁ , l ₁ ). Then | g ₁
Select l ₁ that maximizes |, and let l ₁ and g ₁ at that time be the _first pulse position and amplitude. The second pulse, (8) K = 2 Distant in formula, | g ₂ | Select a l ₂ to maximize, to the position and amplitude of the l _2, g ₂ at that time the second pulse. The third and subsequent pulses are calculated in the same manner, and are continued until the number of pulses determined in advance is reached. In FIG. 2, 1 initializes a calculation counter for calculating the number of pulses to 1. Reference numeral 2 is a comparison, and it is determined whether the number of pulses is larger or smaller than a predetermined number, and if it is larger than the predetermined number, the pulse sequence calculation process is terminated. Three
Is to calculate equation (8), and in equation (8), l ₁ ,…,
l _k-1 and g _1, ..., the g _k-1 and known, | g _k | a to maximize l _k
And g _k , l _k at that time are output as the amplitude and position of the k-th pulse. Reference numeral 4 is an adder, which increases the content of a calculation counter for calculating the number of pulses. This completes the description of the sound source pulse calculation circuit 140.

第１図に戻って、符号化回路150は、音源パルス計算回
路140の出力であるパルス系列の振幅g_k及び位置l_kを入
力し、それらを符号化する。振幅g_kや位置l_kの符号化に
ついては従来よく知られている方法を用いることができ
る。振幅g_kについては、例えば１フレーム内のパルス系
列の振幅の最大値を正規化係数として、この値で各パル
スの振幅を正規化し、その後量子化、符号化する方法が
考えられる。位置l_kについては、例えばファクシミリ信
号符号化の分野でよく知られているランレングス符号化
を用いることが考えられる。これは符号“０”の続く長
さをあらかじめ定められた符号系列を用いて表わすもの
である。マルチプレクサ160は、Ｋパラメータ符号化回
路190の出力符号と符号化回路150の出力符号を入力し、
これらを組み合わせて、送信側出力端子170から通信路
へ出力する。Returning to FIG. 1, the encoding circuit 150 inputs the amplitude g _k and the position l _k of the pulse sequence which is the output of the excitation pulse calculation circuit 140, and encodes them. A method well known in the related art can be used for encoding the amplitude g _k and the position l _k . Regarding the amplitude g _k , for example, a method in which the maximum value of the amplitude of the pulse sequence in one frame is used as a normalization coefficient and the amplitude of each pulse is normalized with this value, and then quantization and encoding are considered. For position l _k , it is conceivable to use run-length coding, which is well known in the field of facsimile signal coding, for example. This represents the length following the code “0” using a predetermined code sequence. The multiplexer 160 inputs the output code of the K parameter encoding circuit 190 and the output code of the encoding circuit 150,
These are combined and output from the transmission-side output terminal 170 to the communication path.

（従来技術の問題点）以上、文献２従来方式において提案された駆動音源パル
ス系列探索法について述べた。文献２従来方式は、音源
パルス系列の振幅と位置とを求めるアルゴリズムにおい
て、パルス振幅はそのパルスが立つ位置だけの関数とい
う仮定をおいている。しかし、実際の音声信号に対して
は前述の仮定は成り立たず、文献２従来方式において音
源パルス系列を求めるために使用した前記(8)式にあるg
_kは一般にl₁,…，l_kなどの関数となる。したがって、文
献２従来方式により決定された音源パルス系列は、前記
(7)式のＪを真に小くするものではなく、更に適した音
源パルス系列が存在する。また従来方式では、音源パル
ス系列の振幅を全て決定してから量子化している。この
ような量子化では、振幅量子化によって生じる量子化誤
差をすくうことができない。さらにこのような音源パル
ス系列を直接量子化する方法では、量子化特性は音源パ
ルス系列の振幅の量子化幅に大きく依存しており、良い
量子化特性を得るためには音源パルス系列の振幅に多く
のビット数を割り当てなければならない。したがって、
駆動音源信号系列を複数のパルスで表わす方式におい
て、伝送レイトが16kビット／秒程度以下の領域で更に
良い音声品質を得るためには、より適した音源パルス系
列の振幅と位置とを求めることと、振幅の量子化により
生じる量子化雑音をすくうような音源パルス探索アルゴ
リズムを用いることが必要となる。(Problems of Prior Art) The drive source pulse sequence search method proposed in the conventional method of Document 2 has been described above. Document 2 The conventional method is based on the assumption that the pulse amplitude is a function of only the position where the pulse stands in the algorithm for obtaining the amplitude and position of the sound source pulse sequence. However, the above assumption does not hold for the actual voice signal, and g in the above equation (8) used to obtain the sound source pulse sequence in the conventional method in Reference 2 is used.
_k is generally a function such as l ₁ , ..., l _k . Therefore, the sound source pulse sequence determined by the conventional method of Document 2 is
There is a more suitable sound source pulse sequence that does not make J in equation (7) really small. In the conventional method, the amplitude of the sound source pulse sequence is all determined and then quantized. With such quantization, it is not possible to scoop out a quantization error caused by amplitude quantization. Further, in such a method of directly quantizing the source pulse sequence, the quantization characteristic greatly depends on the quantization width of the amplitude of the source pulse sequence, and in order to obtain good quantization characteristic, the amplitude of the source pulse sequence is Many bits have to be allocated. Therefore,
In order to obtain better voice quality in a region where the transmission rate is about 16 kbit / sec or less in the method of expressing the driving sound source signal sequence by a plurality of pulses, it is necessary to find a more suitable amplitude and position of the sound source pulse sequence. , It is necessary to use the source pulse search algorithm that scoops the quantization noise generated by the amplitude quantization.

（発明の目的）本発明の目的は、１６Ｋビット／秒程度の伝送レートに
適した高品質な音声符号化方式を提供することである。(Object of the Invention) It is an object of the present invention to provide a high-quality speech coding system suitable for a transmission rate of about 16 Kbit / sec.

（発明の構成）本発明によれば、離散的音声信号系列を入力し前記音声
信号系列を短時間毎に分割した短時間音声信号系列を求
め、前記短時間音声信号系列からスペクトル包絡を表す
パラメータを抽出して符号化し、前記スペクトル包絡に
対応するインパルス応答系列を計算し、前記短時間音声
信号系列の駆動音源信号系列として適した音源パルス系
列を記述するパラメータを逐次的に求める際に、新たに
定める音源パルスの位置に相当する位相の遅れた前記イ
ンパルス応答系列を逐次直交化しながら前記短時間音声
信号系列を用いて新たに定める音源パルスの位置を決定
し、前記直交化された信号系列と前記短時間信号系列と
のあらかじめ定められた時間に渡る積和を計算し量子化
することにより前記音源パルス系列を記述するパラメー
タである前記音源パルスの位置と前記量子化された積和
とを求め、前記音源パルスと前記量子化された積和とを
符号化し、前記スペクトル包絡を表すパラメータの符号
と前記音源パルス系列を記述するパラメータの符号とを
組み合わせることにより前記離散的音声信号系列を符号
化することを特徴とする音声符号化方法が得られる。(Structure of the Invention) According to the present invention, a discrete audio signal sequence is input, a short-time audio signal sequence obtained by dividing the audio signal sequence for each short time is obtained, and a parameter representing a spectrum envelope from the short-time audio signal sequence is obtained. Is extracted and encoded, an impulse response sequence corresponding to the spectral envelope is calculated, and when a parameter describing a sound source pulse sequence suitable as a driving sound source signal sequence of the short-time speech signal sequence is sequentially obtained, The position of the sound source pulse newly determined by using the short-time voice signal sequence while sequentially orthogonalizing the impulse response sequence having a phase delay corresponding to the position of the sound source pulse defined in, and the orthogonalized signal sequence A parameter that describes the source pulse sequence by calculating and quantizing a product sum over a predetermined time with the short-time signal sequence. The position of the excitation pulse and the quantized sum of products are obtained, the excitation pulse and the quantized sum of products are encoded, and the code of the parameter representing the spectral envelope and the excitation pulse sequence are described. A speech coding method characterized in that the discrete speech signal sequence is coded by combining with the code of the parameter.

また本発明によれば、離散的音声信号系列を入力し前記
音声信号系列を短時間毎に分割した短時間音声信号系列
を求め、前記短時間音声信号系列からスペクトル包絡を
表すパラメータを抽出して符号化し、前記スペクトル包
絡にあらかめ定められた補正を加えたスペクトルをもつ
インパルス応答系列を計算し、前記短時間音声信号系列
に前記あらかじめ定められた補正を加えた短時間音声信
号系列を計算し、前記短時間音声信号系列の駆動音源と
して適した音源パルス系列を記述するパラメータを逐次
的に求める際に、新たに定める音源パルスの位置に相当
する位相の遅れたインパルス応答系列を逐次直交化しな
がら前記補正を加えた短時間音声信号系列を用いて新た
に定める音源パルスの位置を決定し、前記補正を加えた
短時間音声信号系列と前記直交化された信号系列とのあ
らかじめ定められた時間に渡る積和を計算し、前記積和
を量子化し前記量子化した積和と前記決定した音源パル
スの位置とを符号化し、前記スペクトル包絡を表すパラ
メータの符号と前記音源パルスの位置を示す符号と前記
量子化された積和を示す符号とを組み合わせることによ
り前記離散的音声信号系列を符号化することを特徴とす
る音声符号化方法が得られる。Further, according to the present invention, a discrete voice signal sequence is input, a short-time voice signal sequence obtained by dividing the voice signal sequence for each short time is obtained, and a parameter representing a spectrum envelope is extracted from the short-time voice signal sequence. Calculate an impulse response sequence having a spectrum that is encoded and has a predetermined correction added to the spectrum envelope, and calculates a short-time speech signal sequence in which the predetermined correction is added to the short-time speech signal sequence. While sequentially obtaining a parameter describing a sound source pulse sequence suitable as a driving sound source of the short-time speech signal sequence, sequentially orthogonalizing an impulse response sequence with a phase delay corresponding to the position of a newly determined sound source pulse. The position of the sound source pulse newly determined using the corrected short-time audio signal sequence is determined, and the corrected short-time audio signal system is added. And a product sum of the orthogonalized signal sequence over a predetermined time is calculated, the product sum is quantized, the quantized product sum and the position of the determined excitation pulse are encoded, and the spectrum is obtained. A speech coding method, wherein the discrete speech signal sequence is coded by combining a code of a parameter indicating an envelope, a code indicating a position of the excitation pulse, and a code indicating the quantized sum of products. Is obtained.

（発明の原理）本発明による音声符号化方式は、上記音源パルス系列の
表現方法とそれらを求めるアルゴリズム及び量子化方法
に特徴がある。したがって、以後(7)式が与えられたと
きＪを最も小さくする音源パルス系列の振幅g_k，ｋ＝
１，…，Ｋと位置l_k，ｋ＝１，…，Ｋを逐次求める本発
明のアルゴリズムについて説明する。(Principle of the Invention) The speech coding method according to the present invention is characterized by the method of expressing the excitation pulse sequence, the algorithm for obtaining them, and the quantization method. Therefore, when the following equation (7) is given, the amplitude g _k , k = of the source pulse sequence that minimizes J is given.
, ..., K and the positions l _k , k = 1, ..., K are sequentially determined by the algorithm of the present invention.

Ｋ個のパルスが加わったときの重み付き二乗誤差を表わ
す式をg_k，ｋ＝１，…，Ｋで偏微分して０とおくとここで、内積および二乗誤差をと表わすことにすると、(12)式は (15)式の関係を(11)式に代入するととなる。(11)式において、位相の異なるh_w(n-l_k),k＝1,
…，Kの群{h_w(n-l_k)}は一般に直交系を為さない。すな
わち＜h_w(n-l_i),h_w(n-l_j)＞≠０，ｉ≠ｊ−(17) という関係がある。そこで、(11)式のＪを小さくする{l
_k}をｋに関し逐次求めるために、 {h_w(n-l_k)}を直交系列｛η_ｋ(n)｝に逐次変換していく
ことを考える。この逐次変換にシュミット(SCHIMDT)の
直交化を用いると次のようになる。Expression representing weighted squared error when K pulses are added Is partially differentiated by g _k , k = 1, ... Where the inner product and the squared error are (12) is expressed as Substituting the relationship of Eq. (15) into Eq. (11), Becomes In equation (11), h _w (nl _k ), k = 1, with different phases
…, The group of K {h _w (nl _k )} generally does not form an orthogonal system. That is, <h _w (nl _i ), h _w (nl _j )> ≠ 0 and i ≠ j− (17). Therefore, reduce J in Eq. (11) {l
_Consider sequentially converting {h _w (nl _k )} into an orthogonal sequence {η _k (n)} in order to sequentially find _k } with respect to k. The use of Schmitt (SCHIMDT) orthogonalization for this successive transformation is as follows.

このシュミットの直交化はh_w(n-l_k)から{h_w(n-l_i)},i=
1,…，k-1との相関を除くことと等価である。｛η
_ｋ(n)｝は次のような直交関係＜η_ｉ(n)，η_ｊ(n)＞＝０ｉ≠ｊ −(19) となすので、｛η_ｋ(n)｝でx_w(n)を線形最小二乗近似し
たときの誤差はとなる（一松信著、気似式、２４頁、竹内書店（昭３
８）、文献３）。ここで、さらに ξ_ｋ＝＜ｘ_ｗ(n)，η_ｋ(n)＞ −(21) とおくと、(20)式はと表現される。 This Schmidt orthogonalization is from h _w (nl _k ) to {h _w (nl _i )}, i =
Equivalent to removing the correlation with 1, ..., k-1. {Η
_{Since k} (n)} has the following orthogonal relationship <η _i (n), η _j (n)> = 0 i ≠ j − (19), {η _k (n)} holds x _w (n) ) Is a linear least squares approximation, the error is (Written by Shin Ichimatsu, Kikishiki, page 24, Takeuchi Shoten (Sho 3
8), reference 3). Here, if we further set ξ _k = <x _w (n), η _k (n)> − (21), equation (20) becomes Is expressed as

逐次過程において、l₁,…，l_k-1が決定されているとす
ると(18)式の漸化式よりη_１(n),…η_ｋ−１(n)まで計
算されていることになる。よってｋ番目のパルス位置l_k
は(22)式の二乗誤差を最も小さくするように、つまりを最大にするものとして決定される。If it is assumed that l ₁ , ..., L _k-1 are determined in the sequential process, it will be calculated up to η ₁ (n), ... η _k-1 (n) from the recurrence formula of Eq. (18). Become. Therefore, the k-th pulse position l _k
Is to minimize the squared error in Eq. (22), that is, Is determined to be the maximum.

また、ξ_ｋは(18)式と(21)式とからであるから、(23)式はと等価である。したがって、(21)式と(23)式とから
ξ_ｋ，l_kが定まるたびにξ_ｋを量子化し、l_kを求める(2
5)式において量子化されたξ_ｉ，ｉ＝１，…，ｋ−１を
用いればξ_ｉ，ｉ＝１，…，ｋ−１の量子化効果を考慮
した位置l_kが求まることになる。量子化したξ_ｋをとおくと、(25)式からl_kは次式を最大にするものとして
求める。In addition, ξ _k is calculated from Eqs. (18) and (21) Therefore, equation (23) is Is equivalent to Therefore, (21) and (23) and the xi] _k quantizes each time the xi] _k, l _k is determined from, seek l _k (2
If quantized ξ _i , i = 1, ..., K-1 is used in equation (5), a position l _k that takes into account the quantization effect of ξ _i , i = 1, ..., k-1 can be obtained. . Quantized ξ _k In other words, from equation (25), l _k is _obtained by maximizing the following equation.

本発明は、以上のようにして求まったとl_k,k=1,…，Ｋを表わす符号とを伝送パラメータとす
るものである。 The present invention was obtained as described above. , And a code representing l _k , k = 1, ..., K are used as transmission parameters.

一方、ξ_ｋ，ｋ＝１，…，Ｋとl_k,ｋ＝１，…，Ｋとが
決定されれば、g_k，ｋ＝１，…，Ｋは次のように計算さ
れる。まず(16)式と(20)式との比較からという関係がある。この式に、(18)式にある{h_w(n-l_k)}
と｛η_ｋ(n)｝との関係式を代入すると但し、b_ii=1，b_ij=0 ｉ＜ｊとする。(29)式の両辺を比較することから、よって、｛ξ_ｋ｝を用いると{g_k}はによって計算される。受信側では符号化されたξ_ｋ，ｋ
＝１，…，Ｋとl_k,k=1,…，Ｋとを受けとり、それらを
復号して(31)式からg_k,k=1,…，Ｋを計算する。以上で
本発明のアルゴリズムに関する説明を終える。On the other hand, if ξ _k , k = 1, ..., K and l _k , k = 1, ..., K are determined, g _k , k = 1, ..., K are calculated as follows. First, from the comparison between Eqs. (16) and (20) There is a relationship. In this equation, {h _w (nl _k )} in equation (18)
Between {η _k (n)} and Substituting However, b _ii = 1 and b _ij = 0 i <j. By comparing both sides of equation (29), Therefore, using {ξ _k } gives {g _k } Calculated by At the receiving side, the encoded ξ _k , k
, K and l _k , k = 1, ..., K are received, and they are decoded to calculate g _k , k = 1, ..., K from the equation (31). This is the end of the description of the algorithm of the present invention.

（実施例）本発明による音声符号化方式の実施例を図を用いて説明
する。第３図(a)は送信側のブロック図、第３図(b)は受
信側のブロック図を示す。第３図(a)において、500は符
号機入力端子を示し離散的な音声信号系列x(n)が入力さ
れる。310は音声信号系列を一フレーム分蓄積するバッ
ファメモリ回路である。320はＫパラメータ計算回路
で、バッファメモリ回路320に蓄積された音声信号x(n)
を入力し、あらかじめ定められた数だけＫパラメータを
計算する。この値はＫパラメータ符号化回路330に出力
される。Ｋパラメータ符号化回路は、あらかじめ定めら
れた量子化ビット数に基づいてＫパラメータを符号化
し、それをマルチプレクサ380へ出力する。またＫパラ
メータ符号化回路は、符号化されたＫパラメータを復号
化し復号化値を重み付け回路340と、インパルス応答系
列計算回路350へ出力する。重み付け回路340は、入力音
声信号x(n)と330からのＫパラメータの復号値を入力
し、合成フィルタの周波数特性に依存した重み関数w(n)
を用い、前述のx_w(n)（音声信号系列x(n)と重み関数w
(n)とのたたみ込み）を計算し、それを音源パルス系列
のパラメータ計算回路360へ出力する。インパルス応答
系列計算回路350は、330からのＫパラメータの復号値を
入力し、前述のh_w(n)（合成フィルタのインパルス応答
系列h(n)と重み関数w(n)とのたたみ込み）を定められた
サンプル数だけ計算し、求まったh_w(n)を音源パルス系
列のパラメータ計算回路360へ出力する。次に音源パル
ス系列のパラメータ計算回路360を説明する。この回路
は、重み付き回路340からx_w(n)を重み付きインパルス応
答系列計算回路350からh_w(n)をそれぞれ入力し、前述の
アルゴリズム(18)式，(21)式，(25)式を用いて、音源パ
ルス系列を表わすパラメータ{l_k}，｛ξ_ｋ｝を計算す
る。第４図は、音源パルス系列のパラメータ計算回路36
0で行われる処理手順を表わす流れ図である。５は初期
値を設定するもので、前記(18)式，(21)式，(26)式にお
いてｋ＝１とおいた値を計算するものである。(18)式か
らη_１(n)を(21)式からξ_１＝＜ｘ_ｗ(n)，η_１(n)＞を
計算し、(26)式より▲ξ² ₁▼／＜η_１(n)，η_１(n)＞が
最大になるl₁を決定する。６は加算で、パルス数を表わ
すｋの値を一つふやすものである。７は比較で、計算さ
れるパルス数があらかじめ定められた数より大きいか小
さいかを判断し、定められた数より大きくなったらパル
ス位置を計算する処理をやめる。８は前記(18)式と(26)
式を計算するもので、(18)式よりη_ｋ(n)を、(26)式よ
りを計算する。９は音源パルスの位置l_kを求めるもので、
前記(26)式を最大にするl_kを音源パルスの位置とする。
１０はξ_ｋを求めるもので、前記(21)式からξ_ｋを計算
しそれを量子化してξ_ｋを得る。｛ξ_ｋ｝の量子化には
種々の方法が考えられる。例え、１番目に求まった｜ξ
_１｜を正規化係数としてξ_１を次から求まる｛ξ_ｋ｝を
正規化し順次一様量子化する方法、あるいは｜ξ_１｜を
初期値とし｜ξ_ｉ−１｜と｜ξ_ｉ｜ｉ＝２，…，Ｋとの
差を順次量子化し符号は保存する方法等が考えられる。
以上で音源パルス系列のパラメータ計算回路360の説明
を終える。(Embodiment) An embodiment of the voice encoding system according to the present invention will be described with reference to the drawings. FIG. 3 (a) is a block diagram of the transmitting side, and FIG. 3 (b) is a block diagram of the receiving side. In FIG. 3 (a), reference numeral 500 denotes an encoder input terminal to which a discrete audio signal sequence x (n) is input. A buffer memory circuit 310 stores one frame of the audio signal sequence. 320 is a K parameter calculation circuit, which is the audio signal x (n) accumulated in the buffer memory circuit 320.
, And calculate K parameters by a predetermined number. This value is output to the K parameter encoding circuit 330. The K parameter encoding circuit encodes the K parameter based on a predetermined number of quantization bits and outputs it to the multiplexer 380. Further, the K parameter coding circuit decodes the coded K parameter and outputs the decoded value to the weighting circuit 340 and the impulse response sequence calculation circuit 350. The weighting circuit 340 inputs the input speech signal x (n) and the decoded value of the K parameter from 330, and weights the weighting function w (n) depending on the frequency characteristic of the synthesis filter.
X _w (n) (voice signal sequence x (n) and weighting function w
(convolution with (n)) is calculated and output to the sound source pulse sequence parameter calculation circuit 360. The impulse response sequence calculation circuit 350 inputs the decoded value of the K parameter from 330 and inputs the above-mentioned h _w (n) (convolution of the impulse response sequence h (n) of the synthesis filter and the weighting function w (n)). Is calculated for a predetermined number of samples, and the obtained h _w (n) is output to the source pulse sequence parameter calculation circuit 360. Next, the sound source pulse sequence parameter calculation circuit 360 will be described. This circuit, x _w (n) is from the weighted impulse response series calculating circuit 350 h _w (n) is inputted from the weighted circuit 340, the algorithm described above (18), (21), (25) The parameters {l _k }, {ξ _k } representing the sound source pulse sequence are calculated using the formula. FIG. 4 shows a parameter calculation circuit 36 for the sound source pulse sequence.
6 is a flowchart showing a processing procedure performed in 0. Reference numeral 5 is for setting an initial value, and is for calculating a value with k = 1 in the equations (18), (21) and (26). (18) eta _{1 (n)} and (21) from _{_{ξ 1 = <x w (n}} ), η 1 (n)> from the equation to calculate the (26) from equation ^{_{▲ ξ 2 1 ▼ / <η}} 1 Determine l ₁ that maximizes (n), η ₁ (n)>. Reference numeral 6 denotes addition, which adds one value of k representing the number of pulses. Reference numeral 7 is a comparison, and it is determined whether the calculated pulse number is larger or smaller than a predetermined number, and when it is larger than the predetermined number, the process of calculating the pulse position is stopped. 8 is the above formula (18) and (26)
Equation (18) is used to calculate η _k (n), and Equation (26) is used to calculate To calculate. 9 is for obtaining the position l _k of the sound source pulse,
Let l _k which maximizes the above equation (26) be the position of the sound source pulse.
10 xi] _k and requests, the (21) get xi] _k quantizes it computes the xi] _k from the equation. Various methods are conceivable for the quantization of {ξ _k }. For example, the first obtained | ξ
₁ | is used as a normalization coefficient and ξ ₁ is obtained from the following {ξ _k } is normalized and sequentially uniformly quantized, or | ξ ₁ | is used as an initial value for | ξ _i-1 | and | ξ _i | i = A method in which the difference between 2, ..., K is sequentially quantized and the code is stored can be considered.
This completes the description of the source pulse sequence parameter calculation circuit 360.

第３図(a)に戻って、符号化回路370は、音源パルス系列
のパラメータ計算回路の出力である。Returning to FIG. 3 (a), the encoding circuit 370 is the output of the excitation pulse sequence parameter calculation circuit.

を入力しそれらを符号化するものである。 Input and encode them.

の符号化についても、種々の工夫が考えられる。しか
し、は逐次直交変換により定められた値であるので、符号化
のとき直交化された順序を復号側でわかるようにしなけ
ればならない。例として、に対応する順に{l_k}をランレングス符号化する方法が考
えられる。他１例として、には、ｉ＜ｊならば、という順序関係が存在するので、{l_k}を符号化しやすい
順序に並らびかえて符号化し、を変換して伝送し、復号側でを(32)式に従ってを大きさの順に変換することにより{l_k}をもとの順序に
もどす符号化方法が考えられる。但し、となる状態が生じる可能性があるための符号にも符号を割り当てなければならない。マルチプ
レクサ380はＫパラメータ符号化回路の出力符号と符号
化回路370の出力符号を入力し、これらを組み合わせて
送信側出力端子510から通信路へ出力する。 There are various conceivable ideas for the encoding of. But, Since is a value determined by successive orthogonal transformation, it is necessary for the decoding side to know the order of orthogonalization at the time of encoding. As an example, A method of run-length encoding {l _k } in the order corresponding to As another example, If i <j, then Since there is an order relation such that {l _k } is rearranged in an order that is easy to encode, On the decoding side According to equation (32) An encoding method is considered in which {l _k } is returned to the original order by transforming in the order of magnitude. However, Because there is a possibility that A code must also be assigned to the code of. The multiplexer 380 inputs the output code of the K parameter encoding circuit and the output code of the encoding circuit 370, combines them, and outputs them from the transmission side output terminal 510 to the communication path.

次に第３図(b)に示す受信側の説明を行う。デマルチプ
レクサ390は受信側入力端子520を通して符号を入力し、
Ｋパラメータを表わす符号と音源パルス系列を表わす符
号とを分離して、Ｋパラメータを表わす符号を復号器40
0へ、音源パルス系列を表わす符号を復号器410へそれぞ
れ出力する。復号器400は、デマルチプレクサ390より入
力されたＫパラメータを表わす符号を復号し、インパル
ス応答系列計算回路420と音声再生回路450へ出力する。
復号器410は音源パルス系列を表わす符号をデマルチプ
レクサ390より入力し、音源パルス系列のパラメータ
{l_k}と｛ξ_ｋ｝とに復号する。インパルス応答系列計算
回路420は、復号化されたＫパラメータを入力して前述
の重み付きインパルス応答系列h_w(n)を計算し、それを
直交変換回路430へ出力する。直交好感回路430は、重み
付きインパルス応答系列h_w(n)と復号器410の出力{l_k}と
を入力し、前記(18)式の漸化式により直交系列｛η
_ｋ(n)｝及び前記(28)式に示した、変換行列{b_ij}を計算
する。音源パルス振幅計算回路440は、直交変換回路430
の出力である｛η_ｋ(n)｝，{b_ij}と復号器410の出力で
ある｛ξ_ｋ｝とから前記(29)式を用いて音源パルスの振
幅{g_k}を計算しし、それを音声再生回路450へ出力す
る。音声再生回路450は、復号器400の出力であるＫパラ
メータから合成フィルタを計算し、合成フィルタの入力
となる駆動音源系列を復号器410の出力{l_k}と音源パル
ス振幅計算回路の出力{g_k}とから計算し、前記計算され
た合成フィルタに前記計算された駆動音源系列を入力と
して加えて再声音声信号系列を計算し出力端子530へ出
力する。Next, the receiving side shown in FIG. 3 (b) will be described. The demultiplexer 390 inputs a code through the reception side input terminal 520,
The code representing the K parameter and the code representing the excitation pulse sequence are separated, and the code representing the K parameter is decoded by the decoder 40.
The code representing the excitation pulse sequence is output to 0 to the decoder 410. The decoder 400 decodes the code representing the K parameter input from the demultiplexer 390, and outputs it to the impulse response sequence calculation circuit 420 and the audio reproduction circuit 450.
Decoder 410 receives the code representing the excitation pulse sequence from demultiplexer 390, and outputs the parameters of the excitation pulse sequence.
Decode into {l _k } and {ξ _k }. The impulse response sequence calculation circuit 420 inputs the decoded K parameter, calculates the above-mentioned weighted impulse response sequence h _w (n), and outputs it to the orthogonal transformation circuit 430. The orthogonal favorable circuit 430 inputs the weighted impulse response sequence h _w (n) and the output {l _k } of the decoder 410 and inputs the orthogonal sequence {η by the recurrence formula of the equation (18).
_k (n)} and the transformation matrix {b _ij } shown in the equation (28) are calculated. The sound source pulse amplitude calculation circuit 440 is the orthogonal transformation circuit 430.
From {η _k (n)}, {b _ij }, which is the output of { _{circumflex over} ()}, and {ξ _k } that is the output of the decoder 410, the amplitude {g _k } of the sound source pulse is calculated using the above equation (29). , And outputs it to the audio reproduction circuit 450. The audio reproduction circuit 450 calculates a synthesis filter from the K parameter that is the output of the decoder 400, and outputs the driving excitation sequence that is the input of the synthesis filter to the output {l _k } of the decoder 410 and the output of the excitation pulse amplitude calculation circuit { g _k }, and the calculated driving sound source sequence is added as an input to the calculated synthesis filter to calculate a re-voiced voice signal sequence and output to the output terminal 530.

以上、本発明の一実施例について述べた。ここで述べた
実施例では、｛ξ_ｋ｝として＜x_w(n)，η_ｋ(n)＞を用い
たが、ξ_ｋとしては＜ｘ_ｗ(n)，η_ｋ(n)＞を含むものな
らなんでもよく、例えば＜ｘ_ｗ(n)，η_ｋ(n)＞／｜η_ｋ
(n)｜や＜x_w(n)，η_ｋ(n)＞／＜η_ｋ(n)，η_ｋ(n)＞と
してもよい。また、合成フィルタのインパルス応答系列
h_w(n)は指数関数的に減衰していく。そこで|l_i-l_j|の値
が大きいところではh_w(n-l_i)とh_w(n-l_j)との相関は小さ
いと言える。そこで前記(18)式漸化式において|l_i-l_j|
の値があらかじめ定められた値より大きいときは、(18)
式の相関除去の操作を行なわなくとも近似的に直交化し
た系列｛η_ｋ(n)｝が計算できる。このような構成は直
交化に要する演算を大幅に減少させることができる。The embodiment of the present invention has been described above. In the example described here, <x _w (n), η _k (n)> is used as {ξ _k }, but <x _w (n), η _k (n)> is included as ξ _k. Anything can be used, for example, <x _w (n), η _k (n)> / | η _k
(n) | or <x _w (n), η _k (n)> / <η _k (n), η _k (n)>. Also, the impulse response sequence of the synthesis filter
h _w (n) decays exponentially. Therefore, it can be said that the correlation between h _w (nl _i ) and h _w (nl _j ) is small where | l _i -l _j | is large. Therefore, in the recurrence formula (18), | l _i -l _j |
If the value of is larger than the predetermined value, (18)
An approximate orthogonalized sequence {η _k (n)} can be calculated without performing the correlation removal operation of the equation. Such a configuration can significantly reduce the calculation required for orthogonalization.

また、本発明の作用・原理で述べたアルゴリズムにおい
て、＜x_w(n),h_w(n-l_i)＞＝ψ_xh(l_i)，＜ｈ_ｗ（ｎ−
ｌ_ｉ），ｈ_ｗ（ｎ−ｌ_ｊ）＞＝_hh(l_i,l_j)であるか
ら、はじめにψ_xh(l_i)，_hh(l_i,l_j)を計算しておいて
も本発明を実現できる。前記{b_ij}、前記｛＜η_ｋ(n)，
η_ｋ(n)＞、前記｛ξ_ｋ｝、前記{l_k}とψ_xh(l_i)，
_hh(l_i,l_j)との関係は以下のように表現できる。まず、
前記(18)式と前記(19)式とから次に前記(26)式から前記(31)式と前記(32)式とを利用して、ψ_xh(l_i)と_hh
(l_i,l_j)から本発明で重要なパラメータである{l_k}，
｛ξ_ｋ｝，{g_k}を求めるアルゴリズムは特許出願番号昭
58-150783“音声符号化方法”（文献４）に詳しい。文
献４における(17)式、(18)式にある{v_ij}が本明細書に
ある(18)式の{b_ij}に等しい。また文献４における(19)
式、(20)式の{d_k}が本明細書にある｛＜η_ｋ(n)，η
_ｋ(n)＞｝に等しい。また文献４における(25)式、(26)
式の{y_k}が本明細書にある前記(21)式の｛ξ_ｋ｝に等し
い。また位置l_kを求めるための文献４における(28)式が
本明細書にある前記(23)式に等しい。さらに振幅{g_k}を
求めるために文献４における(31)式、(32)式が本明細書
にある前記(31)式等しい。Further, in the algorithm described in the action-the principles of the present _{invention, <x w (n),} h w (nl i)> = ψ xh (l i), <h w (n-
Since l _i ), h _w (n−l _j )> = _hh (l _i , l _j ), even if ψ _xh (l _i ), _hh (l _i , l _j ) is calculated first, The invention can be realized. The {b _ij }, the {<< η _k (n),
η _k (n)>, the above {ξ _k }, the above {l _k } and ψ _xh (l _i ),
The relation with _hh (l _i , l _j ) can be expressed as follows. First,
From the equation (18) and the equation (19) Next, from equation (26) above By using the equation (31) and the equation (32), ψ _xh (l _i ) and _hh
From (l _i , l _j ), {l _k }, which is an important parameter in the present invention,
The algorithm for obtaining {ξ _k } and {g _k } is patent application number Sho
58-150783 "Voice coding method" (reference 4). {V _ij } in the equations (17) and (18) in the reference 4 is equal to {b _ij } in the equation (18) in this specification. (19) in Reference 4
In the present specification, {d _k } of the equation (20) is {<< η _k (n), η
is equal to _k (n)>}. Also, Equations (25) and (26) in Reference 4
The expression {y _k } is equal to {ξ _k } in the expression (21) in the present specification. Further, the equation (28) in the reference 4 for obtaining the position l _k is equal to the equation (23) in this specification. Further, in order to obtain the amplitude {g _k }, the equations (31) and (32) in Reference 4 are equal to the above equation (31) in this specification.

前述までの本発明の音源パルス系列の計算はフレーム単
位で行なったが、フレームをいくつかのサブフレームに
分割しそのサブフレーム毎にパルス系列を計算するよう
な構成にしてもよい。この構成によれば、フレーム分割
数をｍとする第３図に示した構成に比べて演算量を大略
１／ｍ倍することができる。The calculation of the sound source pulse sequence of the present invention has been performed frame by frame, but the frame may be divided into several subframes and the pulse sequence may be calculated for each subframe. According to this configuration, the amount of calculation can be increased by about 1 / m as compared with the configuration shown in FIG. 3 in which the number of frame divisions is m.

また、以上説明した構成例においてはフレーム長を一定
にしたが、これは可変にしてもよい。可変にした方が特
性は向上する。また、短時間音声信号系列のスペクトル
包絡を表わすパラメータとしてはＫパラメータを用い
が、これはよく知られている他のパラメータ（例えばＬ
ＳＰパラメータ等）を用いてもよい。更に前述の重み関
数w(n)は本発明を実施する上に必要な要素ではなく、な
くてもよい。しかし人間のの聴覚特性を考慮した重み関
数w(n)を加えることにより、本発明の効果をさらに大き
くできる。Further, although the frame length is fixed in the configuration example described above, it may be variable. The characteristics can be improved by making it variable. The K parameter is used as a parameter representing the spectrum envelope of the short-time speech signal sequence, but this is another well-known parameter (for example, L parameter).
SP parameters, etc.) may be used. Furthermore, the weight function w (n) described above is not an element necessary for implementing the present invention, and may be omitted. However, the effect of the present invention can be further enhanced by adding the weighting function w (n) in consideration of human auditory characteristics.

（発明の効果）本発明の構成によれば、音源パルス系列のパラメータ計
算において(26)式によりl₁,…，l_k-1およびξ_１，…，
ξ_ｋ−１の量子化を考慮した最適な位置l_kを逐次求めて
いき、(31)式により位置{l_k}および｛ξ_ｋ｝の量子化を
考慮した最適な振幅を決定するため、文献２の従来方式
に見るようなパルスの振幅をそのパルスが立つ位置だけ
の関数とみるのとは異なり、二乗誤差を小さくするとい
う意味でより適した音源パルス系列を得ることができ
る。したがって、従来方式より良好な音質が得られると
いう効果がある。また、本発明の構成のように、インパ
ルス応答系列の振幅を一刮して量子化するのではなくイ
ンパルス応答系列を逐次直交化する逐次過程で直交系列
の振幅の量子化を含むものは、振幅量子化によって生じ
る量子化誤差を逐次量子化過程で補償することができ従
来方式よりもすぐれた量子化特性を示すという効果があ
る。(Effect of the Invention) According to the configuration of the present invention, l ₁ , ..., L _k-1 and ξ ₁ , ...,
In order to sequentially find the optimum position l _k considering the quantization of ξ _k−1 , and to determine the optimum amplitude considering the quantization of the positions {l _k } and {ξ _k } by equation (31), Unlike observing the amplitude of the pulse as a function only in the position where the pulse stands as in the conventional method of Document 2, a more suitable sound source pulse sequence can be obtained in the sense of reducing the square error. Therefore, there is an effect that better sound quality can be obtained as compared with the conventional method. Further, as in the configuration of the present invention, the one including the quantization of the amplitude of the orthogonal sequence in the sequential process of sequentially orthogonalizing the impulse response sequence, instead of quantizing the amplitude of the impulse response sequence, There is an effect that the quantization error caused by the quantization can be compensated for in the sequential quantization process, and the quantization characteristic is superior to the conventional method.

[Brief description of drawings]

第１図は従来方式を実現する一実施例を示すブロック
図、第２図は従来方式による音源パルス系列計算回路で
行う処理手順を示す流れ図、第３図(a),(b)は本発明の
一実施例を示すブロック図、第４図は本発明における音
源パルス系列のパラメータ計算回路で行う処理手順を示
す流れ図である。図において、110,310……バッファメモリ回路、120,35
0,420……インパルス応答系列計算回路、130……共分散
関数計算回路、135……相互相関々数列計算回路、140…
…音源パルス系列計算回路、150,370……符号化回路、1
60,380……マルチプレクサ、180,320……Ｋパラメータ
計算回路、190,330……Ｋパラメータ符号化回路、200,3
40……重み付け回路、360……音源パルス系列のパラメ
ータ計算回路、390……デマルチプレクサ、400,410……
復号器、430……直交変換回路、440……音源パルス振幅
計算回路、450……音声再生回路、１……初期化、２…
…比較、３……パルス計算、４……加算、５……初期
化、６……加算、７……比較、８……逐次直交化による
音源パルス系列のパラメータ計算、９……最大値検出、
１０……音源パルス系列のパラメータ計算と量子化をそ
れぞれ示す。FIG. 1 is a block diagram showing an embodiment for realizing a conventional method, FIG. 2 is a flow chart showing a processing procedure performed by a sound source pulse sequence calculation circuit according to the conventional method, and FIGS. 3 (a) and 3 (b) are the present invention. FIG. 4 is a block diagram showing an embodiment of the present invention, and FIG. 4 is a flow chart showing a processing procedure performed by a sound source pulse sequence parameter calculation circuit in the present invention. In the figure, 110,310 ... buffer memory circuit, 120,35
0,420 …… Impulse response sequence calculation circuit, 130 …… Covariance function calculation circuit, 135 …… Cross-correlation sequence calculation circuit, 140…
… Source pulse sequence calculation circuit, 150,370 …… Coding circuit, 1
60,380 …… Multiplexer, 180,320 …… K parameter calculation circuit, 190,330 …… K parameter encoding circuit, 200,3
40 …… Weighting circuit, 360 …… Sound source pulse sequence parameter calculation circuit, 390 …… Demultiplexer, 400,410 ……
Decoder, 430 ... Orthogonal transformation circuit, 440 ... Sound source pulse amplitude calculation circuit, 450 ... Voice reproduction circuit, 1 ... Initialization, 2 ...
… Comparison 3 …… Pulse calculation, 4 …… Addition, 5 …… Initialization, 6 …… Addition, 7 …… Comparison, 8 …… Parameter calculation of the sound source pulse sequence by successive orthogonalization, 9 …… Maximum value detection ,
10 ... Shows parameter calculation and quantization of a sound source pulse sequence, respectively.

Claims

[Claims]

1. A short-term audio signal sequence obtained by inputting a discrete audio signal sequence and dividing the audio signal sequence for each short time,
A parameter representing a spectrum envelope is extracted and encoded from the short-time speech signal sequence, an impulse response sequence corresponding to the spectrum envelope is calculated, and a sound source pulse sequence suitable as a driving sound source signal sequence of the short-time speech signal sequence is obtained. When sequentially determining the parameters to be described, the position of the sound source pulse newly determined by using the short time voice signal sequence while sequentially orthogonalizing the impulse response sequence having a phase delay corresponding to the position of the newly determined sound source pulse Decide
The position and the quantum of the sound source pulse which is a parameter describing the sound source pulse sequence by calculating and quantizing a product sum of the orthogonalized signal sequence and the short time signal sequence over a predetermined time. The sum of the product sums, the excitation pulse and the quantized sum of products are encoded, and by combining the sign of the parameter representing the spectral envelope and the sign of the parameter describing the excitation pulse sequence, A speech coding method characterized by coding a discrete speech signal sequence.

2. A discrete-time audio signal sequence is input, and the audio signal sequence is divided into short-time audio signal sequences to obtain a short-time audio signal sequence,
A parameter representing a spectrum envelope is extracted and encoded from the short-time speech signal sequence, an impulse response sequence having a spectrum in which a predetermined correction is added to the spectrum envelope is calculated, and the short-time speech signal sequence is preliminarily described. The position of the newly determined sound source pulse is calculated when a short-time sound signal sequence to which a predetermined correction is added is calculated and the parameters describing the sound source pulse sequence suitable as the driving sound source of the short-time sound signal sequence are sequentially obtained. The position of the sound source pulse newly determined by using the short-time speech signal sequence to which the correction has been added while sequentially orthogonalizing the impulse response sequence having a phase corresponding to Compute a product sum over a predetermined time with the orthogonalized signal sequence, quantize the product sum, and quantize Encoding the sum of products and the position of the determined excitation pulse, the discrete by combining the code of the parameter representing the spectral envelope, the code indicating the position of the excitation pulse and the code indicating the quantized sum of products A speech encoding method characterized by encoding a dynamic speech signal sequence.