JP3200872B2

JP3200872B2 - Code-driven linear predictive speech coding

Info

Publication number: JP3200872B2
Application number: JP15249991A
Authority: JP
Inventors: 鋼一柴垣
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1991-05-28
Filing date: 1991-05-28
Publication date: 2001-08-20
Anticipated expiration: 2016-08-20
Also published as: JPH04350700A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、コード駆動線形予測音
声符号化方式に関し、特にピッチ情報を用いる場合の音
質改善法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a code-driven linear predictive speech coding method, and more particularly to a sound quality improving method using pitch information.

【０００２】[0002]

【従来の技術】従来のピッチ情報を用いる場合のコード
駆動線形予測音声符号化方式では、ピッチを入力音声の
サンプリング間隔で求めていたので、ピッチの精度は入
力音声のサンプリング間隔となっていた。2. Description of the Related Art In a conventional code-driven linear predictive speech coding system using pitch information, the pitch is determined by the sampling interval of the input speech, so that the pitch accuracy is the sampling interval of the input speech.

【０００３】[0003]

【発明が解決しようとする課題】上述した従来のピッチ
情報を用いるコード駆動線形予測音声符号化では、ピッ
チの精度が入力音声のサンプリング間隔となっているの
で、ピッチの精度が十分とは言えないばかりでなく、コ
ードブック探索の精度も十分とは言えず、音質の向上に
も限度があるという欠点がある。In the above-described conventional code-driven linear predictive speech coding using pitch information, the precision of the pitch is the sampling interval of the input speech, so that the precision of the pitch is not sufficient. Not only that, the accuracy of the codebook search is not sufficient, and there is a limit that the improvement of sound quality is limited.

【０００４】本発明の目的は、音質を改善して向上させ
たコード駆動線形予測音声符号化方式を提供することに
ある。[0004] It is an object of the present invention to provide a code-driven linear predictive speech coding system with improved sound quality.

【０００５】[0005]

【課題を解決するための手段】前記目的を達成するた
め、本発明に係るコード駆動線形予測音声符号方式は、
ピッチ情報を用いるコード駆動線形予測音声符号化方式
であって、入力音声を補間することによって、サンプル
数を増加させた音声信号を作成し、前記サンプル数を増
加させた音声信号から線形予測フィルタ係数とピッチ情
報を算出し、前記ピッチ情報からピッチ再生フィルタを
構成し、かつ前記線形予測フィルタ係数から線形予測フ
ィルタを構成し、コードブックは補間しない場合と同じ
ものを用い、コードを立てる位置は、補間する前の入力
音声のサンプル位置のみとし、ピッチ周期は補間したサ
ンプリング間隔とし、前記ピッチ再生フィルタを前記コ
ードで駆動した駆動信号は補間したサンプリング位置に
も立てることにより、前記ピッチ再生フィルタと前記線
形予測フィルタの縦続フィルタを前記コードで駆動した
合成音声が原音声に最も近くなるように前記コードブッ
クを探索し、所定の数のコード情報を求め、前記線形予
測フィルタ係数と前記ピッチ情報と前記コード情報をエ
ンコードして伝送し、受信側では受信符号をデコード
し、前記線形予測フィルタ係数と前記ピッチ情報と前記
コード情報を復元し、前記コード情報に従って前記コー
ドブックからコードを読み出すことによって第１の駆動
信号を合成し、前記ピッチ情報から前記ピッチ再生フィ
ルタを構成し、前記第１の駆動信号で前記ピッチ再生フ
ィルタを駆動することによって、第２の駆動信号を合成
し、前記線形予測フィルタ係数から前記線形予測フィル
タを構成し、前記第２の駆動信号で前記線形予測フィル
タを駆動することによって得られた信号を間引くことに
よって音声を再生するものである。In order to achieve the above object, a code-driven linear predictive speech coding system according to the present invention comprises:
A code-driven linear prediction speech coding method using pitch information, wherein an input speech is interpolated to create a speech signal with an increased number of samples, and a linear prediction filter coefficient is calculated from the speech signal with the increased number of samples. and calculating a pitch information, said constitutes a pitch reproduction filter from the pitch information, and constitutes a linear prediction filter from the linear prediction filter coefficients, the codebook same as if no interpolation
The position where the code is set is the input before interpolation
Only the audio sample position is used, and the pitch period is
Sampling interval, and the pitch reproduction filter is
The drive signal driven by the mode is placed at the interpolated sampling position.
By also make synthetic speech to the cascade filter is driven by the code of the line <br/> Predictor filter and the pitch reproduction filter searches the codebook to be closest to the original speech, a predetermined number of obtains the code information, the transmitting encodes the linear prediction filter coefficients and the pitch information the code information, decodes the received code at the receiving end, the <br/> codes and the linear prediction filter coefficients and the pitch information restore the information, the first drive signal is synthesized by reading the code from the code <br/> codebook according to the code information, constitute the pitch reproduction filter from the pitch information, in the first drive signal by driving the pitch reproduction filter, the second driving signal by combining, forming the linear prediction filter from the linear prediction filter coefficients And it reproduces the audio by thinning out a signal obtained by driving the linear prediction filter in the second drive signal.

【０００６】[0006]

【作用】本発明では、入力音声を適切に補間することに
よってサンプリング間隔の細かい音声信号を作成し、こ
の音声信号から算出したサンプリング間隔の細かいピッ
チ情報を用いて、補間する前の入力音声のサンプル位置
にのみコードを立てながらコードブック探索を行うこと
により、補間しない場合に比べてほとんど伝送情報量を
変えずに音質を改善するようにしたものである。According to the present invention, an audio signal having a fine sampling interval is created by appropriately interpolating an input voice, and a sample of the input voice before interpolation is generated by using fine pitch information of the sampling interval calculated from the audio signal. By performing a codebook search while setting a code only at a position, the sound quality is improved with almost no change in the amount of transmission information as compared with the case without interpolation.

【０００７】[0007]

【実施例】次に本発明の実施例について図面を参照して
説明する。Next, an embodiment of the present invention will be described with reference to the drawings.

【０００８】図１は本発明に係るコード駆動線形予測音
声符号化方式の一実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of a code-driven linear predictive speech coding system according to the present invention.

【０００９】本実施例のコード駆動線形予測音声符号化
方式を図１により説明する。入力端子１０に加えられた
音声信号Ｘ（ｎ）（ｎ＝０，１，２，…，Ｎ−１）は補
間器１に供給され、ここで補間される。補間された音声
信号Ｘ_I（ｎ）（ｎ＝０，（１／２），１，１（１／
２），…，Ｎ−１）は、線形予測分析器２とピッチ抽出
器３とコードブック探索器５とに供給される。The code-driven linear predictive speech coding method of the present embodiment will be described with reference to FIG. The audio signal X (n) (n = 0, 1, 2,..., N−1) applied to the input terminal 10 is supplied to the interpolator 1 where it is interpolated. Interpolated audio signal X _I (n) (n = 0, (1/2), 1, 1 (1 /
2),..., N−1) are supplied to the linear prediction analyzer 2, the pitch extractor 3, and the codebook searcher 5.

【００１０】線形予測分析器２では、補間された音声信
号Ｘ_I（ｎ）が線形予測分析され、線形予測フィルタの
係数αが求められ、これをコードブック探索器５とエン
コーダ６に供給する。In the linear prediction analyzer 2, the interpolated speech signal X _I (n) is subjected to linear prediction analysis to obtain a coefficient α of a linear prediction filter, which is supplied to a codebook searcher 5 and an encoder 6.

【００１１】ピッチ抽出器３では、補間された音声信号
よりピッチ情報を抽出し、これをコードブック探索器５
とエンコーダ６に供給する。The pitch extractor 3 extracts pitch information from the interpolated audio signal, and extracts the pitch information from the interpolated speech signal.
Is supplied to the encoder 6.

【００１２】コードブック探索器５では、ピッチ情報か
らピッチ再生フィルタを構成し、線形予測フィルタ係数
αから線形予測フィルタを構成し、ピッチ再生フィルタ
と線形予測フィルタの縦続フィルタをコードで駆動した
合成音声が原音声に最も近くなるように、コードブック
４からコードを探索し、所定の数のコード情報を求め、
エンコーダ６に供給する。The codebook searcher 5 forms a pitch reproduction filter from the pitch information, forms a linear prediction filter from the linear prediction filter coefficient α, and generates a synthesized speech obtained by driving a cascade filter of the pitch reproduction filter and the linear prediction filter by a code. Is searched for from the code book 4 so that a is closest to the original voice, a predetermined number of pieces of code information are obtained,
It is supplied to the encoder 6.

【００１３】エンコーダ６では線形予測フィルタ係数α
とピッチ情報とコード情報がエンコードされ、符号化器
出力端子１１から伝送路に出力される。In the encoder 6, a linear prediction filter coefficient α
, Pitch information and code information are encoded and output from the encoder output terminal 11 to the transmission path.

【００１４】受信側では復号化器入力端子３０に入力さ
れた、エンコードされた線形予測フィルタ係数αとピッ
チ情報とコード情報がデコーダ２１でデコードされ、線
形予測フィルタ係数αは線形予測フィルタ２５に供給さ
れ、ピッチ情報はピッチ再生フィルタ２４に供給され、
コード情報は駆動信号合成器２３に供給される。On the receiving side, the encoded linear prediction filter coefficient α, pitch information and code information input to the decoder input terminal 30 are decoded by the decoder 21, and the linear prediction filter coefficient α is supplied to the linear prediction filter 25. The pitch information is supplied to a pitch reproduction filter 24,
The code information is supplied to the drive signal synthesizer 23.

【００１５】駆動信号合成器２３では、コード情報に従
ってコードブック２２からコードを読み出すことによっ
て第１の駆動信号を合成し、ピッチ再生フィルタ２４に
供給する。The drive signal synthesizer 23 synthesizes a first drive signal by reading a code from the code book 22 in accordance with the code information, and supplies the first drive signal to the pitch reproduction filter 24.

【００１６】ピッチ再生フィルタ２４では、第１の駆動
信号でピッチ再生フィルタを駆動することによって第２
の駆動信号を合成し、線形予測フィルタ２５に供給す
る。In the pitch reproduction filter 24, the second drive signal is used to drive the pitch reproduction filter so that the second
Are combined and supplied to the linear prediction filter 25.

【００１７】線形予測フィルタ２５では第２の駆動信号
で線形予測フィルタを駆動することによって、補間され
たサンプリング間隔の合成音声を作成し、間引き器２６
に供給する。The linear predictive filter 25 drives the linear predictive filter with the second drive signal to generate a synthesized speech at the interpolated sampling interval, and
To supply.

【００１８】間引き器２６では補間されたサンプリング
間隔の合成音声を間引き、補間する前のサンプリング間
隔の合成音声を作成し、復号化器出力端子３１から出力
する。The decimation unit 26 decimates the synthesized speech at the interpolated sampling interval, creates a synthesized speech at the sampling interval before interpolation, and outputs it from the decoder output terminal 31.

【００１９】次に本実施例のマルチパルス符号化方式の
各部を図２を用いて説明する。Next, each part of the multi-pulse encoding system of this embodiment will be described with reference to FIG.

【００２０】（１）ピッチ抽出ピッチの抽出は、入力音声Ｘ（ｎ）（ｎ＝０，１，…，
Ｎ−１）（図２（ａ）に示す）を補間した音声信号Ｘ_I
（ｎ）（ｎ＝０，（１／２），１，１（１／２），…，
Ｎ−１）（図２（ｂ）に示す）に対して行うため、サン
プリング間隔が細かくなり、補間が適切に行われれば、
より正確なピッチが求められる。(1) Pitch extraction The pitch is extracted by input speech X (n) (n = 0, 1,...,
N-1) (shown in FIG. 2 (a)) interpolated audio signal X _I
(N) (n = 0, (1/2), 1, 1 (1/2),...,
N-1) (shown in FIG. 2 (b)), the sampling interval becomes small, and if interpolation is performed properly,
More accurate pitch is required.

【００２１】（２）コードブック探索ピッチ再生フィルタと線形予測フィルタの縦続フィルタ
をコードで駆動した合成音声が原音声に最も近くなるよ
うにコードブックを探索する。ここで、コードブックは
補間しない場合と同じものを用い、コードを立てる位置
は、補間する前の入力音声Ｘ（ｎ）のサンプル位置のみ
とする（図２（ｃ）に示す）。しかし、ピッチ周期は補
間したサンプリング間隔であるので、ピッチ再生フィル
タをコードで駆動した駆動信号Ｄ（ｎ）（図２（ｄ）に
示す）は、補間したサンプリング位置にも立つ。このよ
うにすれば、ピッチ再生フィルタと線形予測フィルタの
縦続フィルタをコードで駆動した合成音声は、補間せず
にコードブックを探索する場合に比べて、原音により近
付けることができる。(2) Codebook Search A codebook is searched so that a synthesized speech obtained by driving a cascade filter of a pitch reproduction filter and a linear prediction filter by a code becomes closest to the original speech. Here, the same codebook as in the case where no interpolation is performed is used, and the code is formed only at the sample position of the input voice X (n) before interpolation (shown in FIG. 2C). However, since the pitch period is the interpolated sampling interval, the drive signal D (n) (shown in FIG. 2D) in which the pitch reproduction filter is driven by the code also stands at the interpolated sampling position. In this way, the synthesized speech in which the cascade filters of the pitch reproduction filter and the linear prediction filter are driven by the code can be made closer to the original sound as compared with a case where a codebook is searched without interpolation.

【００２２】またコードブックは補間しない場合と同じ
ものを用い、コードを立てる位置は、補間する前の入力
音声Ｘ（ｎ）のサンプル位置のみであるので、伝送すべ
きコード情報は変わらない。伝送情報としては、ピッチ
情報量のみが増加するが、ピッチ情報はコード情報に比
べて十分少なく、伝送情報量は、ほとんど変わらない。The same code book as that used when no interpolation is performed is used, and the code is set only at the sample position of the input voice X (n) before the interpolation, so that the code information to be transmitted does not change. As the transmission information, only the pitch information amount increases, but the pitch information is sufficiently smaller than the code information, and the transmission information amount hardly changes.

【００２３】[0023]

【発明の効果】以上説明したように本発明は、入力音声
を適切に補間することによってサンプリング間隔の細か
い音声信号を作成し、この音声信号から算出したサンプ
リング間隔の細かいピッチ情報を用いて、補間する前の
入力音声のサンプル位置にのみコードを立てながらコー
ドブック探索を行うことによって、補間しない場合に比
べてほとんど伝送情報量を変えずに音質を改善できる効
果がある。As described above, according to the present invention, an audio signal having a small sampling interval is generated by appropriately interpolating an input audio signal, and interpolation is performed using fine pitch information of the sampling interval calculated from the audio signal. By performing a codebook search while setting a code only at the sample position of the input voice before the input, the sound quality can be improved with almost no change in the amount of transmitted information as compared with a case without interpolation.

[Brief description of the drawings]

【図１】本発明に係るコード駆動線形予測音声符号化方
式の一実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of a code-driven linear predictive speech coding method according to the present invention.

【図２】本発明に係るコード駆動線形予測音声符号化方
式における各部の信号を示す図である。FIG. 2 is a diagram showing signals of respective units in the code-driven linear predictive speech coding method according to the present invention.

[Explanation of symbols]

１補間器２線形予測分析器３ピッチ抽出器４コードブック５コードブック探索器６エンコーダ１０符号化器入力端子１１符号化器出力端子２１デコーダ２２コードブック２３駆動信号合成器２４ピッチ再生フィルタ２５線形予測フィルタ２６間引き器３０復号化器入力端子３１復号化器出力端子 DESCRIPTION OF SYMBOLS 1 Interpolator 2 Linear prediction analyzer 3 Pitch extractor 4 Codebook 5 Codebook searcher 6 Encoder 10 Encoder input terminal 11 Encoder output terminal 21 Decoder 22 Codebook 23 Drive signal synthesizer 24 Pitch reproduction filter 25 Linear Prediction filter 26 decimator 30 decoder input terminal 31 decoder output terminal

Claims

(57) [Claims]

1. A code-driven linear predictive speech coding method using pitch information, wherein a speech signal having an increased number of samples is created by interpolating an input speech, and the speech signal having the increased number of samples is generated. Calculate the linear prediction filter coefficient and pitch information from, configure a pitch reproduction filter from the pitch information,
A linear prediction filter is constructed from the linear prediction filter coefficients, and the same codebook as that used when no interpolation is performed is used.
The position of the input sound is the sample position of the input sound before interpolation.
The pitch period is the same as the interpolated sampling interval.
And a drive in which the pitch reproduction filter is driven by the code.
The motion signal is set at the interpolated sampling position.
Ri, the pitch reproduction filter and synthesized speech to the cascade filter is driven by the code of the linear prediction filter searches the codebook to be closest to the original speech, obtains the code information of a predetermined number of said linear prediction wherein the filter coefficients and the pitch information is encoded code information and transmitting, and decoding the received symbols at the receiving side to restore the code information and the linear prediction filter coefficients and the pitch information, the codebook according to said code information the first drive signal is synthesized by reading the code from, constitute the pitch reproduction filter from the pitch information, the first
By driving the pitch reproduction filter in the drive signal, the second driving signal by combining the constituting said linear prediction filter from the linear prediction filter coefficients, driving the linear prediction filter in the second driving signal A code-driven linear predictive speech coding method for reproducing speech by thinning out a signal obtained by performing the above.