JPS6235680B2

JPS6235680B2 -

Info

Publication number: JPS6235680B2
Application number: JP55136282A
Authority: JP
Inventors: Kazunori Ozawa; Taku Arazeki
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1980-09-30
Filing date: 1980-09-30
Publication date: 1987-08-03
Also published as: JPS5762096A

Description

[Detailed description of the invention]

本発明は音声信号の高能率符号化伝送方式に関
し、特に伝送速度が10Kビツト／秒前後の低速の
音声信号伝送方式に関する。音声信号の高能率符号化伝送方式の一つとし
て、直交変換を用いて音声信号を時間領域から周
波数領域に変換し、周波数領域で信号のスペクト
ラムを量子化および符号化して伝送する方式が提
案されている。この方式の代表的なものとして
は、アイ・イー・イー・イー・トランザクシヨン
ズ・オン・アコーステイクス・スピーチ・アン
ド・シグナル・プロセツシング（IEEE
Transactions on Acoustics，Speech，and
Signal Processing）誌の1977年第25巻、８月号
第299〜309頁（PP.299〜309，VOL.ASSP−25，
AUGUST，1977）掲載の「アダプテイブ・トラ
ンスフオーム・コーデイング・オブ・スピーチ・
シグナルズ」（“Adaptive Transform Codig of
Speech Signals”）と題する論文（文献１）所載
のアダプテイブ・トランスフオーム・コーデイン
グ（Adeptive Transform Coding；以下ATCと
記す）方式が知られている。また、前記ATC方
式を改善し、より少ない伝送情報量で高品質な音
声信号伝送を可能にした方式として、アール・イ
ー・クロシヤー（Ｒ・Ｅ・CROCHIERE）氏ら
により提案されたボコーダ・ドリブン・アダプテ
イブ・トランスフオーム・コーデイング
（“Vocoder driven Adaptive Transform
Codig”；以下、VD−ATCと記す。）方式が知ら
れており、その詳細は、アイ・イー・イー・イ
ー・トランザクシヨンズ・オン・アコーステイク
ス・スピーチ・アンド・シグナルプロセツシング
（IEEE Transactions on Acoustics，Speech，
and Signal Processing）誌の1979年第27巻、10
月号、512〜530頁（PP.512〜530，VOL.ASSP−
27，OCTOBER，1979）掲載の「フリーケンシ
ヤ・ドメイン・コーデイング・オブ・スピーチ」
（“Frequency Domain Coding of Speech.”）と
題する論文（文献２）を参照できる。第１図はVD−ATC方式を説明するブロツク図
である。図において、送信側では、入力端子１０
には標本化された音声信号系例ｘ（ｎ）が入力さ
れる。バツフアメモリ１５は入力信号をＭサンプ
ルずつブロツク処理するためにＭサンプルずつ入
力信号を記憶する。窓関数回路２０はバツフアメ
モリ１５に格納されているＭサンプルの信号に対
してあらかじめ定められた窓関数を掛ける。第２
図は前記文献２で述べられている窓関数の一例を
示す。第２図において、ｎはサンプル時刻を示
し、ｍは重なりあうサンプル数を示す。ここでＭ
個のサンプル値から構成されるサンプル値系列を
ブロツクと定義すると、第２図の実線部は現ブロ
ツクの窓関数を示し、破線部はその前後のブロツ
クの窓関数を示す。窓関数回路２０の出力値は離
散的コサイン変換回路（以下、DCT回路と記
す）３０に入力される。ここでＭサンプルの離散
的コサイン変換（Discrete Cosine Transform；
以下DCTと記す）は、前記文献２によれば(1)お
よび(2)式に従つて計算される。ここで The present invention relates to a high-efficiency coding transmission system for audio signals, and particularly to a low-speed audio signal transmission system with a transmission rate of around 10K bits/second. As one of the highly efficient coding transmission methods for audio signals, a method has been proposed in which the audio signal is converted from the time domain to the frequency domain using orthogonal transformation, and the spectrum of the signal is quantized and encoded in the frequency domain before being transmitted. ing. A typical example of this method is IE Transactions on Acoustics Speech and Signal Processing (IEEE
Transactions on Acoustics, Speech, and
Signal Processing) magazine, Volume 25, August issue, 1977, pp.299-309 (PP.299-309, VOL.ASSP-25,
AUGUST, 1977), “Adaptive Transform Coding of Speech.
Signals” (“Adaptive Transform Codig of
An adaptive transform coding (hereinafter referred to as ATC) method is known, which is published in a paper entitled "Speech Signals" (Reference 1). Vocoder driven adaptive transform coding was proposed by R.E. CROCHIERE and others as a method that enables high-quality audio signal transmission with a large amount of information. Transform
Codig” (hereinafter referred to as VD-ATC) method is known, and its details can be found in the IEEE Transactions on Acoustics Speech and Signal Processing (IEEE Transactions on Acoustics Speech and Signal Processing). on Acoustics，Speech，
and Signal Processing), Volume 27, 1979, 10.
Monthly issue, pp.512-530 (PP.512-530, VOL.ASSP-
27, OCTOBER, 1979) “Frequency Domain Coding of Speech”
(“Frequency Domain Coding of Speech.”) (Reference 2). FIG. 1 is a block diagram explaining the VD-ATC system. In the figure, on the transmitting side, input terminal 10
A sampled audio signal system example x(n) is input to . The buffer memory 15 stores the input signal in units of M samples in order to block process the input signal in units of M samples. The window function circuit 20 multiplies the M sample signals stored in the buffer memory 15 by a predetermined window function. Second
The figure shows an example of the window function described in Document 2. In FIG. 2, n indicates the sample time, and m indicates the number of overlapping samples. Here M
If a sample value series consisting of sample values is defined as a block, the solid line part in FIG. 2 shows the window function of the current block, and the broken line part shows the window functions of the blocks before and after it. The output value of the window function circuit 20 is input to a discrete cosine transform circuit (hereinafter referred to as a DCT circuit) 30. Here, M-sample discrete cosine transform (Discrete Cosine Transform;
According to the above-mentioned document 2, the DCT (hereinafter referred to as DCT) is calculated according to equations (1) and (2). here

【式】 (1)式において、ｖ（ｎ）（ｎ＝０，１，…，Ｍ
−１）は窓関数回路２０の出力値であり、Ｖ_c(k)
（ｋ＝０，１，…，Ｍ−１）はＶ（ｎ）のＭサン
プルのDCT係数を示す。（以下、同様な表現を用
いる。）また、逆コサイン変換（以下IDCTと記
す）は、次式に従い計算される。ＭサンプルのDCT係数の計算法としては、2M
サンプルの離散的フーリエ変換（Discrete
Fourier Transform；以下DFTと記す）を用い
る計算法が知られている。この計算法を以下に簡
単に説明する。今、ｕ（ｎ）をＭサンプルのサンプル値系列と
し、次の様に定義する。 2MサンプルのDFTをＵ(k)とすると、Ｕ(k)は従つて、ＭサンプルのDCT系数、Ｖ_c(k)は次の様
に求まる。Ｖ_c(k)＝Re｛ｃ(k)exp（−ｊπｋ／2M）Ｕ(k)｝ｋ＝０，１，２…，Ｍ−１ (6) 上式で記号Re｛｝は、実数部を表わす。ま
た、演算量を大幅に低減させる計算法としては、
ジエー・マクホウル（J.MAKHOUL）氏による
アイ・イー・イー・イー・トランザクシヨンズ・
オン・アコーステイクス・スピーチ・アンド・シ
グナルプロセツシング（IEEE Transactions on
Acoustics，Speech，and Signal Processing）
誌の1980年、第28巻、２月号第27頁〜第34頁掲載
の「ア・フアースト・コサイン・トランスフオー
ム・イン・ワン・アンド・トウー・デイメンシヨ
ンズ」（“Ａ Fast Cosine Transform in One
and Two Dimensions”」と題する論文に詳細に
説明されているので、ここでは説明を省略する。
再び第２図に戻つて、DCT回路３０の出力値、
すなわち、ＭサンプルDCT係数は、自己相関々
数計算回路４０とDCT係数量子化符号化回路１
１０に入力される。自己相関々数計算回路４０は
入力値であるＭサンプルの各DCT係数の２乗値
を計算し、これら計算結果に2Mサンプルの逆
DFT（IDFT）を施し、Ｍ個の擬似自己相関々数
を計算する。Ｍ個の擬似自己相関々数は、ピツチ
抽出回路５０及び予測パラメータ計算回路７０に
それぞれ与えられる。ピツチ抽出回路５０は、Ｍ
個の自己相関々数を用いてピツチ周期Ｐ及びピツ
チゲインＰ_Gを計算し、量子化器６０に与える。
予測パラメータ計算回路７０は、Ｍ個の自己相
関々数を用いて所定の数の予測パラメータ値系列
（一例としてＫパラメータ値系列とする。）を計算
し、このＫパラメータ値系列を量子化器６０に供
給する。尚、Ｋパラメータはパーコールパラメー
タとも呼ばれる。量子化器６０は、ピツチ周期
Ｐ、ピツチゲインＰ_GおよびＫパラメータ値系列
を所定の量子化ビツト数で量子化し、得られた量
子化パラメータ値を逆量子化器６５及びサイド情
報符号器１３０に与える。逆量子化器６５は、入
力した量子化パラメータ値を逆量子化し、得られ
たピツチ周期P′およびピツチゲインＰ_G′をスペク
トラム再生回路９０に供給する。また、逆量子化
して得たＫパラメータ値系列をパラメータ変換回
路８０に与える。パラメータ変換回路８０は、入
力されたＫパラメータ値系列をスペクトラム再生
に適したパラメータ値系列（例えば、αパラメー
タ値系列）に変換し、このαパラメータ値系列を
スペクトラム再生回路９０に与える。尚、αパラ
メータは線形予測係数とも呼ばれる。スペクトラ
ム再生回路９０は、ピツチ周期P′とピツチゲイン
Ｐ_G′とを用いてピツチスペクトラムσ_p(k)を計算
する。また、αパラメータ値系列を用いて包絡ス
ペクトラムσ_f(k)を計算する。ここで、包絡スペ
クトラムはフオルマントスペクトラムとも呼ばれ
る。さらに、σ_p(k)とσ_f(k)とを用いてスペクトラ
ムを再生する。以下、再生スペクトラムをσ(k)
（ｋ＝０，１…，Ｍ−１）と記すことにする。次
にσ_f(k)，σ(k)は、ビツト割り当てステツプサイ
ズ計算回路１００に入力される。ビツト割り当て
ステツプサイズ計算回路１００の計算アルゴリズ
ムを次に説明する。まず、最初にステツプサイズ
計算アルゴリズムを示す。 Δ(k)＝Ｑ・Ａ（ｂ(k)）・σ(k) （ｋ＝０，１，…，Ｍ−１） (7) ここで、Δ(k)は、第ｋ番目のDCT係数を量子
化する際の量子化ステツプサイズを示し、σ(k)は
前記再生スペクトラムである。ｂ(k)は割り当てビ
ツト数であり、Ａ（ｂ(k)）は、ｂ(k)により決まる
定数である。Ｑは、負荷係数であり、負荷に対す
る量子化器の雑音特性を決める。次に、ビツト割
り当て計算アルゴリズムを示す。ｋ番目のDCT
係数に対する割り当てビツト数は、次式により計
算される。ｂ(k)＝δ＋１／２log₂ｗ(k)σ^２(k)／Ｄ（ｋ＝０，１，…，Ｍ−１） (8) ここで、δは実際の量子化器に適用する際の修
正項、ｗ(k)は周波数重み関数、Ｄは次式で示され
る量子化雑音電力を示す。ここで、σ_e ^２(k)は、ｋ番目のDCT係数の量子
化雑音電力を示す。前記周波数重み関数ｗ(k)は、
次の様に表わされる。ｗ(k)＝σ_f ²〓(k)（ｋ＝０，１，…，Ｍ−１） (10) ここで、σ_f(k)は前述の包絡スペクトラムであ
る。γは量子化雑音電力の周波数特性を決定する
因子で、−１≦γ≦０の値をとる。γの値による
量子化雑音電力σ_e ^２(k)の変化を第３図に示す。
第３図において、横軸はｋの値を示し、縦軸は対
数電力を示す。また、実線部は、信号電力スペク
トラムを示し、破線部は、量子化雑音電力スペク
トラムを示す。γ＝０の時は、量子化雑音電力ス
ペクトラムは平坦な特性を示し、γ＝−１の時
は、信号電力スペクトラムと同様な特性を示す。
γの値としては、普通は−１＜γ＜０の値を用い
る。γの値によつて量子化雑音電力スペクトラム
の特性を変化させる手法を、ノイズ・シエイピン
グ（Noise Shaping）と呼ぶ。次にDCT係数適応
型量子化符号化回路１１０を説明する。DCT係
数適応型量子化符号化回路１１０は、ビツト割り
当てステツプサイズ計算回路１００の出力値であ
る量子化ステツプサイズΔ(k)と割り当てビツト数
ｂ(k)を用いて、DCT回路３０の出力値である
DCT係数を適応的に量子化、符号化する。符号
化されたDCT係数は、マルチプレクサ１２０に
与えられる。サイド情報符号器１３０は、量子化
器６０からの量子化パラメータ値をサイド情報と
して符号化し、マルチプレクサ１２０に供給す
る。マルチプレクサ１２０は、サイド情報符号器
１３０の出力符号とDCT係数適応型量子化符号
化回路１１０の出力符号とを、Ｍサンプル時刻毎
に送信側出力端子１４０を介して伝送する。次に受信側の動作を説明する。受信側では、受
信側入力端子１５０に受信した送信符号はデマル
チプレクサ１６０に与えられ、この受信符号を、
DCT係数を表わす符号と、サイド情報を表わす
符号として分離する。分離されたDCT係数を表
わす符号はDCT係数適応型復号器２００に与え
られ、サイド情報を表わす符号はサイド情報復号
器１７０に与えられる。サイド情報復号器１７０
は入力された符号を復号し、ピツチ周期P′及びピ
ツチゲインＰ_G′と、Ｋパラメータ値系列とを分離
し、Ｋパラメータ値系列をパラメータ変換回路１
７５に与え、ピツチ周期P′とピツチゲインＰ_G′と
をスペクトラム再生回路１８０に与える。ここ
で、パラメータ変換回路１７５は、送信側のパラ
メータ変換回路８０と同一の動作を行ない、Ｋパ
ラメータ値系列をαパラメータ値系列へ変換し、
αパラメータ値系列をスペクトラム再生回路１８
０に与える。スペクトラム再生回路１８０は、送
信側におけるスペクトラム再生回路９０と同一の
動作を行ない、再生スペクトラムσ(k)及び包絡ス
ペクトラムσ_f(k)を、ビツト割り当てステツプサ
イズ計算回路１９０に供給する。ビツト割り当て
ステツプサイズ計算回路１９０は、送信側におけ
るビツト割り当てステツプサイズ計算回路１００
と同一の動作を行ない、量子化ステツプサイズΔ
(k)及び割り当てビツト数ｂ(k)をDCT係数適応型
復号器２００に供給する。DCT係数適応型復号
器２００は、前記Δ(k)及びｂ(k)を用いて、デマル
チプレクサ１６０から入力した値を復号化して
DCT係数を得る。得られたDCT係数は、逆DCT
（IDCT）回路２１０に入力され、Ｍサンプルの
DCT係数に逆コサイン変換を施し、Ｍサンプル
の再生信号（ｎ）（ｎ＝０，１，…，Ｍ−１）
を得る。バツフアメモリ回路２２０は、Ｍサンプ
ルの（ｎ）を一旦蓄積した後に、受信側出力端
子２３０を介して出力する。以上説明したVD−ATC方式は、16kビツト／
秒前後の伝送速度において非常に高品質な音声信
号を得ることができる。しかしながら、このVD
−ATC方式は、DCT係数とサイド情報の双方を
伝送しなければならないので、大幅な伝送情報量
の圧縮が難しく、また低速伝送（例えば9.6kビツ
ト／秒以下）においては、再生音声の品質が劣化
するという欠点がある。音声品質劣化の一例とし
ては、ピツチ情報が充分に再現されないことによ
り再生音声がしわがれた感じの音になることが挙
げられる。これは、低速伝送においては、包絡ス
ペクトラム及びピツチスペクトラム等を充分に再
現できないことに起因する。本発明の目的は、伝送情報量を低減でき、かつ
低速伝送においても高品質な音声信号の再生が可
能な適応型音声信号伝送方法を提供することにあ
る。本発明による伝送方法は、音声信号をサンプリ
ングして得たサンプル値系列を直交変換した直交
変換系列を適応的に量子化して伝送する適応型音
声信号伝送方法において、前記サンプル値系列の
直交変換離散系列を表わす第１の信号系列を発生
し、前記第１の信号系列に基づいて前記第１の信
号系列が有する特徴を表わす第２の信号系列を発
生し、前記第２の信号系列に基づいてスペクトラ
ムを表わす第３の信号系列を発生し、前記第１の
信号系列と前記第３の信号系列との差分系列を計
算し、前記差分系列を適応量子化した第４の信号
系列を発生し、前記第２の信号系列と前記第４の
信号系列とを組み合わせて伝送することを特徴と
している。次に、本発明を図面を参照して詳細に説明す
る。第４図は、本発明の第１の実施例を示すブロ
ツク図である。本実施例においては、直交変換と
してDCTを用いている。第４図の送信側におい
て、バツフアメモリ１５および窓関数回路２０
は、第１図の同一番号の構成要素と同一の動作を
行う。DCT回路３００は窓関数回路２０のＭ個
の出力値に対して離散的コサイン変換（DCT）
を施し、DCT係数Ｖ_c(k)（ｋ＝０，１，…，Ｍ−
１）を出力する。DCTの計算法としては、前記
VD−ATC方式の説明の所で述べた様に、DFTま
たは文献３等で明されている方法に従つて計算で
きる。絶対値回路３１０は、入力されたＶ_c(k)
（ｋ＝０，１，…，Ｍ−１）の符号情報を適応型
量子化符号器３９０に与える。また、Ｖ_c(k)の絶
対値｜Ｖ_c(k)｜を計算し、その計算結果を対数計
算回路３２０と減算器３８０に与える。対数計算
回路３２０は、DCT係数の絶対値｜Ｖ_c(k)｜（ｋ
＝０，１，…，Ｍ−１）の対数LOG｜Ｖ_c(k)｜
（ｋ＝０，１，…，Ｍ−１）の値を計算する。2M
点の逆離散的フーリエ変換（Irverse Discrete
Fourier Transform；以下IDFTと記す）回路３
２０は、前記対数計算回路３２０の出力値の
IDFTを計算し、擬似ケプストラムＣ(h)（ｎ＝
０，１，…，2M−１）を得る。ここで擬似と表
現したのは、DCT係数よりケプストラムを求め
ているためである。尚、ケブストラムの詳細は、デイー・ジー・チ
ルダーズ（D.G.CHILDERS）氏らによるプロシ
ーデイングズ・オブ・アイー・イー・イー・イー
（PROCEEDINGS OF THE IEEE）誌の1977
年、第65巻、10月号1428頁〜1443頁に掲載の
「ザ・ケプストラム：ア・ガイド・トウー・プロ
セツシング」（“The Cepstrum：AGuide to
Processing”）と題する論文に述べられているの
で、ここでは説明を省略する。擬似ケプストラムＣ（ｎ）は、リフタ回路３４
０に入力される。リフタ回路３４０は、擬似ケプ
ストラムＣ（ｎ）に所定の窓関数を乗じ、低時間
部分と高時間部分とを分離して抽出し、低時間部
分を量子化器３５０に与え、高時間部分をピツチ
検出回路３４５に与える。一般に、ケプストラム
の低時間部分は、入力音声信号の包絡スペクトラ
ム情報を含み、高時間部分はピツチ（ピツチ周期
及びピツチゲイン）を含んでいることが知られて
いることから擬似ケプストラムに対しても同様な
ことが成立すると考えられる。ピツチ検出回路３
４５は、ピツチ周期Ｐと、ピツチゲインＰ_Gを検
出して量子化器３５０に与える。量子化器３５０
は、リフタ回路３４０から出力された擬似ケプス
トラムの低時間部分と、ピツチ検出回路３４５よ
り出力されたピツチ周期ＰおよびピツチゲインＰ
_Gとを所定の量子化ビツト数で量子化する。以
後、擬似ケプストラムの低時間部分と、ピツチ周
期ＰおよびピツチゲインＰ_Gとの量子化値をサイ
ド情報と呼ぶことにする。量子化器３５０より出
力されたサイド情報は、復号器３５５と、サイド
情報符号器４１０とに入力される。復号器３５５
はサイド情報を復号化して、DFT計算回路３６
０に与える。離散的フーリエ変換（Discrete
Fourier Transform；以下DFTと記す）回路３
６０は、復号されたサイド情報に2M点のDFTを
施し、対数領域における再生スペクトラムを得
る。この内の最初のＭ点の再生スペクトラムをσ
_L′(k)（ｋ＝０，１，…，Ｍ−１）とすると、σ
_L′(k)は、ピツチスペクトラムσ_P′(k)と包絡スペク
トラムσ_f′(k)（ｋ＝０，１，…，Ｍ−１）とから
構成されている。ここで、包絡スペクトラムσ
_f′(k)、ピツチスペクトラムσ_P′(k)及び再生スペク
トラムσ_L′(k)の一例を第５図ａ，ｂ，ｃに示す。
第５図において、横軸はいずれもｋの値（０〜Ｍ
−１）を表わし、縦軸は対数振幅を表わす。尚、
図においては、視覚的にわかりやすくするため
に、各スペクトラム値を連続的に示してある。対数領域における再生スペクトラムσ_L′(k)は、
指数計算回路３７０およびビツト割り当てステツ
プサイズ計算回路４００に与えられる。指数計算
回路３７０は、入力値に対して指数を計算する。
すなわち、対数計算回路３２０における処理と逆
処理を行う。指数計算回路３７０の出力値σ′(k)
（ｋ＝０，１，…，Ｍ−１）は、減算器３８０に
与えられ、絶対値回路３１０の出力値である｜Ｖ
_c(k)｜と指数計算回路３７０の出力値σ′(k)との差
分ｅ(k)（ｋ＝０，１，…，Ｍ−１）を計算し、計
算結果を適応型量子化符号器３９０に与える。適
応型量子化符号器３９０は、ビツト割り当て量子
化ステツプサイズ計算回路４００から出力される
量子化ステツプサイズと割り当てビツト数とを用
いて、前記ｅ(k)を適応的に量子化するとともに、
絶対値計算回路３１０から供給されるＶ_c(k)の符
号情報を用いて符号化する。適応型量子化符号器
からの符号系列は、マルチプレクサ４２０に与え
られる。ビツト割り当て量子化ステツプサイズ計
算回路４００は、DFT計算回路３６０から供給
された再生スペクトラムσ_L′(k)を用いて差分ｅ(k)
に対する量子化ビツト数及び量子化ステツプサイ
ズを計算する。量子化ステツプサイズの計算法と
しては、前記(7)式を適用できる。量子化ビツト数
の割り当て方法の最も簡単な方法としては、全て
等しいビツト数（例えば２ビツト）を割り当てる
方法が考えられる。この様な量子化ビツト割り当
て方法では、量子化雑音電力スペクトラムは、信
号電力スペクトラムと相似な特性をもつ。即ち、
第３図において、γ＝−１とした場合と同様な特
性となる。従来のVD−ATC方式においては、等
しいビツト数で量子化する場合には、DCT係数
Ｖ_c(k)に対して少なからぬビツト数（４ビツトあ
るいは５ビツト）を割り当てる必要があり、この
ため、伝送情報量が非常に増大してしまうという
欠点があつたが、本発明では、CT係数の量子化
ではなく、差分ｅ(k)を量子化しているので、少な
いビツト数でも音質劣化はほとんどなく、伝送情
報量もそれほど増大しない。勿論、適応型ビツト
割り当て、あるいはノイズ・シエイピングを施す
ことも可能であり、その様にする場合には、前記
(8)〜(10)式を適用すればよい。しかしながら、本実
施例の場合にノイズ・シエイピングを効果的に適
用するためには、γの値として従来のVD−ATC
方式の説明で述べた値と異なる値を採用しなくて
はならない。つまり、本実施例の場合には量子化
雑音電力スペクトラムの特性は、γ＝１の時に平
坦な特性となり、γ＝０の時には、信号電力スペ
クトラムと相似な特性となる。従つて、γの値と
しては、０＜γ＜１の値を採用しなくてはならな
い。マルチプレクサ４２０は、適応型量子化符号器
３９０の出力符号系列とサイズ情報符号器４１０
の出力符号系列とを受け、これらの符号系列を多
重化して送信側出力端子１４０を介し受信側へ伝
送する。次に、受信側の動作を説明する。受信側におい
ては、伝送された符号系列を受信側入力端子１５
０を介して受信し、デマルチプレクサ４３０に与
える。デマルチプレクサ４３０は、差分ｅ(k)（ｋ
＝０，１，…，Ｍ−１）を表わす符号系列と、サ
イド情報を表わす符号系列とを分離し、前者を適
応型復号器４７０に供給し、後者をサイド情報復
号器４４０に供給する。サイド情報復号器４４０
は、入力された符号系列を復号化し復号化サイド
情報を得て、これらをDFT計算回路４５０に出
力する。ここで、DFT計算回路４５０は、送信
側のDFT計算回路３６０と同一の動作を行な
い、復号化サイド情報を用いて対数領域のスペク
トラムσ_L′(k)（ｋ＝０，１，…，Ｍ−１）を再生
する。σ_L′(k)は、ビツト割り当てステツプサイズ
計算回路４６０と指数計算回路４９０とに出力さ
れる。ビツト割り当てステツプサイズ計算回路４
６０は、送信側のビツト割り当てステツプサイズ
計算回路４００と同一の動作を行なう。また、指
数計算回路４９０は、送信側の指数計算回路３７
０と同一の動作を行なう。指数計算回路３７０の
出力値σ′(k)（ｋ＝０，１，…，Ｍ−１）は、加
算器４８０に出力される。適応型復号器４７０
は、送信側における適応型量子化符号器３９０と
逆の動作を行なう。すなわち、入力された符号系
列からDCT係数の符号情報を分離してこれを
IDCT回路５００に与える。また、割り当てビツ
ト数及び量子化ステツプサイズとを用いて、差分
ｅ(k)を表わす符号を復号化したｅ(k)を加算器４８
０に与える。加算器４８０は、差分ｅ(k)とσ′(k)
とを加算し、DCT係数の絶対値を表わす｜_c(k)
｜（ｋ＝０，１，…，Ｍ−１）を得る。｜_c(k)
｜は、IDCT（逆コサイン変換）回路５００に与
えられる。IDCT回路５００は、DCT係数の符号
情報と｜_c(k)｜とからDCT係数_c(k)（ｋ＝０，
１，…，Ｍ−１）を求め、Ｍ点の逆DCTをDCT
係数_c(k)に対して施し、再生信号（ｎ）（（ｎ
＝０，１，…，Ｍ−１）を得る。IDCT回路５０
０の出力値（ｎ）は、バツフアメモリ回路２２
０に与えられ、Ｍサンプル分信号が蓄積された後
に、再生音声信号として受信側出力端子２３０を
介して出力される。このような構成により、従来のVD−ATC方式
と比較して次のような効果がある。 (1) 本実施例によればDCT係数を直接量子化す
るのではな、差分ｅ(k)を量子化して伝送してい
るので、差分ｅ(k)を量子化して伝送するための
情報量を、従来方式のDCT係数を伝送する場
合と比べ、1/2近くまで抵減させることが可能
で、少ない伝送情報量で従来方式VD−ATC方
式と同等の音質が得られる。 (2) 従来VD−ATC方式においては、サイド情報
は量子化ステツプサイズ及び割り当てビツト数
の計算に寄与し伝送情報量を9.6kbps以下とし
た場合に、ススペクトラムの欠落が増大し、受
信側において音声信号の再生に必要なスペクト
ラム構造、すなわちスペクトラム包絡構造及び
ピツチ構造を忠実に再現できないこと等の理由
により、音質的劣化が生じているが、本実施例
の構成によれば、受信側において伝送されたサ
イド情報を用いて、音声信号の再生に必要なス
ペクトラム構造を再現することができるので、
伝送情報量を9.6kビツト／秒以下に低減して
も、高品質な再生音声を得ることができる。 (3) 演算量低減のために割り当てビツト数の計算
方法として簡単な方法（一例として全ての差分
に対して２ビツトを割り当てる）を用いても、
音質の劣化はほとんどない。なお、DCT係数
の計算において、５．ａ，５．ｂ，(6)式で示し
た様に、DFTを用いて計算する場合には、ケ
プストラムの計算をDFT係数を用いて行なつ
てもよい。この様にして求めたケプストラム
は、DCT係数より計算した擬似ケプストラム
と比較して、入力信号のスペクトラム構造をよ
り良好に近似することが能である。この様にし
た場合には、2M点DFT計算回路３６０および
４５０のかわりに、Ｍ点DCT計算回路を用い
ることも可能である。Ｍ点DCT計算回路を用
いることによつて、DCT計算に必要な演算量
を約1/2に低減できるとともに差分伝送情報量
を減少させることができる。なお、第４図においては、直接、ケプストラム
を量子化したが、ケプストラムは、比較的量子化
の影響を受け易いので、スペクトラム包絡を表わ
す低時間領域ケプストラムを量子化の影響に対し
てさほど敏感でないパラメータ値（一例として、
Ｋパラメータ）に変換した後に、量子化を施して
もよい。この様にした場合は、Ｋパラメータの量
子化ビツト数を低減させることが可能となり、サ
イド情報の伝送情報量を低減させることができ
る。尚、スペクトラム構造を少ないビツト数で表
わす手段としては、ケプストラム以外の手段（例
えば従来VD−ATC方式の様な擬似自己相関々数
から求めたＫパラメータを用いることが考えられ
る）を用いることも可能である。一例として第９図ａ〜ｉに、入力音声信号系列
の任意のブロツクを本発明の方式によつて処理し
た場合の各部の信号を示す。尚、図では離散信号
系列を連続信号として表現している。第９図ａ
は、第４図の実施例における窓関数回路２０の出
力信号波形を示す。これは、入力音声信号系列の
任意のブロツク内のＭサンプルの信号波形であ
る。第９図ａの横軸は時間、縦軸は振幅である。
第９図ｂは第４図のDCT回路３００の出力信号
で、第９図ａの信号波形の離散的コサイン変換波
形を示す。第９図ｂの横軸は周波数、縦軸は振幅
である。第９図ｃは第４図の対数計算回路３２０
の出力信号であり、横軸は周波数、縦軸は振幅を
示す。第９図ｄは第４図のIDFT計算回路３３０
の出力信号である。この信号はケプストラムと呼
ばれている。横軸は時間、縦軸は振幅を示す。第
９図ｅは第４図のDFT計算回路３６０の出力信
号である。横軸は周波数を、縦軸は振幅を示す。
第９図ｆは第４図の指数計算回路３７０における
出力波形を示す。横軸は周波数、縦軸は振幅であ
る。第９図ｇは第４図のビツト割り当て量子化ス
テツプ計算回路４００の出力である割り当てビツ
ト数を示す。横軸は周波数、縦軸は割り当てビツ
ト数である。第９図ｈは第４図のIDCT回路５０
０の入力信号を示す。横軸は周波数、縦軸は振幅
である。第９図ｉは第４図のIDCT回路５００の
出力信号波形であり、再生音声信号を示す。横軸
は時間、縦軸は振幅を示す。第６図は本発明の第２の実施例を示すブロツク
図である。本実施例においては、送信側における
減算操作と、受信側における加算操作とを線形
（Linear）領域ではなく、対数領域で実行してい
る。第６図において、第４図と同一番号を記した
構成要素は、第４図と同一の動作をすることを示
す。この様な構成においては、送信側において、
指数計算回路が不要になることおよび差分スペク
トラムをより少ない量子化ビツト数で量子化する
ことができるため、伝送情報量を更に低減させる
ことが可能となる。第７図は本発明の第３の実施例を示すブロツク
図である。第７図において、第４図と同一の番号
を記した構成要素は、第４図と同一の動作を行な
う。最初に、送信側の説明を行なう。DFT計算
回路６００は窓関数回路２０のＭ点の出力サンプ
ル値に対してＭ点DFTを施す。DFT係数をＸ(k)
（ｋ＝０，１，…，Ｍ−１），Ｘ(k)の実数部および
虚数部をそれぞれＸ_R(k)，Ｘ_I(k)（ｋ＝０，１，
…，Ｍ−１）とすると，Ｘ_R(k)，Ｘ_I(k)は絶対値位
相計算回路６１０に与えられる。計算回路６１０
における計算方法の一例を次に示す。｜Ｘ(k)｜＝√（_R(k)）^２＋（_I(k)）^２（11.a） arg（Ｘ(k)）＝tan^-1（Ｘ_I(k)／Ｘ_R(k)）（11.b）（ｋ＝０，１，…，Ｍ−１）上式において、｜Ｘ｜はＸの絶対値，arg(x)は
Ｘの位相を表わす。ここで、DFTの入力信号は
音声信号であり、実数である。DFTの性質よ
り、実数信号をDFTした場合に、DFT係数の実
数部は偶関数、虚数部は奇関数となることが知ら
れている。従つて、DFT係数の絶対値は偶関
数、位相は奇関数となる。故に、入力信号の全情
報は、DFT係数のM/2点の絶対値および位相系
列に含まれていることは明白である。以後、M/2
点の絶対値および位相系列をそれぞれ｜Ｘ(k)｜
〓，arg（Ｘ(k)）〓（ｋ＝０，１，…，Ｍ／２−１）と表わす。Ｍ点の｜Ｘ(k)｜は減算器３８０と対数計
算回路３２０とに与えられ，M/2点のarg（Ｘ
(k)）〓は、位相量子化符号器６２０に与えられ
る。対数計算回路３２０は、第４図の対数計算回
路３２０と同一の動作を行なう。IDFT計算回路
３３５は、対数計算回路３２０の出力値に対して
Ｍ点の逆DFT計算を行なう。本実施例の構成に
よれば、IDFT計算回路３３５の出力としてケプ
ストラムＣ_P（ｎ）（ｎ＝０，１，…，Ｍ−１）が
得られる。よく知られている様に、ケプストラム
Ｃ_P（ｎ）は偶関数である。リフタ回路３４０，
ピツチ検出回路３４５，量子化器３５０，復号器
３５５，サイド情報符号器４１０は、第４図にお
いて同一番号を記した構成要素と同一動作を行な
うので、説明を省略する。またDFT計算回路３
６５は、Ｍ点のDFT計算を行なう。本実施例の
構成によれば、ケプストラムに対してDFT計算
を行なつてスペクトラム構造を求めているので、
高品質な対数領域再生スペクトラム｜Ｘ_L′(k)｜
（ｋ＝０，１，…，Ｍ−１）を得ることができ
る。勿論，｜Ｘ_L′(k)｜は偶関数であるため、
DFT計算回路３６０のＭ点の出力系列のうち、
M/2点系列（以下｜Ｘ_L′(k)｜〓と記す。）が指数
計算回路３７０及びビツト割り当て／量子化ステ
ツプサイズ計算回路４００に供給される。構成要
素３７０及び４００は、第４図における同一番号
を記した構成要素と同一の動作を行なう。但し、
本実施例の構成を採用した場合には、構成要素３
７０及び４００はM/2点系列を対象とする。減算
器３８０は、絶対値位相計算回路６１０のＭ点の
絶対値出力｜Ｘ(k)｜（ｋ＝０，１，…，Ｍ−１）
のうちのM/2点系列、すなわち｜Ｘ(k)｜〓と、指
数計算回路３７０のM/2点の出力系列｜′(k)｜
〓との引き算を行ない、その差分を適応型量子化
符号器３９に出力する。適応型量子化符号器３９
０は、第４図における同一番号を記した構成要素
と同一の動作を行なう。位相量子化符号器６２０
は、arg（Ｘ(k)）〓を量子化する。位相量子化符
号器６２０の量子化方法の一例としては、簡単な
方法として各位相成分に対して等ビツト（例えば
２ビツト程度）を割り当てて量子化を行なう方法
が考えられる。マルチプレクサ６３０は、位相量
子化器６２０の出力値は、適応型量子化符号器３
９０の出力符号系列及びサイド情報符号器４１０
の出力符号系列を受け、これらを多重化して送信
側出力端子１４０を介して受信側へ伝送する。受信側では、入力端子１５０に受信した符号を
デマルチプレクサ６４０に与え、受信符号系列か
ら位相を表わす符号系列と、差分を表わす符号系
列と、サイド情報を表わす符号系列とをそれぞれ
分離し、位相を表わす符号系列を位相復号器６５
０に与え、差分を表わす符号系列を適応型後号器
４７０に与え、サイド情報を表わす符号系列をサ
イド情報復号器４４０に与える。位相復号器６５
０は、入力符号系列からM/2点の位相系列を復号
し、変換回路６６０に出力する。サイド情報復号
器４４０は、第４図における同一番号の構成要素
と同一の動作を行なうが、本実施例の構成をとつ
た場合には、復号器４４０の出力としてケプスト
ラムが得られる。ケプストラムはDFT計算回路
４５５に与えられる。DFT計算回路４５０、指
数計算回路４９０及びビツト割り当て量子化ステ
ツプサイズ計算回路は、それぞれ送信側における
DFT計算回路３６５、指数計算回路３７０及び
ビツト割り当て量子化ステツプサイズ計算回路４
００と同一の動作を行なう。適応型復号器４７０
は、差分情報を復号し、加算器４８０に出力す
る。加算器４８０は、適応型復号器４７０の出力
値と、指数計算回路４９０のM/2点出力系列｜
X′(k)｜〓（ｋ＝０，１，…，Ｍ／２−１）とを加算したM/2点絶対値系列｜(k)｜〓（ｋ＝０，１，
…，Ｍ／２−１）を変換回路６６０に出力する。変換回路６６０は位相系列が奇関数，絶対値系列が偶
関数であることを利用してM/2点の位相系列およ
び絶対値系列からＭ点の位相系列および絶対値系
列を得て、これらをＭ点の実数部系列_R(k)を
IDFT計算回路６７０に出力する。IDFT計算回
路６７０は、Ｍ点の_R(k)及び_I(k)に逆DFT計
算を行ない、再生信号（ｎ）（ｎ＝０，１，
…，Ｍ−１）を得る。バツフアメモリ回路２２０
及び受信側出力端子２３０は、第４図における同
一番号の構成要素と同一の動作を行なう。本実施例の構成により、第１の実施例の説明で
述べた(1)〜(3)と同様の効果が得られる。また、第
１の実施例同様、ケプストラムを他のパラメータ
（例えばＫパラメータ）に変換して量子化し伝送
してもよいし、ケプストラム以外の手段を用いて
もよい。さらに、第２の実施例に示した様に、減
算、加算領域を対数領域としてもよい。この様に
することによる効果は第１、第２の実施例の場合
と同様である。一般に、人間の耳は信号の位相成分に対しては
それほど敏感でないことが知られているため、本
実施例においては、位相成分に対しかなり粗い量
子化を施しても音質的劣化はほとんどない。位相
成分に対する量子化を粗くすることによつて、伝
送情報量を低減できるという効果がある。また位
相量子化の方法としては、本実施例で説明した方
法の他のブロツク内で位相成分を一次関数で近似
して、一次関数の傾きと、各位相成分の一次関数
の対応値との差異とを量子化して伝送してもよ
い。さらにまた、入力音声系列に対して最小位相条
件を適用することも可能である。すなわち、入力
音声系列が最小位相条件を満足すると仮定して処
理を行なうものである。人間の耳が位相に対して
それほど敏感でないという前述の理由により最小
位相条件を適用しても音声劣化は少ない。この様
にした場合には、位相成分を受信側で再生するこ
とが可能となるので、位相成分を伝送する必要が
ない。従つて、差分に関する伝送情報量は1/2と
なり、全体的にみて伝送情報量の大幅な低減が可
能となる。第８図は、最小位相条件を適用した本発明の第
４の実施例を表わすブロツク図である。第８図に
おいて、第７図における構成要素と同一番号を記
した構成要素は、第７図と同一の動作を行なうの
で、ここでは説明を省略する。第８図におけるリ
フタ回路７００の高時間領域における動作は、第
７図におけるリフタ回路３４０と同一である。し
かしながら、低時間領域の動作においては、ケプ
ストラムに施すケプストラム窓は次に示す様にな
る。[Formula] In equation (1), v(n) (n=0,1,...,M
−1) is the output value of the window function circuit 20, and V _c (k)
(k=0, 1,...,M-1) indicates the DCT coefficient of M samples of V(n). (Similar expressions will be used hereinafter.) Inverse cosine transform (hereinafter referred to as IDCT) is calculated according to the following formula. The method for calculating the DCT coefficient for M samples is 2M
Discrete Fourier transform of the sample
A calculation method using Fourier Transform (hereinafter referred to as DFT) is known. This calculation method will be briefly explained below. Now, let u(n) be a sample value series of M samples, and define it as follows. If the DFT of 2M samples is U(k), U(k) is Therefore, the DCT series of M samples, V _c (k), can be found as follows. V _c (k)=Re{c(k)exp(-jπk/2M)U(k)} k=0,1,2...,M-1 (6) In the above formula, the symbol Re{ } is the real part represents. In addition, as a calculation method that significantly reduces the amount of calculation,
IE Transactions by J.MAKHOUL
IEEE Transactions on Acoustics, Speech and Signal Processing
Acoustics, Speech, and Signal Processing)
"A Fast Cosine Transform in One and Two Dimensions" published in 1980, Volume 28, February issue, pages 27-34 of One
It is explained in detail in the paper titled "And Two Dimensions", so I will not explain it here.
Returning to FIG. 2 again, the output value of the DCT circuit 30,
That is, the M sample DCT coefficients are generated by the autocorrelation calculation circuit 40 and the DCT coefficient quantization encoding circuit 1.
10 is input. The autocorrelation calculation circuit 40 calculates the square value of each DCT coefficient of M samples which are input values, and applies the inverse of 2M samples to these calculation results.
DFT (IDFT) is applied to calculate M pseudo-autocorrelation numbers. The M pseudo-autocorrelation numbers are provided to the pitch extraction circuit 50 and the prediction parameter calculation circuit 70, respectively. The pitch extraction circuit 50 has M
The pitch period P and the pitch gain P _G are calculated using the autocorrelation coefficients P and P G and are supplied to the quantizer 60 .
The prediction parameter calculation circuit 70 calculates a predetermined number of prediction parameter value sequences (as an example, a K parameter value sequence) using M autocorrelation numbers, and converts this K parameter value sequence into a quantizer 60. supply to. Note that the K parameter is also called a Percall parameter. The quantizer 60 quantizes the pitch period P, the pitch gain _PG , and the K parameter value series using a predetermined number of quantization bits, and provides the obtained quantization parameter values to the inverse quantizer 65 and the side information encoder 130. . The inverse quantizer 65 inversely quantizes the input quantization parameter value and supplies the obtained pitch period P' and pitch gain P _G ' to the spectrum regeneration circuit 90. Further, the K parameter value series obtained by inverse quantization is provided to the parameter conversion circuit 80. The parameter conversion circuit 80 converts the input K parameter value series into a parameter value series (for example, an α parameter value series) suitable for spectrum reproduction, and provides this α parameter value series to the spectrum reproduction circuit 90 . Note that the α parameter is also called a linear prediction coefficient. The spectrum reproducing circuit 90 calculates the pitch spectrum σ _p (k) using the pitch period P' and the pitch gain P _G '. Furthermore, the envelope spectrum σ _f (k) is calculated using the α parameter value series. Here, the envelope spectrum is also called formant spectrum. Furthermore, the spectrum is reproduced using σ _p (k) and σ _f (k). Below, the reproduced spectrum is σ(k)
It will be written as (k=0, 1..., M-1). Next, σ _f (k) and σ(k) are input to the bit allocation step size calculation circuit 100. The calculation algorithm of the bit allocation step size calculation circuit 100 will now be described. First, we will show the step size calculation algorithm. Δ(k)=Q・A(b(k))・σ(k) (k=0, 1,...,M−1) (7) Here, Δ(k) is the k-th DCT coefficient σ(k) is the reproduced spectrum. b(k) is the number of allocated bits, and A(b(k)) is a constant determined by b(k). Q is a load factor and determines the noise characteristics of the quantizer with respect to load. Next, the bit allocation calculation algorithm is shown. kth DCT
The number of bits allocated to the coefficient is calculated by the following equation. b(k)=δ+1/2log ₂ w(k)σ ² (k)/D (k=0, 1,...,M-1) (8) Here, δ is w(k) is the frequency weighting function, and D is the quantization noise power expressed by the following equation. Here, σ _e ² (k) indicates the quantization noise power of the k-th DCT coefficient. The frequency weighting function w(k) is
It is expressed as follows. w(k)=σ _f ² 〓(k) (k=0, 1,..., M-1) (10) Here, σ _f (k) is the aforementioned envelope spectrum. γ is a factor that determines the frequency characteristics of quantization noise power, and takes a value of −1≦γ≦0. FIG. 3 shows changes in the quantization noise power σ _e ² (k) depending on the value of γ.
In FIG. 3, the horizontal axis shows the value of k, and the vertical axis shows the logarithmic power. Moreover, the solid line part shows the signal power spectrum, and the broken line part shows the quantization noise power spectrum. When γ=0, the quantization noise power spectrum exhibits flat characteristics, and when γ=−1, it exhibits similar characteristics to the signal power spectrum.
As the value of γ, a value of −1<γ<0 is normally used. The method of changing the characteristics of the quantized noise power spectrum depending on the value of γ is called noise shaping. Next, the DCT coefficient adaptive quantization encoding circuit 110 will be explained. The DCT coefficient adaptive quantization encoding circuit 110 uses the quantization step size Δ(k) and the number of allocated bits b(k), which are the output values of the bit allocation step size calculation circuit 100, to calculate the output value of the DCT circuit 30. is
Adaptively quantize and encode DCT coefficients. The encoded DCT coefficients are provided to multiplexer 120. Side information encoder 130 encodes the quantization parameter value from quantizer 60 as side information and supplies it to multiplexer 120. The multiplexer 120 transmits the output code of the side information encoder 130 and the output code of the DCT coefficient adaptive quantization coding circuit 110 via the transmission side output terminal 140 every M sample times. Next, the operation on the receiving side will be explained. On the receiving side, the transmitted code received at the receiving side input terminal 150 is given to a demultiplexer 160, which converts the received code into
The code is separated into a code representing the DCT coefficient and a code representing side information. The codes representing the separated DCT coefficients are provided to a DCT coefficient adaptive decoder 200 and the codes representing side information are provided to a side information decoder 170. Side information decoder 170
decodes the input code, separates the pitch period P' and pitch gain P _G ' from the K parameter value series, and sends the K parameter value series to the parameter conversion circuit 1.
75, and the pitch period P' and pitch gain P _G ' are supplied to the spectrum regeneration circuit 180. Here, the parameter conversion circuit 175 performs the same operation as the parameter conversion circuit 80 on the transmitting side, converting the K parameter value series to the α parameter value series,
Spectrum regeneration circuit 18 for α parameter value series
Give to 0. The spectrum regeneration circuit 180 performs the same operation as the spectrum regeneration circuit 90 on the transmitting side, and supplies the regenerated spectrum σ(k) and the envelope spectrum σ _f (k) to the bit allocation step size calculation circuit 190. The bit allocation step size calculation circuit 190 is a bit allocation step size calculation circuit 100 on the transmitting side.
Perform the same operation as quantization step size Δ
(k) and the allocated bit number b(k) are supplied to the DCT coefficient adaptive decoder 200. The DCT coefficient adaptive decoder 200 decodes the value input from the demultiplexer 160 using the Δ(k) and b(k).
Get the DCT coefficients. The obtained DCT coefficients are the inverse DCT
(IDCT) circuit 210, M samples of
Perform inverse cosine transformation on the DCT coefficients and reproduce M samples of reproduced signal (n) (n=0, 1,...,M-1)
get. The buffer memory circuit 220 once stores M samples (n) and then outputs them through the receiving side output terminal 230. The VD-ATC method explained above is 16k bit/
Very high quality audio signals can be obtained at transmission speeds of around seconds. However, this VD
- Since the ATC method has to transmit both DCT coefficients and side information, it is difficult to significantly compress the amount of transmitted information, and the quality of the reproduced audio deteriorates at low-speed transmission (for example, 9.6 kbit/s or less). It has the disadvantage of deterioration. An example of audio quality deterioration is that pitch information is not reproduced sufficiently, resulting in reproduced audio sounding hoarse. This is due to the fact that the envelope spectrum, pitch spectrum, etc. cannot be reproduced satisfactorily in low-speed transmission. An object of the present invention is to provide an adaptive audio signal transmission method that can reduce the amount of transmitted information and reproduce high-quality audio signals even during low-speed transmission. The transmission method according to the present invention is an adaptive audio signal transmission method in which an orthogonal transform sequence obtained by orthogonally transforming a sample value sequence obtained by sampling an audio signal is adaptively quantized and transmitted. generating a first signal sequence representing a characteristic of the first signal sequence, based on the first signal sequence, generating a second signal sequence representing a characteristic of the first signal sequence; generating a third signal sequence representing a spectrum, calculating a difference sequence between the first signal sequence and the third signal sequence, and generating a fourth signal sequence by adaptively quantizing the difference sequence; It is characterized in that the second signal series and the fourth signal series are transmitted in combination. Next, the present invention will be explained in detail with reference to the drawings. FIG. 4 is a block diagram showing a first embodiment of the present invention. In this embodiment, DCT is used as the orthogonal transformation. On the transmitting side in FIG. 4, a buffer memory 15 and a window function circuit 20
performs the same operations as the components with the same numbers in FIG. The DCT circuit 300 performs discrete cosine transformation (DCT) on the M output values of the window function circuit 20.
, and the DCT coefficient V _c (k) (k=0, 1,...,M-
1) Output. The DCT calculation method is as described above.
As mentioned in the explanation of the VD-ATC method, it can be calculated using DFT or the method disclosed in Document 3. The absolute value circuit 310 receives input V _c (k)
The code information of (k=0, 1, . . . , M-1) is given to the adaptive quantization encoder 390. Further, the absolute value |V _c (k)| of V _c (k) is calculated, and the calculation result is provided to the logarithm calculation circuit 320 and the subtracter 380 . The logarithm calculation circuit 320 calculates the absolute value of the DCT coefficient |V _c (k)|(k
=0,1,...,M-1) logarithm LOG |V _c (k)|
Calculate the value of (k=0, 1,...,M-1). 2M
Irverse Discrete Fourier transform of a point
Fourier Transform (hereinafter referred to as IDFT) circuit 3
20 is the output value of the logarithm calculation circuit 320.
Calculate the IDFT and pseudo-cepstrum C(h) (n=
0,1,...,2M-1). The reason why it is expressed as pseudo here is because the cepstrum is obtained from the DCT coefficients. Details of the kebstrum can be found in the 1977 issue of PROCEEDINGS OF THE IEEE by DGCHILDERS et al.
“The Cepstrum: A Guide to Processing” published in October 2016, Volume 65, pp. 1428-1443.
The pseudo-cepstrum C(n) is described in the paper titled "Processing"), so its explanation will be omitted here.
It is input to 0. The lifter circuit 340 multiplies the pseudo cepstrum C(n) by a predetermined window function, separates and extracts a low time part and a high time part, supplies the low time part to a quantizer 350, and pitches the high time part. It is applied to the detection circuit 345. In general, it is known that the low time part of the cepstrum contains the envelope spectrum information of the input audio signal, and the high time part contains the pitch (pitch period and pitch gain), so the same applies to the pseudo cepstrum. It is believed that this holds true. Pitch detection circuit 3
45 detects pitch period P and pitch gain P _G and supplies them to quantizer 350 . Quantizer 350
are the low time portion of the pseudo cepstrum output from the lifter circuit 340 and the pitch period P and pitch gain P output from the pitch detection circuit 345.
_G is quantized with a predetermined number of quantization bits. Hereinafter, the low time portion of the pseudo cepstrum and the quantized values of the pitch period P and the pitch gain P _G will be referred to as side information. The side information output from the quantizer 350 is input to a decoder 355 and a side information encoder 410. Decoder 355
decodes the side information and sends it to the DFT calculation circuit 36
Give to 0. Discrete Fourier transform
Fourier Transform (hereinafter referred to as DFT) circuit 3
60 performs 2M-point DFT on the decoded side information to obtain a reproduced spectrum in the logarithmic domain. The reproduced spectrum of the first M points among these is σ
If _L ′(k) (k=0, 1,...,M-1), σ
_L '(k) is composed of a pitch spectrum σ _P '(k) and an envelope spectrum σ _f '(k) (k=0, 1, . . . , M-1). Here, the envelope spectrum σ
Examples of _f ′(k), pitch spectrum σ _P ′(k), and reproduction spectrum σ _L ′(k) are shown in FIGS. 5a, b, and c.
In Figure 5, the horizontal axis is the value of k (0 to M
-1), and the vertical axis represents logarithmic amplitude. still,
In the figure, each spectrum value is shown consecutively for visual clarity. The reproduced spectrum σ _L ′(k) in the logarithmic domain is
Provided to exponent calculation circuit 370 and bit allocation step size calculation circuit 400. The exponent calculation circuit 370 calculates an exponent for the input value.
That is, the processing in the logarithm calculation circuit 320 and the inverse processing are performed. Output value σ′(k) of index calculation circuit 370
(k=0, 1,...,M-1) is given to the subtracter 380 and is the output value of the absolute value circuit 310 |V
The difference e(k) (k=0, 1,...,M-1) between _c (k)| and the output value σ'(k) of the exponent calculation circuit 370 is calculated, and the calculation result is converted into an adaptive quantization code. 390. The adaptive quantization encoder 390 adaptively quantizes the e(k) using the quantization step size and the number of allocated bits output from the bit allocation quantization step size calculation circuit 400.
Encoding is performed using the code information of V _c (k) supplied from the absolute value calculation circuit 310. The code sequence from the adaptive quantization encoder is provided to multiplexer 420. The bit allocation quantization step size calculation circuit 400 calculates the difference e(k) using the reproduced spectrum σ _L '(k) supplied from the DFT calculation circuit 360.
Calculate the number of quantization bits and quantization step size for As a method of calculating the quantization step size, the above equation (7) can be applied. The simplest method for allocating the number of quantization bits is to allocate the same number of bits (for example, 2 bits). In such a quantization bit allocation method, the quantization noise power spectrum has similar characteristics to the signal power spectrum. That is,
In FIG. 3, the characteristics are similar to those obtained when γ=-1. In the conventional VD-ATC method, when quantizing with the same number of bits, it is necessary to allocate a considerable number of bits (4 bits or 5 bits) to the DCT coefficient V _c (k). Although there was a drawback that the amount of transmitted information increased significantly, in the present invention, the difference e(k) is quantized instead of the CT coefficients, so there is almost no deterioration in sound quality even with a small number of bits. , the amount of transmitted information does not increase significantly. Of course, it is also possible to apply adaptive bit allocation or noise shaping, and in that case, the above-mentioned
Equations (8) to (10) may be applied. However, in order to effectively apply noise shaping in the case of this embodiment, the value of γ must be set to the conventional VD-ATC
A value different from the value mentioned in the method description must be adopted. That is, in the case of this embodiment, the characteristics of the quantization noise power spectrum are flat when γ=1, and are similar to the signal power spectrum when γ=0. Therefore, a value of 0<γ<1 must be adopted as the value of γ. The multiplexer 420 combines the output code sequence of the adaptive quantization encoder 390 and the size information encoder 410.
, and multiplexes these code sequences and transmits them to the receiving side via the transmitting side output terminal 140. Next, the operation on the receiving side will be explained. On the receiving side, the transmitted code sequence is input to the receiving side input terminal 15.
0 and applied to demultiplexer 430. The demultiplexer 430 calculates the difference e(k)(k
=0, 1, . Side information decoder 440
decodes the input code sequence to obtain decoding side information, and outputs these to the DFT calculation circuit 450. Here, the DFT calculation circuit 450 performs the same operation as the DFT calculation circuit 360 on the transmitting side, and uses the decoding side information to calculate the spectrum σ _L ′(k) (k=0, 1, ..., M -1) is played. σ _L '(k) is output to bit allocation step size calculation circuit 460 and exponent calculation circuit 490. Bit allocation step size calculation circuit 4
60 performs the same operation as the bit allocation step size calculation circuit 400 on the transmitting side. In addition, the exponent calculation circuit 490 is connected to the exponent calculation circuit 37 on the transmitting side.
Performs the same operation as 0. The output value σ'(k) (k=0, 1, . . . , M-1) of the exponent calculation circuit 370 is output to the adder 480. adaptive decoder 470
performs the opposite operation to the adaptive quantization encoder 390 on the transmitting side. In other words, the code information of the DCT coefficients is separated from the input code sequence and
It is applied to the IDCT circuit 500. Further, the adder 48 decodes the code representing the difference e(k) using the allocated bit number and the quantization step size.
Give to 0. Adder 480 calculates the difference e(k) and σ′(k)
and represents the absolute value of the DCT coefficient | _c (k)
|(k=0, 1,...,M-1) is obtained. ｜ _c (k)
| is given to an IDCT (inverse cosine transform) circuit 500. The IDCT circuit 500 calculates _{the DCT coefficient c} ₍ k) (k=0,
1,...,M-1), and the inverse DCT of M point is DCT
It is applied to the coefficient _c (k) and reproduced signal (n) ((n
=0,1,...,M-1). IDCT circuit 50
The output value (n) of 0 is the buffer memory circuit 22
0, and after the signal for M samples has been accumulated, it is outputted as a reproduced audio signal via the receiving side output terminal 230. This configuration provides the following effects compared to the conventional VD-ATC method. (1) According to this embodiment, the DCT coefficients are not directly quantized, but the difference e(k) is quantized and transmitted, so the amount of information required to quantize and transmit the difference e(k) is It is possible to reduce this to nearly 1/2 compared to the case of transmitting DCT coefficients in the conventional method, and it is possible to obtain the same sound quality as the conventional VD-ATC method with a small amount of transmitted information. (2) In the conventional VD-ATC system, the side information contributes to the calculation of the quantization step size and the number of allocated bits, and when the amount of transmitted information is set to 9.6 kbps or less, the loss of the spectrum increases, causing problems on the receiving side. Deterioration in sound quality occurs due to the inability to faithfully reproduce the spectral structure required for audio signal reproduction, that is, the spectral envelope structure and pitch structure. However, according to the configuration of this embodiment, the transmission The spectral structure required for audio signal reproduction can be reproduced using the side information obtained.
Even if the amount of transmitted information is reduced to 9.6k bits/second or less, high-quality playback audio can be obtained. (3) Even if a simple method is used to calculate the number of allocated bits (as an example, allocating 2 bits to every difference) to reduce the amount of calculation,
There is almost no deterioration in sound quality. In addition, in calculating the DCT coefficient, 5. a, 5. b. As shown in equation (6), when calculating using DFT, the cepstrum may be calculated using DFT coefficients. The cepstrum obtained in this way is capable of approximating the spectral structure of the input signal better than the pseudo cepstrum calculated from the DCT coefficients. In this case, it is also possible to use an M-point DCT calculation circuit instead of the 2M-point DFT calculation circuits 360 and 450. By using the M-point DCT calculation circuit, the amount of calculation required for DCT calculation can be reduced to approximately 1/2, and the amount of differential transmission information can be reduced. In Fig. 4, the cepstrum is directly quantized, but since the cepstrum is relatively easily affected by quantization, the low time domain cepstrum, which represents the spectral envelope, is not so sensitive to the effects of quantization. Parameter values (as an example,
Quantization may be performed after converting to K parameter). In this case, the number of quantization bits of the K parameter can be reduced, and the amount of side information to be transmitted can be reduced. Note that it is also possible to use a means other than the cepstrum to express the spectrum structure with a small number of bits (for example, it is possible to use the K parameter obtained from the pseudo-autocorrelation coefficients as in the conventional VD-ATC method). It is. As an example, FIGS. 9a to 9i show signals at various parts when an arbitrary block of an input audio signal sequence is processed by the method of the present invention. Note that in the figure, the discrete signal sequence is expressed as a continuous signal. Figure 9a
4 shows the output signal waveform of the window function circuit 20 in the embodiment of FIG. This is a signal waveform of M samples within an arbitrary block of the input audio signal sequence. The horizontal axis in FIG. 9a is time, and the vertical axis is amplitude.
FIG. 9b is an output signal of the DCT circuit 300 of FIG. 4, and shows a discrete cosine transform waveform of the signal waveform of FIG. 9a. The horizontal axis in FIG. 9b is frequency, and the vertical axis is amplitude. FIG. 9c shows the logarithm calculation circuit 320 of FIG.
The horizontal axis shows the frequency and the vertical axis shows the amplitude. FIG. 9d shows the IDFT calculation circuit 330 of FIG.
is the output signal of This signal is called the cepstrum. The horizontal axis shows time and the vertical axis shows amplitude. FIG. 9e shows the output signal of the DFT calculation circuit 360 of FIG. The horizontal axis shows frequency and the vertical axis shows amplitude.
FIG. 9f shows the output waveform of the index calculation circuit 370 of FIG. The horizontal axis is frequency and the vertical axis is amplitude. FIG. 9g shows the number of allocated bits which is the output of the bit allocation quantization step calculation circuit 400 of FIG. The horizontal axis is the frequency, and the vertical axis is the number of allocated bits. Figure 9h shows the IDCT circuit 50 of Figure 4.
Indicates an input signal of 0. The horizontal axis is frequency and the vertical axis is amplitude. FIG. 9i is an output signal waveform of the IDCT circuit 500 of FIG. 4, and shows a reproduced audio signal. The horizontal axis shows time and the vertical axis shows amplitude. FIG. 6 is a block diagram showing a second embodiment of the present invention. In this embodiment, the subtraction operation on the transmitting side and the addition operation on the receiving side are performed not in the linear domain but in the logarithmic domain. In FIG. 6, components labeled with the same numbers as in FIG. 4 indicate that they operate in the same way as in FIG. 4. In such a configuration, on the sending side,
Since an exponent calculation circuit is not required and the difference spectrum can be quantized with a smaller number of quantization bits, it is possible to further reduce the amount of transmitted information. FIG. 7 is a block diagram showing a third embodiment of the present invention. In FIG. 7, components labeled with the same numbers as in FIG. 4 perform the same operations as in FIG. 4. First, the sending side will be explained. The DFT calculation circuit 600 performs M-point DFT on the M-point output sample values of the window function circuit 20. DFT coefficient is X(k)
(k=0,1,...,M-1), the _real part and _imaginary part of
..., M-1), X _R (k) and X _I (k) are given to the absolute value phase calculation circuit 610. calculation circuit 610
An example of the calculation method is shown below. |X(k)|=√( _R (k)) ² + ( _I (k)) ² (11.a) arg(X(k))=tan ^-1 (X _I (k)/X _R (k )) (11.b) (k=0, 1,...,M-1) In the above equation, |X| represents the absolute value of X, and arg(x) represents the phase of X. Here, the input signal of DFT is an audio signal and is a real number. Due to the properties of DFT, it is known that when a real number signal is subjected to DFT, the real part of the DFT coefficient becomes an even function and the imaginary part becomes an odd function. Therefore, the absolute value of the DFT coefficient is an even function, and the phase is an odd function. Therefore, it is clear that all the information of the input signal is contained in the absolute value and phase sequence of M/2 points of the DFT coefficient. From now on, M/2
The absolute value and phase series of the points are respectively |X(k)|
〓, arg(X(k))〓(k=0, 1,..., M/2-1). |X(k)| at point M is given to the subtracter 380 and logarithm calculation circuit 320, and arg(X
(k)) is provided to a phase quantization encoder 620. Logarithm calculation circuit 320 performs the same operation as logarithm calculation circuit 320 in FIG. The IDFT calculation circuit 335 performs an M-point inverse DFT calculation on the output value of the logarithm calculation circuit 320. According to the configuration of this embodiment, the cepstrum C _P (n) (n=0, 1, . . . , M-1) is obtained as the output of the IDFT calculation circuit 335. As is well known, the cepstrum C _P (n) is an even function. lifter circuit 340,
The pitch detection circuit 345, quantizer 350, decoder 355, and side information encoder 410 operate in the same way as the components denoted by the same numbers in FIG. 4, so their explanation will be omitted. Also, DFT calculation circuit 3
65 performs DFT calculation of M points. According to the configuration of this embodiment, the spectral structure is obtained by performing DFT calculation on the cepstrum.
High quality logarithmic domain reproduction spectrum｜X _L ′(k)｜
(k=0, 1,...,M-1) can be obtained. Of course, |X _L ′(k)| is an even function, so
Among the output series of M points of the DFT calculation circuit 360,
The M/2 point sequence (hereinafter referred to as |X _L '(k)|) is supplied to an exponent calculation circuit 370 and a bit allocation/quantization step size calculation circuit 400. Components 370 and 400 perform the same operations as the similarly numbered components in FIG. however,
When the configuration of this embodiment is adopted, component 3
70 and 400 target M/2 point series. The subtracter 380 outputs the absolute value of the M point of the absolute value phase calculation circuit 610 |X(k)|(k=0, 1,...,M-1)
The M/2 point series, that is, |X(k)|〓, and the M/2 point output series of the index calculation circuit 370 |′(k)|
〓 is performed, and the difference is output to the adaptive quantization encoder 39. Adaptive quantization encoder 39
0 performs the same operation as the component with the same number in FIG. Phase quantization encoder 620
quantizes arg(X(k)). As an example of the quantization method of the phase quantization encoder 620, a simple method may be one in which equal bits (for example, about 2 bits) are assigned to each phase component to perform quantization. The multiplexer 630 outputs the output value of the phase quantizer 620 to the adaptive quantization encoder 3.
90 output code sequences and side information encoder 410
It receives the output code sequences of , multiplexes them, and transmits them to the receiving side via the transmitting side output terminal 140. On the receiving side, the code received at the input terminal 150 is applied to the demultiplexer 640, which separates the code sequence representing the phase, the code sequence representing the difference, and the code sequence representing side information from the received code sequence, and demultiplexes the phase. The phase decoder 65
0, a code sequence representing the difference is applied to the adaptive post-coder 470, and a code sequence representing the side information is applied to the side information decoder 440. Phase decoder 65
0 decodes the phase sequence of M/2 points from the input code sequence and outputs it to the conversion circuit 660. The side information decoder 440 performs the same operations as the components with the same numbers in FIG. The cepstrum is given to a DFT calculation circuit 455. The DFT calculation circuit 450, the exponent calculation circuit 490, and the bit allocation quantization step size calculation circuit are respectively provided on the transmitting side.
DFT calculation circuit 365, exponent calculation circuit 370, and bit allocation quantization step size calculation circuit 4
Performs the same operation as 00. adaptive decoder 470
decodes the difference information and outputs it to adder 480. The adder 480 receives the output value of the adaptive decoder 470 and the M/2 point output series of the exponent calculation circuit 490.
X′(k)｜〓(k=0,1,...,M/2-1)
..., M/2-1) is output to the conversion circuit 660. The conversion circuit 660 uses the fact that the phase series is an odd function and the absolute value series is an even function to obtain the phase series and absolute value series of M points from the phase series and absolute value series of M/2 points, and converts these into The real part series _R (k) of M points is
It is output to the IDFT calculation circuit 670. The IDFT calculation circuit 670 performs inverse DFT calculation on _R (k) and _I (k) at M points, and reproduces the reproduced signal (n) (n=0, 1,
..., M-1) is obtained. Buffer memory circuit 220
and the receiving side output terminal 230 perform the same operations as the components with the same numbers in FIG. With the configuration of this embodiment, effects similar to (1) to (3) described in the description of the first embodiment can be obtained. Further, as in the first embodiment, the cepstrum may be converted into another parameter (for example, K parameter), quantized, and transmitted, or means other than the cepstrum may be used. Furthermore, as shown in the second embodiment, the subtraction and addition regions may be made into logarithmic regions. The effect of doing so is the same as in the first and second embodiments. Generally, it is known that the human ear is not very sensitive to the phase component of a signal, so in this embodiment, there is almost no deterioration in sound quality even if the phase component is subjected to fairly coarse quantization. By coarsening the quantization of the phase component, there is an effect that the amount of transmitted information can be reduced. In addition, as a method of phase quantization, the phase component is approximated by a linear function in another block of the method explained in this example, and the difference between the slope of the linear function and the corresponding value of the linear function of each phase component is may be quantized and transmitted. Furthermore, it is also possible to apply a minimum phase condition to the input speech sequence. That is, processing is performed on the assumption that the input speech sequence satisfies the minimum phase condition. Due to the aforementioned reason that the human ear is not very sensitive to phase, applying the minimum phase condition causes less audio degradation. In this case, the phase component can be reproduced on the receiving side, so there is no need to transmit the phase component. Therefore, the amount of information to be transmitted regarding the difference is halved, making it possible to significantly reduce the amount of information to be transmitted overall. FIG. 8 is a block diagram showing a fourth embodiment of the present invention to which the minimum phase condition is applied. In FIG. 8, the components labeled with the same numbers as those in FIG. 7 perform the same operations as in FIG. 7, so their explanations will be omitted here. The operation of lifter circuit 700 in FIG. 8 in the high time region is the same as lifter circuit 340 in FIG. However, in low time domain operation, the cepstral window applied to the cepstrum is as follows.

【表】上式において、（ｎ）は、ケプストラム窓を
示し、n₀はピツチ周期ｔよりも短い任意の正の値
である。尚、以上述べてきた４つの実施例における
DFT計算に関しては、高速に演算するアルゴリ
ズムとして高速フーリエ変換（Fast Fourier
Trans form；以下FFTと記す）が知られている
ので、FFTを用いれば演算速度は大幅に減少す
る。また、ケプストラムからの入力信号の構造スペ
クトラムを再生する際に、前述の実施例において
はピツチスペクトラムと包絡スペクトラムの両者
を用いて再生を行なつたが、ピツチスペクトラム
を除いて、包絡スペクトラムのみを用いて再生を
行なつてもよい。この様にすれば、ピツチに関す
る情報は差分系列に含まれて伝送されるので、ピ
ツチ検出回路は不要となり、ピツチ検出回路にお
けるピツチ情報の検出誤りを防ぐことができる。以上述べたように、本発明によれば、直交変換
係数とサイド情報から再生されるスペクトラムと
の差分を伝送しているので、直交変換係数を伝送
する従来方式と比べ、伝送情報を1/2近くまで低
減することが可能であり、従来方式では音質が劣
化していた9.6Kビツト／秒以下の低伝送速度に
おいても高品質な音声信号を得ることができる。[Table] In the above equation, (n) represents a cepstral window, and n ₀ is any positive value shorter than the pitch period t. In addition, in the four embodiments described above,
Regarding DFT calculation, Fast Fourier transform (Fast Fourier transform) is used as an algorithm for high-speed calculation.
Transform (hereinafter referred to as FFT) is known, so if FFT is used, the calculation speed will be significantly reduced. Furthermore, when reproducing the structural spectrum of the input signal from the cepstrum, both the pitch spectrum and the envelope spectrum were used in the above embodiment, but only the envelope spectrum was used, excluding the pitch spectrum. You may also perform playback. In this way, since the information regarding the pitch is transmitted while being included in the difference series, the pitch detection circuit is not required, and it is possible to prevent the pitch detection circuit from erroneously detecting the pitch information. As described above, according to the present invention, since the difference between the orthogonal transform coefficients and the spectrum reproduced from the side information is transmitted, the transmitted information is halved compared to the conventional method that transmits the orthogonal transform coefficients. It is possible to obtain a high-quality audio signal even at low transmission speeds of 9.6K bits/second or less, where the sound quality deteriorated with conventional methods.

[Brief explanation of the drawing]

第１図は適応型変換符号化方式における従来方
式の一例を示すVD−ATC（Vocoder Diwen
Adaptive Transform Coding）方式のブロツク
図、第２図は窓関数の一例を示す図、第３図はノ
イズ・シエイピング（Noise Shaping）を適用し
た場合の信号スペクトラムの一例を示す図、第４
図は本発明の第１の実施例を示すブロツク図、第
５図は音声信号のスペクトラム構造を表わす信号
スペクトラムの一例を示す図、第６図は本発明の
第２の実施例を示すブロツク図、第７図は本発明
の第３の実施例を示すブロツク図、第８図は本発
明の第４の実施例を示すブロツク図および第９図
(a)〜(i)は第４図の装置の動作を説明する波形図で
ある。図において、１０……送信側入力端子、１５…
…バツフアメモリ回路、２０……窓関数回路、３
０，３００……逆散的コサイン変換（DCT）回
路、４０……自己相関々数計算回路、５０……ピ
ツチ検出回路、６０……量子化器、６５……逆量
子化器、７０……予測パラメータ計算回路、８０
……パラメータ変換回路、９０……スペクトラム
再生回路、１００……ビツト割り当てステツプサ
イズ計算回路、１１０……DCT係数適応型量子
化符号化回路、１２０，４２０，６３０……マル
チプレクサ、１３０，４１０……サイド情報符号
器、１４０……送信側出力端子、１５０……受信
側入力端子、１６０，４３０，６４０……デマル
チプレクサ、１７０，４４０……サイド情報復号
器、１７５……パラメータ変換回路、１８０……
スペクトラム再生回路、１９０……ビツト割り当
てステツプサイズ計算回路、２００……DCT係
数適応型復号器、２１０，５００……逆離散的コ
サイン変換（IDCT）回路、２２０……バツフア
メモリ回路、２３０……受信側出力端子、３１０
……絶対値回路、３２０……対数計算回路、３３
０，３３５，６７０……逆離散的フーリエ変換
（IDFT）計算回路、３４０，７００……リフタ
回路、３４５……ピツチ検出回路、３５０……量
子化器、３５５……復号器、３６０，３６５，４
５０，４５５，６００……離散的フーリエ変換
（DFT）計算回路、３７，４９０……指数計算回
路、３８０……減算器、３９０……適応型量子化
符号器、４００，４６０……ビツト割り当て量子
化ステツプサイズ計算回路、４７０……適応型復
号器、４８０……加算器、６１０……絶対値位相
計算回路、６２０……位相量子化符号器、６５０
位相復号器、６６０……変換回路、をそれぞれ示
す。 Figure 1 shows an example of the conventional adaptive transform coding method, VD-ATC (Vocoder Diwen
Figure 2 is a diagram showing an example of a window function, Figure 3 is a diagram showing an example of a signal spectrum when noise shaping is applied, and Figure 4 is a block diagram of the Adaptive Transform Coding method.
Figure 5 is a block diagram showing a first embodiment of the present invention, Figure 5 is a diagram showing an example of a signal spectrum representing the spectral structure of an audio signal, and Figure 6 is a block diagram showing a second embodiment of the present invention. , FIG. 7 is a block diagram showing a third embodiment of the invention, FIG. 8 is a block diagram showing a fourth embodiment of the invention, and FIG. 9 is a block diagram showing a fourth embodiment of the invention.
(a) to (i) are waveform diagrams illustrating the operation of the device in FIG. 4; In the figure, 10... transmitting side input terminal, 15...
... Buffer memory circuit, 20 ... Window function circuit, 3
0,300... Inverse dispersive cosine transform (DCT) circuit, 40... Autocorrelation calculation circuit, 50... Pitch detection circuit, 60... Quantizer, 65... Inverse quantizer, 70... Prediction parameter calculation circuit, 80
. . . Parameter conversion circuit, 90 . . . Spectrum regeneration circuit, 100 . Side information encoder, 140... Transmission side output terminal, 150... Receiving side input terminal, 160, 430, 640... Demultiplexer, 170, 440... Side information decoder, 175... Parameter conversion circuit, 180... …
Spectrum regeneration circuit, 190... Bit allocation step size calculation circuit, 200... DCT coefficient adaptive decoder, 210,500... Inverse discrete cosine transform (IDCT) circuit, 220... Buffer memory circuit, 230... Receiving side Output terminal, 310
... Absolute value circuit, 320 ... Logarithm calculation circuit, 33
0,335,670...Inverse discrete Fourier transform (IDFT) calculation circuit, 340,700...Lifter circuit, 345...Pitch detection circuit, 350...Quantizer, 355...Decoder, 360,365, 4
50,455,600...Discrete Fourier Transform (DFT) calculation circuit, 37,490...Exponent calculation circuit, 380...Subtractor, 390...Adaptive quantization encoder, 400,460...Bit allocation quantum quantization step size calculation circuit, 470...adaptive decoder, 480...adder, 610...absolute value phase calculation circuit, 620...phase quantization encoder, 650
A phase decoder, 660 . . . , a conversion circuit is shown, respectively.

Claims

[Claims]

1. In an adaptive audio signal transmission method that adaptively quantizes and transmits an orthogonal transform sequence obtained by orthogonally transforming a sample value sequence obtained by sampling an audio signal, a first generating a signal sequence, generating a second signal sequence representing a characteristic of the first signal sequence based on the first signal sequence, and generating a third signal sequence representing a spectrum based on the second signal sequence; generating a signal sequence, calculating a difference sequence between the first signal sequence and the third signal sequence, generating a fourth signal sequence by adaptively quantizing the difference sequence; and the fourth signal sequence are transmitted in combination.