JP6652469B2

JP6652469B2 - Decoding device, decoding method, and program

Info

Publication number: JP6652469B2
Application number: JP2016174266A
Authority: JP
Inventors: 亮介杉浦; 守谷　健弘; 健弘守谷; 優鎌本; 康一古角; 川西　隆仁; 隆仁川西; 賢一野口
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: NTT Inc; NTT Inc USA
Priority date: 2016-09-07
Filing date: 2016-09-07
Publication date: 2020-02-26
Anticipated expiration: 2036-09-07
Also published as: JP2018040917A

Description

この発明は、音信号等の時系列信号を復号する技術に関する。 The present invention relates to a technique for decoding a time-series signal such as a sound signal.

従来、音信号の符号化と復号の間にある伝送路上でのパケットの損失等によって復号装置に入力されるべき情報に欠落が生じて正しい復号音信号を得られなくなった際の対策として、予め符号化の段階において信号のエネルギー、位相、ピッチ周期といった音信号の分類情報を補助的に付加し、復号装置に入力されるべき情報の欠落が生じる以前に復号装置に入力された補助情報を基に、その補助情報が示す音信号の分類と同種の信号を求めることにより情報が欠落した部分の復号音信号を補間により生成していた（例えば、非特許文献１参照。）。 Conventionally, as a countermeasure when a correct decoded sound signal cannot be obtained due to loss of information to be input to the decoding device due to loss of a packet on a transmission path between the encoding and decoding of the sound signal and the like, At the encoding stage, sound signal classification information such as signal energy, phase, and pitch period is supplementarily added, and based on auxiliary information input to the decoding device before loss of information to be input to the decoding device occurs. Then, by obtaining a signal of the same kind as the classification of the sound signal indicated by the auxiliary information, a decoded sound signal in a portion where information is missing is generated by interpolation (for example, see Non-Patent Document 1).

J. Lecomte, T. Vaillancourt, S. Bruhn, H. Sung, K. Peng, K. Kikuiri, B. Wang, S. Subasingha, and J. Faure, “Packet-loss concealment technology advances in EVS,” in Proc. ICASSP 2015, pp. 5708-5712, 2015.J. Lecomte, T. Vaillancourt, S. Bruhn, H. Sung, K. Peng, K. Kikuiri, B. Wang, S. Subasingha, and J. Faure, “Packet-loss concealment technology advances in EVS,” in Proc . ICASSP 2015, pp. 5708-5712, 2015.

しかし、上記の技術では、情報欠落時の聴覚的不快感を低減はするものの、通常の信号の復号に必要な情報のほかに追加の情報量が必要であった。また、上記の技術では、１つのパケットに含まれる音信号として20 msec程度の長さのものを想定していることから、欠落した信号の後の情報を用いるための追加の原理遅延が必要であった。 However, in the above technique, although the auditory discomfort at the time of information loss is reduced, an additional amount of information is required in addition to the information necessary for decoding a normal signal. Further, in the above technique, since a sound signal included in one packet has a length of about 20 msec, an additional principle delay for using information after a missing signal is necessary. there were.

この発明は、追加の情報の伝送が必要なく、復号の原理遅延を増やすことなく、欠落した情報を従来技術より聴覚的に良好に補間できる復号装置、復号方法及びプログラムを提供することを目的とする。 An object of the present invention is to provide a decoding device, a decoding method, and a program that can interpolate missing information more audibly than conventional techniques without requiring additional information transmission and without increasing the principle delay of decoding. I do.

この発明の一態様による復号装置は、フレームごとに復号音信号を得る復号装置であって、音信号符号が欠落しているフレームについては、フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号と当該信号の極性を反転した信号の中から、前フレームと時間的に連続性が高い信号の候補を選択し、選択した信号の候補である拡張復号音信号と、前フレームから線形予測合成した信号と、に基づいて生成した信号を、フレームの復号音信号とする補間信号生成部を含む。 A decoding device according to one aspect of the present invention is a decoding device that obtains a decoded sound signal for each frame. For a frame in which a sound signal code is missing, a sample of a decoded sound signal of a frame preceding the frame is temporally extracted. A signal candidate having high temporal continuity with the previous frame is selected from the reversely arranged signal and the signal obtained by inverting the polarity of the signal, and the extended decoded sound signal which is a selected signal candidate and the previous frame are selected. And a signal generated on the basis of the signal obtained by linear prediction synthesis from the input signal and a decoded signal of the frame.

この発明の一態様による復号装置は、フレームごとに復号音信号を得る復号装置であって、音信号符号が欠落しているフレームについては、フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号又は当該信号の極性を反転した信号である拡張復号音信号と、前フレームから線形予測合成した信号と、に基づいて生成した信号を、フレームの復号音信号とする補間信号生成部を含む。
この発明の一態様による復号装置は、フレームごとに復号音信号を得る復号装置であって、音信号符号が欠落しているフレームについては、フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号、当該信号の極性を反転した信号、前フレームの復号音信号及び前フレームの復号音信号の極性を反転した信号の中から、前フレームから線形予測合成した信号と類似性が高い信号の候補を選択し、選択した信号の候補である拡張復号音信号、または、拡張復号音信号と前フレームから線形予測合成した信号とに基づいて生成した信号、をフレームの復号音信号とする補間信号生成部、を備えている。
この発明の一態様による復号装置は、フレームごとに復号音信号を得る復号装置であって、音信号符号が欠落しているフレームについては、フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号又は当該信号の極性を反転した信号から前フレームから線形予測合成した信号と類似性が高い信号を選択し、選択した信号である拡張復号音信号、または、拡張復号音信号と前フレームから線形予測合成した信号とに基づいて生成した信号、を、フレームの復号音信号とする補間信号生成部、を備えている。
この発明の一態様による復号装置は、フレームごとに復号音信号を得る復号装置であって、音信号符号が欠落しているフレームについては、フレームの前フレームの復号音信号とパワースペクトルが同じである複数の信号の候補の中から、前フレームと時間的に連続性が高い信号の候補を選択し、選択した信号の候補である拡張復号音信号と、前フレームから線形予測合成した信号と、に基づいて生成した信号を、フレームの復号音信号とする補間信号生成部、を備えている。
この発明の一態様による復号装置は、フレームごとに復号音信号を得る復号装置であって、音信号符号が欠落しているフレームについては、フレームの前フレームの復号音信号とパワースペクトルが同じである複数の信号の候補の中から、前フレームから線形予測合成した信号と類似性が高い信号の候補を選択し、選択した信号の候補である拡張復号音信号、または、拡張復号音信号と前フレームから線形予測合成した信号とに基づいて生成した信号、をフレームの復号音信号とする補間信号生成部、を備えている。 A decoding device according to one aspect of the present invention is a decoding device that obtains a decoded sound signal for each frame. For a frame in which a sound signal code is missing, a sample of a decoded sound signal of a frame preceding the frame is temporally extracted. Interpolation signal generation using a signal generated based on an extended decoded sound signal that is a signal arranged in reverse or a signal whose polarity is inverted and a signal that is linearly predicted and synthesized from the previous frame as a decoded sound signal of a frame Including parts.
A decoding device according to one aspect of the present invention is a decoding device that obtains a decoded sound signal for each frame. For a frame in which a sound signal code is missing, a sample of a decoded sound signal of a frame preceding the frame is temporally extracted. Among the reversely arranged signals, the signal whose polarity is inverted, the decoded sound signal of the previous frame, and the signal whose polarity of the decoded sound signal of the previous frame is inverted, the similarity to the signal obtained by linear predictive synthesis from the previous frame. A candidate for a high signal is selected, and an extended decoded sound signal that is a candidate for the selected signal, or a signal generated based on the extended decoded sound signal and a signal that is linearly predicted and synthesized from the previous frame, and a decoded sound signal of the frame And an interpolation signal generation unit for performing the interpolation.
A decoding device according to one aspect of the present invention is a decoding device that obtains a decoded sound signal for each frame. For a frame in which a sound signal code is missing, a sample of a decoded sound signal of a frame preceding the frame is temporally extracted. A signal having a high similarity to a signal obtained by linear prediction synthesis from the previous frame from a signal arranged in reverse or a signal in which the polarity of the signal is inverted is selected, and the selected decoded signal is an extended decoded sound signal, or an extended decoded sound signal. A signal generated based on a signal subjected to linear prediction synthesis from the previous frame, and an interpolated signal generation unit that uses the signal as a decoded sound signal of the frame.
A decoding device according to one aspect of the present invention is a decoding device that obtains a decoded sound signal for each frame. For a frame in which a sound signal code is missing, the decoded sound signal of the frame preceding the frame has the same power spectrum as the decoded sound signal. From among a plurality of signal candidates, a signal candidate having high temporal continuity with the previous frame is selected, and an extended decoded sound signal that is a candidate of the selected signal, and a signal obtained by linear prediction synthesis from the previous frame, And an interpolation signal generation unit that uses the signal generated based on the above as a decoded sound signal of the frame.
A decoding device according to one aspect of the present invention is a decoding device that obtains a decoded sound signal for each frame. For a frame in which a sound signal code is missing, the decoded sound signal of the frame preceding the frame has the same power spectrum as the decoded sound signal. From among a plurality of signal candidates, a signal candidate having a high similarity to the signal obtained by linear prediction synthesis from the previous frame is selected, and the extended decoded sound signal or the extended decoded sound signal that is the selected signal candidate is selected. An interpolated signal generation unit that uses a signal generated based on a signal obtained by performing linear prediction synthesis from the frame as a decoded sound signal of the frame.

追加の情報の伝送が必要なく、復号の原理遅延を増やすことなく、欠落した情報を従来技術より良好に聴覚的に補間できる。 Missing information can be audibly interpolated better than in the prior art, without the need to transmit additional information and without increasing the decoding delay.

復号装置が想定する符号化装置の例を示すブロック図。FIG. 3 is a block diagram illustrating an example of an encoding device assumed by a decoding device. 復号装置が想定する符号化方法の例を示す流れ図。5 is a flowchart illustrating an example of an encoding method assumed by a decoding device. パケットの例を示す図。The figure which shows the example of a packet. 第一実施形態の復号装置の例を示すブロック図。FIG. 2 is a block diagram illustrating an example of a decoding device according to the first embodiment. 第一実施形態から第三実施形態の復号方法の例を示す流れ図。9 is a flowchart illustrating an example of a decoding method according to the first to third embodiments. 第二実施形態の復号装置の例を示すブロック図。FIG. 13 is a block diagram illustrating an example of a decoding device according to a second embodiment. 第三実施形態の復号装置の例を示すブロック図。FIG. 13 is a block diagram illustrating an example of a decoding device according to a third embodiment. 技術背景を説明するための図。The figure for demonstrating a technical background.

以下、図面を参照して、この発明の実施形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

[第一実施形態から第三実施形態で想定する符号化装置]
後述する第一実施形態から及び第三実施形態で想定する符号化装置の例を図１に示す。想定する符号化装置は、図１に示すように、線形予測分析部１１、信号符号化部１２及びパケット化部１３を例えば備えている。 [Encoding device assumed in first to third embodiments]
FIG. 1 shows an example of an encoding device assumed in a first embodiment to be described later and in a third embodiment. The assumed coding apparatus includes, for example, a linear prediction analysis unit 11, a signal coding unit 12, and a packetization unit 13, as shown in FIG.

第一実施形態から第三実施形態で想定する符号化方法は、符号化装置の各部が、図２及び以下に説明するステップＥ１からステップＥ３の処理を行うことにより例えば実現される。 The encoding method assumed in the first to third embodiments is realized, for example, by each unit of the encoding device performing the processing of FIG. 2 and steps E1 to E3 described below.

以下、図１に示す符号化装置の各部について説明する。 Hereinafter, each unit of the encoding device shown in FIG. 1 will be described.

＜線形予測分析部１１＞
線形予測分析部１１には、時間領域の音信号が入力される。音信号は、例えば音声信号又は音響信号である。 <Linear prediction analysis unit 11>
A time-domain sound signal is input to the linear prediction analysis unit 11. The sound signal is, for example, an audio signal or an acoustic signal.

線形予測分析部１１は、パケットに含める音信号の時間長に対応する所定の時間長のフレーム単位で入力された時間領域の音信号を基に、線形予測係数α₁,α₂,…,α_pを生成する（ステップＥ１）。また、線形予測分析部１１は、生成した線形予測係数α₁,α₂,…,α_pを符号化して線形予測係数符号を得る。線形予測係数符号の例は、線形予測係数α₁,α₂,…,α_pに対応するLSP(Line Spectrum Pairs)パラメータ列の量子化値の列に対応する符号であるＬＳＰ符号である。pは２以上の整数である。 The linear prediction analysis unit 11 performs linear prediction coefficients α ₁ , α ₂ ,..., Α on the basis of a time-domain sound signal input in a frame unit of a predetermined time length corresponding to the time length of the sound signal included in the packet. Generate _p (step E1). Further, the linear prediction analysis unit 11 encodes the generated linear prediction coefficients α ₁ , α ₂ ,..., Α _p to obtain a linear prediction coefficient code. Examples of the linear prediction coefficient code, the linear predictive coefficients α _1, α _2, ..., a LSP code is a code corresponding to the column of the quantized value of the LSP (Line Spectrum Pairs) parameter sequence corresponding to alpha _p. p is an integer of 2 or more.

線形予測分析部１１は、得た線形予測係数符号をパケット化部１３に出力する。 The linear prediction analysis unit 11 outputs the obtained linear prediction coefficient code to the packetization unit 13.

また、線形予測分析部１１は、得た線形予測係数符号に対応する線形予測係数である量子化線形予測係数^α₁,^α₂,…,^α_pを得る。なお、「・」を任意の文字として、「^・」という記載は、「・」の上に「^」が付いていることを意味する。 Also, the linear prediction analyzer 11 is a linear prediction coefficient corresponding to the linear prediction coefficient code obtained quantized linear prediction coefficient _{_{^ α 1, ^ α 2,}} ..., obtaining ^ alpha _p. It should be noted that the notation “^” with “•” as an arbitrary character means that “^” is added above “•”.

線形予測分析部１１は、得た量子化線形予測係数^α₁,^α₂,…,^α_pを、信号符号化部１２に出力する。 Linear prediction analysis unit 11, resulting quantized linear prediction coefficient _{_{^ α 1, ^ α 2,}} ..., a ^ alpha _p, and outputs the signal encoding unit 12.

線形予測分析部１１は、線形予測分析の処理として、例えば、フレーム単位で入力された音信号に対する自己相関を求めて、求めた自己相関を利用してLevinson-Durbinアルゴリズムを行うことにより線形予測係数を得る方法を用いる。線形予測分析部１１による線形予測係数符号の取得は、例えば従来的な符号化技術によって行われる。従来的な符号化技術とは、例えば、線形予測係数そのものに対応する符号を線形予測係数符号とする符号化技術、線形予測係数をLSPパラメータに変換してLSPパラメータに対応する符号を線形予測係数符号とする符号化技術、線形予測係数をPARCOR係数に変換してPARCOR係数に対応する符号を線形予測係数符号とする符号化技術などである。 The linear prediction analysis unit 11 performs a linear prediction analysis process, for example, by obtaining an autocorrelation with respect to a sound signal input in a frame unit and performing a Levinson-Durbin algorithm using the obtained autocorrelation. Is used. The acquisition of the linear prediction coefficient code by the linear prediction analysis unit 11 is performed by, for example, a conventional encoding technique. Conventional coding techniques include, for example, an encoding technique in which a code corresponding to the linear prediction coefficient itself is a linear prediction coefficient code, and a code corresponding to the LSP parameter obtained by converting the linear prediction coefficient into an LSP parameter. There are coding techniques for coding, and coding techniques for converting a linear prediction coefficient into a PARCOR coefficient and setting a code corresponding to the PARCOR coefficient as a linear prediction coefficient code.

＜信号符号化部１２＞
信号符号化部１２には、時間領域の音信号と、線形予測分析部１１が出力した量子化線形予測係数^α₁,^α₂,…,^α_pとが入力される。 <Signal encoding unit 12>
The signal encoding unit 12 receives a sound signal in the time domain and the quantized linear prediction coefficients ^ α ₁ , ^ α ₂ , ..., ^ α _p output from the linear prediction analysis unit 11.

信号符号化部１２は、例えば非特許文献１の符号化装置のように、量子化線形予測係数^α₁,^α₂,…,^α_pの値を用いてフレーム単位で入力された音信号の線形予測を行い、線形予測残差である予測残差信号を得て、得られた予測残差信号を符号化することにより残差信号符号を得る（ステップＥ２）。信号符号化部１２は、得た残差信号符号を、パケット化部１３に出力する。 The signal encoding unit 12 uses the values of the quantized linear prediction coefficients ^ α ₁ , ^ α ₂ ,..., ^ Α _p as in the encoding device of Non-Patent Document 1, for example, to input a sound input in frame units. Linear prediction of the signal is performed to obtain a prediction residual signal that is a linear prediction residual, and the obtained prediction residual signal is encoded to obtain a residual signal code (step E2). The signal encoding unit 12 outputs the obtained residual signal code to the packetizing unit 13.

＜パケット化部１３＞
パケット化部１３には、線形予測分析部１１が出力した線形予測係数符号と、信号符号化部１２が出力した残差信号符号とが入力される。ここで、入力された線形予測係数符号と残差信号符号は、符号化装置に入力された音信号を表す符号であるので、線形予測係数符号と残差信号符号を合わせたものを音信号符号と呼ぶこととする。 <Packetizing unit 13>
The packetizer 13 receives the linear prediction coefficient code output from the linear prediction analyzer 11 and the residual signal code output from the signal encoder 12. Here, the input linear prediction coefficient code and residual signal code are codes representing the sound signal input to the encoding device, and thus the combination of the linear prediction coefficient code and the residual signal code is used as the sound signal code. Shall be called.

パケット化部１３は、例えば図３に示すように、データ長及びフレーム番号等を示すヘッダと、音信号符号と、パケット全体に誤りが生じているか否かを検出するためのＣＲＣ符号などの誤り検出符号とを含む１つのパケットを構成し、このパケットを復号装置に対して出力する（ステップＥ３）。この例では、音信号符号は、線形予測係数符号及び残差信号符号を含む符号である。 For example, as shown in FIG. 3, the packetizing unit 13 includes a header indicating a data length and a frame number, a sound signal code, and an error such as a CRC code for detecting whether an error has occurred in the entire packet. One packet including the detection code is formed, and this packet is output to the decoding device (step E3). In this example, the sound signal code is a code including a linear prediction coefficient code and a residual signal code.

音信号符号のデータ長が固定であればヘッダにデータ長を表す情報を含める必要はないが、信号符号化部１２で可変長符号化した場合などは音信号符号のデータ長がパケットにより異なることがあるのでヘッダにデータ長を表す情報を含める必要がある。 If the data length of the audio signal code is fixed, it is not necessary to include information indicating the data length in the header, but the data length of the audio signal code differs depending on the packet when the signal encoding unit 12 performs variable-length encoding. Therefore, it is necessary to include information indicating the data length in the header.

[第一実施形態の復号装置及び方法]
復号装置に入力されるべき情報であるパケットがパケット消失等によって欠落した場合、聴覚的不快感を低減するために、復号装置はそのパケット（以下、「欠落パケット」との文言を用いる場合もある。）に対応するフレーム（以下、「欠落フレーム」との文言を用いる場合もある。）の復号音信号を消失していないパケットに含まれる情報から生成する、すなわち、補間により生成する。その際、復号の原理遅延を増加させずに欠落フレームの補間を行うためには、復号装置は欠落フレームの欠落パケットよりも時間的に過去のパケットの情報から補間する必要がある。ここで、第一実施形態の復号装置は、欠落フレームの前フレームの復号線形予測係数から予測される信号（以下、「予測信号」との文言を用いる場合もある。）を欠落フレームの補間に用いる。すなわち、第一実施形態の復号装置は、前フレームの復号線形予測係数から予測信号を欠落フレームの復号音信号とすることで、欠落フレームの復号音信号と前フレームの復号音信号との連続性を担保する。 [Decoding device and method according to first embodiment]
When a packet that is information to be input to the decoding device is lost due to packet loss or the like, the decoding device may use the packet (hereinafter, referred to as “missing packet”) in order to reduce auditory discomfort. ) Is generated from the information included in the packet that has not lost the decoded sound signal of the frame (hereinafter, the term “missing frame” may be used in some cases), that is, generated by interpolation. At this time, in order to perform the interpolation of the missing frame without increasing the principle delay of the decoding, the decoding device needs to interpolate from the information of the packet that is earlier in time than the missing packet of the missing frame. Here, the decoding device according to the first embodiment uses a signal predicted from the decoded linear prediction coefficient of the previous frame of the missing frame (hereinafter, sometimes referred to as a “predicted signal”) for interpolation of the missing frame. Used. That is, the decoding device of the first embodiment uses the prediction signal as the decoded sound signal of the missing frame from the decoded linear prediction coefficient of the previous frame, so that the continuity between the decoded sound signal of the missing frame and the decoded sound signal of the previous frame is obtained. To secure.

ただし、この予測信号は時間とともに減衰してしまい、予測信号のみで補間を行った場合は欠落フレームの復号音信号と次フレームの復号音信号とが不連続になってしまうことがある。そこで、第一実施形態の復号装置は、欠落フレームの予測信号と前フレームの復号音信号との重み付和により得られる信号を欠落フレームの補間に用いる。すなわち、第一実施形態の復号装置は、欠落フレームの予測信号と前フレームの復号音信号との重み付和により得られる信号を欠落フレームの復号音信号として得る。この際、第一実施形態の復号装置は、前フレームの復号音信号に代えて、前フレームの復号音信号に対して時間反転による折り返しや極性の反転を行ったものを用いてもよい。第一実施形態の復号装置は、時間反転の有無や極性の反転の有無を前フレームの復号音信号に合わせて適応的に使い分けることで、前フレームの復号音信号とのより連続的な補間を行ってもよい。要するに、第一実施形態の復号装置は、欠落フレームの予測信号と、前フレームの復号音信号とパワースペクトルが同じである信号と、の重み付和により得られる信号を欠落フレームの復号音信号としてもよい。 However, the predicted signal attenuates with time, and if interpolation is performed using only the predicted signal, the decoded sound signal of the missing frame and the decoded sound signal of the next frame may be discontinuous. Therefore, the decoding device of the first embodiment uses a signal obtained by weighted sum of a predicted signal of a missing frame and a decoded sound signal of a previous frame for interpolation of the missing frame. That is, the decoding device of the first embodiment obtains, as a decoded sound signal of the missing frame, a signal obtained by weighting the sum of the predicted signal of the missing frame and the decoded sound signal of the previous frame. At this time, the decoding device of the first embodiment may use a decoded sound signal of the previous frame that has been subjected to time reversal or polarity inversion instead of the decoded sound signal of the previous frame. The decoding device according to the first embodiment uses the presence / absence of time reversal and the presence / absence of polarity reversal adaptively according to the decoded sound signal of the previous frame, thereby performing more continuous interpolation with the decoded sound signal of the previous frame. May go. In short, the decoding device of the first embodiment sets the signal obtained by the weighted sum of the predicted signal of the missing frame and the signal having the same power spectrum as the decoded sound signal of the previous frame as the decoded sound signal of the missing frame. Is also good.

また、第一実施形態の復号装置は、欠落フレームの予測信号と次フレームの復号音信号との不連続を解消するために、次フレームの復号により得られた音信号について、次フレームの復号により得られた音信号と欠落フレームの復号音信号との重み付和を次フレームの復号音信号とすることで、次フレームで欠落フレームと連続する復号音信号を得てもよい。 Further, the decoding apparatus of the first embodiment performs decoding of the next frame on the sound signal obtained by decoding the next frame in order to eliminate discontinuity between the prediction signal of the missing frame and the decoded sound signal of the next frame. By making the weighted sum of the obtained sound signal and the decoded sound signal of the missing frame the decoded sound signal of the next frame, a decoded sound signal continuous with the missing frame in the next frame may be obtained.

第一実施形態の復号装置の構成例を図４に示す。第一実施形態の復号装置は、図４に示すように、非パケット化部２１と、線形予測係数復号部２２と、信号復号部２３と、補間信号生成部２４とを例えば備えている。補間信号生成部２４は、線形予測係数記憶部２４１と、復号音信号記憶部２４２と、線形予測部２４３と、信号拡張選択部２４４と、補間信号結合部２４５とを例えば備えている。 FIG. 4 shows a configuration example of the decoding device of the first embodiment. As shown in FIG. 4, the decoding device according to the first embodiment includes, for example, a non-packetizing unit 21, a linear prediction coefficient decoding unit 22, a signal decoding unit 23, and an interpolation signal generation unit 24. The interpolation signal generation unit 24 includes, for example, a linear prediction coefficient storage unit 241, a decoded sound signal storage unit 242, a linear prediction unit 243, a signal extension selection unit 244, and an interpolation signal combination unit 245.

復号方法は、フレームごとに復号音信号を得る復号装置の各部が、図５及び以下に説明するステップＤ１からステップＤ６の処理を行うことにより例えば実現される。 The decoding method is realized, for example, by each unit of the decoding device that obtains a decoded sound signal for each frame, performing the processing of FIG. 5 and steps D1 to D6 described below.

以下、図４の復号装置の各部について説明する。 Hereinafter, each unit of the decoding device in FIG. 4 will be described.

＜非パケット化部２１＞
非パケット化部２１には、符号化装置から出力されたパケットが入力される。 <Non-packetizing unit 21>
The packet output from the encoding device is input to the non-packetizing unit 21.

非パケット化部２１は、パケット内のＣＲＣ符号などの誤り検出符号を基にパケット内に誤りが生じているか否かを検出する。また、非パケット化部２１は、パケットのヘッダ内のフレーム番号を基に一連のパケット番号に欠落が生じているか否か、すなわち、連続して存在するはずの複数フレームの音信号符号のうちのあるフレームのパケットがパケット損失等で欠落しているか、も検出する。そして、非パケット化部２１は、それらの検出結果から、パケットが欠落しておらずパケット内に誤りがない場合には当該パケットに対応するフレームの音信号符号が欠落していないと判定し、パケットが欠落しているかパケット内に誤りがある場合には当該パケットに対応するフレームの音信号符号が欠落していると判定し、当該パケットに対応するフレームの音信号符号が欠落しているかいないかを示す情報である欠落判定情報を生成する（ステップＤ１）。 The depacketizing unit 21 detects whether an error has occurred in the packet based on an error detection code such as a CRC code in the packet. The non-packetizing unit 21 also determines whether or not a series of packet numbers are missing based on the frame number in the header of the packet, that is, among the sound signal codes of a plurality of frames that should exist continuously. It also detects whether a packet of a certain frame is missing due to packet loss or the like. Then, from those detection results, the non-packeting unit 21 determines that the sound signal code of the frame corresponding to the packet is not missing when the packet is not missing and there is no error in the packet, If the packet is missing or if there is an error in the packet, it is determined that the sound signal code of the frame corresponding to the packet is missing, and whether the sound signal code of the frame corresponding to the packet is missing Missing determination information, which is information indicating whether or not there is, is generated (step D1).

非パケット化部２１は、欠落判定情報を、線形予測係数記憶部２４１と、復号音信号記憶部２４２と、線形予測部２４３と、信号拡張選択部２４４と、補間信号結合部２４５とに出力する。 The non-packeting unit 21 outputs the missing information to the linear prediction coefficient storage unit 241, the decoded sound signal storage unit 242, the linear prediction unit 243, the signal extension selection unit 244, and the interpolation signal combining unit 245. .

非パケット化部２１は、音信号符号が欠落していないフレームについて、パケットからヘッダ内のデータ長を表す情報を基に音信号符号を取り出し、線形予測係数復号部２２には音信号符号のうちの少なくとも線形予測係数符号を、信号復号部２３には音信号符号のうちの少なくとも残差信号符号を、それぞれ出力する。 The non-packeting unit 21 extracts a sound signal code from the packet based on the information indicating the data length in the header from the frame in which the sound signal code is not missing. , And at least the residual signal code of the sound signal codes to the signal decoding unit 23.

＜線形予測係数復号部２２＞
線形予測係数復号部２２には、音信号符号が欠落していないフレームについての、非パケット化部２１が出力した線形予測係数符号が入力される。 <Linear prediction coefficient decoding unit 22>
The linear prediction coefficient code output from the non-packetizing unit 21 for the frame in which the sound signal code is not missing is input to the linear prediction coefficient decoding unit 22.

線形予測係数復号部２２は、音信号符号が欠落していないフレームについて、フレーム毎に、入力された線形予測係数符号を例えば従来的な復号技術によって復号して復号線形予測係数^α₁,^α₂,…,^α_pを得て、得た復号線形予測係数^α₁,^α₂,…,^α_pを信号復号部２３及び線形予測係数記憶部２４１に出力する（ステップＤ２）。 The linear prediction coefficient decoding unit 22 decodes the input linear prediction coefficient code for each frame by using, for example, a conventional decoding technique for a frame in which the sound signal code is not missing, and decodes the decoded linear prediction coefficient ^ α ₁ , ^ α _2, ..., ^ to obtain alpha _p, resulting decoded linear prediction coefficient _{_{^ α 1, ^ α 2,}} ..., and outputs a ^ alpha _p to the signal decoding unit 23 and the linear prediction coefficient storage unit 241 (step D2) .

ここで、従来的な復号技術とは、例えば、線形予測係数符号が量子化された線形予測係数に対応する符号である場合に線形予測係数符号を復号して量子化された線形予測係数と同じ復号線形予測係数を得る技術、線形予測係数符号が量子化されたLSPパラメータに対応する符号である場合に線形予測係数符号を復号して量子化されたLSPパラメータと同じ復号LSPパラメータを得る技術などである。また、線形予測係数とLSPパラメータは互いに変換可能なものであり、入力された線形予測係数符号と後段での処理において必要な情報に応じて、復号線形予測係数と復号LSPパラメータの間での変換処理を行なえばよいのは周知である。以上から、上記の線形予測係数符号の復号処理と必要に応じて行なう上記の変換処理とを包含したものが「従来的な復号技術による復号」ということになる。 Here, the conventional decoding technique is, for example, when the linear prediction coefficient code is a code corresponding to the quantized linear prediction coefficient, the linear prediction coefficient code is the same as the quantized linear prediction coefficient. Technology to obtain decoded linear prediction coefficients, technology to obtain the same decoded LSP parameters as quantized LSP parameters by decoding the linear prediction coefficient codes when the linear prediction coefficient codes are codes corresponding to the quantized LSP parameters It is. Further, the linear prediction coefficient and the LSP parameter are mutually convertible, and the conversion between the decoded linear prediction coefficient and the decoded LSP parameter is performed according to the input linear prediction coefficient code and information necessary for processing in the subsequent stage. It is well known that the processing may be performed. As described above, what includes the decoding processing of the linear prediction coefficient code and the conversion processing performed as necessary is “decoding by the conventional decoding technique”.

なお、線形予測係数復号部２２は、音信号符号が欠落したフレームについては、何もしない。 Note that the linear prediction coefficient decoding unit 22 does nothing with respect to the frame in which the sound signal code is missing.

＜信号復号部２３＞
信号復号部２３には、音信号符号が欠落していないフレームについての、非パケット化部２１が出力した残差信号符号と、線形予測係数復号部２２が出力した復号線形予測係数^α₁,^α₂,…,^α_pとが入力される。 <Signal decoding unit 23>
The signal decoding unit 23 outputs the residual signal code output from the non-packetizing unit 21 and the decoded linear prediction coefficient ^ α ₁ output from the linear prediction coefficient decoding unit 22 for the frame in which the sound signal code is not missing. ^ α ₂ , ..., ^ α _p are input.

信号復号部２３は、音信号符号が欠落していないフレームについて、フレーム毎に、例えば非特許文献１の復号装置のように、残差信号符号に対応する残差信号を得て、復号線形予測係数^α₁,^α₂,…,^α_pと残差信号と１サンプル前までの復号音信号とを用いて線形予測合成をすることにより復号音信号^x(0),^x(1),…,^x(N-1)を得て、得た復号音信号^x(0),^x(1),…,^x(N-1)を復号音信号記憶部２４２及び補間信号結合部２４５に出力する（ステップＤ３）。Nは所定の正の整数である。 The signal decoding unit 23 obtains a residual signal corresponding to the residual signal code for each frame, for example, as in the decoding device of Non-Patent Document 1, for a frame in which the sound signal code is not missing, and performs decoding linear prediction. By performing linear prediction synthesis using the coefficients ^ α ₁ , ^ α ₂ , ..., ^ α _p , the residual signal and the decoded sound signal up to one sample before, the decoded sound signal ^ x (0), ^ x ( 1),..., ^ X (N-1), and obtains the decoded sound signals ^ x (0), ^ x (1), ..., ^ x (N-1). The signal is output to the interpolation signal combining unit 245 (step D3). N is a predetermined positive integer.

なお、信号復号部２３は、音信号符号が欠落したフレームについては、何もしない。 Note that the signal decoding unit 23 does nothing with respect to the frame in which the sound signal code is missing.

＜線形予測係数記憶部２４１＞
線形予測係数記憶部２４１には、非パケット化部２１が出力した欠落判定情報と、線形予測係数復号部２２が出力した復号線形予測係数^α₁,^α₂,…,^αとが入力される。 <Linear prediction coefficient storage unit 241>
The linear prediction coefficient storage unit 241 receives the loss determination information output by the depacketizing unit 21 and the decoded linear prediction coefficients ^ α ₁ , ^ α ₂ ,..., ^ Α output by the linear prediction coefficient decoding unit 22. Is done.

線形予測係数記憶部２４１は、フレーム毎に、当該フレームの欠落判定情報が音信号符号が欠落していないことを示す場合、すなわち、当該フレームの音信号符号が欠落していない場合に、入力された復号線形予測係数^α₁,^α₂,…,^α_pを前フレーム復号線形予測係数^β₁,^β₂,…,^β_pとして記憶する。 The linear prediction coefficient storage unit 241 is input for each frame when the missing determination information of the frame indicates that the sound signal code is not missing, that is, when the sound signal code of the frame is not missing. The decoded linear prediction coefficients ^ α ₁ , ^ α ₂ , ..., ^ α _p are stored as the preceding frame decoded linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p .

また、線形予測係数記憶部２４１は、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、記憶している前フレーム復号線形予測係数^β₁,^β₂,…,^β_pを線形予測部２４３に出力する。 Further, the linear prediction coefficient storage unit 241 stores the loss determination information of the frame when the loss signal information indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing. Output the preceding frame decoded linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p to the linear prediction unit 243.

＜復号音信号記憶部２４２＞
復号音信号記憶部２４２には、非パケット化部２１が出力した欠落判定情報と、信号復号部２３が出力した復号音信号^x(0),^x(1),…,^x(N-1)とが入力される。 <Decoded sound signal storage unit 242>
The decoded sound signal storage unit 242 stores the loss determination information output by the non-packetizing unit 21 and the decoded sound signals ^ x (0), ^ x (1), ..., ^ x (N) output by the signal decoding unit 23. -1) is input.

復号音信号記憶部２４２は、正常に復号された復号音信号を数フレーム分、例えば、２フレーム分記憶する。例えば、復号音信号記憶部２４２は、当該フレームの欠落判定情報が音信号符号が欠落していないことを示す場合、すなわち、当該フレームの音信号符号が欠落していない場合に、復号音信号記憶部２４２に記憶されている当該フレームの直前フレーム復号音信号^y(N+1),^y(N+2),…,^y(2N-1)の各サンプル値を、２フレーム前復号音信号^y(0),^y(1),…,^y(N-1)の各サンプル値としてそれぞれ記憶し、入力された復号音信号^x(0),^x(1),…,^x(N-1)の各サンプル値を前フレーム復号音信号^y(N+1),^y(N+2),…,^y(2N-1)の各サンプル値としてそれぞれ記憶する。 The decoded sound signal storage unit 242 stores decoded sound signals that have been successfully decoded for several frames, for example, two frames. For example, the decoded sound signal storage unit 242 stores the decoded sound signal when the missing signal determination information of the frame indicates that the sound signal code is not missing, that is, when the sound signal code of the frame is not missing. Each frame sample value of the decoded sound signal ^ y (N + 1), ^ y (N + 2),..., ^ Y (2N-1) immediately before the frame stored in the unit 242 is decoded two frames earlier. The sound signals ^ y (0), ^ y (1), ..., ^ y (N-1) are stored as respective sample values, and the input decoded sound signals ^ x (0), ^ x (1), ..., ^ x (N-1) are used as the sample values of the preceding frame decoded sound signal ^ y (N + 1), ^ y (N + 2), ..., ^ y (2N-1), respectively. Remember.

また、復号音信号記憶部２４２は、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、記憶されている前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を線形予測部２４３及び信号拡張選択部２４４に出力する。 Further, the decoded sound signal storage unit 242 is stored when the missing signal determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing. The preceding frame decoded sound signal ^ y (0), ^ y (1), ..., ^ y (2N-1) is output to the linear prediction unit 243 and the signal extension selection unit 244.

＜線形予測部２４３＞
線形予測部２４３には、欠落判定情報が音信号符号が欠落していることを示すフレーム、すなわち、当該フレームの音信号符号が欠落しているフレームについての、非パケット化部２１が出力した欠落判定情報と、線形予測係数記憶部２４１が出力した前フレーム復号線形予測係数^β₁,^β₂,…,^β_pと、復号音信号記憶部２４２が出力した前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)とが入力される。 <Linear prediction unit 243>
The linear predictor 243 includes, in the missing determination information, a frame indicating that the sound signal code is missing, that is, the missing portion output by the non-packeting unit 21 for the frame in which the sound signal code of the frame is missing. The determination information, the previous frame decoded linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p output by the linear prediction coefficient storage unit 241 and the previous frame decoded sound signal ^ y output by the decoded sound signal storage unit 242 (0), ^ y (1), ..., ^ y (2N-1) are input.

線形予測部２４３は、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、前フレーム復号線形予測係数^β₁,^β₂,…,^β_pの値を基に例えば以下の式(1)のように予測信号predict(0),predict(1),…,predict(N-1)を生成し、生成した予測信号predict(0),predict(1),…,predict(N-1)を補間信号結合部２４５に出力する（ステップＤ４）。

ただし、n=0,1,…,N-1で、k>0においてpredict(-k)=^y(2N-k)である。 When the missing determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing, the linear prediction unit 243 determines whether or not the previous frame decoded linear prediction coefficient ^ β _{Based on} the values of ₁ , ^ β ₂ , ..., ^ β _p , predictive signals predict (0), predict (1), ..., predict (N-1) are generated as in the following equation (1), for example, The generated prediction signals predict (0), predict (1),..., Predict (N−1) are output to the interpolation signal combining unit 245 (step D4).

Where n = 0, 1,..., N−1, and predict (−k) == y (2N−k) when k> 0.

すなわち、この予測信号predict(0),predict(1),…,predict(N-1)は、音信号符号が欠落していない前フレームの復号線形予測係数を当該フレームの復号線形予測係数とし、０を当該フレームの残差信号の各サンプル値としたときの、当該フレームの予測信号である。 That is, the predictive signal predict (0), predict (1),..., Predict (N-1) defines the decoded linear prediction coefficient of the previous frame in which the sound signal code is not missing as the decoded linear prediction coefficient of the frame. This is a predicted signal of the frame when 0 is used as each sample value of the residual signal of the frame.

＜信号拡張選択部２４４＞
信号拡張選択部２４４には、欠落判定情報が音信号符号が欠落していることを示すフレーム、すなわち、音信号符号が欠落しているフレームについての、非パケット化部２１が出力した欠落判定情報と、復号音信号記憶部２４２が出力した前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)とが入力される。 <Signal extension selection unit 244>
The signal extension selection unit 244 includes, in the loss determination information, a frame indicating that the sound signal code is missing, that is, the loss determination information output by the non-packetizing unit 21 for the frame in which the sound signal code is missing. , Yy (0), ^ y (1),..., ^ Y (2N-1) output from the decoded sound signal storage unit 242.

後述するが、この前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆にしたもの及び時間的に逆にして極性を入れ替えたもののそれぞれは、特定の仮定の下では、前フレームの復号音信号のパワースペクトルを保ったまま位相をフレーム長分だけシフトさせたものに等しい。したがって、信号拡張選択部２４４は、フレーム毎に、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、当該フレームの前フレームの復号音信号から仮定される当該フレーム（以下、「現フレーム」との文言を用いる場合もある。）の復号音信号として適切なものを例えば以下のように選択し、その仮定に則った現フレームの復号音信号である拡張復号音信号extend(0),extend(1),…,extend(2N-1)を生成する（ステップＤ５）。 As will be described later, the polarity of the preceding frame decoded sound signal ^ y (0), ^ y (1),..., ^ Y (2N-1) is reversed in time and the polarity is reversed in time. Each of them is, under certain assumptions, equal to a phase shifted by the frame length while maintaining the power spectrum of the decoded sound signal of the previous frame. Therefore, the signal extension selecting unit 244 determines, for each frame, when the loss determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing, An appropriate one as a decoded sound signal of the frame (hereinafter, sometimes referred to as “current frame”) assumed from the decoded sound signal of the previous frame of the frame is selected as follows, for example, and the assumption is made. The extended decoded sound signal extend (0), extend (1),..., Extend (2N-1), which is the decoded sound signal of the current frame, is generated according to the above (step D5).

前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)が例えば、N番目及びN+1番目のサンプルの境を中心に偶対称な正弦波の足し合わせで表されるとするのであれば、その信号に続く現フレームの復号音信号は前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆に並べたものと等しくなる。また、前フレーム復号音信号^y(0), ^y(1),…, ^y(2N-1)が例えば、N番目及びN+1番目のサンプルの境を中心に奇対称な正弦波の足し合わせで表されるとするのであれば、その信号に続く現フレームの復号音信号は前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆に並べ、極性を反転させたものと等しくなる。 The decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1) is, for example, the addition of a sine wave that is even symmetric about the boundary between the Nth and N + 1th samples. If it is expressed as a combination, the decoded sound signal of the current frame following that signal is the decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1). It becomes equal to what was arranged in reverse. Also, the decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1) is, for example, a sine wave that is oddly symmetric about the boundary between the Nth and N + 1th samples. , The decoded sound signal of the current frame following that signal is the decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1) Are reversed in time, and are equal to those obtained by inverting the polarity.

このことから、まず、信号拡張選択部２４４は、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)のN番目及びN+1番目のサンプルの境を中心に対称なサンプル対の和と差のエネルギーeven, oddを以下の式(2),(2')のようにそれぞれ算出し、その値の大小を基に、前フレーム復号音信号^y(0), ^y(1),…, ^y(2N-1)が偶対称的であるか奇対称的であるかを判断する。

From this, first, the signal extension selecting unit 244 selects the Nth and N + 1th samples of the decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1). The sum and difference energy even and odd of the sample pair symmetrical about the boundary are calculated as in the following equations (2) and (2 '), and based on the magnitude of the value, the decoded sound signal of the previous frame ^ Determine whether y (0), ^ y (1), ..., ^ y (2N-1) is even or odd symmetric.

そして、信号拡張選択部２４４は、even≧oddの場合には拡張復号音信号として前フレーム復号音信号の逆順であるextend(n)=^y(2N-1-n)（n=0,1,…,2N-1）を、odd>evenの場合には極性を反転した前フレーム復号音信号の逆順であるextend(n)=-^y(2N-1-n)（n=0,1,…,2N-1）を選択し、選択した拡張復号音信号extend(0),extend(1),…,extend(2N-1)を補間信号結合部２４５に出力する。 Then, when even ≧ odd, the signal extension selecting unit 244 determines that the extended decoded sound signal is extend (n) = ^ y (2N-1-n) (n = 0,1) which is the reverse order of the decoded sound signal of the previous frame. , ..., 2N-1), and in the case of odd> even, extend (n) =-^ y (2N-1-n) (n = 0,1) which is the reverse order of the decoded sound signal of the previous frame whose polarity is inverted. , ..., 2N-1) and outputs the selected extended decoded sound signal extend (0), extend (1), ..., extend (2N-1) to the interpolation signal combining unit 245.

すなわち、補間信号生成部２４は、より詳細には補間信号生成部２４の信号拡張選択部２４４は、前フレームの復号音信号が偶対称的である場合には前フレームの復号音信号のサンプルを時間的に逆に並べた信号を選択し、前フレームの復号音信号が奇対称的である場合には当該信号の極性を反転した信号を選択し、上記選択した信号を当該フレームの拡張復号音信号とする。 That is, the interpolation signal generation unit 24, more specifically, the signal extension selection unit 244 of the interpolation signal generation unit 24 samples the decoded sound signal sample of the previous frame when the decoded sound signal of the previous frame is even symmetric. Select the signals arranged in reverse time, and if the decoded sound signal of the previous frame is odd-symmetric, select the signal whose polarity is inverted, and convert the selected signal to the extended decoded sound of the frame. Signal.

このようにして、補間信号生成部２４は、より詳細には補間信号生成部２４の信号拡張選択部２４４は、音信号符号が欠落しているフレームについては、当該フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号、又は、当該信号の極性を反転した信号を選択し、上記選択した信号を当該フレームの拡張復号音信号とする。 In this way, the interpolation signal generation unit 24, more specifically, the signal extension selection unit 244 of the interpolation signal generation unit 24, for a frame in which the sound signal code is missing, decodes the decoded sound signal of the previous frame of the frame. Or a signal in which the polarity of the signal is inverted in time, or a signal in which the polarity of the signal is inverted, is selected as an extended decoded sound signal of the frame.

＜補間信号結合部２４５＞
補間信号結合部２４５には、非パケット化部２１が出力した欠落判定情報と、信号復号部２３が出力した復号音信号^x(0),^x(1),…,^x(N-1)と、線形予測部２４３が出力した予測信号predict(0),predict(1),…,predict(N-1)と、信号拡張選択部２４４が出力した拡張復号音信号extend(0),extend(1),…,extend(2N-1)とが入力される。 <Interpolation signal combining unit 245>
The interpolation signal combining unit 245 includes the loss determination information output from the depacketizing unit 21 and the decoded sound signals ^ x (0), ^ x (1), ..., ^ x (N- 1), the prediction signals predict (0), predict (1),..., Predict (N−1) output from the linear prediction unit 243, and the extended decoded sound signal extend (0), output from the signal extension selection unit 244. extend (1), ..., extend (2N-1) is input.

補間信号結合部２４５は、当該フレームと前フレームの欠落判定情報に応じて、すなわち、当該フレームの音信号符号が欠落しているか否かと前フレームの音信号符号が欠落しているか否かに応じて、例えば以下の３通りの処理を選択的に行う。 The interpolation signal combining unit 245 determines whether the sound signal code of the frame is missing and the sound signal code of the previous frame according to the missing determination information of the frame and the previous frame. Then, for example, the following three processes are selectively performed.

補間信号結合部２４５は、当該フレームの欠落判定情報と前フレームの欠落判定情報の両方が音信号符号が欠落していないことを示す場合、すなわち、当該フレームの音信号符号が欠落しておらず、当該フレームの前フレームの音信号符号も欠落していない場合、復号音信号^x(0),^x(1),…,^x(N-1)を当該フレームの復号音信号^X(0),^X(1),…,^X(N-1)として出力する（ステップＤ６）。 The interpolation signal combining unit 245 determines that both the missing signal determination information of the frame and the missing determination information of the previous frame indicate that the sound signal code is not missing, that is, the sound signal code of the frame is not missing. If the sound signal code of the previous frame of the frame is not missing, the decoded sound signal ^ x (0), ^ x (1), ..., ^ x (N-1) is converted to the decoded sound signal ^ X of the frame. Output as (0), ^ X (1), ..., ^ X (N-1) (step D6).

補間信号結合部２４５は、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合、例えば以下の式(3)のように、予測信号predict(0),predict(1),…,predict(N-1)と拡張復号音信号extend(0),extend(1),…,extend (N-1)とを所定の窓関数を用いて線形結合させたものを当該フレームの復号音信号^X(0),^X(1),…,^X(N-1)として出力する（ステップＤ６）。

The interpolation signal combining unit 245, when the loss determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing, for example, the following equation (3) Thus, the prediction signal predict (0), predict (1),..., Predict (N-1) and the extended decoded sound signal extend (0), extend (1),. The linearly combined signal using the window function is output as decoded sound signals ^ X (0), ^ X (1), ..., ^ X (N-1) of the frame (step D6).

ただし、n=0,1,…,N-1であり、w(n)が窓関数を示す。窓関数w(n)として、上記式以外の他の窓関数を用いてもよいが、extend(n)に乗じる窓関数は時間に対して増大する関数、すなわちnの値が大きいほど大きな値となる関数であり、predict(n)に乗じる窓関数は時間に対して減少する関数、すなわちnの値が大きいほど小さな値となる関数であることが望ましい。このように、窓関数を用いて二種類の信号を線形結合することをここではクロスフェードと呼び、extend(n)とpredict(n)のそれぞれに乗じる窓関数をクロスフェードするための関数と呼ぶ。 Here, n = 0, 1,..., N−1, and w (n) indicates a window function. As the window function w (n), a window function other than the above equation may be used.However, the window function by which extend (n) is multiplied with time, that is, the larger the value of n, the larger the value of n. It is desirable that the window function that multiplies predict (n) be a function that decreases with time, that is, a function that becomes smaller as the value of n increases. In this way, the linear combination of two types of signals using a window function is called a crossfade here, and a function for crossfading a window function multiplying each of extend (n) and predict (n) is called here. .

このようにして、補間信号生成部２４は、より詳細には補間信号生成部２４の補間信号結合部２４５は、音信号符号が欠落しているフレームについては、当該フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号又は当該信号の極性を反転した信号である拡張復号音信号と、前フレームから線形予測合成した信号と、に基づいて生成した信号を、当該フレームの復号音信号とする。 In this way, the interpolation signal generation unit 24, more specifically, the interpolation signal combination unit 245 of the interpolation signal generation unit 24, for a frame in which the sound signal code is missing, decodes the decoded sound signal of the previous frame of the frame. A signal generated based on an extended decoded sound signal that is a signal obtained by arranging the samples in reverse in time or a signal obtained by inverting the polarity of the signal, and a signal obtained by performing linear prediction synthesis from the previous frame, decodes the frame. Sound signal.

補間信号結合部２４５は、当該フレームの欠落判定情報が音信号符号が欠落していないことを示しかつ当該フレームの前フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落しておらず、当該フレームの前フレームの音信号符号が欠落している場合、例えば以下の式(4)のように、復号音信号^x(0),^x(1),…,^x(N-1)と拡張復号音信号extend(N),extend(N+1),…,extend(2N-1)とをクロスフェードさせたものを当該フレームの復号音信号^X(0),^X(1),…,^X(N-1)として出力する（ステップＤ６）。

The interpolation signal coupling unit 245 indicates that the missing signal determination information of the frame indicates that the sound signal code is not missing and the missing determination information of the previous frame of the frame indicates that the sound signal code is missing, that is, If the sound signal code of the frame is not missing and the sound signal code of the previous frame of the frame is missing, for example, the decoded sound signal ^ x (0), Cross-fading ^ x (1), ..., ^ x (N-1) and extended decoded sound signal extend (N), extend (N + 1), ..., extend (2N-1) Are output as decoded sound signals ^ X (0), ^ X (1), ..., ^ X (N-1) (step D6).

ただし、n=0,1,…,N-1であり、w(n)が上述の窓関数を示す。この操作により、情報の欠落したフレームと現フレームとの連続性を高めることができる。 Here, n = 0, 1,..., N−1, and w (n) indicates the above-described window function. By this operation, continuity between the frame in which information is missing and the current frame can be increased.

＜補間信号生成部２４＞
補間信号生成部２４は、上記説明した、線形予測係数記憶部２４１、復号音信号記憶部２４２、線形予測部２４３、信号拡張選択部２４４の処理及び補間信号結合部２４５の式(3)を用いた処理により、音信号符号が欠落しているフレームについては、当該フレームの前フレームの復号音信号とパワースペクトルが同じである複数の信号の候補の中から、前フレームと時間的に連続性が高い信号の候補を選択し、選択した信号の候補を当該フレームの復号音信号としていると言える。 <Interpolation signal generator 24>
The interpolation signal generation unit 24 uses the processing of the linear prediction coefficient storage unit 241, the decoded sound signal storage unit 242, the linear prediction unit 243, the signal extension selection unit 244, and the equation (3) of the interpolation signal combination unit 245 described above. As for the frame in which the sound signal code is missing, the continuity with the previous frame is temporally selected from among a plurality of signal candidates having the same power spectrum as the decoded sound signal of the previous frame of the frame. It can be said that a candidate for a high signal is selected and the candidate for the selected signal is used as a decoded sound signal of the frame.

なお、補間信号生成部２４の補間信号結合部２４５は、フレームの音信号符号が欠落している場合には、信号拡張選択部２４４が出力した拡張復号音信号を、当該フレームの復号音信号としてもよい。すなわち、補間信号生成部２４は、音信号符号が欠落しているフレームについては、当該フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号又は当該信号の極性正負を反転逆にした信号を、当該フレームの復号音信号としてもよい。このようにする場合には、補間信号生成部２４は、線形予測係数記憶部２４１及び線形予測部２４３を動作させないでよく、線形予測係数記憶部２４１及び線形予測部２４３を備えないでもよい。 When the sound signal code of the frame is missing, the interpolation signal combining unit 245 of the interpolation signal generation unit 24 uses the extended decoded sound signal output by the signal extension selecting unit 244 as the decoded sound signal of the frame. Is also good. That is, the interpolation signal generation unit 24, for a frame in which a sound signal code is missing, a signal obtained by arranging samples of the decoded sound signal of the previous frame of the frame in time reverse or the polarity of the signal is inverted. May be used as the decoded sound signal of the frame. In this case, the interpolation signal generation unit 24 may not operate the linear prediction coefficient storage unit 241 and the linear prediction unit 243, and may not include the linear prediction coefficient storage unit 241 and the linear prediction unit 243.

このように、復号装置が、前フレームの情報のみに基づいて、前フレームのパワースペクトルを同じ信号を用いた補間を行うことにより、追加の情報の伝送の必要がなく、通常の復号処理と同じ原理遅延の範囲内での遅延で、従来技術より聴覚的に良好な復号音信号を得ることが可能となる。 In this way, the decoding apparatus performs interpolation using the same signal on the power spectrum of the previous frame based on only the information of the previous frame, so that there is no need to transmit additional information, and the decoding apparatus performs the same processing as normal decoding processing. With a delay within the range of the principle delay, it is possible to obtain a decoded sound signal that is more audibly better than the prior art.

[第二実施形態の復号装置及び方法]
第二実施形態の復号装置及び方法は、信号拡張選択部２４４における信号の拡張方法の選択において予測信号を用いることにより、連続性の高いクロスフェードを実現するものである。以下、第二実施形態の復号装置及び方法の詳細を示す。 [Decoding device and method according to second embodiment]
The decoding device and method according to the second embodiment realize a highly continuous crossfade by using a prediction signal in selecting a signal expansion method in the signal expansion selection unit 244. Hereinafter, details of the decoding device and method of the second embodiment will be described.

第二実施形態の復号装置の例を図６に示す。第二実施形態の復号装置は、第一実施形態と同様に、非パケット化部２１と、線形予測係数復号部２２と、信号復号部２３と、補間信号生成部２４とを例えば備えている。補間信号生成部２４は、線形予測係数記憶部２４１と、復号音信号記憶部２４２と、線形予測部２４３と、信号拡張選択部２４４と、補間信号結合部２４５とを例えば備えている。 FIG. 6 shows an example of the decoding device of the second embodiment. As in the first embodiment, the decoding device of the second embodiment includes, for example, a non-packetizing unit 21, a linear prediction coefficient decoding unit 22, a signal decoding unit 23, and an interpolation signal generation unit 24. The interpolation signal generation unit 24 includes, for example, a linear prediction coefficient storage unit 241, a decoded sound signal storage unit 242, a linear prediction unit 243, a signal extension selection unit 244, and an interpolation signal combination unit 245.

以下、第一実施形態と異なる部分である線形予測部２４３及び信号拡張選択部２４４について説明する。第一実施形態と同様の部分については重複説明を省略する。 Hereinafter, the linear prediction unit 243 and the signal extension selection unit 244 that are different from the first embodiment will be described. The same parts as in the first embodiment will not be described repeatedly.

＜線形予測部２４３＞
線形予測部２４３には、非パケット化部２１が出力した欠落判定情報と、線形予測係数記憶部２４１が出力した前フレーム復号線形予測係数^β₁,^β₂,…,^β_pと、復号音信号記憶部２４２が出力した前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)とが入力される。 <Linear prediction unit 243>
The linear prediction unit 243 includes the loss determination information output from the depacketizing unit 21 and the previous frame decoded linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p output from the linear prediction coefficient storage unit 241. The preceding frame decoded sound signal ^ y (0), ^ y (1), ..., ^ y (2N-1) output from the decoded sound signal storage unit 242 is input.

線形予測部２４３は、音信号符号が欠落している場合に、第一実施形態の線形予測部２４３と同様に予測信号predict(0),predict(1),…,predict(N-1)を生成する（ステップＤ４）。 When the sound signal code is missing, the linear prediction unit 243 converts the prediction signals predict (0), predict (1),..., Predict (N-1) in the same manner as the linear prediction unit 243 of the first embodiment. Generate (Step D4).

線形予測部２４３は、生成した予測信号predict(0),predict(1),…,predict(N-1)を補間信号結合部２４５及び信号拡張選択部２４４に出力する。 The linear prediction unit 243 outputs the generated prediction signals predict (0), predict (1),..., Predict (N−1) to the interpolation signal combining unit 245 and the signal extension selecting unit 244.

予測信号predict(0),predict(1),…,predict(N-1)が、信号拡張選択部２４４にも出力される部分が第一実施形態と異なる部分である。 The prediction signal predict (0), predict (1),..., Predict (N-1) is also different from that of the first embodiment in that it is also output to the signal extension selection unit 244.

＜信号拡張選択部２４４＞
信号拡張選択部２４４には、欠落判定情報が音信号符号が欠落していることを示すフレーム、すなわち、音信号符号が欠落しているフレームについての、非パケット化部２１が出力した欠落判定情報と、復号音信号記憶部２４２が出力した前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)と、線形予測部２４３が出力した予測信号predict(0),predict(1),…,predict(N-1)とが入力される。 <Signal extension selection unit 244>
The signal extension selection unit 244 includes, in the loss determination information, a frame indicating that the sound signal code is missing, that is, the loss determination information output by the non-packetizing unit 21 for the frame in which the sound signal code is missing. , Yy (0), ^ y (1), ..., ^ y (2N-1) output from the decoded sound signal storage unit 242, and the prediction signal predict ( 0), predict (1), ..., predict (N-1) are input.

信号拡張選択部２４４は、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合、例えば以下のように拡張復号音信号extend(0),extend(1),…,extend(2N-1)を選択する（ステップＤ５）。 The signal extension selection unit 244 may determine whether the missing sound signal code of the frame is missing, that is, if the missing sound signal code of the frame is missing, for example, as follows. The signals extend (0), extend (1), ..., extend (2N-1) are selected (step D5).

信号拡張選択部２４４は、拡張復号音信号extend(0),extend(1),…,extend(2N-1)の候補として、例えば以下の４個の候補extend₁(n),extend₂(n),extend₃(n),extend₄(n)を用意する。extend₁(n)は、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆に並べた信号である。extend₂(n)は、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆に並べた信号の極性を反転させた信号である。extend₃(n)は、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)そのものである。extend₄(n)は、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)の極性を反転させた信号である。

ただし、n=0,1,…,2N-1である。 The signal extension selection unit 244 includes, for example, the following four candidates extend ₁ (n), extend ₂ (n) as candidates for the extended decoded sound signal extend (0), extend (1),..., Extend (2N-1). ), extend ₃ (n) and extend ₄ (n) are prepared. extend ₁ (n) is a signal obtained by arranging the preceding frame decoded sound signals ^ y (0), ^ y (1), ..., ^ y (2N-1) in reverse time. extend ₂ (n) is a signal obtained by inverting the polarity of the signal obtained by arranging the decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1) in reverse time. is there. extend ₃ (n) is the previous frame decoded sound signal ^ y (0), ^ y (1), ..., ^ y (2N-1) itself. extend ₄ (n) is a signal obtained by inverting the polarity of the preceding frame decoded sound signal ^ y (0), ^ y (1), ..., ^ y (2N-1).

Here, n = 0, 1,..., 2N−1.

後述のとおり、これらの候補は、特定の仮定の下では、前フレームの復号音信号のパワースペクトルを保ったまま位相をフレーム長分だけシフトさせたものに等しい。信号拡張選択部２４４は、フレーム毎に、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、音信号符号が欠落している場合に、当該フレームとして最も適切なものを選択するが、その選択においては前フレームの復号音信号と拡張復号音信号との連続性を基準とする。そして、信号拡張選択部２４４は、この連続性の基準として予測信号を用いる。予測信号は、線形予測の原理上、前フレームの復号音信号と連続な信号となるので、信号拡張選択部２４４では予測信号と拡張復号音信号の候補との値の近さを連続性の基準として用いて評価を行う。 As described below, these candidates are equivalent to those obtained by shifting the phase by the frame length while maintaining the power spectrum of the decoded sound signal of the previous frame under a specific assumption. The signal extension selection unit 244 determines, for each frame, when the missing determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code is missing, In this case, the continuity between the decoded sound signal of the previous frame and the extended decoded sound signal is used as a reference. Then, the signal extension selecting unit 244 uses the prediction signal as a criterion of the continuity. Since the prediction signal is a signal continuous with the decoded sound signal of the previous frame on the principle of linear prediction, the signal extension selecting unit 244 determines the closeness of the value between the prediction signal and the candidate of the extended decoded sound signal as a criterion of continuity. The evaluation is performed using

つまり、信号拡張選択部２４４は、上記の４個の候補extend₁(n),extend₂(n),extend₃(n),extend₄(n)のうち、例えば以下の式(5)に示される予測信号と拡張復号音信号の候補との二乗距離の値が最小となるものを拡張復号音信号extend(0),extend(1),…,extend(2N-1)として選択し、選択した拡張復号音信号extend(0),extend(1),…,extend(2N-1)を補間信号結合部２４５に出力する。

ただし、i=1,2,3,4である。 That is, the signal extension selecting unit 244 selects, for example, the following equation (5) among the _four candidates extend ₁ (n), extend ₂ (n), extend ₃ (n), and extend ₄ (n). The extended decoded sound signal extend (0), extend (1),..., Extend (2N-1) is selected as a signal having the minimum square distance between the predicted signal and the extended decoded sound signal candidate. The extended decoded sound signals extend (0), extend (1),..., Extend (2N−1) are output to the interpolation signal combining unit 245.

Here, i = 1, 2, 3, and 4.

または、信号拡張選択部２４４は、上記の４個の候補extend₁(n),extend₂(n),extend₃(n),extend₄(n)のうち、例えば下記の式(6)に示される内積値が最大となるものを拡張復号音信号として選択してもよい。

ただし、i=1,2,3,4である。 Alternatively, the signal extension selection unit 244 selects one of the above four candidates, extend ₁ (n), extend ₂ (n), extend ₃ (n), and extend ₄ (n), for example, as shown in the following equation (6). The signal with the largest inner product value may be selected as the extended decoded sound signal.

Here, i = 1, 2, 3, and 4.

なお、式(5),(6)において、クロスフェードをするための窓関数の値をextend_i(n)に対してかけた信号extend_i(n)'を、上記式(5),(6)におけるextend_i(n)の代わりに用いてもよい。同様にして、式(5),(6)において、クロスフェードをするための窓関数の値をpredict(n)に対してかけた信号predict(n)'を、上記式(5),(6)におけるpredict(n)の代わりに用いてもよい。 In Equations (5) and (6), a signal extend _i (n) ′ obtained by multiplying extend _i (n) by the value of the window function for crossfading is expressed by Equations (5) and (6). ) May be used instead of extend _i (n). Similarly, in formulas (5) and (6), a signal predict (n) ′ obtained by multiplying predict (n) by the value of the window function for performing the crossfading is expressed by the above formulas (5) and (6). ) May be used instead of predict (n).

このようにして、補間信号生成部２４は、より詳細には補間信号生成部２４の信号拡張選択部２４４は、当該フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号、当該信号の極性を反転した信号、当該フレームの前フレームの復号音信号及び当該フレームの前フレームの復号音信号の極性を反転した信号の中から、上記前フレームから線形予測合成した信号と類似性が高い信号を選択し、選択した信号を拡張復号音信号とする。 In this manner, the interpolation signal generation unit 24, more specifically, the signal extension selection unit 244 of the interpolation signal generation unit 24 outputs a signal obtained by arranging the samples of the decoded sound signal of the previous frame of the current frame in reverse time, Among the signal in which the polarity of the signal is inverted, the decoded sound signal of the previous frame of the frame, and the signal of which the polarity of the decoded sound signal of the previous frame of the frame is inverted, similarity to the signal obtained by linear prediction synthesis from the previous frame. Is selected, and the selected signal is used as an extended decoded sound signal.

また、このようにして、補間信号生成部２４は、より詳細には補間信号生成部２４の信号拡張選択部２４４は、音信号符号が欠落しているフレームについては、当該フレームの前フレームの復号音信号のサンプルを時間的に逆に並べた信号、当該信号の極性を反転した信号、当該フレームの前フレームの復号音信号及び当該フレームの前フレームの復号音信号の極性を反転した信号の何れかを選択し、選択した信号を拡張復号音信号とする。 Further, in this manner, the interpolation signal generation unit 24, more specifically, the signal extension selection unit 244 of the interpolation signal generation unit 24 decodes the previous frame of the frame in which the sound signal code is missing for the frame in which the sound signal code is missing. Any one of a signal obtained by arranging samples of the sound signal in reverse time, a signal in which the polarity of the signal is inverted, a decoded sound signal in the previous frame of the frame, and a signal in which the polarity of the decoded sound signal in the previous frame of the frame is inverted. Is selected, and the selected signal is used as an extended decoded sound signal.

信号拡張選択部２４４は、選択して得た拡張復号音信号extend(0),extend(1),…,extend(2N-1)を補間信号結合部２４５に出力する。 The signal extension selecting unit 244 outputs the selected extended decoded sound signal extend (0), extend (1),..., Extend (2N−1) to the interpolation signal combining unit 245.

第一実施形態では、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆に並べた信号及び当該信号の極性を反転した信号の２個の信号が拡張復号音信号extend(0),extend(1),…,extend(2N-1)の候補であった。これに対して、上記説明した第二実施形態では、拡張復号音信号extend(0),extend(1),…,extend(2N-1)の候補の数は４個である。拡張復号音信号extend(0),extend(1),…,extend(2N-1)の候補の数を増やすことにより、より精度の高い補間が可能となる。 In the first embodiment, a signal in which the preceding frame decoded sound signal ^ y (0), ^ y (1),..., ^ Y (2N-1) is temporally reversed and a signal in which the polarity of the signal is inverted Are candidates for the extended decoded sound signal extend (0), extend (1),..., Extend (2N-1). On the other hand, in the second embodiment described above, the number of candidates for the extended decoded sound signal extend (0), extend (1),..., Extend (2N−1) is four. By increasing the number of candidates for the extended decoded sound signal extend (0), extend (1),..., Extend (2N−1), more accurate interpolation becomes possible.

なお、第二実施形態においても、第一実施形態と同様に、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を時間的に逆に並べた信号及び当該信号の極性を反転した信号の２個の信号を拡張復号音信号extend(0),extend(1),…,extend(2N-1)の候補としてもよい。 Note that also in the second embodiment, similarly to the first embodiment, the decoded sound signals of the previous frame ^ y (0), ^ y (1), ..., ^ y (2N-1) are temporally reversed. , Extend (0), extend (1),..., Extend (2N−1).

この場合、信号拡張選択部２４４は、例えば以下の式により定義される、前フレームの復号音信号と予測信号を並べた際の境を中心に対称なサンプルの和と差のエネルギーeven, oddをそれぞれ求める。

In this case, the signal extension selecting unit 244 calculates, for example, the sum and difference energies even and odd of the samples symmetric about the boundary when the decoded sound signal of the previous frame and the prediction signal are aligned, as defined by the following equation. Ask for each.

そして、第一実施形態と同様の理由により、信号拡張選択部２４４は、even≧oddの場合に拡張復号音信号として前フレーム復号音信号の逆順であるextend(n)=^y(2N-1-n)（n=0,1,…,2N-1）を、odd>evenの場合には極性を反転した前フレーム復号音信号の逆順であるextend(n)=-^y(2N-1-n)（n=0,1,…,2N-1）を選択し、選択した拡張復号音信号extend(0),extend(1),…,extend(2N-1)を補間信号結合部２４５に出力する。 Then, for the same reason as in the first embodiment, the signal extension selecting unit 244 sets extend (n) = ^ y (2N−1), which is the reverse order of the decoded sound signal of the previous frame, as the extended decoded sound signal when even ≧ odd. -n) (n = 0,1, ..., 2N-1), and in the case of odd> even, extend (n) =-^ y (2N-1) which is the reverse order of the decoded sound signal of the previous frame whose polarity has been inverted. -n) (n = 0, 1,..., 2N−1), and the selected extended decoded sound signals extend (0), extend (1),. Output to

[第三実施形態の復号装置及び方法]
第三実施形態の復号装置及び方法は、情報の欠落が生じたフレーム以前のフレームの復号音信号を基に線形予測係数を推定するものである。これにより、復号装置が想定する符号化装置が線形予測分析部１１を備えておらず、復号装置に入力されるパケットに線形予測係数符号が含まれていない場合であっても、復号装置は、線形予測を用いた信号の補間を行うことができる。または、復号装置は、線形予測分析の次数を、符号化で用いた次数よりも高く求めることにより、更に高い精度の予測を行うことができる。 [Decoding device and method according to third embodiment]
The decoding device and method of the third embodiment are for estimating a linear prediction coefficient based on a decoded sound signal of a frame before a frame in which information is lost. Thereby, even when the encoding device assumed by the decoding device does not include the linear prediction analysis unit 11 and the packet input to the decoding device does not include the linear prediction coefficient code, Signal interpolation using linear prediction can be performed. Alternatively, the decoding device can perform higher-precision prediction by obtaining the order of the linear prediction analysis higher than the order used in the encoding.

第三実施形態の復号装置の例を図７に示す。第三実施形態の復号装置は、非パケット化部２１と、線形予測係数復号部２２と、信号復号部２３と、補間信号生成部２４とを例えば備えている。補間信号生成部２４は、復号音信号記憶部２４２と、線形予測部２４３と、信号拡張選択部２４４と、補間信号結合部２４５と、線形予測係数推定部２４６とを例えば備えている。 FIG. 7 shows an example of a decoding device according to the third embodiment. The decoding device according to the third embodiment includes, for example, a non-packetizing unit 21, a linear prediction coefficient decoding unit 22, a signal decoding unit 23, and an interpolation signal generation unit 24. The interpolation signal generation unit 24 includes, for example, a decoded sound signal storage unit 242, a linear prediction unit 243, a signal extension selection unit 244, an interpolation signal combination unit 245, and a linear prediction coefficient estimation unit 246.

以下、第一実施形態又は第二実施形態と異なる部分である、非パケット化部２１、線形予測係数復号部２２、復号音信号記憶部２４２、線形予測部２４３及び線形予測係数推定部２４６について説明する。第一実施形態又は第二実施形態と同様の部分については重複説明を省略する。 Hereinafter, a description will be given of the non-packetizing unit 21, the linear prediction coefficient decoding unit 22, the decoded sound signal storage unit 242, the linear prediction unit 243, and the linear prediction coefficient estimation unit 246, which are different from the first embodiment or the second embodiment. I do. The same parts as those in the first embodiment or the second embodiment will not be described repeatedly.

＜非パケット化部２１＞
非パケット化部２１は、第一実施形態又は第二実施形態と同様の処理により、欠落判定情報を生成し、生成した欠落判定情報を、復号音信号記憶部２４２、信号拡張選択部２４４、及び補間信号結合部２４５のみならず、線形予測係数推定部２４６にも出力する（ステップＤ１）。 <Non-packetizing unit 21>
The non-packetizing unit 21 generates missing judgment information by the same processing as in the first embodiment or the second embodiment, and stores the generated missing judgment information in the decoded sound signal storage unit 242, the signal extension selecting unit 244, and The data is output not only to the interpolation signal combining unit 245 but also to the linear prediction coefficient estimating unit 246 (step D1).

非パケット化部２１の他の処理は、第一実施形態又は第二実施形態と同様である。 Other processes of the depacketizing unit 21 are the same as those of the first embodiment or the second embodiment.

＜線形予測係数復号部２２＞
線形予測係数復号部２２は、第一実施形態又は第二実施形態と同様の処理により、音信号符号が欠落していないフレームについて、フレーム毎に、復号線形予測係数^α₁,^α₂,…,^α_pを得て、得た復号線形予測係数^α₁,^α₂,…,^α_pを信号復号部２３に出力する（ステップＤ２）。 <Linear prediction coefficient decoding unit 22>
The linear prediction coefficient decoding unit 22 performs the decoding linear prediction coefficients ^ α ₁ , ^ α ₂ , ... to give the ^ alpha _p, resulting decoded linear prediction coefficient _{_{^ α 1, ^ α 2,}} ..., and outputs a ^ alpha _p to the signal decoding unit 23 (step D2).

線形予測係数復号部２２の他の処理は、第一実施形態又は第二実施形態と同様である。 Other processes of the linear prediction coefficient decoding unit 22 are the same as those of the first embodiment or the second embodiment.

＜復号音信号記憶部２４２＞
当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、復号音信号記憶部２４２は、記憶している前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を線形予測部２４３及び信号拡張選択部２４４のみならず、線形予測係数推定部２４６にも出力する。 <Decoded sound signal storage unit 242>
If the missing signal determination information of the frame indicates that the sound signal code is missing, that is, if the sound signal code of the frame is missing, the decoded sound signal storage unit 242 stores the stored previous frame. The decoded sound signal ^ y (0), ^ y (1), ..., ^ y (2N-1) is output not only to the linear prediction unit 243 and the signal extension selection unit 244, but also to the linear prediction coefficient estimation unit 246.

復号音信号記憶部２４２の他の処理は、第一実施形態又は第二実施形態と同様である。 Other processes of the decoded sound signal storage unit 242 are the same as those of the first embodiment or the second embodiment.

＜線形予測係数推定部２４６＞
線形予測係数推定部２４６には、非パケット化部２１が出力した欠落判定情報と、復号音信号記憶部２４２が出力した前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)とが入力される。 <Linear prediction coefficient estimation unit 246>
The linear prediction coefficient estimating unit 246 includes the loss determination information output by the depacketizing unit 21 and the decoded sound signal of the previous frame ^ y (0), ^ y (1), ..., output by the decoded sound signal storage unit 242. ^ y (2N-1) is input.

線形予測係数推定部２４６は、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)を基に、図１における線形予測分析部１１で用いる線形予測分析と同様の処理により推定線形予測係数^γ₁,^γ₂,…,^γ_pを生成する（ステップＤ７）。 The linear prediction coefficient estimating unit 246 outputs the preceding frame decoded sound signal ^ when the loss determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing. Based on y (0), ^ y (1),..., ^ y (2N-1), the estimated linear prediction coefficient ^ γ ₁ , ^ γ ₂ , ..., ^ γ _p are generated (step D7).

線形予測係数推定部２４６は、生成した推定線形予測係数^γ₁,^γ₂,…,^γ_pを線形予測部２４３に出力する。 Linear predictive coefficient estimator 246, generated estimated linear prediction coefficient _{_{^ γ 1, ^ γ 2,}} ..., and outputs a ^ gamma _p the linear prediction unit 243.

＜線形予測部２４３＞
線形予測部２４３には、非パケット化部２１が出力した欠落判定情報と、線形予測係数推定部２４６が出力した推定線形予測係数^γ₁,^γ₂,…,^γ_pと、復号音信号記憶部２４２が出力した前フレーム復号音信号^y(0),^y(1),…,^y(2N-1)とが入力される。 <Linear prediction unit 243>
The linear predictor 243 includes the missing information output from the depacketizer 21, the estimated linear prediction coefficients ^ γ ₁ , ^ γ ₂ , ..., ^ γ _p output from the linear prediction coefficient estimator 246, and the decoded sound. The previous frame decoded sound signal ^ y (0), ^ y (1), ..., ^ y (2N-1) output from the signal storage unit 242 is input.

線形予測部２４３は、フレーム毎に、当該フレームの欠落判定情報が音信号符号が欠落していることを示す場合、すなわち、当該フレームの音信号符号が欠落している場合に、推定線形予測係数^γ₁,^γ₂,…,^γ_pの値を基に例えば以下の式(1')のように予測信号predict(0), predict(1),…,predict(N-1)を生成し、生成した予測信号predict(0),predict(1),…,predict (N-1)を補間信号結合部２４５に出力する。

The linear prediction unit 243 determines, for each frame, the estimated linear prediction coefficient when the missing signal determination information of the frame indicates that the sound signal code is missing, that is, when the sound signal code of the frame is missing. Based on the values of ^ γ ₁ , ^ γ ₂ ,…, ^ γ _p , predictive signals predict (0), predict (1),…, predict (N-1) are calculated as in the following equation (1 '), for example. The generated prediction signals predict (0), predict (1),..., Predict (N−1) are output to the interpolation signal combining unit 245.

ただし、n=1,2,…,Nである。この予測信号predict(0),predict(1),…,predict(N-1)は、推定線形予測係数^γ₁,^γ₂,…,^γ_pを当該フレームの復号線形予測係数とし、０を当該フレームの残差信号の各サンプル値としたときの、当該フレームの予測信号である。 Here, n = 1, 2,..., N. The prediction signal predict (0), predict (1 ), ..., predict (N-1) is estimated linear prediction coefficient _{_{^ γ 1, ^ γ 2,}} ..., a ^ gamma _p is the decoded linear prediction coefficients of the frame, This is a predicted signal of the frame when 0 is used as each sample value of the residual signal of the frame.

[技術背景]
上記の実施形態では、２種類の信号の少なくとも一方を用いて情報の欠落したフレームの補間を行っている。 [Technical background]
In the above embodiment, interpolation of a frame with missing information is performed using at least one of the two types of signals.

２種類の信号の１つ目は予測信号predict(0),predict(1),…,predict(N-1)である。予測信号predict(0),predict(1),…,predict(N-1)は、前フレームの復号線形予測係数^β₁,^β₂,…,^β_pを用いて上記式(1)で例えば求められる。この予測信号predict(0),predict(1),…,predict(N-1)を用いると、その性質上、前フレームとの連続性が担保されるが、予測次数pがフレーム長Nに対して短い場合には予測信号のエネルギーは徐々に減少してゆき、フレームの後半で予測信号の値が０になる。したがって、予測信号のみを用いて情報の欠落したフレームの補間を行うと、次フレームとの不連続性が生じてしまうことがある。 The first of the two types of signals is a prediction signal predict (0), predict (1),..., Predict (N-1). Prediction signal predict (0), predict (1 ), ..., predict (N-1) is decoded linear prediction coefficients of the previous frame _{_{^ β 1, ^ β 2,}} ..., the formula using the ^ β _p (1) For example, it is required. When the prediction signals predict (0), predict (1),..., Predict (N−1) are used, continuity with the previous frame is ensured due to its property. If it is short, the energy of the prediction signal gradually decreases, and the value of the prediction signal becomes 0 in the latter half of the frame. Therefore, if interpolation of a frame with missing information is performed using only the prediction signal, discontinuity with the next frame may occur.

復号線形予測係数^β₁,^β₂,…,^β_pは、第一実施形態及び第二実施形態のように、符号化側において符号化されたものを復号側で復号することにより得ることができる。 The decoded linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p are obtained by decoding on the decoding side what has been encoded on the encoding side, as in the first and second embodiments. be able to.

なお、復号線形予測係数^β₁,^β₂,…,^β_pの代わりに、第三実施形態のように、過去に復号された波形から線形予測分析で求めた推定線形予測係数^γ₁,^γ₂,…,^γ_pを用いて、予測信号predict(0),predict(1),…,predict(N-1)を求めることもできる。この場合pを例えばNと同等まで大きくすることができ、これにより多くのサンプル数まで予測信号predict(0),predict(1),…,predict(N-1)のエネルギーを保つことができる。しかし、次数の大きな線形予測係数を求めるために、長い過去のサンプル（複数フレーム）を保持する必要があり、分析の演算量も極端に大きくなってしまう。 Note that, instead of the decoded linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p , the estimated linear prediction coefficients ^ γ obtained by linear prediction analysis from previously decoded waveforms as in the third embodiment. _1, ^ γ _2, ..., with ^ gamma _p, the prediction signal predict (0), predict (1 ), ..., can also be determined predict (N-1). In this case, p can be made as large as, for example, N, so that the energy of the prediction signal predict (0), predict (1),..., Predict (N-1) can be maintained up to a large number of samples. However, in order to obtain a linear prediction coefficient of a large order, it is necessary to hold a long past sample (a plurality of frames), and the amount of calculation for analysis becomes extremely large.

演算量の増加を防ぎつつ、フレーム内の信号の欠落を防ぐために、上記２種類の信号の２つ目の信号である拡張復号音信号を用いる。この拡張復号音信号extend(0),extend(1),…,extend(2N-1)には、前フレームの復号音信号^y(0),^y(1),…,^y(2N-1)のパワースペクトルを保った信号を例えば用いる。この拡張復号音信号extend(0),extend(1),…,extend(2N-1)と予測信号predict(0),predict(1),…,predict(N-1)とを例えば上記式(3)のようにクロスフェードさせた信号を欠落したフレームの復号音信号とすることにより、欠落したフレームの前後フレームとの時間的及び周波数的連続性を担保したまま、補間した信号のエネルギーを保つことができる。 The extended decoded sound signal, which is the second of the above two types of signals, is used to prevent loss of signals in the frame while preventing an increase in the amount of calculation. The extended decoded sound signal extend (0), extend (1),..., Extend (2N-1) includes the decoded sound signal ^ y (0), ^ y (1),. For example, a signal having the power spectrum of -1) is used. The extended decoded sound signal extend (0), extend (1),..., Extend (2N-1) and the prediction signals predict (0), predict (1),. By making the cross-fade signal as the decoded sound signal of the missing frame as in 3), the energy of the interpolated signal is maintained while maintaining the temporal and frequency continuity with the frames before and after the missing frame. be able to.

以下、拡張復号音信号extend(0),extend(1),…,extend(2N-1)の選択の例について説明する。 Hereinafter, an example of selection of the extended decoded sound signal extend (0), extend (1),..., Extend (2N-1) will be described.

信号の周波数を考慮する際、その信号を正弦波で分解することは広く用いられているが、その分解において用いられる正弦波には様々な種類がある。例えば図８に示すように、ある信号をn=0,1,…,N-1において以下の式に則りcos波で分解した場合を考える。

When considering the frequency of a signal, it is widely used to decompose the signal into sine waves, but there are various types of sine waves used in the decomposition. For example, as shown in FIG. 8, consider a case where a signal is decomposed by a cos wave at n = 0, 1,..., N−1 according to the following equation.

周波数スペクトルa(0),a(1),…,a(N-1)を保ったまま信号をn+N=N,…,2N-1に拡張すると以下の式のように表すことができる。

Extending the signal to n + N = N, ..., 2N-1 while maintaining the frequency spectrum a (0), a (1), ..., a (N-1) can be expressed as the following equation .

つまり、信号を時間的に逆にしたものが、信号のパワースペクトルを保ったまま位相をフレーム長分だけシフトさせたものと等しくなる。 That is, a signal obtained by reversing the signal in time is equivalent to a signal obtained by shifting the phase by the frame length while maintaining the power spectrum of the signal.

同様に、ある信号をn=0,1,…,N-1において以下の式に則りsin波で分解した場合は、以下の式のようにn+N=N,…,2N-1に拡張できる。

Similarly, when a certain signal is decomposed with a sine wave at n = 0,1, ..., N-1 according to the following equation, it is expanded to n + N = N, ..., 2N-1 as the following equation it can.

つまり、信号を時間的に逆にして極性を反転させたものが、信号のパワースペクトルを保ったまま位相をフレーム長分だけシフトさせたものと等しくなる。 In other words, a signal obtained by inverting the signal in time and inverting the polarity is equivalent to a signal obtained by shifting the phase by the frame length while maintaining the power spectrum of the signal.

上記の信号の拡張により、信号がcos波で構成されていると仮定するならば信号を時間的に逆にしたものが、信号がsin波で構成されていると仮定するならば信号を時間的に逆にして極性を反転させたものが得られることがわかる。したがって、第一実施形態では信号の偶対称性と奇対称性のいずれが強いかを比較し、偶対称性が強ければ偶対称性なcos波で構成されているとみなし、奇対称性が強ければ奇対称性なsin波で構成されているとみなして拡張信号を選択しているのである。 Due to the above signal extension, if the signal is assumed to be composed of a cosine wave, the signal is inverted in time, whereas if the signal is assumed to be composed of a sine wave, the signal is temporally reversed. It can be seen that a polarity-reversed one is obtained. Therefore, in the first embodiment, it is compared whether the even symmetry or the odd symmetry of the signal is stronger. If the even symmetry is stronger, it is considered that the signal is composed of an even-symmetry cos wave, and the odd symmetry is stronger. In this case, the extension signal is selected on the assumption that the signal is composed of sine waves having odd symmetry.

なお、信号を複素正弦波で分解した場合は、n+N=N,…,2N-1に拡張した信号は以下の式のように求めることができる。

When a signal is decomposed by a complex sine wave, a signal expanded to n + N = N,..., 2N−1 can be obtained as in the following equation.

これらは、それぞれ、そのままの信号及び極性を反転した信号である。上記４種の拡張方法が、第二実施形態の信号拡張選択部２４４における拡張復号音信号の候補extend₁(n),extend₂(n),extend₃(n),extend₄(n)に対応する。 These are a signal as it is and a signal whose polarity is inverted, respectively. The above four extension methods correspond to extended decoded sound signal candidates extend ₁ (n), extend ₂ (n), extend ₃ (n), and extend ₄ (n) in the signal extension selecting unit 244 of the second embodiment. I do.

[プログラム及び記録媒体]
復号装置の各部における処理をコンピュータによって実現する場合、復号装置の各部がが有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、その各部の処理がコンピュータ上で実現される。 [Program and recording medium]
When the processing in each unit of the decoding device is realized by a computer, the processing content of the function that each unit of the decoding device should have is described by a program. By executing this program on a computer, the processing of each unit is realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 A program describing this processing content can be recorded on a computer-readable recording medium. As a computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、各部の処理は、コンピュータ上で所定のプログラムを実行させることにより構成することにしてもよいし、これらの処理の少なくとも一部をハードウェア的に実現することとしてもよい。 Further, the processing of each unit may be configured by executing a predetermined program on a computer, or at least a part of these processing may be realized by hardware.

[変形例]
その他、この発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。 [Modification]
In addition, it goes without saying that changes can be made as appropriate without departing from the spirit of the present invention.

Claims

A decoding device for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, the signal obtained by arranging samples of the decoded sound signal of the previous frame of the frame in reverse time and the signal in which the polarity of the signal is inverted are selected from the time of the previous frame. A signal generated based on an extended decoded sound signal that is a candidate for the selected signal and a signal that is linearly predicted and synthesized from the previous frame, and decodes the frame. An interpolated signal generation unit to be a sound signal,
A decoding device including:

A decoding device for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, an extended decoded sound signal that is a signal obtained by arranging samples of the decoded sound signal of the previous frame of the above frame in time or a signal obtained by inverting the polarity of the signal, A signal generated based on a signal subjected to linear prediction synthesis from the previous frame, and an interpolation signal generation unit that serves as a decoded sound signal of the frame;
A decoding device including:

A decoding device for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, a signal in which samples of the decoded sound signal of the previous frame of the frame are arranged in time reverse, a signal in which the polarity of the signal is inverted, the decoded sound signal of the previous frame, and From among the signals obtained by inverting the polarity of the decoded sound signal of the previous frame, a signal candidate having a high similarity to the signal obtained by linearly predicting and synthesizing the previous frame is selected, and the extended decoded sound signal which is the selected signal candidate is selected. Or, a signal generated based on the extended decoded sound signal and a signal obtained by performing linear prediction synthesis from the previous frame, and an interpolation signal generation unit that sets the decoded sound signal of the frame as a decoded sound signal.
A decoding device including:

A decoding device for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, linear prediction synthesis was performed from the previous frame from a signal in which samples of the decoded sound signal of the previous frame of the frame were arranged in time reverse or a signal in which the polarity of the signal was inverted. A signal having a high similarity to the signal is selected, and an extended decoded sound signal that is the selected signal, or a signal generated based on the extended decoded sound signal and a signal that is linearly predicted and synthesized from the previous frame, An interpolated signal generation unit for the decoded sound signal of
A decoding device including:

A decoding device for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, from among a plurality of signal candidates having the same power spectrum as the decoded sound signal of the previous frame of the frame, a signal having a high temporal continuity with the previous frame is selected. An interpolation signal generation unit that selects a candidate, and sets a signal generated based on an extended decoded sound signal that is a candidate for the selected signal and a signal that is linearly predicted and synthesized from the previous frame as a decoded sound signal of the frame; ,
A decoding device including:

A decoding device for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, among a plurality of signal candidates having the same power spectrum as the decoded sound signal of the previous frame of the frame, the similarity to the signal obtained by linear prediction synthesis from the previous frame is similar. A candidate for a high signal is selected, and an extended decoded sound signal that is a candidate for the selected signal, or a signal generated based on the extended decoded sound signal and a signal that is linearly predicted and synthesized from the previous frame, An interpolated signal generator for decoding sound signals,
A decoding device including:

The decoding device according to any one of claims 1 , 2 , and 4 ,
When the decoded sound signal of the previous frame is even symmetric, the interpolation signal generation unit selects a signal in which samples of the decoded sound signal of the previous frame are arranged in reverse time, and decodes the previous frame. If the sound signal is oddly symmetric, select the signal whose polarity is inverted, and use the selected signal as the decoded sound signal of the frame or the extended decoded sound signal,
Decoding device.

A decoding method for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, the signal obtained by arranging samples of the decoded sound signal of the previous frame of the frame in reverse time and the signal in which the polarity of the signal is inverted are selected from the time of the previous frame. A signal generated based on an extended decoded sound signal that is a candidate for the selected signal and a signal that is linearly predicted and synthesized from the previous frame, and decodes the frame. Generating an interpolation signal as a sound signal,
A decoding method including:

A decoding method for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, an extended decoded sound signal that is a signal obtained by arranging samples of the decoded sound signal of the previous frame of the above frame in time or a signal obtained by inverting the polarity of the signal, A signal generated based on a signal obtained by linear prediction synthesis from the previous frame, and an interpolation signal generation step of setting a signal generated as a decoded sound signal of the frame,
A decoding method including:

A decoding method for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, a signal in which samples of the decoded sound signal of the previous frame of the frame are arranged in time reverse, a signal in which the polarity of the signal is inverted, the decoded sound signal of the previous frame, and From among the signals obtained by inverting the polarity of the decoded sound signal of the previous frame, a signal candidate having a high similarity to the signal obtained by linearly predicting and synthesizing the previous frame is selected, and the extended decoded sound signal which is a candidate of the selected signal is selected. Or, a signal generated based on the extended decoded sound signal and a signal obtained by linear prediction synthesis from the previous frame, and an interpolation signal generation step of setting the decoded sound signal of the frame as a decoded sound signal;
A decoding method including:

A decoding method for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, linear prediction synthesis was performed from the previous frame from a signal in which samples of the decoded sound signal of the previous frame of the frame were arranged in time reverse or a signal in which the polarity of the signal was inverted. A signal having a high similarity to the signal is selected, and an extended decoded sound signal that is the selected signal, or a signal generated based on the extended decoded sound signal and a signal that is linearly predicted and synthesized from the previous frame, An interpolation signal generation step as a decoded sound signal of
A decoding method including:

A decoding method for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, from among a plurality of signal candidates having the same power spectrum as the decoded sound signal of the previous frame of the frame, a signal having a high temporal continuity with the previous frame is selected. An interpolation signal generating step of selecting a candidate, and setting a signal generated based on the extended decoded sound signal that is a candidate for the selected signal and a signal obtained by linear prediction synthesis from the previous frame as a decoded sound signal of the frame ,
A decoding method including:

A decoding method for obtaining a decoded sound signal for each frame,
For a frame in which the sound signal code is missing, among a plurality of signal candidates having the same power spectrum as the decoded sound signal of the previous frame of the frame, the similarity to the signal obtained by linear prediction synthesis from the previous frame is similar. A candidate for a high signal is selected, and an extended decoded sound signal that is a candidate for the selected signal, or a signal generated based on the extended decoded sound signal and a signal that is linearly predicted and synthesized from the previous frame, An interpolation signal generating step to be a decoded sound signal,
A decoding method including:

A program for causing a computer to function as each unit of the decoding device according to any one of claims 1 to 7.