JP7790976B2

JP7790976B2 - Signal processing device, signal processing method, and signal processing program

Info

Publication number: JP7790976B2
Application number: JP2022001337A
Authority: JP
Inventors: 昭彦杉山
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-01-06
Filing date: 2022-01-06
Publication date: 2025-12-23
Anticipated expiration: 2042-01-06
Also published as: JP2023100564A

Description

本発明は、信号に混在する雑音、妨害信号、エコーなどを消去する信号処理技術に関する。 The present invention relates to signal processing technology that eliminates noise, interference signals, echoes, etc. that are mixed in with signals.

マイクロホンやハンドセット等から入力された音声信号には、しばしば背景雑音が重畳されており、音声符号化や音声認識を行う上で大きな問題となる。音響的に重畳した雑音の消去を目的とした信号処理装置として、特許文献１および２には、２つの適応フィルタを用いた２入力型雑音消去装置が開示されている。劣化信号（所望の信号と雑音とが混合された信号）と参照信号（主として雑音と相関のある信号を含む）を入力して、雑音の一部または全部を消去し、強調信号（所望の信号を強調した信号）を出力する。２つの適応フィルタの内、第１の適応フィルタを用いて推定した劣化信号における信号対雑音比を用いて、ステップサイズ算出部が第２の適応フィルタの係数更新ステップサイズを算出する。なお、第１の適応フィルタは第２の適応フィルタと同様に動作するが、第１の適応フィルタの係数更新ステップサイズは第２の適応フィルタの係数更新ステップサイズよりも大きな値に設定される。このため、第１の適応フィルタの出力は、環境変化への追従性が高いが雑音の推定精度が第２の適応フィルタよりも劣る。 Speech signals input from microphones, handsets, etc. are often superimposed with background noise, posing a significant problem in speech coding and speech recognition. Patent documents 1 and 2 disclose a two-input noise canceller using two adaptive filters as a signal processing device designed to eliminate acoustically superimposed noise. A degraded signal (a signal containing a mixture of a desired signal and noise) and a reference signal (mainly including signals correlated with the noise) are input, and some or all of the noise is canceled, resulting in an enhanced signal (a signal in which the desired signal is enhanced). A step size calculation unit calculates the coefficient update step size of the second adaptive filter using the signal-to-noise ratio of the degraded signal estimated using the first adaptive filter. The first adaptive filter operates in the same way as the second adaptive filter, but the coefficient update step size of the first adaptive filter is set to a larger value than the coefficient update step size of the second adaptive filter. As a result, the output of the first adaptive filter is highly responsive to environmental changes, but its noise estimation accuracy is inferior to that of the second adaptive filter.

ステップサイズ算出部は、第１の適応フィルタを用いて推定した劣化信号における信号対雑音比を評価し、音声信号が雑音より大きいときには音声信号による妨害が大きいとみなし、小さな係数更新ステップサイズを第２の適応フィルタに提供する。逆に、音声信号が雑音より小さいときには音声信号による妨害が小さいとみなし、大きな係数更新ステップサイズを第２の適応フィルタに提供する。このように、ステップサイズ算出部から提供された係数更新ステップサイズで第２の適応フィルタを制御することにより、十分な環境変化への追従性と雑音消去後の信号における低歪とを出力される。 The step size calculation unit evaluates the signal-to-noise ratio of the noisy signal estimated using the first adaptive filter, and when the speech signal is louder than the noise, it assumes that interference from the speech signal is large and provides a small coefficient update step size to the second adaptive filter. Conversely, when the speech signal is quieter than the noise, it assumes that interference from the speech signal is small and provides a large coefficient update step size to the second adaptive filter. In this way, by controlling the second adaptive filter with the coefficient update step size provided by the step size calculation unit, sufficient adaptability to environmental changes and low distortion in the signal after noise cancellation are output.

特許文献３には、上記特許文献１および２の構成から第１の適応フィルタを削除した構成が開示されている。第２の適応フィルタを用いて推定した所望信号（音声等）と第２の適応フィルタ出力の比で信号対雑音比を近似して、その信号対雑音比に基づいて算出したステップサイズで、第２の適応フィルタ自身を制御する。さらに、特許文献３には、上記特許文献１および２の構成を拡張して、２雑音入力装置の入力において雑音に混入している音声信号の影響が大きい、いわゆる音声信号によるクロストークが存在する際に雑音に混入する音声信号の消去をも行なう雑音消去装置の構成が開示されている。特許文献３においては、上記特許文献１および２の構成に加えて、参照信号からクロストークを消去する第３の適応フィルタを備えている。音声信号入力から正確に雑音を消去するため、第２のステップサイズ算出部において係数更新ステップサイズを算出し、第３の適応フィルタを制御する。 Patent Document 3 discloses a configuration in which the first adaptive filter is omitted from the configurations of Patent Documents 1 and 2. The signal-to-noise ratio is approximated by the ratio between the desired signal (such as speech) estimated using a second adaptive filter and the output of the second adaptive filter, and the second adaptive filter itself is controlled using a step size calculated based on that signal-to-noise ratio. Furthermore, Patent Document 3 discloses a noise cancellation device configuration that expands on the configurations of Patent Documents 1 and 2, and also cancels speech signals mixed into noise when crosstalk due to speech signals is present, which is when the influence of speech signals mixed into noise at the input of a two-noise input device is significant. In addition to the configurations of Patent Documents 1 and 2, Patent Document 3 includes a third adaptive filter that cancels crosstalk from the reference signal. To accurately cancel noise from the speech signal input, a second step size calculation unit calculates a coefficient update step size and controls the third adaptive filter.

すなわち、特許文献１乃至３の雑音消去装置は、雑音消去後の信号と適応フィルタ出力を用いて推定した信号対雑音比で、適応フィルタの係数更新を制御する。信号対雑音比が高いときには小さなステップサイズを、信号対雑音比が低いときには大きなステップサイズを用いることで、高速収束と低歪出力信号を両立している。 In other words, the noise cancellation devices in Patent Documents 1 to 3 control the updating of adaptive filter coefficients using a signal-to-noise ratio estimated using the post-noise cancellation signal and the adaptive filter output. By using a small step size when the signal-to-noise ratio is high and a large step size when the signal-to-noise ratio is low, they achieve both fast convergence and low-distortion output signals.

しかしながら、特許文献１乃至３の雑音消去装置では、適応フィルタの係数が全く更新されない。これは、通常、適応フィルタ係数の初期値がゼロに設定されるためである。ゼロ係数の適応フィルタはゼロを出力する。これが信号対雑音比の推定値の分母であるために、信号対雑音比の推定値は極めて大きな値となり、対応するステップサイズとしてゼロが設定される。ゼロのステップサイズは、係数更新を行わないことを意味する。これを避けるためには、係数更新開始直後に強制的にステップサイズを非ゼロの値に設定しなければならないが、実際にどの値をステップサイズに設定するか、どれだけの期間、非ゼロの値に設定しなければならないかに関して明確な設計方法は開示されていない。すなわち、２入力雑音消去装置で高速収束と低歪出力信号を両立するためには、ステップサイズの手動制御が必要である。 However, in the noise cancellers of Patent Documents 1 to 3, the adaptive filter coefficients are not updated at all. This is because the initial values of the adaptive filter coefficients are typically set to zero. An adaptive filter with zero coefficients outputs zero. Because this is the denominator of the estimated signal-to-noise ratio, the estimated signal-to-noise ratio becomes an extremely large value, and the corresponding step size is set to zero. A step size of zero means that no coefficient update is performed. To avoid this, the step size must be forcibly set to a non-zero value immediately after the coefficient update begins, but no clear design method is disclosed for what value the step size should actually be set to or for how long it should be set to a non-zero value. In other words, manual control of the step size is necessary to achieve both fast convergence and low-distortion output signals in a two-input noise canceller.

特許文献４には、ステップサイズの手動制御が不要で、高速収束と低歪出力信号を両立できる２入力型雑音消去装置の構成が開示されている。雑音消去後の信号と参照信号の比で定義される新たな信号対雑音比を装置の動作直後に用いて適応フィルタの係数更新を制御し、係数更新が行われない問題を解決する。適応フィルタの係数が成長したとき、特許文献３に開示された信号対雑音比に切り替える。係数成長の評価は、前記新たな信号対雑音比が前記従来の信号対雑音比に十分近くなったことによって行う。クロストークが存在する際には、雑音入力信号から音声信号を消去する第３の適応フィルタを導入して、第２の適応フィルタと同様の原理でステップサイズを制御する。 Patent Document 4 discloses the configuration of a two-input noise canceller that does not require manual control of the step size and achieves both fast convergence and low-distortion output signals. A new signal-to-noise ratio, defined as the ratio of the post-noise-cancellation signal to the reference signal, is used immediately after the device is operated to control the update of the adaptive filter's coefficients, solving the problem of coefficient updates not being performed. When the adaptive filter's coefficients grow, they are switched to the signal-to-noise ratio disclosed in Patent Document 3. The coefficient growth is evaluated when the new signal-to-noise ratio becomes sufficiently close to the conventional signal-to-noise ratio. When crosstalk is present, a third adaptive filter is introduced that cancels the speech signal from the noise input signal, and the step size is controlled using the same principle as the second adaptive filter.

特開平１０－２１５１９３号公報Japanese Patent Application Publication No. 10-215193 特開２０００－１７２２９９号公報Japanese Patent Application Laid-Open No. 2000-172299 国際公開第２０１２／０４６５８２号International Publication No. 2012/046582 国際公開第２０１９／０９２７９８号International Publication No. 2019/092798

しかしながら、特許文献４に記載の信号処理装置では、適応フィルタの近似する音響インパルス応答の利得が１未満である場合、推定した信号対雑音比の切換えが行われない。これは、１未満の音響インパルス応答利得で、新たな信号対雑音比が従来の信号対雑音比に十分近くならないためである。その結果、信号対雑音比の推定精度が低く、高速収束と低歪出力信号を両立できない。 However, in the signal processing device described in Patent Document 4, if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1, the estimated signal-to-noise ratio is not switched. This is because with an acoustic impulse response gain of less than 1, the new signal-to-noise ratio is not sufficiently close to the previous signal-to-noise ratio. As a result, the accuracy of the signal-to-noise ratio estimation is low, and it is not possible to achieve both fast convergence and a low-distortion output signal.

本発明の目的は、上記の課題を解決する技術を提供することにある。 The object of the present invention is to provide technology that solves the above problems.

上記目的を達成するため、本発明に係る装置は、
第１信号と第２信号とが混在した第１混在信号を入力する第１入力手段と、
前記第１信号と相関のある第３信号と前記第２信号と相関のある第４信号とが混在した第２混在信号を入力する第２入力手段と、
前記第２混在信号をフィルタ処理して前記第２信号の推定値を生成する第１適応フィルタと、
前記第１混在信号と前記第２信号の推定値とから、前記第１信号の推定値を生成する第１減算部と、
前記第１信号の推定値と前記第２信号の推定値と前記第２混在信号と前記第１適応フィルタの係数を用いて前記第１信号と前記第２信号の振幅または電力の比を第１混在比として推定する推定部と、
を備え、
前記第１混在比を用いて前記第１の適応フィルタを制御する。 In order to achieve the above object, the device according to the present invention comprises:
a first input means for inputting a first mixed signal in which the first signal and the second signal are mixed;
a second input means for inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed;
a first adaptive filter for filtering the second mixed signal to generate an estimate of the second signal;
a first subtraction unit that generates an estimate of the first signal from the first mixed signal and an estimate of the second signal;
an estimation unit that estimates a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, and a coefficient of the first adaptive filter;
Equipped with
The first mixing ratio is used to control the first adaptive filter.

上記目的を達成するため、本発明に係る方法は、
第１信号と第２信号が混在した第１混在信号を入力し、
前記第１信号と相関のある第３信号と前記第２信号と相関のある第４信号とが混在した第２混在信号を入力し、
前記第２混在信号をフィルタ処理して前記第２信号の推定値を生成し、
前記第１混在信号と前記第２信号の推定値から前記第１信号の推定値を生成し、
前記第１信号の推定値と前記第２信号の推定値と前記第２混在信号と前記第１適応フィルタの係数とを用いて前記第１信号と前記第２信号の振幅または電力の比を第１混在比として推定し、
前記第１混在比を用いて前記第２信号の推定値の生成を制御する。 In order to achieve the above object, the method according to the present invention comprises:
A first mixed signal in which the first signal and the second signal are mixed is input;
a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed is input;
filtering the second mixed signal to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
estimating an amplitude or power ratio between the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, and a coefficient of the first adaptive filter;
The first mixing ratio is used to control generation of an estimate of the second signal.

上記目的を達成するため、本発明に係るプログラムは、
コンピュータに、
第１信号と第２信号が混在した第１混在信号を入力するステップと、
前記第１信号と相関のある第３信号と前記第２信号と相関のある第４信号とが混在した第２混在信号を入力するステップと、
前記第２混在信号をフィルタ処理して前記第２信号の推定値を生成するステップと、
前記第１混在信号と前記第２信号の推定値から前記第１信号の推定値を生成するステップと、
前記第１信号の推定値と前記第２信号の推定値と前記第２混在信号と前記第１適応フィルタの係数とを用いて前記第１信号と前記第２信号の振幅または電力の比を第１混在比として推定するステップと、
前記第１混在比を用いて前記第２信号の推定値の生成を制御するステップと、
を実行させる。 In order to achieve the above object, the program according to the present invention comprises:
On the computer,
inputting a first mixed signal in which a first signal and a second signal are mixed;
inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed;
filtering the second mixture signal to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
a step of estimating a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, and a coefficient of the first adaptive filter;
using the first mixing ratio to control generation of an estimate of the second signal;
Execute the following.

本発明によれば、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズを手動制御することなく、高速収束と低歪出力信号を両立できる信号処理装置を得ることができる。 The present invention makes it possible to obtain a signal processing device that can achieve both fast convergence and low-distortion output signals without manually controlling the step size, even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

本発明の第１実施形態に係る信号処理装置の構成を示すブロック図である。1 is a block diagram showing a configuration of a signal processing device according to a first embodiment of the present invention. 本発明の第２実施形態に係る信号処理装置の構成を示すブロック図である。FIG. 10 is a block diagram showing the configuration of a signal processing device according to a second embodiment of the present invention. 本発明の第２実施形態に係る推定部の第１の構成を示すブロック図である。FIG. 10 is a block diagram showing a first configuration of an estimating unit according to a second embodiment of the present invention. 本発明の第２実施形態に係る値の時間推移を示す図である。FIG. 10 is a diagram showing time transitions of values according to the second embodiment of the present invention. 本発明の第２実施形態に係る推定部の第２の構成を示すブロック図である。FIG. 10 is a block diagram showing a second configuration of the estimation unit according to the second embodiment of the present invention. 本発明の第３実施形態に係る信号処理装置の構成を示すブロック図である。FIG. 10 is a block diagram showing a configuration of a signal processing device according to a third embodiment of the present invention. 本発明の第３実施形態に係る推定部の第１の構成を示すブロック図である。FIG. 11 is a block diagram showing a first configuration of an estimation unit according to a third embodiment of the present invention. 本発明の第３実施形態に係る推定部の第２の構成を示すブロック図である。FIG. 11 is a block diagram showing a second configuration of the estimation unit according to the third embodiment of the present invention. 本発明の第１の実施形態に係るコンピュータの構成を示すブロック図である。1 is a block diagram showing the configuration of a computer according to a first embodiment of the present invention. 図９に示すコンピュータのプロセッサによる信号処理の一例を示すフローチャートである。10 is a flowchart showing an example of signal processing by a processor of the computer shown in FIG. 9 .

以下に、図面を参照して、本発明の実施の形態について例示的に詳しく説明する。ただし、以下の実施の形態に記載されている構成要素はあくまで例示であり、本発明の技術範囲をそれらのみに限定する趣旨のものではない。 Embodiments of the present invention will be described in detail below with reference to the drawings. However, the components described in the following embodiments are merely examples and are not intended to limit the technical scope of the present invention to these components alone.

〔１，第１実施形態〕
本発明の第１実施形態としての信号処理装置１００について、図１を用いて説明する。図１の信号処理装置１００は、第１信号と第２信号とが混在する第１混在信号ｘＰ（ｋ）から、第１信号の推定値ｅ１（ｋ）を求める装置である。 1. First Embodiment
A signal processing device 100 according to a first embodiment of the present invention will be described with reference to Fig. 1. The signal processing device 100 in Fig. 1 is a device that calculates an estimate e1(k) of a first signal from a first mixture signal xP(k) in which a first signal and a second signal are mixed.

図１に示すように、信号処理装置１００は、第１入力部１０１と、第２入力部１０２と、適応フィルタ１０３と、減算部１０４と、推定部１０６と、係数更新制御部１０７とを含む。 As shown in FIG. 1, the signal processing device 100 includes a first input unit 101, a second input unit 102, an adaptive filter 103, a subtraction unit 104, an estimation unit 106, and a coefficient update control unit 107.

このうち、第１入力部１０１は第１信号と第２信号とが混在した第１混在信号ｘＰ（ｋ）を入力する。第２入力部１０２は、第３信号と第４信号とが混在した第２混在信号ｘＲ（ｋ）を入力する。第１信号と第３信号は、同一の信号源Ａから生じており、互いに相関を有する。第２信号と第４信号は、同一の信号源Ｂから生じており、互いに相関を有する。 Of these, the first input unit 101 inputs a first mixed signal xP(k) that is a mixture of the first signal and the second signal. The second input unit 102 inputs a second mixed signal xR(k) that is a mixture of the third signal and the fourth signal. The first signal and the third signal originate from the same signal source A and are correlated with each other. The second signal and the fourth signal originate from the same signal source B and are correlated with each other.

減算部１０４は、第１混在信号ｘＰ（ｋ）に混在する第２信号の推定値ｎ１（ｋ）と第１混在信号ｘＰ（ｋ）を受けて、第１信号の推定値ｅ１（ｋ）を出力する。そして、適応フィルタ１０３は、第２信号の推定値ｎ１（ｋ）を求めるため、第２混在信号ｘＲ（ｋ）に対して、第１信号の推定値ｅ１（ｋ）に基づいて更新される係数１４１を用いてフィルタ処理を施す。 The subtraction unit 104 receives the first mixture signal xP(k) and an estimate n1(k) of the second signal mixed into the first mixture signal xP(k), and outputs an estimate e1(k) of the first signal. Then, to obtain the estimate n1(k) of the second signal, the adaptive filter 103 performs filtering on the second mixture signal xR(k) using coefficients 141 that are updated based on the estimate e1(k) of the first signal.

推定部１０６は、第１信号の推定値ｅ１（ｋ）と第２信号の推定値ｎ１（ｋ）と第１混在信号ｘＰ（ｋ）と第２混在信号ｘＲ（ｋ）と適応フィルタ１０３の係数ベクトルｗ１（ｋ）とを用いて、第１信号と第２信号の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。係数更新制御部１０７は、推定部１０６によって得られた第１混在比Ｒ１（ｋ）の値が大きい場合に、適応フィルタ１０３の係数１４１の更新量を小さくするための制御信号μ１（ｋ）を適応フィルタ１０３に出力する。 The estimation unit 106 estimates the amplitude or power ratio of the first signal to the second signal as a first mixture ratio R1(k) using the estimated value e1(k) of the first signal, the estimated value n1(k) of the second signal, the first mixture signal xP(k), the second mixture signal xR(k), and the coefficient vector w1(k) of the adaptive filter 103. When the value of the first mixture ratio R1(k) obtained by the estimation unit 106 is large, the coefficient update control unit 107 outputs a control signal μ1(k) to the adaptive filter 103 to reduce the amount of update of the coefficients 141 of the adaptive filter 103.

このような構成を備えた本実施形態によれば、第１信号と第２信号とが混在する混在信号から、低演算量で、遅延なく第２信号を除くことができ、結果として、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、第２信号の消し残りが少なく、かつ、歪が少ない第１信号の推定値を得ることができる。 With this configuration, this embodiment can remove the second signal from a mixed signal containing a first signal and a second signal with a low amount of calculation and without delay.As a result, even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1, it is possible to obtain an estimate of the first signal with little residual second signal and little distortion.

〔２，第２実施形態］
本発明の第２実施形態に係る信号処理装置として、劣化信号（所望の信号と雑音とが混合された信号）と参照信号（主として雑音と相関のある信号を含む）を入力して、雑音の一部または全部を消去し、強調信号（所望信号を強調した信号）を出力する雑音消去装置について説明する。ここで、劣化信号は第１信号と第２信号が混在する第１混在信号に相当し、参照信号は第３信号と第４信号が混在する第２混在信号に相当し、強調信号（第１信号の推定値）は所望信号に相当する。 2. Second Embodiment
As a signal processing device according to a second embodiment of the present invention, a noise canceller will be described which receives a noisy signal (a signal obtained by mixing a desired signal and noise) and a reference signal (mainly including a signal correlated with the noise), cancels part or all of the noise, and outputs an emphasized signal (a signal obtained by emphasizing the desired signal). Here, the noisy signal corresponds to a first mixed signal obtained by mixing a first signal and a second signal, the reference signal corresponds to a second mixed signal obtained by mixing a third signal and a fourth signal, and the emphasized signal (an estimated value of the first signal) corresponds to the desired signal.

〔２，１．雑音消去の基礎技術の説明〕
以下、マイクロホン、ハンドセット、通信路等から入力された所望信号に混在する雑音、妨害信号、エコーなどを適応フィルタによって消去し、または所望信号を強調する雑音消去の基礎技術の説明を簡単に行なう。 [2.1. Explanation of basic noise cancellation technology]
The following is a brief explanation of the basic technology of noise cancellation, which uses an adaptive filter to cancel noise, interference signals, echoes, etc. mixed in with a desired signal input from a microphone, handset, communication channel, etc., or to emphasize the desired signal.

特許文献１乃至３に開示されているように、２入力型の雑音消去装置は、雑音源から音声入力端子に至る音響経路のインパルス応答を近似する適応フィルタを用いて、参照信号から音声入力端子において音声に混入する雑音成分に対応した擬似雑音（第２信号の推定値）を生成する。そして、音声入力端子に入力された信号（第１混在信号）からこの擬似雑音を差し引くことによって、雑音成分を抑圧するように動作する。ここで、混在信号とは、所望（音声）信号と雑音とが混在した信号のことであり、一般に、マイクロホンやハンドセットから音声入力端子に供給される。また、参照信号とは、雑音源における雑音成分と相関のある信号であり、雑音源近傍において捕捉される。このように、雑音源近傍において参照信号を捕捉することで、参照信号は雑音源における雑音成分とほぼ等しいとみなすことができる。適応フィルタには、参照入力端子に供給される参照信号を入力する。 As disclosed in Patent Documents 1 to 3, a two-input noise cancellation device uses an adaptive filter that approximates the impulse response of the acoustic path from the noise source to the audio input terminal to generate, from a reference signal, pseudo-noise (an estimated value of the second signal) corresponding to the noise components mixed into the audio at the audio input terminal. The device then operates to suppress the noise components by subtracting this pseudo-noise from the signal input to the audio input terminal (first mixed signal). Here, the mixed signal refers to a signal that contains a mixture of a desired (audio) signal and noise, and is generally supplied to the audio input terminal from a microphone or handset. The reference signal is a signal that is correlated with the noise components at the noise source and is captured near the noise source. By capturing the reference signal near the noise source in this way, the reference signal can be considered to be approximately equal to the noise components at the noise source. The adaptive filter receives the reference signal supplied to the reference input terminal.

適応フィルタの係数は、劣化信号から擬似雑音を差し引いた誤差と参照入力端子に入力された参照信号との相関をとることにより修正される。このような適応フィルタの係数修正アルゴリズムとして、特許文献１乃至３には、「ＬＭＳアルゴリズム（ＬｅａｓｔＭｅａｎ－ＳｑｕａｒｅＡｌｇｏｒｉｔｈｍ）」や「ＬＩＭ（ＬｅａｒｎｉｎｇＩｄｅｎｔｉｆｉｃａｔｉｏｎＭｅｔｈｏｄ）：学習同定法」が開示されている。ＬＩＭはまた、正規化ＬＭＳ（ＮＬＭＳ）アルゴリズムとも呼ばれる。 The adaptive filter's coefficients are corrected by correlating the error, obtained by subtracting pseudo-noise from the degraded signal, with the reference signal input to the reference input terminal. Patent documents 1 to 3 disclose algorithms for correcting the adaptive filter's coefficients, such as the "Least Mean-Square Algorithm (LMS)" and "Learning Identification Method (LIM)." LIM is also known as the normalized LMS (NLMS) algorithm.

正規化ＬＭＳアルゴリズムによる係数更新は、時刻ｋにおけるステップサイズをμ１（ｋ）とすれば、式（１）で表される。係数ベクトルｗ１（ｋ）は、その要素を用いて式（２）で定義される。参照信号ベクトルｘＲ（ｋ）は、その要素を用いて式（３）で表される。なお、Ｔはベクトルの転置を表し、係数の総数はＬとする。
Coefficient update using the normalized LMS algorithm is expressed by equation (1), where μ(k) is the step size at time k. The coefficient vector w(k) is defined by equation (2) using its elements. The reference signal vector x(k) is expressed by equation (3) using its elements. Note that T represents the transpose of the vector, and L is the total number of coefficients.

ＬＭＳアルゴリズムやＬＩＭは、勾配法と呼ばれるアルゴリズムの一種であり、係数更新の速度と精度は、係数更新ステップサイズと呼ばれる定数に依存する。係数更新ステップサイズと誤差との積によってフィルタ係数を更新するが、誤差に含まれる所望信号（第１信号の推定値）による係数更新への妨害を低減するためには、係数更新ステップサイズを極めて小さな値またはゼロに設定する必要がある。係数更新ステップサイズを常に小さい値に設定すると、適応フィルタ係数の環境変化への追従性が低下する。上記特許文献１乃至３は、出力誤差が増大し、あるいは所望信号に歪が生じるという問題を解決する１つの方法を開示するものである。また、所望信号は一般的に音声であるために、以降音声と表記するが、本発明の趣旨は音声には限定されず、音響（オーディオ）信号を含むあらゆる種類の信号を表す。また、係数更新アルゴリズムも、ＬＭＳやＬＩＭに限定されない。 The LMS algorithm and LIM are types of algorithms known as gradient methods, and the speed and accuracy of coefficient updates depend on a constant known as the coefficient update step size. Filter coefficients are updated by multiplying the coefficient update step size by the error. To reduce interference with the coefficient update caused by the desired signal (the estimated value of the first signal) contained in the error, the coefficient update step size must be set to an extremely small value or zero. Constantly setting the coefficient update step size to a small value reduces the adaptive filter coefficient's ability to track environmental changes. Patent documents 1 to 3 above disclose a method for solving the problem of increased output error or distortion of the desired signal. Furthermore, because the desired signal is generally speech, the term "speech" will be used hereafter. However, the scope of this invention is not limited to speech and can refer to any type of signal, including acoustic (audio) signals. Furthermore, the coefficient update algorithm is not limited to LMS or LIM.

〔２．２．雑音消去装置の構成〕
図２は、本実施形態としての雑音消去装置２００の全体構成を示すブロック図である。雑音消去装置２００は、例えばデジタルカメラ、ノートパソコン、携帯電話、補聴器、テレビジョン、スマートスピーカ、またはロボットなどといった装置の一部としても機能するが、本発明はこれに限定されるものではなく、入力信号からの雑音消去を要求されるあらゆる信号処理装置に適用可能である。 2.2. Configuration of the Noise Canceller
2 is a block diagram showing the overall configuration of a noise cancellation device 200 according to this embodiment. The noise cancellation device 200 may function as part of a device such as a digital camera, a laptop computer, a mobile phone, a hearing aid, a television, a smart speaker, or a robot, but the present invention is not limited to these and may be applied to any signal processing device that is required to cancel noise from an input signal.

図２に示すように、雑音消去装置２００は、入力端子２０１から音声（第１信号）と雑音（第２信号）の混在した劣化信号（第１混在信号）ｘＰ（ｋ）を入力する。そして、入力端子２０２から音声と雑音の混在した参照信号（第２混在信号）ｘＲ（ｋ）を入力し、出力端子２０５から音声の推定値ｅ１（ｋ）（強調信号）を出力する。また、雑音消去装置２００は、適応フィルタ２０３と、減算部２０４と、推定部２０６と、を備えている。適応フィルタ２０３は、図１における適応フィルタ１０３と係数更新制御部１０７とを包含する構成であり、第１混在比Ｒ１（ｋ）を受けてステップサイズμ１（ｋ）を算出し、算出したステップサイズμ１（ｋ）を用いて係数ベクトルｗ１（ｋ）を更新する。雑音消去装置２００は、消去しようとする雑音と相関のある参照信号ｘＲ（ｋ）を適応フィルタ２０３で変形して擬似雑音ｎ１（ｋ）を生成し、これを雑音の重畳した音声信号である劣化信号ｘＰ（ｋ）から減算することで、雑音の消去を行うものである。 2, noise cancellation device 200 receives a noisy signal (first mixture signal) xP(k) containing a mixture of speech (first signal) and noise (second signal) from input terminal 201. It also receives a reference signal (second mixture signal) xR(k) containing a mixture of speech and noise from input terminal 202, and outputs an estimated speech value e1(k) (emphasis signal) from output terminal 205. Furthermore, noise cancellation device 200 includes adaptive filter 203, subtraction unit 204, and estimation unit 206. Adaptive filter 203 includes adaptive filter 103 and coefficient update control unit 107 in FIG. 1, and receives first mixture ratio R1(k) to calculate step size μ1(k) and updates coefficient vector w1(k) using the calculated step size μ1(k). The noise canceller 200 generates pseudo-noise n1(k) by modifying a reference signal xR(k) that is correlated with the noise to be cancelled using an adaptive filter 203, and then subtracts this from a noisy signal xP(k), which is an audio signal with noise superimposed on it, thereby canceling the noise.

入力端子２０１には、劣化信号ｘＰ（ｋ）が、サンプル値系列として供給される。劣化信号ｘＰ（ｋ）は、減算部２０４に伝達される。入力端子２０２には、参照信号ｘＲ（ｋ）が、サンプル値系列として供給される。参照信号ｘＲ（ｋ）は、適応フィルタ２０３と推定部２０６に伝達される。 The degraded signal xP(k) is supplied to the input terminal 201 as a sequence of sample values. The degraded signal xP(k) is transmitted to the subtraction unit 204. The reference signal xR(k) is supplied to the input terminal 202 as a sequence of sample values. The reference signal xR(k) is transmitted to the adaptive filter 203 and the estimation unit 206.

適応フィルタ２０３は、参照信号ｘＲ（ｋ）とフィルタ係数の畳込み演算を行い、その結果を擬似雑音ｎ１（ｋ）として減算部２０４と推定部２０６に伝達する。また、適応フィルタ２０３は、係数ベクトルｗ１（ｋ）を推定部２０６に供給する。 The adaptive filter 203 performs a convolution operation on the reference signal xR(k) and the filter coefficients, and transmits the result as pseudo-noise n1(k) to the subtraction unit 204 and the estimation unit 206. The adaptive filter 203 also supplies the coefficient vector w1(k) to the estimation unit 206.

減算部２０４には、入力端子２０１から劣化信号ｘＰ（ｋ）が、適応フィルタ２０３から擬似雑音ｎ１（ｋ）が供給される。減算部２０４は、劣化信号ｘＰ（ｋ）から擬似雑音ｎ１（ｋ）を減算し、その結果を音声信号の推定値ｅ１（ｋ）（第１信号の推定値）として出力端子２０５に伝達すると同時に適応フィルタ２０３に帰還する。 The subtraction unit 204 receives the noisy signal xP(k) from the input terminal 201 and the pseudo-noise n1(k) from the adaptive filter 203. The subtraction unit 204 subtracts the pseudo-noise n1(k) from the noisy signal xP(k) and transmits the result as an estimated value e1(k) of the audio signal (estimated value of the first signal) to the output terminal 205, while also feeding it back to the adaptive filter 203.

推定部２０６は、音声の推定値ｅ１（ｋ）、擬似雑音ｎ１（ｋ）（適応フィルタ２０３の出力）、参照信号ｘＲ（ｋ）、劣化信号ｘＰ（ｋ）、および適応フィルタ２０３の係数ベクトルｗ１（ｋ）を受けて、入力端子２０１における音声と雑音の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定し、適応フィルタ２０３に伝達する。適応フィルタ２０３は、第１混在比Ｒ１（ｋ）が大きいときに小さなステップサイズμ１（ｋ）を、第１混在比Ｒ１（ｋ）が小さいときに大きなステップサイズμ１（ｋ）を用いて、係数ベクトルｗ１（ｋ）を更新する。第１混在比Ｒ１（ｋ）、すなわち信号対雑音比の推定値を用いてステップサイズを制御する方法に関しては、特許文献１から３に詳細に開示されている。また、特許文献１から３に開示されているように、第１混在比Ｒ１（ｋ）を平均化してから、ステップサイズμ１（ｋ）の計算に用いてもよい。音声と雑音の振幅または電力の比に対する推定精度が向上する。 The estimation unit 206 receives the estimated speech value e1(k), the pseudo-noise n1(k) (output of the adaptive filter 203), the reference signal xR(k), the noisy signal xP(k), and the coefficient vector w1(k) of the adaptive filter 203, estimates the amplitude or power ratio of speech to noise at the input terminal 201 as the first mixture ratio R1(k), and transmits this to the adaptive filter 203. The adaptive filter 203 updates the coefficient vector w1(k) using a small step size μ1(k) when the first mixture ratio R1(k) is large and a large step size μ1(k) when the first mixture ratio R1(k) is small. Methods for controlling the step size using the estimated first mixture ratio R1(k), i.e., the signal-to-noise ratio, are disclosed in detail in Patent Documents 1 to 3. Furthermore, as disclosed in Patent Documents 1 to 3, the first mixture ratio R1(k) may be averaged before being used to calculate the step size μ1(k). Improved estimation accuracy for the ratio of speech to noise amplitude or power.

〔２．３．推定部２０６の第１の構成〕
図３は、推定部２０６の第１の内部構成を示すブロック図である。推定部２０６は、信号比推定部３０１、信号比推定部３０２、混合部３０５、および補正部３１０を備えている。信号比推定部３０１は、音声の推定値ｅ１（ｋ）と擬似雑音ｎ１（ｋ）とを受けて、入力端子２０１における音声と雑音の振幅または電力の比を第２混在比Ｒ２（ｋ）として推定する。第２混在比Ｒ２（ｋ）は、音声の推定値ｅ１（ｋ）と擬似雑音ｎ１（ｋ）の振幅または電力の比としてもよいし、それらの振幅または電力に微小定数を加算してから比を計算してもよい。微小定数の加算は、除算による商の発散を防ぐ効果がある。また、音声の推定値ｅ１（ｋ）と擬似雑音ｎ１（ｋ）のいずれかまたは両方を、平均化してから用いてもよい。平均化によって、比の計算精度を向上することができる。 2.3. First Configuration of Estimation Unit 206
FIG. 3 is a block diagram showing a first internal configuration of estimation unit 206. Estimation unit 206 includes signal ratio estimation unit 301, signal ratio estimation unit 302, mixer 305, and correction unit 310. Signal ratio estimation unit 301 receives speech estimate e1(k) and pseudo-noise n1(k) and estimates the amplitude or power ratio of speech to noise at input terminal 201 as second mixture ratio R2(k). Second mixture ratio R2(k) may be the ratio of the amplitude or power of speech estimate e1(k) and pseudo-noise n1(k), or may be calculated by adding a small constant to the amplitude or power of the speech estimate e1(k) and pseudo-noise n1(k). Adding a small constant has the effect of preventing divergence of the quotient due to division. Alternatively, one or both of speech estimate e1(k) and pseudo-noise n1(k) may be averaged before use. Averaging can improve the accuracy of the ratio calculation.

補正部３１０は、劣化信号ｘＰ（ｋ）と参照信号ｘＲ（ｋ）とを受けて、参照信号ｘＲ（ｋ）を補正し、補正参照信号ｘＲＣ（ｋ）を求める。雑音消去装置２００の動作開始直後からＭ１サンプルの間、劣化信号ｘＰ（ｋ）と参照信号ｘＲ（ｋ）のパワーを平均し、その比を倍率ｇ１として求める。自然数Ｍ１は予め定められた定数である。適応フィルタ２０３のすべての係数に０でない入力信号サンプルが供給されて平常の動作が開始されるまで、Ｌのサンプリング周期が必要である。したがって、Ｍ１の１つの設定法はＭ１＝Ｌとすることである。補正部３１０は、倍率ｇ１の平方根と参照信号ｘＲ（ｋ）との積を補正参照信号ｘＲＣ（ｋ）とする。 The correction unit 310 receives the noisy signal xP(k) and the reference signal xR(k), corrects the reference signal xR(k), and obtains the corrected reference signal xRC(k). The power of the noisy signal xP(k) and the reference signal xR(k) is averaged for M1 samples immediately after the noise canceller 200 starts operating, and the ratio is obtained as the multiplier g1. The natural number M1 is a predetermined constant. L sampling periods are required until non-zero input signal samples are supplied to all coefficients of the adaptive filter 203 and normal operation begins. Therefore, one way to set M1 is to set M1 = L. The correction unit 310 obtains the corrected reference signal xRC(k) by multiplying the square root of the multiplier g1 by the reference signal xR(k).

一般的な音声応用では、機器の始動直後に第１信号の振幅を０と仮定することができる。例えば、始動直後の１００ｍｓ、２００ｍｓなどの短時間に会話が開始されることはなく、音声（第１信号）が存在しないためである。しかし、背景雑音などはスイッチを入れた直後から存在する。このとき、劣化信号ｘＰ（ｋ）は雑音（第２信号）に等しいので、倍率ｇ１を劣化信号（＝雑音）ｘＰ（ｋ）と参照信号（入力端子２０１における混入雑音）ｘＲ（ｋ）のパワーの比として推定できる。このようにして推定された倍率ｇ１は、適応フィルタ２０３に白色雑音が入力されているときの入力と出力の比、すなわち、適応フィルタ２０３が近似する音響系のインパルス応答の平均利得に相当する。参照信号ｘＲ（ｋ）のパワーと倍率ｇ１の積は、適応フィルタ２０３の利得だけを反映させた擬似雑音ｎ１（ｋ）（第２信号の推定値）のパワーの近似値となり、参照信号ｘＲ（ｋ）よりも擬似雑音ｎ１（ｋ）のパワーの精度が高い。したがって、参照信号ｘＲ（ｋ）のパワーと倍率ｇ１の積を擬似雑音ｎ１（ｋ）のパワーの代わりに用いることで、より高精度に第１信号と第２信号、すなわち入力端子２０１における音声と雑音の振幅またはパワーの比を求めることができる。 In typical audio applications, the amplitude of the first signal can be assumed to be zero immediately after the device is turned on. This is because, for example, conversation does not begin during the short period of time immediately after startup, such as 100 ms or 200 ms, and no voice (first signal) is present. However, background noise is present immediately after the device is turned on. In this case, the degradation signal xP(k) is equal to the noise (second signal), so the scaling factor g1 can be estimated as the ratio of the power of the degradation signal (= noise) xP(k) to the reference signal (mixed noise at input terminal 201) xR(k). The scaling factor g1 estimated in this way corresponds to the ratio of input to output when white noise is input to adaptive filter 203, i.e., the average gain of the impulse response of the acoustic system approximated by adaptive filter 203. The product of the power of reference signal xR(k) and scaling factor g1 is an approximation of the power of pseudo-noise n1(k) (estimated value of the second signal) that reflects only the gain of adaptive filter 203, and the accuracy of the power of pseudo-noise n1(k) is higher than that of reference signal xR(k). Therefore, by using the product of the power of reference signal xR(k) and scaling factor g1 instead of the power of pseudo-noise n1(k), the ratio of the amplitude or power of the first signal to the second signal, i.e., the speech to noise at input terminal 201, can be determined with higher accuracy.

倍率ｇ１の決定は、音声（第１信号）の振幅が０である限り、どのタイミングで何回実施してもよい。倍率ｇ１の値をより高い頻度で決定することで、適応フィルタ２０３が近似する音響系のインパルス応答の変化に、より正確に追従することができる。始動直後以外に音声の振幅が０であることは、音声の推定値ｅ１（ｋ）を用いて判定することができる。音声の推定値ｅ１（ｋ）は適応フィルタ２０３が十分に収束しても常に一定の誤差を含むので、この誤差を考慮して音声の推定値ｅ１（ｋ）と予め定めた閾値β１との比較を行う。閾値β１を大きく設定すると振幅が０であると判定される頻度が高くなるが、判定誤りが増加する。閾値β１を小さく設定すると振幅が０であると判定される頻度が低くなり、前記音響系インパルス応答の変化への追従性が劣化する。 The determination of the multiplier g1 may be performed at any timing and any number of times as long as the amplitude of the audio (first signal) is zero. By determining the value of the multiplier g1 more frequently, the adaptive filter 203 can more accurately track changes in the impulse response of the acoustic system it approximates. The audio estimate e1(k) can be used to determine that the audio amplitude is zero other than immediately after startup. Since the audio estimate e1(k) always contains a certain amount of error even when the adaptive filter 203 has sufficiently converged, this error is taken into account when comparing the audio estimate e1(k) with a predetermined threshold β1. Setting the threshold β1 to a large value increases the frequency at which the amplitude is determined to be zero, but also increases the number of determination errors. Setting the threshold β1 to a small value decreases the frequency at which the amplitude is determined to be zero, resulting in poor tracking of changes in the acoustic system impulse response.

信号比推定部３０２は、音声の推定値ｅ１（ｋ）と補正参照信号ｘＲＣ（ｋ）とを受けて、入力端子２０１における音声と雑音の振幅または電力の比を第３混在比Ｒ３（ｋ）として推定する。第３混在比Ｒ３（ｋ）は、音声の推定値ｅ１（ｋ）と補正参照信号ｘＲＣ（ｋ）の振幅または電力の比としてもよいし、それらの振幅または電力に微小定数を加算してから比を計算してもよい。微小定数の加算は、除算による商の発散を防ぐ効果がある。また、音声の推定値ｅ１（ｋ）と補正参照信号ｘＲＣ（ｋ）のいずれかまたは両方を、平均化してから用いてもよい。 The signal ratio estimation unit 302 receives the speech estimate e1(k) and the corrected reference signal xRC(k), and estimates the ratio of the amplitude or power of speech to noise at the input terminal 201 as the third mixture ratio R3(k). The third mixture ratio R3(k) may be the ratio of the amplitude or power of the speech estimate e1(k) and the corrected reference signal xRC(k), or may be calculated by adding a small constant to these amplitudes or powers. Adding a small constant has the effect of preventing the quotient from diverging due to division. Alternatively, either or both of the speech estimate e1(k) and the corrected reference signal xRC(k) may be averaged before use.

混合部３０５は、第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）とを混合して、混合結果を第１混在比Ｒ１（ｋ）として出力する。第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）は、重み付き加算によって混合してもよいし、さらに複雑な高次多項式などを用いて混合してもよい。混合に先立って、第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）のいずれかまたは両方を平均化してもよい。平均化によって、第１混在比Ｒ１（ｋ）の計算精度、すなわち入力端子２０１における音声と雑音の振幅または電力の近似精度を向上することができる。 The mixer 305 mixes the second mixture ratio R2(k) and the third mixture ratio R3(k) and outputs the mixture result as the first mixture ratio R1(k). The second mixture ratio R2(k) and the third mixture ratio R3(k) may be mixed using weighted addition, or may be mixed using a more complex higher-order polynomial. Prior to mixing, one or both of the second mixture ratio R2(k) and the third mixture ratio R3(k) may be averaged. Averaging can improve the calculation accuracy of the first mixture ratio R1(k), i.e., the approximation accuracy of the amplitude or power of the speech and noise at the input terminal 201.

ここで、単純化のために、第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）を重み付き加算で混合することで、第１混在比Ｒ１（ｋ）を求める場合を考える。また、両者の重みの和が１となるように設定する。適応フィルタ２０３の係数は、ゼロに初期化されることが一般的である。そのため、係数更新開始時には擬似雑音ｎ１（ｋ）はゼロであり、第２混在比Ｒ２（ｋ）は分母がゼロで無限大となる。このため、第２混在比Ｒ２（ｋ）によって適応フィルタ２０３のステップサイズμ１（ｋ）を算出すると、極めて小さな値またはゼロとなり、係数が成長しない。係数が成長しないと、擬似雑音ｎ１（ｋ）も大きくならず、同じ問題が継続する。 For simplicity's sake, consider the case where the first mixture ratio R1(k) is calculated by mixing the second mixture ratio R2(k) and the third mixture ratio R3(k) using weighted addition. The sum of the weights for both is set to 1. The coefficients of the adaptive filter 203 are generally initialized to zero. Therefore, when coefficient updating begins, the pseudo-noise n1(k) is zero, and the second mixture ratio R2(k) has a denominator of zero and is infinite. Therefore, when the step size μ1(k) of the adaptive filter 203 is calculated using the second mixture ratio R2(k), the result is an extremely small value or zero, and the coefficient does not grow. If the coefficient does not grow, the pseudo-noise n1(k) does not increase, and the same problem persists.

一方、第３混在比Ｒ３（ｋ）の分母は補正参照信号ｘＲＣ（ｋ）であり、係数更新開始時にゼロとは限らない。これは、マイク入力には環境雑音など微小な信号が含まれるためである。仮に補正参照信号ｘＲＣ（ｋ）がゼロであっても、ゼロが継続することはない。このため、第３混在比Ｒ３（ｋ）が無限大になることはなく、対応するステップサイズμ１（ｋ）も極小値とはならない。したがって、適応フィルタ２０３の係数は、係数更新とともに成長し、雑音の信号源から入力端子２０１に至る経路の音響特性を表す値に収束する。補正参照信号ｘＲＣ（ｋ）がゼロのときは参照信号ｘＲ（ｋ）もゼロであって、適応フィルタ２０３の係数は更新しないので、第３混在比Ｒ３（ｋ）が極めて大きな値をとっても問題とはならない。しかし、適応フィルタ２０３の係数がある程度成長して、擬似雑音ｎ１（ｋ）が成長したときには、第３混在比Ｒ３（ｋ）は第２混在比Ｒ２（ｋ）よりも、入力端子２０１における音声と雑音の振幅または電力の比に対する近似精度が低い。 On the other hand, the denominator of the third mixture ratio R3(k) is the corrected reference signal xR(k), which is not necessarily zero at the start of coefficient updating. This is because the microphone input contains minute signals such as environmental noise. Even if the corrected reference signal xR(k) is zero, it will not remain at zero. Therefore, the third mixture ratio R3(k) will never become infinite, and the corresponding step size μ1(k) will not become a minimum value. Therefore, the coefficients of the adaptive filter 203 grow as the coefficients are updated, converging to a value representing the acoustic characteristics of the path from the noise signal source to the input terminal 201. When the corrected reference signal xR(k) is zero, the reference signal xR(k) is also zero, and the coefficients of the adaptive filter 203 are not updated. Therefore, even if the third mixture ratio R3(k) becomes extremely large, it does not pose a problem. However, when the coefficients of the adaptive filter 203 grow to a certain extent and the pseudo-noise n1(k) grows, the third mixture ratio R3(k) has lower approximation accuracy for the ratio of the amplitude or power of speech to noise at the input terminal 201 than the second mixture ratio R2(k).

そこで、混合部３０５は、適応フィルタ２０３の係数更新開始時に第３混在比Ｒ３（ｋ）の重みを大きな値に設定し、係数の成長とともに減少させる。第２混在比Ｒ２（ｋ）の重みは、適応フィルタ２０３の係数更新開始時に小さな値に設定し、時間とともに増加させる。これは、第３混在比Ｒ３（ｋ）の第１混在比Ｒ１（ｋ）における含有割合を、係数更新回数に対応して減少させることを表す。 Therefore, the mixer 305 sets the weight of the third mixture ratio R3(k) to a large value when the coefficient update of the adaptive filter 203 begins, and decreases it as the coefficients grow. The weight of the second mixture ratio R2(k) is set to a small value when the coefficient update of the adaptive filter 203 begins, and increases it over time. This means that the proportion of the third mixture ratio R3(k) in the first mixture ratio R1(k) decreases in accordance with the number of coefficient updates.

例えば、適応フィルタ２０３の係数更新開始時に第３混在比Ｒ３（ｋ）の重みを１に設定すれば、重みの和が１という条件から、第２混在比Ｒ２（ｋ）の重みは０となる。係数の成長は係数更新回数と対応する。したがって、適応フィルタ２０３の係数更新開始時に第３混在比Ｒ３（ｋ）の重みを１に設定し、係数ベクトルｗ１（ｋ）の係数更新回数に対応してその重みを０に向かって減少させる。対応して、第２混在比Ｒ２（ｋ）の重みは０から１へ増加する。第３混在比Ｒ３（ｋ）の重みの初期値を１と設定し、ある時点で重みが１になるか重みを１に設定すれば、第３混在比Ｒ３（ｋ）から第２混在比Ｒ２（ｋ）へ切り換えることになる。前記重みの和として１以外の値を設定しても、第３混在比Ｒ３（ｋ）に対する重みの初期値を１以外に設定しても、同様の効果が得られる。重みは、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の変化量を用いて決定することができる。 For example, if the weight of the third mixture ratio R3(k) is set to 1 when the adaptive filter 203 begins updating its coefficients, the weight of the second mixture ratio R2(k) will be 0, since the sum of the weights must be 1. Coefficient growth corresponds to the number of coefficient updates. Therefore, the weight of the third mixture ratio R3(k) is set to 1 when the adaptive filter 203 begins updating its coefficients, and the weight decreases toward 0 in accordance with the number of coefficient updates for the coefficient vector w1(k). Correspondingly, the weight of the second mixture ratio R2(k) increases from 0 to 1. If the initial value of the weight of the third mixture ratio R3(k) is set to 1, and the weight reaches 1 at some point or is set back to 1, the third mixture ratio R3(k) will be switched to the second mixture ratio R2(k). Similar effects can be achieved by setting the sum of the weights to a value other than 1 or by setting the initial weight for the third mixture ratio R3(k) to a value other than 1. The weights can be determined using the amount of change in the coefficient vector w1(k) of the adaptive filter 203.

図４は、第２実施形態に係る信号の時間推移を模式的に示す図であり、補正参照信号ｘＲＣ（ｋ）、適応フィルタ２０３の出力である擬似雑音ｎ１（ｋ）、および係数ベクトルｗ１（ｋ）が示されている。図４の縦軸は、信号の値をパワーで表したものであり、係数は係数ベクトルのノルムとする。横軸は、サンプル数ｋで表した適応フィルタ２０３の係数更新回数である。ｘＲＣ（ｋ）とｎ１（ｋ）を比較することは、同一の分子を有し、分母がそれぞれｘＲＣ（ｋ）とｎ１（ｋ）である第３混在比Ｒ３（ｋ）と第２混在比Ｒ２（ｋ）を比較することと等価である。ｘＲＣ（ｋ）は係数更新と無関係なので、ｎ１（ｋ）が第３混在比Ｒ３（ｋ）と第２混在比Ｒ２（ｋ）の関係を決定する。ｎ１（ｋ）はｘＲＣ（ｋ）と係数ベクトルｗ１（ｋ）で決定され、ｘＲＣ（ｋ）の増減に起因する変化を除くと、その振幅またはパワーは係数更新に伴って増加する。すなわち、ｘＲＣ（ｋ）は一定で、ｎ１（ｋ）は係数更新による係数ベクトルｗ１（ｋ）の増加と共に増加する。 Figure 4 is a schematic diagram showing the time progression of a signal according to the second embodiment, illustrating the corrected reference signal xRC(k), the pseudo-noise n1(k) output from the adaptive filter 203, and the coefficient vector w1(k). The vertical axis in Figure 4 represents the signal value in terms of power, and the coefficient is the norm of the coefficient vector. The horizontal axis represents the number of coefficient updates for the adaptive filter 203, expressed as the number of samples k. Comparing xRC(k) and n1(k) is equivalent to comparing the third mixture ratio R3(k) and the second mixture ratio R2(k), which have the same numerator and denominators xRC(k) and n1(k), respectively. Because xRC(k) is unrelated to the coefficient updates, n1(k) determines the relationship between the third mixture ratio R3(k) and the second mixture ratio R2(k). n1(k) is determined by xRC(k) and the coefficient vector w1(k), and its amplitude or power increases with coefficient updates, excluding changes due to increases or decreases in xRC(k). In other words, xRC(k) is constant, and n1(k) increases as the coefficient vector w1(k) increases due to coefficient updates.

以上説明した補正参照信号ｘＲＣ（ｋ）、適応フィルタ２０３の出力である擬似雑音ｎ１（ｋ）、および係数ベクトルｗ１（ｋ）の関係は、図４から明らかである。図４では、係数更新と無関係なｘＲＣ（ｋ）は、簡単のため一定として表した。ｘＲＣ（ｋ）に依存するｎ１（ｋ）は、スムーズに増加し、係数ベクトルｗ１（ｋ）が収束するとｎ１（ｋ）も飽和する。適応フィルタ２０３の係数ベクトルｗ１（ｋ）の変化量は係数更新に伴って小さくなるので、ｎ１（ｋ）がゼロからどれだけ成長したかの指標として、係数ベクトルｗ１（ｋ）の変化量を用いることができる。すなわち、混合部３０５は、係数ベクトルｗ１（ｋ）の時間変化によって、第３混在比Ｒ３（ｋ）の重みを決定する。 The relationship between the corrected reference signal xRC(k), the pseudo-noise n1(k) output by the adaptive filter 203, and the coefficient vector w1(k) described above is clear from Figure 4. In Figure 4, xRC(k), which is unrelated to coefficient updates, is represented as a constant for simplicity. n1(k), which depends on xRC(k), increases smoothly, and when the coefficient vector w1(k) converges, n1(k) also saturates. Because the amount of change in the coefficient vector w1(k) of the adaptive filter 203 decreases with coefficient updates, the amount of change in the coefficient vector w1(k) can be used as an indicator of how much n1(k) has grown from zero. In other words, the mixer 305 determines the weight of the third mixing ratio R3(k) based on the change in the coefficient vector w1(k) over time.

倍率ｇ１でｘＲ（ｋ）を補正することで、図４におけるｎ１（ｋ）のパワーとｘＲ（ｋ）のパワーとの差を小さくすることができる。これは、ｎ１（ｋ）のパワーとｘＲ（ｋ）のパワーとの差よりもｎ１（ｋ）のパワーとｘＲＣ（ｋ）のパワーとの差の方が小さいことを意味する。すなわち、ｘＲＣ（ｋ）はｘＲ（ｋ）よりも、ｎ１（ｋ）を高精度で近似できる。これは、高精度の第３混在比Ｒ３（ｋ）を通じて、より正確なステップサイズ制御につながり、適応フィルタ２０３の制御に望ましい。したがって、雑音の消し残りが少なく、かつ、歪が少ない音声の推定値を得ることができる。 By correcting xR(k) by the multiplier g1, the difference between the power of n1(k) and the power of xR(k) in Figure 4 can be reduced. This means that the difference between the power of n1(k) and the power of xRC(k) is smaller than the difference between the power of n1(k) and the power of xR(k). In other words, xRC(k) can approximate n1(k) with higher accuracy than xR(k). This leads to more accurate step size control through the high-accuracy third mixing ratio R3(k), which is desirable for controlling the adaptive filter 203. Therefore, it is possible to obtain an estimated speech value with less residual noise and less distortion.

係数ベクトルｗ１（ｋ）の時間変化は、係数ベクトルｗ１（ｋ）の要素の２乗総和または絶対値総和の時間変化としてもよいし、２乗部分和または絶対値部分和の時間変化であってもよい。部分和とする際には、係数ベクトルの要素を間引いたものを用いてもよいし、係数ベクトルの一部を切り出したものを用いてもよい。部分和を用いることで、時間変化の評価精度低下を抑えつつ、係数の時間変化に必要とする演算量を削減することができる。また、係数は入力信号に依存しないで変化するので、滑らかである。従って、入力信号に依存する他の指標よりも最小的に変動が小さく、飽和状態を正確に検出できる。 The change over time of the coefficient vector w1(k) may be the change over time of the sum of squares or the sum of absolute values of the elements of the coefficient vector w1(k), or the change over time of the partial sum of squares or the partial sum of absolute values. When using a partial sum, elements of the coefficient vector may be thinned out, or a portion of the coefficient vector may be extracted. By using a partial sum, the amount of calculation required for the change over time of the coefficients can be reduced while minimizing the deterioration in the accuracy of the evaluation of the change over time. Furthermore, since the coefficients change independently of the input signal, they are smooth. Therefore, they have minimal fluctuation compared to other indices that depend on the input signal, allowing for accurate detection of saturation.

式（４）に、第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）とを重み付き加算によって混合する例を示す。式（５）で表されるζ１（ｋ）はζ１（０）＝０を満たし、係数更新と共に増加する。適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化に相当するδ１（ｋ）は式（６）で求めることができる。Ｍは予め定められた自然数であり、係数ベクトルｗ１（ｋ）の時間変化を計算する頻度を決定する。Ｍが大きいほどδ１（ｋ）の変動が小さく係数飽和の検出が正確になるが、δ１（ｋ）の変化に生じる大きな遅延によって係数飽和の検出が遅れる。式（５）は、時間変化δ１（ｋ）の値が閾値ε１未満になったとき、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を０に設定する例を表し、第１混在比Ｒ１（ｋ）を第３混在比Ｒ３（ｋ）から第２混在比Ｒ２（ｋ）に切り換える。これは、時間変化δ１（ｋ）の値が十分に小さくなったことを判定する方法の一例である。式（５）の代わりにζ１（ｋ）＝δ１（ｋ）／｜δ１（Ｍ）｜とすれば、係数ベクトルｗ１（ｋ）の時間変化が第３混在比Ｒ３（ｋ）の含有割合を定める構成となる。
Equation (4) shows an example of mixing the second mixing ratio R2(k) and the third mixing ratio R3(k) by weighted addition. ζ1(k) expressed in equation (5) satisfies ζ1(0) = 0 and increases as the coefficients are updated. δ1(k), which corresponds to the time change in the coefficient vector w1(k) of the adaptive filter 203, can be calculated using equation (6). M is a predetermined natural number that determines the frequency at which the time change in the coefficient vector w1(k) is calculated. The larger M is, the smaller the fluctuation in δ1(k) and the more accurate the detection of coefficient saturation becomes; however, the larger the delay in the change in δ1(k), the longer the detection of coefficient saturation becomes. Equation (5) represents an example in which, when the value of the time change δ1(k) becomes less than the threshold value ε1, the content ratio of the third mixture ratio R3(k) in the first mixture ratio R1(k) is set to 0, and the first mixture ratio R1(k) is switched from the third mixture ratio R3(k) to the second mixture ratio R2(k). This is an example of a method for determining whether the value of the time change δ1(k) has become sufficiently small. If ζ1(k) = δ1(k)/|δ1(M)| is used instead of equation (5), the content ratio of the third mixture ratio R3(k) is determined by the time change of the coefficient vector w1(k).

このように、混合部３０５は、時間変化δ１（ｋ）と閾値ε１との比較結果に基づいて、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を１００％および０％のうちのいずれかに設定することができる。なお、係数ベクトルｗ１（ｋ）の時間変化δ１（ｋ）と閾値ε１とを繰り返し比較し、係数ベクトルｗ１（ｋ）の時間変化δ１（ｋ）が閾値ε１以上である場合に、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を１００％にし、そうでない場合に、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を０％にする処理を繰り返してもよい。これにより、環境の変化に対する追従性の高い第１混在比Ｒ１（ｋ）を得ることができ、例えば、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の更新量が急増した場合であっても、より正確な係数更新が可能となる。 In this way, the mixer 305 can set the content ratio of the third mixture ratio R3(k) in the first mixture ratio R1(k) to either 100% or 0% based on the comparison result between the time change δ1(k) and the threshold ε1. Alternatively, the mixer 305 can repeatedly compare the time change δ1(k) of the coefficient vector w1(k) with the threshold ε1, and if the time change δ1(k) of the coefficient vector w1(k) is equal to or greater than the threshold ε1, set the content ratio of the third mixture ratio R3(k) in the first mixture ratio R1(k) to 100%. Otherwise, set the content ratio of the third mixture ratio R3(k) in the first mixture ratio R1(k) to 0%. This allows for a first mixture ratio R1(k) that is highly responsive to environmental changes, enabling more accurate coefficient updating, even if, for example, the amount of updates to the coefficient vector w1(k) of the adaptive filter 203 suddenly increases.

混合部３０５は、係数ベクトルｗ１（ｋ）の時間変化δ１（ｋ）が予め定められた第１定数Ｌ１以上連続して閾値ε１未満になった場合に、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を０％にし、それ以外の場合には、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を１００％にすることもできる。第３混在比Ｒ３（ｋ）の含有割合が１回０％に設定された後、閾値を十分に大きく設定することで、δ１（ｋ）とε１の比較を停止してもよい。ここに、Ｌ１は、２以上の整数である。 The mixing unit 305 can set the content ratio of the third mixing ratio R3(k) in the first mixing ratio R1(k) to 0% if the time change δ1(k) of the coefficient vector w1(k) is less than the threshold ε1 for a predetermined first constant L1 or more consecutive times. Otherwise, the mixing unit 305 can set the content ratio of the third mixing ratio R3(k) in the first mixing ratio R1(k) to 100%. After the content ratio of the third mixing ratio R3(k) is set to 0% once, the comparison of δ1(k) and ε1 can be stopped by setting the threshold to a sufficiently large value. Here, L1 is an integer greater than or equal to 2.

混合部３０５は、係数ベクトルｗ１（ｋ）の時間変化δ１（ｋ）が予め定められた第２定数Ｌ２と等しいサンプル数を有する区間において、予め定められた第３定数Ｌ３以上連続してδ１（ｋ）が閾値ε１未満になったときに、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を０％にし、それ以外の場合には、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を１００％にすることもできる。第２定数Ｌ２及び第３定数Ｌ３は共に２以上の整数で、Ｌ２＞Ｌ３を満足する。さらに、上記評価から連続の条件を削除して、予め定められた第３定数Ｌ３以上δ１（ｋ）が閾値ε１未満になる回数を評価することもできる。第３混在比Ｒ３（ｋ）の含有割合が１回０％に設定された後、Ｌ３をＬ２より大きな値に設定することで、δ１（ｋ）の評価を停止してもよい。 The mixing unit 305 may set the content ratio of the third mixture ratio R3(k) in the first mixture ratio R1(k) to 0% when δ1(k) is less than the threshold ε1 for a predetermined third constant L3 or more consecutive times in a section where the time change δ1(k) of the coefficient vector w1(k) has a number of samples equal to the predetermined second constant L2. Otherwise, the content ratio of the third mixture ratio R3(k) in the first mixture ratio R1(k) may be set to 100%. The second constant L2 and the third constant L3 are both integers greater than or equal to 2, satisfying L2 > L3. Furthermore, the continuous condition may be removed from the above evaluation, and the number of times δ1(k) is less than the threshold ε1 for a predetermined third constant L3 or more may be evaluated. After the content ratio of the third mixture ratio R3(k) is set to 0% once, the evaluation of δ1(k) may be stopped by setting L3 to a value greater than L2.

なお、混合部３０５は、第２混在比Ｒ２（ｋ）および第３混在比Ｒ３（ｋ）の重みの最小値を０ではなく、０よりも大きな値とすることもできる。 The mixing unit 305 may also set the minimum weighting values of the second mixing ratio R2(k) and the third mixing ratio R3(k) to a value greater than 0 rather than 0.

〔２．４．推定部２０６の第２の構成〕
図５は、推定部２０６の第２の内部構成を示すブロック図である。推定部２０６は、補正部３１０と混合部５０６と信号比推定部５０３とを備えている。補正部３１０は、劣化信号ｘＰ（ｋ）と参照信号ｘＲ（ｋ）とを受けて、参照信号ｘＲ（ｋ）を補正し、補正参照信号ｘＲＣ（ｋ）を求める。補正部３１０の動作は、すでに説明した補正部３１０の動作と同様である。混合部５０６は、補正参照信号ｘＲＣ（ｋ）（補正第２混在信号）と擬似雑音ｎ１（ｋ）（第２信号の推定値）とを係数ベクトルｗ１（ｋ）に基づいて混合して、第１混合信号ｎ３（ｋ）を生成する。信号比推定部５０３は、音声の推定値ｅ１（ｋ）と第１混合信号ｎ３（ｋ）とを受けて、入力端子２０１における音声と雑音の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。第１混在比Ｒ１（ｋ）は、音声の推定値ｅ１（ｋ）と第１混合信号ｎ３（ｋ）の振幅または電力の比としてもよいし、それらの振幅または電力に微小定数を加算してから比を計算してもよい。微小定数の加算は、除算による商の発散を防ぐ効果がある。また、推定値ｅ１（ｋ）と第１混合信号ｎ３（ｋ）のいずれかまたは両方を、平均化してから用いてもよい。平均化によって、比の計算精度を向上することができる。 2.4. Second Configuration of Estimation Unit 206
FIG. 5 is a block diagram showing a second internal configuration of the estimation unit 206. The estimation unit 206 includes a correction unit 310, a mixer 506, and a signal ratio estimation unit 503. The correction unit 310 receives the noisy signal xP(k) and the reference signal xR(k) and corrects the reference signal xR(k) to obtain a corrected reference signal xRC(k). The operation of the correction unit 310 is similar to that of the correction unit 310 already described. The mixer 506 mixes the corrected reference signal xRC(k) (corrected second mixed signal) with the pseudo-noise n1(k) (estimated value of the second signal) based on the coefficient vector w1(k) to generate a first mixed signal n3(k). The signal ratio estimation unit 503 receives the speech estimate e1(k) and the first mixed signal n3(k) and estimates the amplitude or power ratio of speech to noise at the input terminal 201 as a first mixed ratio R1(k). The first mixture ratio R1(k) may be the ratio of the amplitude or power of the speech estimate e1(k) to the first mixed signal n3(k), or may be calculated by adding a small constant to the amplitude or power. Adding a small constant has the effect of preventing the quotient from diverging due to division. Alternatively, either or both of the estimate e1(k) and the first mixed signal n3(k) may be averaged before use. Averaging can improve the accuracy of the ratio calculation.

図５に示す推定部２０６の第２の内部構成は、図３に示す推定部２０６の第１の内部構成と等価である。すなわち、図３に示す第１の内部構成は、入力端子２０１における音声と雑音の振幅または電力の比に対して２つの推定値を信号比推定部３０１、３０２で生成し、それらを混合して第１混在比Ｒ１（ｋ）を算出する。図５に示す推定部２０６の第２の内部構成は、２種類の雑音の推定値、すなわち補正参照信号ｘＲＣ（ｋ）と擬似雑音ｎ１（ｋ）を混合して第１混合信号ｎ３（ｋ）を生成して分母を確定し、分子である音声の推定値ｅ１（ｋ）と作用させて第１混在比Ｒ１（ｋ）を算出する。これら２種類の構成が可能となったのは、図３に示す第１の内部構成と図５に示す第２の内部構成において、入力端子２０１における音声と雑音の振幅または電力の比を推定する際に、同一の分子、すなわち音声の推定値ｅ１（ｋ）を用いるからである。図５に示す推定部２０６の第２の内部構成は、図３に示す第１の内部構成よりも、構成要素が少なく、単純である。 The second internal configuration of the estimation unit 206 shown in FIG. 5 is equivalent to the first internal configuration of the estimation unit 206 shown in FIG. 3. That is, in the first internal configuration shown in FIG. 3, two estimates of the amplitude or power ratio of speech to noise at the input terminal 201 are generated by the signal ratio estimation units 301 and 302, and these estimates are mixed to calculate the first mixture ratio R1(k). The second internal configuration of the estimation unit 206 shown in FIG. 5 mixes two types of noise estimates, i.e., the corrected reference signal xRC(k) and the pseudo-noise n1(k), to generate the first mixed signal n3(k), determine the denominator, and then calculate the first mixture ratio R1(k) by combining this with the numerator, the speech estimate e1(k). These two types of configurations are possible because the first internal configuration shown in FIG. 3 and the second internal configuration shown in FIG. 5 use the same numerator, i.e., the speech estimate e1(k), when estimating the amplitude or power ratio of speech to noise at the input terminal 201. The second internal configuration of the estimation unit 206 shown in Figure 5 has fewer components and is simpler than the first internal configuration shown in Figure 3.

混合部５０６の動作は、入力信号である第２混在比と第３混在比が擬似雑音と参照信号（第２混在信号）にそれぞれ置き換わったことを除いて、混合部３０５の動作と同様である。従って、混合部３０５の動作に関する説明は、混合部５０６にも同様に適用できる。 The operation of the mixer 506 is similar to that of the mixer 305, except that the second and third mixing ratios, which are input signals, are replaced with pseudo-noise and a reference signal (second mixing signal), respectively. Therefore, the description of the operation of the mixer 305 can also be applied to the mixer 506.

以上の構成により、本実施形態は、適応フィルタ２０３の近似する音響インパルス応答の利得が１未満であっても、ステップサイズに特別な値を強制的に設定することなく、円滑に係数更新を行うことができ、結果として、雑音の消し残りが少なく、かつ、信号歪が少ない出力信号を得ることができる。 With the above configuration, this embodiment can smoothly update the coefficients without forcibly setting a special value for the step size, even if the gain of the acoustic impulse response approximated by the adaptive filter 203 is less than 1. As a result, it is possible to obtain an output signal with little residual noise and little signal distortion.

〔３．第３実施形態〕
これまでの説明では、雑音源近傍において参照信号の捕捉を行うことによって、参照信号は雑音そのものであると仮定してきた。しかし、現実にはこの条件を満たすことのできない場合が存在する。このような場合には、参照信号は雑音とそれに混入する音声信号とから構成される。このような参照信号に対する音声信号の混入成分はクロストークと呼ばれる。クロストークが存在する際の雑音消去装置の構成が、特許文献３に開示されている。 3. Third embodiment
In the explanation so far, it has been assumed that the reference signal is noise itself, by capturing the reference signal near the noise source. However, in reality, there are cases where this condition cannot be met. In such cases, the reference signal is composed of noise and an audio signal mixed in with it. Such an audio signal mixed into the reference signal is called crosstalk. The configuration of a noise cancellation device when crosstalk exists is disclosed in Patent Document 3.

本実施形態では、雑音の消去と同様に、クロストークを消去するための第２の適応フィルタを導入する。音声信号源から参照入力端子に至る音響経路（クロストークパス）のインパルス応答を近似する第２の適応フィルタを用いて、参照入力端子において混入する音声信号成分に対応した擬似クロストークを生成する。そして、参照入力端子に入力された信号（参照信号）からこの擬似クロストークを差し引くことによって、参照入力に混入する音声信号成分（クロストーク）を消去する。 In this embodiment, a second adaptive filter is introduced to cancel crosstalk, similar to the noise cancellation. The second adaptive filter, which approximates the impulse response of the acoustic path (crosstalk path) from the audio signal source to the reference input terminal, is used to generate pseudo-crosstalk corresponding to the audio signal components mixed in at the reference input terminal. Then, by subtracting this pseudo-crosstalk from the signal (reference signal) input to the reference input terminal, the audio signal components (crosstalk) mixed in the reference input are canceled.

本発明の第３実施形態としての雑音消去装置について、図６を用いて説明する。第２実施形態と比べた場合、本実施形態に係る雑音消去装置は、減算部２０４、適応フィルタ２０３に加えて、減算部８０４、適応フィルタ８０３とを備え、推定部２０６が推定部８０６に置換されている。その他の構成および動作は、第２実施形態と同様であるため、同じ構成には同じ符号を付して詳しい説明を省略する。 A noise cancellation device according to a third embodiment of the present invention will be described with reference to Figure 6. Compared to the second embodiment, the noise cancellation device according to this embodiment includes a subtraction unit 804 and an adaptive filter 803 in addition to the subtraction unit 204 and adaptive filter 203, and the estimation unit 206 has been replaced with an estimation unit 806. Other configurations and operations are similar to those of the second embodiment, and therefore the same components are designated by the same reference numerals and detailed descriptions will be omitted.

雑音消去装置８００は、消去しようとするクロストーク（第３信号）に相関のある信号（出力端子２０５における出力＝推定音声信号あるいは強調信号）を適応フィルタで変形して擬似クロストークｎ２（ｋ）（第３信号の推定値）を生成する。そして、これを音声と雑音の混在した参照信号ｘＲ（ｋ）から減算することで、クロストークの消去を行う。適応フィルタ８０３の係数更新を行う際に、第４信号と第３信号の振幅またはパワーの比を近似する第４混在比Ｒ４（ｋ）を用いてステップサイズを制御するために、係数更新を円滑に進めることができ、結果として、雑音の消し残りが少なく、かつ、信号歪が少ない出力信号を得ることができる。 Noise cancellation device 800 uses an adaptive filter to modify a signal (output at output terminal 205 = estimated speech signal or emphasis signal) that is correlated with the crosstalk (third signal) to be canceled, to generate pseudo crosstalk n2(k) (estimated value of the third signal). This is then subtracted from a reference signal xR(k) that contains a mixture of speech and noise, thereby canceling the crosstalk. When updating the coefficients of adaptive filter 803, the step size is controlled using a fourth mixture ratio R4(k), which approximates the ratio of the amplitude or power of the fourth signal to the third signal. This allows for smooth coefficient updating, resulting in an output signal with minimal residual noise and minimal signal distortion.

入力端子２０１には、劣化信号ｘＰ（ｋ）が、サンプル値系列として供給され、減算部２０４に伝達される。入力端子２０２には、参照信号ｘＲ（ｋ）がサンプル値系列として供給され、減算部８０４に伝達される。 The degradation signal xP(k) is supplied to the input terminal 201 as a sample value sequence and transmitted to the subtraction unit 204. The reference signal xR(k) is supplied to the input terminal 202 as a sample value sequence and transmitted to the subtraction unit 804.

減算部８０４には、入力端子２０２から参照信号ｘＲ（ｋ）が、適応フィルタ８０３から擬似クロストークｎ２（ｋ）が供給される。減算部８０４は、参照信号ｘＲ（ｋ）から擬似クロストークｎ２（ｋ）を減算し、その結果を雑音の推定値ｅ２（ｋ）（第４信号の推定値）として出力端子８０５に伝達すると同時に適応フィルタ８０３に帰還する。また、減算部８０４は、雑音の推定値ｅ２（ｋ）を推定部８０６に供給する。 The subtraction unit 804 receives the reference signal xR(k) from the input terminal 202 and the pseudo crosstalk n2(k) from the adaptive filter 803. The subtraction unit 804 subtracts the pseudo crosstalk n2(k) from the reference signal xR(k) and transmits the result as a noise estimate e2(k) (estimate of the fourth signal) to the output terminal 805, while simultaneously feeding it back to the adaptive filter 803. The subtraction unit 804 also supplies the noise estimate e2(k) to the estimation unit 806.

適応フィルタ８０３は、音声の推定値ｅ１（ｋ）（強調信号）とフィルタ係数の畳込み演算を行い、その結果を擬似クロストークｎ２（ｋ）（第３信号の推定値）として減算部８０４と推定部８０６に伝達する。また、適応フィルタ８０３は、係数ベクトルｗ２（ｋ）を推定部８０６に供給する。 The adaptive filter 803 performs a convolution operation on the speech estimate e1(k) (emphasis signal) and the filter coefficients, and transmits the result as pseudo crosstalk n2(k) (estimate of the third signal) to the subtraction unit 804 and the estimation unit 806. The adaptive filter 803 also supplies the coefficient vector w2(k) to the estimation unit 806.

推定部８０６は、音声の推定値ｅ１（ｋ）、擬似雑音ｎ１（ｋ）（適応フィルタ２０３の出力）、雑音の推定値ｅ２（ｋ）（第４信号の推定値）、適応フィルタ２０３の係数ベクトルｗ１（ｋ）、および劣化信号ｘＰ（ｋ）を受けて、入力端子２０１における音声と雑音の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定し、適応フィルタ２０３に伝達する。適応フィルタ２０３は、第１混在比Ｒ１（ｋ）が大きいときに小さなステップサイズμ１（ｋ）を、第１混在比Ｒ１（ｋ）が小さいときに大きなステップサイズμ１（ｋ）を用いて、係数を更新する。なお、図６に示す適応フィルタ２０３は、参照信号ｘＲ（ｋ）に代えて雑音の推定値ｅ２（ｋ）が入力される点で、図２に示す適応フィルタ２０３とは異なる。第１混在比Ｒ１（ｋ）、すなわち信号対雑音比の推定値を用いてステップサイズを制御する方法に関しては、特許文献１から３に詳細に開示されている。また、特許文献１から３に開示されているように、第１混在比Ｒ１（ｋ）を平均化してから、ステップサイズμ１（ｋ）の計算に用いてもよい。入力端子２０１における音声と雑音の振幅または電力の比に対する推定精度が向上する。 The estimation unit 806 receives the speech estimate e1(k), the pseudo-noise n1(k) (output of the adaptive filter 203), the noise estimate e2(k) (the estimate of the fourth signal), the coefficient vector w1(k) of the adaptive filter 203, and the noisy signal xP(k), estimates the amplitude or power ratio of speech to noise at the input terminal 201 as the first mixture ratio R1(k), and transmits this to the adaptive filter 203. The adaptive filter 203 updates the coefficients using a small step size μ1(k) when the first mixture ratio R1(k) is large and a large step size μ1(k) when the first mixture ratio R1(k) is small. Note that the adaptive filter 203 shown in FIG. 6 differs from the adaptive filter 203 shown in FIG. 2 in that the noise estimate e2(k) is input instead of the reference signal xR(k). Methods of controlling the step size using the first mixture ratio R1(k), i.e., an estimated value of the signal-to-noise ratio, are disclosed in detail in Patent Documents 1 to 3. Also, as disclosed in Patent Documents 1 to 3, the first mixture ratio R1(k) may be averaged and then used to calculate the step size μ1(k). This improves the estimation accuracy of the ratio of the amplitude or power of speech to noise at input terminal 201.

推定部８０６は、さらに、擬似クロストークｎ２（ｋ）（適応フィルタ８０３の出力）、適応フィルタ８０３の係数ベクトルｗ２（ｋ）、および参照信号ｘＲ（ｋ）を受けて、入力端子２０２における雑音とクロストーク（第４信号と第３信号）の振幅または電力の比を第４混在比Ｒ４（ｋ）として推定し、適応フィルタ８０３に伝達する。式（７）に示す係数ベクトルｗ２（ｋ）のサイズは、係数ベクトルｗ１（ｋ）のサイズＬと等しくてもよいし、異なってもよい。
The estimation unit 806 further receives the pseudo crosstalk n2(k) (output of the adaptive filter 803), the coefficient vector w2(k) of the adaptive filter 803, and the reference signal xR(k), estimates the ratio of the amplitude or power of the noise and the crosstalk (the fourth signal and the third signal) at the input terminal 202 as a fourth mixing ratio R4(k), and transmits this to the adaptive filter 803. The size of the coefficient vector w2(k) shown in equation (7) may be equal to or different from the size L of the coefficient vector w1(k).

適応フィルタ８０３は、第４混在比Ｒ４（ｋ）が大きいときに小さなステップサイズμ２（ｋ）を、第４混在比Ｒ４（ｋ）が小さいときに大きなステップサイズμ２（ｋ）を用いて、係数を更新する。第４混在比Ｒ４（ｋ）、すなわち入力端子２０２における雑音対クロストーク比の推定値を用いてステップサイズμ２（ｋ）を制御する方法に関しては、特許文献１から３に詳細に開示されている。また、特許文献１から３に詳細に開示されているように、第４混在比Ｒ４（ｋ）を平均化してから、ステップサイズμ２（ｋ）の計算に用いてもよい。入力端子２０２における雑音とクロストークの振幅または電力の比に対する推定精度が向上する。 The adaptive filter 803 updates the coefficients using a small step size μ2(k) when the fourth mixing ratio R4(k) is large and a large step size μ2(k) when the fourth mixing ratio R4(k) is small. Methods for controlling the step size μ2(k) using the fourth mixing ratio R4(k), i.e., an estimate of the noise-to-crosstalk ratio at the input terminal 202, are disclosed in detail in Patent Documents 1 to 3. Furthermore, as disclosed in detail in Patent Documents 1 to 3, the fourth mixing ratio R4(k) may be averaged before being used to calculate the step size μ2(k). This improves the accuracy of estimating the ratio of the amplitude or power of noise to crosstalk at the input terminal 202.

〔３．１．推定部８０６の第１の構成〕
図７は、推定部８０６の第１の内部構成を示すブロック図である。推定部８０６は、推定部２０６の構成に加えて、信号比推定部９０１、信号比推定部９０２、混合部９０５、および補正部７１０を備えている。図７に示す補正部３１０は、図３に示す補正部３１０と次の点で異なる。まず、参照信号ｘＲ（ｋ）に代えて雑音の推定値（第４信号の推定値）ｅ２（ｋ）が入力である。次に、補正参照信号ｘＲＣ（ｋ）に代えて雑音の補正推定値ｅ２Ｃ（ｋ）が出力である。これら以外の動作は等しい。図７に示す信号比推定部３０２は、補正参照信号ｘＲＣ（ｋ）に代えて雑音の補正推定値ｅ２Ｃ（ｋ）を用いる点以外は、図３に示す信号比推定部３０２と等しい。 3.1. First Configuration of Estimation Unit 806
FIG. 7 is a block diagram showing a first internal configuration of the estimation unit 806. In addition to the configuration of the estimation unit 206, the estimation unit 806 includes a signal ratio estimation unit 901, a signal ratio estimation unit 902, a mixer 905, and a correction unit 710. The correction unit 310 shown in FIG. 7 differs from the correction unit 310 shown in FIG. 3 in the following respects. First, instead of the reference signal xR(k), a noise estimate (fourth signal estimate) e2(k) is input. Second, instead of the corrected reference signal xRC(k), a corrected noise estimate e2C(k) is output. The remaining operations are the same. The signal ratio estimation unit 302 shown in FIG. 7 is the same as the signal ratio estimation unit 302 shown in FIG. 3 except that the corrected noise estimate e2C(k) is used instead of the corrected reference signal xRC(k).

信号比推定部９０１は、雑音の推定値ｅ２（ｋ）と擬似クロストークｎ２（ｋ）を受けて、雑音とクロストークの振幅または電力の比を第５混在比Ｒ５（ｋ）として推定する。第５混在比Ｒ５（ｋ）は、雑音の推定値ｅ２（ｋ）と擬似クロストークｎ２（ｋ）の振幅または電力の比としてもよいし、それらの振幅または電力に微小定数を加算してから比を計算してもよい。微小定数の加算は、除算による商の発散を防ぐ効果がある。また、雑音の推定値ｅ２（ｋ）と擬似クロストークｎ２（ｋ）のいずれかまたは両方を、平均化してから用いてもよい。平均化によって、比の計算精度を向上することができる。 The signal ratio estimator 901 receives the noise estimate e2(k) and the pseudo crosstalk n2(k) and estimates the ratio of the amplitude or power of the noise to the crosstalk as the fifth mixture ratio R5(k). The fifth mixture ratio R5(k) may be the ratio of the amplitude or power of the noise estimate e2(k) to the pseudo crosstalk n2(k), or may be calculated by adding a small constant to these amplitudes or powers. Adding a small constant has the effect of preventing the quotient from diverging due to division. Alternatively, either or both of the noise estimate e2(k) and the pseudo crosstalk n2(k) may be averaged before use. Averaging can improve the accuracy of the ratio calculation.

補正部７１０は、参照信号ｘＲ（ｋ）と音声の推定値ｅ１（ｋ）とを受けて、音声の推定値ｅ１（ｋ）を補正し、音声（第１信号）の補正推定値ｅ１Ｃ（ｋ）を求める。補正部７１０の動作は、雑音の推定値ｅ２（ｋ）を音声の推定値ｅ１（ｋ）に、劣化信号ｘＰ（ｋ）を参照信号ｘＲ（ｋ）に置き換えることで、補正部３１０の動作と等しい。ただし、補正に用いる倍率ｇ２の求め方に違いがある。 The correction unit 710 receives the reference signal xR(k) and the speech estimate e1(k), corrects the speech estimate e1(k), and obtains the corrected speech (first signal) estimate e1C(k). The operation of the correction unit 710 is identical to that of the correction unit 310, replacing the noise estimate e2(k) with the speech estimate e1(k) and the noisy signal xP(k) with the reference signal xR(k). However, there is a difference in how the scaling factor g2 used for correction is calculated.

補正部７１０は、補正部３１０と同様に、雑音消去装置８００の動作開始直後からＭ２サンプルの間、参照信号ｘＲ（ｋ）と音声の推定値ｅ１（ｋ）のパワーを平均し、その比を倍率ｇ２として求めることができる。自然数Ｍ２は予め定められた定数であり、補正部３１０と同様にＭ２を適応フィルタ８０３のタップ数と等しいサンプル数とすることができる。雑音が連続的で、無音区間を有しない場合にはこの方法で適切な倍率ｇ２を求めることができないので、ｇ２＝ｇ１とすることができる。これは、倍率ｇ１と倍率ｇ２が同じ音響空間に対応しており、その値の決定において音響空間の反射率が支配的なｇ１とｇ２は近似的に等しいとみなすことができるからである。このようにして求めた倍率ｇ２と、音声の推定値ｅ１（ｋ）のパワーの積を擬似クロストークｎ２（ｋ）のパワーの代わりに用いることで、より高精度に第４信号と第３信号、すなわち入力端子２０２における雑音とクロストークの振幅またはパワーの比を求めることができる。 Similar to the correction unit 310, the correction unit 710 averages the power of the reference signal xR(k) and the speech estimate e1(k) for M2 samples immediately after the noise cancellation device 800 starts operating, and calculates the ratio as the multiplier g2. The natural number M2 is a predetermined constant, and similar to the correction unit 310, M2 can be set to a number of samples equal to the number of taps of the adaptive filter 803. When the noise is continuous and does not have silent intervals, an appropriate multiplier g2 cannot be calculated using this method, so g2 = g1 can be used. This is because the multipliers g1 and g2 correspond to the same acoustic space, and g1 and g2, which are dominated by the reflectivity of the acoustic space, can be considered approximately equal in determining their values. By using the product of the multiplier g2 calculated in this way and the power of the speech estimate e1(k) instead of the power of the pseudo crosstalk n2(k), the ratio of the amplitude or power of the fourth signal to the third signal, i.e., the noise and crosstalk at the input terminal 202, can be calculated with higher accuracy.

倍率ｇ２の決定は、雑音（第４信号）の振幅が０である限り、どのタイミングで何回実施してもよい。倍率ｇ２の値をより高い頻度で決定することで、適応フィルタ８０３が近似する音響系のインパルス応答の変化に、より正確に追従することができる。始動直後以外に雑音の振幅が０であることは、雑音の推定値ｅ２（ｋ）を用いて判定することができる。雑音の推定値ｅ２（ｋ）は適応フィルタ８０３が十分に収束しても常に一定の誤差を含むので、この誤差を考慮して雑音の推定値ｅ２（ｋ）と予め定めた閾値β２との比較を行う。閾値β２を大きく設定すると振幅が０であると判定される頻度が高くなるが、判定誤りが増加する。閾値β２を小さく設定すると振幅が０であると判定される頻度が低くなり、前記音響系インパルス応答の変化への追従性が劣化する。 The multiplier g2 can be determined at any timing and any number of times as long as the amplitude of the noise (fourth signal) is 0. By determining the value of the multiplier g2 more frequently, adaptive filter 803 can more accurately track changes in the impulse response of the acoustic system it approximates. The noise estimate e2(k) can be used to determine that the noise amplitude is 0 other than immediately after startup. Since the noise estimate e2(k) always contains a certain amount of error even when adaptive filter 803 has sufficiently converged, this error is taken into account when comparing the noise estimate e2(k) with a predetermined threshold β2. Setting threshold β2 to a large value increases the frequency at which the amplitude is determined to be 0, but also increases the likelihood of incorrect determination. Setting threshold β2 to a small value decreases the frequency at which the amplitude is determined to be 0, resulting in poor tracking of changes in the acoustic system impulse response.

信号比推定部９０２は、雑音の推定値ｅ２（ｋ）と音声の補正推定値ｅ１Ｃ（ｋ）（第１信号の補正推定値）を受けて、入力端子２０２における雑音とクロストークの振幅または電力の比を第６混在比Ｒ６（ｋ）として推定する。第６混在比Ｒ６（ｋ）は、雑音の推定値ｅ２（ｋ）と音声の補正推定値ｅ１Ｃ（ｋ）の振幅または電力の比としてもよいし、それらの振幅または電力に微小定数を加算してから比を計算してもよい。微小定数の加算は、除算による商の発散を防ぐ効果がある。また、雑音の推定値ｅ２（ｋ）と音声の補正推定値ｅ１Ｃ（ｋ）のいずれかまたは両方を、平均化してから用いてもよい。 The signal ratio estimation unit 902 receives the noise estimate e2(k) and the corrected speech estimate e1C(k) (the corrected estimate of the first signal) and estimates the ratio of the amplitude or power of the noise to the crosstalk at the input terminal 202 as the sixth mixture ratio R6(k). The sixth mixture ratio R6(k) may be the ratio of the amplitude or power of the noise estimate e2(k) and the corrected speech estimate e1C(k), or may be calculated by adding a small constant to these amplitudes or powers. Adding a small constant has the effect of preventing the quotient from diverging due to division. Alternatively, either or both of the noise estimate e2(k) and the corrected speech estimate e1C(k) may be averaged before use.

混合部９０５は、第５混在比Ｒ５（ｋ）と第６混在比Ｒ６（ｋ）とを適応フィルタ８０３の係数ベクトルｗ２（ｋ）を用いて混合して、混合結果を第４混在比Ｒ４（ｋ）として出力する。第５混在比Ｒ５（ｋ）と第６混在比Ｒ６（ｋ）は、適応フィルタ８０３の係数ベクトルｗ２（ｋ）を用いた重み付き加算によって混合してもよいし、さらに複雑な高次多項式などを用いて混合してもよい。混合に先立って、第５混在比Ｒ５（ｋ）と第６混在比Ｒ６（ｋ）のいずれかまたは両方を平均化してもよい。平均化によって、第４混在比Ｒ４（ｋ）の計算精度、すなわち雑音とクロストークの振幅または電力の近似精度を向上することができる。 The mixer 905 mixes the fifth mixture ratio R5(k) and the sixth mixture ratio R6(k) using the coefficient vector w2(k) of the adaptive filter 803 and outputs the mixed result as the fourth mixture ratio R4(k). The fifth mixture ratio R5(k) and the sixth mixture ratio R6(k) may be mixed by weighted addition using the coefficient vector w2(k) of the adaptive filter 803, or may be mixed using a more complex higher-order polynomial. Prior to mixing, one or both of the fifth mixture ratio R5(k) and the sixth mixture ratio R6(k) may be averaged. Averaging can improve the calculation accuracy of the fourth mixture ratio R4(k), i.e., the approximation accuracy of the amplitude or power of noise and crosstalk.

混合部９０５の動作は、混合部３０５の入出力信号を図７で示されるように混合部９０５のものに変更することで、混合部３０５の動作と等しくなるので、詳細な説明は省略する。 The operation of the mixer 905 is the same as that of the mixer 305 by changing the input and output signals of the mixer 305 to those of the mixer 905 as shown in Figure 7, so a detailed explanation will be omitted.

〔３．３．推定部８０６の第２の構成〕
図８は、推定部８０６の第２の内部構成を示すブロック図である。推定部８０６は、混合部５０６と信号比推定部５０３と補正部３１０に加えて、さらに混合部１１０６と信号比推定部１１０３と補正部７１０を備えている。図８の補正部３１０と補正部７１０は、図７の補正部３１０と補正部７１０と比較して、入出力信号および動作に関して等しいので、詳細な説明は省略する。図８の混合部５０６は、図５の混合部５０６において補正参照信号ｘＲＣ（ｋ）（第２混在信号）の代わりに雑音の補正推定値ｅ２Ｃ（ｋ）を入力とする構成である。図８の混合部５０６の動作は、図５の混合部５０６の動作と等しいので、詳細な説明は省略する。 3.3. Second Configuration of Estimation Unit 806
FIG. 8 is a block diagram showing a second internal configuration of the estimation unit 806. In addition to the mixing unit 506, the signal ratio estimation unit 503, and the correction unit 310, the estimation unit 806 further includes a mixing unit 1106, a signal ratio estimation unit 1103, and a correction unit 710. The correction units 310 and 710 in FIG. 8 are identical to the correction units 310 and 710 in FIG. 7 in terms of input/output signals and operation, and therefore detailed description thereof will be omitted. The mixing unit 506 in FIG. 8 is configured to input a noise correction estimate value e2C(k) instead of the correction reference signal xRC(k) (second mixed signal) in the mixing unit 506 in FIG. 5. The operation of the mixing unit 506 in FIG. 8 is identical to the operation of the mixing unit 506 in FIG. 5, and therefore detailed description thereof will be omitted.

混合部１１０６は、音声の補正推定値ｅ１Ｃ（ｋ）（補正された強調信号、すなわち第１信号の補正推定値）と擬似クロストークｎ２（ｋ）（第３信号の推定値）とを適応フィルタ８０３の係数ベクトルｗ２（ｋ）を用いて混合して、第２混合信号ｎ４（ｋ）を生成する。混合部１１０６の動作は、混合部５０６の入出力信号を図８で示されるように混合部１１０６のものに変更することで、混合部５０６の動作と等しくなるので、詳細な説明は省略する。 The mixer 1106 mixes the corrected speech estimate e1C(k) (the corrected emphasis signal, i.e., the corrected estimate of the first signal) and the pseudo crosstalk n2(k) (the estimate of the third signal) using the coefficient vector w2(k) of the adaptive filter 803 to generate a second mixed signal n4(k). The operation of the mixer 1106 becomes identical to that of the mixer 506 by changing the input and output signals of the mixer 506 to those of the mixer 1106 as shown in FIG. 8, so a detailed description will be omitted.

信号比推定部１１０３は、雑音の推定値ｅ２（ｋ）と第２混合信号ｎ４（ｋ）を受けて、雑音とクロストークの振幅または電力の比を第４混在比Ｒ４（ｋ）として推定する。第４混在比Ｒ４（ｋ）は、雑音の推定値ｅ２（ｋ）と第２混合信号ｎ４（ｋ）の振幅または電力の比としてもよいし、それらの振幅または電力に微小定数を加算してから比を計算してもよい。微小定数の加算は、除算による商の発散を防ぐ効果がある。また、雑音の推定値ｅ２（ｋ）と第２混合信号ｎ４（ｋ）のいずれかまたは両方を、平均化してから用いてもよい。平均化によって、比の計算精度を向上することができる。 The signal ratio estimator 1103 receives the noise estimate e2(k) and the second mixed signal n4(k) and estimates the ratio of the amplitude or power of the noise to the crosstalk as the fourth mixed ratio R4(k). The fourth mixed ratio R4(k) may be the ratio of the amplitude or power of the noise estimate e2(k) to the second mixed signal n4(k), or may be calculated by adding a small constant to these amplitudes or powers. Adding a small constant has the effect of preventing the quotient from diverging due to division. Alternatively, either or both of the noise estimate e2(k) and the second mixed signal n4(k) may be averaged before use. Averaging can improve the accuracy of the ratio calculation.

図８に示す推定部８０６の第２の内部構成は、図７に示す推定部８０６の第１の内部構成と等価である。すなわち、図７に示す第１の内部構成は、雑音とクロストークの振幅または電力の比に対して２つの推定値を信号比推定部９０１と９０２で生成し、それらを混合して第４混在比Ｒ４（ｋ）を算出する。図８に示す第２の内部構成は、２種類の推定値、すなわち音声の補正推定値ｅ１Ｃ（ｋ）と擬似クロストークｎ２（ｋ）を適応フィルタ８０３の係数ベクトルｗ２（ｋ）を用いて混合して第２混合信号ｎ４（ｋ）を生成して分母を確定し、分子である雑音の推定値ｅ２（ｋ）と作用させて第４混在比Ｒ４（ｋ）を算出する。これら２種類の構成が可能となったのは、図７に示す第１の内部構成と図８に示す第２の内部構成において、入力端子２０２における雑音とクロストークの振幅または電力の比を推定する際に、同一の分子、すなわち雑音の推定値ｅ２（ｋ）を用いるからである。図８に示す推定部８０６の第２の内部構成は、図７に示す第１の内部構成よりも、構成要素が少なく、単純である。 The second internal configuration of the estimation unit 806 shown in Figure 8 is equivalent to the first internal configuration of the estimation unit 806 shown in Figure 7. That is, the first internal configuration shown in Figure 7 generates two estimates for the amplitude or power ratio of noise and crosstalk using signal ratio estimation units 901 and 902, and mixes them to calculate the fourth mixture ratio R4(k). The second internal configuration shown in Figure 8 mixes two types of estimates, namely, the corrected speech estimate e1C(k) and the pseudo crosstalk n2(k), using the coefficient vector w2(k) of the adaptive filter 803 to generate a second mixed signal n4(k), determine the denominator, and then combines this with the noise estimate e2(k), which is the numerator, to calculate the fourth mixture ratio R4(k). These two types of configurations are possible because the first internal configuration shown in FIG. 7 and the second internal configuration shown in FIG. 8 use the same numerator, i.e., the noise estimate e2(k), when estimating the ratio of the amplitude or power of noise to crosstalk at the input terminal 202. The second internal configuration of the estimator 806 shown in FIG. 8 has fewer components and is simpler than the first internal configuration shown in FIG. 7.

以上の構成により、本実施形態は、適応フィルタ２０３，８０３の近似する音響インパルス応答の利得が１未満でクロストークが存在する場合でも、ステップサイズに特別な値を強制的に設定することなく、円滑に係数更新を行うことができ、結果として、雑音の消し残りが少なく、かつ、信号歪が少ない出力信号を得ることができる。 With the above configuration, this embodiment can smoothly update the coefficients without forcibly setting a special value for the step size, even when the gain of the acoustic impulse response approximated by the adaptive filters 203 and 803 is less than 1 and crosstalk is present. As a result, it is possible to obtain an output signal with little residual noise and little signal distortion.

〔４．他の実施形態〕
以上、本発明の複数の実施形態について詳述したが、それぞれの実施形態に含まれる別々の特徴を如何様に組み合わせたシステムまたは装置も、本発明の範疇に含まれる。 4. Other Embodiments
Although multiple embodiments of the present invention have been described in detail above, any system or device that combines the separate features included in each embodiment also falls within the scope of the present invention.

また、本発明は、複数の機器から構成されるシステムに適用されてもよいし、単体の装置に適用されてもよい。さらに、本発明は、上述の実施形態の機能を実現する情報処理プログラム（信号処理プログラム）が、システムあるいは装置に直接あるいは遠隔から供給される場合にも適用可能である。そのようなプログラムは、信号処理装置あるいは雑音消去装置を構成するＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）などのプロセッサで実行される。さらには、本発明の機能をコンピュータで実現するために、コンピュータにインストールされるプログラム、あるいはそのプログラムを格納した媒体、そのプログラムをダウンロードさせるＷＷＷ（ＷｏｒｌｄＷｉｄｅＷｅｂ）サーバも、本発明の範疇に含まれる。 The present invention may also be applied to a system consisting of multiple devices, or to a stand-alone device. Furthermore, the present invention can also be applied when an information processing program (signal processing program) that realizes the functions of the above-described embodiments is supplied directly or remotely to a system or device. Such a program is executed by a processor such as a DSP (Digital Signal Processor) that constitutes a signal processing device or noise cancellation device. Furthermore, the scope of the present invention also includes programs installed on a computer to realize the functions of the present invention on the computer, media on which such programs are stored, and WWW (World Wide Web) servers from which such programs can be downloaded.

図９は、第１～第３実施形態を信号処理プログラムにより構成する場合に、その信号処理プログラムを実行するコンピュータ１２００の構成図である。コンピュータ１２００は、入力部１２０１と、プロセッサ１２０３と、出力部１２０２と、メモリ１２０４とを含む。 Figure 9 is a configuration diagram of a computer 1200 that executes a signal processing program when the first to third embodiments are configured using the signal processing program. The computer 1200 includes an input unit 1201, a processor 1203, an output unit 1202, and a memory 1204.

プロセッサ１２０３は、メモリ１２０４に記憶された信号処理プログラムを読み込むことにより、コンピュータ１２００の動作を制御する。プロセッサ１２０３は、例えば、ＤＳＰ、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、またはＭＰＵ（Ｍｉｃｒｏ－ＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）などのプロセッサである。メモリ１２０４は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＲＯＭ（Ｒｅａｄ－ＯｎｌｙＭｅｍｏｒｙ）、フラッシュメモリ、ＥＰＲＯＭ（ＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄ－ＯｎｌｙＭｅｍｏｒｙ）、およびＥＥＰＲＯＭ（登録商標）（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄ－ＯｎｌｙＭｅｍｏｒy）のうち１つ以上を含む。 Processor 1203 controls the operation of computer 1200 by reading signal processing programs stored in memory 1204. Processor 1203 is, for example, a processor such as a DSP, a CPU (Central Processing Unit), or an MPU (Micro-Processing Unit). Memory 1204 includes one or more of RAM (Random Access Memory), ROM (Read-Only Memory), flash memory, EPROM (Erasable Programmable Read-Only Memory), and EEPROM (Electrically Erasable Programmable Read-Only Memory).

図１０は、図９に示すコンピュータのプロセッサによる信号処理の一例を示すフローチャートである。図１０に示す例は、コンピュータ１２００が第１実施形態に係る信号処理装置１００として機能する場合のフローチャートである。 Figure 10 is a flowchart showing an example of signal processing by the processor of the computer shown in Figure 9. The example shown in Figure 10 is a flowchart when the computer 1200 functions as the signal processing device 100 according to the first embodiment.

図１０に示すように、信号処理プログラムを実行したプロセッサ１２０３は、ステップＳ１０において、まず、入力部１２０１から、第１信号と第２信号が混在した第１混在信号ｘＰ（ｋ）を入力し、第１信号と相関のある第３信号と第２信号と相関のある第４信号とが混在した第２混在信号ｘＲ（ｋ）を入力する。
プロセッサ１２０３は、ステップＳ１１において、第２混在信号ｘＲ（ｋ）を第１適応フィルタ（適応フィルタ１０３）で処理して第２信号の推定値ｎ１（ｋ）を生成し、ステップＳ１２において、第１混在信号ｘＰ（ｋ）と第２信号の推定値ｎ１（ｋ）から第１信号の推定値ｅ１（ｋ）を生成する。
プロセッサ１２０３は、ステップＳ１３において、第１信号の推定値ｅ１（ｋ）と第２信号の推定値ｎ１（ｋ）と第２混在信号ｘＲ（ｋ）と第１混在信号ｘＰ（ｋ）と第１適応フィルタ（適応フィルタ１０３）の係数ベクトルｗ１（ｋ）を用いて第１信号と第２信号の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。
プロセッサ１２０３は、ステップＳ１４において、第１混在比Ｒ１（ｋ）を用いて第２信号の推定値ｎ１（ｋ）の生成を制御する。これにより、第１実施形態と同様の効果を得ることができる。 As shown in FIG. 10 , in step S10, the processor 1203 that has executed the signal processing program first inputs, from the input unit 1201, a first mixture signal xP(k) in which a first signal and a second signal are mixed, and then inputs a second mixture signal xR(k) in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed.
In step S11, processor 1203 processes second mixture signal xR(k) with a first adaptive filter (adaptive filter 103) to generate an estimate n1(k) of the second signal, and in step S12, generates an estimate e1(k) of the first signal from first mixture signal xP(k) and the estimate n1(k) of the second signal.
In step S13, processor 1203 estimates the ratio of the amplitudes or powers of the first signal and the second signal as a first mixture ratio R1(k) using the estimated value e1(k) of the first signal, the estimated value n1(k) of the second signal, the second mixture signal xR(k), the first mixture signal xP(k), and the coefficient vector w1(k) of the first adaptive filter (adaptive filter 103).
In step S14, the processor 1203 controls the generation of the estimated value n1(k) of the second signal using the first mixture ratio R1(k), thereby achieving the same effects as in the first embodiment.

また、コンピュータ１２００が第２実施形態に係る信号処理装置として機能する場合、プロセッサ１２０３は、ステップＳ１０において、第１混在信号ｘＰ（ｋ）と第２混在信号ｘＲ（ｋ）とを入力し、ステップＳ１１において、第２混在信号ｘＲ（ｋ）を第１適応フィルタ（適応フィルタ２０３）で処理して第２信号の推定値ｎ１（ｋ）を生成する。プロセッサ１２０３は、ステップＳ１２において第１混在信号ｘＰ（ｋ）と第２信号の推定値ｎ１（ｋ）とから第１信号の推定値ｅ１（ｋ）を生成する。プロセッサ１２０３は、ステップＳ１３において、第１信号の推定値ｅ１（ｋ）と第２信号の推定値ｎ１（ｋ）と第２混在信号ｘＲ（ｋ）と第１混在信号ｘＰ（ｋ）と第１適応フィルタ（適応フィルタ２０３）の係数１４１とを用いて第１混在比Ｒ１（ｋ）を生成する。プロセッサ１２０３は、ステップＳ１４において、第１混在比Ｒ１（ｋ）を用いて第２信号の推定値ｎ１（ｋ）の生成を制御する。 Furthermore, when the computer 1200 functions as the signal processing device according to the second embodiment, the processor 1203 receives the first mixture signal xP(k) and the second mixture signal xR(k) in step S10, and processes the second mixture signal xR(k) using a first adaptive filter (adaptive filter 203) in step S11 to generate an estimated value n1(k) of the second signal. In step S12, the processor 1203 generates an estimated value e1(k) of the first signal from the first mixture signal xP(k) and the estimated value n1(k) of the second signal. In step S13, the processor 1203 generates a first mixture ratio R1(k) using the estimated value e1(k) of the first signal, the estimated value n1(k) of the second signal, the second mixture signal xR(k), the first mixture signal xP(k), and coefficient 141 of the first adaptive filter (adaptive filter 203). In step S14, the processor 1203 controls the generation of the estimated value n1(k) of the second signal using the first mixture ratio R1(k).

また、コンピュータ１２００が第３実施形態に係る信号処理装置として機能する場合、プロセッサ１２０３は、ステップＳ１３において、第１信号の推定値ｅ１（ｋ）と第２信号の推定値ｎ１（ｋ）と第４信号の推定値ｅ２（ｋ）と第１混在信号ｘＰ（ｋ）と第１適応フィルタ（適応フィルタ２０３）の係数１４１とを用いて第１混在比Ｒ１（ｋ）を生成する。さらに、プロセッサ１２０３は、第１信号の推定値ｅ１（ｋ）を第２適応フィルタ（適応フィルタ８０３）で処理して第３信号の推定値ｎ２（ｋ）を生成し、第２混在信号ｘＲ（ｋ）から第３信号の推定値ｎ２（ｋ）を減算して第４信号の推定値ｅ２（ｋ）を生成する。また、プロセッサ１２０３は、第１適応フィルタ（適応フィルタ２０３）において第３信号の推定値ｘＲ（ｋ）に代えて第４信号の推定値ｅ２（ｋ）を処理し、第４信号の推定値ｅ２（ｋ）と第３信号の推定値ｎ２（ｋ）と第１信号の推定値ｅ１（ｋ）と第２混在信号ｘＲ（ｋ）と第２適応フィルタ（適応フィルタ８０３）の係数とをさらに用いて、雑音とクロストークの振幅または電力の比を第４混在比Ｒ４（ｋ）としてさらに推定する。 Furthermore, when the computer 1200 functions as the signal processing device according to the third embodiment, in step S13, the processor 1203 generates a first mixture ratio R1(k) using the estimated value e1(k) of the first signal, the estimated value n1(k) of the second signal, the estimated value e2(k) of the fourth signal, the first mixture signal xP(k), and the coefficient 141 of the first adaptive filter (adaptive filter 203). Furthermore, the processor 1203 processes the estimated value e1(k) of the first signal with the second adaptive filter (adaptive filter 803) to generate an estimated value n2(k) of the third signal, and subtracts the estimated value n2(k) of the third signal from the second mixture signal xR(k) to generate an estimated value e2(k) of the fourth signal. The processor 1203 also processes the fourth signal estimate e2(k) in place of the third signal estimate xR(k) in the first adaptive filter (adaptive filter 203), and further estimates the amplitude or power ratio of noise to crosstalk as a fourth mixture ratio R4(k) using the fourth signal estimate e2(k), the third signal estimate n2(k), the first signal estimate e1(k), the second mixture signal xR(k), and the coefficients of the second adaptive filter (adaptive filter 803).

〔５．その他〕
また、上述した実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [5. Other]
Furthermore, among the processes described in the above embodiments, all or part of the processes described as being performed automatically can be performed manually, or all or part of the processes described as being performed manually can be performed automatically using known methods. In addition, the information including the processing procedures, specific names, various data, and parameters shown in the above documents and drawings can be changed as desired unless otherwise specified. For example, the various information shown in each drawing is not limited to the information shown in the drawings.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Furthermore, the components of each device shown in the figure are functional concepts and do not necessarily have to be physically configured as shown. In other words, the specific form of distribution and integration of each device is not limited to that shown, and all or part of them can be functionally or physically distributed and integrated in any unit depending on various loads, usage conditions, etc.

〔６．効果〕
上述してきたように、第１実施形態に係る信号処理装置１００は、第１入力部１０１（第１入力手段の一例に相当）と、第２入力部１０２（第２入力手段の一例に相当）と、適応フィルタ１０３（第１適応フィルタの一例に相当）と、減算部１０４（第１減算部の一例に相当）と、推定部１０６とを備える。第１入力部１０１は、第１信号と第２信号が混在した第１混在信号ｘＰ（ｋ）を入力する。第２入力部１０２は、第１信号と相関のある第３信号と第２信号と相関のある第４信号とが混在した第２混在信号ｘＲ（ｋ）を入力する。適応フィルタ１０３は、第２混在信号ｘＲ（ｋ）をフィルタ処理して第２信号の推定値ｎ１（ｋ）を生成する。減算部１０４は、第１混在信号ｘＰ（ｋ）と第２信号の推定値ｎ１（ｋ）とから第１信号の推定値ｅ１（ｋ）を生成する。推定部１０６は、第１信号の推定値ｅ１（ｋ）と第２信号の推定値ｎ１（ｋ）と第２混在信号ｘＲ（ｋ）第１混在信号ｘＰ（ｋ）と適応フィルタ１０３の係数１４１とを用いて、入力端子２０１における第１信号と第２信号の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。係数更新制御部１０７は、推定部１０６によって得られた第１混在比Ｒ１（ｋ）の値が大きい場合に、適応フィルタ１０３の係数１４１の更新量を小さくするための制御信号μ１（ｋ）を適応フィルタ１０３に出力する。信号処理装置１００は、制御信号μ１（ｋ）を用いて適応フィルタ１０３を制御する。これにより、信号処理装置１００は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 6. Effects
As described above, the signal processing device 100 according to the first embodiment includes a first input unit 101 (corresponding to an example of a first input means), a second input unit 102 (corresponding to an example of a second input means), an adaptive filter 103 (corresponding to an example of a first adaptive filter), a subtraction unit 104 (corresponding to an example of a first subtraction unit), and an estimation unit 106. The first input unit 101 receives a first mixture signal xP(k) in which a first signal and a second signal are mixed. The second input unit 102 receives a second mixture signal xR(k) in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed. The adaptive filter 103 filters the second mixture signal xR(k) to generate an estimate n1(k) of the second signal. The subtractor 104 generates an estimate e1(k) of the first signal from the first mixture signal xP(k) and the estimate n1(k) of the second signal. The estimator 106 estimates a first mixture ratio R1(k), which is the ratio of the amplitudes or powers of the first signal and the second signal at the input terminal 201, using the estimate e1(k) of the first signal, the estimate n1(k) of the second signal, the second mixture signal xR(k), the first mixture signal xP(k), and the coefficients 141 of the adaptive filter 103. When the value of the first mixture ratio R1(k) obtained by the estimator 106 is large, the coefficient update controller 107 outputs a control signal μ1(k) to the adaptive filter 103 to reduce the amount of update of the coefficients 141 of the adaptive filter 103. The signal processing device 100 controls the adaptive filter 103 using the control signal μ1(k). As a result, the signal processing device 100 can achieve both fast convergence and a low-distortion output signal without manually controlling the step size μ1(k) even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、第２実施形態に係る信号処理装置は、入力端子２０１（第１入力手段の一例に相当）と、入力端子２０２（第２入力手段の一例に相当）と、適応フィルタ２０３（第１適応フィルタの一例に相当）と、減算部２０４（第１減算部の一例に相当）と、推定部２０６とを備える。入力端子２０１は、音声信号（第１信号の一例に相当）と雑音（第２信号の一例に相当）とが混在した劣化信号ｘＰ（ｋ）（第１混在信号の一例に相当）を入力する。入力端子２０２は、音声信号と相関のある信号（第３信号の一例に相当）と雑音と相関のある信号（第４信号の一例に相当）とが混在した参照信号ｘＲ（ｋ）（第２混在信号の一例に相当）を入力する。適応フィルタ２０３は、参照信号ｘＲ（ｋ）をフィルタ処理して擬似雑音ｎ１（ｋ）（第２信号の推定値の一例に相当）を生成する。減算部１０４は、劣化信号ｘＰ（ｋ）と擬似雑音ｎ１（ｋ）とから音声の推定値ｅ１（ｋ）（第１信号の推定値ｅ１（ｋ）の一例に相当）を生成する。推定部２０６は、音声の推定値ｅ１（ｋ）と擬似雑音ｎ１（ｋ）と参照信号ｘＲ（ｋ）と劣化信号ｘＰ（ｋ）と適応フィルタ２０３の係数ベクトルｗ１（ｋ）とを用いて、入力端子２０１における音声信号と雑音の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。雑音消去装置２００は、第１混在比Ｒ１（ｋ）を用いて適応フィルタ２０３を制御する。これにより、第２実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 The signal processing device according to the second embodiment also includes an input terminal 201 (corresponding to an example of a first input means), an input terminal 202 (corresponding to an example of a second input means), an adaptive filter 203 (corresponding to an example of a first adaptive filter), a subtraction unit 204 (corresponding to an example of a first subtraction unit), and an estimation unit 206. The input terminal 201 receives a noisy signal xP(k) (corresponding to an example of a first mixed signal) that is a mixture of an audio signal (corresponding to an example of a first signal) and noise (corresponding to an example of an second signal). The input terminal 202 receives a reference signal xR(k) (corresponding to an example of a second mixed signal) that is a mixture of a signal correlated with the audio signal (corresponding to an example of a third signal) and a signal correlated with the noise (corresponding to an example of an fourth signal). The adaptive filter 203 filters the reference signal xR(k) to generate pseudo-noise n1(k) (corresponding to an example of an estimated value of the second signal). The subtractor 104 generates a speech estimate e1(k) (corresponding to an example of the first signal estimate e1(k)) from the noisy signal xP(k) and the pseudo-noise n1(k). The estimator 206 estimates the amplitude or power ratio of the speech signal to noise at the input terminal 201 as a first mixture ratio R1(k) using the speech estimate e1(k), the pseudo-noise n1(k), the reference signal xR(k), the noisy signal xP(k), and the coefficient vector w1(k) of the adaptive filter 203. The noise canceller 200 controls the adaptive filter 203 using the first mixture ratio R1(k). As a result, the signal processing device according to the second embodiment can achieve both fast convergence and a low-distortion output signal without manually controlling the step size μ1(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、推定部２０６は、信号比推定部３０１（第１信号比推定部の一例に相当）と、信号比推定部３０２（第２信号比推定部の一例に相当）と、補正部３１０と、混合部３０５（第１混合部の一例に相当）とを備える。信号比推定部３０１は、音声の推定値ｅ１（ｋ）と擬似雑音ｎ１（ｋ）とを用いて、入力端子２０１における音声信号と雑音の振幅または電力の比を第２混在比Ｒ２（ｋ）として推定する。補正部３１０は、参照信号ｘＲ（ｋ）と劣化信号ｘＰ（ｋ）を入力として、補正参照信号ｘＲＣ（ｋ）を求める。信号比推定部３０２は、音声の推定値ｅ１（ｋ）と補正参照信号ｘＲＣ（ｋ）とを用いて、入力端子２０１における音声信号と雑音の振幅または電力の比を第３混在比Ｒ３（ｋ）として推定する。混合部３０５は、第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）を、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化に基づいて混合して第１混在比Ｒ１（ｋ）を生成する。これにより、第２実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 The estimation unit 206 also includes a signal ratio estimation unit 301 (corresponding to an example of a first signal ratio estimation unit), a signal ratio estimation unit 302 (corresponding to an example of a second signal ratio estimation unit), a correction unit 310, and a mixing unit 305 (corresponding to an example of a first mixing unit). The signal ratio estimation unit 301 uses the speech estimate e1(k) and the pseudo-noise n1(k) to estimate the amplitude or power ratio of the speech signal to noise at the input terminal 201 as a second mixing ratio R2(k). The correction unit 310 receives the reference signal xR(k) and the noisy signal xP(k) as inputs and calculates a corrected reference signal xRC(k). The signal ratio estimation unit 302 uses the speech estimate e1(k) and the corrected reference signal xRC(k) to estimate the amplitude or power ratio of the speech signal to noise at the input terminal 201 as a third mixing ratio R3(k). The mixer 305 generates the first mixture ratio R1(k) by mixing the second mixture ratio R2(k) and the third mixture ratio R3(k) based on the time change of the coefficient vector w1(k) of the adaptive filter 203. As a result, the signal processing device according to the second embodiment can achieve both fast convergence and a low-distortion output signal without manually controlling the step size μ1(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、混合部３０５は、適応フィルタ２０３の係数更新開始時に第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を１００％に設定し、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化が十分に小さくなったとき、第１混在比Ｒ１（ｋ）における第３混在比Ｒ３（ｋ）の含有割合を０％に設定する。これにより、第２実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 Furthermore, the mixer 305 sets the content ratio of the third mixing ratio R3(k) in the first mixing ratio R1(k) to 100% when the coefficient update of the adaptive filter 203 begins, and sets the content ratio of the third mixing ratio R3(k) in the first mixing ratio R1(k) to 0% when the change over time in the coefficient vector w1(k) of the adaptive filter 203 becomes sufficiently small. As a result, the signal processing device according to the second embodiment can achieve both fast convergence and a low-distortion output signal without manually controlling the step size μ1(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、推定部２０６は、補正部３１０と、混合部５０６（第２混合部の一例に相当）と、信号比推定部５０３（第３信号比推定部の一例に相当）とを備える。補正部３１０は、参照信号ｘＲ（ｋ）と劣化信号ｘＰ（ｋ）を入力として、補正参照信号ｘＲＣ（ｋ）を求める。混合部５０６は、補正参照信号ｘＲＣ（ｋ）と擬似雑音ｎ１（ｋ）とを、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化に基づいて混合して第１混合信号ｎ３（ｋ）を生成する。信号比推定部５０３は、第１混合信号ｎ３（ｋ）と音声の推定値ｅ１（ｋ）とを用いて、入力端子２０１における音声信号と雑音の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。これにより、第２実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 The estimation unit 206 also includes a correction unit 310, a mixing unit 506 (corresponding to an example of a second mixing unit), and a signal ratio estimation unit 503 (corresponding to an example of a third signal ratio estimation unit). The correction unit 310 receives the reference signal xR(k) and the degradation signal xP(k) as input and calculates a corrected reference signal xRC(k). The mixing unit 506 mixes the corrected reference signal xRC(k) and the pseudo-noise n1(k) based on the time variation of the coefficient vector w1(k) of the adaptive filter 203 to generate a first mixed signal n3(k). The signal ratio estimation unit 503 uses the first mixed signal n3(k) and the estimated value e1(k) of the speech to estimate the amplitude or power ratio of the speech signal to noise at the input terminal 201 as a first mixing ratio R1(k). As a result, the signal processing device according to the second embodiment can achieve both fast convergence and low-distortion output signals without manually controlling the step size μ1(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、混合部５０６は、適応フィルタ２０３の係数更新開始時に第１混合信号ｎ３（ｋ）における補正参照信号ｘＲＣ（ｋ）の含有割合を１００％に設定し、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化が十分に小さくなったとき、第１混合信号ｎ３（ｋ）における補正参照信号ｘＲＣ（ｋ）の含有割合を０％に設定する。これにより、第２実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 Furthermore, the mixer 506 sets the content ratio of the correction reference signal xRC(k) in the first mixed signal n3(k) to 100% when the coefficient update of the adaptive filter 203 begins, and sets the content ratio of the correction reference signal xRC(k) in the first mixed signal n3(k) to 0% when the time change of the coefficient vector w1(k) of the adaptive filter 203 becomes sufficiently small. As a result, the signal processing device according to the second embodiment can achieve both fast convergence and a low-distortion output signal without manually controlling the step size μ1(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、第３実施形態に係る信号処理装置は、入力端子２０１と、入力端子２０２と、適応フィルタ２０３と、減算部２０４と、減算部８０４（第２減算部の一例に相当）と、適応フィルタ８０３（第２適応フィルタの一例に相当）と、推定部８０６とを備える。適応フィルタ８０３は、第１信号の推定値ｅ１（ｋ）をフィルタ処理して擬似クロストークｎ２（ｋ）（第３信号の推定値の一例に相当）を生成する。減算部８０４は、参照信号ｘＲ（ｋ）から擬似クロストークｎ２（ｋ）を減算して雑音の推定値ｅ２（ｋ）（第４信号の推定値の一例に相当）を生成する。適応フィルタ２０３は、参照信号ｘＲ（ｋ）に代えて雑音の推定値ｅ２（ｋ）を入力とする。推定部８０６は、推定部２０６の機能に加えて、雑音の推定値ｅ２（ｋ）と擬似クロストークｎ２（ｋ）と第１信号の推定値ｅ１（ｋ）と参照信号ｘＲ（ｋ）と適応フィルタ８０３の係数とをさらに用いて、入力端子２０２における雑音とクロストークの振幅または電力の比を第４混在比Ｒ４（ｋ）としてさらに推定する。第３実施形態に係る信号処理装置は、第４混在比Ｒ４（ｋ）を用いて適応フィルタ８０３を制御する。これにより、第３実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ２（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 The signal processing device according to the third embodiment also includes an input terminal 201, an input terminal 202, an adaptive filter 203, a subtraction unit 204, a subtraction unit 804 (corresponding to an example of a second subtraction unit), an adaptive filter 803 (corresponding to an example of a second adaptive filter), and an estimation unit 806. The adaptive filter 803 filters the estimated value e1(k) of the first signal to generate pseudo crosstalk n2(k) (corresponding to an example of an estimated value of the third signal). The subtraction unit 804 subtracts the pseudo crosstalk n2(k) from the reference signal xR(k) to generate a noise estimated value e2(k) (corresponding to an example of an estimated value of the fourth signal). The adaptive filter 203 receives the noise estimated value e2(k) as input instead of the reference signal xR(k). In addition to the functions of the estimation unit 206, the estimation unit 806 further uses the noise estimate e2(k), the pseudo crosstalk n2(k), the first signal estimate e1(k), the reference signal xR(k), and the coefficients of the adaptive filter 803 to estimate the ratio of the amplitude or power of the noise to the crosstalk at the input terminal 202 as a fourth mixing ratio R4(k). The signal processing device according to the third embodiment controls the adaptive filter 803 using the fourth mixing ratio R4(k). As a result, the signal processing device according to the third embodiment can achieve both fast convergence and a low-distortion output signal without manually controlling the step size μ2(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、推定部８０６は、信号比推定部３０１と、補正部３１０と、信号比推定部３０２と、混合部３０５と、信号比推定部９０１（第４信号比推定部の一例に相当）と、補正部７１０と、信号比推定部９０２（第５信号比推定部の一例に相当）と、混合部９０５（第３混合部の一例に相当）とを備える。信号比推定部３０１は、音声の推定値ｅ１（ｋ）と擬似雑音ｎ１（ｋ）とを用いて、入力端子２０１における音声信号と雑音の振幅または電力の比を第２混在比Ｒ２（ｋ）として推定する。補正部３１０は、雑音の推定値ｅ２（ｋ）と劣化信号ｘＰ（ｋ）を入力として、雑音の補正推定値ｅ２Ｃ（ｋ）を求める。信号比推定部３０２は、音声の推定値ｅ１（ｋ）と雑音の補正推定値ｅ２Ｃ（ｋ）とを用いて、入力端子２０１における音声と雑音の振幅または電力の比を第３混在比Ｒ３（ｋ）として推定する。混合部３０５は、第２混在比Ｒ２（ｋ）と第３混在比Ｒ３（ｋ）を、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化に基づいて混合して第１混在比Ｒ１（ｋ）を生成する。信号比推定部９０１は、雑音の推定値ｅ２（ｋ）と擬似クロストークｎ２（ｋ）とを用いて、入力端子２０２における雑音とクロストークの振幅または電力の比を第５混在比Ｒ５（ｋ）として推定する。補正部７１０は、音声の推定値ｅ１（ｋ）と参照信号ｘＲ（ｋ）を入力として、音声の補正推定値ｅ１Ｃ（ｋ）を求める。信号比推定部９０２は、雑音の推定値ｅ２（ｋ）と音声の補正推定値ｅ１Ｃ（ｋ）とを用いて、入力端子２０２における雑音とクロストークの振幅または電力の比を第６混在比Ｒ６（ｋ）として推定する。混合部９０５は、第５混在比Ｒ５（ｋ）と第６混在比Ｒ６（ｋ）を、適応フィルタ８０３の係数ベクトルｗ２（ｋ）の時間変化に基づいて混合して第４混在比Ｒ４（ｋ）を生成する。これにより、第３実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ），μ２（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 The estimation unit 806 also includes a signal ratio estimation unit 301, a correction unit 310, a signal ratio estimation unit 302, a mixing unit 305, a signal ratio estimation unit 901 (corresponding to an example of a fourth signal ratio estimation unit), a correction unit 710, a signal ratio estimation unit 902 (corresponding to an example of a fifth signal ratio estimation unit), and a mixing unit 905 (corresponding to an example of a third mixing unit). The signal ratio estimation unit 301 uses the speech estimate e1(k) and the pseudo-noise n1(k) to estimate the amplitude or power ratio of the speech signal to noise at the input terminal 201 as a second mixing ratio R2(k). The correction unit 310 receives the noise estimate e2(k) and the noisy signal xP(k) as input and calculates a corrected noise estimate e2C(k). The signal ratio estimator 302 uses the speech estimate e1(k) and the corrected noise estimate e2C(k) to estimate the amplitude or power ratio of speech to noise at the input terminal 201 as a third mixture ratio R3(k). The mixer 305 mixes the second mixture ratio R2(k) and the third mixture ratio R3(k) based on time changes in the coefficient vector w1(k) of the adaptive filter 203 to generate the first mixture ratio R1(k). The signal ratio estimator 901 uses the noise estimate e2(k) and the pseudo crosstalk n2(k) to estimate the amplitude or power ratio of noise to crosstalk at the input terminal 202 as a fifth mixture ratio R5(k). The corrector 710 receives the speech estimate e1(k) and the reference signal xR(k) as input and calculates the corrected speech estimate e1C(k). The signal ratio estimator 902 uses the noise estimate e2(k) and the corrected speech estimate e1C(k) to estimate the amplitude or power ratio of noise to crosstalk at the input terminal 202 as a sixth mixture ratio R6(k). The mixer 905 mixes the fifth mixture ratio R5(k) and the sixth mixture ratio R6(k) based on time changes in the coefficient vector w2(k) of the adaptive filter 803 to generate a fourth mixture ratio R4(k). As a result, the signal processing device according to the third embodiment can achieve both fast convergence and low-distortion output signals without manually controlling the step sizes μ1(k) and μ2(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、推定部８０６は、補正部３１０と、混合部５０６と、信号比推定部５０３と、補正部７１０と、混合部１１０６（第４混合部の一例に相当）と、信号比推定部１１０３（第６信号比推定部の一例に相当）とを備える。補正部３１０は、雑音の推定値ｅ２（ｋ）と劣化信号ｘＰ（ｋ）を入力として、雑音の補正推定値ｅ２Ｃ（ｋ）を求める。混合部５０６は、雑音の補正推定値ｅ２Ｃ（ｋ）と擬似雑音ｎ１（ｋ）を、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化に基づいて混合して第１混合信号ｎ３（ｋ）を生成する。信号比推定部５０３は、第１混合信号ｎ３（ｋ）と音声の推定値ｅ１（ｋ）を用いて、入力端子２０１における音声と雑音の振幅または電力の比を第１混在比Ｒ１（ｋ）として推定する。補正部７１０は、音声の推定値ｅ１（ｋ）と参照信号ｘＲ（ｋ）を入力として、音声の補正推定値ｅ１Ｃ（ｋ）を求める。混合部１１０６は、音声の補正推定値ｅ１Ｃ（ｋ）と擬似クロストークｎ２（ｋ）を、適応フィルタ８０３の係数ベクトルｗ２（ｋ）の時間変化に基づいて混合して第２混合信号ｎ４（ｋ）を生成する。信号比推定部１１０３は、第２混合信号ｎ４（ｋ）と雑音の推定値ｅ２（ｋ）を用いて、入力端子２０２における雑音とクロストークの振幅または電力の比を第４混在比Ｒ４（ｋ）として推定する。これにより、第３実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ），μ２（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 The estimation unit 806 also includes a correction unit 310, a mixing unit 506, a signal ratio estimation unit 503, a correction unit 710, a mixing unit 1106 (corresponding to an example of a fourth mixing unit), and a signal ratio estimation unit 1103 (corresponding to an example of a sixth signal ratio estimation unit). The correction unit 310 receives the noise estimate e2(k) and the noisy signal xP(k) as input and calculates a corrected noise estimate e2C(k). The mixing unit 506 mixes the corrected noise estimate e2C(k) and the pseudo-noise n1(k) based on the time variation of the coefficient vector w1(k) of the adaptive filter 203 to generate a first mixed signal n3(k). The signal ratio estimation unit 503 uses the first mixed signal n3(k) and the speech estimate e1(k) to estimate the amplitude or power ratio of speech to noise at the input terminal 201 as a first mixing ratio R1(k). The correction unit 710 receives the speech estimate e1(k) and the reference signal xR(k) as input and calculates the corrected speech estimate e1C(k). The mixer 1106 mixes the corrected speech estimate e1C(k) and the pseudo crosstalk n2(k) based on time changes in the coefficient vector w2(k) of the adaptive filter 803 to generate a second mixed signal n4(k). The signal ratio estimator 1103 uses the second mixed signal n4(k) and the noise estimate e2(k) to estimate the ratio of the amplitude or power of the noise to the crosstalk at the input terminal 202 as a fourth mixing ratio R4(k). As a result, the signal processing device according to the third embodiment can achieve both fast convergence and low-distortion output signals without manually controlling the step sizes μ1(k) and μ2(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、混合部５０６は、適応フィルタ２０３の係数更新開始時に第１混合信号ｎ３（ｋ）における雑音の補正推定値ｅ２Ｃ（ｋ）の含有割合を１００％に設定し、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化が十分に小さくなったとき、第１混合信号ｎ３（ｋ）における雑音の補正推定値ｅ２Ｃ（ｋ）の含有割合を０％に設定する。また、混合部１１０６は、適応フィルタ８０３の係数更新開始時に第２混合信号ｎ４（ｋ）における音声の補正推定値ｅ１Ｃ（ｋ）の含有割合を１００％に設定し、適応フィルタ８０３の係数ベクトルｗ２（ｋ）の時間変化が十分に小さくなったとき第２混合信号ｎ４（ｋ）における音声の補正推定値ｅ１Ｃ（ｋ）の含有割合を０％に設定する。これにより、第３実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ），μ２（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 Furthermore, the mixer 506 sets the content ratio of the noise correction estimate value e2C(k) in the first mixed signal n3(k) to 100% when the coefficient update of the adaptive filter 203 begins, and sets the content ratio of the noise correction estimate value e2C(k) in the first mixed signal n3(k) to 0% when the time change of the coefficient vector w1(k) of the adaptive filter 203 becomes sufficiently small. Furthermore, the mixer 1106 sets the content ratio of the speech correction estimate value e1C(k) in the second mixed signal n4(k) to 100% when the coefficient update of the adaptive filter 803 begins, and sets the content ratio of the speech correction estimate value e1C(k) in the second mixed signal n4(k) to 0% when the time change of the coefficient vector w2(k) of the adaptive filter 803 becomes sufficiently small. As a result, the signal processing device according to the third embodiment can achieve both fast convergence and low-distortion output signals without manually controlling the step sizes μ1(k) and μ2(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化は、係数ベクトルｗ１（ｋ）の２乗総和または絶対値総和の時間変化であり、適応フィルタ８０３の係数ベクトルｗ２（ｋ）の時間変化は、係数ベクトルｗ２（ｋ）の２乗総和または絶対値総和の時間変化である。これにより、第１～第３実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ），μ２（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 Furthermore, the time change in coefficient vector w1(k) of adaptive filter 203 is the time change in the sum of squares or sum of absolute values of coefficient vector w1(k), and the time change in coefficient vector w2(k) of adaptive filter 803 is the time change in the sum of squares or sum of absolute values of coefficient vector w2(k). As a result, the signal processing devices according to the first to third embodiments can achieve both fast convergence and low-distortion output signals without manually controlling the step sizes μ1(k) and μ2(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

また、適応フィルタ２０３の係数ベクトルｗ１（ｋ）の時間変化は、係数ベクトルｗ１（ｋ）の２乗部分和または絶対値部分和の時間変化であり、適応フィルタ８０３の係数ベクトルｗ２（ｋ）の時間変化は、係数ベクトルｗ２（ｋ）の２乗部分和または絶対値部分和の時間変化である。これにより、第１～第３実施形態に係る信号処理装置は、適応フィルタの近似する音響インパルス応答の利得が１未満であっても、ステップサイズμ１（ｋ），μ２（ｋ）を手動制御することなく、高速収束と低歪出力信号とを両立できる。 Furthermore, the time change in coefficient vector w1(k) of adaptive filter 203 is the time change in the partial sum of squares or absolute values of coefficient vector w1(k), and the time change in coefficient vector w2(k) of adaptive filter 803 is the time change in the partial sum of squares or absolute values of coefficient vector w2(k). As a result, the signal processing devices according to the first to third embodiments can achieve both fast convergence and low-distortion output signals without manually controlling the step sizes μ1(k) and μ2(k), even if the gain of the acoustic impulse response approximated by the adaptive filter is less than 1.

以上、本願の実施形態を図面に基づいて詳細に説明したが、これは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 The above describes in detail the embodiments of the present application based on the drawings, but this is merely an example, and the present invention can be implemented in other forms that incorporate various modifications and improvements based on the knowledge of those skilled in the art, including the aspects described in the Disclosure of the Invention section.

また、上述してきた「部（ｓｅｃｔｉｏｎ、ｍｏｄｕｌｅ、ｕｎｉｔ）」は、「手段」や「回路」などに読み替えることができる。例えば、減算部は、減算手段や減算回路に読み替えることができる。

（付記１）
第１信号と第２信号が混在した第１混在信号を入力する第１入力手段と、
前記第１信号と相関のある第３信号と前記第２信号と相関のある第４信号とが混在した第２混在信号を入力する第２入力手段と、
前記第２混在信号をフィルタ処理して前記第２信号の推定値を生成する第１適応フィルタ（適応フィルタ１０３、２０３）と、
前記第１混在信号と前記第２信号の推定値とから前記第１信号の推定値を生成する第１減算部（減算部１０４、２０４）と、
前記第１信号の推定値と前記第２信号の推定値と前記第２混在信号と前記第１混在信号と前記第１適応フィルタの係数とを用いて前記第１信号と前記第２信号の振幅または電力の比を第１混在比として推定する推定部（推定部１０６、２０６、８０６）と、
を備え、
前記第１混在比を用いて前記第１適応フィルタを制御する
信号処理装置。
（付記２）
前記推定部（推定部１０６、２０６）は、
前記第１信号の推定値と前記第２信号の推定値とを用いて前記第１信号と前記第２信号の振幅または電力の比を第２混在比として推定する第１信号比推定部（信号比推定部３０１）と、
前記第２混在信号と前記第１混在信号を用いて前記第２混在信号を補正して補正第２混在信号を生成する第１補正部（補正部３１０）と、
前記第１信号の推定値と前記補正第２混在信号とを用いて前記第１信号と前記第２信号の振幅または電力の比を第３混在比として推定する第２信号比推定部（信号比推定部３０２）と、
前記第２混在比と前記第３混在比を、前記第１適応フィルタの係数の時間変化に基づいて混合して前記第１混在比を生成する第１混合部（混合部３０５）と、
を備えた付記１に記載の信号処理装置。
（付記３）
前記第１混合部（混合部３０５）は、
前記第１適応フィルタの係数更新開始時に前記第１混在比における前記第３混在比の含有割合を１００％に設定し、前記第１適応フィルタの係数の時間変化が十分に小さくなったとき、前記第１混在比における前記第３混在比の含有割合を０％に設定する
付記２に記載の信号処理装置。
（付記４）
前記推定部（推定部１０６、２０６）は、
前記第２混在信号と前記第１混在信号を用いて前記第２混在信号を補正して補正第２混在信号を生成する第１補正部（補正部３１０）と、
前記補正第２混在信号と前記第２信号の推定値とを、前記第１適応フィルタの係数の時間変化に基づいて混合して第１混合信号を生成する第２混合部（混合部５０６）と、
前記第１混合信号と前記第１信号の推定値とを用いて前記第１信号と前記第２信号の振幅または電力の比を前記第１混在比として推定する第３信号比推定部（信号比推定部５０３）と、
を備えた付記１に記載の信号処理装置。
（付記５）
前記第２混合部（混合部５０６）は、
前記第１適応フィルタの係数更新開始時に前記第１混合信号における前記補正第２混在信号の含有割合を１００％に設定し、前記第１適応フィルタの係数の時間変化が十分に小さくなったとき、前記第１混合信号における前記補正第２混在信号の含有割合を０％に設定する
付記４に記載の信号処理装置。
（付記６）
前記第１信号の推定値をフィルタ処理して前記第３信号の推定値を生成する第２適応フィルタ（適応フィルタ８０３）と、
前記第２混在信号から前記第３信号の推定値を減算して前記第４信号の推定値を生成する第２減算部（減算部８０４）と、をさらに備え、
前記第１適応フィルタは、
前記第２混在信号に代えて前記第４信号の推定値を入力とし、
前記推定部は、
前記第４信号の推定値と前記第３信号の推定値と前記第１信号の推定値と前記第２混在信号と前記第２適応フィルタの係数とをさらに用いて、前記第４信号と前記第３信号の振幅または電力の比を第４混在比としてさらに推定し、
前記第４混在比を用いて前記第２適応フィルタを制御する
付記１に記載の信号処理装置。
（付記７）
前記推定部（推定部８０６）は、
前記第１信号の推定値と前記第２信号の推定値とを用いて前記第１信号と前記第２信号の振幅または電力の比を第２混在比として推定する第１信号比推定部（信号比推定部３０１）と、
前記第４信号の推定値と前記第１混在信号を用いて前記第４信号の推定値を補正して前記第４信号の補正推定値を生成する第１補正部（補正部３１０）と、
前記第１信号の推定値と前記第４信号の補正推定値とを用いて前記第１信号と前記第２信号の振幅または電力の比を第３混在比として推定する第２信号比推定部（信号比推定部３０２）と、
前記第２混在比と前記第３混在比を、前記第１適応フィルタの係数の時間変化に基づいて混合して前記第１混在比を生成する第１混合部（混合部３０５）と、
前記第４信号の推定値と前記第３信号の推定値とを用いて前記第４信号と前記第３信号の振幅または電力の比を第５混在比として推定する第４信号比推定部（信号比推定部９０１）と、
前記第１信号の推定値と前記第２混在信号を用いて前記第１信号の推定値を補正して前記第１信号の補正推定値を生成する第２補正部（補正部７１０）と、
前記第４信号の推定値と前記第１信号の補正推定値とを用いて前記第４信号と前記第３信号の振幅または電力の比を第６混在比として推定する第５信号比推定部（信号比推定部９０２）と、
前記第５混在比と前記第６混在比を、前記第２適応フィルタの係数の時間変化に基づいて混合して前記第４混在比を生成する第３混合部（混合部９０５）と、
を備えた付記６に記載の信号処理装置。
（付記８）
前記第１混合部（混合部３０５）は、
前記第１適応フィルタの係数更新開始時に前記第１混在比における前記第３混在比の含有割合を１００％に設定し、前記第１適応フィルタの係数の時間変化が十分に小さくなったとき前記第１混在比における前記第３混在比の含有割合を０％に設定し、
前記第３混合部（混合部９０５）は、
前記第２適応フィルタの係数更新開始時に前記第４混在比における前記第６混在比の含有割合を１００％に設定し、前記第２適応フィルタの係数の時間変化が十分に小さくなったとき前記第４混在比における前記第６混在比の含有割合を０％に設定する
付記７または８に記載の信号処理装置。
（付記９）
前記推定部（推定部８０６）は、
前記第４信号の推定値と前記第１混在信号を用いて前記第４信号の推定値を補正して前記第４信号の補正推定値を生成する第１補正部（補正部３１０）と、
前記第４信号の補正推定値と前記第２信号の推定値を、前記第１適応フィルタの係数の時間変化に基づいて混合して第１混合信号を生成する第２混合部（混合部５０６）と、
前記第１混合信号と前記第１信号の推定値を用いて前記第１信号と前記第２信号の振幅または電力の比を前記第１混在比として推定する第３信号比推定部（信号比推定部５０３）と、
前記第１信号の推定値と前記第２混在信号を用いて前記第１信号の推定値を補正して前記第１信号の補正推定値を生成する第２補正部（補正部７１０）と、
前記第１信号の補正推定値と前記第３信号の推定値を、前記第２適応フィルタの係数の時間変化に基づいて混合して第２混合信号を生成する第４混合部（混合部１１０６）と、
前記第２混合信号と前記第４信号の推定値を用いて前記第４信号と前記第３信号の振幅または電力の比を前記第４混在比として推定する第６信号比推定部（信号比推定部１１０３）と、
を備えた付記６に記載の信号処理装置。
（付記１０）
前記第２混合部（混合部５０６）は、
前記第１適応フィルタの係数更新開始時に前記第１混合信号における前記第４信号の補正推定値の含有割合を１００％に設定し、前記第１適応フィルタの係数の時間変化が十分に小さくなったとき前記第１混合信号における前記第４信号の補正推定値の含有割合を０％に設定し、
前記第４混合部（混合部１１０６）は、
前記第２適応フィルタの係数更新開始時に前記第２混合信号における前記第１信号の補正推定値の含有割合を１００％に設定し、前記第２適応フィルタの係数の時間変化が十分に小さくなったとき前記第２混合信号における前記第１信号の補正推定値の含有割合を０％に設定する
付記９に記載の信号処理装置。
（付記１１）
前記係数の時間変化は、
前記係数の２乗総和または絶対値総和の時間変化である
付記３または５または８または１０に記載の信号処理装置。
（付記１２）
前記係数の時間変化は、
前記係数の２乗部分和または絶対値部分和の時間変化である
付記３または５または８または１０に記載の信号処理装置。
（付記１３）
第１信号と第２信号が混在した第１混在信号を入力し、
前記第１信号と相関のある第３信号と前記第２信号と相関のある第４信号とが混在した第２混在信号を入力し、
前記第２混在信号を第１適応フィルタ（適応フィルタ１０３、２０３）で処理して前記第２信号の推定値を生成し、
前記第１混在信号と前記第２信号の推定値から前記第１信号の推定値を生成し、
前記第１信号の推定値と前記第２信号の推定値と前記第２混在信号と前記第１混在信号と前記第１適応フィルタの係数とを用いて前記第１信号と前記第２信号の振幅または電力の比を第１混在比として推定し、
前記第１混在比を用いて前記第２信号の推定値の生成を制御する
信号処理方法。
（付記１４）
前記第１信号の推定値を第２適応フィルタ（適応フィルタ８０３）で処理して前記第３信号の推定値を生成し、
前記第２混在信号から前記第３信号の推定値を減算して前記第４信号の推定値を生成し、
前記第１適応フィルタは前記第２混在信号に代えて前記第４信号の推定値を処理し、
前記第４信号の推定値と前記第３信号の推定値と前記第１信号の推定値と前記第２混在信号と前記第２適応フィルタの係数とをさらに用いて、
前記第４信号と前記第３信号の振幅または電力の比を第４混在比としてさらに推定し、
前記第４混在比を用いて前記第３信号の推定値の生成を制御する
付記１３に記載の信号処理方法。
（付記１５）
コンピュータに、
第１信号と第２信号が混在した第１混在信号を入力するステップと
前記第１信号と相関のある第３信号と前記第２信号と相関のある第４信号とが混在した第２混在信号を入力するステップと、
前記第２混在信号を第１適応フィルタ（適応フィルタ１０３、２０３）で処理して前記第２信号の推定値を生成するステップと、
前記第１混在信号と前記第２信号の推定値から前記第１信号の推定値を生成するステップと、
前記第１信号の推定値と前記第２信号の推定値と前記第２混在信号と前記第１混在信号と前記第１適応フィルタの係数とを用いて前記第１信号と前記第２信号の振幅または電力の比を第１混在比として推定するステップと、
前記第１混在比を用いて前記第２信号の推定値の生成を制御するステップと
を実行させる信号処理プログラム。
（付記１６）
コンピュータに、
前記第１信号の推定値を第２適応フィルタ（適応フィルタ８０３）で処理して前記第３信号の推定値を生成するステップと、
前記第２混在信号から前記第３信号の推定値を減算して前記第４信号の推定値を生成するステップと、
前記第１適応フィルタは前記第２混在信号に代えて前記第４信号の推定値を処理するステップと、
前記第４信号の推定値と前記第３信号の推定値と前記第１信号の推定値と前記第２混在信号と前記第２適応フィルタの係数とをさらに用いて、
前記第４信号と前記第３信号の振幅または電力の比を第４混在比としてさらに推定するステップと、
前記第４混在比を用いて前記第３信号の推定値の生成を制御するステップと
を実行させる付記１６に記載の信号処理プログラム。
Furthermore, the above-mentioned "section, module, unit" can be read as "means" or "circuit", etc. For example, a subtraction section can be read as subtraction means or subtraction circuit.

(Appendix 1)
a first input means for inputting a first mixed signal in which the first signal and the second signal are mixed;
a second input means for inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed;
a first adaptive filter (adaptive filter 103, 203) for filtering the second mixture signal to generate an estimate of the second signal;
a first subtraction unit (subtraction unit 104, 204) that generates an estimate of the first signal from the first mixed signal and an estimate of the second signal;
an estimation unit (estimation unit 106, 206, 806) that estimates a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, the first mixture signal, and a coefficient of the first adaptive filter;
Equipped with
The signal processing device controls the first adaptive filter using the first mixture ratio.
(Appendix 2)
The estimation unit (estimation unit 106, 206)
a first signal ratio estimator (signal ratio estimator 301) that estimates a ratio of amplitudes or powers of the first signal and the second signal as a second mixture ratio using the estimated value of the first signal and the estimated value of the second signal;
a first correction unit (correction unit 310) that corrects the second mixture signal using the second mixture signal and the first mixture signal to generate a corrected second mixture signal;
a second signal ratio estimator (signal ratio estimator 302) that estimates a ratio of amplitudes or powers of the first signal and the second signal as a third mixture ratio using the estimated value of the first signal and the corrected second mixture signal;
a first mixer (mixer 305) that mixes the second mixture ratio and the third mixture ratio based on a time change in a coefficient of the first adaptive filter to generate the first mixture ratio;
2. The signal processing device according to claim 1, comprising:
(Appendix 3)
The first mixing section (mixing section 305)
the signal processing device according to claim 2, wherein a content ratio of the third mixture ratio in the first mixture ratio is set to 100% when a coefficient update of the first adaptive filter starts, and when a change over time in the coefficient of the first adaptive filter becomes sufficiently small, a content ratio of the third mixture ratio in the first mixture ratio is set to 0%.
(Appendix 4)
The estimation unit (estimation unit 106, 206)
a first correction unit (correction unit 310) that corrects the second mixture signal using the second mixture signal and the first mixture signal to generate a corrected second mixture signal;
a second mixer (mixer 506) that mixes the corrected second mixed signal and the estimated value of the second signal based on time changes in the coefficients of the first adaptive filter to generate a first mixed signal;
a third signal ratio estimator (signal ratio estimator 503) that estimates a ratio of amplitudes or powers of the first signal and the second signal as the first mixture ratio using the first mixed signal and an estimated value of the first signal;
2. The signal processing device according to claim 1, comprising:
(Appendix 5)
The second mixing unit (mixing unit 506)
The signal processing device according to claim 4, wherein the content ratio of the corrected second mixture signal in the first mixture signal is set to 100% when coefficient update of the first adaptive filter starts, and when a change over time in the coefficient of the first adaptive filter becomes sufficiently small, the content ratio of the corrected second mixture signal in the first mixture signal is set to 0%.
(Appendix 6)
a second adaptive filter (adaptive filter 803) that filters the estimate of the first signal to produce an estimate of the third signal;
a second subtraction unit (subtraction unit 804) that subtracts the estimated value of the third signal from the second mixed signal to generate the estimated value of the fourth signal,
The first adaptive filter is
an estimate of the fourth signal is input instead of the second mixed signal;
The estimation unit
further estimating, as a fourth mixture ratio, a ratio of amplitudes or powers of the fourth signal and the third signal using the estimated value of the fourth signal, the estimated value of the third signal, the estimated value of the first signal, the second mixture signal, and a coefficient of the second adaptive filter;
2. The signal processing device according to claim 1, wherein the second adaptive filter is controlled using the fourth mixing ratio.
(Appendix 7)
The estimation unit (estimation unit 806)
a first signal ratio estimator (signal ratio estimator 301) that estimates a ratio of amplitudes or powers of the first signal and the second signal as a second mixture ratio using the estimated value of the first signal and the estimated value of the second signal;
a first correction unit (correction unit 310) that corrects the estimate of the fourth signal using the estimate of the fourth signal and the first mixed signal to generate a corrected estimate of the fourth signal;
a second signal ratio estimator (signal ratio estimator 302) that estimates a ratio of amplitudes or powers of the first signal and the second signal as a third mixture ratio using the estimated value of the first signal and the corrected estimated value of the fourth signal;
a first mixer (mixer 305) that mixes the second mixture ratio and the third mixture ratio based on a time change in a coefficient of the first adaptive filter to generate the first mixture ratio;
a fourth signal ratio estimator (signal ratio estimator 901) that estimates a ratio of amplitudes or powers of the fourth signal and the third signal as a fifth mixture ratio using the estimated value of the fourth signal and the estimated value of the third signal;
a second correction unit (correction unit 710) that corrects the estimate of the first signal using the estimate of the first signal and the second mixed signal to generate a corrected estimate of the first signal;
a fifth signal ratio estimator (signal ratio estimator 902) that estimates a ratio of amplitudes or powers of the fourth signal and the third signal as a sixth mixture ratio using the estimated value of the fourth signal and the corrected estimated value of the first signal;
a third mixer (mixer 905) that mixes the fifth mixture ratio and the sixth mixture ratio based on a time change in a coefficient of the second adaptive filter to generate the fourth mixture ratio;
7. The signal processing device according to claim 6, comprising:
(Appendix 8)
The first mixing section (mixing section 305)
setting a content ratio of the third mixture ratio in the first mixture ratio to 100% when a coefficient update of the first adaptive filter starts, and setting a content ratio of the third mixture ratio in the first mixture ratio to 0% when a time change in the coefficient of the first adaptive filter becomes sufficiently small;
The third mixing section (mixing section 905)
the signal processing device according to claim 7 or 8, wherein a content ratio of the sixth mixing ratio in the fourth mixing ratio is set to 100% when a coefficient update of the second adaptive filter starts, and a content ratio of the sixth mixing ratio in the fourth mixing ratio is set to 0% when a change over time in the coefficient of the second adaptive filter becomes sufficiently small.
(Appendix 9)
The estimation unit (estimation unit 806)
a first correction unit (correction unit 310) that corrects the estimate of the fourth signal using the estimate of the fourth signal and the first mixed signal to generate a corrected estimate of the fourth signal;
a second mixer (mixer 506) that mixes the corrected estimate of the fourth signal and the estimate of the second signal based on time variations in the coefficients of the first adaptive filter to generate a first mixed signal;
a third signal ratio estimator (signal ratio estimator 503) that estimates a ratio of amplitudes or powers of the first signal and the second signal as the first mixture ratio using estimated values of the first mixed signal and the first signal;
a second correction unit (correction unit 710) that corrects the estimate of the first signal using the estimate of the first signal and the second mixed signal to generate a corrected estimate of the first signal;
a fourth mixer (mixer 1106) that mixes the corrected estimate of the first signal and the estimate of the third signal based on time variations in the coefficients of the second adaptive filter to generate a second mixed signal;
a sixth signal ratio estimator (signal ratio estimator 1103) that estimates a ratio of amplitudes or powers of the fourth signal and the third signal as the fourth mixture ratio using estimated values of the second mixed signal and the fourth signal;
7. The signal processing device according to claim 6, comprising:
(Appendix 10)
The second mixing unit (mixing unit 506)
setting a content ratio of the corrected estimate value of the fourth signal in the first mixed signal to 100% when coefficient update of the first adaptive filter starts, and setting a content ratio of the corrected estimate value of the fourth signal in the first mixed signal to 0% when a time change in the coefficient of the first adaptive filter becomes sufficiently small;
The fourth mixing unit (mixing unit 1106)
10. The signal processing device according to claim 9, wherein a content ratio of the corrected estimate value of the first signal in the second mixed signal is set to 100% when coefficient update of the second adaptive filter starts, and a content ratio of the corrected estimate value of the first signal in the second mixed signal is set to 0% when a time change in the coefficient of the second adaptive filter becomes sufficiently small.
(Appendix 11)
The time change of the coefficient is
The signal processing device according to claim 3, 5, 8, or 10, wherein the sum of squares or sum of absolute values of the coefficients is a time change.
(Appendix 12)
The time change of the coefficient is
The signal processing device according to claim 3, 5, 8, or 10, wherein the partial sum of squares or the partial sum of absolute values of the coefficients is a time change.
(Appendix 13)
A first mixed signal in which the first signal and the second signal are mixed is input;
a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed is input;
processing the second mixed signal with a first adaptive filter (adaptive filter 103, 203) to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
estimating a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, the first mixture signal, and a coefficient of the first adaptive filter;
and using the first mixing ratio to control generation of an estimate of the second signal.
(Appendix 14)
processing the estimate of the first signal with a second adaptive filter (adaptive filter 803) to produce an estimate of the third signal;
subtracting the estimate of the third signal from the second mixed signal to generate an estimate of the fourth signal;
the first adaptive filter processes an estimate of the fourth signal instead of the second mixed signal;
further using the estimated value of the fourth signal, the estimated value of the third signal, the estimated value of the first signal, the second mixed signal, and a coefficient of the second adaptive filter,
further estimating a ratio of amplitude or power of the fourth signal to the third signal as a fourth mixture ratio;
14. The signal processing method of claim 13, further comprising: controlling generation of an estimate of the third signal using the fourth mixing ratio.
(Appendix 15)
On the computer,
a step of inputting a first mixed signal in which a first signal and a second signal are mixed; and a step of inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed.
processing the second mixed signal with a first adaptive filter (adaptive filter 103, 203) to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
a step of estimating a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, the first mixture signal, and a coefficient of the first adaptive filter;
and controlling generation of an estimate of the second signal using the first mixture ratio.
(Appendix 16)
On the computer,
processing the estimate of the first signal with a second adaptive filter (adaptive filter 803) to produce an estimate of the third signal;
subtracting the estimate of the third signal from the second mixture signal to generate an estimate of the fourth signal;
the first adaptive filter processes an estimate of the fourth signal instead of the second mixed signal;
further using the estimated value of the fourth signal, the estimated value of the third signal, the estimated value of the first signal, the second mixed signal, and a coefficient of the second adaptive filter,
further estimating a ratio of amplitude or power of the fourth signal to the third signal as a fourth mixing ratio;
and controlling generation of an estimate of the third signal using the fourth mixing ratio.

１００信号処理装置
１０１第１入力部
１０２第２入力部
１０３，２０３適応フィルタ（第１適応フィルタの一例に相当）
１０４，２０４減算部（第１減算部の一例に相当）
１０６，２０６，８０６推定部
１０７係数更新制御部
１４１係数
２００，８００雑音消去装置（信号処理装置の一例に相当）
２０１入力端子（第１入力部の一例に相当）
２０２入力端子（第２入力部の一例に相当）
２０５，８０５出力端子
３０１信号比推定部（第１信号比推定部の一例に相当）
３０２信号比推定部（第２信号比推定部の一例に相当）
３０５混合部（第１混合部の一例に相当）
３１０補正部（第１補正部の一例に相当）
５０６混合部（第２混合部の一例に相当）
５０３信号比推定部（第３信号比推定部の一例に相当）
７１０補正部（第２補正部の一例に相当）
８０３適応フィルタ（第２適応フィルタの一例に相当）
８０４減算部（第２減算部の一例に相当）
９０１信号比推定部（第４信号比推定部の一例に相当）
９０２信号比推定部（第５信号比推定部の一例に相当）
９０５混合部（第３混合部の一例に相当）
１１０６混合部（第４混合部の一例に相当）
１１０３信号比推定部（第６信号比推定部の一例に相当）
Ａ信号源
Ｂ信号源
ｘＰ（ｋ）第１混在信号
ｘＲ（ｋ）第２混在信号
ｘＲＣ（ｋ）補正第２混在信号
ｅ１（ｋ）第１信号の推定値，音声信号の推定値
ｅ２（ｋ）雑音の推定値（第４信号の推定値の一例に相当）
ｅ１Ｃ（ｋ）第１信号の補正推定値，音声信号の補正推定値
ｅ２Ｃ（ｋ）雑音の補正推定値（第４信号の補正推定値の一例に相当）
ｎ１（ｋ）第２信号の推定値，擬似雑音（第２信号の推定値の一例に相当）
ｎ２（ｋ）擬似クロストーク（第３信号の推定値の一例に相当）
ｎ３（ｋ）混合信号（第１混合信号の一例に相当）
ｎ４（ｋ）混合信号（第２混合信号の一例に相当）
Ｒ１（ｋ）第１混在比
Ｒ２（ｋ）第２混在比
Ｒ３（ｋ）第３混在比
Ｒ４（ｋ）第４混在比
Ｒ５（ｋ）第５混在比
Ｒ６（ｋ）第６混在比 100 Signal processing device 101 First input unit 102 Second input unit 103, 203 Adaptive filter (corresponding to an example of a first adaptive filter)
104, 204 Subtraction unit (corresponding to an example of a first subtraction unit)
106, 206, 806 Estimation unit 107 Coefficient update control unit 141 Coefficient 200, 800 Noise canceller (corresponding to an example of a signal processing device)
201 Input terminal (corresponding to an example of a first input unit)
202 Input terminal (corresponding to an example of a second input unit)
205, 805 Output terminal 301 Signal ratio estimation unit (corresponding to an example of a first signal ratio estimation unit)
302 Signal ratio estimation unit (corresponding to an example of a second signal ratio estimation unit)
305 Mixing section (corresponding to an example of the first mixing section)
310 Correction unit (corresponding to an example of a first correction unit)
506 Mixing section (corresponding to an example of the second mixing section)
503 Signal ratio estimation unit (corresponding to an example of a third signal ratio estimation unit)
710 Correction unit (corresponding to an example of a second correction unit)
803 Adaptive filter (corresponding to an example of a second adaptive filter)
804 Subtraction unit (corresponding to an example of a second subtraction unit)
901 Signal ratio estimation unit (corresponding to an example of a fourth signal ratio estimation unit)
902 signal ratio estimator (corresponding to an example of a fifth signal ratio estimator)
905 Mixing section (corresponding to an example of the third mixing section)
1106 Mixing section (corresponding to an example of the fourth mixing section)
1103: Signal ratio estimation unit (corresponding to an example of a sixth signal ratio estimation unit)
A: Signal source B: Signal source xP(k): First mixture signal xR(k): Second mixture signal xRC(k): Corrected second mixture signal e1(k): Estimated value of the first signal, estimated value of the speech signal e2(k): Estimated value of noise (corresponding to an example of the estimated value of the fourth signal)
e1C(k): Corrected estimate of the first signal, corrected estimate of the speech signal e2C(k): Corrected estimate of the noise (corresponding to an example of the corrected estimate of the fourth signal)
n1(k) is an estimate of the second signal, pseudo-noise (corresponding to an example of an estimate of the second signal)
n2(k) pseudo crosstalk (corresponding to an example of an estimate of the third signal)
n3(k) mixed signal (corresponding to an example of the first mixed signal)
n4(k) mixed signal (corresponding to an example of the second mixed signal)
R1(k) 1st mixing ratio R2(k) 2nd mixing ratio R3(k) 3rd mixing ratio R4(k) 4th mixing ratio R5(k) 5th mixing ratio R6(k) 6th mixing ratio

Claims

a first input means for inputting a first mixed signal in which the first signal and the second signal are mixed;
a second input means for inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed;
a first adaptive filter for filtering the second mixed signal to generate an estimate of the second signal;
a first subtraction unit that generates an estimate of the first signal from the first mixed signal and an estimate of the second signal;
an estimation unit that estimates a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, the first mixture signal, and a coefficient of the first adaptive filter;
Equipped with
controlling the first adaptive filter using the first mixture ratio ;
The estimation unit
a first signal ratio estimator that estimates a ratio of amplitudes or powers of the first signal and the second signal as a second mixture ratio using the estimated value of the first signal and the estimated value of the second signal;
a first correction unit that averages the power of the second mixture signal and the first mixture signal, sets a ratio between the averaged second mixture signal and the first mixture signal as a magnification, and generates a corrected second mixture signal by multiplying the square root of the magnification by the second mixture signal;
a second signal ratio estimator that estimates a ratio of amplitudes or powers of the first signal and the second signal as a third mixture ratio using the estimated value of the first signal and the corrected second mixture signal;
a first mixer that mixes the second mixture ratio and the third mixture ratio by setting a weight of the third mixture ratio to a large value at the start of updating the coefficients and decreasing the weight as the coefficients grow, to generate the first mixture ratio;
A signal processing device comprising :

The first mixing section is
2. The signal processing device according to claim 1, wherein a content ratio of the third mixture ratio in the first mixture ratio is set to 100% when a coefficient update of the first adaptive filter starts, and when a change over time in the coefficient of the first adaptive filter becomes sufficiently small, a content ratio of the third mixture ratio in the first mixture ratio is set to 0%.

a first input means for inputting a first mixed signal in which the first signal and the second signal are mixed;
a second input means for inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed;
a first adaptive filter for filtering the second mixed signal to generate an estimate of the second signal;
a first subtraction unit that generates an estimate of the first signal from the first mixed signal and an estimate of the second signal;
an estimation unit that estimates a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the estimated value of the first signal, the estimated value of the second signal, the second mixture signal, the first mixture signal, and a coefficient of the first adaptive filter;
Equipped with
controlling the first adaptive filter using the first mixture ratio;
The estimation unit
a first correction unit that averages the power of the second mixture signal and the first mixture signal, sets a ratio between the averaged second mixture signal and the first mixture signal as a magnification, and generates a corrected second mixture signal by multiplying the square root of the magnification by the second mixture signal;
a second mixer that mixes the corrected second mixture signal and the estimated value of the second signal by setting a weight of the corrected second mixture signal to a large value when the coefficient update starts and decreasing the weight as the coefficient grows, to generate a first mixture signal;
a third signal ratio estimator that estimates, as the first mixture ratio, a ratio of amplitudes or powers of the first signal and the second signal using the first mixed signal and an estimated value of the first signal;
A signal processing device comprising :

The second mixing section is
4. The signal processing device according to claim 3, wherein a content ratio of the corrected second mixture signal in the first mixture signal is set to 100% when coefficient update of the first adaptive filter starts, and when a change over time in the coefficient of the first adaptive filter becomes sufficiently small, a content ratio of the corrected second mixture signal in the first mixture signal is set to 0%.

a second adaptive filter that filters the estimate of the first signal to generate an estimate of the third signal;
a second subtraction unit that subtracts the estimated value of the third signal from the second mixed signal to generate the estimated value of the fourth signal,
The first adaptive filter is
an estimate of the fourth signal is input instead of the second mixed signal;
The estimation unit
further estimating, as a fourth mixture ratio, a ratio of amplitudes or powers of the fourth signal and the third signal using the estimated value of the fourth signal, the estimated value of the third signal, the estimated value of the first signal, the second mixture signal, and a coefficient of the second adaptive filter;
The signal processing device according to claim 1 or 3 , wherein the second adaptive filter is controlled using the fourth mixing ratio.

The estimation unit
a first signal ratio estimator that estimates a ratio of amplitudes or powers of the first signal and the second signal as a second mixture ratio using the estimated value of the first signal and the estimated value of the second signal;
a first correction unit that corrects the estimate of the fourth signal using the estimate of the fourth signal and the first mixed signal to generate a corrected estimate of the fourth signal;
a second signal ratio estimator that estimates a ratio of amplitudes or powers of the first signal and the second signal as a third mixture ratio using the estimated value of the first signal and the corrected estimated value of the fourth signal;
a first mixer that mixes the second mixture ratio and the third mixture ratio based on a time change in a coefficient of the first adaptive filter to generate the first mixture ratio;
a fourth signal ratio estimator that estimates a ratio of amplitudes or powers of the fourth signal and the third signal as a fifth mixture ratio using the estimated value of the fourth signal and the estimated value of the third signal;
a second correction unit that corrects the estimate of the first signal using the estimate of the first signal and the second mixed signal to generate a corrected estimate of the first signal;
a fifth signal ratio estimator that estimates a ratio of amplitudes or powers of the fourth signal and the third signal as a sixth mixture ratio using the estimated value of the fourth signal and the corrected estimated value of the first signal;
a third mixer that mixes the fifth mixture ratio and the sixth mixture ratio based on a time change in a coefficient of the second adaptive filter to generate the fourth mixture ratio;
The signal processing device according to claim 5 , comprising:

The first mixing section
setting a content ratio of the third mixture ratio in the first mixture ratio to 100% when a coefficient update of the first adaptive filter starts, and setting a content ratio of the third mixture ratio in the first mixture ratio to 0% when a time change in the coefficient of the first adaptive filter becomes sufficiently small;
The third mixing section is
7. The signal processing device according to claim 6, wherein a content ratio of the sixth mixing ratio in the fourth mixing ratio is set to 100% when a coefficient update of the second adaptive filter starts, and a content ratio of the sixth mixing ratio in the fourth mixing ratio is set to 0% when a change over time in the coefficient of the second adaptive filter becomes sufficiently small.

The estimation unit
a first correction unit that corrects the estimate of the fourth signal using the estimate of the fourth signal and the first mixed signal to generate a corrected estimate of the fourth signal;
a second mixer that mixes the corrected estimate of the fourth signal and the estimate of the second signal based on time changes in coefficients of the first adaptive filter to generate a first mixed signal;
a third signal ratio estimator that estimates a ratio of amplitudes or powers of the first signal and the second signal as the first mixture ratio using estimated values of the first mixed signal and the first signal;
a second correction unit that corrects the estimate of the first signal using the estimate of the first signal and the second mixed signal to generate a corrected estimate of the first signal;
a fourth mixer that mixes the corrected estimate of the first signal and the estimate of the third signal based on time changes in coefficients of the second adaptive filter to generate a second mixed signal;
a sixth signal ratio estimator that estimates a ratio of amplitudes or powers of the fourth signal and the third signal as the fourth mixture ratio using estimated values of the second mixed signal and the fourth signal;
The signal processing device according to claim 5 , comprising:

The second mixing section is
setting a content ratio of the corrected estimate value of the fourth signal in the first mixed signal to 100% when coefficient update of the first adaptive filter starts, and setting a content ratio of the corrected estimate value of the fourth signal in the first mixed signal to 0% when a time change in the coefficient of the first adaptive filter becomes sufficiently small;
The fourth mixing section is
When coefficient updating of the second adaptive filter starts, the content ratio of the corrected estimated value of the first signal in the second mixed signal is set to 100%, and when time change in the coefficient of the second adaptive filter becomes sufficiently small, the content ratio of the corrected estimated value of the first signal in the second mixed signal is set to 0%.
The signal processing device according to claim 8 .

The time change of the coefficient is
The signal processing device according to claim 2 , 4 , 7 or 9 , wherein the sum of squares or sum of absolute values of the coefficients is a time variation.

The time change of the coefficient is
The signal processing device according to claim 2 , 4 , 7 or 9 , wherein the partial sum of squares or the partial sum of absolute values of the coefficients is a time variation.

A first mixed signal in which the first signal and the second signal are mixed is input;
a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed is input;
processing the second mixed signal with a first adaptive filter to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
estimating a ratio of amplitudes or powers of the first signal and the second signal as a second mixture ratio using the estimated value of the first signal and the estimated value of the second signal;
averaging the power of the second mixture signal and the first mixture signal, determining a ratio between the averaged second mixture signal and the first mixture signal as a magnification, and generating a corrected second mixture signal by multiplying the square root of the magnification by the second mixture signal;
estimating a ratio of amplitudes or powers of the first signal and the second signal as a third mixture ratio using the estimated value of the first signal and the corrected second mixture signal;
generating a first mixing ratio by mixing the second mixing ratio and the third mixing ratio by setting a weight of the third mixing ratio to a large value at the start of updating the coefficients of the first adaptive filter and decreasing the weight as the coefficients grow;
and using the first mixing ratio to control generation of an estimate of the second signal.

A first mixed signal in which the first signal and the second signal are mixed is input;
a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed is input;
processing the second mixed signal with a first adaptive filter to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
averaging the power of the second mixture signal and the first mixture signal, determining a ratio between the averaged second mixture signal and the first mixture signal as a magnification, and generating a corrected second mixture signal by multiplying the square root of the magnification by the second mixture signal;
generating a first mixed signal by mixing the corrected second mixed signal and the estimated value of the second signal, by setting a weight of the corrected second mixed signal to a large value when updating the coefficients of the first adaptive filter starts and decreasing the weight as the coefficients grow;
using the first mixed signal and the estimated value of the first signal, estimating an amplitude or power ratio between the first signal and the second signal as a first mixture ratio;
and controlling generation of an estimate of the second signal using the first mixing ratio.
Signal processing methods.

processing the estimate of the first signal with a second adaptive filter to generate an estimate of the third signal;
subtracting the estimate of the third signal from the second mixed signal to generate an estimate of the fourth signal;
the first adaptive filter processes an estimate of the fourth signal instead of the second mixed signal;
further using the estimated value of the fourth signal, the estimated value of the third signal, the estimated value of the first signal, the second mixed signal, and a coefficient of the second adaptive filter,
further estimating a ratio of amplitude or power of the fourth signal to the third signal as a fourth mixture ratio;
14. The signal processing method according to claim 12 or 13 , wherein the fourth mixing ratio is used to control generation of the estimate of the third signal.

On the computer,
a step of inputting a first mixed signal in which a first signal and a second signal are mixed; and a step of inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed.
processing the second mixed signal with a first adaptive filter to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
estimating a ratio of amplitudes or powers of the first signal and the second signal as a second mixture ratio using the estimated value of the first signal and the estimated value of the second signal;
a step of averaging the power of the second mixture signal and the first mixture signal, determining a ratio of the averaged second mixture signal to the first mixture signal as a scaling factor, and generating a corrected second mixture signal by multiplying the square root of the scaling factor by the second mixture signal;
estimating a ratio of amplitudes or powers of the first signal and the second signal as a third mixture ratio using the estimated value of the first signal and the corrected second mixture signal;
generating a first mixture ratio by mixing the second mixture ratio and the third mixture ratio by setting a weight of the third mixture ratio to a large value at the start of updating the coefficients of the first adaptive filter and decreasing the weight as the coefficients grow;
and controlling generation of an estimate of the second signal using the first mixture ratio.

On the computer,
inputting a first mixed signal in which the first signal and the second signal are mixed;
inputting a second mixed signal in which a third signal correlated with the first signal and a fourth signal correlated with the second signal are mixed;
processing the second mixed signal with a first adaptive filter to generate an estimate of the second signal;
generating an estimate of the first signal from the first mixed signal and an estimate of the second signal;
a step of averaging the power of the second mixture signal and the first mixture signal, determining a ratio of the averaged second mixture signal to the first mixture signal as a scaling factor, and generating a corrected second mixture signal by multiplying the square root of the scaling factor by the second mixture signal;
a step of mixing the corrected second mixed signal and the estimated value of the second signal to generate a first mixed signal by setting a weight of the corrected second mixed signal to a large value at the start of updating the coefficients of the first adaptive filter and decreasing the weight as the coefficients grow;
a step of estimating a ratio of amplitudes or powers of the first signal and the second signal as a first mixture ratio using the first mixed signal and the estimated value of the first signal;
using the first mixing ratio to control generation of an estimate of the second signal;
A signal processing program that executes the above.

On the computer,
processing the estimate of the first signal with a second adaptive filter to generate an estimate of the third signal;
subtracting the estimate of the third signal from the second mixture signal to generate an estimate of the fourth signal;
the first adaptive filter processes an estimate of the fourth signal instead of the second mixed signal;
further using the estimated value of the fourth signal, the estimated value of the third signal, the estimated value of the first signal, the second mixed signal, and a coefficient of the second adaptive filter,
further estimating a ratio of amplitude or power of the fourth signal to the third signal as a fourth mixing ratio;
and controlling generation of an estimate of the third signal using the fourth mixture ratio.