JP3514714B2

JP3514714B2 - Sound collection method and device

Info

Publication number: JP3514714B2
Application number: JP2000249547A
Authority: JP
Inventors: 和則小林; 賢一古家
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: NTT Inc; NTT Inc USA
Priority date: 2000-08-21
Filing date: 2000-08-21
Publication date: 2004-03-31
Anticipated expiration: 2020-08-21
Also published as: JP2002062895A

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音声認識、ハンズ
フリー電話、テレビカメラ、通信会議、遠隔講義、異常
音監視等において、複数のマイクロホンによって受音さ
れた信号をフィルタ処理し、出力することによって、雑
音や周波数劣化を低減し、目的とする音源から発せられ
た音を高品質に収音する方法および装置に関するもので
ある。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention filters and outputs signals received by a plurality of microphones in voice recognition, hands-free telephone, TV camera, communication conference, remote lecture, abnormal sound monitoring and the like. The present invention relates to a method and apparatus for reducing noise and frequency deterioration, and collecting sound emitted from a target sound source with high quality.

【０００２】[0002]

【従来の技術】まず、高品質な収音の意味について説明
する。2. Description of the Related Art First, the meaning of high quality sound pickup will be described.

【０００３】マイクロホンによって受音された信号に
は、目的とする音源から発せられた音（目的音）の他
に、空調音、電気機器のファン音、マイクロホンアンプ
や信号ケーブル等で生じる電気的雑音等の雑音が含まれ
る。また、収音の過程で目的音成分に周波数劣化が生じ
る。目的音成分の周波数劣化が小さい程、収音した音は
目的音に近い波形であるので、目的音成分の周波数劣化
は小さいほど高品質である。したがって、高品質な収音
とは、高ＳＮ比（目的信号と雑音のパワー比）であり、
しかも、目的音成分の周波数劣化が小さい収音のことで
ある。The signal received by the microphone includes not only the sound emitted from the target sound source (target sound) but also the air conditioning sound, the fan sound of electric equipment, and the electric noise generated by the microphone amplifier, the signal cable, and the like. Etc. noise is included. In addition, frequency deterioration occurs in the target sound component during the sound collection process. As the frequency deterioration of the target sound component is smaller, the collected sound has a waveform closer to the target sound. Therefore, the smaller the frequency deterioration of the target sound component is, the higher the quality is. Therefore, high-quality sound pickup means a high SN ratio (power ratio of target signal and noise),
Moreover, it is a sound pickup in which the frequency deterioration of the target sound component is small.

【０００４】次に、単一仮想目的音源を用いた適応形ア
レーについて説明する。Next, an adaptive array using a single virtual target sound source will be described.

【０００５】適応形アレーは、複数のマイクロホン（マ
イクロホンアレー）で収音された信号をそれぞれフィル
タリングし、加算して出力する方法であり、雑音の強
さ、位置、周波数等の雑音の性質に応じて、フィルタ係
数を適応的に更新することによって、雑音を抑圧し、目
的音を高品質に収音できる。The adaptive array is a method in which signals picked up by a plurality of microphones (microphone arrays) are filtered, added, and output, depending on noise characteristics such as noise intensity, position and frequency. By adaptively updating the filter coefficient, noise can be suppressed and the target sound can be collected with high quality.

【０００６】単一仮想目的音源を用いた従来の適応形ア
レーにおいて、実際に収音した雑音と、予め設定した単
一の仮想目的音源位置からマイクロホンに到来する音と
を仮想的に合成した仮想目的信号を用いて、雑音に対す
るマイクロホンアレーの感度が低く、仮想目的音源位置
に対するアレーの感度が高くなるように、フィルタ係数
を更新することによって、仮想目的音源位置に存在する
音源の音を、高品質に収音することが可能である。In a conventional adaptive array using a single virtual target sound source, a virtual sound obtained by virtually synthesizing noise actually picked up and a sound arriving at a microphone from a preset single virtual target sound source position is used. By using the target signal, the sound of the sound source existing at the virtual target sound source position is raised by updating the filter coefficient so that the sensitivity of the microphone array to the noise is low and the sensitivity of the array to the virtual target sound source position is high. It is possible to pick up quality.

【０００７】しかし、実際の目的音源は、仮想目的音源
位置からずれた位置にあったり、移動したりすることが
予想される。たとえば、目的音が人であれば、必ず動く
であろうし、毎回同じ位置で話しをすることもない。こ
のように、実際の目的音源が仮想目的音源位置からずれ
ると、従来技術では、実際の目的音源に対する仮想目的
音源位置のずれを修正することができないので、目的音
に対して周波数特性の劣化が生じ、聞き取りずらい音に
なったり、音声認識や異常音検出が困難となったりす
る。However, it is expected that the actual target sound source will be displaced from the virtual target sound source position or will move. For example, if the target sound is a person, it will always move and will not speak at the same position every time. As described above, when the actual target sound source deviates from the virtual target sound source position, the conventional technique cannot correct the deviation of the virtual target sound source position with respect to the actual target sound source. Sound that is difficult to hear, and voice recognition and abnormal sound detection are difficult.

【０００８】次に、従来技術について、詳細に説明す
る。Next, the prior art will be described in detail.

【０００９】図１２は、従来の収音装置ＣＳ１１を示す
図である。FIG. 12 is a diagram showing a conventional sound collecting device CS11.

【００１０】従来の収音装置ＣＳ１１は、マイクロホン
１１₁〜１１_Mと、加算器１２₁〜１２_M、１４Ａ、１４
Ｂ、１５（＋記号は加算、−記号は減算を表す）と、第
２の可変フィルタ１３Ａ₁〜１３Ａ_Mと、第１の可変フィ
ルタ１３Ｂ₁〜１３Ｂ_Mと、適応アルゴリズム部１６と、
信号発生器１７Ｃと、遅延器１９Ｃと、仮想音源位置設
定部２６Ｃと、空間特性推定部２７Ｃと、空間特性フィ
ルタ１８Ｃ₁〜１８Ｃ_Mと、適応期間検出部２０とを有す
る。The conventional sound collecting device CS11 includes microphones 11 _{1 to} 11 _M and adders 12 _{1 to} 12 _M , 14A and 14A.
B and 15 (+ sign represents addition, − sign represents subtraction), second variable filters 13A _{1 to} 13A _M , first variable filters 13B _{1 to} 13B _M , adaptive algorithm unit 16, and
Having a signal generator 17C, a delay circuit 19C, a virtual sound source position setting unit 26C, and the spatial characteristic estimating section 27C, and the spatial characteristic filter 18C ₁ ~18C _M, an adaptive period detection unit 20.

【００１１】次に、以下で使用する数式の記号について
定義する。Next, symbols of mathematical expressions used below will be defined.

【００１２】サンプリング周期によって離散化された時
刻をｎとし、時刻ｎにｉ番目マイクロホン１１_iで収音
された信号をｘ_i（ｎ）とし、Ｌサンプル分（フィルタ
が必要とするサンプル）を取り出して行列で表したもの
を、ｘ（ｎ）＝［ｘ_i（ｎ），ｘ_i（ｎ−１），…，ｘ_i（ｎ
−Ｌ＋１），ｘ₂（ｎ），…，ｘ_M（ｎ−Ｌ＋１）］^T とする。Let n be the time discretized by the sampling period, x _i (n) be the signal picked up by the i-th microphone 11 _i at time n, and extract L samples (samples required by the filter). X (n) = [x _i (n), x _i (n-1), ..., x _i (n
−L + 1), x ₂ (n), ..., X _M (n−L + 1)] ^T.

【００１３】信号発生器１７Ｃの出力信号を、ｖ’
（ｎ）とし、ｉ番目マイクロホン１１_iに対する空間特
性フィルタを、ｇ’_i（ｎ）で表し、空間特性フィルタ
出力を、ｕ’_i（ｎ）＝ｇ’_i（ｎ）＊ｖ’（ｎ）とし、
Ｌサンプル分を取り出して行列で表したものを、ｕ’（ｎ）＝［ｕ_i（ｎ），ｕ_i（ｎ−１），…，ｕ
_i（ｎ−Ｌ＋１），ｕ₂（ｎ），…，ｕ_M（ｎ−Ｌ＋
１）］^T とする。The output signal of the signal generator 17C is v '
(N), the spatial characteristic filter for the i-th microphone 11 _i is represented by g ′ _i (n), and the spatial characteristic filter output is u ′ _i (n) = g ′ _i (n) * v ′ (n). age,
L's are taken out and expressed in a matrix as u '(n) = [u _i (n), u _i (n-1), ..., u
_{i (n-L + 1)} , u 2 (n), ..., u M (n-L +
1)] ^T.

【００１４】ただし、＊は、畳み込み演算である。第２
の可変フィルタ１３Ａ₁〜１３Ａ_M、第１の可変フィルタ
１３Ｂ₁〜１３Ｂ_Mは、ＬタップのＦＩＲフィルタ（各デ
ータに定数を乗じ、これらを加算するフィルタ）とし、
フィルタ係数ｈ’（ｎ）を、ｈ’（ｎ）＝［ｈ_i（ｎ），ｈ_i（ｎ−１），…，ｈ
_i（ｎ−Ｌ＋１），ｈ₂（ｎ），…，ｈ_M（ｎ−Ｌ＋
１）］^T として行列で表す。However, * is a convolution operation. Second
The variable filters 13A _{1 to} 13A _M and the first variable filters 13B _{1 to} 13B _M are L-tap FIR filters (filters that multiply each data by a constant and add these),
The filter coefficient h ′ (n) is represented by h ′ (n) = [h _i (n), h _i (n−1), ..., H
_{i (n-L + 1)} , h 2 (n), ..., h M (n-L +
1)] ^T is represented by a matrix.

【００１５】ただし、ｈ_i（ｎ−ｐ＋１）は、時刻ｎに
おけるｉ番目マイクロホンに対するフィルタのｐタップ
目のフィルタ係数を表し、第２の可変フィルタと第１の
可変フィルタとには、同一のフィルタ係数が用いられ
る。また、加算器１４Ａの出力を、ｙ’（ｎ）で表し、
加算器１４Ｂの出力を、ｙ（ｎ）で表し、加算器１５の
出力（誤差）を、ｅ（ｎ）で表し、遅延器１９Ｃでの遅
延量を、τ’₀で表す。However, h _i (n-p + 1) represents the filter coefficient of the p-tap of the filter for the i-th microphone at time n, and the same filter is used for the second variable filter and the first variable filter. The coefficient is used. Also, the output of the adder 14A is represented by y '(n),
The output of the adder 14B is represented by y (n), the output (error) of the adder 15 is represented by e (n), and the delay amount of the delay device 19C is represented by τ ′ ₀ .

【００１６】次に、上記従来例におけるフィルタの収束
解と修正式との導出を行う。Next, the convergent solution of the filter and the correction formula in the above conventional example are derived.

【００１７】まず、加算器１５の出力（誤差）ｅ（ｎ）
の二乗平均を求める。この二乗平均誤差が小さくなれ
ば、加算器１４Ａ出力における雑音パワーが小さくな
り、加算器１４Ａ出力における仮想目的音の周波数劣化
が小さくなるので、この二乗平均誤差を最小とするフィ
ルタを、最適なフィルタとする。First, the output (error) e (n) of the adder 15
Find the root mean square of. If this root mean square error becomes smaller, the noise power at the output of the adder 14A becomes smaller and the frequency deterioration of the virtual target sound at the output of the adder 14A becomes smaller. Therefore, the filter that minimizes this root mean square error is the optimum filter. And

【００１８】[0018]

【数１】ただし、オーバーラインは時間平均を意味する。[Equation 1] However, overline means time average.

【００１９】雑音と仮想目的音とは無相関であるとする
と、上記式（１）を、次の式（２）のように変形するこ
とができる。Assuming that the noise and the virtual target sound are uncorrelated, the above equation (1) can be transformed into the following equation (2).

【００２０】[0020]

【数２】第１の可変フィルタｈ（ｎ）を、ＬタップのＦＩＲフィ
ルタ（各データに定数を乗じ、これらを加算するフィル
タ）とし、式（２）をベクトル表記すれば、次の式
（３）のようになる。[Equation 2] If the first variable filter h (n) is an L-tap FIR filter (a filter that multiplies each data by a constant and adds them), and the equation (2) is represented by a vector, the following equation (3) is obtained. become.

【００２１】[0021]

【数３】ただし、仮想目的信号ｖ’（ｎ）は、平均パワー[Equation 3] However, the virtual target signal v ′ (n) is the average power

【００２２】[0022]

【数４】の定常的な信号であるとし、また、[Equation 4] Is a stationary signal of

【００２３】[0023]

【数５】である。[Equation 5] Is.

【００２４】上記式（３）を最小化するフィルタが最適
なフィルタであるので、式（３）をｈ（ｎ）で偏微分
し、０とおいて、極小点を求める。Since the filter that minimizes the above formula (3) is the optimum filter, the formula (3) is partially differentiated by h (n) and is set to 0 to find the minimum point.

【００２５】[0025]

【数６】式（４）をｈ（ｎ）について解けば、式（３）を最小化
する最適フィルタｈ（ｏｐｔ，ｎ）が求められる。[Equation 6] By solving the equation (4) for h (n), the optimum filter h (opt, n) that minimizes the equation (3) can be obtained.

【００２６】[0026]

【数７】上記式（５）の最適フィルタを求める方法として、ＬＭ
Ｓアルゴリズム、ＮＬＭＳアルゴリズム、射影アルゴリ
ズム等の適応アルゴリズムがある。今回は、ＮＬＭＳ法
を例にとって修正式を示す。[Equation 7] As a method for obtaining the optimum filter of the above equation (5), LM
There are adaptive algorithms such as S algorithm, NLMS algorithm, and projection algorithm. This time, the modified formula is shown taking the NLMS method as an example.

【００２７】修正式は、次の式（６）で表される。The correction equation is expressed by the following equation (6).

【００２８】ｈ（ｎ＋１）＝ｈ（ｎ）＋２α［｛ｘ’’（ｎ）ｅ（ｎ）｝／｛ｘ’’（ｎ）ｘ’’^T（ｎ）｝］ …… 式（６）ただし、ｘ’’（ｎ）は、次の式（７）で表される。H (n + 1) = h (n) + 2α [{x ″ (n) e (n)} / {x ″ (n) x ″ ^T (n)}] ... Equation (6) , X ″ (n) are represented by the following equation (7).

【００２９】ｘ’’（ｎ）＝ｕ’（ｎ）＋ｘ（ｎ） …… 式（７）ただし、αは、更新係数であり、０よりも大きく１以下
の定数である。X ″ (n) = u ′ (n) + x (n) (7) where α is an update coefficient and is a constant greater than 0 and 1 or less.

【００３０】以上で、式（６）の修正式を用いて、式
（５）の最適フィルタを求めることができることを示し
た。As described above, it has been shown that the modified filter of the equation (6) can be used to obtain the optimum filter of the equation (5).

【００３１】次に、信号発生器１７Ｃについて説明す
る。Next, the signal generator 17C will be described.

【００３２】信号発生器１７Ｃは、仮想目的音源位置に
対する感度を保つという条件を盛り込んでフィルタ更新
するために使われる。したがって、全ての周波数帯で感
度を保つためには、信号発生器１７₁〜１７_Jが出力する
信号は、全ての周波数成分を含む必要がある。また、逐
次修正アルゴリズムでは、白色信号（周波数成分を一様
に含む信号）に対して収束速度が高いという性質があ
る。これらの理由によって、通常は、白色雑音を発生す
る信号発生器が用いられる。The signal generator 17C is used to update the filter by incorporating the condition of maintaining the sensitivity with respect to the virtual target sound source position. Therefore, in order to maintain sensitivity in all frequency bands, the signals output from the signal generators 17 _{1 to} 17 _J must include all frequency components. In addition, the successive correction algorithm has a property that the convergence speed is high for a white signal (a signal that uniformly includes frequency components). For these reasons, signal generators that produce white noise are commonly used.

【００３３】適応期間検出部２０は、実際の目的音が存
在する場合に、適応動作を停止する機能を有する。つま
り、実際の目的音が存在する場合に適応動作すると、実
際の目的音に対する感度を小さくするようにフィルタが
更新されるので、この場合におけるフィルタ更新を停止
する必要がある。適応期間検出部２０は、マイクロホン
で収音された信号のパワーを監視することによって、実
際の目的音の存在を検出し、適応動作を停止する。The adaptive period detector 20 has a function of stopping the adaptive operation when the actual target sound is present. That is, if the adaptive operation is performed when the actual target sound is present, the filter is updated so as to reduce the sensitivity to the actual target sound, so the filter update in this case needs to be stopped. The adaptive period detection unit 20 detects the presence of the actual target sound by monitoring the power of the signal picked up by the microphone, and stops the adaptive operation.

【００３４】[0034]

【発明が解決しようとする課題】上記のように、従来の
収音装置ＣＳ１１において、実際に収音した雑音と、予
め設定した単一の仮想目的音源位置から、マイクロホン
に到来する音とを、仮想的に合成した仮想目的信号とを
用いて、雑音に対するマイクロホンアレーの感度が低
く、仮想目的音源位置に対するマイクロホンアレーの感
度が高くなるように、フィルタ係数を更新し、目的音を
高品質に収音しようとする。As described above, in the conventional sound collecting device CS11, the noise actually collected and the sound arriving at the microphone from the preset single virtual target sound source position are: Using the virtually synthesized virtual target signal, the filter coefficient is updated so that the sensitivity of the microphone array to noise is low and the sensitivity of the microphone array to the virtual target sound source position is high. Try to make a sound.

【００３５】しかし、マイクロホンアレーの感度が高く
なる位置は、仮想目的音源位置だけであり、実際の目的
音源位置ではない。実際の目的音源位置と仮想目的音源
位置とが完全に一致していれば問題はないが、実際の目
的音源位置が仮想目的音源位置とずれた場合には、目的
音に対して周波数特性の劣化が生じる。However, the position where the sensitivity of the microphone array is high is only the virtual target sound source position, not the actual target sound source position. There is no problem if the actual target sound source position and the virtual target sound source position are exactly the same, but if the actual target sound source position deviates from the virtual target sound source position, the frequency characteristics deteriorate with respect to the target sound. Occurs.

【００３６】特に、波長が短い高周波成分（数ｋＨｚ）
に対して、劣化が激しく、数Ｃｍずれただけで、目的音
に対する特性が著しく劣化することもある。Especially, a high frequency component (several kHz) having a short wavelength
On the other hand, the deterioration is severe, and the characteristics with respect to the target sound may be remarkably deteriorated even if the deviation is several Cm.

【００３７】上記従来技術では、高品質に収音できる位
置が、仮想目的音源位置に限られるので、動く音源（人
等）や、音源位置が正確に分からない場合（異常音を監
視する場合）に用いることが難しいという問題がある。In the above-mentioned prior art, since the position where sound can be collected with high quality is limited to the virtual target sound source position, when the moving sound source (human etc.) or the sound source position is not accurately known (when monitoring abnormal sound). There is a problem that it is difficult to use.

【００３８】単一仮想目的音源を用いた従来の適応形ア
レーでは、仮想目的音源位置と実際の音源位置とにずれ
があると、目的音成分に周波数特性の劣化が生じ、動く
音源（人等）や、位置が正確に分からない場合（異常音
を監視する場合）に用いることが難しいという問題があ
る。In the conventional adaptive array using a single virtual target sound source, if the virtual target sound source position and the actual sound source position are deviated, the frequency characteristic of the target sound component deteriorates, and the moving sound source (human etc.) ), Or when the position is not known accurately (when monitoring abnormal sound), it is difficult to use.

【００３９】本発明は、適応形アレーにおいて、目的音
源が動く場合や、目的音源位置が正確に分からない場合
に生じる目的音成分の周波数特性の劣化を改善し、高品
質な収音を実現する収音方法および装置を提供すること
を目的とするものである。According to the present invention, in the adaptive array, the deterioration of the frequency characteristic of the target sound component, which occurs when the target sound source moves or when the target sound source position is not accurately known, is improved, and high quality sound collection is realized. It is an object of the present invention to provide a sound collecting method and device.

【００４０】[0040]

【課題を解決するための手段】本発明は、任意に配置さ
れている複数の収音手段が収音した収音信号を、それぞ
れ異なるフィルタ係数によってフィルタリングする第１
の可変フィルタリング手段と、上記各第１の可変フィル
タリング手段の出力信号を加算し、加算出力を出力する
加算手段とを有する収音装置において、点としての仮想
目的音源位置を設定する代わりに、所定の収音範囲内に
仮想目的音源位置を複数設定し、その範囲内の感度を保
つような拘束条件を実現するものである。According to a first aspect of the present invention, a sound pickup signal picked up by a plurality of arbitrarily arranged sound pickup means is filtered by different filter coefficients.
In the sound collecting device having the variable filtering means and the adding means for adding the output signals of the respective first variable filtering means and outputting the addition output, instead of setting the virtual target sound source position as a point, A plurality of virtual target sound source positions are set within the sound collection range of, and a constraint condition for maintaining sensitivity within the range is realized.

【００４１】[0041]

【発明の実施の形態および実施例】図１は、本発明の第
１の実施例である収音装置ＣＳ１を示すブロック図であ
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a block diagram showing a sound collecting device CS1 which is a first embodiment of the present invention.

【００４２】収音装置ＣＳ１は、マイクロホン１１₁〜
１１_Mと、第１の可変フィルタ１３Ｂ ₁〜１３Ｂ_Mと、第
２の可変フィルタ１３Ａ₁〜１３Ａ_Mと、空間特性フィル
タ１８ _1,1〜１８_J,Mと、信号発生器１７₁〜１７_Jと、遅
延器１９₁〜１９_Jと、収音範囲設定部３０と、仮想目的
音源位置設定部２６と、空間特性推定部２７と、適応期
間検出部２０と、適応アルゴリズム部１６、加算器１２
₁〜１２_M、１４Ａ、１４Ｂ、１５、２１₁〜２１_M、２２
とによって構成されている。The sound collecting device CS1 includes a microphone 11₁~
11_MAnd the first variable filter 13B ₁~ 13B_MAnd the
2 variable filter 13A₁~ 13A_MAnd the spatial characteristic fill
18 _1,1~ 18_{J, M}And the signal generator 17₁~ 17_JAnd late
Postponement 19₁~ 19_J, Sound collection range setting unit 30, and virtual purpose
Sound source position setting unit 26, spatial characteristic estimation unit 27, adaptation period
Interval detection unit 20, adaptive algorithm unit 16, adder 12
₁~ 12_M, 14A, 14B, 15, 21₁~ 21_M, 22
It is composed of and.

【００４３】収音装置ＣＳ１は、雑音を抑圧し、目的音
を高品質に収音する装置であり、予め設定した収音範囲
内にある音源の音を収音し、収音範囲外にある音源の音
を抑圧する装置である。The sound collecting device CS1 is a device for suppressing noise and collecting a target sound with high quality. The sound collecting device CS1 collects a sound from a sound source within a preset sound collecting range and is outside the sound collecting range. It is a device that suppresses the sound of the sound source.

【００４４】マイクロホン１１₁〜１１_Mで収音された信
号は、それぞれ、第１の可変フィルタ１３Ｂ₁〜１３Ｂ_M
でフィルタリングされた後、加算器１４Ｂで加算され、
出力される。The signals picked up by the microphones 11 _{1 to} 11 _M are respectively received by the first variable filters 13B _{1 to} 13B _M.
After being filtered by, it is added by the adder 14B,
Is output.

【００４５】第１の可変フィルタ１３Ｂ₁〜１３Ｂ_Mは、
収音範囲設定部３０で設定された収音範囲に対して感度
が高く、収音範囲外にある雑音源位置に対して感度が低
くなるように、後述のように学習されたものである。加
算器１４Ｂの出力は、目的音対雑音比（ＳＮ比）の高い
高品質な音となる。The first variable filters 13B _{1 to} 13B _M are
It is learned as described later so that the sensitivity is high for the sound collection range set by the sound collection range setting unit 30 and low for the noise source position outside the sound collection range. The output of the adder 14B becomes a high-quality sound having a high target sound-to-noise ratio (SN ratio).

【００４６】収音装置ＣＳ１が従来例と異なる点は、仮
想目的音源位置を収音範囲として与えた点であり、この
ようにすることによって、その収音範囲内で目的音源が
移動する場合や、目的音源位置が正確に分からない場合
でも、目的音成分に大きな周波数劣化を生じず、安定し
て収音できる。The sound collecting device CS1 is different from the conventional example in that the virtual target sound source position is given as a sound collecting range. By doing so, when the target sound source moves within the sound collecting range, Even if the target sound source position is not exactly known, the target sound component does not undergo a large frequency deterioration and stable sound collection is possible.

【００４７】次に、収音装置ＣＳ１において、第１の可
変フィルタ１３Ｂ₁〜１３Ｂ_Mの学習方法について具体的
に説明する。Next, the learning method of the first variable filters 13B _{1 to} 13B _{M in} the sound collecting device CS1 will be specifically described.

【００４８】上記「学習」は、実際に収音した雑音と、
予め準備した仮想目的音源を用いて合成した仮想的な収
音信号と、第２の可変フィルタとを用いて行う。すなわ
ち、実際の目的音源を観測する場合、必ず雑音が混入し
た信号として観測され、しかも、目的音と雑音とを区別
することができないので、雑音が混入していない仮想目
的音源を用いる。The above-mentioned "learning" includes the noise actually picked up,
This is performed using a virtual sound pickup signal synthesized using a virtual target sound source prepared in advance and a second variable filter. That is, when observing an actual target sound source, it is always observed as a signal in which noise is mixed, and since the target sound and noise cannot be distinguished, a virtual target sound source in which no noise is mixed is used.

【００４９】まず、仮想目的音源を用いて仮想的な収音
信号を合成する動作について、説明する。First, the operation of synthesizing a virtual sound pickup signal using a virtual target sound source will be described.

【００５０】収音範囲設定部３０は、収音する範囲（音
源の移動範囲、音源位置計測誤差の範囲等）を設定し、
仮想目的音源位置設定部２６は、設定範囲内に一様に仮
想目的音源位置を設ける。たとえば、５ｃｍ間隔で、設
定範囲を埋め尽くす。仮想目的音源位置の間隔は、十分
に狭いことが必要である。つまり、ある仮想目的音源位
置に存在する音源から、最も離れている２つのマイクロ
ホンが収音した場合に、１つ目のマイクロホンが収音し
た時刻と、２つ目のマイクロホンが収音した時刻との差
を第１の相対遅延時間であるとし、上記音源が相隣る仮
想目的音源に移動したときに、上記１つ目のマイクロホ
ンが収音した時刻と、上記２つ目のマイクロホンが収音
した時刻の差を第２の相対遅延時間とすると、相対遅延
時間の変動（上記第１の相対遅延時間と上記第２の相対
遅延時間との差の時間）が、収音信号の最高周波数の周
期よりも小さくなるように、仮想目的音源位置の間隔を
設定する。The sound collection range setting unit 30 sets a range in which sound is collected (range of movement of sound source, range of sound source position measurement error, etc.),
The virtual target sound source position setting unit 26 uniformly sets the virtual target sound source position within the set range. For example, the set range is filled up at intervals of 5 cm. The distance between the virtual target sound source positions needs to be sufficiently small. In other words, when the two microphones farthest from the sound source existing at a certain virtual target sound source position pick up sound, the time when the first microphone picks up the sound and the time when the second microphone picks up the sound. Is defined as the first relative delay time, and when the sound source moves to the adjacent virtual target sound source, the time when the first microphone picks up the sound and the second microphone picks up the sound. Assuming that the difference between the times is the second relative delay time, the fluctuation of the relative delay time (the time of the difference between the first relative delay time and the second relative delay time) is the maximum frequency of the sound pickup signal. The interval between the virtual target sound source positions is set so as to be smaller than the cycle.

【００５１】空間特性推定部２７は、設定した仮想目的
音源位置からマイクロホン位置に音が到達するまでの遅
延時間、減衰量を含む空間特性を推定し、空間特性フィ
ルタ１８_1,1〜１８_J,Mの係数を設定する。The spatial characteristic estimating unit 27 estimates the spatial characteristic including the delay time and the attenuation amount from the set virtual target sound source position until the sound reaches the microphone position, and the spatial characteristic filters 18 _1,1 to 18 _J, Set the coefficient of _M.

【００５２】信号発生器１７₁〜１７_Jによって発生され
た互いに無相関で定常な信号は、空間特性フィルタ１８
_1,1〜１８_J,Mによってフィルタリングされ、マイクロホ
ン毎に、加算器２１₁〜２１_Mで加算される。The uncorrelated and stationary signals generated by the signal generators 17 _{1 to} 17 _J are spatial characteristic filters 18
_It is filtered by _1,1 to 18 _{J, M} and is added by adders 21 _{1 to} 21 _M for each microphone.

【００５３】また、空間推定特性推定部２７は、仮想目
的音源位置と、その位置から各マイクまでの伝達関数と
を対応付けて、予め記憶し、仮想目的音源位置に基づい
て、伝達関数を呼び出す。Further, the space estimation characteristic estimating section 27 stores the virtual target sound source position and the transfer function from that position to each microphone in association with each other in advance, and calls the transfer function based on the virtual target sound source position. .

【００５４】このように、信号発生器１７₁〜１７_Jによ
って発生された互いに無相関で定常な信号を、空間特性
フィルタ１８_1,1〜１８_J,Mがフィルタリングすることに
よって、仮想的に収音信号を合成することができる。As described above, the spatial characteristic filters 18 _1,1 to 18 _{J, M} filter virtually uncorrelated and stationary signals generated by the signal generators 17 _{1 to} 17 _J, thereby virtually collecting the signals. Sound signals can be synthesized.

【００５５】次に、仮想的に合成した収音信号と、実際
に収音した雑音信号とを、加算器１２₁〜１２_Mが加算
し、この加算結果を、第２の可変フィルタ１３Ａ₁〜１
３Ａ_Mがフィルタリングした後に、加算器１４Ａで加算
する。この加算器１４Ａの出力が、仮想的に合成した収
音信号の出力である。Next, the virtually synthesized sound pickup signal and the actually picked up noise signal are added by the adders 12 _{1 to} 12 _M , and the addition result is added to the second variable filter 13 A ₁ to. 1
After 3A _M has been filtered, it is added by the adder 14A. The output of the adder 14A is the output of the virtually collected sound pickup signal.

【００５６】この仮想的に合成した収音信号の出力の雑
音成分が小さく、仮想目的音成分の劣化が小さければ、
高品質に収音できていることになり、減算手段としての
加算器１５が、第５の加算手段としての加算器２２の出
力信号から、仮想目的音の原音（第４の加算手段１４Ａ
の出力信号）を減算し、この加算器１５の出力を、誤差
信号として、第２の可変フィルタ１３Ａ₁〜１３Ａ_Mを更
新する。If the noise component of the output of the virtually synthesized sound pickup signal is small and the deterioration of the virtual target sound component is small,
Since the sound is collected with high quality, the adder 15 as the subtraction means outputs the original sound of the virtual target sound (the fourth addition means 14A from the output signal of the adder 22 as the fifth addition means).
Output signal of the second variable filter 13A _{1 to} 13A _M is updated by using the output of the adder 15 as an error signal.

【００５７】ただし、入力から出力までの遅延を許容さ
せ、第２の可変フィルタ（学習フィルタ）の効率的学習
を可能とするために、遅延器１９₁〜１９_Jで、仮想目的
音の原音に遅延を付加した後に、加算器２２で加算した
信号を、加算器１５による減算に使用する。However, in order to allow the delay from the input to the output and enable the efficient learning of the second variable filter (learning filter), the delay devices 19 _{1 to} 19 _{J change} the original sound of the virtual target sound. After adding the delay, the signal added by the adder 22 is used for the subtraction by the adder 15.

【００５８】適応アルゴリズム部１６は、加算器１５が
出力した誤差信号と、第２の可変フィルタ１３Ａ₁〜１
３Ａ_Mへの入力信号（学習信号）とに基づいて、誤差信
号の二乗平均誤差が最小となるように、第２の可変フィ
ルタの更新ベクトルを求める。The adaptive algorithm unit 16 uses the error signal output from the adder 15 and the second variable filters 13A ₁ to 1A 1.
Based on the input signal (learning signal) to 3A _M , the update vector of the second variable filter is obtained so that the root mean square error of the error signal is minimized.

【００５９】第１の可変フィルタ１３Ｂ₁〜１３Ｂ_Mに
は、第２の可変フィルタ１３Ａ₁〜１３Ａ_Mと同じフィル
タ係数がセットされ、設定した収音範囲内の目的音源の
音を収音し、雑音を抑圧する。The same filter coefficient as that of the second variable filters 13A _{1 to} 13A _M is set in the first variable filters 13B _{1 to} 13B _M , and the sound of the target sound source within the set sound collecting range is picked up, Suppress noise.

【００６０】一方、マイクロホン１１₁〜１１_Mの収音信
号に、実際の目的音が含まれる場合、実際の目的音源に
対して感度を低くするように学習されてしまうので、実
際の目的音が存在する場合には、フィルタの更新を停止
する必要がある。適応期間検出部２０は、マイクロホン
１１₁〜１１_Mで収音された信号のパワーを監視すること
によって、実際の目的音の存在を検出し、第１の可変フ
ィルタ１３Ｂ₁〜１３Ｂ_M、第２の可変フィルタ１３Ａ₁
〜１３Ａ_Mによる適応動作を停止させる。On the other hand, when the picked-up signals of the microphones 11 _{1 to} 11 _M include the actual target sound, the actual target sound is learned because the sensitivity is learned with respect to the actual target sound source. If so, you need to stop updating the filters. The adaptive period detection unit 20 detects the presence of the actual target sound by monitoring the power of the signals picked up by the microphones 11 _{1 to} 11 _M , and the first variable filters 13B _{1 to} 13B _M and the second variable filter 13B ₁ Variable filter 13A ₁
Stop the adaptive operation by 13A _M.

【００６１】次に、適応アルゴリズム部１６について、
詳細に説明する。Next, regarding the adaptive algorithm section 16,
The details will be described.

【００６２】適応アルゴリズムとしては、ＬＭＳアルゴ
リズム、ＮＬＭＳアルゴリズム、射影アルゴリズム等が
ある。本明細書では、ＮＬＭＳ法を例にとって、以下
に、フィルタの収束解と修正式との導出を行う。The adaptive algorithm includes LMS algorithm, NLMS algorithm, projection algorithm and the like. In the present specification, the NLMS method is taken as an example, and the convergent solution of the filter and the correction formula are derived below.

【００６３】まず、数式で使用する記号について、説明
する。First, the symbols used in the mathematical formulas will be described.

【００６４】サンプリング周期によって離散化された時
刻をｎとし、マイクロホン数をＭとし、仮想目的音源数
をＪとし、時刻ｎにｉ番目マイクロホン１１_iで収音さ
れた信号をｘ_i（ｎ）とし、Ｌサンプル分を取り出して
行列で表したものを、ｘ（ｎ）＝［ｘ_i（ｎ），ｘ_i（ｎ−１），…，ｘ_i（ｎ
−Ｌ＋１），ｘ₂（ｎ），…，ｘ_M（ｎ−Ｌ＋１）］^T とする。Let n be the time discretized by the sampling period, M be the number of microphones, J be the number of virtual target sound sources, and x _i (n) be the signal picked up by the i-th microphone 11 _i at time n. , L samples are taken out and expressed in a matrix as x (n) = [x _i (n), x _i (n−1), ..., X _i (n
−L + 1), x ₂ (n), ..., X _M (n−L + 1)] ^T.

【００６５】ｊ番目の信号発生器１７_jの出力信号を、
ｖ_j（ｎ）とし、ｊ番目の信号発生器１７_jとｉ番目マイ
クロホン１１_iとに対する空間特性フィルタを、ｇ
_i&j（ｎ）とし、空間特性フィルタ出力を、ｕ_i,j（ｎ）
＝ｇ_i,j（ｎ）＊ｖ_j（ｎ）とし、Ｌサンプル分（フィル
タが必要とするサンプル）を取り出して行列で表したも
のを、ｕ_j（ｎ）＝［ｕ_i,j（ｎ），ｕ_i,j（ｎ−１），…，ｕ
_i,j（ｎ−Ｌ＋１），ｕ _2,j（ｎ），…，ｕ_M,j（ｎ−Ｌ
＋１）］^T とする。ただし、＊は、畳み込み演算を表している。Jth signal generator 17_jOutput signal of
v_j(N), and the j-th signal generator 17_jAnd i-th my
Black phone 11_iThe spatial characteristic filter for and is g
_{i & j}(N), the spatial characteristic filter output is u_{i, j}(N)
= G_{i, j}(N) * v_j(N) and L samples (fill
The sample required by the data) is taken out and expressed in a matrix.
Of u_j(N) = [u_{i, j}(N), u_{i, j}(N-1), ..., u
_{i, j}(N-L + 1), u _{2, j}(N), ..., u_{M, j}(N-L
+1)]^T And However, * represents a convolution operation.

【００６６】第２の可変フィルタ１３Ａ₁〜１３Ａ_M、第
１の可変フィルタ１３Ｂ₁〜１３Ｂ_Mは、ＬタップのＦＩ
Ｒフィルタとし、このフィルタ係数を、ｈ（ｎ）＝［ｈ_i（ｎ），ｈ_i（ｎ−１），…，ｈ_i（ｎ
−Ｌ＋１），ｈ₂（ｎ），…，ｈ_M（ｎ−Ｌ＋１）］^Tとして行列で表す。ただ
し、ｈ_i（ｎ−ｐ−１）は、時刻ｎにおけるｉ番目マイ
クロホンに対するフィルタのｐタップ目のフィルタ係数
を表し、第２の可変フィルタ１３Ａ₁〜１３Ａ_Mと第１の
可変フィルタ１３Ｂ₁〜１３Ｂ_Mには、同一のフィルタ係
数が用いられている。The second variable filters 13A _{1 to} 13A _M and the first variable filters 13B _{1 to} 13B _M are L-tap FIs.
And R filter, the filter coefficients, h (n) = [h i (n), h i (n-1), ..., h i (n
−L + 1), h ₂ (n), ..., h _M (n−L + 1)] ^T in a matrix. However, h _i (n−p−1) represents the filter coefficient of the p-tap of the filter for the i-th microphone at the time n, and the second variable filters 13A _{1 to} 13A _M and the first variable filter 13B ₁ to The same filter coefficient is used for 13B _M.

【００６７】加算器１４Ａの出力を、ｙ’（ｎ）とし、
加算器１４Ｂの出力を、ｙ（ｎ）とし、加算器１５の出
力（誤差）を、ｅ（ｎ）とし、遅延器１９₁〜１９_Jでの
遅延量を、τ₀とし（通常、τ₀は、第２の可変フィルタ
のタップ長の半分の長さである）、τ₀は全て等しいと
する。The output of the adder 14A is y '(n),
The output of the adder 14B is y (n), the output (error) of the adder 15 is e (n), and the delay amounts of the delay devices 19 _{1 to} 19 _J are τ ₀ (normally τ ₀ Is half the tap length of the second variable filter), and τ ₀ are all equal.

【００６８】まず、加算器１５の出力（誤差）ｅ（ｎ）
の二乗平均を求める。この二乗平均誤差を最小とするフ
ィルタが、最適なフィルタである。First, the output (error) e (n) of the adder 15
Find the root mean square of. The filter that minimizes this root mean square error is the optimum filter.

【００６９】[0069]

【数８】ただし、式（８）において、オーバーラインは、時間平
均を意味する。仮想目的信号ｖ_j（ｎ）は、互いに無相
関であり、仮想目的信号と雑音とは無相関であるので、
式（８）は、次の式（９）のように変形される。[Equation 8] However, in Formula (8), the overline means a time average. Since the virtual target signals v _j (n) are uncorrelated with each other and the virtual target signal and noise are uncorrelated,
The formula (8) is transformed into the following formula (9).

【００７０】[0070]

【数９】第１の可変フィルタｈ（ｎ）を、ＬタップのＦＩＲフィ
ルタ（各データに定数を乗じ、これらを加算するフィル
タ）とし、式（９）を、ベクトル表記すれば、次の式
（１０）のようになる。[Equation 9] If the first variable filter h (n) is an L-tap FIR filter (a filter that multiplies each data by a constant and adds them), the equation (9) can be expressed as a vector as shown in the following equation (10). Like

【００７１】[0071]

【数１０】ただし、仮想目的信号Ｖ_j（ｎ）は、平均パワー[Equation 10] However, the virtual target signal V _j (n) is the average power

【００７２】[0072]

【数１１】の定常的な信号であると仮定し、また、[Equation 11] Suppose it is a stationary signal of

【００７３】[0073]

【数１２】であるとする。[Equation 12] Suppose

【００７４】式（１０）を最小化するフィルタが最適な
フィルタであるので、式（１０）をｈ（ｎ）で偏微分
し、０とおいて、極小点を求める。Since the filter that minimizes the equation (10) is the optimum filter, the equation (10) is partially differentiated by h (n) and is set to 0 to find the minimum point.

【００７５】[0075]

【数１３】上記式（１１）を、ｈ（ｎ）について解けば、上記式
（１０）を最小化する最適フィルタｈ（ｏｐｔ，ｎ）が
求められる。[Equation 13] By solving the above equation (11) for h (n), the optimum filter h (opt, n) that minimizes the above equation (10) is obtained.

【００７６】[0076]

【数１４】上記式（１２）の最適フィルタを求める方法として、Ｌ
ＭＳアルゴリズム、ＮＬＭＳアルゴリズム、射影アルゴ
リズム等の適応アルゴリズムがある。[Equation 14] As a method of obtaining the optimum filter of the above formula (12), L
There are adaptive algorithms such as MS algorithm, NLMS algorithm, and projection algorithm.

【００７７】本明細書ではＮＬＭＳアルゴリズムを例に
とって説明することとし、修正式は、次の式（１３）で
表される。In this specification, the NLMS algorithm will be described as an example, and the modified equation is expressed by the following equation (13).

【００７８】ｈ（ｎ＋１）＝ｈ（ｎ）＋２α［｛ｘ’（ｎ）ｅ（ｎ）｝／｛ｘ’（ｎ）ｘ’ ^T （ｎ）｝］ …… 式（１３）ただし、ｘ’（ｎ）は、次の式（１４）で表される。[0078] h (n + 1) = h (n) + 2α [{x '(n) e (n)} / {x' (n) x ' ^T (N)}] ... Equation (13) However, x '(n) is represented by the following formula (14).

【００７９】[0079]

【数１５】ここまでの説明で、式（１３）の修正式を用いて、式
（１２）の最適フィルタを求めることができることを示
した。[Equation 15] In the description so far, it has been shown that the optimum filter of Expression (12) can be obtained by using the modified expression of Expression (13).

【００８０】収音装置ＣＳ１は、音声認識、ハンズフリ
ー電話、テレビカメラ、通信会議、遠隔講義、異常音監
視等の収音装置として利用することができ、予め設定し
た収音範囲内に仮想目的音源位置を複数設定することに
よって、その範囲内の感度を保つような拘束条件を実現
し、収音範囲内にある目的音源を、低い周波数特性の劣
化で収音でき、範囲外の雑音を抑圧することができる。
また、範囲内で目的音源が移動しても、フィルタ修正の
必要がなく、音源移動による性能低下がない。The sound collecting device CS1 can be used as a sound collecting device for voice recognition, hands-free telephone, TV camera, communication conference, remote lecture, abnormal sound monitoring, etc., and a virtual object is set within a preset sound collecting range. By setting multiple sound source positions, a constraint condition that keeps the sensitivity within that range is realized, and the target sound source within the sound collection range can be picked up due to deterioration of low frequency characteristics, and noise outside the range is suppressed. can do.
Further, even if the target sound source moves within the range, it is not necessary to modify the filter, and the performance does not deteriorate due to the movement of the sound source.

【００８１】上記のように、上記実施例は、目的音源が
動く場合や、目的音源位置が正確に分からない場合で
も、高品質な収音ができるという従来例にはない優れた
特徴を有する。As described above, the above-described embodiment has an excellent feature that high-quality sound can be collected even when the target sound source is moving or when the position of the target sound source is not exactly known, which is not available in the prior art.

【００８２】つまり、収音装置ＣＳ１は、任意に配置さ
れている複数の収音手段が収音した収音信号を、それぞ
れ異なるフィルタ係数によってフィルタリングする第１
の可変フィルタリング手段と、上記各第１の可変フィル
タリング手段の出力信号を加算し、加算出力を出力する
第１の加算手段１４Ｂとを有する収音装置において、所
定の収音範囲を設定する収音範囲設定手段３０と、上記
収音範囲内に、複数の仮想目的音源位置を設定する仮想
目的音源位置設定手段２６と、上記各仮想目的音源位置
と上記各収音手段の位置とに基づいて、上記各仮想目的
音源位置から上記各収音手段の位置に音が到達するまで
の遅延時間、減衰量を含む空間特性を推定する空間特性
推定手段２７と、互いに無相関で定常な擬似目的信号
を、上記仮想目的音源位置の数と同数だけ発生させる擬
似目的信号発生手段１７と、上記空間特性推定手段によ
って推定された各空間特性をフィルタ係数とし、上記各
擬似目的信号のそれぞれをフィルタリングする空間特性
フィルタリング手段１８と、上記各空間特性フィルタリ
ング手段の各出力信号を、上記各収音手段毎に、それぞ
れ加算することによって、擬似目的音収音信号を合成す
る第２の加算手段２１と、上記各擬似目的音収音信号と
上記各収音信号とをそれぞれ加算することによって、学
習信号を合成する第３の加算手段１２と、上記合成され
た学習信号を、それぞれ異なるフィルタ係数でフィルタ
リングする第２の可変フィルタリング手段１３と、上記
各第２の可変フィルタリング手段の出力信号を互いに加
算する第４の加算手段１４と、上記各擬似目的信号をそ
れぞれ遅延させる遅延手段１９と、上記遅延手段１９か
らの各遅延出力信号同士を加算する第５の加算手段２２
と、上記第５の加算手段２２の出力信号から、上記第４
の加算手段１４の出力信号を減算することによって、誤
差信号を求める減算手段１５と、上記収音信号に基づい
て、上記収音範囲内に音源が存在しない期間を検出し、
この検出された期間を、適応させるべき期間として検出
する適応期間検出部２０と、上記適応期間検出部によっ
て検出された収音範囲内に音源が存在しない期間に、上
記誤差信号の二乗平均値が最小になるように、上記第２
の可変フィルタ係数と上記第１の可変フィルタ係数とを
更新する適応アルゴリズム手段１６とを有する収音装置
である。That is, the sound collecting device CS1 filters the sound collecting signals picked up by a plurality of arbitrarily arranged sound collecting means by different filter coefficients.
In the sound collecting device having the variable filtering means of No. 1 and the first adding means 14B for adding the output signals of the respective first variable filtering means and outputting the addition output. Based on the range setting means 30, the virtual target sound source position setting means 26 for setting a plurality of virtual target sound source positions within the sound collection range, the virtual target sound source positions and the positions of the sound collecting means, Spatial characteristic estimating means 27 for estimating a spatial characteristic including a delay time and an attenuation amount from the virtual target sound source position until the sound reaches the position of each sound collecting means, and a pseudo target signal which is uncorrelated with each other and is stationary. , Pseudo target signal generating means 17 for generating the same number as the number of virtual target sound source positions, and each spatial characteristic estimated by the spatial characteristic estimating means as a filter coefficient. A second addition for synthesizing the pseudo target sound pickup signal by adding the output signals of the spatial characteristic filtering means 18 for filtering it and the output signals of the spatial characteristic filtering means for each of the sound collecting means. Means 21, third adding means 12 for synthesizing a learning signal by adding each of the pseudo target sound pickup signals and each of the sound pickup signals, and the synthesized learning signal by different filters. Second variable filtering means 13 for filtering with a coefficient, fourth adding means 14 for adding output signals of the second variable filtering means to each other, and delay means 19 for delaying the pseudo target signals, respectively. Fifth addition means 22 for adding the respective delayed output signals from the delay means 19
And from the output signal of the fifth adding means 22, the fourth
Subtracting the output signal of the adding means 14 to obtain an error signal, and detecting a period during which no sound source exists within the sound collecting range based on the sound collecting signal,
During the period in which the sound source does not exist within the sound collection range detected by the adaptive period detecting unit 20 and the adaptive period detecting unit that detects the detected period as the period to be adapted, the root mean square value of the error signal is The second, above, to minimize
Of the variable filter coefficient and the adaptive algorithm means 16 for updating the first variable filter coefficient.

【００８３】図２は、上記実施例の特徴を、従来例との
比較で説明する図である。FIG. 2 is a diagram for explaining the features of the above embodiment in comparison with the conventional example.

【００８４】従来例は、単一仮想目的音源を用いる装置
であり、一方、上記実施例は、単一仮想目的音源を用い
る装置（ＡＭＮＯＲ等）において、仮想目的信号源が複
数になったものであり、図２に示すように、所定の範囲
に互いに無相関な仮想目的信号源を複数設定することに
よって、その範囲内の感度を保つような拘束条件を実現
するものである。The conventional example is an apparatus using a single virtual target sound source, while the above embodiment is an apparatus using a single virtual target sound source (AMNOR, etc.) in which a plurality of virtual target signal sources are provided. Therefore, as shown in FIG. 2, a plurality of virtual target signal sources that are uncorrelated with each other are set in a predetermined range to realize a constraint condition that maintains the sensitivity within the range.

【００８５】図３は、上記実施例の構成を、従来例の構
成との比較で説明する図である。FIG. 3 is a diagram for explaining the configuration of the above embodiment in comparison with the configuration of the conventional example.

【００８６】図３（１）は、従来例（ＡＭＮＯＲ等単一
仮想目的音源を用いる装置）の基本構成を示す図であ
り、一方、図３（２）は、上記実施例の基本構成を示す
図である。FIG. 3 (1) is a diagram showing the basic configuration of a conventional example (apparatus using a single virtual target sound source such as AMNOR), while FIG. 3 (2) shows the basic configuration of the above embodiment. It is a figure.

【００８７】ＡＭＮＯＲ等では、１点の位置に感度を保
つように学習させるので、話者が設定位置からずれた場
合に目的音に周波数特性の劣化が生じる。一方、上記実
施例では、互いに無相関な信号を発生する信号発生器を
複数持っており、これによって、複数の仮想目的音源が
ある状況を模擬し、設定範囲内の感度を保つような拘束
条件を実現する。このようにすることによって、設定範
囲内に存在する音源の信号は、大きな周波数特性の劣化
なしに収音でき、範囲外の雑音を抑圧することができ
る。また、範囲内で音源が移動しても、フィルタ修正の
必要がなく、音源移動による性能低下がない。In AMNOR and the like, the learning is performed so that the sensitivity is maintained at one position, so that the frequency characteristic of the target sound deteriorates when the speaker deviates from the set position. On the other hand, in the above embodiment, a plurality of signal generators that generate mutually uncorrelated signals are provided, thereby simulating a situation in which there are a plurality of virtual target sound sources, and a constraint condition for maintaining sensitivity within a setting range. To realize. By doing so, the signal of the sound source existing within the set range can be picked up without significant deterioration of the frequency characteristic, and the noise outside the range can be suppressed. Further, even if the sound source moves within the range, it is not necessary to modify the filter, and the performance does not deteriorate due to the movement of the sound source.

【００８８】図４は、本発明の第２の実施例である収音
装置ＣＳ２を示すブロック図である。FIG. 4 is a block diagram showing a sound collecting device CS2 which is a second embodiment of the present invention.

【００８９】収音装置ＣＳ２は、収音装置ＣＳ１におい
て、第１の可変フィルタ１３Ｂ₁〜１３Ｂ_Mを、半固定フ
ィルタ（フィルタ係数を保持しつつ、フィルタ係数を書
き換え可能なフィルタ）２３₁〜２３_Mに置き換え、マイ
クロホン１１₁〜１１_Mと加算器２１₁〜２１_Mとの間に、
収音信号記憶部２５を設け、適応アルゴリズム部１６と
半固定フィルタ２３₁〜２３_Mとの間に、フィルタ係数記
憶部２４を設け、適応期間検出部２０を取り除いた点
が、収音装置ＣＳ１と異なる点である。In the sound collecting device CS2, in the sound collecting device CS1, the first variable filters 13B _{1 to} 13B _M are replaced with semi-fixed filters (filters in which the filter coefficient can be rewritten while holding the filter coefficient) 23 ₁ to 23. _M , and between the microphones 11 _{1 to} 11 _M and the adders 21 _{1 to} 21 _M ,
The sound pickup device CS1 is provided with the sound pickup signal storage unit 25, the filter coefficient storage unit 24 provided between the adaptive algorithm unit 16 and the semi-fixed filters 23 ₁ to 23 _M, and the adaptation period detection unit 20 removed. Is different from.

【００９０】まず、収音装置ＣＳ２において、目的音の
収音を行う前に、雑音のみを収音信号記憶部２５に記憶
し、次に、収音信号記憶部２５が記憶した収音信号を出
力し、収音装置ＣＳ１と同様に、第２の可変フィルタ１
３Ａ₁〜１３Ａ_Mの更新を行い、第２の可変フィルタ１３
Ａ₁〜１３Ａ_Mが十分に収束するまで学習を行う。First, in the sound collection device CS2, only noise is stored in the sound collection signal storage unit 25 before the collection of the target sound, and then the sound collection signal stored in the sound collection signal storage unit 25 is stored. The second variable filter 1 which outputs and outputs sound
3A _{1 to} 13A _M are updated, and the second variable filter 13 is updated.
Performs learning to A ₁ ~13A _M is sufficiently converged.

【００９１】このときに、上記のように、記憶している
収音信号には目的音が含まれていないので、適応動作を
停止する必要はなく、適応期間検出部２０を設ける必要
がない。At this time, as described above, since the stored sound pickup signal does not include the target sound, it is not necessary to stop the adaptive operation, and it is not necessary to provide the adaptive period detecting section 20.

【００９２】十分に学習された第２の可変フィルタ１３
Ａ₁〜１３Ａ_Mにおけるフィルタ係数と同じフィルタ係数
を、適応アルゴリズム部１６からフィルタ係数記憶部２
４に転送し、フィルタ係数記憶部２４は、上記転送され
たフィルタ係数を記憶する。フィルタ係数記憶部２４
は、半固定フィルタ２３₁〜２３_Mにフィルタ係数をセッ
トし、目的収音時には、半固定フィルタ２３₁〜２３Ｍ
を固定して使用する。Fully learned second variable filter 13
The same filter coefficient as that of A _{1 to} 13 A _M is supplied from the adaptive algorithm unit 16 to the filter coefficient storage unit 2
4 and the filter coefficient storage unit 24 stores the transferred filter coefficient. Filter coefficient storage unit 24
Sets the filter coefficients in the semi-fixed filter 23 ₁ ~ 23 _M, at the time of object sound collection, a semi-fixed filter 23 ₁ ~23M
Fixed and used.

【００９３】このようにすることによって、マイクロホ
ン１１₁〜１１_Mと、半固定フィルタ２３₁〜２３_Mと、加
算器１４Ｂとを、他の部分から切り離して使用すること
が可能であり、可搬性、省スペース性に優れるという利
点がある。By doing so, it is possible to use the microphones 11 _{1 to} 11 _M , the semi-fixed filters 23 ₁ to 23 _M, and the adder 14 B separately from the other parts, which is portable. There is an advantage that it is excellent in space saving.

【００９４】また、フィルタを学習する処理を実行する
場合、実時間で計算する必要がないので、少ないハード
ウェアで構成することができ、パーソナルコンピュータ
等の汎用計算機でも、フィルタを学習する処理のための
計算が可能である。ただし、収音装置ＣＳ２では、半固
定フィルタ２３₁〜２３_Mのフィルタ係数が固定であるの
で、雑音源の移動に対しては追従できないという不利な
点もある。Further, when the process of learning the filter is executed, since it is not necessary to perform the calculation in real time, it can be configured with a small amount of hardware, and even a general-purpose computer such as a personal computer can perform the process of learning the filter. Can be calculated. However, in the sound collecting device CS2, since the filter coefficients of the semi-fixed filters 23 ₁ to 23 _M are fixed, there is a disadvantage that the movement of the noise source cannot be followed.

【００９５】収音装置ＣＳ２におけるその他の構成につ
いては、収音装置ＣＳ１と同じであるので、説明を省略
する。The rest of the configuration of the sound collecting device CS2 is the same as that of the sound collecting device CS1, and therefore its explanation is omitted.

【００９６】なお、収音信号記憶部２５は、各収音手段
１１と各第３の加算手段１２との間に設けられ、上記各
収音信号を記憶する収音信号記憶手段の例である。フィ
ルタ係数記憶部２４は、適応アルゴリズム手段１６と各
第１の可変フィルタリング手段１３との間に設けられ、
上記第１の可変フィルタ係数を記憶するフィルタ係数記
憶手段の例である。The picked-up signal storage unit 25 is an example of picked-up signal storage unit that is provided between each picked-up unit 11 and each third addition unit 12, and stores each picked-up signal. . The filter coefficient storage unit 24 is provided between the adaptive algorithm unit 16 and each first variable filtering unit 13,
It is an example of a filter coefficient storage means for storing the first variable filter coefficient.

【００９７】図５は、本発明の第３の実施例である収音
装置ＣＳ３を示す構成図である。FIG. 5 is a block diagram showing a sound collecting device CS3 which is a third embodiment of the present invention.

【００９８】収音装置ＣＳ３は、収音装置ＣＳ１または
収音装置ＣＳ２において、空間特性フィルタ１８_1,1〜
１８_J,Mを、遅延器２８_1,1〜２８_J,Mに置き換え、空間
特性推定部２７を、距離計算部２７１とマイクロホン間
相対遅延量計算部２７２とによって実現した装置であ
る。The sound collecting device CS3 is the same as the sound collecting device CS1 or the sound collecting device CS2 except that the spatial characteristic filters 18 _1,1 ...
18 _{J, M} is replaced with delay devices 28 _{1,1 to} 28 _{J, M} , and the spatial characteristic estimating unit 27 is realized by a distance calculating unit 271 and an inter-microphone relative delay amount calculating unit 272.

【００９９】これら以外の構成要素は、収音装置ＣＳ１
または収音装置ＣＳ２における構成要素と同じであるの
で、図５では、それらを省略して示してある。The components other than these are the sound collecting device CS1.
Alternatively, since they are the same as the constituent elements of the sound collecting device CS2, they are omitted in FIG.

【０１００】距離計算部２７１は、仮想目的音源位置と
マイクロホン位置との間の距離を計算する部分であり、
マイクロホン間相対遅延量計算部２７２は、距離計算部
２７１が出力した距離を音速で除算して遅延時間を求
め、遅延時間の最小値を、各遅延時間から減算し、マイ
クロホン間相対遅延量を求め、遅延器２８_1,1〜２８_J,M
にセットする。The distance calculation section 271 is a section for calculating the distance between the virtual target sound source position and the microphone position,
The inter-microphone relative delay amount calculation unit 272 obtains the delay time by dividing the distance output by the distance calculation unit 271 by the sound velocity, subtracts the minimum delay time value from each delay time, and obtains the inter-microphone relative delay amount. , Delay device 28 _{1,1 to} 28 _{J, M}
Set to.

【０１０１】収音装置ＣＳ３では、空間特性を遅延のみ
で置き換えることによって、計算量が軽減され、少ない
ハードウェアで構成することができるという利点があ
る。The sound collecting device CS3 has an advantage that the calculation amount is reduced and the hardware can be configured with a small amount of hardware by replacing the spatial characteristic with only the delay.

【０１０２】収音装置ＣＳ３におけるその他の構成につ
いては、収音装置ＣＳ１または収音装置ＣＳ２と同じで
あるので、説明を省略する。Since the other structure of the sound collecting device CS3 is the same as that of the sound collecting device CS1 or the sound collecting device CS2, the description thereof will be omitted.

【０１０３】つまり、収音装置ＣＳ３は、任意に配置さ
れている複数の収音手段が収音した収音信号を、それぞ
れ異なるフィルタ係数によってフィルタリングする第１
の可変フィルタリング手段と、上記各第１の可変フィル
タリング手段の出力信号を加算し、加算出力を出力する
第１の加算手段１４Ｂとを有する収音装置において、所
定の収音範囲を設定する収音範囲設定手段３０と、上記
収音範囲内に、複数の仮想目的音源位置を設定する仮想
目的音源位置設定手段２６と、上記各仮想目的音源位置
と上記各収音手段の位置とに基づいて、上記各仮想目的
音源位置から上記各収音手段の位置に音が到達するまで
の遅延時間、減衰量を含む空間特性を推定する空間特性
推定手段であり、上記各仮想目的音源位置から上記各収
音手段１１の位置までの距離を計算する距離計算手段２
７１と、上記距離計算手段２７１によって計算された距
離と音速とから、上記各収音手段１１間の相対遅延量を
求める収音手段間相対遅延量計算手段２７２とを含む空
間特性推定手段２７と、互いに無相関で定常な擬似目的
信号を、上記仮想目的音源位置の数と同数だけ発生させ
る擬似目的信号発生手段１７と、信号発生手段１７が出
力した擬似目的信号を、上記収音手段間相対遅延量計算
手段２７２が求めた相対遅延量だけ遅延させる複数の第
１の遅延手段２８と、上記各遅延手段の各出力信号を、
上記各収音手段毎に、それぞれ加算することによって、
擬似目的音収音信号を合成する第２の加算手段２１と、
上記各擬似目的音収音信号と上記各収音信号とをそれぞ
れ加算することによって、学習信号を合成する第３の加
算手段１２と、上記合成された学習信号を、それぞれ異
なるフィルタ係数でフィルタリングする第２の可変フィ
ルタリング手段１３と、上記各第２の可変フィルタリン
グ手段の出力信号を互いに加算する第４の加算手段１４
と、上記各擬似目的信号をそれぞれ遅延させる第２の遅
延手段１９と、上記第２の遅延手段１９からの各遅延出
力信号同士を加算する第５の加算手段２２と、上記第
５の加算手段２２の出力信号から、上記第４の加算手段
１４の出力信号を減算することによって、誤差信号を求
める減算手段１５と、上記収音信号に基づいて、上記収
音範囲内に音源が存在しない期間を検出し、この検出さ
れた期間を、適応させるべき期間として検出する適応期
間検出部２０と、上記適応期間検出部によって検出され
た収音範囲内に音源が存在しない期間に、上記誤差信号
の二乗平均値が最小になるように、上記第２の可変フィ
ルタ係数と上記第１の可変フィルタ係数とを更新する適
応アルゴリズム手段１６とを有する収音装置の例であ
る。That is, the sound collecting device CS3 filters the sound collecting signals picked up by a plurality of arbitrarily arranged sound collecting means by different filter coefficients.
In the sound collecting device having the variable filtering means of No. 1 and the first adding means 14B for adding the output signals of the respective first variable filtering means and outputting the addition output. Based on the range setting means 30, the virtual target sound source position setting means 26 for setting a plurality of virtual target sound source positions within the sound collection range, the virtual target sound source positions and the positions of the sound collecting means, It is a spatial characteristic estimating means for estimating a spatial characteristic including a delay time from the virtual target sound source position until the sound reaches the position of each of the sound collecting means, an attenuation amount, and each of the virtual target sound source positions for collecting the respective spatial characteristics. Distance calculation means 2 for calculating the distance to the position of the sound means 11.
71 and a spatial characteristic estimating means 27 including a sound collecting means relative delay amount calculating means 272 for obtaining a relative delay amount between the sound collecting means 11 from the distance and the sound velocity calculated by the distance calculating means 271. , Pseudo target signal generating means 17 for generating the same number of pseudo target signals that are uncorrelated with each other and are stationary as many as the number of virtual target sound source positions, and the pseudo target signal output from the signal generating means 17 are compared between the sound collecting means. A plurality of first delay means 28 for delaying the relative delay amount obtained by the delay amount calculation means 272, and respective output signals of the respective delay means,
By adding up for each of the above sound collecting means,
Second adding means 21 for synthesizing the pseudo target sound pickup signal,
Third adding means 12 for synthesizing a learning signal by adding each of the pseudo target sound collecting signals and each of the sound collecting signals, and the synthesized learning signal are filtered by different filter coefficients. The second variable filtering means 13 and the fourth adding means 14 for adding the output signals of the respective second variable filtering means to each other.
A second delaying means 19 for delaying the pseudo target signals, a fifth adding means 22 for adding the delayed output signals from the second delaying means 19, and a fifth adding means. A subtraction means 15 for obtaining an error signal by subtracting the output signal of the fourth adding means 14 from the output signal of 22, and a period during which no sound source exists within the sound collecting range based on the sound collecting signal. Is detected and the detected period is detected as a period to be adapted, and a period during which no sound source exists within the sound collection range detected by the adaptive period detection unit. It is an example of a sound pickup device having an adaptive algorithm means 16 for updating the second variable filter coefficient and the first variable filter coefficient so that the root mean square value is minimized.

【０１０４】図６は、本発明の第４の実施例である収音
装置ＣＳ４の構成を示す図である。FIG. 6 is a diagram showing the configuration of a sound collecting device CS4 which is the fourth embodiment of the present invention.

【０１０５】収音装置ＣＳ４は、収音装置ＣＳ１または
収音装置ＣＳ２において、空間特性フィルタ１８_1,1〜
１８_J,Mを、遅延器２８_1,1〜２８_J,Mとゲイン（増幅
器）２９ _1,1〜２９_J,Mとに置き換え、空間特性推定部２
７を、距離計算部２７１と、マイクロホンとの間の相対
遅延量計算部２７２と、マイクロホン間相対減衰量計算
部２７３とによって実現した装置である。The sound collecting device CS4 is the sound collecting device CS1 or
In the sound collection device CS2, the spatial characteristic filter 18_1,1~
18_{J, M}Delay device 28_1,1~ 28_{J, M}And gain (amplification
Vessel) 29 _1,1~ 29_{J, M}And the spatial characteristic estimation unit 2
7 is the relative distance between the distance calculator 271 and the microphone.
Delay amount calculation unit 272 and relative attenuation amount calculation between microphones
The device realized by the unit 273.

【０１０６】これら以外の構成要素は、収音装置ＣＳ１
または収音装置ＣＳ２における構成要素と同じであるの
で、図６では、それらを省略して示してある。The components other than these are the sound collecting device CS1.
Alternatively, since they are the same as the constituent elements of the sound collecting device CS2, they are omitted in FIG.

【０１０７】距離計算部２７１は、仮想目的音源位置と
マイクロホン位置との間の距離を計算する。マイクロホ
ン間相対遅延量計算部２７２は、距離計算部２７１が出
力した距離を音速で除算し、遅延時間を求め、遅延時間
の最小値を、各遅延時間から減算して、マイクロホン間
相対遅延量を求め、遅延器２８_1,1〜２８_J,Mにセットす
る。The distance calculator 271 calculates the distance between the virtual target sound source position and the microphone position. The inter-microphone relative delay amount calculation unit 272 divides the distance output by the distance calculation unit 271 by the speed of sound to obtain the delay time, and subtracts the minimum delay time value from each delay time to obtain the inter-microphone relative delay amount. Obtained and set to the delay devices 28 _{1,1 to} 28 _{J, M.}

【０１０８】マイクロホン間相対減衰量計算部２７２
は、距離計算部２７１が出力した距離の逆数を求め、減
衰量を求め、基準となるマイクロホンの減衰量を各減衰
量から減算し、マイクロホン間相対減衰量を求め、遅延
器２８_1,1〜２８_J,Mにセットする。Inter-microphone relative attenuation amount calculation section 272
Calculates the reciprocal of the distance output by the distance calculation unit 271, obtains the attenuation amount, subtracts the attenuation amount of the reference microphone from each attenuation amount, obtains the relative attenuation amount between the microphones, and delay devices 28 _1,1 ... Set to 28 _{J, M.}

【０１０９】収音装置ＣＳ４では、上記空間特性を、遅
延と減衰とのみで置き換える装置であり、これによっ
て、計算量が軽減され、少ないハードウェアで構成する
ことができる。The sound collecting device CS4 is a device that replaces the above-mentioned spatial characteristics only with delay and attenuation, and this reduces the amount of calculation and can be configured with less hardware.

【０１１０】また、収音装置ＣＳ４は、収音装置ＣＳ３
よりも、計算量が多いが、球面波モデルを仮定するよう
なマイクロホンの配置の場合（マイクロホンと音源との
間の距離に対して、マイクロホンアレーのサイズが長い
場合）でも、空間特性を良く近似し、良好な結果が得ら
れる。The sound collecting device CS4 is the sound collecting device CS3.
Although it requires more calculation than the above, the spatial characteristics are well approximated even in the case of microphone placement that assumes a spherical wave model (when the microphone array size is long with respect to the distance between the microphone and the sound source). And good results are obtained.

【０１１１】収音装置ＣＳ４におけるその他の構成は、
収音装置ＣＳ１または収音装置ＣＳ２と同じであるの
で、説明を省略する。Other configurations of the sound collecting device CS4 are as follows.
Since it is the same as the sound collecting device CS1 or the sound collecting device CS2, the description thereof will be omitted.

【０１１２】つまり、収音装置ＣＳ４は、任意に配置さ
れている複数の収音手段が収音した収音信号を、それぞ
れ異なるフィルタ係数によってフィルタリングする第１
の可変フィルタリング手段と、上記各第１の可変フィル
タリング手段の出力信号を加算し、加算出力を出力する
第１の加算手段１４Ｂとを有する収音装置において、所
定の収音範囲を設定する収音範囲設定手段３０と、上記
収音範囲内に、複数の仮想目的音源位置を設定する仮想
目的音源位置設定手段２６と、上記各仮想目的音源位置
と上記各収音手段の位置とに基づいて、上記各仮想目的
音源位置から上記各収音手段の位置に音が到達するまで
の遅延時間、減衰量を含む空間特性を推定する空間特性
推定手段２７であり、上記各仮想目的音源位置から上記
各収音手段の位置までの距離を計算する距離計算手段２
７１と、上記距離計算手段２７１によって計算された距
離と音速とから、上記各収音手段間の相対遅延量を求め
る収音手段間相対遅延量計算手段２７２と、上記距離計
算手段２７２によって計算された距離から、収音手段間
の相対減衰量を求める収音手段間相対減衰量計算手段２
７３とを含む空間特性推定手段２７と、互いに無相関で
定常な擬似目的信号を、上記仮想目的音源位置の数と同
数だけ発生させる擬似目的信号発生手段１７と、信号発
生手段が出力した擬似目的信号を、上記収音手段間相対
遅延量計算手段２７２が求めた相対遅延量だけ遅延させ
る複数の第１の遅延手段２８と、上記複数の遅延手段２
８のそれぞれが出力した擬似目的信号を、上記収音手段
間相対減衰量計算手段２７３が求めた相対減衰量だけ減
衰させる複数のゲイン手段２９と、上記各ゲイン手段の
各出力信号を、上記各収音手段毎に、それぞれ加算する
ことによって、擬似目的音収音信号を合成する第２の加
算手段２１と、上記各擬似目的音収音信号と上記各収音
信号とをそれぞれ加算することによって、学習信号を合
成する第３の加算手段１２と、上記合成された学習信号
を、それぞれ異なるフィルタ係数でフィルタリングする
第２の可変フィルタリング手段１３と、上記各第２の可
変フィルタリング手段の出力信号を互いに加算する第４
の加算手段１４と、上記各擬似目的信号をそれぞれ遅延
させる第２の遅延手段１９と、上記第２の遅延手段１９
からの各遅延出力信号同士を加算する第５の加算手段２
２と、上記第５の加算手段２２の出力信号から、上記第
４の加算手段１４の出力信号を減算することによって、
誤差信号を求める減算手段１５と、上記収音信号に基づ
いて、上記収音範囲内に音源が存在しない期間を検出
し、この検出された期間を、適応させるべき期間として
検出する適応期間検出部２０と、上記適応期間検出部に
よって検出された収音範囲内に音源が存在しない期間
に、上記誤差信号の二乗平均値が最小になるように、上
記第２の可変フィルタ係数と上記第１の可変フィルタ係
数とを更新する適応アルゴリズム手段１６とを有する収
音装置の例である。That is, the sound collecting device CS4 filters the sound collecting signals collected by a plurality of arbitrarily arranged sound collecting means by different filter coefficients.
In the sound collecting device having the variable filtering means of No. 1 and the first adding means 14B for adding the output signals of the respective first variable filtering means and outputting the addition output. Based on the range setting means 30, the virtual target sound source position setting means 26 for setting a plurality of virtual target sound source positions within the sound collection range, the virtual target sound source positions and the positions of the sound collecting means, Spatial characteristic estimating means 27 for estimating a spatial characteristic including a delay time from the virtual target sound source position until the sound reaches the position of the sound collecting means and an attenuation amount. Distance calculating means 2 for calculating the distance to the position of the sound collecting means
71, the relative delay amount calculation means 272 between the sound collecting means for obtaining the relative delay amount between the respective sound collecting means from the distance calculated by the distance calculation means 271 and the sound velocity, and the distance calculation means 272. Relative attenuation amount calculating means 2 between the sound collecting means for obtaining the relative attenuation amount between the sound collecting means from the distance
Spatial characteristic estimating means 27 including 73, pseudo target signal generating means 17 for generating the same number of pseudo target signals that are uncorrelated with each other and are stationary, and the pseudo target signal output by the signal generating means. A plurality of first delay means 28 for delaying the signal by the relative delay amount calculated by the sound pickup means relative delay amount calculation means 272, and the plurality of delay means 2
A plurality of gain means 29 for attenuating the pseudo target signal output by each of the above 8 by the relative attenuation amount calculated by the inter-sound collecting means relative attenuation amount calculating means 273, and each output signal of each gain means, Second adding means 21 for synthesizing the pseudo target sound collecting signal by adding each sound collecting means, and adding each of the pseudo target sound collecting signal and each of the sound collecting signals. , A third adding means 12 for synthesizing the learning signal, a second variable filtering means 13 for filtering the synthesized learning signal with different filter coefficients, and an output signal of each of the second variable filtering means. Fourth adding to each other
Adding means 14, second delay means 19 for delaying each of the pseudo target signals, and second delay means 19
Fifth addition means 2 for adding each delayed output signal from
2 by subtracting the output signal of the fourth adding means 14 from the output signal of the fifth adding means 22,
A subtraction unit 15 for obtaining an error signal, and an adaptive period detection unit for detecting a period during which no sound source exists within the sound collection range based on the sound collection signal and detecting the detected period as a period to be adapted. 20 and the second variable filter coefficient and the first variable filter coefficient so that the root mean square value of the error signal is minimized during a period in which no sound source exists within the sound collection range detected by the adaptive period detection unit. It is an example of a sound pickup device having an adaptive algorithm means 16 for updating a variable filter coefficient.

【０１１３】図７は、上記各実施例における適応期間検
出部２０の１つの具体例である適応期間検出部２０ａを
示すブロック図である。FIG. 7 is a block diagram showing an adaptive period detecting section 20a which is one specific example of the adaptive period detecting section 20 in each of the above embodiments.

【０１１４】適応期間検出部２０ａは、短時間平均パワ
ー計集部２０１と、雑音パワー設定部２０２と、閾値係
数乗算部２０５と、パワー比較部２０３とによって構成
されている。The adaptive period detecting section 20a comprises a short-time average power collecting section 201, a noise power setting section 202, a threshold coefficient multiplying section 205, and a power comparing section 203.

【０１１５】短時間平均パワー計算部２０１は、マイク
ロホンで１１₁〜１１_Mで収音した信号のうちで、１チャ
ネルまたは複数チャネル平均の短時間平均パワーを求
め、出力する。なお、上記短時間は、たとえば、１０〜
１００ｍｓｅｃである。The short-time average power calculation unit 201 obtains and outputs the short-time average power of the average of one channel or a plurality of channels among the signals picked up by the microphones 11 _{1 to} 11 _M. The short time is, for example, 10 to 10.
It is 100 msec.

【０１１６】雑音パワー設定部２０２は、予め測定した
雑音パワーの長時間平均を求め、その雑音パワー（一定
値）を出力する。なお、上記長時間は、たとえば、１〜
１０ｓｅｃである。The noise power setting unit 202 obtains a long-time average of the noise power measured in advance and outputs the noise power (constant value). Note that the above-mentioned long time is, for example, 1 to
It is 10 seconds.

【０１１７】閾値係数乗算部２０５は、雑音パワー設定
部２０２の出力に閾値係数を乗算し、閾値として設定す
る。上記閾値係数は、雑音の短時間平均パワーの変動の
大きさに応じて決定し、たとえば、雑音の短時間平均パ
ワーが、長時間平均を中心に１０％の変動がある場合、
閾値係数は１．１に設定される。Threshold coefficient multiplying section 205 multiplies the output of noise power setting section 202 by a threshold coefficient and sets it as a threshold. The threshold coefficient is determined according to the magnitude of the fluctuation of the short-time average power of noise. For example, when the short-time average power of noise has a fluctuation of 10% around the long-term average,
The threshold coefficient is set to 1.1.

【０１１８】パワー比較部２０３は、短時間平均パワー
計算部２０１の出力と、閾値係数乗算部２０５が設定し
た閾値とを比較し、短時間平均パワーが上記閾値を超え
た場合に、適応アルゴリズム部１６に、適応動作停止信
号を出力する。The power comparison unit 203 compares the output of the short-time average power calculation unit 201 with the threshold value set by the threshold coefficient multiplication unit 205, and if the short-time average power exceeds the threshold value, the adaptive algorithm unit. An adaptive operation stop signal is output to 16.

【０１１９】適応期間検出部２０ａを上記のように構成
すると、雑音の定常性と目的音の非定常性とに注目した
目的音検出を行っており、簡単な処理で目的音の検出が
できるという利点を持つ。When the adaptive period detecting section 20a is configured as described above, the target sound is detected by paying attention to the steadiness of noise and the non-stationarity of the target sound, and the target sound can be detected by a simple process. Have an advantage.

【０１２０】つまり、適応期間検出手段２０ａは、収音
信号の短時間平均パワーを計算する短時間平均パワー計
算手段２０１と、予め測定した雑音の長時間平均パワー
を設定する雑音パワー設定手段２０２と、上記雑音パワ
ーに閾値係数を乗じた値を閾値として設定する閾値設定
手段２０５と、上記閾値と上記短時間平均パワーとを比
較して、適応期間を検出するパワー比較部２０３とを含
む手段の例である。That is, the adaptive period detecting means 20a includes the short-time average power calculating means 201 for calculating the short-time average power of the picked-up signal, and the noise power setting means 202 for setting the long-time average power of noise measured in advance. , A threshold value setting means 205 for setting a value obtained by multiplying the noise power by a threshold coefficient as a threshold value, and a power comparing section 203 for comparing the threshold value with the short-time average power to detect an adaptive period. Here is an example.

【０１２１】図８は、上記各実施例における適応期間検
出部２０の別の具体例である適応期間検出部２０ｂを示
すブロック図である。FIG. 8 is a block diagram showing an adaptation period detecting section 20b which is another specific example of the adaptation period detecting section 20 in each of the above embodiments.

【０１２２】適応期間検出部２０ｂは、短時間平均パワ
ー計算部２０１と、長時間平均パワー計算部２０４と、
閾値係数乗算部２０５と、パワー比較部２０３とを有す
る。The adaptive period detecting section 20b includes a short-time average power calculating section 201, a long-time average power calculating section 204,
It has a threshold coefficient multiplication unit 205 and a power comparison unit 203.

【０１２３】短時間平均パワー計算部２０１は、マイク
ロホンで１１₁〜１１_Mで収音した信号のうちで、１チャ
ネルまたは複数チャネルの平均の短時間平均パワーを求
め、出力する。The short-time average power calculation unit 201 finds and outputs the average short-time average power of one channel or a plurality of channels among the signals picked up by the microphones 11 _{1 to} 11 _M.

【０１２４】長時間平均パワー計算部２０４は、マイク
ロホンで１１₁〜１１_Mで収音した信号のうちで、１チャ
ネルまたは複数チャネル平均の長時間平均パワーを求め
る。The long-term average power calculation unit 204 obtains the long-term average power of one channel or a plurality of channels averaged from the signals picked up by the microphones 11 _{1 to} 11 _M.

【０１２５】閾値係数乗算部２０５は、長時間平均パワ
ー計算部２０４の出力に閾値係数を乗算し、閾値として
設定する。上記閾値係数は、雑音の短時間平均パワーの
変動の大きさに応じて決定し、たとえば、雑音の短時間
平均パワーが、長時間平均を中心に１０％の変動がある
場合、上記閾値係数が１．１に設定される。The threshold coefficient multiplication unit 205 multiplies the output of the long-time average power calculation unit 204 by the threshold coefficient and sets it as a threshold value. The threshold coefficient is determined according to the magnitude of the fluctuation of the short-time average power of noise. For example, when the short-time average power of noise has a fluctuation of 10% around the long-term average, the threshold coefficient is It is set to 1.1.

【０１２６】パワー比較部２０３は、短時間平均パワー
計算部２０１の出力と、閾値係数乗算部２０５に応じて
設定された閾値とを比較し、短時間平均パワーが、閾値
を超えた場合に、適応アルゴリズム部１６に、適応動作
停止信号を出力する。The power comparison unit 203 compares the output of the short-time average power calculation unit 201 with the threshold value set according to the threshold coefficient multiplication unit 205, and when the short-time average power exceeds the threshold value, An adaptive operation stop signal is output to the adaptive algorithm unit 16.

【０１２７】適応期間検出部２０ｂを上記のように構成
すると、目的音の非定常性が、雑音の非定常性よりも強
いことに注目した目的音検出を行っており、簡単な処理
で目的音の検出ができるという利点を持つ。When the adaptive period detecting section 20b is configured as described above, the target sound is detected by paying attention to the fact that the non-stationarity of the target sound is stronger than the non-stationarity of noise, and the target sound is detected by simple processing. Has the advantage of being able to detect

【０１２８】適応期間検出部２０ｂは、適応期間検出部
２０ａに比べ、多少処理量は多いが、雑音パワーの緩や
かな変化に追従することができ、雑音レベルを予め測定
する必要がないという利点を持つ。The adaptive period detecting section 20b has a slightly larger amount of processing than the adaptive period detecting section 20a, but has the advantage that it can follow a gradual change in noise power and does not need to measure the noise level in advance. To have.

【０１２９】つまり、適応期間検出部２０ｂは、上記収
音信号の短時間平均パワーを計算する短時間平均パワー
計算手段２０１と、上記収音信号の長時間平均パワーを
計算する長時間平均パワー計算手段２０４と、上記長時
間平均パワーに閾値係数を乗じた値を閾値として設定す
る閾値係数乗算手段２０５と、上記閾値と上記短時間平
均パワーとを比較し、適応期間を検出するパワー比較部
２０３とを含む手段の例である。That is, the adaptive period detecting section 20b calculates the short-time average power calculating means 201 for calculating the short-time average power of the sound pickup signal and the long-time average power calculation for calculating the long-time average power of the sound pickup signal. A means 204, a threshold coefficient multiplication means 205 for setting a value obtained by multiplying the long-term average power by a threshold coefficient as a threshold, and a power comparing section 203 for comparing the threshold with the short-time average power to detect an adaptation period. It is an example of a means including and.

【０１３０】図９は、上記各実施例における適応期間検
出部２０ａの具体例である適応期間検出部２０ｃを示す
ブロック図である。FIG. 9 is a block diagram showing an adaptive period detecting section 20c which is a specific example of the adaptive period detecting section 20a in each of the above embodiments.

【０１３１】適応期間検出部２０ｃは、閾値係数乗算部
２０５を、立上り閾値係数乗算部２０６と、立下り閾値
係数乗算部２０７と、立上り立下り切替部２０８とによ
って実現した装置である。The adaptive period detecting section 20c is a device in which the threshold coefficient multiplying section 205 is realized by the rising threshold coefficient multiplying section 206, the falling threshold coefficient multiplying section 207, and the rising / falling switching section 208.

【０１３２】立上り閾値係数乗算部２０６は、雑音パワ
ー設定部２０２が出力した値に立上り閾値係数を乗算
し、この乗算結果を、立上り閾値として設定する。The rising threshold coefficient multiplying unit 206 multiplies the value output by the noise power setting unit 202 by the rising threshold coefficient, and sets the multiplication result as the rising threshold.

【０１３３】立下り閾値係数乗算部２０７は、雑音パワ
ー設定部２０２が出力した値に立下り閾値係数を乗算
し、この乗算結果を立下り閾値として設定する。The falling threshold coefficient multiplying unit 207 multiplies the value output by the noise power setting unit 202 by the falling threshold coefficient, and sets the multiplication result as the falling threshold.

【０１３４】上記立上り閾値係数または立下り閾値係数
は、雑音の短時間平均パワーの変動の大きさに応じて決
定し、たとえば、雑音の短時間平均パワーが、長時間平
均を中心に１０％の変動がある場合には、立上り閾値係
数は１．１に設定され、立下り閾値係数は、立上り閾値
係数に近い値に設定される。The rising threshold coefficient or the falling threshold coefficient is determined according to the magnitude of fluctuation of the short-time average power of noise. For example, the short-time average power of noise is 10% around the long-time average. When there is fluctuation, the rising threshold coefficient is set to 1.1, and the falling threshold coefficient is set to a value close to the rising threshold coefficient.

【０１３５】立上り立下り切替部２０８は、パワー比較
部２０３が適応動作停止信号を出力している場合に、立
下り閾値を選択し、それ以外の場合に、立上り閾値を選
択し、閾値に設定する。通常、目的音波形の立上り立下
りは、緩やかであることが予想される。たとえば、音声
であれば、立上り部分は、子音でパワーが小さく、立下
がりも緩やかである。このため、立ち上がり部分、立下
り部分で誤り検出を起こし易い。The rising / falling switching section 208 selects the falling threshold when the power comparing section 203 outputs the adaptive operation stop signal, and in other cases, selects the rising threshold and sets it as the threshold. To do. Normally, it is expected that the rising and falling edges of the target sound waveform will be gentle. For example, in the case of voice, the rising portion is a consonant, has low power, and falls gently. Therefore, error detection is likely to occur at the rising portion and the falling portion.

【０１３６】なお、適応期間検出部２０ｃにおける閾値
係数乗算部２０５に、雑音パワー設定部２０２が出力し
た値を印加する代わりに、長時間平均パワー計算部２０
４が出力した値を印加するようにしてもよい。It should be noted that instead of applying the value output from the noise power setting unit 202 to the threshold coefficient multiplication unit 205 in the adaptive period detection unit 20c, the long-time average power calculation unit 20
You may make it apply the value which 4 output.

【０１３７】つまり、適応期間検出部２０ｃは、上記雑
音パワー設定手段２０２または上記長時間平均パワー計
算手段２０４の出力に、立上り閾値を乗算する立上り閾
値係数乗算手段２０６と、上記雑音パワー設定手段２０
２または上記長時間平均パワー計算手段２０４の出力
に、立下り閾値を乗算する立下り閾値係数乗算手段２０
７と、上記パワー比較部出力の状態によって、立上り閾
値係数乗算出力または立下り閾値係数乗算出力を選択
し、この選択された出力を閾値として設定する立上り立
下り切替手段２０８とを含む手段の例である。That is, the adaptive period detecting section 20c, the rising threshold coefficient multiplying means 206 for multiplying the output of the noise power setting means 202 or the long-time average power calculating means 204 by the rising threshold, and the noise power setting means 20.
2 or falling threshold coefficient multiplication means 20 for multiplying the output of the long-term average power calculation means 204 by the falling threshold
7 and a rising / falling edge switching means 208 for selecting a rising threshold coefficient multiplication output or a falling threshold coefficient multiplication output according to the state of the output of the power comparison unit and setting the selected output as a threshold value. Is.

【０１３８】なお、上記立上り閾値、立下り閾値は、雑
音パワー設定手段２０２で設定される。The rising threshold and the falling threshold are set by the noise power setting means 202.

【０１３９】図１０は、短時間平均パワーの立上り、立
下りで検出誤りを起こし易いことと、その対策とを説明
する図である。FIG. 10 is a diagram for explaining that a detection error is likely to occur at the rise and fall of the short-time average power and a countermeasure therefor.

【０１４０】図１０（１）は、閾値を１つだけ用いる方
法を示す図であり、短時間平均パワーの立上り部分、立
下り部分で３、検出誤りを起こしている。これは、目的
音成分のパワーが微小に上昇したために、雑音の短時間
平均パワーの微小な変動の影響を受け易くなるためであ
る。FIG. 10 (1) is a diagram showing a method in which only one threshold value is used, and 3 detection errors occur in the rising portion and the falling portion of the short-time average power. This is because the power of the target sound component is slightly increased, which makes it more likely to be affected by minute fluctuations in the short-time average power of noise.

【０１４１】図９に示す適応期間検出部２０を使用する
と、立上りと立下りとの２つの閾値を設定することによ
って、雑音の短時間平均パワーの微小な変動の影響を受
け難くし、より正確な目的音検出が可能になる。When the adaptive period detector 20 shown in FIG. 9 is used, by setting two threshold values for rising and falling, it is possible to make it less susceptible to minute fluctuations in the short-time average power of noise and more accurately. It is possible to detect various target sounds.

【０１４２】図１０（２）は、短時間平均パワーの立上
り部分、立下り部分で検出誤りを解消しているのが分か
る。In FIG. 10 (2), it can be seen that the detection error is eliminated at the rising portion and the falling portion of the short-time average power.

【０１４３】次に、図９に示す適応期間検出部２０を使
用した場合における上記各実施例のシミュレーション結
果を示す。Next, the simulation results of the above-mentioned respective embodiments when the adaptive period detecting section 20 shown in FIG. 9 is used will be shown.

【０１４４】マイクロホンアレーとして、無指向性のマ
イクロホンを２ｃｍ間隔で７つ直線状に並べたものを用
い、マイクロホンアレーの正面方向に５０ｃｍ離れた位
置を従来技術の仮想音源位置とした。As the microphone array, seven omnidirectional microphones arranged linearly at 2 cm intervals were used, and the position 50 cm away from the front of the microphone array was set as the virtual sound source position of the prior art.

【０１４５】上記各実施例の収音範囲は、従来例におけ
る仮想音源位置（１ポイントの位置）から、たとえば、
左に３０ｃｍの位置と、上記従来例における仮想音源位
置から右に３０ｃｍの位置との間の範囲であるとし、１
０ｃｍ間隔で７点の仮想目的音源位置を設けた。雑音に
は、白色雑音を用い、従来技術の仮想音源位置から横に
１ｍ離れた位置に、雑音源を配置した。このときに、従
来技術と上記各実施例とにおいて、音源−アレー出力間
の周波数特性を、図１０（２）に示してある。目的音源
位置は、従来技術の仮想目的音源位置と、そこから２０
ｃｍ横にずれた位置の２通りに設定した。From the virtual sound source position (position of 1 point) in the conventional example, the sound collection range of each of the above-mentioned examples is, for example,
It is assumed that the range is between the position of 30 cm to the left and the position of 30 cm to the right from the virtual sound source position in the above conventional example.
Seven virtual target sound source positions were provided at 0 cm intervals. White noise was used as the noise, and the noise source was placed laterally 1 m away from the virtual sound source position of the prior art. At this time, the frequency characteristic between the sound source and the array output in the related art and each of the above-described embodiments is shown in FIG. 10 (2). The target sound source position is the virtual target sound source position of the related art and 20
It was set in two ways at positions shifted laterally by cm.

【０１４６】図１１は、シミュレーション結果を示す図
である。FIG. 11 is a diagram showing simulation results.

【０１４７】図１１（１）は、目的音源位置が、従来技
術の仮想目的音源位置にある場合に、音源−アレー出力
間の周波数特性を示す図である。図１１（２）は、目的
音源位置が従来技術の仮想目的音源位置から２０ｃｍず
れた場合に、音源−アレー出力間の周波数特性を示す図
である。FIG. 11 (1) is a diagram showing frequency characteristics between the sound source and the array output when the target sound source position is at the virtual target sound source position of the prior art. FIG. 11 (2) is a diagram showing frequency characteristics between the sound source and the array output when the target sound source position deviates from the virtual target sound source position of the related art by 20 cm.

【０１４８】図１１（１）に示す周波数特性では、従来
技術、上記実施例ともに、大きな周波数特性の劣化は生
じていないが、図１１（２）に示す周波数特性では、従
来技術の周波数特性の高周波部分が大きく劣化してい
る。上記各実施例では、図１１（２）に示す周波数特性
でも、周波数特性の大きな劣化は生じていない。In the frequency characteristic shown in FIG. 11 (1), no large deterioration of the frequency characteristic occurs in the prior art and the above-mentioned embodiment, but in the frequency characteristic shown in FIG. The high frequency part is greatly deteriorated. In each of the above-described embodiments, even in the frequency characteristic shown in FIG. 11B, the frequency characteristic is not significantly deteriorated.

【０１４９】以上の結果から、従来方法では、仮想音源
位置から目的音源がずれると、周波数特性の大きな劣化
を生じることが確認された。しかし、上記各実施例は、
設定した収音範囲内で、目的音源が移動しても、周波数
特性の大きな劣化が生じず、安定して、目的音を高品質
に収音できることが確認された。From the above results, it was confirmed that in the conventional method, when the target sound source is deviated from the virtual sound source position, the frequency characteristic is largely deteriorated. However, in each of the above embodiments,
It was confirmed that even if the target sound source moved within the set sound collection range, the frequency characteristics did not deteriorate significantly and the target sound could be stably collected with high quality.

【０１５０】また、このときの雑音抑圧性能は、従来技
術、上記各実施例ともに、１５ｄＢ以上あり、高い雑音
抑圧が行なわれていることが確認された。Further, the noise suppression performance at this time was 15 dB or more in both the prior art and each of the above-mentioned embodiments, and it was confirmed that high noise suppression was performed.

【０１５１】以上のシミュレーション結果より、上記各
実施例は、目的音源が動く場合や、目的音源位置が正確
に分からない場合でも、高い雑音抑圧、低い周波数特性
の劣化で、高品質な収音ができることが確認された。From the above simulation results, in each of the above embodiments, even when the target sound source moves or when the target sound source position is not accurately known, high noise suppression, low frequency characteristic deterioration, and high quality sound collection are possible. It was confirmed that it was possible.

【０１５２】[0152]

【発明の効果】本発明によれば、収音範囲内に仮想目的
音源位置を複数設定することによって、その範囲内の感
度を保つような拘束条件を実現するので、上記収音範囲
内に存在する音源を、低い周波数特性の劣化で収音で
き、上記収音範囲外の雑音を、抑圧することができ、ま
た、上記収音範囲内で音源が移動しても、フィルタ修正
の必要がなく、音源移動による性能低下がなく、したが
って、目的音源が動く場合や、目的音源位置が正確に分
からない場合でも、雑音抑圧が高く、周波数特性の劣化
が低く、高品質な収音ができるという効果を奏する。According to the present invention, by setting a plurality of virtual target sound source positions within the sound collection range, a constraint condition for maintaining the sensitivity within the range is realized. The sound source can be picked up by deterioration of low frequency characteristics, noise outside the picked-up range can be suppressed, and even if the sound source moves within the picked-up range, there is no need to modify the filter. The effect that there is no performance degradation due to sound source movement, and therefore even if the target sound source moves or if the target sound source position is not known accurately, noise suppression is high, frequency characteristic deterioration is low, and high-quality sound collection is possible. Play.

[Brief description of drawings]

【図１】本発明の第１の実施例である収音装置ＣＳ１を
示すブロック図である。FIG. 1 is a block diagram showing a sound collecting device CS1 which is a first embodiment of the present invention.

【図２】上記実施例の特徴を、従来例との比較で説明す
る図である。FIG. 2 is a diagram for explaining the features of the above-described embodiment in comparison with a conventional example.

【図３】上記実施例の構成を、従来例の構成との比較で
説明する図である。FIG. 3 is a diagram illustrating a configuration of the above-described embodiment in comparison with a configuration of a conventional example.

【図４】本発明の第２の実施例である収音装置ＣＳ２を
示すブロック図である。FIG. 4 is a block diagram showing a sound collecting device CS2 which is a second embodiment of the present invention.

【図５】本発明の第３の実施例である収音装置ＣＳ３を
示す構成図である。FIG. 5 is a configuration diagram showing a sound collecting device CS3 which is a third embodiment of the present invention.

【図６】本発明の第４の実施例である収音装置ＣＳ４の
構成を示す図である。FIG. 6 is a diagram showing a configuration of a sound collecting device CS4 which is a fourth embodiment of the present invention.

【図７】上記各実施例における適応期間検出部２０の１
つの具体例である適応期間検出部２０ａを示すブロック
図である。FIG. 7 is a block diagram of the adaptive period detector 20 in each of the above embodiments.
It is a block diagram which shows the adaptation period detection part 20a which is one specific example.

【図８】上記各実施例における適応期間検出部２０の別
の具体例である適応期間検出部２０ｂを示すブロック図
である。FIG. 8 is a block diagram showing an adaptation period detection unit 20b which is another specific example of the adaptation period detection unit 20 in each of the embodiments.

【図９】上記各実施例における適応期間検出部２０ａの
具体例である適応期間検出部２０ｃを示すブロック図で
ある。FIG. 9 is a block diagram showing an adaptation period detection unit 20c which is a specific example of the adaptation period detection unit 20a in each of the above embodiments.

【図１０】短時間平均パワーの立上り、立下りで検出誤
りを起こし易いことと、その対策とを説明する図であ
る。FIG. 10 is a diagram illustrating that a detection error is likely to occur at the rise and fall of the short-time average power and a countermeasure against the error.

【図１１】シミュレーション結果を示す図である。FIG. 11 is a diagram showing a simulation result.

【図１２】従来の収音装置ＣＳ１１を示す図である。FIG. 12 is a diagram showing a conventional sound collecting device CS11.

[Explanation of symbols]

１１₁〜１１_M…マイクロホン、１４Ｂ…第１の加算手段、２１₁〜２１_M…第２の加算手段、１２₁〜１２_M…第３の加算手段、１４Ａ…第４の加算手段、２２…第５の加算手段、１５、…減算手段、１３Ａ₁〜１３Ａ_M…第２の可変フィルタ、１３Ｂ₁〜１３Ｂ_M…第１の可変フィルタ、１６…適応アルゴリズム部、１７₁〜１７_J、１７Ｃ…信号発生器、１８_1,1〜１８_J,M、１８Ｃ₁〜１８Ｃ_M…空間特性フィル
タ、１９₁〜１９_J、１９Ｃ、２８_1,1〜２８_J,M…遅延器、２０…適応期間検出部、２３₁〜２３_M…半固定フィルタ、２４…フィルタ係数記憶部、２５…収音信号記憶部、２６、２６Ｃ…仮想音源位置設定部、２７…空間特性推定部、２９_1,1〜２９_J,M…ゲイン、３０…収音範囲設定部、２０１…短時間平均パワー計算部、２０２…雑音パワー設定部、２０３…パワー比較部、２０４…長時間平均パワー計算部、２０５…閾値係数乗算部、２０６…立上り閾値係数乗算部、２０７…立下り閾値係数乗算部、２０８…立上り立下り切替部、２７１…距離計算部、２７２…マイクロホン間相対遅延量計算部、２７３…マイクロホン間相対減衰量計算部。11 _{1 to} 11 _M ... Microphone, 14 B ... 1st addition means, 21 _{1 to} 21 _M ... 2nd addition means, 12 _{1 to} 12 _M ... 3rd addition means, 14 A ... 4th addition means, 22 ... fifth addition means, 15, ... subtracting means, 13A ₁ ~13A _M ... second variable filters, 13B ₁ 13 b _M ... first variable filter, 16 ... adaptive algorithm section, 17 ₁ to 17 _J, 17C ... Signal generator, 18 _1,1 to 18 _{J, M} , 18C _{1 to} 18C _M ... Spatial characteristic filter, 19 _{1 to} 19 _J , 19C, 28 _{1,1 to} 28 _{J, M} ... Delay device, 20 ... Adaptation period detection part, 23 ₁ ~ 23 _M ... semi-fixed filter, 24 ... filter coefficient storage unit, 25 ... sound pickup signal storage unit, 26,26C ... virtual sound source position setting unit, 27 ... spatial characteristic estimating unit, 29 _1,1 to 29 _{J, M} ... gain, 30 ... sound collection range setting unit, 201 ... short Average power calculation unit, 202 ... Noise power setting unit, 203 ... Power comparison unit, 204 ... Long-term average power calculation unit, 205 ... Threshold coefficient multiplication unit, 206 ... Rising threshold coefficient multiplication unit, 207 ... Falling threshold coefficient multiplication unit , 208 ... Rise / fall switching unit, 271 ... Distance calculation unit, 272 ... Microphone relative delay amount calculation unit, 273 ... Microphone relative attenuation amount calculation unit.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩＨ０４Ｒ 3/00 ３２０Ｇ１０Ｌ 3/02 ３０１Ｆ (56)参考文献特開昭59−72295（ＪＰ，Ａ) 特開昭60−41393（ＪＰ，Ａ) 特開平８−271605（ＪＰ，Ａ) 特開2001−309483（ＪＰ，Ａ) 小林，古家，片岡，複数仮想音源を用いた適応型マイクロホンアレー，電子情報通信学会誌Ａ，日本，2003年４月１日，Ｖｏｌ．Ｊ86−Ａ，Ｎｏ．４，Ｐａｇｅｓ 333−344 小林，古家，複数仮想音源を用いた適応型アレーの収束特性および仮想音源配置に関する検討，電子情報通信学会技術研究報告［応用音響］，日本，2000年 10月27日，Ｖｏｌ．100，Ｎｏ．397, ＥＡ2000−54，Ｐａｇｅｓ 23−30 小林，古家，話者移動による適応形アレーの性能劣化の改善，日本音響学会 2000年秋季研究発表会講演論文集 −Ｉ −，日本，2000年９月20日，３−Ｐ− 19，Ｐａｇｅｓ 485−486 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 21/00 - 21/02 G10L 15/28 G01S 3/802 H04M 1/00 H04R 3/00 ─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁷ Identification code FI H04R 3/00 320 G10L 3/02 301F (56) References JP-A-59-72295 (JP, A) JP-A-60-41393 (JP, A) JP-A-8-271605 (JP, A) JP-A-2001-309483 (JP, A) Kobayashi, Furuya, Kataoka, Adaptive microphone array using multiple virtual sound sources, Journal of the Institute of Electronics, Information and Communication Engineers A , Japan, April 1, 2003, Vol. J86-A, No. 4, Pages 333-344 Kobayashi, Furuya, A Study on Convergence Characteristics and Virtual Source Placement of Adaptive Array Using Multiple Virtual Sound Sources, Technical Report of IEICE [Applied Acoustics], Japan, October 27, 2000 , Vol. 100, No. 397, EA2000-54, Pages 23-30 Kobayashi, Furuya, Improvement of performance degradation of adaptive array due to speaker movement, The Acoustical Society of Japan 2000 Autumn Research Conference Lecture Collection-I-, Japan, September 20, 2000 Sun, 3-P-19, Pages 485-486 (58) Fields investigated (Int.Cl. ⁷ , DB name) G10L 21/00-21/02 G10L 15/28 G01S 3/802 H04M 1/00 H04R 3 / 00

Claims

(57) [Claims]

1. A first variable filtering means for filtering a sound collecting signal picked up by a plurality of arbitrarily arranged sound collecting means by different filter coefficients, and outputs of the first variable filtering means. A sound collecting device having first adding means for adding signals and outputting an addition output; sound collecting range setting means for setting a predetermined sound collecting range; and a plurality of virtual target sound sources within the sound collecting range. A virtual target sound source position setting means for setting a position; a sound arrives from the virtual target sound source position to the sound collecting means based on the virtual target sound source position and the sound collecting means position Characteristic estimating means for estimating a spatial characteristic including a delay time and an attenuation amount; and pseudo target signals that generate the same number of pseudo target signals that are uncorrelated with each other and are stationary as many as the virtual target sound source positions Live means ;; spatial characteristic filtering means for filtering each of the pseudo target signals using each spatial characteristic estimated by the spatial characteristic estimating means as filter coefficients; and each output signal of each spatial characteristic filtering means, Second adding means for synthesizing the pseudo target sound collecting signal by adding each sound collecting means respectively;
Third adding means for synthesizing a learning signal by adding each of the pseudo target sound collecting signals and each of the sound collecting signals; and a third adding means for filtering the synthesized learning signal with different filter coefficients. Two variable filtering means; fourth adding means for adding the output signals of the respective second variable filtering means to each other; delay means for delaying the respective pseudo target signals; and respective delayed outputs from the delay means. Fifth adding means for adding signals; subtracting means for obtaining an error signal by subtracting the output signal of the fourth adding means from the output signal of the fifth adding means; the sound collecting signal An adaptive period detection unit that detects a period during which no sound source exists in the sound collection range based on the above, and detects the detected period as a period to be adapted; The second variable filter coefficient and the first variable filter coefficient are set so that the root mean square value of the error signal is minimized during a period in which no sound source exists within the sound collection range detected by the response period detection unit. And an adaptive algorithm means for updating the.

2. The sound pickup signal storage device according to claim 1, which is provided between each sound collection device and each third addition device and stores each sound collection signal; and the adaptive algorithm device. And a filter coefficient storage means for storing the first variable filter coefficient, which is provided between each of the first variable filtering means.

3. A first variable filtering means for filtering the collected sound signals picked up by a plurality of arbitrarily arranged sound collecting means by different filter coefficients, and outputs of the respective first variable filtering means. A sound collecting device having first adding means for adding signals and outputting an addition output; sound collecting range setting means for setting a predetermined sound collecting range; and a plurality of virtual target sound sources within the sound collecting range. A virtual target sound source position setting means for setting a position; a sound arrives from the virtual target sound source position to the sound collecting means based on the virtual target sound source position and the sound collecting means position Is a spatial characteristic estimating means for estimating a spatial characteristic including a delay time and an attenuation amount, and a distance calculating means for calculating a distance from each virtual target sound source position to each sound collecting means position; Spatial characteristic estimating means including relative delay amount calculating means for obtaining relative delay amounts between the sound collecting means from the distance calculated by the calculating means and sound velocity; Pseudo target signal generating means for generating the same number of signals as the number of virtual target sound source positions; Relative delay amount obtained by the sound collecting means relative delay amount calculating means 2 for the pseudo target signal output from the signal generating means. A plurality of first delaying means for delaying only the above; and second adding means for synthesizing the pseudo target sound collecting signal by adding the respective output signals of the respective delaying means for the respective sound collecting means. Third adding means for synthesizing a learning signal by adding each of the pseudo target sound pickup signals and each of the sound pickup signals; and filtering the synthesized learning signal with different filter coefficients. Second variable filtering means for ringing; fourth adding means for adding output signals of the second variable filtering means to each other; second delay means for delaying the pseudo target signals respectively; Two
A fifth adding means for adding the respective delayed output signals from the delay means; and an error signal is obtained by subtracting the output signal of the fourth adding means from the output signal of the fifth adding means. Subtraction means; based on the sound pickup signal,
An adaptive period detection unit that detects a period during which no sound source exists within the sound collection range, and detects the detected period as a period to be adapted; a sound source within the sound collection range detected by the adaptive period detection unit An adaptive algorithm means for updating the second variable filter coefficient and the first variable filter coefficient so that the root mean square value of the error signal is minimized in the period in which the error signal does not exist. Sound pickup device.

4. A first variable filtering means for filtering the collected sound signals picked up by a plurality of arbitrarily arranged sound collecting means by different filter coefficients, and outputs of the first variable filtering means. A sound collecting device having first adding means for adding signals and outputting an addition output; sound collecting range setting means for setting a predetermined sound collecting range; and a plurality of virtual target sound sources within the sound collecting range. A virtual target sound source position setting means for setting a position; a sound arrives from the virtual target sound source position to the sound collecting means based on the virtual target sound source position and the sound collecting means position Is a spatial characteristic estimating means for estimating a spatial characteristic including a delay time and an attenuation amount, and a distance calculating means for calculating a distance from each virtual target sound source position to each sound collecting means position; From the distance calculated by the calculation means and the sound velocity, the relative delay amount calculation means for obtaining the relative delay amount between the respective sound collection means, and the distance calculated by the distance calculation means 2 from the sound collection means Spatial characteristic estimating means including relative attenuation amount calculation means between sound collecting means for obtaining relative attenuation amount between them; and pseudo purpose for generating pseudo target signals that are uncorrelated with each other and are stationary as many as the number of virtual target sound source positions Signal generating means; a plurality of first delay means for delaying the pseudo target signal output by the signal generating means by the relative delay amount calculated by the sound collecting means relative delay amount calculating means 2; and the plurality of delay means. A plurality of gain means for attenuating the pseudo target signal output by each of the sound pickup means by the relative attenuation amount obtained by the sound pickup means relative attenuation amount calculation means; and each output signal of each gain means, each sound pickup means. Every, Second adding means for synthesizing the pseudo target sound picked-up signal by adding each; and a second summing means for synthesizing the learning signal by adding each of the pseudo target sound pick-up signal and each of the sound pick-up signals 3) adding means; 2nd variable filtering means for filtering the synthesized learning signal with different filter coefficients; 4th adding means for adding output signals of the 2nd variable filtering means to each other; Second delay means for delaying each of the pseudo target signals; and second delay means.
A fifth adding means for adding the respective delayed output signals from the delay means; and an error signal is obtained by subtracting the output signal of the fourth adding means from the output signal of the fifth adding means. Subtraction means; based on the sound pickup signal,
An adaptive period detection unit that detects a period during which no sound source exists within the sound collection range, and detects the detected period as a period to be adapted; a sound source within the sound collection range detected by the adaptive period detection unit An adaptive algorithm means for updating the second variable filter coefficient and the first variable filter coefficient so that the root mean square value of the error signal is minimized in the period in which the error signal does not exist. Sound pickup device.

5. The short-time average power calculating means for calculating the short-time average power of the picked-up signal; the noise for setting a long-time average power of noise measured in advance. Power setting means; threshold setting means for setting a value obtained by multiplying the noise power by a threshold coefficient as a threshold; and a power comparing section for comparing the threshold and the short-time average power to detect an adaptive period. A sound collecting device, characterized in that it is a means including the sound collecting device.

6. The short-term average power calculating means for calculating the short-time average power of the sound pickup signal according to claim 1, and the length for calculating the long-term average power of the sound collection signal. Time average power calculation means; threshold coefficient multiplication means for setting a value obtained by multiplying the long time average power by a threshold coefficient as a threshold; power comparison for comparing the threshold with the short time average power and detecting an adaptive period A sound collecting device, characterized in that it is a means including a section and;

7. The rising period coefficient multiplying unit for multiplying an output of the noise power setting unit or the long-term average power calculating unit by a rising threshold value according to claim 5 or 6, Falling threshold coefficient multiplication means for multiplying the output of the noise power setting means or the long-time average power calculation means by a falling threshold; rising threshold coefficient multiplication output or falling threshold coefficient depending on the state of the power comparison section output A sound pickup device comprising: rising / falling switching means 8 for selecting a multiplication output and setting the selected output as a threshold.

8. A first sound output device, wherein sound pickup signals picked up by a plurality of arbitrarily arranged sound pickup means are filtered by different filter coefficients, the filtered signals are added, and an addition output is output. In a sound pickup method having an addition step, a sound pickup range setting step for setting a predetermined sound pickup range; a virtual target sound source position setting step for setting a plurality of virtual target sound source positions within the sound pickup range; Based on the virtual target sound source position and the position of each sound collecting unit, the spatial characteristics including the delay time and the attenuation amount from the virtual target sound source position until the sound reaches the position of each sound collecting unit are estimated. Spatial characteristic estimation step; pseudo target signal generation step for generating stationary pseudo target signals that are uncorrelated with each other and equal in number to the virtual target sound source positions; each space estimated in the spatial characteristic estimation step Sex and filter coefficients, and spatial characteristics filtering step of filtering each of the respective pseudo target signal;
Each output signal at each spatial characteristic filtering stage,
By adding up for each of the above sound collecting means,
A second adding step for synthesizing the pseudo target sound collecting signal; and a third adding step for synthesizing the learning signal by adding the pseudo target sound collecting signals and the sound collecting signals, respectively. A second variable filtering step of filtering the synthesized learning signal with different filter coefficients; a fourth adding step of adding output signals of the second variable filtering steps to each other; A delaying step for delaying each of the output signals; a fifth adding step for adding the delayed output signals from the delaying step; and an output signal for the fourth adding step from an output signal for the fifth adding step. A subtraction step for obtaining an error signal by subtracting; a period during which no sound source is present within the sound collection range is detected based on the sound collection signal, and the detected period is adapted. An adaptive period detecting step of detecting as a period to be caused; a second mean value of the error signal is minimized during a period in which no sound source exists within the sound collection range detected in the adaptive period detecting step. A variable filter coefficient and an adaptive algorithm step for updating the first variable filter coefficient;

9. The sound collecting method according to claim 8, further comprising: a sound collecting signal storing step for storing each of the sound collecting signals; and a filter coefficient storing step for storing the first variable filter coefficient. Method.

10. A first variable filtering step of filtering a sound pickup signal picked up by a plurality of arbitrarily arranged sound pickup means by different filter coefficients, and an output of each of the first variable filtering steps. A sound collecting method including a first adding step of adding signals and outputting an added output; a sound collecting range setting step of setting a predetermined sound collecting range; and a plurality of virtual object sound sources within the sound collecting range. A virtual target sound source position setting step of setting a position; a sound reaches the position of each sound collecting means from each virtual target sound source position based on each virtual target sound source position and each position of each sound collecting means Is a spatial characteristic estimating step of estimating a spatial characteristic including a delay time and an attenuation amount, and a distance calculating step of calculating a distance from each of the virtual target sound source positions to each of the sound collecting means positions, A spatial characteristic estimation step including a relative delay amount calculation step for obtaining the relative delay amount between the respective sound collection means from the distance calculated in the separation calculation step and the sound velocity; A pseudo target signal generating step for generating the same number of target signals as the number of the virtual target sound source positions; a relative delay amount obtained by the pseudo delay signal calculating step between the pseudo target signals output in the signal generating step A plurality of first delay stages for delaying only; a second adding stage for synthesizing the pseudo target sound pickup signal by adding the output signals of the respective delay stages for each of the sound collecting means. A third adding step of synthesizing a learning signal by adding each of the pseudo target sound pickup signals and each of the sound pickup signals; filtering the synthesized learning signal with different filter coefficients A second variable filtering stage to:
A fourth adding stage for adding the output signals of the respective second variable filtering stages to each other; a second delay stage for delaying the respective pseudo target signals; and a respective delayed output signal from the second delay stage. A fifth addition step of adding the two; a subtraction step of obtaining an error signal by subtracting the output signal of the fourth addition step from the output signal of the fifth addition step; An adaptive period detecting step of detecting a period during which no sound source is present within the sound collecting range, and detecting the detected period as a period to be adapted; a sound collecting range detected in the adaptive period detecting step. An adaptive algorithm step of updating the second variable filter coefficient and the first variable filter coefficient such that the root mean square value of the error signal is minimized during a period in which no sound source exists. Sound collecting method characterized by.

11. A first variable filtering step for filtering a sound pickup signal picked up by a plurality of arbitrarily arranged sound pickup means by respective different filter coefficients, and an output of each of the first variable filtering steps. A sound collecting method including a first adding step of adding signals and outputting an added output; a sound collecting range setting step of setting a predetermined sound collecting range; and a plurality of virtual object sound sources within the sound collecting range. A virtual target sound source position setting step of setting a position; a sound reaches the position of each sound collecting means from each virtual target sound source position based on each virtual target sound source position and each position of each sound collecting means Is a spatial characteristic estimating step of estimating a spatial characteristic including a delay time and an attenuation amount, and a distance calculating step of calculating a distance from each of the virtual target sound source positions to each of the sound collecting means positions, From the calculated distance and speed of sound by releasing calculation step, and sound pickup means between the relative delay amount calculation step of obtaining a relative delay between the respective sound pickup means, a distance calculated by the distance calculation step,
A spatial characteristic estimation step including a relative attenuation amount calculation step between the sound pickup means for obtaining a relative attenuation amount between the sound pickup means; and a pseudo target signal that is uncorrelated with each other and is stationary A plurality of first delay stages for delaying the pseudo target signal generated by the signal generation stage by the relative delay amount calculated by the relative delay amount calculation step between the sound collecting means; A plurality of gain stages for attenuating the pseudo target signal output from each of the first delay stages by the relative attenuation amount calculated in the relative sounding means relative attenuation amount calculation stage; A second adding step of synthesizing the pseudo target sound collecting signals by adding the respective sound collecting means; and adding the respective pseudo target sound collecting signals and the respective sound collecting signals. Learning by A third addition step of combining the signals; a second variable filtering step of filtering the combined learning signal with different filter coefficients; and a second addition of the output signals of the respective second variable filtering steps 4; a second delay step for delaying each of the pseudo target signals; a fifth addition step for adding the delayed output signals from the second delay step; and a fifth addition step A subtraction step of obtaining an error signal by subtracting the output signal of the fourth addition step from the output signal of the step;
An adaptive period detecting step of detecting a period during which no sound source is present in the sound collecting range and detecting the detected period as a period to be adapted; a sound source within the sound collecting range detected in the adaptive period detecting step An adaptive algorithm step of updating the second variable filter coefficient and the first variable filter coefficient so that the mean square value of the error signal is minimized in the period in which the error signal does not exist. How to collect sound.

12. The adaptive period detecting step according to claim 8, further comprising: a short-time average power calculating step for calculating a short-time average power of the picked-up signal; and a noise for setting a long-time average power of noise measured in advance. A power setting step; a threshold setting step of setting a value obtained by multiplying the noise power by a threshold coefficient as a threshold value; a power comparison step of comparing the threshold value with the short time average power to detect an adaptive period; A method for collecting sound, characterized in that it is a stage that includes.

13. The adaptive period detecting step according to claim 8, further comprising: a short time average power calculating step for calculating a short time average power of the sound pickup signal; and a long time calculating for a long time average power of the sound pickup signal. A time average power calculation step; a threshold coefficient multiplication step of setting a value obtained by multiplying the long time average power by a threshold coefficient as a threshold value; a power comparison for comparing the threshold value with the short time average power and detecting an adaptive period And a step of including;

14. The rising threshold coefficient multiplication step according to claim 12 or 13, wherein the threshold coefficient multiplication step comprises a rising threshold coefficient multiplication step of multiplying an output in the noise power setting step or the long-time average power calculation step by a rising threshold value. A falling threshold coefficient multiplication step of multiplying the output in the noise power setting step 202 or the long-time average power calculation step 204 by a falling threshold value; a rising threshold coefficient multiplication output depending on the state of the power comparison step output; A rising / falling switching step of selecting a falling threshold coefficient multiplication output and setting the selected output as a threshold;
A method for collecting sound, characterized in that it is a stage including.