JP4333524B2

JP4333524B2 - Loudspeaker

Info

Publication number: JP4333524B2
Application number: JP2004245783A
Authority: JP
Inventors: 実福島; 博昭竹山; 靖久井平; 武正庄司; 彰洋菊池
Original assignee: Panasonic Corp; Matsushita Electric Works Ltd
Current assignee: Panasonic Corp; Panasonic Electric Works Co Ltd
Priority date: 2004-08-25
Filing date: 2004-08-25
Publication date: 2009-09-16
Anticipated expiration: 2024-08-25
Also published as: JP2006067128A

Description

本発明は、インターホンなどに用いられる拡声通話装置に関するものである。 The present invention relates to a loudspeaker communication device used for an interphone or the like.

従来より、通話時にハンドセットを持つ必要がなく、通話機器から離れた通話者に対して相手側の通話機器から伝送されてくる音声信号をスピーカによって拡声出力し、かつ、上記通話者の発する音声をマイクロホンにより集音して相手側通話機器へ伝送することで拡声通話（ハンズフリー通話）を実現する拡声通話装置が提供されている。このような拡声通話装置においては、通話者が発した音声の一部が相手側通話機器のスピーカからマイクロホンヘの音響結合や通話機器と伝送路との間のインピーダンスの不整合によって生じる反射などが原因で再び受話信号と重畳して帰還することがあり、この帰還成分のレベルが大きい場合には、不快なエコー（音響エコーあるいは回線エコー）として通話者に聴こえてしまう。また、上記音響結合や反射、および自端末における音響結合により通話系に閉ループが形成され、閉ループの一巡利得が１倍を超える周波数成分が存在する場合には、その周波数においてハウリングを生じ、安定した通話を継続することが不可能となる。したがって、通話機器としての拡声通話装置を設計する上で、上述した不快なエコーやハウリングを如何に抑圧するかが重要な課題となる。 Conventionally, it is not necessary to have a handset during a call, and a speaker who is away from the calling device outputs a voice signal transmitted from the other party's calling device through a speaker, and the voice emitted by the calling party is output. 2. Description of the Related Art There is provided a loudspeaker device that implements a loudspeaker call (hands-free call) by collecting sound with a microphone and transmitting the collected sound to a counterpart call device. In such a loudspeaker, a part of the voice uttered by the caller is reflected due to acoustic coupling from the speaker of the other call device to the microphone or impedance mismatch between the call device and the transmission path. For this reason, there may be a case where feedback is again superimposed on the received signal, and if the level of the feedback component is high, the caller hears it as an unpleasant echo (acoustic echo or line echo). In addition, when a closed loop is formed in the communication system due to the above acoustic coupling and reflection, and acoustic coupling at the terminal itself, and there is a frequency component in which the loop gain of the closed loop exceeds one time, howling occurs at that frequency, and stable It becomes impossible to continue the call. Therefore, how to suppress the above-mentioned unpleasant echo and howling is an important issue in designing a loudspeaker device as a telephone device.

このような課題に対して、従来、通話状態（送話状態、受話状態など）を常時推定し、推定結果に基づき適切な配分で送話路および受話路に対して損失を挿入する音声スイッチを用いて閉ループの一巡利得を低減し不快なエコーやハウリングを抑圧する方式が広く用いられてきた。図１９は、拡声通話装置としてのインターホン親機（以下、「親機」と略す）Ｍと、相手側通話機器としてのドアホン子器Ｓとからなる、所謂ハンズフリーインターホンの従来例を示すブロック図である（特許文献１参照）。親機Ｍは、マイクロホン１、スピーカ２、２線−４線変換回路３０、マイクロホン１から出力される送話信号を増幅するマイクロホンアンプ３１、相手側の通話機器から伝送系を経て到達する受話信号を増幅する回線アンプ３２、並びに音声スイッチＶＳ’で構成される。また、図示は省略するが、ドアホン子器Ｓはマイクロホン、スピーカ、２線−４線変換回路等で構成される。 Conventionally, a voice switch that always estimates the call state (transmission state, reception state, etc.) and inserts losses into the transmission path and reception path with appropriate distribution based on the estimation results. A method of reducing closed loop loop gain and suppressing unpleasant echoes and howling has been widely used. FIG. 19 is a block diagram showing a conventional example of a so-called hands-free intercom comprising an interphone master unit (hereinafter abbreviated as “master unit”) M as a loudspeaker device and a doorphone slave unit S as a counterpart communication device. (See Patent Document 1). Base unit M includes microphone 1, speaker 2, two-wire / four-wire conversion circuit 30, microphone amplifier 31 that amplifies the transmission signal output from microphone 1, and reception signal that arrives from the other party's telephone equipment via the transmission system. Is composed of a line amplifier 32 and a voice switch VS ′. Moreover, although illustration is abbreviate | omitted, the doorphone subunit | mobile_unit S is comprised with a microphone, a speaker, a 2 wire | wire 4 line | wire conversion circuit, etc.

また音声スイッチＶＳ’は、マイクロホン１からマイクロホンアンプ３１を経て２線−４線変換回路３０へ至る送話側信号経路に損失を挿入する送話側損失挿入手段３３と、回線アンプ３２からスピーカ２へ至る受話側信号経路に損失を挿入する受話側損失挿入手段３４と、送話側および受話側の各損失挿入手段３３，３４における挿入損失量を制御する挿入損失量制御手段３５とを具備する。この挿入損失量制御手段３５は、例えば送話信号および受話信号のパワーを推定し、これらの推定値の大小関係を比較して瞬時パワーの小さい側の損失挿入手段３３，３４に対して所定の損失を挿入することによって送話状態と受話状態を切り換えるという処理を行っている。
特開２０００−３０７７４５号公報 The voice switch VS ′ includes transmission side loss insertion means 33 for inserting loss into the transmission side signal path from the microphone 1 through the microphone amplifier 31 to the two-wire / four-wire conversion circuit 30, and the line amplifier 32 to the speaker 2. Receiving-side loss insertion means 34 for inserting a loss in the receiving-side signal path leading to, and insertion loss amount control means 35 for controlling the insertion loss amount in each of the transmission-side and reception-side loss insertion means 33, 34. . This insertion loss amount control means 35 estimates the power of, for example, a transmission signal and a reception signal, compares the estimated values, and compares the estimated values with the loss insertion means 33 and 34 on the side having a smaller instantaneous power. A process of switching between the transmission state and the reception state by inserting a loss is performed.
JP 2000-307745 A

しかしながら上記従来例においては、遠端（ドアホン子器Ｓ）側の周囲騒音レベルと近端（親機Ｍ）側の周囲騒音レベルとの差が大きい場合、例えば屋外に設置されたドアホン子器Ｓのマイクロホンに風切り音や自動車騒音などの大きな騒音が入力された場合、送話信号及び受話信号を監視して通話状態を推定する挿入損失量制御手段３５では、例えば遠端側の周囲騒音レベルが大きい状況においては常に受話状態と判定し、近端側の周囲騒音レベルが大きい状況においては常に送話状態と判定してしまい、実際の通話状態に関係なく、受話状態又は送話状態の何れか一方に通話状態を固定してしまう現象（所謂音声スイッチの片倒れ）が生じてしまう。 However, in the above conventional example, when the difference between the ambient noise level on the far end (doorphone slave unit S) side and the ambient noise level on the near end (master unit M) side is large, for example, the doorphone slave unit S installed outdoors. When a large noise such as wind noise or car noise is input to the microphone, the insertion loss amount control means 35 that monitors the transmission signal and the reception signal to estimate the call state, for example, has a far-end ambient noise level. In a large situation, it is always judged as a reception state, and in a situation where the ambient noise level at the near end is high, it is always judged as a transmission state, and it is either a reception state or a transmission state regardless of the actual call state. On the other hand, a phenomenon of fixing the call state (so-called voice switch one-sided fall) occurs.

本発明は上記事情に鑑みて為されたものであり、その目的は、音声スイッチの片倒れを抑制可能とした拡声通話装置を提供することにある。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a loudspeaker device capable of suppressing a fall of a voice switch.

請求項１の発明は、上記目的を達成するために、マイクロホンおよびスピーカと、送話信号を伝送系へ送り出し且つ受話信号を減衰させる送話状態と受話信号をスピーカへ送り出し且つ送話信号を減衰させる受話状態とを切り換える音声スイッチとを備え、音声スイッチは、マイクロホンから伝送系へ至る送話側信号経路に損失を挿入する送話側損失挿入部と、伝送系からスピーカへ至る受話側信号経路に損失を挿入する受話側損失挿入部と、送話信号の音声区間を検出する第１の音声区間検出部と、受話信号の音声区間を検出する第２の音声区間検出部と、送話信号の瞬時パワーを推定する送話側瞬時パワー推定部と、送話信号の背景騒音パワーを推定する送話側背景騒音パワー推定部と、受話信号の瞬時パワーを推定する受話側瞬時パワー推定部と、受話信号の背景騒音パワーを推定する受話側背景騒音パワー推定部と、送話側および受話側の各損失挿入部における挿入損失量を制御する挿入損失量制御部とを具備し、挿入損失量制御部は、送話側および受話側の瞬時パワー推定値の比較結果と第１および第２の音声区間検出部の検出結果とを参照して通話状態を判定するとともに、通話状態の判定結果に応じて受話側の挿入損失量を相対的に大きくした送話モード、又は送話側の挿入損失量を相対的に大きくした受話モードの少なくとも何れか一方に切り換えてなる拡声通話装置において、送話側の瞬時パワー推定値を減衰させる第１の減衰器並びに送話側背景騒音パワー推定値が所定のしきい値を超えたら第１の減衰器の減衰量を増大させる第１の減衰量制御部、若しくは受話側の瞬時パワー推定値を減衰させる第２の減衰器並びに受話側背景騒音パワー推定値が所定のしきい値を超えたら第２の減衰器の減衰量を増大させる第２の減衰量制御部の少なくとも何れか一方を備えるとともに、第２の音声区間検出部で参照する受話信号から音声の主成分帯域よりも低い周波数帯域成分を除去する高域通過フィルタを音声スイッチに具備したことを特徴とする。 In order to achieve the above-mentioned object, the invention of claim 1 is a microphone and a speaker, a transmission state for transmitting a transmission signal to the transmission system and attenuating the reception signal, a transmission state of the reception signal to the speaker, and attenuating the transmission signal. A voice switch that switches between a receiving state and a receiving side signal path. The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side signal path from the transmission system to the speaker. A receiving side loss insertion unit that inserts a loss into the voice signal, a first voice zone detection unit that detects a voice zone of the transmission signal, a second voice zone detection unit that detects a voice zone of the reception signal, and a transmission signal Transmitting-side instantaneous power estimation unit for estimating the instantaneous power of the receiving signal, transmitting-side background noise power estimating unit for estimating the background noise power of the transmitting signal, and receiving-side instantaneous power estimating the instantaneous power of the receiving signal An estimation unit, a reception side background noise power estimation unit for estimating the background noise power of the reception signal, and an insertion loss amount control unit for controlling the insertion loss amount in each loss insertion unit on the transmission side and reception side, The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result A first attenuator for attenuating the instantaneous power estimate on the transmitting side and a first attenuation for increasing the attenuation of the first attenuator when the estimated background noise power on the transmitting side exceeds a predetermined threshold Quantity control unit, or A second attenuator for attenuating the talk side instantaneous power estimate and a second attenuator controller for increasing the attenuator of the second attenuator when the receiver background noise power estimate exceeds a predetermined threshold characterized by including a voice switch high-pass filter for removing at least one comprises one Rutotomoni low frequency band component than the main component band of a speech from the receiving signal to be referenced in the second speech section detecting unit of And

この発明によれば、遠端側（又は近端側）の周囲騒音レベルが大きい場合に受話側（又は送話側）の瞬時パワー推定値も大きくなってしまうことにより挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定して音声スイッチが受話側（又は送話側）へ片倒れすることがあるが、このような場合に受話側（又は送話側）の背景騒音パワー推定値が所定のしきい値を超えていれば、第２（又は第１）の減衰器の減衰量を増大させることで受話側（又は送話側）の瞬時パワー推定値を減少させるから、挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定することを防いで音声スイッチの片倒れを抑制することができる。さらに、音声の主成分帯域よりも低い周波数帯域成分を有する騒音を受話信号から除去することにより、音声区間の誤検出が防止できるとともに該騒音に対する受話側の瞬時パワー推定値を低減することができ、その結果、音声スイッチの片倒れを抑制することができる。 According to this invention, when the ambient noise level on the far end side (or near end side) is high, the instantaneous power estimation value on the receiving side (or transmitting side) also becomes large, so that the insertion loss amount control unit performs a call. The voice switch may fall down to the receiving side (or sending side) by misjudging the state as the receiving state (or sending state). In such a case, the background of the receiving side (or sending side) If the noise power estimate exceeds a predetermined threshold, the instantaneous power estimate on the receiver side (or transmitter side) is decreased by increasing the attenuation amount of the second (or first) attenuator. Therefore, it is possible to prevent the insertion loss amount control unit from erroneously determining the call state as the reception state (or the transmission state) and to suppress the fall of the voice switch. Furthermore, by removing noise having a frequency band component lower than the main component band of the voice from the received signal, erroneous detection of the voice section can be prevented and the instantaneous power estimate on the receiving side for the noise can be reduced. As a result, it is possible to prevent the voice switch from falling over.

請求項２の発明は、請求項１の発明において、伝送系を介して通話する相手の通話機器を特定するとともに特定した通話機器に応じて高域通過フィルタの有効・無効を切り換える手段を備えたことを特徴とする。 According to a second aspect of the present invention, in the first aspect of the invention, there is provided means for specifying the other party's telephone device that communicates via the transmission system, and switching between valid / invalid of the high-pass filter according to the identified telephone device. It is characterized by that.

この発明によれば、相手の通話機器として屋外に設置されたものと屋内に設置されたものとがあるような場合、屋外に設置された通話機器との通話時には高域通過フィルタを有効とすることで騒音除去による音声区間の誤検出防止並びに該騒音に対する受話側の瞬時パワー推定値低減によって音声スイッチの片倒れが抑制でき、屋内に設置された通話機器との通話時には高域通過フィルタを無効とすることで音声スイッチにおける通話状態の切換のバランスを保つことができる。 According to the present invention, when there is a telephone apparatus installed outdoors and a telephone apparatus installed indoors, the high-pass filter is enabled when calling with a telephone apparatus installed outdoors. Therefore, the voice switch can be prevented from falling down by preventing the false detection of the voice section by removing the noise and reducing the instantaneous power estimation value on the receiver side for the noise, and the high-pass filter is disabled when talking to the telephone equipment installed indoors. By doing so, it is possible to maintain a balance of switching of the call state in the voice switch.

請求項３の発明は、請求項１の発明において、使用者の操作による操作入力を取り込むとともに該操作入力に応じて高域通過フィルタの有効・無効を切り換える手段を備えたことを特徴とする。 The invention of claim 3 is characterized in that, in the invention of claim 1 , there is provided means for taking in an operation input by a user's operation and switching valid / invalid of a high-pass filter in accordance with the operation input.

この発明によれば、使用者が意図的に高域通過フィルタの有効・無効を切り換えることで音声区間の誤検出防止並びに該騒音に対する受話側の瞬時パワー推定値低減によって音声スイッチの片倒れが抑制できるとともに使い勝手の向上が図れる。 According to the present invention, the user intentionally switches between enabling and disabling the high-pass filter, thereby preventing erroneous detection of the voice section and reducing the instantaneous power estimate on the receiving side for the noise, thereby suppressing the voice switch from falling over. As well as being able to improve usability.

請求項４の発明は、上記目的を達成するために、マイクロホンおよびスピーカと、送話信号を伝送系へ送り出し且つ受話信号を減衰させる送話状態と受話信号をスピーカへ送り出し且つ送話信号を減衰させる受話状態とを切り換える音声スイッチとを備え、音声スイッチは、マイクロホンから伝送系へ至る送話側信号経路に損失を挿入する送話側損失挿入部と、伝送系からスピーカへ至る受話側信号経路に損失を挿入する受話側損失挿入部と、送話信号の音声区間を検出する第１の音声区間検出部と、受話信号の音声区間を検出する第２の音声区間検出部と、送話信号の瞬時パワーを推定する送話側瞬時パワー推定部と、送話信号の背景騒音パワーを推定する送話側背景騒音パワー推定部と、受話信号の瞬時パワーを推定する受話側瞬時パワー推定部と、受話信号の背景騒音パワーを推定する受話側背景騒音パワー推定部と、送話側および受話側の各損失挿入部における挿入損失量を制御する挿入損失量制御部とを具備し、挿入損失量制御部は、送話側および受話側の瞬時パワー推定値の比較結果と第１および第２の音声区間検出部の検出結果とを参照して通話状態を判定するとともに、通話状態の判定結果に応じて受話側の挿入損失量を相対的に大きくした送話モード、又は送話側の挿入損失量を相対的に大きくした受話モードの少なくとも何れか一方に切り換えてなる拡声通話装置において、送話側の瞬時パワー推定値を減衰させる第１の減衰器並びに送話側背景騒音パワー推定値が所定のしきい値を超えたら第１の減衰器の減衰量を増大させる第１の減衰量制御部、若しくは受話側の瞬時パワー推定値を減衰させる第２の減衰器並びに受話側背景騒音パワー推定値が所定のしきい値を超えたら第２の減衰器の減衰量を増大させる第２の減衰量制御部の少なくとも何れか一方を備え、第１及び第２の音声区間検出部は、参照信号の瞬時パワーを推定する第１の瞬時パワー推定部と、参照信号中に定常的に存在する背景騒音のパワーを推定する背景騒音パワー推定部と、瞬時パワー推定値と背景騒音パワー推定値の比に基づいて参照信号の音声区間を判定する第１の判定部と、参照信号から音声の主成分帯域よりも高い周波数帯域成分を除去する低域通過フィルタと、低域通過フィルタで高周波数帯域成分が除去された後の参照信号の瞬時パワーを推定する第２の瞬時パワー推定部と、第１の判定部で非音声区間と判定されたときは非音声区間と判定し、第１の判定部で音声区間と判定されたときは第１の瞬時パワー推定部で推定された瞬時パワー推定値に１未満の正の係数を乗算した値と第２の瞬時パワー推定部で推定された瞬時パワー推定値との大小関係に基づいて音声区間か否かを判定する第２の判定部とを具備することを特徴とする。 In order to achieve the above-mentioned object , the invention according to claim 4 is a microphone and a speaker, a transmission state for transmitting a transmission signal to the transmission system and attenuating the reception signal, a transmission state of the reception signal to the speaker, and attenuating the transmission signal. A voice switch that switches between a receiving state and a receiving side signal path. The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side signal path from the transmission system to the speaker. A receiving side loss insertion unit that inserts a loss into the voice signal, a first voice zone detection unit that detects a voice zone of the transmission signal, a second voice zone detection unit that detects a voice zone of the reception signal, and a transmission signal Transmitting-side instantaneous power estimation unit for estimating the instantaneous power of the receiving signal, transmitting-side background noise power estimating unit for estimating the background noise power of the transmitting signal, and receiving-side instantaneous power estimating the instantaneous power of the receiving signal An estimation unit, a reception side background noise power estimation unit for estimating the background noise power of the reception signal, and an insertion loss amount control unit for controlling the insertion loss amount in each loss insertion unit on the transmission side and reception side, The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result A first attenuator for attenuating the instantaneous power estimate on the transmitting side and a first attenuation for increasing the attenuation of the first attenuator when the estimated background noise power on the transmitting side exceeds a predetermined threshold Quantity control unit, or A second attenuator for attenuating the talk side instantaneous power estimate and a second attenuator controller for increasing the attenuator of the second attenuator when the receiver background noise power estimate exceeds a predetermined threshold And the first and second speech section detection units include a first instantaneous power estimation unit that estimates the instantaneous power of the reference signal, and the power of background noise that is steadily present in the reference signal. A background noise power estimator for estimating the reference signal, a first determination unit for determining a speech section of the reference signal based on a ratio of the instantaneous power estimate value and the background noise power estimate value, A low-pass filter for removing high frequency band components, a second instantaneous power estimation unit for estimating the instantaneous power of the reference signal after the high frequency band components have been removed by the low-pass filter, and a first determination unit Is determined as a non-voice segment Is determined as a non-speech interval, and when the first determination unit determines a speech interval, the instantaneous power estimation value estimated by the first instantaneous power estimation unit is multiplied by a positive coefficient less than 1. And a second determination unit that determines whether or not the voice segment is based on a magnitude relationship between the value and the instantaneous power estimation value estimated by the second instantaneous power estimation unit.

この発明によれば、遠端側（又は近端側）の周囲騒音レベルが大きい場合に受話側（又は送話側）の瞬時パワー推定値も大きくなってしまうことにより挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定して音声スイッチが受話側（又は送話側）へ片倒れすることがあるが、このような場合に受話側（又は送話側）の背景騒音パワー推定値が所定のしきい値を超えていれば、第２（又は第１）の減衰器の減衰量を増大させることで受話側（又は送話側）の瞬時パワー推定値を減少させるから、挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定することを防いで音声スイッチの片倒れを抑制することができる。さらに、音声の主成分帯域よりも高い周波数帯域成分を持った騒音によって第１及び第２の音声区間検出部が音声区間を誤検出することを防止できる。 According to this invention, when the ambient noise level on the far end side (or near end side) is high, the instantaneous power estimation value on the receiving side (or transmitting side) also becomes large, so that the insertion loss amount control unit performs a call. The voice switch may fall down to the receiving side (or sending side) by misjudging the state as the receiving state (or sending state). In such a case, the background of the receiving side (or sending side) If the noise power estimate exceeds a predetermined threshold, the instantaneous power estimate on the receiver side (or transmitter side) is decreased by increasing the attenuation amount of the second (or first) attenuator. Therefore, it is possible to prevent the insertion loss amount control unit from erroneously determining the call state as the reception state (or the transmission state) and to suppress the fall of the voice switch. Furthermore, it is possible to prevent the first and second voice section detection units from erroneously detecting the voice section due to noise having a frequency band component higher than the main component band of the voice.

請求項５の発明は、請求項４の発明において、第２の判定部における前記係数を可変としたことを特徴とする。 The invention of claim 5 is characterized in that, in the invention of claim 4 , the coefficient in the second determination section is variable.

この発明によれば、相手の通話機器や使用者の意図に応じて係数を変化させることにより第１及び第２の音声区間検出部の検出動作を最適化することができる。 According to the present invention, it is possible to optimize the detection operation of the first and second voice section detection units by changing the coefficient according to the intention of the other party's telephone device or the user.

請求項６の発明は、請求項４又は５の発明において、第１の音声区間検出部が具備する第１の判定部は、所定の時間間隔を空けて第１の瞬時パワー推定部で推定された２つの瞬時パワー推定値の差分の絶対値を求め、該差分の絶対値と所定のしきい値との比較結果を参照して判定することを特徴とする。 According to a sixth aspect of the present invention, in the fourth or fifth aspect of the present invention, the first determination unit included in the first speech section detection unit is estimated by the first instantaneous power estimation unit with a predetermined time interval. The absolute value of the difference between the two instantaneous power estimation values is obtained, and the determination is made with reference to a comparison result between the absolute value of the difference and a predetermined threshold value.

この発明によれば、音声以外の非定常的な騒音のうちで瞬時パワーの時間的な変動が少ない騒音が含まれる区間が音声区間と誤検出されることを防ぐことができる。 According to the present invention, it is possible to prevent erroneous detection of a section including noise with little temporal fluctuation of instantaneous power among non-stationary noises other than speech as a speech section.

請求項７の発明は、上記目的を達成するために、マイクロホンおよびスピーカと、送話信号を伝送系へ送り出し且つ受話信号を減衰させる送話状態と受話信号をスピーカへ送り出し且つ送話信号を減衰させる受話状態とを切り換える音声スイッチとを備え、音声スイッチは、マイクロホンから伝送系へ至る送話側信号経路に損失を挿入する送話側損失挿入部と、伝送系からスピーカへ至る受話側信号経路に損失を挿入する受話側損失挿入部と、送話信号の音声区間を検出する第１の音声区間検出部と、受話信号の音声区間を検出する第２の音声区間検出部と、送話信号の瞬時パワーを推定する送話側瞬時パワー推定部と、送話信号の背景騒音パワーを推定する送話側背景騒音パワー推定部と、受話信号の瞬時パワーを推定する受話側瞬時パワー推定部と、受話信号の背景騒音パワーを推定する受話側背景騒音パワー推定部と、送話側および受話側の各損失挿入部における挿入損失量を制御する挿入損失量制御部とを具備し、挿入損失量制御部は、送話側および受話側の瞬時パワー推定値の比較結果と第１および第２の音声区間検出部の検出結果とを参照して通話状態を判定するとともに、通話状態の判定結果に応じて受話側の挿入損失量を相対的に大きくした送話モード、又は送話側の挿入損失量を相対的に大きくした受話モードの少なくとも何れか一方に切り換えてなる拡声通話装置において、送話側の瞬時パワー推定値を減衰させる第１の減衰器並びに送話側背景騒音パワー推定値が所定のしきい値を超えたら第１の減衰器の減衰量を増大させる第１の減衰量制御部、若しくは受話側の瞬時パワー推定値を減衰させる第２の減衰器並びに受話側背景騒音パワー推定値が所定のしきい値を超えたら第２の減衰器の減衰量を増大させる第２の減衰量制御部の少なくとも何れか一方を備え、音声スイッチは、挿入損失量制御部で参照する受話信号から音声の主成分帯域よりも低い周波数帯域成分を除去する高域通過フィルタを具備し、第１の音声区間検出部は、参照信号の瞬時パワーを推定する第１の瞬時パワー推定部と、参照信号中に定常的に存在する背景騒音のパワーを推定する背景騒音パワー推定部と、瞬時パワー推定値と背景騒音パワー推定値の比に基づいて参照信号の音声区間を判定する第１の判定部と、参照信号から音声の主成分帯域よりも高い周波数帯域成分を除去する低域通過フィルタと、低域通過フィルタで高周波数帯域成分が除去された後の参照信号の瞬時パワーを推定する第２の瞬時パワー推定部と、第１の判定部で非音声区間と判定されたときは非音声区間と判定し、第１の判定部で音声区間と判定されたときは第１の瞬時パワー推定部で推定された瞬時パワー推定値に１未満の正の係数を乗算した値と第２の瞬時パワー推定部で推定された瞬時パワー推定値との大小関係に基づいて音声区間か否かを判定する第２の判定部とを具備したことを特徴とする。 In order to achieve the above-mentioned object , the invention according to claim 7 provides a microphone and a speaker, a transmission state for transmitting a transmission signal to the transmission system and attenuating the reception signal, a transmission state of the reception signal to the speaker, and attenuating the transmission signal. A voice switch that switches between a receiving state and a receiving side signal path. The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side signal path from the transmission system to the speaker. A receiving side loss insertion unit that inserts a loss into the voice signal, a first voice zone detection unit that detects a voice zone of the transmission signal, a second voice zone detection unit that detects a voice zone of the reception signal, and a transmission signal Transmitting-side instantaneous power estimation unit for estimating the instantaneous power of the receiving signal, transmitting-side background noise power estimating unit for estimating the background noise power of the transmitting signal, and receiving-side instantaneous power estimating the instantaneous power of the receiving signal An estimation unit, a reception side background noise power estimation unit for estimating the background noise power of the reception signal, and an insertion loss amount control unit for controlling the insertion loss amount in each loss insertion unit on the transmission side and reception side, The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result A first attenuator for attenuating the instantaneous power estimate on the transmitting side and a first attenuation for increasing the attenuation of the first attenuator when the estimated background noise power on the transmitting side exceeds a predetermined threshold Quantity control unit, or A second attenuator for attenuating the talk side instantaneous power estimate and a second attenuator controller for increasing the attenuator of the second attenuator when the receiver background noise power estimate exceeds a predetermined threshold The voice switch includes a high-pass filter that removes a frequency band component lower than the main component band of the voice from the received signal referred to by the insertion loss amount control unit, and the first voice section The detector includes a first instantaneous power estimator that estimates the instantaneous power of the reference signal, a background noise power estimator that estimates the power of background noise that is constantly present in the reference signal, an instantaneous power estimate, and a background A first determination unit that determines a speech section of a reference signal based on a ratio of noise power estimation values, a low-pass filter that removes a frequency band component higher than a main component band of speech from the reference signal, and a low-pass filter A second instantaneous power estimator that estimates the instantaneous power of the reference signal after the high frequency band component is removed in step 1 and when the first determination unit determines that it is a non-voice interval, When the first determination unit determines that the speech section is present, the second instantaneous power estimation unit estimates the value obtained by multiplying the instantaneous power estimation value estimated by the first instantaneous power estimation unit by a positive coefficient less than 1. And a second determination unit that determines whether or not the voice section is based on the magnitude relationship with the instantaneous power estimation value.

この発明によれば、遠端側（又は近端側）の周囲騒音レベルが大きい場合に受話側（又は送話側）の瞬時パワー推定値も大きくなってしまうことにより挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定して音声スイッチが受話側（又は送話側）へ片倒れすることがあるが、このような場合に受話側（又は送話側）の背景騒音パワー推定値が所定のしきい値を超えていれば、第２（又は第１）の減衰器の減衰量を増大させることで受話側（又は送話側）の瞬時パワー推定値を減少させるから、挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定することを防いで音声スイッチの片倒れを抑制することができる。さらに、音声の主成分帯域よりも低い周波数帯域成分を有する騒音を高域通過フィルタを用いて受話信号から除去することにより、音声区間の誤検出が防止できるとともに該騒音に対する受話側の瞬時パワー推定値を低減することができ、その結果、音声スイッチの片倒れを抑制することができる。また、音声の主成分帯域よりも高い周波数帯域成分を有する騒音を低域通過フィルタを用いて除去することにより、第１の音声区間検出部における音声区間の誤検出が防止できる。 According to this invention, when the ambient noise level on the far end side (or near end side) is high, the instantaneous power estimation value on the receiving side (or transmitting side) also becomes large, so that the insertion loss amount control unit performs a call. The voice switch may fall down to the receiving side (or sending side) by misjudging the state as the receiving state (or sending state). In such a case, the background of the receiving side (or sending side) If the noise power estimate exceeds a predetermined threshold, the instantaneous power estimate on the receiver side (or transmitter side) is decreased by increasing the attenuation amount of the second (or first) attenuator. Therefore, it is possible to prevent the insertion loss amount control unit from erroneously determining the call state as the reception state (or the transmission state) and to suppress the fall of the voice switch. Furthermore, by removing noise having a frequency band component lower than the main component band of speech from the received signal using a high-pass filter, erroneous detection of the speech section can be prevented and the instantaneous power estimation on the receiving side for the noise The value can be reduced, and as a result, the fall of the voice switch can be suppressed. Further, by removing noise having a frequency band component higher than the main component band of speech using a low-pass filter, erroneous detection of the speech section in the first speech section detection unit can be prevented.

請求項８の発明は、請求項７の発明において、第１の判定部は、所定の時間間隔を空けて第１の瞬時パワー推定部で推定された２つの瞬時パワー推定値の差分の絶対値を求め、該差分の絶対値と所定のしきい値との比較結果を参照して判定することを特徴とする。 The invention according to claim 8 is the invention according to claim 7 , wherein the first determination unit is an absolute value of a difference between the two instantaneous power estimation values estimated by the first instantaneous power estimation unit at a predetermined time interval. And determining with reference to a comparison result between the absolute value of the difference and a predetermined threshold value.

請求項９の発明は、上記目的を達成するために、マイクロホンおよびスピーカと、送話信号を伝送系へ送り出し且つ受話信号を減衰させる送話状態と受話信号をスピーカへ送り出し且つ送話信号を減衰させる受話状態とを切り換える音声スイッチとを備え、音声スイッチは、マイクロホンから伝送系へ至る送話側信号経路に損失を挿入する送話側損失挿入部と、伝送系からスピーカへ至る受話側信号経路に損失を挿入する受話側損失挿入部と、送話信号の音声区間を検出する第１の音声区間検出部と、受話信号の音声区間を検出する第２の音声区間検出部と、送話信号の瞬時パワーを推定する送話側瞬時パワー推定部と、送話信号の背景騒音パワーを推定する送話側背景騒音パワー推定部と、受話信号の瞬時パワーを推定する受話側瞬時パワー推定部と、受話信号の背景騒音パワーを推定する受話側背景騒音パワー推定部と、送話側および受話側の各損失挿入部における挿入損失量を制御する挿入損失量制御部とを具備し、挿入損失量制御部は、送話側および受話側の瞬時パワー推定値の比較結果と第１および第２の音声区間検出部の検出結果とを参照して通話状態を判定するとともに、通話状態の判定結果に応じて受話側の挿入損失量を相対的に大きくした送話モード、又は送話側の挿入損失量を相対的に大きくした受話モードの少なくとも何れか一方に切り換えてなる拡声通話装置において、送話側の瞬時パワー推定値を減衰させる第１の減衰器並びに送話側背景騒音パワー推定値が所定のしきい値を超えたら第１の減衰器の減衰量を増大させる第１の減衰量制御部、若しくは受話側の瞬時パワー推定値を減衰させる第２の減衰器並びに受話側背景騒音パワー推定値が所定のしきい値を超えたら第２の減衰器の減衰量を増大させる第２の減衰量制御部の少なくとも何れか一方を備え、第１の音声区間検出部で参照する送話信号から音声の主成分帯域よりも高い周波数帯域成分を除去する低域通過フィルタと、第２の音声区間検出部で参照する受話信号から音声の主成分帯域よりも低い周波数帯域成分を除去する高域通過フィルタとを音声スイッチに具備したことを特徴とする。 In order to achieve the above object , the invention according to claim 9 provides a microphone and a speaker, a transmission state for transmitting a transmission signal to the transmission system and attenuating the reception signal, a transmission state of the reception signal to the speaker, and attenuating the transmission signal. A voice switch that switches between a receiving state and a receiving side signal path. The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side signal path from the transmission system to the speaker. A receiving side loss insertion unit that inserts a loss into the voice signal, a first voice zone detection unit that detects a voice zone of the transmission signal, a second voice zone detection unit that detects a voice zone of the reception signal, and a transmission signal Transmitting-side instantaneous power estimation unit for estimating the instantaneous power of the receiving signal, transmitting-side background noise power estimating unit for estimating the background noise power of the transmitting signal, and receiving-side instantaneous power estimating the instantaneous power of the receiving signal An estimation unit, a reception side background noise power estimation unit for estimating the background noise power of the reception signal, and an insertion loss amount control unit for controlling the insertion loss amount in each loss insertion unit on the transmission side and reception side, The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result A first attenuator for attenuating the instantaneous power estimate on the transmitting side and a first attenuation for increasing the attenuation of the first attenuator when the estimated background noise power on the transmitting side exceeds a predetermined threshold Quantity control unit, or A second attenuator for attenuating the talk side instantaneous power estimate and a second attenuator controller for increasing the attenuator of the second attenuator when the receiver background noise power estimate exceeds a predetermined threshold A low-pass filter that removes a frequency band component higher than the main component band of the speech from the transmission signal referred to by the first speech section detection unit, and a second speech section detection unit The voice switch is provided with a high-pass filter that removes a frequency band component lower than the main component band of the voice from the received reception signal.

この発明によれば、遠端側（又は近端側）の周囲騒音レベルが大きい場合に受話側（又は送話側）の瞬時パワー推定値も大きくなってしまうことにより挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定して音声スイッチが受話側（又は送話側）へ片倒れすることがあるが、このような場合に受話側（又は送話側）の背景騒音パワー推定値が所定のしきい値を超えていれば、第２（又は第１）の減衰器の減衰量を増大させることで受話側（又は送話側）の瞬時パワー推定値を減少させるから、挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定することを防いで音声スイッチの片倒れを抑制することができる。さらに、音声の主成分帯域よりも高い周波数帯域成分を有する騒音を低域通過フィルタを用いて除去することにより、第１の音声区間検出部における音声区間の誤検出が防止できるとともに該騒音に対する受話側の瞬時パワー推定値を低減することができ、その結果、音声スイッチの片倒れを抑制することができる。また、音声の主成分帯域よりも低い周波数帯域成分を有する騒音を受話信号から除去することにより、音声区間の誤検出が防止できるとともに該騒音に対する受話側の瞬時パワー推定値を低減することができ、その結果、音声スイッチの片倒れを抑制することができる。 According to this invention, when the ambient noise level on the far end side (or near end side) is high, the instantaneous power estimation value on the receiving side (or transmitting side) also becomes large, so that the insertion loss amount control unit performs a call. The voice switch may fall down to the receiving side (or sending side) by misjudging the state as the receiving state (or sending state). In such a case, the background of the receiving side (or sending side) If the noise power estimate exceeds a predetermined threshold, the instantaneous power estimate on the receiver side (or transmitter side) is decreased by increasing the attenuation amount of the second (or first) attenuator. Therefore, it is possible to prevent the insertion loss amount control unit from erroneously determining the call state as the reception state (or the transmission state) and to suppress the fall of the voice switch. Further, by removing noise having a frequency band component higher than the main component band of the voice using a low-pass filter, erroneous detection of the voice section in the first voice section detection unit can be prevented and reception of the noise is received. The instantaneous power estimation value on the side can be reduced, and as a result, the voice switch can be prevented from falling down. In addition, by removing noise having a frequency band component lower than the main component band of speech from the received signal, erroneous detection of the speech section can be prevented and the instantaneous power estimate on the receiving side for the noise can be reduced. As a result, it is possible to prevent the voice switch from falling over.

請求項１０の発明は、請求項９の発明において、カットオフ周波数が可変である前記低域通過フィルタを音声スイッチに具備し、使用者の操作による操作入力を取り込むとともに該操作入力に応じて低域通過フィルタのカットオフ周波数を変化させる手段を備えたことを特徴とする。 According to a tenth aspect of the present invention, in the ninth aspect of the invention, the low-pass filter having a variable cut-off frequency is provided in a voice switch, and an operation input by a user's operation is taken in and a low frequency is set according to the operation input. A means for changing the cut-off frequency of the pass filter is provided.

この発明によれば、使用環境に応じて適切なカットオフ周波数に設定できるから音声区間の誤検出防止と使い勝手の向上が同時に図れる。 According to the present invention, since an appropriate cutoff frequency can be set according to the use environment, it is possible to simultaneously prevent erroneous detection of voice sections and improve usability.

請求項１１の発明は、請求項９の発明において、参照信号に含まれる音声のピッチを検出するとともに検出される音声ピッチに応じて低域通過フィルタのカットオフ周波数を変化させる手段を備えたことを特徴とする。 The invention of claim 11 is the invention of claim 9 , further comprising means for detecting the pitch of the voice included in the reference signal and changing the cutoff frequency of the low-pass filter in accordance with the detected voice pitch. It is characterized by.

この発明によれば、一般に成人の声よりも音声ピッチが高い幼児の声が低域通過フィルタで除去されるのを防ぐことができる。 According to the present invention, it is possible to prevent an infant's voice whose voice pitch is generally higher than that of an adult voice from being removed by the low-pass filter.

請求項１２の発明は、請求項９の発明において、音声の主成分帯域よりも低い周波数帯域成分を除去する高域通過フィルタを低域通過フィルタと直列に設けたことを特徴とする。 The invention of claim 12 is characterized in that, in the invention of claim 9 , a high-pass filter for removing frequency band components lower than the main component band of speech is provided in series with the low-pass filter.

この発明によれば、音声の主成分帯域以外の周波数帯域成分を有する騒音を高域通過フィルタ並びに低域通過フィルタを用いて送話信号から除去することにより、音声区間の誤検出が防止できるとともに該騒音に対する送話側の瞬時パワー推定値を低減することができ、その結果、音声スイッチの片倒れを抑制することができる。 According to the present invention, it is possible to prevent erroneous detection of a speech section by removing noise having frequency band components other than the main component band of speech from a transmission signal using a high-pass filter and a low-pass filter. It is possible to reduce the instantaneous power estimation value on the transmission side with respect to the noise, and as a result, it is possible to suppress the fall of the voice switch.

請求項１３の発明は、請求項１２の発明において、送話信号および受話信号から音声の主成分領域よりも高い若しくは低い周波数帯域成分を除去するフィルタをデジタルフィルタで構成したことを特徴とする。 A thirteenth aspect of the invention is characterized in that, in the twelfth aspect of the invention, the filter for removing frequency band components higher or lower than the main component region of speech from the transmitted signal and the received signal is constituted by a digital filter.

この発明によれば、回路構成を変更せずに所望の特性を有するフィルタが容易に実現できる。 According to the present invention, a filter having desired characteristics can be easily realized without changing the circuit configuration.

本発明によれば、遠端側（又は近端側）の周囲騒音レベルが大きい場合に受話側（又は送話側）の瞬時パワー推定値も大きくなってしまうことにより挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定して音声スイッチが受話側（又は送話側）へ片倒れすることがあるが、このような場合に受話側（又は送話側）の背景騒音パワー推定値が所定のしきい値を超えていれば、第２（又は第１）の減衰器の減衰量を増大させることで受話側（又は送話側）の瞬時パワー推定値を減少させるから、挿入損失量制御部が通話状態を受話状態（又は送話状態）と誤判定することを防いで音声スイッチの片倒れを抑制することができるという効果がある。 According to the present invention, when the ambient noise level on the far end side (or near end side) is high, the instantaneous power estimation value on the receiving side (or transmitting side) also increases, so that the insertion loss amount control unit can The voice switch may fall down to the receiving side (or sending side) by misjudging the state as the receiving state (or sending state). In such a case, the background of the receiving side (or sending side) If the noise power estimate exceeds a predetermined threshold, the instantaneous power estimate on the receiver side (or transmitter side) is decreased by increasing the attenuation amount of the second (or first) attenuator. Therefore, there is an effect that it is possible to prevent the voice switch from falling down by preventing the insertion loss amount control unit from erroneously determining the call state as the reception state (or transmission state).

本発明の実施形態を説明する前に、本発明の参考例について説明する。
本参考例の拡声通話装置は、図１に示すようにマイクロホン１、スピーカ２、マイクロホンアンプ５、回線アンプ６、並びに音声スイッチＶＳを備える点で従来の拡声通話装置（インターホン親機Ｍ）と共通する。なお、音声スイッチＶＳと２線−４線変換回路３０との間の送話側信号経路には回線アンプ７が挿入され、音声スイッチＶＳとスピーカ２との間の受話側信号経路にはスピーカアンプ８が挿入されている。 Before describing embodiments of the present invention, reference examples of the present invention will be described.
The loudspeaker device of this reference example is common to the conventional loudspeaker device (interphone master M) in that it includes a microphone 1, a speaker 2, a microphone amplifier 5, a line amplifier 6, and a voice switch VS as shown in FIG. To do. A line amplifier 7 is inserted in the transmitting signal path between the voice switch VS and the 2-wire / four-wire conversion circuit 30, and a speaker amplifier is connected in the receiving signal path between the voice switch VS and the speaker 2. 8 is inserted.

本参考例における音声スイッチＶＳは、送話信号を回線へ伝送するための送話側信号経路に挿入される送話側損失挿入部３と、受話信号をスピーカ２へ伝送するための受話側信号経路に挿入される受話側損失挿入部４と、送話信号の音声区間を検出する第１の音声区間検出部１１と、受話信号の音声区間を検出する第２の音声区間検出部１２と、通話状態に応じて送話側損失挿入部３並びに受話側損失挿入部４の挿入損失量を制御する挿入損失量制御部１０とを備える。なお、送話側損失挿入部３並びに受話側損失挿入部４は何れもアンプからなり、挿入損失量制御部１０によって各々の増幅度Ｇ_T，Ｇ_R（損失量としては、−Ｇ_T，−Ｇ_R）が可変となっている。 The voice switch VS in this reference example includes a transmission side loss insertion unit 3 inserted into a transmission side signal path for transmitting a transmission signal to a line, and a reception side signal for transmitting a reception signal to the speaker 2. A receiving side loss insertion unit 4 to be inserted into the path, a first voice zone detection unit 11 for detecting a voice zone of a transmission signal, a second voice zone detection unit 12 for detecting a voice zone of a reception signal, An insertion loss amount control unit 10 that controls the insertion loss amount of the transmission side loss insertion unit 3 and the reception side loss insertion unit 4 according to the call state is provided. Note that both the transmission side loss insertion unit 3 and the reception side loss insertion unit 4 are composed of amplifiers, and the insertion loss amount control unit 10 controls each amplification degree G _T , G _R (the loss amounts are −G _T , − G _R ) is variable.

第１および第２の音声区間検出部１１，１２は共通の構成を有し、図２に示すように参照信号（送話側損失挿入部３へ入力する送話信号又は受話側損失挿入部４へ入力する受話信号）の瞬時パワーを推定する瞬時パワー推定部２０と、参照信号の背景騒音パワーを推定する背景騒音パワー推定部２１と、瞬時パワー推定値Ｐｓと背景騒音パワー推定値Ｐｎの比（＝Ｐｓ／Ｐｎ）を所定のしきい値と比較し、前記比がしきい値以上のときに音声区間と判定する第１の判定部２２とで構成される。そして、第１および第２の音声区間検出部１１，１２は音声区間を検出したときに各々送話側および受話側の音声区間検出信号ＴＳＤ，ＲＳＤを「１」とし、音声区間を検出しないとき（非音声区間のとき）に音声区間検出信号ＴＳＤ，ＲＳＤを「０」とする。なお、送話側並びに受話側の瞬時パワー推定部２０は、立ち上がりが急峻で立ち下がりが緩やかな特性を有するフィルタ等で構成され、送話側並びに受話側の背景騒音パワー推定部２１は、立ち上がりが緩やかで立ち下がりが急峻な特性を有するフィルタ等で構成される。 The first and second voice section detection units 11 and 12 have a common configuration, and as shown in FIG. 2, a reference signal (a transmission signal input to the transmission side loss insertion unit 3 or a reception side loss insertion unit 4). The instantaneous power of the reference signal), the background noise power estimation unit 21 for estimating the background noise power of the reference signal, and the ratio of the instantaneous power estimation value Ps and the background noise power estimation value Pn. The first determination unit 22 compares (= Ps / Pn) with a predetermined threshold value and determines that the voice section is used when the ratio is equal to or greater than the threshold value. When the first and second voice section detection units 11 and 12 detect the voice section, the voice section detection signals TSD and RSD on the transmission side and the reception side are set to “1”, respectively, and the voice section is not detected. In a non-voice section, the voice section detection signals TSD and RSD are set to “0”. The instantaneous power estimation unit 20 on the transmitting side and the receiving side is configured by a filter having a characteristic that the rise is steep and the fall is gentle, and the background noise power estimation unit 21 on the transmission side and the reception side is a rise. Is composed of a filter or the like having a characteristic of a gradual fall and a steep fall.

さらに挿入損失量制御部１０は、送話側損失挿入部３への入力点から送話側損失挿入部３並びに回線側での回り込みを経て受話側損失挿入部４への入力点へ帰還する系の利得に応じて決定される値を係数にもつ回線帰還利得乗算部（図示せず）と、受話側損失挿入部４への入力点から受話側損失挿入部４並びに音響側（マイクロホン１およびスピーカ２）での回り込みを経て送話側損失挿入部３への入力点へ到る経路の利得に応じて決定される値を係数にもつ音響結合利得乗算部（図示せず）と、第２の音声区間検出部１２から出力される受話側瞬時パワー推定値Ｐs(R)を音響結合利得乗算部へ入力して得られる出力信号Ｐ２と第１の音声区間検出部１１から出力される送話側瞬時パワー推定値Ｐs(T)との大小関係を比較する第１の比較器（図示せず）と、送話側瞬時パワー推定値Ｐs(T)を回線帰還利得乗算部へ入力して得られる出力信号Ｐ１と受話側瞬時パワー推定値Ｐs(R)との大小関係を比較する第２の比較器（図示せず）と、第１の比較器及び第２の比較器の出力信号Ｃ１，Ｃ２と第１の音声区間検出部１１及び第２の音声区間検出部１２の出力信号Ｃ３（＝ＴＳＤ），Ｃ４（＝ＲＳＤ）に基づいて通話状態を判定し、送話側損失挿入部３及び受話側損失挿入部４の損入損失量を制御する挿入損失量分配処理部（図示せず）とを具備する。ここで、第１の比較器の出力信号Ｃ１は、Ｐs(T)＜Ｐ２の場合に「０」となり、Ｐs(T)≧Ｐ２の場合に「１」となる。また、第２の比較器の出力信号Ｃ２は、Ｐs(R)≧Ｐ１の場合に「０」となり、Ｐs(R)＜Ｐ１の場合に「１」となる。 Further, the insertion loss amount control unit 10 returns from the input point to the transmission side loss insertion unit 3 to the input point to the reception side loss insertion unit 4 through the transmission side loss insertion unit 3 and the loop on the line side. A line feedback gain multiplier (not shown) whose coefficient is a value determined in accordance with the gain of the receiver, a receiving side loss inserting unit 4 and an acoustic side (microphone 1 and speaker) from the input point to the receiving side loss inserting unit 4 An acoustic coupling gain multiplier (not shown) having as a coefficient a value determined in accordance with the gain of the path that reaches the input point to the transmission side loss insertion unit 3 through the wraparound in 2); The receiving side instantaneous power estimate value Ps (R) output from the speech segment detector 12 is input to the acoustic coupling gain multiplier and the transmission side output from the first speech segment detector 11. A first comparator for comparing the magnitude relationship with the instantaneous power estimated value Ps (T) (see FIG. And the magnitude relationship between the output signal P1 obtained by inputting the transmission side instantaneous power estimation value Ps (T) to the line feedback gain multiplier and the reception side instantaneous power estimation value Ps (R). 2 comparators (not shown), the output signals C1 and C2 of the first comparator and the second comparator, and the output signals C3 of the first speech section detector 11 and the second speech section detector 12 (= TSD), an insertion loss amount distribution processing unit (not shown) that determines a call state based on C4 (= RSD) and controls the loss loss amount of the transmission side loss insertion unit 3 and the reception side loss insertion unit 4 A). Here, the output signal C1 of the first comparator becomes “0” when Ps (T) <P2, and becomes “1” when Ps (T) ≧ P2. Further, the output signal C2 of the second comparator becomes “0” when Ps (R) ≧ P1, and becomes “1” when Ps (R) <P1.

而して、挿入損失量制御部１０においては、上記４つの２値信号Ｃ１〜Ｃ４を参照して通話状態を判定し、送話側損失挿入部３及び受話側損失挿入部４の挿入損失量を決定する。ここで、Ｃ１＝Ｃ２＝１且つＣ３＝１の場合は送話モード、Ｃ１＝Ｃ２＝０且つＣ４＝１の場合は受話モード、Ｃ１≠Ｃ２且つＣ３及びＣ４が共に０ではない場合は高速アイドルモード、その他の状態では緩速アイドルモードと判定し、判定結果が送話モードのときには送話側損失挿入部３の挿入損失量を最小値、受話側損失挿入部４の挿入損失量を最大値に設定し、判定結果が受話モードのときには送話側損失挿入部３の挿入損失量を最大値、受話側損失挿入部４の挿入損失量を最小値に設定し、判定結果が高速アイドルモードのときには短い遷移時間で送話損失挿入部３並びに受話損失挿入部４の挿入損失量を互いに等しくするとともに、判定結果が緩速アイドルモードのときには長い遷移時間で送話損失挿入部３並びに受話損失挿入部４の挿入損失量を互いに等しくする。なお、上述した音声スイッチＶＳの構成および動作は特許文献１に開示されたものと共通であるので詳しい説明は省略する。 Thus, the insertion loss amount control unit 10 determines the call state by referring to the four binary signals C1 to C4, and the insertion loss amounts of the transmission side loss insertion unit 3 and the reception side loss insertion unit 4 are determined. To decide. Here, when C1 = C2 = 1 and C3 = 1, the transmission mode, when C1 = C2 = 0 and C4 = 1, the reception mode, and when C1 ≠ C2 and C3 and C4 are not 0, high-speed idle In the mode and other states, the mode is determined as the slow idle mode. When the determination result is the transmission mode, the insertion loss amount of the transmission side loss insertion unit 3 is the minimum value, and the insertion loss amount of the reception side loss insertion unit 4 is the maximum value. When the determination result is the reception mode, the insertion loss amount of the transmission side loss insertion unit 3 is set to the maximum value, and the insertion loss amount of the reception side loss insertion unit 4 is set to the minimum value. Sometimes the transmission loss insertion unit 3 and the reception loss insertion unit 4 have the same insertion loss amount in a short transition time, and when the determination result is the slow idle mode, the transmission loss insertion unit 3 and the reception loss insertion in a long transition time. Part Equal to each other in the insertion loss. Note that the configuration and operation of the voice switch VS described above are the same as those disclosed in Patent Document 1, and thus detailed description thereof is omitted.

次に本参考例の要旨について説明する。本参考例は、受話側の瞬時パワー推定値Ｐs(R)を減衰させる減衰器１３と、受話側の背景騒音パワー推定値Ｐn(R)が所定のしきい値を超えたら減衰器１３の減衰量を増大させる減衰量制御部１４とを音声スイッチＶＳに具備する点に特徴がある。 Next, the gist of this reference example will be described. In this reference example , the attenuator 13 that attenuates the instantaneous power estimate value Ps (R) on the receiver side and the attenuation of the attenuator 13 when the background noise power estimate value Pn (R) on the receiver side exceeds a predetermined threshold value. A feature resides in that the voice switch VS is provided with an attenuation amount control unit 14 for increasing the amount.

減衰量制御部１４は、図３に示すように受話側背景騒音パワー推定値Ｐn(R)の平均値Ｅ[Ｐn(R)]を算出する平均値算出部１４ａと、この平均値Ｅ[Ｐn(R)]を所定のしきい値と比較し、平均値Ｅ[Ｐn(R)]がしきい値以下のときは減衰器１３の減衰量を初期値（＝０ｄＢ）に設定し、平均値Ｅ[Ｐn(R)]がしきい値を超えたら初期値よりも大きい値に減衰量を増大させる減衰量決定部１４ｂと、受話側音声区間検出信号ＲＳＤが「１」、すなわち第２の音声区間検出部１２で音声区間が検出されているとき、若しくは送話側損失挿入部３の挿入損失量（利得Ｇ_T）が所定のしきい値未満（利得Ｇ_Tがしきい値より大きい）のときに平均値算出部１４ａによる平均値の算出（更新）を停止させる平均演算更新／停止判定部１４ｃと、減衰量決定部１４ｂによる減衰量の決定を所定時間毎に間欠的に行わせるカウンタ１４ｄとで構成される。 As shown in FIG. 3, the attenuation amount control unit 14 calculates an average value E [Pn (R)] of the receiver side background noise power estimated value Pn (R), and the average value E [Pn (R)] is compared with a predetermined threshold value. When the average value E [Pn (R)] is equal to or less than the threshold value, the attenuation amount of the attenuator 13 is set to the initial value (= 0 dB), and the average value is set. When E [Pn (R)] exceeds the threshold value, the attenuation amount determination unit 14b that increases the attenuation amount to a value larger than the initial value, and the reception side speech section detection signal RSD is “1”, that is, the second speech. When a voice section is detected by the section detection unit 12, or when the insertion loss amount (gain G _T ) of the transmission side loss insertion unit 3 is less than a predetermined threshold (gain G _T is greater than the threshold). An average calculation update / stop determination unit 14c that sometimes stops the average value calculation (update) by the average value calculation unit 14a, and a decrease by the attenuation amount determination unit 14b Configured to determination of the amount in a counter 14d for intermittently performed every predetermined time.

而して、遠端側の周囲騒音レベルが大きい場合に受話側瞬時パワー推定値Ｐs(R)も大きくなってしまうことにより、挿入損失量制御部１０が通話状態を受話状態と誤判定して音声スイッチＶｓが受話側へ片倒れすることがあるが、遠端側の周囲騒音レベルが大きくなって受話側背景騒音パワー推定値Ｐn(R)がしきい値を超えていれば、減衰量制御部１４により減衰器１３の減衰量を初期値から増大させて受話側瞬時パワー推定値Ｐs(R)を減少させるから、挿入損失量制御部１０が通話状態を受話状態と誤判定することを防いで音声スイッチＶＳの片倒れを抑制することができる。また、第２の音声区間検出部１２で音声区間が検出されているときに平均演算更新／停止判定部１４ｃが平均値算出部１４ａによる平均値の算出を停止させているため、受話信号に音声が含まれているときに減衰器１３の減衰量を変化させて挿入損失量制御部１０が送話モードと受話モードを誤って切り換えてしまうことを防ぐことができる。さらに、送話側損失挿入部３の挿入損失量が所定のしきい値未満のときに平均演算更新／停止判定部１４ｃが平均値算出部１４ａによる平均値の算出を停止させているため、受話信号が送話側信号経路へ回り込むことにより減衰器１３の減衰量を変化させてしまうために挿入損失量制御部１０が送話モードと受話モードを誤って切り換えてしまうことを防ぐことができる。 Thus, when the ambient noise level on the far end side is large, the receiving side instantaneous power estimate value Ps (R) also becomes large, so that the insertion loss amount control unit 10 erroneously determines the call state as the receiving state. The voice switch Vs may fall to the receiver side, but if the ambient noise level at the far end becomes large and the receiver background noise power estimate Pn (R) exceeds the threshold value, the attenuation control is performed. Since the attenuation amount of the attenuator 13 is increased from the initial value by the unit 14 and the reception side instantaneous power estimated value Ps (R) is decreased, the insertion loss amount control unit 10 is prevented from erroneously determining the call state as the reception state. Therefore, it is possible to prevent the voice switch VS from falling down. In addition, since the average calculation update / stop determination unit 14c stops the calculation of the average value by the average value calculation unit 14a when the voice segment is detected by the second voice segment detection unit 12, the voice is received in the received signal. Therefore, it is possible to prevent the insertion loss amount control unit 10 from erroneously switching between the transmission mode and the reception mode by changing the attenuation amount of the attenuator 13. Furthermore, since the average calculation update / stop determination unit 14c stops the calculation of the average value by the average value calculation unit 14a when the insertion loss amount of the transmission side loss insertion unit 3 is less than a predetermined threshold, Since the attenuation amount of the attenuator 13 is changed when the signal wraps around the transmission side signal path, the insertion loss amount control unit 10 can be prevented from erroneously switching between the transmission mode and the reception mode.

ところで本出願人は、図４に示すように第２の音声区間検出部１２に受話信号を入力する経路に減衰器１３’を挿入し、受話側背景騒音パワー推定値Ｐn(R)が所定のしきい値を超えたときに減衰量制御部１４’が減衰器１３’の減衰量を増大させることで同様の作用効果を奏するようにした拡声通話装置を既に出願している（特願２００３−３９４６７０号参照）。しかしながら、図４に示すものにおいては減衰器１３’と第２の音声区間検出部１２と減衰量制御部１４’とでフィードバックループが形成されているため、減衰器１３’の減衰量が増大して受話信号が減衰すると受話側背景騒音パワー推定値Ｐn(R)が減少してしきい値を下回り、受話側背景騒音パワー推定値Ｐn(R)がしきい値を下回ることで減衰量制御部１４’が減衰器１３’の減衰量を減少させることから、減衰量制御部１４’による減衰器１３’の減衰量の増減が繰り返されて挿入損失量制御部１０の動作が不安定になる虞があった。これに対して本参考例では、受話側瞬時パワー推定値Ｐs(R)を減衰器１３で減衰させるようにしてフィードバックループを形成していないから、上述のように減衰量制御部１４’によって減衰器１３’の減衰量の増減が繰り返されることがなく、挿入損失量制御部１０の動作を安定させることができるという利点がある。 By the way, as shown in FIG. 4, the applicant inserts an attenuator 13 ′ into the path for inputting the received signal to the second speech section detecting unit 12, and the received background noise power estimate Pn (R) is a predetermined value. An application has already been filed for a loudspeaker device in which the attenuation control unit 14 ′ increases the attenuation amount of the attenuator 13 ′ when the threshold value is exceeded (see Japanese Patent Application 2003-2003). 394670). However, in the case shown in FIG. 4, since the feedback loop is formed by the attenuator 13 ′, the second voice section detector 12 and the attenuation controller 14 ′, the attenuation of the attenuator 13 ′ increases. When the received signal is attenuated, the receiving side background noise power estimated value Pn (R) decreases and falls below the threshold value, and the receiving side background noise power estimated value Pn (R) falls below the threshold value so that the attenuation amount control unit 14 ′ decreases the attenuation amount of the attenuator 13 ′, and therefore the attenuation amount control unit 14 ′ repeatedly increases and decreases the attenuation amount of the attenuator 13 ′, and the operation of the insertion loss amount control unit 10 may become unstable. was there. On the other hand, in the present reference example , since the receiving side instantaneous power estimate value Ps (R) is attenuated by the attenuator 13 and no feedback loop is formed, the attenuation control unit 14 ′ attenuates as described above. There is an advantage that the operation of the insertion loss amount control unit 10 can be stabilized without repeating the increase and decrease of the attenuation amount of the device 13 '.

なお、本参考例では受話側に減衰器１３並びに減衰量制御部１４を設けたが、第１の音声区間検出部１１から出力する送話側瞬時パワー推定値Ｐs(T)を減衰させる減衰器と、送話側背景騒音パワー推定値Ｐn(T）が所定のしきい値を超えたら減衰器の減衰量を増大させる減衰量制御部とを設ければ、受話側と同様に挿入損失量制御部１０における通話状態の誤判定を防いで音声スイッチＶＳの片倒れを抑制することが可能であり、送話側又は受話側の何れか一方だけでなく双方に減衰器並びに減衰量制御部を設けても構わない。 In this reference example , the attenuator 13 and the attenuation amount control unit 14 are provided on the reception side. However, the attenuator for attenuating the transmission side instantaneous power estimation value Ps (T) output from the first speech section detection unit 11. If the transmission side background noise power estimated value Pn (T) exceeds a predetermined threshold value, an attenuation amount control unit for increasing the attenuation amount of the attenuator is provided. It is possible to prevent the voice switch VS from collapsing by preventing erroneous determination of the call state in the unit 10, and an attenuator and an attenuation amount control unit are provided not only on either the transmitting side or the receiving side but on both sides. It doesn't matter.

（実施形態１）
図５に本実施形態のブロック図を示す。但し、本実施形態の基本構成は参考例と共通であるから、共通の構成要素には同一の符号を付して説明を省略し、本実施形態の特徴となる構成についてのみ説明する。 (Embodiment 1 )
FIG. 5 shows a block diagram of the present embodiment. However, since the basic configuration of the present embodiment is the same as that of the reference example , the same components are denoted by the same reference numerals, description thereof is omitted, and only the configuration that is a feature of the present embodiment will be described.

本実施形態は、挿入損失量制御部１０で参照する受話信号から音声の主成分帯域よりも低い周波数帯域成分を除去する高域通過フィルタ（ＨＰＦ）１５を音声スイッチＶＳに具備した点に特徴がある。 The present embodiment is characterized in that the voice switch VS includes a high-pass filter (HPF) 15 that removes a frequency band component lower than the main component band of the voice from the received signal referred to by the insertion loss amount control unit 10. is there.

例えば、相手側通話機器であるドアホン子器Ｓが屋外に設置されているため、ドアホン子器Ｓから伝送されてくる受話信号に所謂風切り音と呼ばれる騒音が含まれることがある。かかる風切り音は一般に音声の主成分帯域に比べて低い周波数帯域成分が多く含まれているから、高域通過フィルタ１５によって受話信号に含まれる風切り音をある程度除去することが可能である。 For example, since the door phone slave unit S, which is the other party's call device, is installed outdoors, the reception signal transmitted from the door phone slave unit S may include a so-called wind noise. Since such wind noise generally includes many frequency band components that are lower than the main component band of speech, the high-pass filter 15 can remove wind noise included in the received signal to some extent.

而して、高域通過フィルタ１５で風切り音をある程度除去した後の受話信号を第２の音声区間検出部１２で参照するようにしているから、第２の音声区間検出部１２が風切り音によって音声区間を誤検出することを防止できるとともに、風切り音に対する受話側瞬時パワー推定値Ｐs(R)を低減することができ、その結果、音声スイッチＶＳの片倒れを抑制することができるものである。 Thus, since the received voice signal after the wind noise has been removed to some extent by the high-pass filter 15 is referred to by the second voice segment detector 12, the second voice segment detector 12 detects the wind noise. It is possible to prevent erroneous detection of the voice section, and to reduce the instantaneous instantaneous power estimate value Ps (R) for the wind noise, and as a result, it is possible to suppress the fall of the voice switch VS. .

但し、風切り音のレベルが小さい或いはほぼゼロとなる状況や相手側通話機器が屋内に設置されている状況においては、受話信号の低域成分を高域通過フィルタ１５で除去することによって第２の音声区間検出部１２の誤検出が生じてしまうことがある。そこで、伝送系を介して通話する相手の通話機器を特定するとともに特定した通話機器に応じて高域通過フィルタ１５の有効・無効を切り換える手段や、あるいは使用者の操作による操作入力を取り込むとともに該操作入力に応じて高域通過フィルタ１５の有効・無効を切り換える手段を備えることが望ましい。例えば、本実施形態の拡声通話装置ではマイクロコンピュータからなる制御部（図示せず）によって相手の通話機器からの呼出を検出するとともに個々の通話機器毎に伝送系の線路を切り換えているから、制御部が呼出元の通話機器を特定し、屋外に設置された通話機器との通話時には高域通過フィルタ１５を有効とすることで騒音除去による音声区間の誤検出防止並びに該騒音に対する受話側瞬時パワー推定値Ｐs(R)の低減によって音声スイッチＶＳの片倒れが抑制でき、屋内に設置された通話機器との通話時には高域通過フィルタ１５を無効とすることで音声スイッチＶＳにおける通話状態の切換のバランスを保つことができる。あるいは、本実施形態の拡声通話装置のハウジング（図示せず）に押釦を設け、使用者が該押釦を押操作したときに前記制御部に操作入力が取り込まれ、その操作入力に応じて制御部が高域通過フィルタ１５の有効・無効を切り換えるようにすれば、使用者が意図的に高域通過フィルタ１５の有効・無効を切り換えることで音声区間の誤検出防止並びに該騒音に対する受話側瞬時パワー推定値Ｐs(R)の低減によって音声スイッチＶＳの片倒れが抑制できるとともに、使い勝手の向上が図れる。なお、高域通過フィルタ１５の有効・無効の切り換えは、例えば受話側信号経路の参照点と第２の音声区間検出部１２との間に高域通過フィルタ１５を迂回する信号経路を設け、この信号経路の途中に挿入された接点を開閉することで行えばよい。 However, in a situation where the level of wind noise is low or almost zero, or in a situation where the other party's call device is installed indoors, the second low frequency component of the received signal is removed by the high pass filter 15. An erroneous detection of the voice section detection unit 12 may occur. Therefore, the communication device of the other party to be called via the transmission system is specified, and the means for switching the high pass filter 15 between valid and invalid according to the specified call device, or the operation input by the operation of the user is taken in and It is desirable to provide means for switching between valid / invalid of the high-pass filter 15 according to the operation input. For example, in the loudspeaker device according to the present embodiment, a control unit (not shown) made of a microcomputer detects a call from a partner telephone device and switches the transmission line for each individual telephone device. The caller identifies the calling device of the caller and enables the high-pass filter 15 to be effective during a call with a call device installed outdoors, thereby preventing erroneous detection of a voice section by noise removal and receiving-side instantaneous power against the noise. By reducing the estimated value Ps (R), it is possible to prevent the voice switch VS from collapsing, and the voice switch VS can be used to switch the call state by disabling the high-pass filter 15 when talking to a telephone device installed indoors. Balance can be maintained. Alternatively, a push button is provided on the housing (not shown) of the loudspeaker apparatus according to the present embodiment, and when the user presses the push button, an operation input is taken into the control unit, and the control unit is set according to the operation input. If the high pass filter 15 is switched between valid / invalid, the user intentionally switches the valid / invalid of the high pass filter 15 to prevent erroneous detection of the speech section and to receive instantaneous power for the noise. By reducing the estimated value Ps (R), the voice switch VS can be prevented from falling down and the usability can be improved. In order to switch the high-pass filter 15 between valid and invalid, for example, a signal path that bypasses the high-pass filter 15 is provided between the reference point of the receiver-side signal path and the second speech section detection unit 12. What is necessary is just to open and close the contact inserted in the middle of the signal path.

（実施形態２）
図６に本実施形態における第１および第２の音声区間検出部１１’，１２’のブロック図を示す。但し、第１および第２の音声区間検出部１１’，１２’の基本構成並びに他の構成は参考例と共通であるから、共通の構成要素には同一の符号を付して適宜図示および説明を省略する。 (Embodiment 2 )
FIG. 6 shows a block diagram of the first and second speech section detection units 11 ′ and 12 ′ in the present embodiment. However, since the basic configuration and other configurations of the first and second speech section detection units 11 ′ and 12 ′ are the same as those in the reference example , common components are denoted by the same reference numerals and illustrated and described as appropriate. Is omitted.

本実施形態における第１及び第２の音声区間検出部１１’，１２’は、瞬時パワー推定部２０と、背景騒音パワー推定部２１と、第１の判定部２２と、参照信号（送話信号又は受話信号）から音声の主成分帯域よりも高い周波数帯域成分を除去する低域通過フィルタ（ＬＰＦ）２３と、低域通過フィルタ２３で高周波数帯域成分が除去された後の参照信号の瞬時パワーを推定する第２の瞬時パワー推定部２４と、瞬時パワー推定部２０で推定された瞬時パワー推定値Ｐｓ、第１の判定部２２による判定結果を示す検出フラグ、第２の瞬時パワー推定部２４で推定された瞬時パワー推定値Ｐｓ_Lに基づいて音声区間を検出する第２の判定部２５とを具備する。第１の判定部２２では瞬時パワー推定値Ｐｓと背景騒音パワー推定値Ｐｎの比（＝Ｐｓ／Ｐｎ）が所定のしきい値以上のときに音声区間と判定して検出フラグ（音声区間検出信号）を「１」とし、前記比がしきい値未満のときに非音声区間と判定して検出フラグを「０」とする。 The first and second speech section detection units 11 ′ and 12 ′ in the present embodiment include an instantaneous power estimation unit 20, a background noise power estimation unit 21, a first determination unit 22, and a reference signal (transmission signal). Or a low-pass filter (LPF) 23 that removes a frequency band component higher than the main component band of the voice from the received signal) and the instantaneous power of the reference signal after the high-frequency band component is removed by the low-pass filter 23 The second instantaneous power estimation unit 24 for estimating the instantaneous power estimation value Ps estimated by the instantaneous power estimation unit 20, the detection flag indicating the determination result by the first determination unit 22, and the second instantaneous power estimation unit 24 And a second determination unit 25 that detects a speech section based on the instantaneous power estimation value Ps_L estimated in (1). The first determination unit 22 determines that the voice section is a voice flag when the ratio (= Ps / Pn) of the instantaneous power estimated value Ps and the background noise power estimated value Pn is equal to or greater than a predetermined threshold, and a detection flag (voice section detection signal). ) Is set to “1”, and when the ratio is less than the threshold value, it is determined as a non-speech interval and the detection flag is set to “0”.

一方、第２の判定部２５は、図７のフローチャートに示すように第１の判定部２２から出力する検出フラグが「０」、すなわち第１の判定部２２で非音声区間と判定された場合には直ちに非音声区間と判定して出力フラグ（音声区間検出信号ＴＳＤ，ＲＳＤ）を「０」とし、第１の判定部２２から出力する検出フラグが「１」、すなわち第１の判定部２２で音声区間と判定された場合、瞬時パワー推定部２０で推定された瞬時パワー推定値Ｐｓに所定の係数γ（０＜γ＜１）を乗算した値（＝Ｐｓ・γ）と第２の瞬時パワー推定部２４で推定された瞬時パワー推定値Ｐｓ_Lとを比較し、Ｐｓ・γ＜Ｐｓ_Lならば音声区間、Ｐｓ・γ≧Ｐｓ_Lならば非音声区間と判定して出力フラグをそれぞれ「１」、「０」に切り換える判定処理を行っている。 On the other hand, as shown in the flowchart of FIG. 7, the second determination unit 25 has a detection flag output from the first determination unit 22 of “0”, that is, the first determination unit 22 determines that it is a non-speech segment. Is immediately determined as a non-speech interval, the output flag (speech interval detection signals TSD, RSD) is set to “0”, and the detection flag output from the first determination unit 22 is “1”, that is, the first determination unit 22. , The instantaneous power estimation value Ps estimated by the instantaneous power estimation unit 20 is multiplied by a predetermined coefficient γ (0 <γ <1) (= Ps · γ) and the second instantaneous time. The instantaneous power estimated value Ps_L estimated by the power estimator 24 is compared. If Ps · γ <Ps_L, the speech section is determined, and if Ps · γ ≧ Ps_L, the non-voice section is determined and the output flag is set to “1”. A determination process for switching to “0” is performed.

すなわち、電話機の呼出音や動物（特に小型犬）の鳴き声のように非定常的且つ音声の主成分帯域よりも高い周波数帯域成分を有する騒音が存在する環境下において、これらの騒音によって第１又は第２の音声区間検出部１１，１２が音声区間を誤検出してしまう可能性があるが、本実施形態の第１又は第２の音声区間検出部１１’，１２’では、音声の主成分帯域よりも高い周波数帯域成分を有する騒音を低域通過フィルタ２３で除去した後の参照信号の瞬時パワー推定値Ｐｓ_Lを第２の瞬時パワー推定部２４で推定し、その推定値Ｐｓ_Lと、音声の主成分帯域よりも高い周波数帯域成分を除去する前の参照信号の瞬時パワー推定値Ｐｓと、第１の判定部２２による判定結果とに基づいて第２の判定部２５が総合的に音声区間を判定しているから、上述のように音声の主成分帯域よりも高い周波数帯域成分を持った騒音によって第１及び第２の音声区間検出部１１’，１２’が音声区間を誤検出することを防止できるものである。なお、第２の判定部２５の判定処理において瞬時パワー推定部２０で推定された瞬時パワー推定値Ｐｓと乗算される係数γは相手の通話機器毎に最適な値が異なると考えられるから、係数γを可変とし、例えば、実施形態１で説明したように相手の通話機器毎に自動的に最適値に設定したり、あるいは使用者の操作によって値を設定するようにすれば、第１及び第２の音声区間検出部１１’，１２’の検出動作を最適化することができる。 That is, in an environment in which noise having a frequency band component higher than a main component band of speech such as a ringing tone of a telephone or an animal (particularly a small dog) is present, Although there is a possibility that the second speech segment detection units 11 and 12 may erroneously detect the speech segment, the first or second speech segment detection units 11 ′ and 12 ′ of the present embodiment use the main component of speech. The instantaneous power estimated value Ps_L of the reference signal after the noise having a frequency band component higher than the band is removed by the low-pass filter 23 is estimated by the second instantaneous power estimating unit 24, and the estimated value Ps_L Based on the instantaneous power estimation value Ps of the reference signal before removing the frequency band component higher than the main component band and the determination result by the first determination unit 22, the second determination unit 25 comprehensively selects the voice section. Is judging As described above, it is possible to prevent the first and second voice section detection units 11 ′ and 12 ′ from erroneously detecting the voice section due to noise having a frequency band component higher than the main component band of the voice. . Note that the coefficient γ multiplied by the instantaneous power estimation value Ps estimated by the instantaneous power estimation unit 20 in the determination process of the second determination unit 25 is considered to be different from the optimum value for each counterpart telephone equipment. If γ is variable and, for example, is automatically set to an optimum value for each call device as described in the first embodiment, or a value is set by user operation, the first and first It is possible to optimize the detection operation of the two voice section detection units 11 ′ and 12 ′.

（実施形態３）
本実施形態は、第１および第２の音声区間検出部１１’，１２’の第１の判定部２２における判定処理に特徴があり、全体の構成は実施形態２と共通であるから図示並びに説明は省略する。 (Embodiment 3 )
The present embodiment is characterized by the determination process in the first determination unit 22 of the first and second speech section detection units 11 ′ and 12 ′, and the overall configuration is the same as that of the second embodiment, so that it is illustrated and described. Is omitted.

本実施形態における第１の判定部２２は、（１）瞬時パワー推定値Ｐs(n)が所定のしきい値Ｐth以上であること、（２）瞬時パワー推定値Ｐs(n)と背景騒音パワー推定値Ｐn(n)の比Ｐs(n)／Ｐn(n)がしきい値δ以上であること、（３）所定の時間間隔Ｋを空けて算出された２つの瞬時パワー推定値Ｐs(n)，Ｐs(n-K)の差分の絶対値が所定のしきい値χ以上であること、の３つの条件が全て満たされたときにのみ参照信号ｘ(n)を音声と判定する。なお、時間間隔Ｋは、ターゲットとする騒音（例えば、後述する赤ちゃんの泣き声など）の特徴に応じて適切な値に設定される。 In the present embodiment, the first determination unit 22 (1) the instantaneous power estimated value Ps (n) is greater than or equal to a predetermined threshold Pth, and (2) the instantaneous power estimated value Ps (n) and the background noise power. The ratio Ps (n) / Pn (n) of the estimated value Pn (n) is equal to or greater than the threshold value δ, and (3) two instantaneous power estimated values Ps (n calculated with a predetermined time interval K. ), Ps (nK), the reference signal x (n) is determined to be a voice only when all of the three conditions that the absolute value of the difference between Ps (nK) is greater than or equal to a predetermined threshold value χ are satisfied. The time interval K is set to an appropriate value in accordance with the characteristics of the target noise (for example, a baby cry described later).

次に、第１の判定部２２における具体的な判定処理を、図８のフローチャートに基づいて説明する。まず、瞬時パワー推定部２０で算出された瞬時パワー推定値Ｐs(n)をしきい値Ｐthと比較し（ステップ１）、しきい値Ｐth以上であれば、瞬時パワー推定値Ｐs(n)と背景騒音パワー推定値Ｐn(n)の比Ｐs(n)／Ｐn(n)をしきい値δと比較する（ステップ２）。そして、比Ｐs(n)／Ｐn(n)がしきい値δ以上であれば、２つの瞬時パワー推定値Ｐs(n)，Ｐs(n-K)の差分の絶対値｜Ｐs(n)−Ｐs(n-K)｜をしきい値χと比較し（ステップ３）、しきい値χ以上であれば音声区間と判定する（ステップ４）。また、瞬時パワー推定値Ｐs(n)がしきい値Ｐth未満、比Ｐs(n)／Ｐn(n)がしきい値δ未満、若しくは差分の絶対値｜Ｐs(n)−Ｐs(n-K)｜がしきい値χ未満の何れかであれば非音声区間と判定する（ステップ５）。 Next, specific determination processing in the first determination unit 22 will be described based on the flowchart of FIG. First, the instantaneous power estimation value Ps (n) calculated by the instantaneous power estimation unit 20 is compared with a threshold value Pth (step 1). The ratio Ps (n) / Pn (n) of the background noise power estimated value Pn (n) is compared with the threshold value δ (step 2). If the ratio Ps (n) / Pn (n) is greater than or equal to the threshold value δ, the absolute value of the difference between the two instantaneous power estimated values Ps (n) and Ps (nK) | Ps (n) −Ps ( nK) | is compared with a threshold value χ (step 3). Further, the instantaneous power estimated value Ps (n) is less than the threshold value Pth, the ratio Ps (n) / Pn (n) is less than the threshold value δ, or the absolute value of the difference | Ps (n) −Ps (nK) | Is less than the threshold value χ, it is determined as a non-voice segment (step 5).

ここで、上述の（１）および（２）の２つの条件については従来から一般に用いられており、本発明者らは、（３）の条件を加えることによって音声以外の非定常的な周囲騒音が音声として誤検出されなくなることを実験により確認した。すなわち、非定常的な周囲騒音として赤ちゃんの泣き声を想定し、通話者の音声（男性の音声並びに女性の音声）と赤ちゃんの泣き声をそれぞれ含む参照信号ｘ(n)に対して、瞬時パワー推定値Ｐｓと、瞬時パワー推定値の差分絶対値｜Ｐs(n)−Ｐs(n-K)｜とを求めたので、その結果を図９〜図１１に示す。図９（ａ）、図１０（ａ）および図１１（ａ）はそれぞれ参照信号ｘ(n)に赤ちゃんの泣き声、男性の音声、女性の音声が含まれるときの瞬時パワー推定値Ｐｓを示し、各図の（ｂ）は瞬時パワー推定値の差分の絶対値をそれぞれ示している。なお、時間間隔Ｋは４ｍｓ、参照信号ｘ(n)のレベルは男性および女性の音声の平均音圧が等しく、それぞれ赤ちゃんの泣き声に対して４ｄＢ程度大きかった。 Here, the above two conditions (1) and (2) have been generally used, and the present inventors have added non-steady ambient noise other than speech by adding the condition (3). Has been confirmed by experiments to prevent false detection as a voice. That is, assuming the baby's cry as non-stationary ambient noise, the instantaneous power estimate for the reference signal x (n) including the caller's voice (male voice and female voice) and the baby's cry Since Ps and the absolute difference value | Ps (n) −Ps (nK) | of the instantaneous power estimation value are obtained, the results are shown in FIGS. FIGS. 9 (a), 10 (a) and 11 (a) show the instantaneous power estimate Ps when the reference signal x (n) includes baby cry, male voice, and female voice, respectively. (B) of each figure has shown the absolute value of the difference of an instantaneous power estimated value, respectively. It should be noted that the time interval K was 4 ms, and the level of the reference signal x (n) was equal to the average sound pressure of male and female voices, which was about 4 dB greater than the baby cry.

而して、図９（ａ）、図１０（ａ）並びに図１１（ａ）を比較すると、赤ちゃんの泣き声に対して通話者の音声は瞬時パワー推定値Ｐs(n)の時間変動が大きいことが分かる。このため、図９（ｂ）、図１０（ｂ）並びに図１１（ｂ）に示すように瞬時パワー推定値の差分絶対値｜Ｐs(n)−Ｐs(n-K)｜に有意な差が認められる。したがって、差分絶対値｜Ｐs(n)−Ｐs(n-K)｜を判定条件に加えることで赤ちゃんの泣き声を騒音（非音声）と判定することができ、言い換えれば音声と誤判定することが防止できる。但し、非定常的な周囲騒音のうちで赤ちゃんの泣き声と同様に通話音声と比較して時間変動が小さいもの、例えばクラシック音楽や犬の遠吠えなども本実施形態により非音声と判定できると考えられる。 Thus, comparing FIG. 9 (a), FIG. 10 (a) and FIG. 11 (a), the voice of the caller has a large temporal fluctuation of the estimated power Ps (n) relative to the baby's cry. I understand. Therefore, as shown in FIGS. 9 (b), 10 (b) and 11 (b), there is a significant difference in the absolute difference value | Ps (n) −Ps (nK) | . Therefore, by adding the absolute difference value | Ps (n) −Ps (nK) | to the determination condition, the baby's cry can be determined as noise (non-speech), in other words, erroneous determination as voice can be prevented. . However, it is considered that non-stationary ambient noises that have a small time variation compared to the call voice as in the case of the baby's cry, such as classical music and howling dogs can be determined as non-voice according to this embodiment. .

（実施形態４）
図１２に本実施形態のブロック図を示す。但し、本実施形態の基本構成は実施形態１と共通であるから、共通の構成要素には同一の符号を付して説明を省略し、本実施形態の特徴となる構成についてのみ説明する。 (Embodiment 4 )
FIG. 12 shows a block diagram of this embodiment. However, the basic configuration of the present embodiment because it is common to Embodiment 1, the same components will not be described with the same reference numerals, a description will be given only of the elements having the corresponding functions in this embodiment.

本実施形態は、実施形態１の構成において、実施形態２の第１の音声区間検出部１１’を適用した点に特徴がある。すなわち、電話機の呼出音や犬の鳴き声などの騒音は屋内で生じる場合が多いと考えられるから、この種の騒音による音声区間の誤検出防止対策は第１の音声区間検出部１１’についてのみ適用することでトータルのコストを下げることができる。なお、第１の音声区間検出部１１’の第１の判定部２２において実施形態３と同様の判定処理を行うことにより、赤ちゃんの泣き声を音声と誤判定することを防止するのが望ましい。 This embodiment is in the configuration of Embodiment 1 is characterized in that the application of the first speech section detecting unit 11 of the embodiment 2 '. That is, since it is considered that noises such as telephone ringing sounds and dog calls are often generated indoors, a measure for preventing erroneous detection of a voice segment due to this type of noise is applied only to the first voice segment detector 11 '. By doing so, the total cost can be reduced. In addition, it is desirable to prevent the baby's cry from being erroneously determined as speech by performing the same determination process as in the third embodiment in the first determination unit 22 of the first speech section detection unit 11 ′.

（実施形態５）
図１３に本実施形態のブロック図を示す。但し、本実施形態の基本構成は実施形態１と共通であるから、共通の構成要素には同一の符号を付して説明を省略し、本実施形態の特徴となる構成についてのみ説明する。 (Embodiment 5 )
FIG. 13 shows a block diagram of this embodiment. However, the basic configuration of the present embodiment because it is common to Embodiment 1, the same components will not be described with the same reference numerals, a description will be given only of the elements having the corresponding functions in this embodiment.

本実施形態は、実施形態１の構成において、第１の音声区間検出部１１で参照する参照信号（送話信号）から音声の主成分帯域よりも高い周波数帯域成分を除去する低域通過フィルタ（ＬＰＦ）１６を音声スイッチＶＳに具備する点に特徴がある。 This embodiment is in the configuration of Embodiment 1, the low-pass filter for removing a high frequency band component than the main component band of the audio from the reference signal to be referred to in the first speech section detecting unit 11 (transmission signal) ( LPF) 16 is provided in the voice switch VS.

すなわち、実施形態２で説明したように電話機の呼出音や動物（特に小型犬）の鳴き声のように非定常的且つ音声の主成分帯域よりも高い周波数帯域成分を有する騒音が存在する環境（主に屋内）下においては、これらの騒音によって第１の音声区間検出部１１が音声区間を誤検出してしまう可能性があるので、低域通過フィルタ１６によって音声の主成分帯域よりも高い周波数帯域成分を送話信号から除去すれば、実施形態２における第１の音声区間検出部１１’に比べて簡易な構成により、電話機の呼出音や動物の鳴き声などの騒音の影響を低減して第１の音声区間検出部１１における音声区間の誤検出を抑えることができる。 That is, as described in the second embodiment, there is an environment where there is noise that has a non-stationary frequency band component higher than the main component band of voice, such as a telephone ringing tone or an animal (particularly a small dog) cry. The first speech segment detection unit 11 may erroneously detect the speech segment due to these noises, and therefore the frequency band higher than the main component band of the speech by the low-pass filter 16. If the component is removed from the transmission signal, the influence of noise such as the ringing tone of the telephone and the crying of the animal is reduced by a simple configuration compared to the first voice section detecting unit 11 ′ in the second embodiment. It is possible to suppress the erroneous detection of the speech section in the speech section detection unit 11.

ところで、低域通過フィルタ１６のカットオフ周波数を設定する際には主に成人の声の周波数特性を考慮して決められているが、一般に幼児の声のピッチは成人の声のピッチよりも相対的に高くなっているため、低域通過フィルタ１６によって幼児の声の一部が除去されてしまう可能性がある。したがって、このような幼児がいる家庭で本実施形態の拡声通話装置を使用する場合、低域通過フィルタ１６のカットオフ周波数を適切な値に設定しないと第１の音声区間検出部１１が幼児の声を音声と判定できずに音声区間を誤検出して片倒れが発生する可能性がある。そこで、使用者の操作による操作入力を取り込むとともに該操作入力に応じて低域通過フィルタ１６のカットオフ周波数を切り換える手段、あるいは参照信号に含まれる音声のピッチを検出するとともに検出される音声ピッチに応じて低域通過フィルタ１６のカットオフ周波数を変化させる手段を備えることが望ましい。例えば、本実施形態の拡声通話装置のハウジング（図示せず）に切換スイッチを設け、使用者が該切換スイッチを操作することで制御部に操作入力が取り込まれ、その操作入力に応じて制御部が低域通過フィルタ１６のカットオフ周波数を何通りかに切り換えるようにすれば、幼児のいる／いないによって使用者が低域通過フィルタ１６のカットオフ周波数を切り換え、幼児の声が第１の音声区間検出部１１で音声と判定されずに音声区間が誤検出されることを防ぐことができる。あるいは、参照信号（送話信号）に含まれる音声のピッチを検出し、検出された音声ピッチから参照信号に含まれる音声が幼児の声か否かを判定し、幼児の声であればカットオフ周波数を高い値に切り換えるとともに幼児の声でなければカットオフ周波数を低い値に切り換え、幼児の声が第１の音声区間検出部１１で音声と判定されずに音声区間が誤検出されることを防ぐことができる。尚、音声ピッチの検出方法については従来周知であるから詳細な説明は省略する。 By the way, when setting the cut-off frequency of the low-pass filter 16, it is determined mainly considering the frequency characteristics of an adult voice. Generally, the pitch of an infant's voice is relative to that of an adult voice. Therefore, part of the infant's voice may be removed by the low-pass filter 16. Therefore, when the loudspeaker device of this embodiment is used in a home where such an infant is present, the first voice interval detection unit 11 must be set in the infant unless the cutoff frequency of the low-pass filter 16 is set to an appropriate value. There is a possibility that the voice may not be determined to be voice, and the voice section may be erroneously detected and one-sided fall may occur. Therefore, the operation input by the user's operation is captured and the cut-off frequency of the low-pass filter 16 is switched according to the operation input, or the pitch of the voice included in the reference signal is detected and the detected voice pitch is detected. Accordingly, it is desirable to provide means for changing the cutoff frequency of the low-pass filter 16. For example, a changeover switch is provided in a housing (not shown) of the loudspeaker device according to the present embodiment, and an operation input is taken into the control unit when a user operates the changeover switch. If the cut-off frequency of the low-pass filter 16 is switched in several ways, the user switches the cut-off frequency of the low-pass filter 16 depending on the presence / absence of the infant, and the voice of the infant is the first voice. It is possible to prevent a voice section from being erroneously detected without being determined as voice by the section detection unit 11. Alternatively, the pitch of the voice included in the reference signal (transmission signal) is detected, and it is determined from the detected voice pitch whether the voice included in the reference signal is an infant voice. The frequency is switched to a high value and the cut-off frequency is switched to a low value if the voice is not an infant voice, and the voice section is erroneously detected without being judged as a voice by the first voice section detection unit 11. Can be prevented. Note that the method for detecting the voice pitch is well known in the art and will not be described in detail.

（実施形態６）
図１４に本実施形態のブロック図を示す。但し、本実施形態の基本構成は実施形態５と共通であるから、共通の構成要素には同一の符号を付して説明を省略し、本実施形態の特徴となる構成についてのみ説明する。 (Embodiment 6 )
FIG. 14 shows a block diagram of the present embodiment. However, since the basic configuration of the present embodiment is the same as that of the fifth embodiment, the same components are denoted by the same reference numerals and description thereof is omitted, and only the configuration that characterizes the present embodiment will be described.

本実施形態は、実施形態５の構成において、音声の主成分帯域よりも低い周波数帯域成分を除去する高域通過フィルタ（ＨＰＦ）１７を低域通過フィルタ１６と直列に設けた点に特徴がある。すなわち、高域通過フィルタ１７のカットオフ周波数は音声の主成分帯域の下限値に設定され、低域通過フィルタ１６のカットオフ周波数は音声の主成分帯域の上限値に設定される。 The present embodiment is characterized in that, in the configuration of the fifth embodiment, a high-pass filter (HPF) 17 that removes frequency band components lower than the main component band of speech is provided in series with the low-pass filter 16. . That is, the cutoff frequency of the high-pass filter 17 is set to the lower limit value of the main component band of the voice, and the cutoff frequency of the low-pass filter 16 is set to the upper limit value of the main component band of the voice.

而して本実施形態によれば、電話機の呼出音や動物の鳴き声のように非定常的且つ音声の主成分帯域よりも高い周波数帯域成分を有する騒音を低域通過フィルタ１６で除去するとともに、音声の主成分帯域よりも低い周波数帯域成分を有する騒音を高域通過フィルタ１７で除去することにより、第１の音声区間検出部１１が騒音によって音声区間を誤検出することを防止できるとともに、騒音に対する送話側瞬時パワー推定値Ｐs(T)を低減することができ、その結果、音声スイッチＶＳの片倒れを抑制することができるものである。 Thus, according to the present embodiment, the low-pass filter 16 removes noise having a frequency band component higher than the main component band of speech, such as a telephone ringing tone and an animal call, By removing noise having a frequency band component lower than the main component band of the voice with the high-pass filter 17, it is possible to prevent the first voice section detection unit 11 from erroneously detecting the voice section due to noise, and noise. The transmission side instantaneous power estimate value Ps (T) can be reduced, and as a result, the fall of the voice switch VS can be suppressed.

（実施形態７）
図１５に本実施形態のブロック図を示す。但し、本実施形態の基本構成は実施形態６と共通であるから、共通の構成要素には同一の符号を付して説明を省略し、本実施形態の特徴となる構成についてのみ説明する。 (Embodiment 7 )
FIG. 15 shows a block diagram of the present embodiment. However, since the basic configuration of the present embodiment is the same as that of the sixth embodiment, the same components are denoted by the same reference numerals, description thereof is omitted, and only the configuration that is a feature of the present embodiment will be described.

本実施形態は、図１５に示すように音声スイッチＶＳの構成要素のうちで送話側および受話側の損失挿入部３，４以外の構成要素をＤＳＰのようなディジタル回路Ａで実現した点に特徴がある。すなわち、本実施形態における音声スイッチＶＳは、アナログ回路からなる送話側損失挿入部３並びに受話側損失挿入部４と、アナログの参照信号（送話信号および受話信号）をＡ／Ｄ変換してディジタル回路Ａに出力するＡ／Ｄ変換部１８，１９と、挿入損失量制御部１０、第１および第２の音声区間検出部１１，１２、減衰器１３、減衰量制御部１４、ディジタルの参照信号をフィルタリングする第１〜第３のフィルタ部３１〜３３の機能をソフトウェアで実現するディジタル回路Ａとで構成される。 In the present embodiment, as shown in FIG. 15, among the constituent elements of the voice switch VS, the constituent elements other than the transmission side and receiving side loss insertion sections 3 and 4 are realized by a digital circuit A such as a DSP. There are features. That is, the voice switch VS in the present embodiment performs A / D conversion on the transmission side loss insertion unit 3 and the reception side loss insertion unit 4 made of analog circuits and analog reference signals (transmission signal and reception signal). A / D converters 18 and 19 to be output to the digital circuit A, insertion loss amount control unit 10, first and second speech section detection units 11 and 12, attenuator 13, attenuation amount control unit 14, digital reference It comprises a digital circuit A that implements the functions of the first to third filter units 31 to 33 for filtering signals by software.

第１及び第２の音声区間検出部１１，１２は共通の構成を有し、図１６に示すように参照信号（送話側損失挿入部３へ入力する送話信号又は受話側損失挿入部４へ入力する受話信号）の瞬時パワー推定部２０と、参照信号の背景騒音パワーを推定する背景騒音パワー推定部２１と、瞬時パワー推定値と背景騒音パワー推定値の比に基づいて参照信号の音声区間を判定する第１の判定部２２と、時定数更新部２６とを具備する。瞬時パワー推定部２０は、参照信号ｘ(n)の絶対値の時間平均値（絶対平均値）Ｐz(n)を求める絶対平均値算出部２０１と、絶対平均値算出部２０１で算出される時系列の絶対平均値Ｐz(n)を平滑化する絶対平均値平滑部２０２とで構成される。 The first and second speech section detection units 11 and 12 have a common configuration, and as shown in FIG. 16, a reference signal (a transmission signal input to the transmission side loss insertion unit 3 or a reception side loss insertion unit 4). Of the reference signal based on the ratio of the instantaneous power estimation value and the background noise power estimation value, and the background noise power estimation unit 21 for estimating the background noise power of the reference signal. A first determination unit 22 that determines a section and a time constant update unit 26 are provided. When the instantaneous power estimation unit 20 is calculated by the absolute average value calculation unit 201 for obtaining the time average value (absolute average value) Pz (n) of the absolute value of the reference signal x (n) and the absolute average value calculation unit 201 The absolute average value smoothing unit 202 smoothes the absolute average value Pz (n) of the series.

絶対平均値算出部２０１は、所定のサンプリング時間でサンプリングされた参照信号ｘ(n)の絶対値を求める絶対値算出部２０１ａと、所定の時間フレーム（サンプリング数Ｍ）における絶対値の総和を求める総和算出部２０１ｂと、算出された総和をサンプリング数Ｍで除して絶対平均値Ｐx(n)を求める除算部２０１ｃとからなり、結局のところ、絶対平均値算出部２０１では下記の式（１）の演算を行っている。 The absolute average value calculation unit 201 calculates an absolute value calculation unit 201a that calculates the absolute value of the reference signal x (n) sampled at a predetermined sampling time, and calculates the sum of absolute values in a predetermined time frame (sampling number M). The sum calculation unit 201b and the division unit 201c that calculates the absolute average value Px (n) by dividing the calculated sum by the sampling number M. As a result, the absolute average value calculation unit 201 uses the following formula (1 ).

また絶対平均値平滑部２０２は、正の定数α（＜１）を絶対平均値Ｐz(n)に乗算する乗算器２０２ａと、遅延シフトレジスタ２０２ｂと、遅延シフトレジスタ２０２ｂで遅延させた瞬時パワー推定値Ｐs(n-1)に正の定数（１−α）を乗算する乗算器２０２ｃと、２つの乗算器２０２ａ，２０２ｃの出力を加算する加算器２０２ｄとからなり、結局のところ、絶対平均値平滑部２０２では下記の式（２）の演算を行っている。 The absolute average value smoothing unit 202 multiplies the absolute average value Pz (n) by a positive constant α (<1), a delay shift register 202b, and an instantaneous power estimation delayed by the delay shift register 202b. It consists of a multiplier 202c that multiplies the value Ps (n-1) by a positive constant (1-α), and an adder 202d that adds the outputs of the two multipliers 202a and 202c. The smoothing unit 202 performs the following equation (2).

一方、背景騒音パワー推定部２１は、瞬時パワー推定値Ｐn(n)を遅延する遅延シフトレジスタ２１１と、瞬時パワー推定値Ｐs(n)と遅延シフトレジスタ２１１で遅延された瞬時パワー推定値Ｐn(n-1)とを比較する比較器２１２と、比較器２１２による比較結果に応じてそれぞれカウント値Ｃｕ，Ｃｄをインクリメントする第１および第２のカウンタ２１３，２１４と、第１および第２のカウンタ２１３，２１４のカウント値Ｃｕ，Ｃｄとしきい値Ｕｓ，Ｕｄの大小関係に応じて３つの補正値β（ｎ），０，−β（ｎ）（但し、β（ｎ）＞０）の何れかを選択して出力するセレクタ２１５と、セレクタ２１５から出力される補正値に遅延された瞬時パワー推定値Ｐn(n-1)を加算する加算器２１６とで構成される。ここで、第１および第２のカウンタ２１３，２１４は、それぞれ参照信号ｘのサンプリング時間毎に以下の規則に則ってカウント値Ｃｕ，Ｃｄを更新する。 On the other hand, the background noise power estimation unit 21 delays the instantaneous power estimate value Pn (n), the instantaneous power estimate value Ps (n), and the instantaneous power estimate value Pn (delayed by the delay shift register 211). n-1), first and second counters 213 and 214 for incrementing the count values Cu and Cd according to the comparison result by the comparator 212, and the first and second counters, respectively. One of three correction values β (n), 0, −β (n) (where β (n)> 0) depending on the magnitude relationship between the count values Cu and Cd of 213 and 214 and the threshold values Us and Ud. Is selected and output, and an adder 216 that adds the delayed instantaneous power estimation value Pn (n-1) to the correction value output from the selector 215. Here, the first and second counters 213 and 214 update the count values Cu and Cd according to the following rules for each sampling time of the reference signal x.

Ｐs(n)≧Ｐn(n-1)ならば、Ｃｕ＝Ｃｕ＋１，Ｃｄ＝０
Ｐs(n)＜Ｐn(n-1)ならば、Ｃｕ＝０，Ｃｄ＝Ｃｄ＋１
また、セレクタ２１５は以下の規則に則って３つの補正値のうちの何れか１つを選択して出力する。 If Ps (n) ≧ Pn (n−1), Cu = Cu + 1, Cd = 0
If Ps (n) <Pn (n-1), Cu = 0, Cd = Cd + 1
The selector 215 selects and outputs one of the three correction values according to the following rules.

Ｃｕ＝Ｕｓならば、β（ｎ）（同時に、Ｃｕ＝０にリセット）
Ｃｄ＝Ｄｓならば、−β（ｎ）（同時に、Ｃｄ＝０にリセット）
Ｃｕ≠Ｕｓ且つＣｄ≠Ｄｓならば、０
したがって、第１および第２のカウンタ２１３，２１４のカウント値Ｃｕ，Ｃｄと比較されるしきい値Ｕｓ，Ｄｓが、Ｕｓ≫Ｄｓとなるように設定すれば、立ち上がり時定数が大きく且つ立ち下がり時定数が小さい応答特性を有するフィルタが実現できる。なお、前記立ち上がり時定数は正の補正値β（ｎ）とそのしきい値Ｕｓによってきまり、補正値β（ｎ）が大きいほどあるいはしきい値Ｕｓが小さいほど、短くなる。 If Cu = Us, β (n) (at the same time reset to Cu = 0)
If Cd = Ds, -β (n) (at the same time reset to Cd = 0)
0 if Cu ≠ Us and Cd ≠ Ds
Therefore, if the thresholds Us and Ds to be compared with the count values Cu and Cd of the first and second counters 213 and 214 are set such that Us >> Ds, the rising time constant is large and the falling time constant is A filter having a response characteristic with a small constant can be realized. The rising time constant is determined by the positive correction value β (n) and its threshold value Us, and becomes shorter as the correction value β (n) is larger or the threshold value Us is smaller.

また第１の判定部２２は、瞬時パワー推定値Ｐｓと背景騒音パワー推定値Ｐｎの比（＝Ｐｓ／Ｐｎ）が所定のしきい値以上のときに音声区間と判定して検出フラグ（音声区間検出信号）を「１」とし、前記比がしきい値未満のときに非音声区間と判定して検出フラグを「０」とする。 Further, the first determination unit 22 determines that the voice section is a speech flag when the ratio (= Ps / Pn) of the instantaneous power estimated value Ps and the background noise power estimated value Pn is equal to or greater than a predetermined threshold, and a detection flag (voice section Detection signal) is set to “1”, and when the ratio is less than the threshold value, it is determined as a non-speech interval and the detection flag is set to “0”.

図１７は第１および第２のフィルタ部３１，３２、図１８は第３のフィルタ部３３をそれぞれ示すブロック図であり、これらは何れもディジタルフィルタで構成される。第１のフィルタ部３１は次数２の２次フィルタであって、ｂ０〜ｂ５の５つのパラメータ、２つの遅延処理Ｄ、４つの加算処理により実現され、第２のフィルタ部３２は次数１の１次フィルタであって、ａ０〜ａ２の３つのパラメータ、１つの遅延処理Ｄ、２つの加算処理により実現される。そして、パラメータａ０〜ａ２、ｂ０〜ｂ５の値を適当に設定することにより、回路構成を変更せずに所望の特性を有する高域通過フィルタや低域通過フィルタが容易に実現できる。さらに、第３のフィルタ部３２は次数１の１次フィルタであって、ｃ１，ｃ２の２つのパラメータ、１つの遅延処理Ｄ、２つの加算処理により実現され、パラメータｃ１，ｃ２の値を適当に設定することにより、回路構成を変更せずに所望の特性を有する高域通過フィルタが容易に実現できる。 FIG. 17 is a block diagram showing the first and second filter units 31 and 32, and FIG. 18 is a block diagram showing the third filter unit 33, both of which are constituted by digital filters. The first filter unit 31 is a secondary filter of order 2, and is realized by five parameters b0 to b5, two delay processes D, and four addition processes, and the second filter unit 32 is 1 of order 1. A next filter, which is realized by three parameters a0 to a2, one delay process D, and two addition processes. Then, by appropriately setting the values of the parameters a0 to a2 and b0 to b5, a high-pass filter and a low-pass filter having desired characteristics can be easily realized without changing the circuit configuration. Further, the third filter unit 32 is a first-order filter of order 1, which is realized by two parameters c1 and c2, one delay process D, and two addition processes, and appropriately sets the values of the parameters c1 and c2. By setting, a high-pass filter having desired characteristics can be easily realized without changing the circuit configuration.

本発明の参考例を示すブロック図である。It is a block diagram which shows the reference example of this invention. 同上における第１および第２の音声区間検出部を示すブロック図である。It is a block diagram which shows the 1st and 2nd audio | voice area detection part in the same as the above. 同上における減衰量制御部を示すブロック図である。It is a block diagram which shows the attenuation amount control part in the same as the above. 同上の比較例を示すブロック図である。It is a block diagram which shows the comparative example same as the above. 本発明の実施形態１を示すブロック図である。It is a block diagram which shows Embodiment 1 of this invention. 本発明の実施形態２における第１および第２の音声区間検出部を示すブロック図である。It is a block diagram which shows the 1st and 2nd audio | voice area detection part in Embodiment 2 of this invention. 同上における第２の判定部の動作説明用のフローチャートである。It is a flowchart for operation | movement description of the 2nd determination part in the same as the above. 本発明の実施形態３における第１の判定部の動作説明用のフローチャートである。It is a flowchart for operation | movement description of the 1st determination part in Embodiment 3 of this invention. 同上における赤ちゃんの泣き声に対する実験結果を示す波形図である。It is a wave form diagram which shows the experimental result with respect to the baby's cry in the same as the above. 同上における男性の音声に対する実験結果を示す波形図である。It is a wave form diagram which shows the experimental result with respect to the male voice in the same as the above. 同上における女性の音声に対する実験結果を示す波形図である。It is a wave form diagram which shows the experimental result with respect to the female voice in the same as the above. 本発明の実施形態４を示すブロック図である。It is a block diagram which shows Embodiment 4 of this invention. 本発明の実施形態５を示すブロック図である。It is a block diagram which shows Embodiment 5 of this invention. 本発明の実施形態６を示すブロック図である。It is a block diagram which shows Embodiment 6 of this invention. 本発明の実施形態７を示すブロック図である。It is a block diagram which shows Embodiment 7 of this invention. 同上における第１および第２の音声区間検出部を示すブロック図である。It is a block diagram which shows the 1st and 2nd audio | voice area detection part in the same as the above. 同上における第１および第２のフィルタ部を示すブロック図である。It is a block diagram which shows the 1st and 2nd filter part in the same as the above. 同上における第３のフィルタ部を示すブロック図である。It is a block diagram which shows the 3rd filter part in the same as the above. 従来例を示すブロック図である。It is a block diagram which shows a prior art example.

Explanation of symbols

１マイクロホン
２スピーカ
３送話側損失挿入部
４受話側損失挿入部
１０挿入損失量制御部
１１第１の音声区間検出部
１２第２の音声区間検出部
１３減衰器
１４減衰量制御部 DESCRIPTION OF SYMBOLS 1 Microphone 2 Speaker 3 Transmission side loss insertion part 4 Reception side loss insertion part 10 Insertion loss amount control part 11 1st audio | voice area detection part 12 2nd audio | voice area detection part 13 Attenuator 14 Attenuation amount control part

Claims

A microphone and a speaker, and a voice switch that switches between a transmission state that sends a transmission signal to the transmission system and attenuates the reception signal and a reception state that sends the reception signal to the speaker and attenuates the transmission signal,
The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side loss insertion unit that inserts a loss into the reception side signal path from the transmission system to the speaker. A first voice section detector for detecting a voice section of a transmitted signal, a second voice section detector for detecting a voice section of a received signal, and a transmitter instantaneous power estimation for estimating the instantaneous power of the transmitted signal A transmission side background noise power estimation unit for estimating the background noise power of the transmission signal, a reception side instantaneous power estimation unit for estimating the instantaneous power of the reception signal, and a reception side for estimating the background noise power of the reception signal A background noise power estimation unit, and an insertion loss amount control unit for controlling an insertion loss amount in each loss insertion unit on the transmission side and reception side,
The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result ,
A first attenuator for attenuating the instantaneous power estimation value on the transmission side and a first attenuation amount for increasing the attenuation amount of the first attenuator when the transmission side background noise power estimation value exceeds a predetermined threshold value A second attenuator for attenuating the instantaneous power estimate value on the control unit or the receiver side and a second attenuator for increasing the attenuation amount of the second attenuator when the receiver side background noise power estimate value exceeds a predetermined threshold value at least one comprises one Rutotomoni attenuation control unit, comprising a high-pass filter for removing low frequency band component than the main component band of a speech from the receiving signal to be referenced in the second speech section detecting unit in the voice switch hands-free communication device, characterized in that the.

2. The loudspeaker apparatus according to claim 1, further comprising means for specifying a call device of a partner with which a call is made via a transmission system and switching between valid / invalid of a high-pass filter in accordance with the specified call device.

The loudspeaker apparatus according to claim 1, further comprising means for taking in an operation input by a user's operation and switching valid / invalid of a high-pass filter in accordance with the operation input .

A microphone and a speaker, and a voice switch that switches between a transmission state that sends a transmission signal to the transmission system and attenuates the reception signal and a reception state that sends the reception signal to the speaker and attenuates the transmission signal,
The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side loss insertion unit that inserts a loss into the reception side signal path from the transmission system to the speaker. A first voice section detector for detecting a voice section of a transmitted signal, a second voice section detector for detecting a voice section of a received signal, and a transmitter instantaneous power estimation for estimating the instantaneous power of the transmitted signal A transmission side background noise power estimation unit for estimating the background noise power of the transmission signal, a reception side instantaneous power estimation unit for estimating the instantaneous power of the reception signal, and a reception side for estimating the background noise power of the reception signal A background noise power estimation unit, and an insertion loss amount control unit for controlling an insertion loss amount in each loss insertion unit on the transmission side and reception side,
The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result ,
A first attenuator for attenuating the instantaneous power estimation value on the transmission side and a first attenuation amount for increasing the attenuation amount of the first attenuator when the transmission side background noise power estimation value exceeds a predetermined threshold value A second attenuator for attenuating the instantaneous power estimate value on the control unit or the receiver side and a second attenuator for increasing the attenuation amount of the second attenuator when the receiver side background noise power estimate value exceeds a predetermined threshold value At least one of the attenuation amount control units is provided, and the first and second speech section detection units are constantly present in the reference signal and a first instantaneous power estimation unit that estimates the instantaneous power of the reference signal. A background noise power estimator that estimates the power of the background noise, a first determination unit that determines the speech section of the reference signal based on the ratio of the instantaneous power estimate value and the background noise power estimate value, Frequency band component higher than the component band A low-pass filter to be removed, a second instantaneous power estimation unit that estimates the instantaneous power of the reference signal after the high-frequency band component has been removed by the low-pass filter, and a non-voice section in the first determination unit When it is determined, it is determined as a non-speech interval, and when it is determined as a speech interval by the first determination unit, the instantaneous power estimation value estimated by the first instantaneous power estimation unit is multiplied by a positive coefficient less than 1. expanding voice call, characterized by comprising a second determining unit that determines whether the speech section based value and the magnitude relationship between the instantaneous power estimate estimated by the second instantaneous power estimator apparatus.

The loudspeaker apparatus according to claim 4, wherein the coefficient in the second determination unit is variable .

The first determination unit included in the first speech section detection unit obtains an absolute value of a difference between the two instantaneous power estimation values estimated by the first instantaneous power estimation unit at a predetermined time interval, absolute value with a predetermined hands-free communication device according to claim 4 or 5, wherein the determining by referring to the result of comparison between the threshold value of the difference.

A microphone and a speaker, and a voice switch that switches between a transmission state that sends a transmission signal to the transmission system and attenuates the reception signal and a reception state that sends the reception signal to the speaker and attenuates the transmission signal,
The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side loss insertion unit that inserts a loss into the reception side signal path from the transmission system to the speaker. A first voice section detector for detecting a voice section of a transmitted signal, a second voice section detector for detecting a voice section of a received signal, and a transmitter instantaneous power estimation for estimating the instantaneous power of the transmitted signal A transmission side background noise power estimation unit for estimating the background noise power of the transmission signal, a reception side instantaneous power estimation unit for estimating the instantaneous power of the reception signal, and a reception side for estimating the background noise power of the reception signal A background noise power estimation unit, and an insertion loss amount control unit for controlling an insertion loss amount in each loss insertion unit on the transmission side and reception side,
The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result ,
A first attenuator for attenuating the instantaneous power estimation value on the transmission side and a first attenuation amount for increasing the attenuation amount of the first attenuator when the transmission side background noise power estimation value exceeds a predetermined threshold value A second attenuator for attenuating the instantaneous power estimate value on the control unit or the receiver side and a second attenuator for increasing the attenuation amount of the second attenuator when the receiver side background noise power estimate value exceeds a predetermined threshold value The voice switch includes at least one of the attenuation amount control units, and includes a high-pass filter that removes a frequency band component lower than the main component band of the voice from the reception signal referred to by the second voice section detection unit. The first speech section detector includes a first instantaneous power estimator that estimates the instantaneous power of the reference signal, a background noise power estimator that estimates the power of background noise that is constantly present in the reference signal, Instantaneous power estimates and background A first determination unit that determines a speech section of a reference signal based on a ratio of estimated sound power values, a low-pass filter that removes a frequency band component higher than a main component band of speech from the reference signal, and a low-pass A second instantaneous power estimation unit that estimates the instantaneous power of the reference signal after the high frequency band component is removed by the filter, and a non-speech interval when the first determination unit determines that it is a non-speech segment When the first determination unit determines that the voice section is used, the second instantaneous power estimation unit uses a value obtained by multiplying the instantaneous power estimation value estimated by the first instantaneous power estimation unit by a positive coefficient less than 1. expansion voice communication apparatus being characterized in that and a second determination unit that determines whether the speech segment based on the magnitude relationship between the estimated instantaneous power estimate.

The first determination unit obtains an absolute value of a difference between the two instantaneous power estimation values estimated by the first instantaneous power estimation unit at a predetermined time interval, and calculates the absolute value of the difference and a predetermined threshold value. The loudspeaker apparatus according to claim 7, wherein the determination is made with reference to a comparison result with .

A microphone and a speaker, and a voice switch that switches between a transmission state that sends a transmission signal to the transmission system and attenuates the reception signal and a reception state that sends the reception signal to the speaker and attenuates the transmission signal,
The voice switch includes a transmission side loss insertion unit that inserts a loss into the transmission side signal path from the microphone to the transmission system, and a reception side loss insertion unit that inserts a loss into the reception side signal path from the transmission system to the speaker. A first voice section detector for detecting a voice section of a transmitted signal, a second voice section detector for detecting a voice section of a received signal, and a transmitter instantaneous power estimation for estimating the instantaneous power of the transmitted signal A transmission side background noise power estimation unit for estimating the background noise power of the transmission signal, a reception side instantaneous power estimation unit for estimating the instantaneous power of the reception signal, and a reception side for estimating the background noise power of the reception signal A background noise power estimation unit, and an insertion loss amount control unit for controlling an insertion loss amount in each loss insertion unit on the transmission side and reception side,
The insertion loss amount control unit determines the call state with reference to the comparison result of the instantaneous power estimation values on the transmission side and the reception side and the detection results of the first and second voice section detection units, and In a loudspeaker device that is switched to at least one of a transmission mode in which the insertion loss amount on the receiving side is relatively large or a reception mode in which the insertion loss amount on the transmission side is relatively large according to the determination result ,
A first attenuator for attenuating the instantaneous power estimation value on the transmission side and a first attenuation amount for increasing the attenuation amount of the first attenuator when the transmission side background noise power estimation value exceeds a predetermined threshold value A second attenuator for attenuating the instantaneous power estimate value on the control unit or the receiver side and a second attenuator for increasing the attenuation amount of the second attenuator when the receiver side background noise power estimate value exceeds a predetermined threshold value A low-pass filter that includes at least one of the attenuation amount control unit and removes a frequency band component higher than the main component band of the speech from the transmission signal referred to by the first speech section detection unit; expansion voice communication apparatus being characterized in that provided in the voice switch from the received signal and a high pass filter for removing low frequency band component than the main component band of sound to be referenced in section detection unit.

The voice switch includes the low-pass filter having a variable cut-off frequency, and includes means for capturing an operation input by a user operation and changing the cut-off frequency of the low-pass filter in accordance with the operation input. The loudspeaker apparatus according to claim 9 .

The loudspeaker apparatus according to claim 9, further comprising means for detecting a pitch of voice included in the reference signal and changing a cutoff frequency of the low-pass filter in accordance with the detected voice pitch .

The loudspeaker apparatus according to claim 9, wherein a high-pass filter for removing a frequency band component lower than a main component band of the voice is provided in series with the low-pass filter .

13. The loudspeaker apparatus according to claim 12 , wherein the filter that removes a frequency band component higher or lower than the main component region of the voice from the transmission signal and the reception signal is constituted by a digital filter .