JP4135702B2

JP4135702B2 - Intercom system

Info

Publication number: JP4135702B2
Application number: JP2004341085A
Authority: JP
Inventors: 貴二本杉; 義則室井
Original assignee: Matsushita Electric Works Ltd
Current assignee: Panasonic Electric Works Co Ltd
Priority date: 2004-11-25
Filing date: 2004-11-25
Publication date: 2008-08-20
Anticipated expiration: 2024-11-25
Also published as: JP2006154964A

Description

本発明は、ドアホン子器とインターホン親機とで構成されるインターホンシステムに関し、特にドアホン子器のカメラで撮像した画像をインターホン親機で表示する機能を有したインターホンシステムに関するものである。 The present invention relates to an interphone system including a door phone slave unit and an interphone master unit, and more particularly to an interphone system having a function of displaying an image captured by a camera of the door phone slave unit on the interphone master unit.

ドアホン子器のカメラで撮像した画像をインターホン親機で表示する機能を有したインターホンシステムが従来より提供されている。この種のシステムにおいては、カメラで撮像される被写体（来訪者）が逆光となった場合に撮像画像全体の明るさ、すなわち、映像信号のレベルを大きくすることで被写体が黒くつぶれることを防止する機能（いわゆる逆光補正機能）をインターホン親機が有している。例えば、図１３に示すようにドアホン子器のカメラで撮像された画像においては、鉛直方向の上部に相対的に輝度の高い空が位置するとともに鉛直方向の下部に相対的に輝度の低い地面などが位置し、被写体である来訪者の顔が空と重なることが多いと考えられる。従って、撮像画像において地面に対応する下部位置の輝度を検出し、検出した輝度が最適値以上となるまで映像信号のレベルを大きくすることで逆光補正を行っていた（図１３（ｂ）参照）。 An intercom system having a function of displaying an image captured by a camera of a door phone slave unit with an interphone master has been provided. In this type of system, when the subject (visitor) captured by the camera is backlit, the brightness of the entire captured image, that is, the level of the video signal is increased to prevent the subject from being blacked out. The interphone master unit has a function (so-called backlight correction function). For example, as shown in FIG. 13, in an image captured by a camera of a door phone slave unit, a sky with relatively high brightness is located in the upper part in the vertical direction, and a ground with relatively low brightness in the lower part in the vertical direction. It is considered that the face of the visitor who is the subject often overlaps the sky. Therefore, the backlight correction is performed by detecting the luminance of the lower position corresponding to the ground in the captured image and increasing the level of the video signal until the detected luminance is equal to or higher than the optimum value (see FIG. 13B). .

一方、１画面分の映像信号のサンプリング値を白と黒の各々の所定値と比較し、白の所定値を超えるサンプリング値の個数（計数値）と黒の所定値を下回るサンプリング値の個数（計数値）とを比較することで当該画面が逆光か否かを判断し、逆光と判断した場合に映像信号のレベルを大きくするようにしたインターホンシステムも提案されている（特許文献１参照）。
実開平７−２４７５７号公報 On the other hand, the sampling value of the video signal for one screen is compared with a predetermined value for each of white and black, and the number of sampling values exceeding the predetermined value for white (count value) and the number of sampling values falling below the predetermined value for black ( An interphone system is also proposed in which it is determined whether or not the screen is backlit by comparing the count value), and the level of the video signal is increased when it is determined that the screen is backlit (see Patent Document 1).
Japanese Utility Model Publication No. 7-24757

ところが、画面下部の輝度が最適値になるように逆光補正したり、特許文献１に記載されている従来例のように１画面内の白の領域と黒の領域の大小関係に基づいて逆光か否かを判断して逆光補正を行う場合、画面内で優先的に最適な明るさに調整されるべき来訪者の顔の位置が考慮されていないため、逆光補正された画像における来訪者の顔が明るすぎたり、あるいは暗すぎて来訪者の顔を適切な明るさで撮像することができない虞があった。 However, the backlight is corrected so that the luminance at the bottom of the screen becomes the optimum value, or the backlight is corrected based on the size relationship between the white area and the black area in one screen as in the conventional example described in Patent Document 1. When the backlight correction is performed by determining whether or not to perform the backlight correction, the position of the visitor's face that should be preferentially adjusted to the optimum brightness in the screen is not taken into consideration, so the visitor's face in the backlight corrected image is not taken into consideration. May be too bright or too dark to capture the face of the visitor with appropriate brightness.

本発明は上記事情に鑑みて為されたものであり、その目的は、来訪者の顔を常に適切な明るさで撮像し表示することができるインターホンシステムを提供することにある。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide an intercom system capable of always capturing and displaying a visitor's face with appropriate brightness.

請求項１の発明は、上記目的を達成するために、住戸外の玄関などに設置されるドアホン子器と、住戸内に設置されてドアホン子器との間で通話を行うインターホン親機とで構成され、ドアホン子器は、来訪者を撮像する撮像手段と、インターホン親機との間で通話する通話手段とを備え、インターホン親機は、ドアホン子器との間で通話する通話手段と、ドアホン子器の撮像手段で撮像された画像を表示する表示手段とを備え、ドアホン子器又はインターホン親機の一方に、撮像手段で撮像された画像に対して画像処理を行うことにより当該画像における来訪者の顔の位置を検出する画像処理手段と、画像処理手段で検出された顔の位置が適正な明るさとなるように撮像手段で撮像された画像の明るさを調整する調整手段とを備え、前記画像処理手段は、予め顔検出用テンプレート画像を記憶している顔検出用テンプレート画像記憶手段と、ドアホン子器の撮像手段で撮像した撮像画像及び顔検出用テンプレート画像記憶手段で記憶している顔検出用テンプレート画像の濃度勾配方向画像をそれぞれ抽出する濃度勾配方向抽出手段と、濃度勾配方向抽出手段で抽出した各濃度勾配方向画像を記憶する濃度勾配方向画像記憶手段と、濃度勾配方向画像記憶手段に記憶された顔検出用テンプレート画像の濃度勾配方向画像上の基準点と参照とする各座標点との距離及び基準点と座標点とを結ぶ線が座標点を通る水平軸と交差する角度の情報を抽出する形状特徴抽出手段と、形状特徴抽出手段で抽出した距離、角度の情報を顔検出用テンプレート画像の濃度勾配方向の値毎に分類して記憶する形状特徴記憶手段と、濃度勾配方向画像記憶手段に記憶されている撮像画像の濃度勾配方向画像の参照とする各座標点における濃度勾配方向の値、及び形状特徴記憶手段に記憶された形状特徴に基づいて、撮像画像の濃度勾配方向画像における基準点候補点に投票処理を行う投票手段と、投票手段によって求めた投票結果に基づいて、顔の位置を検出する顔検出手段とから成ることを特徴とする。 In order to achieve the above object, the invention of claim 1 includes a door phone slave installed at an entrance outside a dwelling unit, and an interphone master set installed inside the dwelling unit and performing a call between the door phone slaves. The door phone slave unit includes an imaging unit that images a visitor and a call unit that performs a call between the interphone master unit, and the interphone master unit includes a call unit that performs a call between the door phone slave unit, Display means for displaying an image picked up by the image pickup means of the door phone slave unit, and performing image processing on the image picked up by the image pickup means on one of the door phone slave unit or the interphone master unit Image processing means for detecting the position of the face of the visitor, and adjustment means for adjusting the brightness of the image captured by the imaging means so that the position of the face detected by the image processing means has an appropriate brightness. , The picture The processing means includes a face detection template image storage means for storing a face detection template image in advance, a captured image captured by the imaging means of the door phone slave unit, and a face detection stored in the face detection template image storage means. Concentration gradient direction extraction means for extracting density gradient direction images of the template image, density gradient direction image storage means for storing each density gradient direction image extracted by the density gradient direction extraction means, and density gradient direction image storage means Information on the distance between the reference point on the density gradient direction image of the stored face detection template image and each coordinate point to be referenced, and the angle at which the line connecting the reference point and the coordinate point intersects the horizontal axis passing through the coordinate point The shape feature extracting means for extracting the information and the distance and angle information extracted by the shape feature extracting means are classified and described for each value of the density gradient direction of the face detection template image. Shape feature storage means, density gradient direction values at each coordinate point to be referred to in the density gradient direction image of the captured image stored in the density gradient direction image storage means, and shape features stored in the shape feature storage means Based on the voting means for voting the reference point candidate point in the density gradient direction image of the captured image, and the face detecting means for detecting the position of the face based on the voting result obtained by the voting means. Features.

この発明によれば、ドアホン子器の撮像手段で撮像された画像に対して、ドアホン子器又はインターホン親機の一方が備える画像処理手段が画像処理を行うことにより当該画像における来訪者の顔の位置を検出し、撮像手段で撮像された画像の中から画像処理手段で検出された顔の位置が適正な明るさとなるように調整手段が画像の明るさを調整するため、来訪者の顔の位置が画面内のどこにあっても、来訪者の顔を常に適切な明るさで撮像し表示することができる。しかも、体と顔検出用テンプレート画像のサイズが一致することがあっても、それぞれの濃度勾配方向の値が異なるために、体の位置に投票が集中しにくく、体を誤検出することはなく、また濃度勾配方向画像を用いることにより照明の変動に強い画像処理手段を実現できる。 According to the present invention, the image processing unit included in one of the door phone slave unit or the interphone master unit performs image processing on the image captured by the imaging unit of the door phone slave unit, whereby the face of the visitor in the image is displayed. The position of the face of the visitor's face is detected because the adjustment means adjusts the brightness of the image so that the position of the face detected by the image processing means from the image picked up by the image pickup means becomes an appropriate brightness. Regardless of the position on the screen, the face of the visitor can always be captured and displayed with appropriate brightness. Moreover, even if the size of the body and the face detection template image may be the same, the values of the density gradient directions are different, so that votes are not easily concentrated on the position of the body, and the body is not erroneously detected. Further, by using the density gradient direction image, it is possible to realize an image processing means that is resistant to illumination fluctuations.

請求項２の発明は、請求項１の発明において、前記画像処理手段並びに調整手段をドアホン子器に備えたことを特徴とする。 According to a second aspect of the present invention, in the first aspect of the invention, the image processing means and the adjusting means are provided in a door phone sub unit.

請求項３の発明は、請求項１の発明において、前記顔検出手段には、顔検出用テンプレート画像の大きさを変化させ、この大きさ変化に対応した投票値の変化に基づいて、顔の大きさを抽出する顔サイズ検出手段を備えていることを特徴とする。
この発明によれば、顔の大きさが一定でなくも顔を検出することができる。 According to a third aspect of the present invention, in the first aspect of the present invention, the face detecting means changes the size of the face detection template image, and based on the change of the vote value corresponding to the size change, A face size detecting means for extracting the size is provided.
According to the present invention, a face can be detected even if the size of the face is not constant.

請求項４の発明は、請求項１の発明において、前記顔検出手段には、形状特徴における距離情報を変化させて、その距離情報の変化に応じた投票値の変化に基づいて、顔の大きさを検出する顔サイズ検出手段を備えていることを特徴とする。 According to a fourth aspect of the present invention, in the first aspect of the present invention, the face detection means changes the distance information in the shape feature, and based on the change of the vote value according to the change of the distance information, the size of the face is determined. It is characterized by comprising face size detecting means for detecting the height.

この発明によれば、顔の大きさが一定でなくも顔を検出することができる。 According to the present invention, a face can be detected even if the size of the face is not constant.

請求項５の発明は、請求項１乃至４の何れかの発明において、前記顔検出手段には、顔検出用テンプレート画像の回転角を変化させて、この回転角の変化に対応した投票値の変化に基づいて、顔の回転角を抽出する顔回転角抽出手段を備えていることを特徴とする。 According to a fifth aspect of the present invention, in the invention according to any one of the first to fourth aspects, the face detection means changes the rotation angle of the face detection template image, and the voting value corresponding to the change in the rotation angle is changed. A face rotation angle extracting means for extracting a face rotation angle based on the change is provided.

この発明によれば、顔の回転角が一定でなくても顔を検出することができる。 According to the present invention, a face can be detected even if the rotation angle of the face is not constant.

請求項６の発明は、請求項１乃至４の何れかの発明において、前記顔検出手段には、形状特徴における角度情報を変化させて、この角度情報の変化に応じた投票値の変化に基づいて、顔の回転角を抽出する顔回転角抽出手段を備えていることを特徴とする。 According to a sixth aspect of the present invention, in the invention according to any one of the first to fourth aspects, the face detection means changes angle information in the shape feature and is based on a change in vote value in accordance with the change in the angle information. And a face rotation angle extracting means for extracting the face rotation angle.

請求項１の発明によれば、ドアホン子器の撮像手段で撮像された画像に対して、ドアホン子器又はインターホン親機の一方が備える画像処理手段が画像処理を行うことにより当該画像における来訪者の顔の位置を検出し、撮像手段で撮像された画像の中から画像処理手段で検出された顔の位置が適正な明るさとなるように調整手段が画像の明るさを調整するため、来訪者の顔の位置が画面内のどこにあっても、来訪者の顔を常に適切な明るさで撮像し表示することができ、しかも、体と顔検出用テンプレート画像のサイズが一致することがあっても、それぞれの濃度勾配方向の値が異なるために、体の位置に投票が集中しにくく、体を誤検出することはなく、また濃度勾配方向画像を用いることにより照明の変動に強い画像処理手段を実現できるという効果がある。 According to the first aspect of the present invention, the image processing means provided in one of the doorphone slave unit or the interphone master unit performs image processing on the image picked up by the image pickup means of the door phone slave unit, whereby the visitor in the image Because the adjustment means adjusts the brightness of the image so that the position of the face detected by the image processing means from the images picked up by the image pickup means is adjusted to an appropriate brightness from the image picked up by the image pickup means. The face of the visitor's face can always be captured and displayed at an appropriate brightness no matter where the face is located on the screen , and the size of the body and face detection template image may match. However, since the values of the density gradient directions are different from each other, voting is difficult to concentrate on the position of the body, so that the body is not erroneously detected, and image processing means that is resistant to fluctuations in illumination by using the density gradient direction image. Realized in There is an effect that that.

（実施形態１）
図１及び図２は、本実施形態のインターホンシステムのシステム構成図並びにブロック図をそれぞれ示している。ここで、ドアホン子器２０は２線式の通話線及び２線式の制御線からなる２対の信号線Ｌ１によってインターホン親機（以下、「親機」と略す）１と接続される。 (Embodiment 1)
1 and 2 show a system configuration diagram and a block diagram of an intercom system according to the present embodiment, respectively. Here, the door phone slave unit 20 is connected to the interphone master unit (hereinafter, abbreviated as “master unit”) 1 by two pairs of signal lines L1 including a two-wire communication line and a two-wire control line.

ドアホン子器２０は、親機１によって通話線に印加される電圧を受けて内部の動作電源を作成するとともに内部の通話経路と通話線との間を２線−４線変換する受電部２１と、マイクロホン及びスピーカを有する音声入出力部２２と、マイクロホンから入力する音声信号の増幅や親機１から通話線を介して送られてくる音声信号の増幅等を行う音声処理部２３と、ＣＣＤのような個体撮像素子とレンズなどの光学系を有して来訪者を撮像するカメラ部２４と、マイクロコンピュータを主構成要素とし、カメラ部２４から出力される映像信号に対して画像処理（後述する）を行うとともに画像処理された後の映像信号を周波数変調する映像処理部２５と、来訪者が呼出釦２８（図１参照）を押操作したときにオンするスイッチを有し該スイッチがオンしたときに通話線の線間電圧を変化させるなどして呼出信号を送出する操作部２６と、親機１との通話時などに制御線を介して給電されることにより表示素子（発光ダイオード）を点灯させる表示部２７とを備える。ここで、音声処理部２３から出力される音声信号（ベースバンド信号）に周波数変調された映像信号（以下、「ＦＭ映像信号」と呼ぶ）が受電部２１で多重化され、通話線を通して親機１に送られる。 The door phone sub unit 20 receives a voltage applied to the call line by the base unit 1 to create an internal operating power source, and performs a two-to-four line conversion between the internal call path and the call line; A voice input / output unit 22 having a microphone and a speaker, a voice processing unit 23 for amplifying a voice signal inputted from the microphone, amplifying a voice signal sent from the master unit 1 through a call line, and the like, A camera unit 24 that has such an individual image sensor and an optical system such as a lens and images a visitor, and a microcomputer as main components, and image processing (described later) on a video signal output from the camera unit 24 ) And frequency-modulating the video signal after image processing, and a switch that is turned on when the visitor presses the call button 28 (see FIG. 1). A display element (light-emitting diode) is supplied with power via the control unit 26 during a call with the operation unit 26 that sends a call signal by changing the line voltage of the call line when it is turned on. ) Is lit. Here, the video signal (hereinafter referred to as “FM video signal”) frequency-modulated to the audio signal (baseband signal) output from the audio processing unit 23 is multiplexed by the power receiving unit 21 and is transmitted through the communication line to the master unit. Sent to 1.

親機１は、通話線に直流電圧を印加してドアホン子器２０への給電を行うとともに多重化されて伝送されてくる音声信号とＦＭ映像信号を分離する給電部２と、マイクロコンピュータを主構成要素とする親機制御部３と、マイクロホン及びスピーカを有する音声入出力部４と、音声入出力部４から入力する音声信号と給電部２で分離されたドアホン子器２０からの音声信号とを比較し、信号レベルが低い方の音声信号を減衰させることで通話方向（親機１→ドアホン子器２０又は親機１←ドアホン子器２０）を切り換える音声スイッチの機能を有した音声処理部５と、液晶ディスプレイ（ＬＣＤ）からなる表示部６と、ドアホン子器２０から送られてくるＦＭ映像信号を復調し表示部６に映像信号を出力して映像を表示させる映像処理部７と、通話釦１１（図１参照）の操作によりオンする通話スイッチやモニタ釦１２（図１参照）の操作によりオンするモニタスイッチを有し、各スイッチがオンしたときにそれぞれ対応する操作信号を親機制御部３に出力する親機操作部８と、商用電源より各部の動作電源を作成する電源部９とを備える。 The main unit 1 applies a DC voltage to the telephone line to supply power to the intercom 20 and feeds the multiplexed audio signal and FM video signal, and a microcomputer. Main unit control unit 3 as a component, audio input / output unit 4 having a microphone and a speaker, an audio signal input from audio input / output unit 4 and an audio signal from doorphone slave unit 20 separated by power supply unit 2 , And a voice processing unit having a function of a voice switch for switching a call direction (base unit 1 → door phone slave unit 20 or master unit 1 ← door phone slave unit 20) by attenuating a voice signal having a lower signal level 5, a display unit 6 composed of a liquid crystal display (LCD), and a video processing unit 7 that demodulates the FM video signal sent from the intercom 20 and outputs the video signal to the display unit 6 to display the video. There are a call switch that is turned on by operation of the call button 11 (see FIG. 1) and a monitor switch that is turned on by operation of the monitor button 12 (see FIG. 1). A base unit operation unit 8 that outputs to the control unit 3 and a power source unit 9 that creates an operating power source for each unit from a commercial power source are provided.

次に親機１並びにドアホン子器２０の基本的な動作を説明する。 Next, basic operations of the master unit 1 and the door phone slave unit 20 will be described.

まず、来訪者がドアホン子器２０の呼出釦２８を押操作すると操作部２６が受電部２１から通話線を介して親機１に呼出信号を送信し、さらに映像処理部２５とカメラ部２４が起動してカメラ部２４で撮像した来訪者の映像が受電部２１から通話線を介して親機１に伝送される。親機１では、呼出信号を検出した親機制御部３が音声処理部５を制御して音声入出力部４のスピーカらから呼出音を鳴動させるとともに、ドアホン子器２０から伝送されてきたＦＭ映像信号を映像処理部７で復調させて来訪者の映像を表示部６に表示させる。そして、家人が親機１の通話釦１１を押操作すると親機操作部８から操作信号が出力され、この操作信号を受けた親機制御部３が音声処理部５並びに音声入出力部４と通話線の間に通話路を形成してドアホン子器２０との通話が可能となる。また、モニタ釦１２が押操作されて親機操作部８から操作信号が出力されると、この操作信号を受けた親機制御部３が給電部２から通話線を介してドアホン子器２０への給電を行い、ドアホン子器２０のカメラ部２４及び映像処理部２５が起動してカメラ部２４で撮像した映像が親機１に送られるため、表示部６にドアホン子器２０のカメラ部２４で撮像した映像が映し出される。 First, when a visitor presses the call button 28 of the door phone cordless handset 20, the operation unit 26 transmits a call signal from the power receiving unit 21 to the base unit 1 through the telephone line, and the video processing unit 25 and the camera unit 24 further. The video of the visitor that is activated and imaged by the camera unit 24 is transmitted from the power receiving unit 21 to the base unit 1 via the telephone line. In the base unit 1, the base unit control unit 3 that has detected the calling signal controls the voice processing unit 5 to ring the ringing tone from the speakers of the voice input / output unit 4 and also transmits the FM transmitted from the doorphone slave unit 20. The video signal is demodulated by the video processing unit 7 and the video of the visitor is displayed on the display unit 6. When the housekeeper presses the call button 11 of the base unit 1, an operation signal is output from the base unit operation unit 8, and the base unit control unit 3 that receives this operation signal is connected to the voice processing unit 5 and the voice input / output unit 4. A call path is formed between the call lines, and a call with the door phone slave unit 20 becomes possible. When the monitor button 12 is pressed and an operation signal is output from the base unit operation unit 8, the base unit control unit 3 that has received this operation signal is sent from the power supply unit 2 to the door phone sub unit 20 through the call line. Since the camera unit 24 and the video processing unit 25 of the door phone slave unit 20 are activated and the video imaged by the camera unit 24 is sent to the base unit 1, the camera unit 24 of the door phone slave unit 20 is displayed on the display unit 6. The image captured with is displayed.

次に、本発明の要旨であるドアホン子器２０の映像処理部２５における画像処理及び逆光補正処理について説明する。 Next, image processing and backlight correction processing in the video processing unit 25 of the door phone slave unit 20 that is the gist of the present invention will be described.

映像処理部２５は、上述のように呼出釦２８の押操作によりカメラ部２４及び映像処理部２５が起動すると、カメラ部２４で撮像された画像における来訪者の顔の位置を検出する画像処理（以下、「顔位置検出処理」と呼ぶ）を行った後、検出された顔の位置の画像の明るさ（輝度レベル）を検出し、その検出値が所定のしきい値以下であれば画像全体の明るさを大きくする逆光補正処理を行っている。 When the camera unit 24 and the video processing unit 25 are activated by pressing the call button 28 as described above, the video processing unit 25 detects the position of the visitor's face in the image captured by the camera unit 24 ( (Hereinafter referred to as “face position detection processing”), the brightness (luminance level) of the image of the detected face position is detected, and if the detected value is below a predetermined threshold value, the entire image Backlight correction processing is performed to increase the brightness.

まず、顔位置検出処理について説明する。図３は映像処理部２５の顔位置検出処理を行う構成を示したブロック図であり、カメラ部２４からの撮像画像をＡ／Ｄ変換して取り込む画像入力手段２５１と、予め顔検出用テンプレートを記憶させている顔検出用テンプレート画像記憶手段２５２と、この顔検出用テンプレート画像記憶手段２５２に記憶されている図４（ａ）に示すような顔検出用テンプレート画像Ａ及び画像入力手段２５１が取り込んだ図５に示すような撮像画像Ｂの各濃度勾配方向画像Ａ’，Ｂ’（図４（ｂ）、図４（ａ）参照）を抽出する濃度勾配方向抽出手段２５３と、抽出した濃度勾配方向画像Ａ’，Ｂ’を記憶する濃度勾配方向画像記憶手段２５４と、濃度勾配方向画像記憶手段２５４に記憶されている顔検出用テンプレート画像Ａの濃度勾配方向画像Ａ’上の基準点Ｏと参照とする各座標点（以下参照点という）Ｒとの間の距離Ｌ、基準点Ｏと各参照点Ｒとを結ぶ線が参照点Ｒを通る水平軸（Ｘ軸）と交差する角度αの情報を抽出する形状特徴抽出手段２５５と、これら抽出された情報を顔検出用テンプレート画像Ａの濃度勾配方向の値θ毎に分離して記憶する形状特徴記憶手段２５６と、濃度勾配方向画像記憶手段２５４に記憶されている撮像画像Ｂの濃度勾配方向画像Ｂ’の各参照点Ｒ’における濃度勾配方向の値θ、及び形状特徴記憶手段２５６に記憶された形状特徴の情報に基づいて、濃度勾配方向画像Ｂ’における基準点候補点に投票処理を行う投票手段２５７と、この投票手段２５７によって求めた投票結果に基づいて、顔の位置を検出する顔検出手段２５８とで構成される。 First, the face position detection process will be described. FIG. 3 is a block diagram showing a configuration for performing the face position detection process of the video processing unit 25. An image input unit 251 that captures a captured image from the camera unit 24 by A / D conversion, and a face detection template in advance. The stored face detection template image storage means 252 and the face detection template image A and image input means 251 stored in the face detection template image storage means 252 as shown in FIG. The density gradient direction extracting means 253 for extracting the density gradient direction images A ′ and B ′ (see FIGS. 4B and 4A) of the captured image B as shown in FIG. 5, and the extracted density gradient The density gradient direction image storage unit 254 that stores the direction images A ′ and B ′, and the density gradient direction image A ′ of the face detection template image A stored in the density gradient direction image storage unit 254 The distance L between the reference point O and each reference coordinate point (hereinafter referred to as reference point) R, and the line connecting the reference point O and each reference point R intersects the horizontal axis (X axis) passing through the reference point R Shape feature extraction means 255 for extracting information on the angle α to be performed, shape feature storage means 256 for separating and storing the extracted information for each value θ in the density gradient direction of the face detection template image A, and density gradient Based on the density gradient direction value θ at each reference point R ′ of the density gradient direction image B ′ of the captured image B stored in the direction image storage unit 254 and the shape feature information stored in the shape feature storage unit 256. The voting means 257 for voting the reference point candidate points in the density gradient direction image B ′, and the face detecting means 258 for detecting the position of the face based on the voting result obtained by the voting means 257. The

各手段の動作を以下に説明する。
まず濃度勾配方向抽出手段２５３は、カメラ部２４で撮影し、画像入力手段２５１で入力した図５（ａ）に示す撮像画像Ｂ及び顔検出用テンプレート画像記憶手段２５２に記憶してある顔検出用テンプレート画像Ａからそれぞれの濃度勾配方向画像Ｂ’，Ａ’を抽出する。図４（ｂ）は顔検出用テンプレート画像Ａの濃度勾配方向画像Ａ’を。図６（ａ）は撮像画像Ｂの濃度勾配方向画像Ｂ’をそれぞれ示す。これらの濃度勾配方向画像Ａ’，Ｂ’の抽出に当たっては、図５に示すように３×３のｘ方向ソーベルフィルタ１２ｘ及びｙ方向ソーベルフィルタ１２ｙを入力濃淡画像に含まれる全ての画素に対して適用し、各画像における水平軸（Ｘ軸）方向の微分値ｄｘ及び垂直軸（Ｙ軸）方向の微分値ｄｙを（式１）、（式２）より求める。 The operation of each means will be described below.
First, the density gradient direction extracting means 253 is used for face detection, which is captured by the camera unit 24 and stored in the captured image B and face detection template image storage means 252 shown in FIG. 5A input by the image input means 251. The density gradient direction images B ′ and A ′ are extracted from the template image A. FIG. 4B shows the density gradient direction image A ′ of the face detection template image A. FIG. 6A shows a density gradient direction image B ′ of the captured image B, respectively. When extracting these density gradient direction images A ′ and B ′, as shown in FIG. 5, a 3 × 3 x-direction Sobel filter 12x and a y-direction Sobel filter 12y are applied to all pixels included in the input grayscale image. The differential value dx in the horizontal axis (X-axis) direction and the differential value dy in the vertical axis (Y-axis) direction in each image are obtained from (Formula 1) and (Formula 2).

ｄｘ＝（ｃ＋２ｆ＋ｉ）−（ａ＋２ｄ＋ｇ）…（式１）
ｄｙ＝（ｇ＋２ｈ＋ｉ）−（ａ＋２ｂ＋ｃ）…（式２）
（ａ〜ｈは、図７に示すように注目画素ｅ及びその注目画素ｅ近傍の８つの画素の値ａ〜ｄ、ｆ〜ｈを示すものである。）またこの時、微分の方向値θは、下記の（式３）で示される。 dx = (c + 2f + i) − (a + 2d + g) (Formula 1)
dy = (g + 2h + i) − (a + 2b + c) (Formula 2)
(A to h indicate the values a to d and f to h of the target pixel e and eight pixels in the vicinity of the target pixel e as shown in FIG. 7.) At this time, the differential direction value θ Is represented by the following (formula 3).

θ＝ｔａｎ^-1（ｄｙ／ｄｘ）…（式３）
ここで、濃度勾配方向画像Ａ’，Ｂ’は前記濃度勾配方向値θを画素値に持つ画像であり、この画像を濃度勾配方向画像手段２５４に記憶する。 θ = tan ⁻¹ (dy / dx) (Equation 3)
Here, the density gradient direction images A ′ and B ′ are images having the density gradient direction value θ as a pixel value, and this image is stored in the density gradient direction image means 254.

形状特徴抽出手段２５５は、まず濃度勾配方向画像記憶手段２５４に記憶されている顔検出用テンプレート画像Ａの濃度勾配方向画像Ｂにおいて重心を算出し、それを基準点Ｏとする。この図４（ｃ）に示す基準点Ｏと参照点Ｒとの距離Ｌ及び基準点Ｏと参照点Ｒとを結ぶ線と参照点Ｒを通る水平軸βとの間の角度αを算出し、濃度勾配方向γと線βとの間の角度、つまり濃度勾配方向値θ毎に顔検出用テンプレート画像Ａの特徴情報として形状特徴記憶手段２５６に記憶する。本実施形態で用いる顔検出用テンプレート画像Ａとしては、図４（ａ）に示すように平均的な人の顔の特徴部位（目、眉、鼻、口など）を含む形で作られたものを採用している。尚この顔検出用テンプレート画像Ａの形状特徴は予め形状特徴記憶手段２５６に記憶しておいても良い。 The shape feature extraction unit 255 first calculates the center of gravity in the density gradient direction image B of the face detection template image A stored in the density gradient direction image storage unit 254 and sets it as the reference point O. The distance L between the reference point O and the reference point R shown in FIG. 4C and the angle α between the line connecting the reference point O and the reference point R and the horizontal axis β passing through the reference point R are calculated. The shape feature storage unit 256 stores the feature information of the face detection template image A for each angle between the density gradient direction γ and the line β, that is, for each density gradient direction value θ. As shown in FIG. 4A, the face detection template image A used in the present embodiment is created in a form including an average human face characteristic part (eyes, eyebrows, nose, mouth, etc.). Is adopted. The shape feature of the face detection template image A may be stored in the shape feature storage unit 256 in advance.

表１はこの形状特徴記憶手段２５６に記憶される形状特徴を示し、各参照点の濃度勾配方向値θ毎に、角度α、距離Ｌを記入した表形式となっている。 Table 1 shows the shape features stored in the shape feature storage means 256, and is in a table format in which the angle α and the distance L are entered for each density gradient direction value θ of each reference point.

投票手段２５７は、濃度勾配方向画像記憶手段２５４に記憶されている図６（ａ）に示す撮像画像Ｂの濃度勾配方向画像Ｂ’の濃度勾配方向値θと、形状特徴記憶手段２５６に記憶されている顔検出用テンプレート画像Ａの形状特徴を用いて、下記の（式４）、（式５）より基準点候補点を求め、投票空間のその点に投票する。そして顔検出手段２５８は、図８（ａ）に示す投票結果から投票値のピークを検出して基準点Ｏ’（ｏ’ｘ，ｏ’ｙ）を決め、顔の位置を決定する。尚図８（ｂ）は図８（ａ）で示す区分範囲の投票値との関係を示す。 The voting unit 257 stores the density gradient direction value θ of the density gradient direction image B ′ of the captured image B shown in FIG. 6A stored in the density gradient direction image storage unit 254 and the shape feature storage unit 256. Using the shape feature of the face detection template image A, a reference point candidate point is obtained from the following (Expression 4) and (Expression 5), and the vote is voted for that point. Then, the face detecting means 258 detects the peak of the vote value from the voting result shown in FIG. 8A, determines the reference point O ′ (o′x, o′y), and determines the position of the face. FIG. 8B shows the relationship with the voting value of the segment range shown in FIG.

ｏｘ’＝Ｌ×ｃｏｓ（α）＋ｒ’ｘ …（式４）
ｏｙ’＝Ｌ×ｃｏｓ（α）＋ｒ’ｙ …（式５）
このように本実施形態の顔位置検出処理によれば、体と顔検出用テンプレート画像のサイズが一致することがあっても、それぞれの濃度勾配方向の値が異なるために、体の位置に投票が集中しにくく、体を誤検出することはなく、また濃度勾配方向画像を用いることにより照明の変動に強い画像処理手段（映像処理部２５）が実現できるという利点がある。 ox ′ = L × cos (α) + r′x (Formula 4)
oy ′ = L × cos (α) + r′y (Formula 5)
Thus, according to the face position detection process of the present embodiment, even if the size of the body and the face detection template image may match, the values of the respective density gradient directions are different, so the vote of the body position Are less likely to concentrate, the body is not erroneously detected, and an image processing means (video processing unit 25) that is resistant to variations in illumination can be realized by using a density gradient direction image.

ところで、上述の顔位置検出処理において、顔検出用テンプレート画像記憶手段２５２に記憶された顔検出用テンプレート画像Ａの代わりに、図９（ａ）に示すように、顔の平均的な輪郭を含むように作られたテンプレート画像を用いてもよい。つまり顔の平均的な輪郭のテンプレート画像を用いているため、逆光時のように顔の特徴を示す目、眉、鼻、口等が正しく撮像されていない図９（ｃ）に示すような撮像画像Ｂにおける顔の位置を正しく検出することができるのである。尚図９（ｂ）は図９（ａ）に示す顔検出用テンプレート画像Ａから濃度勾配方向抽出手段２５３により抽出された濃度勾配方向画像Ａ’を示す。 By the way, in the face position detection process described above, instead of the face detection template image A stored in the face detection template image storage means 252, as shown in FIG. A template image created in this way may be used. That is, since the template image of the average contour of the face is used, the eyes, the eyebrows, the nose, the mouth, and the like that indicate the features of the face are not correctly captured as in backlight, as shown in FIG. 9C. The position of the face in the image B can be detected correctly. FIG. 9B shows the density gradient direction image A ′ extracted by the density gradient direction extraction means 253 from the face detection template image A shown in FIG.

上述のようにして顔の位置を検出した後、映像処理部２５は図１０（ａ）に示すように顔位置を含む範囲Ｓの明るさ（輝度レベル）を検出するとともに、その検出値を所定のしきい値と比較し、検出値がしきい値以下であれば、検出値がしきい値を超えるまで画像全体の明るさ（輝度レベル）を大きくする逆光補正処理を行う。その結果、図１０（ｂ）に示すように逆光補正された画像においては、来訪者Ｈの顔が適切な明るさで表示可能となる。 After detecting the face position as described above, the video processing unit 25 detects the brightness (luminance level) of the range S including the face position as shown in FIG. If the detected value is equal to or smaller than the threshold value, backlight correction processing is performed to increase the brightness (luminance level) of the entire image until the detected value exceeds the threshold value. As a result, the face of the visitor H can be displayed with appropriate brightness in the backlight-corrected image as shown in FIG.

而して本実施形態によれば、ドアホン子器２０のカメラ部２４で撮像された画像に対して、ドアホン子器２０が備える画像処理手段たる映像処理部２５が画像処理を行うことにより当該画像における来訪者の顔の位置を検出し、カメラ部２４で撮像された画像の中から映像処理部２５で検出された顔の位置が適正な明るさとなるように調整手段たる映像処理部２５が画像の明るさを調整するため、来訪者の顔の位置が画面内のどこにあっても、来訪者の顔を常に適切な明るさで撮像し表示することができる。 Thus, according to the present embodiment, the image processing unit 25, which is an image processing unit included in the door phone slave unit 20, performs image processing on the image captured by the camera unit 24 of the door phone slave unit 20. The video processing unit 25 as an adjusting unit detects the position of the face of the visitor in the image, and adjusts so that the position of the face detected by the video processing unit 25 from the images captured by the camera unit 24 has an appropriate brightness. Since the brightness of the visitor is adjusted, the visitor's face can always be captured and displayed at an appropriate brightness regardless of the position of the visitor's face on the screen.

（実施形態２）
本実施形態は、ドアホン子器２０の映像処理部２５における顔位置検出処理に特徴があり、それ以外の構成及び動作については実施形態１と共通であるから図示並びに説明を省略する。 (Embodiment 2)
The present embodiment is characterized by face position detection processing in the video processing unit 25 of the door phone slave unit 20, and other configurations and operations are the same as those in the first embodiment, and thus illustration and description thereof are omitted.

映像処理部２５の基本構成は図３に示す実施形態１の構成と同じであるので、図３を参照する。そして、本実施形態の映像処理部２５は、図１１（ａ）に示すように、顔検出手段２５８内に顔サイズ検出手段２５８ａを備えた点に特徴がある。
顔サイズ検出手段２５８ａは、顔検出用テンプレート画像のサイズを変更しながら、投票手段２５７によって投票を行うことによって、図１１（ｂ）に示すようにテンプレート画像サイズ毎に投票結果より投票値のビークが得られるため、その値が最大になるサイズを顔のサイズとして検出する手段である。 Since the basic configuration of the video processing unit 25 is the same as that of the first embodiment shown in FIG. 3, reference is made to FIG. The video processing unit 25 according to this embodiment is characterized in that a face size detecting unit 258a is provided in the face detecting unit 258 as shown in FIG.
The face size detecting means 258a performs voting by the voting means 257 while changing the size of the face detection template image, and as shown in FIG. Therefore, it is means for detecting the size that maximizes the value as the face size.

このように本実施形態では、顔の大きさが一定でなくも顔を検出することができるという利点がある。
（実施形態３）
本実施形態は、ドアホン子器２０の映像処理部２５における顔位置検出処理に特徴があり、それ以外の構成及び動作については実施形態１と共通であるから図示並びに説明を省略する。
上記実施形態２では顔検出用テンプレート画像のサイズを変更させることで、顔サイズを検出するものであったが、本実施形態は上述した形状特徴における距離情報を変化させて顔サイズを検出する点に特徴がある。尚映像処理部２５の基本構成は図３に示す実施形態１の構成と同じであるので、図３を参照し、また顔検出手段２５８内には実施形態２と同様に顔サイズ検出手段２５８ａ（図１１（ａ）参照）を設けてある。 As described above, this embodiment has an advantage that the face can be detected even if the size of the face is not constant.
(Embodiment 3)
The present embodiment is characterized by face position detection processing in the video processing unit 25 of the door phone slave unit 20, and other configurations and operations are the same as those in the first embodiment, and thus illustration and description thereof are omitted.
In the second embodiment, the face size is detected by changing the size of the face detection template image. However, the present embodiment detects the face size by changing the distance information in the shape feature described above. There is a feature. Since the basic configuration of the video processing unit 25 is the same as that of the first embodiment shown in FIG. 3, refer to FIG. 3, and in the face detection means 258, the face size detection means 258 a ( 11 (a)) is provided.

而して本実施形態の顔サイズ検出手段２５８ａ（図１１（ａ））は、形状特徴における距離Ｌに対する倍率ｓを変えて距離情報を変化させて、投票手段２５７によって、濃度勾配方向画像記憶手段２５４に記憶されている図６（ａ）に示す撮像画像Ｂの濃度勾配方向画像Ｂ’の濃度勾配方向値θと、形状特徴記憶手段２５６に記憶されている顔検出用テンプレート画像Ａの形状特徴を用いて、下記の（式６）、（式７）により基準点候補点を求め、投票空間のその点に投票する。そして図１１（ｂ）に示すように顔サイズ毎に投票結果より投票値のピークが得られるため、その値が最大になるサイズを顔のサイズとして検出するのである。 Thus, the face size detection means 258a (FIG. 11A) of the present embodiment changes the distance information by changing the magnification s with respect to the distance L in the shape feature, and the voting means 257 causes the density gradient direction image storage means. The density gradient direction value θ of the density gradient direction image B ′ of the captured image B shown in FIG. 6A stored in the H.254 and the shape feature of the face detection template image A stored in the shape feature storage unit 256. Is used to obtain a reference point candidate point according to the following (formula 6) and (formula 7), and vote for that point in the voting space. Then, as shown in FIG. 11B, since the peak of the vote value is obtained from the vote result for each face size, the size with the maximum value is detected as the face size.

ｏｘ’＝ｓ×Ｌ×ｃｏｓ（α）＋ｒ’ｘ …（式６）
ｏｙ’＝ｓ×Ｌ×ｃｏｓ（α）＋ｒ’ｙ …（式７）
ｓ：距離の倍率
（実施形態４）
本実施形態は、ドアホン子器２０の映像処理部２５における顔位置検出処理に特徴があり、それ以外の構成及び動作については実施形態１と共通であるから図示並びに説明を省略する。また、映像処理部２５の基本構成は図３に示す実施形態１の構成と同じであるので、図３を参照する。 ox ′ = s × L × cos (α) + r′x (Formula 6)
oy ′ = s × L × cos (α) + r′y (Expression 7)
s: magnification of distance (Embodiment 4)
The present embodiment is characterized by face position detection processing in the video processing unit 25 of the door phone slave unit 20, and other configurations and operations are the same as those in the first embodiment, and thus illustration and description thereof are omitted. The basic configuration of the video processing unit 25 is the same as that of the first embodiment shown in FIG.

そして、本実施形態における映像処理部２５は、図１２（ａ）に示すように、顔検出手段２５８内に顔サイズ検出手段２５８ａ（実施形態２、３の顔サイズ検出手段２５８ａの何れでもよい）を備えるとともに、顔回転角出手段２５８ｂを備えた点に特徴がある。 Then, as shown in FIG. 12A, the video processing unit 25 in the present embodiment includes a face size detection unit 258a in the face detection unit 258 (any of the face size detection units 258a in the second and third embodiments). And a feature that the face rotation angle output means 258b is provided.

一方、顔回転角抽出手段２５８ｂは、図１２（ｂ）に示すような顔が傾いて撮像された撮像画像Ｂに対応させるためのもので、顔検出用テンプレート画像Ａを回転させながら、投票手段２５７によって投票を行うことによって、図１２（ｃ）に示すように、回転角毎に投票結果より投票値のピークが得られるため、その値が最大になる回転角を顔の回転角として抽出する手段である。 On the other hand, the face rotation angle extraction means 258b is for corresponding to the picked-up image B captured with the face tilted as shown in FIG. 12B, and the voting means while rotating the face detection template image A. By voting by 257, as shown in FIG. 12C, a peak of the voting value is obtained from the voting result for each rotation angle. Therefore, the rotation angle that maximizes the value is extracted as the rotation angle of the face. Means.

本実施形態では、顔の回転角が一定でなくても顔を検出することができるという利点がある。
（実施形態５）
上記実施形態４での顔回転角抽出手段２５８ｂは顔検出用テンプレート画像Ａを回転させるものであったが、本実施形態の顔回転角検出手段２５８ｂでは、上述した形状特徴における角度情報を変換させる点に特徴がある。尚本実施形態の画像処理装置の基本構成は図１に示す実施形態１の構成と同じであるので、図３を参照し、また実施形態４と同様に顔検出手段２５８内に、例えば実施形態３に用いた顔サイズ検出手段２５８ａとともに顔回転角出手段２５８ｂを備えている（図１２（ａ）参照）
而して本実施形態の顔回転角抽出手段２５８ｂは、上述した形状特徴における角度情報を変化させて、投票手段２５７によって、濃度勾配方向画像記憶手段２５４に記憶されている濃度勾配方向画像の濃度勾配方向値θ（回転角をφ、撮像画像Ｂの濃度勾配方向をθ’とすると、θ＝θ’−φとなる。）と、形状特徴記憶手段７に記憶されている顔検出用テンプレート画像Ａの形状特徴を用いて、下記の（式８）、（式９）より基準点候補点を求め、投票空間のその点に投票する。そして図１２（ｃ）に示すように、各回転角毎の投票結果より投票値のピークが得られるため、その値が最大になる回転角を顔の回転角として検出する。 This embodiment has an advantage that the face can be detected even if the rotation angle of the face is not constant.
(Embodiment 5)
Although the face rotation angle extraction means 258b in the fourth embodiment rotates the face detection template image A, the face rotation angle detection means 258b in the present embodiment converts angle information in the shape feature described above. There is a feature in the point. Since the basic configuration of the image processing apparatus according to the present embodiment is the same as that of the first embodiment shown in FIG. 1, refer to FIG. 3, and in the face detection unit 258 as in the fourth embodiment, for example, the embodiment 3 is provided with a face rotation angle output means 258b in addition to the face size detection means 258a used in FIG. 3 (see FIG. 12A).
Thus, the face rotation angle extraction unit 258b of the present embodiment changes the angle information in the shape feature described above, and the density of the density gradient direction image stored in the density gradient direction image storage unit 254 by the voting unit 257. Gradient direction value θ (when rotation angle is φ and density gradient direction of captured image B is θ ′, θ = θ′−φ) and face detection template image stored in shape feature storage means 7 Using the shape feature of A, a reference point candidate point is obtained from (Equation 8) and (Equation 9) below, and a vote is given to that point in the voting space. Then, as shown in FIG. 12C, since the peak of the vote value is obtained from the vote result for each rotation angle, the rotation angle at which the maximum value is obtained is detected as the face rotation angle.

ｏｘ’＝Ｌ×ｃｏｓ（α＋φ）＋ｒ’ｘ …（式８）
ｏｙ’＝Ｌ×ｃｏｓ（α＋φ）＋ｒ’ｙ …（式９）
φ：回転角（この回転角の値を変えることで角度情報を変化させる）
ところで、上記実施形態１〜５では顔位置検出処理を行う画像処理手段と逆光補正処理を行う調整手段をドアホン子器２０の映像処理部２５で構成しているが、親機１の制御部３で画像処理手段及び調整手段を構成するようにしても構わない。 ox ′ = L × cos (α + φ) + r′x (Expression 8)
oy ′ = L × cos (α + φ) + r′y (Equation 9)
φ: Rotation angle (change the angle information by changing the value of this rotation angle)
In the first to fifth embodiments, the image processing unit that performs the face position detection process and the adjustment unit that performs the backlight correction process are configured by the video processing unit 25 of the door phone slave unit 20. The image processing unit and the adjustment unit may be configured as described above.

実施形態１のシステム構成図である。1 is a system configuration diagram of Embodiment 1. FIG. 同上のブロック図である。It is a block diagram same as the above. 同上におけるドアホン子器の映像処理部の回路構成図である。It is a circuit block diagram of the image | video process part of the door phone sub unit in the same as the above. 同上における映像処理部の動作説明用の図であって、（ａ）は顔検出用テンプレート画像のイメージ図、（ｂ）は顔検出用テンプレート画像の濃淡勾配方向画像のイメージ図、（ｃ）は顔検出用テンプレート画像の形状特徴説明図である。It is a figure for operation | movement description of a video processing part in the same as the above, Comprising: (a) is an image figure of the template image for face detection, (b) is an image figure of the gradient direction image of the template image for face detection, (c) is a face detection It is a shape feature explanatory view of a template image. 実施形態１乃至５における映像処理部の動作説明用の撮像イメージ図である。FIG. 6 is a captured image diagram for explaining the operation of the video processing unit in the first to fifth embodiments. 同上における映像処理部の動作説明用の図であって、（ａ）は撮像画像の濃淡勾配方向画像のイメージ図、（ｂ）は撮像画像の形状特徴説明図である。It is a figure for operation | movement description of a video processing part in the same as the above, Comprising: (a) is an image figure of the gradation gradient direction image of a captured image, (b) is a figure characteristic explanatory drawing of a captured image. 同上に用いるソーベルフィルタの説明図である。It is explanatory drawing of the Sobel filter used for the same as the above. 同上の基準点候補点の投票結果の説明図である。It is explanatory drawing of the vote result of a reference point candidate point same as the above. 同上における映像処理部の動作説明用の図であって、（ａ）は顔検出用テンプレート画像のイメージ図、（ｂ）は顔検出用テンプレート画像の濃淡勾配方向画像のイメージ図、（ｃ）は検出対象の顔を撮像した撮像画像のイメージ図である。It is a figure for operation | movement description of the image | video process part in the same as the above, Comprising: (a) is an image figure of the template image for face detection, (b) is an image figure of the gradation gradient direction image of the template image for face detection, (c) is a detection object. It is an image figure of the captured image which imaged the face of. 同上における映像処理部の動作説明用の図であって、（ａ）は逆光補正前の撮像画像のイメージ図、（ｂ）は逆光補正後の撮像画像のイメージ図である。It is a figure for operation | movement description of a video processing part in the same as the above, Comprising: (a) is an image figure of the captured image before backlight correction | amendment, (b) is an image figure of the captured image after backlight correction | amendment. （ａ）は実施形態２、３に用いる顔検出手段の構成図、（ｂ）は同上の顔検出手段の顔サイズ検出における顔サイズと投票値との関係説明図である。(A) is a block diagram of the face detection means used in Embodiments 2 and 3, and (b) is an explanatory diagram of the relationship between the face size and the vote value in the face size detection of the face detection means of the above. （ａ）は実施形態４，５に用いる顔検出手段の構成図、（ｂ）は同上の検出対象の顔を撮像した撮像画像のイメージ図、（ｃ）は同上の顔回転検出における顔回転角と投票値との関係説明図である。(A) is the block diagram of the face detection means used for Embodiment 4, 5, (b) is the image figure of the picked-up image which imaged the face of the detection object same as the above, (c) is the face rotation angle in the face rotation detection same as the above. It is relationship explanatory drawing with a vote value. 従来例における逆光補正の説明図である。It is explanatory drawing of the backlight correction in a prior art example.

Explanation of symbols

１インターホン親機
２０ドアホン子器
２４カメラ部
２５映像処理部
1 Interphone main unit 20 Door phone slave unit 24 Camera unit 25 Video processing unit

Claims

It is composed of a doorphone slave unit installed at the entrance outside the dwelling unit and an interphone master unit installed inside the dwelling unit to communicate with the doorphone slave unit. And a communication means for making a call with the interphone master unit. The interphone master unit displays the image captured by the call means for making a call with the door phone slave unit and the image pickup means of the door phone slave unit. Image processing means for detecting the position of the face of the visitor in the image by performing image processing on the image picked up by the image pickup means on one of the door phone slave unit or interphone master unit, and an image and an adjustment means for adjusting the brightness of an image captured by the imaging means so that the position of the detected face in the processing means is proper brightness, said image processing means, the template image for advance face detection Respectively stored face detection template image storage means, a captured image taken by the image pickup means of the door phone slave unit, and a density gradient direction image of the face detection template image stored in the face detection template image storage means are respectively extracted. Density gradient direction extraction means, density gradient direction image storage means for storing each density gradient direction image extracted by the density gradient direction extraction means, and density gradient of the face detection template image stored in the density gradient direction image storage means A shape feature extracting means for extracting information on the distance between the reference point on the direction image and each coordinate point to be referred to and the angle at which the line connecting the reference point and the coordinate point intersects the horizontal axis passing through the coordinate point; Shape feature storage means for classifying and storing the distance and angle information extracted by the extraction means for each value of the density gradient direction of the face detection template image, and density gradient direction image recording In the density gradient direction image of the captured image, based on the value of the density gradient direction at each coordinate point to be referred to in the density gradient direction image of the captured image stored in the means and the shape feature stored in the shape feature storage means. An intercom system comprising: voting means for performing voting processing on reference point candidate points; and face detection means for detecting a face position based on a voting result obtained by the voting means .

2. The intercom system according to claim 1, wherein the image processing means and the adjusting means are provided in a door phone slave unit.

The said face detecting means changes the size of the face detection template image, that based on the change in the voting value corresponding to the magnitude of the change, and a face size detecting means for extracting the size of the face The intercom system according to claim 1 .

The face detection means includes face size detection means for changing the distance information in the shape feature and extracting the face size based on a change in vote value according to the change in the distance information. The intercom system according to claim 1, wherein:

The face detection means includes face rotation angle extraction means for changing the rotation angle of the face detection template image and extracting the face rotation angle based on a change in the vote value corresponding to the change in the rotation angle. The intercom system according to any one of claims 1 to 4, wherein the intercom system is provided.

The face detection means includes face rotation angle extraction means that changes angle information in the shape feature and extracts a face rotation angle based on a change in vote value in accordance with the change in the angle information . intercom system according to any one of claims 1 to 4, characterized in.