JP4835898B2

JP4835898B2 - Video display method and video display device

Info

Publication number: JP4835898B2
Application number: JP2004307776A
Authority: JP
Inventors: 泰成畠澤; 昌美緒形
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2004-10-22
Filing date: 2004-10-22
Publication date: 2011-12-14
Anticipated expiration: 2024-10-22
Also published as: JP2006119408A

Description

本発明は、集まった多数の人により視聴される大画面映像を表示する映像表示装置に関し、詳しくは、表示される映像の中で多くの視聴者が注目する領域を検出し、検出された領域を大きく表示する映像表示方法および映像表示装置に関する。 The present invention relates to a video display device that displays a large-screen video that can be viewed by a large number of people gathered, and more specifically, detects a region that is viewed by many viewers in the displayed video, and detects the detected region. The present invention relates to a video display method and a video display device that display a large amount of video.

屋外等に設置された映像表示装置の大画面映像は、その場に集まった多数の人により視聴される。しかし、個々の視聴者がどの映像に興味があるかといった事情は、その映像の表示には反映されないのが普通である。視聴者の興味を反映させた映像を提供する場合、多くの視聴者の情報を集め、集められた情報をまとめる必要がある。 A large screen image of an image display device installed outdoors or the like is viewed by a large number of people gathered on the spot. However, the circumstances such as which video an individual viewer is interested in are usually not reflected in the display of the video. When providing a video reflecting the interests of viewers, it is necessary to collect information from many viewers and collect the collected information.

そこで、従来においては、視線入力装置を複数の観客に装着してもらい、複数の視線入力装置により検出された観客の視線情報を統計的に処理した結果に応じて観客の注目する領域を切り出す映像切り出し制御装置が開示されている（例えば特許文献１参照）。
特開２００３−１２５２８６号 In view of this, conventionally, an image in which a sight line input device is attached to a plurality of spectators, and a region of interest of the spectator is cut out according to a result of statistically processing the sight line information of the spectators detected by the plurality of sight line input devices. A clipping control device is disclosed (see, for example, Patent Document 1).
JP 2003-125286 A

しかしながら、上記特許文献１の映像切り出し制御装置では、視線入力装置を複数の観客に装着してもらうため、スポーツ観戦のような特定の観客の視線情報を取得する場合には好適であるが、屋外に設置された不特定多数の人に視聴される映像表示装置には不向きである。街頭の映像表示装置では、立ち止まって視聴する人や、歩いて通り過ぎる間だけ視聴する人等の様々な人があり、これらの人に視線入力装置をいちいち装着してもらうのが現実的ではない。 However, in the video cutout control device of Patent Document 1 described above, the gaze input device is attached to a plurality of spectators, and thus it is suitable for obtaining gaze information of a specific spectator such as a sports spectator. It is unsuitable for a video display device installed on the Internet and viewed by an unspecified number of people. There are various people in street video display devices, such as those who stop and watch, and people who watch while walking, and it is not realistic for these people to wear eye-gaze input devices one by one.

本発明の目的は、上記のような従来の問題を鑑みてなされたものであり、視聴者に装着される視線入力装置を用いることなしに、映像表示装置に表示された映像に対する不特定多数の視聴者の視線情報を取得して視聴者の興味が集まる映像の領域を検出し、検出された領域を大きく表示する映像表示方法及び映像表示装置を提供するにある。 The object of the present invention has been made in view of the above-described conventional problems, and an unspecified number of images displayed on a video display device can be used without using a line-of-sight input device worn by a viewer. It is an object of the present invention to provide a video display method and a video display apparatus that acquire a viewer's line-of-sight information, detect a video region where the viewer's interest is gathered, and display the detected region in a large size.

上記目的を達成するため、本発明の映像表示方法は、表示画面に主領域と前記主領域より小さい１つ以上の副領域とを設け、前記主領域および前記１つ以上の副領域にそれぞれの映像を表示する映像表示ステップと、前記映像表示ステップで前記表示画面に表示された映像を見る視聴者を複数のカメラにより撮影する視聴者撮影ステップと、前記視聴者撮影ステップで撮影された画像に基づいて前記視聴者の顔画像を認識する顔画像認識ステップと、前記顔画像認識ステップで認識された顔画像に基づいて前記顔画像が注視する前記映像を検出し、検出された前記映像に基づいて前記映像の表示を欲する表示欲求度を算出する表示欲求度算出ステップと、前記表示欲求度算出ステップで算出された表示欲求度を比較し、表示欲求度が最も高い前記映像が前記主領域に表示されるように前記主領域および前記１つ以上の副領域に表示された映像を切り替える映像切り替えステップとを含むことを特徴とする。
また、本発明の映像表示装置は、表示画面に主領域と前記主領域より小さい１つ以上の副領域とを設け、前記主領域および前記１つ以上の副領域にそれぞれの映像を表示する映像表示部と、前記表示画面に表示された映像を見る視聴者を撮影する複数のカメラと、前記複数のカメラにより撮影された画像に基づいて前記視聴者の顔画像を認識する顔画像認識手段と、前記顔画像認識手段により認識された顔画像に基づいて前記顔画像が注視する前記映像を検出し、検出された前記映像に基づいて前記映像の表示を欲する表示欲求度を算出する表示欲求度算出手段と、前記表示欲求度算出手段により算出された表示欲求度を比較し、表示欲求度が最も高い前記映像が前記主領域に表示されるように前記主領域および前記１つ以上の副領域に表示された映像を切り替える映像切り替え手段とを備えたことを特徴とする。 In order to achieve the above object, the video display method of the present invention provides a display screen with a main area and one or more sub areas smaller than the main area, and each of the main area and the one or more sub areas is provided with the main area. A video display step for displaying video; a viewer shooting step for shooting a viewer who views the video displayed on the display screen in the video display step with a plurality of cameras; and an image shot in the viewer shooting step. A face image recognition step for recognizing the face image of the viewer based on the image, and detecting the video that the face image gazes based on the face image recognized in the face image recognition step, and based on the detected video The display desire degree calculation step for calculating the display desire degree for the display of the video is compared with the display desire degree calculated in the display desire degree calculation step, and the display desire degree is the highest. Serial images, characterized in that it comprises a video switching step of switching the main region and the one or more images displayed in the secondary area to be displayed in the main area.
The video display device according to the present invention further includes a main screen and one or more sub-regions smaller than the main region on a display screen, and displays the respective videos in the main region and the one or more sub-regions. A display unit, a plurality of cameras for capturing a viewer who views the video displayed on the display screen, and a face image recognition means for recognizing the viewer's face image based on images captured by the plurality of cameras. The display desire level for detecting the video image that the face image gazes based on the face image recognized by the face image recognition means and calculating the display desire level for displaying the video image based on the detected video image The main area and the one or more sub-areas are compared so that the display means calculated by the calculating means and the display desire degree calculating means are compared, and the video having the highest display desire degree is displayed in the main area. Displayed on Characterized by comprising a video switching means for switching the video image.

本発明の映像表示方法および映像表示装置によれば、複数のカメラにより撮影された画像に基づいて表示画面に表示された映像を見る視聴者の顔画像を認識し、認識された顔画像に基づいて顔画像が注視する映像を検出して映像の表示欲求度を算出し、表示欲求度が最も高い映像が主領域に表示されるように主領域および１つ以上の副領域に表示された映像を切り替える。
したがって、従来のような視線入力装置を用いることなしに、多数の視聴者の興味が集まる映像を検出し、検出された映像を大きく表示することができる。 According to the video display method and the video display device of the present invention, a face image of a viewer who watches a video displayed on a display screen is recognized based on images taken by a plurality of cameras, and based on the recognized face image. The video displayed on the main area and one or more sub-regions is calculated so that the video display desire level is calculated by detecting the video image of the face image and the video having the highest display desire level is displayed in the main area. Switch.
Therefore, it is possible to detect an image that attracts a large number of viewers and display the detected image in a large size without using a conventional line-of-sight input device.

映像表示装置に表示された映像に対する不特定多数の視聴者に特別な視線入力装置を装着することなしに、複数の映像視聴者の興味を持った映像を表示部に入れ替え可能に表示する映像表示制御方法及び装置を提供するという目的は、映像表示手段を視聴している複数の視聴者の興味の集まっている映像を視聴者の視線情報と顔画像の位置情報から調べ、興味の集まっている映像が副表示領域に表示されている映像の場合は、この映像を主表示領域に表示される映像と入れ替え、視聴者に興味を持たれている映像を主表示領域に表示することによって実現した。 Video display that displays video that interests multiple video viewers can be replaced on the display unit without attaching a special gaze input device to an unspecified number of viewers for the video displayed on the video display device An object of the present invention is to provide a control method and an apparatus, in which videos of interest of a plurality of viewers who are viewing video display means are examined based on the viewer's line-of-sight information and position information of face images. In the case where the video is displayed in the sub display area, this video is replaced with the video displayed in the main display area, and the video interested in the viewer is displayed in the main display area. .

以下、本発明の実施例１について図１〜図８を参照して説明する。
図１は、実施例１の映像表示装置の構成を示す機能ブロック図、図２は実施例１における映像表示部への映像表示形態の一例を示す説明図、図３は実施例１における映像表示部と視聴者との関係を示す説明図、図４は実施例１における映像表示部と視聴者情報取得用カメラとの位置関係を示す説明図、図５は実施例１における映像表示部の表示画面に顔を向けている視聴者の顔画像を認識する場合の映像表示部と視聴者との関係を示す説明図、図６は実施例１における映像表示部の表示画面と視聴者の顔領域との位置関係を求めるための説明図、図７は実施例１における映像表示部の表示画面中で見ている視聴者の視線位置を表す説明図、図８は実施例１の映像表示装置の動作を示すフローチャートである。 Embodiment 1 of the present invention will be described below with reference to FIGS.
FIG. 1 is a functional block diagram illustrating the configuration of the video display apparatus according to the first embodiment, FIG. 2 is an explanatory diagram illustrating an example of a video display form on the video display unit according to the first embodiment, and FIG. 3 is a video display according to the first embodiment. FIG. 4 is an explanatory diagram showing the positional relationship between the video display unit and the viewer information acquisition camera in the first embodiment, and FIG. 5 is a display of the video display unit in the first embodiment. FIG. 6 is an explanatory diagram showing the relationship between the video display unit and the viewer when recognizing the face image of the viewer whose face is facing the screen. FIG. 6 shows the display screen of the video display unit and the viewer's face area in the first embodiment. FIG. 7 is an explanatory diagram showing the line-of-sight position of the viewer looking on the display screen of the video display unit in the first embodiment, and FIG. 8 is a diagram of the video display device of the first embodiment. It is a flowchart which shows operation | movement.

図１に示すように、映像表示装置１００は、映像表示部１０１、２つの視聴者情報取得用カメラ１０２、顔画像認識手段１０３、顔座標検出手段１０４、視聴者注視座標検出手段１０５、注目映像情報検出手段１０６、満足度検出手段１０７、表示欲求度検出手段１０８および表示映像作成手段１０９を備える。 As shown in FIG. 1, a video display device 100 includes a video display unit 101, two viewer information acquisition cameras 102, a face image recognition unit 103, a face coordinate detection unit 104, a viewer gaze coordinate detection unit 105, and a video of interest. An information detection unit 106, a satisfaction level detection unit 107, a display desire level detection unit 108, and a display video creation unit 109 are provided.

映像表示部１０１は、屋外のような人通りの多いに広場などに設置されもので、横寸法がＸ，縦寸法がＹの大きさの映像表示画面１０１Ａを有する。また、映像表示画面１０１Ａは、図２〜図４に示すように、比較的大きな１つの映像表示領域からなる主表示領域１０１Ｂと、主表示領域１０１Ｂより小さな映像表示領域の映像Ｐ１，Ｐ２・・・を複数並べて表示する副表示領域１０１Ｃとに区分されている。 The video display unit 101 is installed in a square or the like where there are many people such as outdoors, and has a video display screen 101A having a horizontal dimension of X and a vertical dimension of Y. As shown in FIGS. 2 to 4, the video display screen 101A includes a main display area 101B composed of a relatively large video display area, and videos P1, P2,... In a video display area smaller than the main display area 101B. -It is divided into a sub display area 101C for displaying a plurality of images side by side.

２つの視聴者情報取得用カメラ１０２は、映像表示部１０１の映像表示画面を視聴する不特定多数の視聴者を撮影するもので、図４に示すように、映像表示部１０１の表示画面上における上端の左右個所に、視聴者に向けて配置されている。この左右両視聴者情報取得用カメラ１０２の間隔は、視聴者の表示画面に対する視線の位置を正確に求めるために表示画面のＸ軸方向の寸法に相当する距離Ｌに設定されている。 The two viewer information acquisition cameras 102 capture an unspecified number of viewers who view the video display screen of the video display unit 101. As shown in FIG. It is arranged toward the viewer at the left and right positions of the upper end. The distance between the left and right viewer information acquisition cameras 102 is set to a distance L corresponding to the dimension of the display screen in the X-axis direction in order to accurately determine the position of the line of sight with respect to the viewer's display screen.

顔画像認識手段１０３は、各視聴者情報取得用カメラ１０２で取得された視聴者映像からＳＶＭ（Support Vector Machine）等により、図３に示すように、映像表示画面１０１Ａに顔を向けている視聴者２００の顔画像を認識し、認識された顔画像に基づいて映像表示画面１０１Ａに顔を向けている視聴者２００の人数を認識する。
顔座標検出手段１０４は、視聴者情報取得用カメラ１０２毎に顔画像認識手段１０３により認識された顔画像に基づいて映像表示画面１０１Ａ上における顔画像の位置座標を認識された視聴者２００毎に検出するものである。 The face image recognizing means 103 uses a viewer video acquired by each viewer information acquisition camera 102 by SVM (Support Vector Machine) or the like, as shown in FIG. The face image of the viewer 200 is recognized, and the number of viewers 200 facing their faces on the video display screen 101A is recognized based on the recognized face image.
The face coordinate detection means 104 is provided for each viewer 200 whose position coordinates of the face image on the video display screen 101A are recognized based on the face image recognized by the face image recognition means 103 for each viewer information acquisition camera 102. It is to detect.

視聴者注視座標検出手段１０５は、視聴者情報取得用カメラ１０２毎に顔画像認識手段１０３により認識された両方の顔画像に基づいて視聴者２００の顔の方向と眼球の方向を検出して、認識された視聴者２００の視線方向を視聴者毎に検出し、検出された視線方向の情報と顔座標検出手段１０４により検出された顔画像の位置座標に基ついて視聴者２００が映像表示画面１０１Ａ中で注視している視線の位置座標を認識された視聴者毎に検出する。 The viewer gaze coordinate detecting means 105 detects the face direction and the eyeball direction of the viewer 200 based on both face images recognized by the face image recognizing means 103 for each viewer information acquisition camera 102. The line-of-sight direction of the recognized viewer 200 is detected for each viewer, and the viewer 200 displays the video display screen 101A based on the information on the detected line-of-sight direction and the position coordinates of the face image detected by the face coordinate detection unit 104. The position coordinate of the line of sight in which it is gazing is detected for each recognized viewer.

注目映像情報検出手段１０６は、視聴者注視座標検出手段１０５により検出された視線の位置座標に基づいて視聴者２００が瞬間注視している映像表示画面１０１Ａ中の映像がどの映像かを判定して注目映像情報を検出し、検出された注目映像情報を一定時間取得して視聴者２００の注目映像メタ情報Ｐ^ｉ _ｍｅｔａ（ｔ）を視聴者毎に算出する。 The attention video information detection unit 106 determines which video is in the video display screen 101 </ b> A on which the viewer 200 is gazing at an instant based on the position coordinates of the line of sight detected by the viewer gaze coordinate detection unit 105. Attention video information is detected, the detected attention video information is acquired for a certain period of time, and attention video meta information P ⁱ _meta (t) of the viewer 200 is calculated for each viewer.

満足度検出手段１０７は、顔座標検出手段１０４により検出された顔画像の位置座標に基づいて視聴者２００が映像表示画面１０１Ａに顔を向けて映像をどれくらい静止して見ていたかを表す顔移動係数Ｆ^ｉ _ｍｏｖｅを算出し、算出された顔移動係数Ｆ^ｉ _ｍｏｖｅに基づいて視聴者２００の満足度Ｓ^ｉ（ｔ）を算出する。 The satisfaction detection means 107 is a face movement indicating how much the viewer 200 is viewing the video with the face facing the video display screen 101A based on the position coordinates of the face image detected by the face coordinate detection means 104. The coefficient F ⁱ _move is calculated, and the satisfaction degree S ⁱ (t) of the viewer 200 is calculated based on the calculated face movement coefficient F ⁱ _move .

表示欲求度検出手段１０８は、注目映像情報検出手段１０６で算出された注目映像メタ情報Ｐ^ｉ _ｍｅｔａ（ｔ）と満足度検出手段１０７により算出された顔移動係数Ｆ^ｉ _ｍｏｖｅ及び満足度Ｓ^ｉ（ｔ）に基づいて主表示領域１０１Ｂと副表示領域１０１Ｃの各映像に対する視聴者２００の表示欲求度Ｄ^ｉ（ｔ）を各映像毎に算出し、算出された表示欲求度Ｄ^ｉ（ｔ）を各映像毎に加算して、認識した視聴者全員の各映像に対する表示欲求度ＡＤ（ｔ）を算出する。 The display desire level detection means 108 includes the attention video meta information P ⁱ _meta (t) calculated by the attention video information detection means 106, the face movement coefficient F ⁱ _move calculated by the satisfaction detection means 107, and the satisfaction degree S ⁱ ( t), the display desire degree D ⁱ (t) of the viewer 200 for each image in the main display area 101B and the sub display area 101C is calculated for each image, and the calculated display desire degree D ⁱ (t) is calculated. Addition is performed for each video to calculate a display desire degree AD (t) for each video of all recognized viewers.

表示映像作成手段１０９は、表示欲求度検出手段１０８により算出された表示欲求度の最も高い値の映像が主表示領域１０１Ｂに表示されるように主表示領域１０１Ｂ及び副表示領域１０１Ｃに表示される映像を作成して映像表示部１０１に出力する。また、この表示映像作成手段１０９には、図１に示すように、受信手段１１０が接続されている。受信手段１１０は、遠隔地に設置された１つの主表示領域用カメラ３０１および複数の副表示領域用カメラ３０２により撮影されたそれぞれの映像情報を受信する。表示映像作成手段１０９は、受信手段１１０により受信された１つの主表示領域用カメラ３０１および複数の副表示領域用カメラ３０２の映像情報に基づいて映像表示部１０１に出力する映像を作成する。 The display video creation means 109 is displayed in the main display area 101B and the sub display area 101C so that the video having the highest display desire calculated by the display desire detection means 108 is displayed in the main display area 101B. A video is created and output to the video display unit 101. Further, as shown in FIG. 1, a receiving means 110 is connected to the display image creating means 109. The receiving means 110 receives each video information imaged by one main display area camera 301 and a plurality of sub display area cameras 302 installed in a remote place. The display video creation unit 109 creates a video to be output to the video display unit 101 based on the video information of one main display area camera 301 and a plurality of sub display area cameras 302 received by the reception unit 110.

次に、実施例１の映像表示装置の動作について説明する。
遠隔地で主表示領域用カメラ３０１及び副表示領域用カメラ３０２により撮影されたそれぞれの映像は、通信回線１１１を通して現在地にある映像表示装置１００の受信手段１１０により受信される。受信された映像は、表示映像作成手段１０９により表示用の映像に加工され、映像表示部１０１により表示される。これにより、映像表示部１０１の主表示領域１０１Ｂには、図２に示すように、大きな１つの映像Ｐ_ＭＡＸが表示され、また、副表示領域１０１Ｃの各映像表示領域には、図２に示すように、複数の映像Ｐ１，Ｐ２・・・が並べて表示される。 Next, the operation of the video display apparatus according to the first embodiment will be described.
The respective images captured by the main display area camera 301 and the sub display area camera 302 at a remote location are received by the receiving means 110 of the image display apparatus 100 at the current location through the communication line 111. The received video is processed into a display video by the display video creation means 109 and displayed by the video display unit 101. Accordingly, as shown in FIG. 2, one large video P _MAX is displayed in the main display area 101B of the video display unit 101, and each video display area of the sub display area 101C is shown in FIG. In this way, a plurality of videos P1, P2,... Are displayed side by side.

上述のように映像を表示した映像表示部１０１の映像表示画面１０１Ａを視聴している複数の視聴者２００がいる場合、図２の矢印で示すように、視聴者２００のそれぞれが興味をもっている映像が異なり、常に主表示領域１０１Ｂの映像が最大の興味を得ているとは限らない。
そこで、視聴者２００の興味が集まっている映像を視聴者２００の視線情報と顔の位置情報に基づいて検出する。そして、興味の集まっている映像が副表示領域１０１Ｃに表示されている映像の場合は、この映像を主表示領域１０１Ｂに表示される映像と入れ替え、視聴者２００に興味を持たれている映像が主表示領域１０１Ｂに表示されるようにする。以下、図２〜図８を参照して詳述する。 When there are a plurality of viewers 200 who are viewing the video display screen 101A of the video display unit 101 that displays video as described above, as shown by the arrows in FIG. 2, each of the viewers 200 is interested in the video. However, the video in the main display area 101B does not always get the greatest interest.
Therefore, an image in which the viewer 200 is interested is detected based on the line-of-sight information of the viewer 200 and face position information. Then, when the video in which the interest is gathered is a video displayed in the sub display area 101C, this video is replaced with the video displayed in the main display area 101B, and the video interested in the viewer 200 is displayed. It is displayed in the main display area 101B. Hereinafter, it will be described in detail with reference to FIGS.

まず、左右の視聴者情報取得用カメラ１０２を用いて、それぞれのカメラ視野内にいる視聴者２００を撮影し、視聴者２００達の映像情報を得る（ステップＳ１１）。このとき、視聴者情報取得用カメラ１０２と映像表示部１０１との位置関係は既知であり、図４に示すような配置をとるものとする。また、図４において、座標（０，０，０）は映像表示画面１０１Ａの左端上の座標を表し、座標（Ｘ，Ｙ，０）は映像表示画面１０１Ａの右端下の座標を表している。 First, the left and right viewer information acquisition cameras 102 are used to photograph the viewers 200 within the respective camera fields of view, thereby obtaining video information of the viewers 200 (step S11). At this time, the positional relationship between the viewer information acquisition camera 102 and the video display unit 101 is known, and the arrangement shown in FIG. 4 is assumed. In FIG. 4, coordinates (0, 0, 0) represent the coordinates on the left end of the video display screen 101A, and coordinates (X, Y, 0) represent the coordinates on the lower right end of the video display screen 101A.

顔画像認識手段１０３では、視聴者情報取得用カメラ１０２から得られた映像信号を用いてＳＶＭ等の手法により、映像表示画面１０１Ａに顔を向けている視聴者２００の顔画像の認識を行い、更に顔画像認識された視聴者の人数を検出する（ステップＳ１２）。ここで、検出された視聴者２００の人数をｍとする。この顔画像認識ではある程度の正面顔、つまりは映像表示画面１０１Ａに顔を向けている人物さえ認識できればよい。映像表示画面１０１Ａに対して顔を向けていない人物は、映像に興味を持っていない人物であると考えられるからである。この様子を図５に示す。この図５において、曲線で囲んだ範囲内の視聴者２００が顔認識された者であり、×印を付した視聴者２００は顔認識できない者である。また、曲線で囲んだ範囲内の視聴者で、符号２００Ａで示す視聴者は矢印Ａで示す方向に歩行中または視線を副表示領域１０１Ｃの映像Ｐ１に向けた者であり、また曲線で囲んだ範囲内の視聴者で、符号２００Ｂで示す視聴者は視聴しずらい者である。 The face image recognition means 103 recognizes the face image of the viewer 200 facing his face on the video display screen 101A by a technique such as SVM using the video signal obtained from the viewer information acquisition camera 102, Further, the number of viewers whose face image is recognized is detected (step S12). Here, it is assumed that the number of detected viewers 200 is m. In this face image recognition, it is only necessary to recognize a certain amount of front face, that is, a person whose face is facing the video display screen 101A. This is because the person who does not face the video display screen 101A is considered to be a person who is not interested in the video. This is shown in FIG. In FIG. 5, the viewer 200 within the range surrounded by the curve is a person whose face is recognized, and the viewer 200 marked with a cross is a person who cannot recognize the face. In addition, the viewer within the range surrounded by the curve, the viewer indicated by reference numeral 200A is a person who is walking in the direction indicated by the arrow A or whose line of sight is directed toward the image P1 of the sub display area 101C, and is surrounded by the curve. The viewers within the range and indicated by reference numeral 200B are difficult to view.

顔座標検出手段１０４では、顔画像認識手段１０３によって認識された視聴者２００のそれぞれに対して、映像表示画面１０１Ａに対する顔の位置座標を検出する（ステップＳ１３）。視聴者情報取得用カメラ１０２の映像において、対象とする視聴者２００の顔画像に対応する顔画像をもう一方の視聴者情報取得用カメラの映像から求める。このとき、二つの視聴者情報取得用カメラ１０２の光軸１０２Ａは、図６に示すように、お互いに平行であり、そして、視聴者情報取得用カメラ１０２を構成するレンズを結ぶ線分（Ｘ−Ｙ平面）は光軸１０２Ａと直交し、さらに、視聴者情報取得用カメラ１１２を構成する撮像素子面は光軸１１２Ａと直交する同一平面内に存在し、レンズの焦点距離ｆ（＝Ｈ）は等しい。
視聴者情報取得用カメラ１０２のレンズの中心から対応する顔画像ｉ(視聴者ｍ人中ｉ番目の視聴者の顔画像)が二つの視聴者情報取得用カメラにおいてずれている距離をそれぞれ

この時、求めたい視聴者ｉの顔画像の映像表示画面１０１Ａ上での座標は、数１で与えられる。 The face coordinate detection unit 104 detects the position coordinates of the face with respect to the video display screen 101A for each of the viewers 200 recognized by the face image recognition unit 103 (step S13). In the video of the viewer information acquisition camera 102, a face image corresponding to the face image of the target viewer 200 is obtained from the video of the other viewer information acquisition camera. At this time, as shown in FIG. 6, the optical axes 102A of the two viewer information acquisition cameras 102 are parallel to each other, and a line segment (X that connects the lenses constituting the viewer information acquisition camera 102) -Y plane) is orthogonal to the optical axis 102A, and the image pickup device surface constituting the viewer information acquisition camera 112 exists in the same plane orthogonal to the optical axis 112A, and the focal length f (= H) of the lens. Are equal.
The distance at which the corresponding face image i (i.e., the face image of the i-th viewer among m viewers) deviates from the two viewer information acquisition cameras from the center of the lens of the viewer information acquisition camera 102, respectively.

At this time, the coordinates on the video display screen 101A of the face image of the viewer i to be obtained are given by Equation 1.

［数１］

[Equation 1]

視聴者注視座標検出手段１０５では、視聴者２００が映像表示画面１０１Ａ中で注視している視線の方向を検出し、かつ映像表示画面１０１Ａ中で見ている位置座標を、認識された視聴者毎に検出する（ステップＳ１４）。
視聴者ｉの視線方向の検出手法としては、実施例１では画像認識による方法を用いる。すなわち、二つの視聴者情報取得用カメラ１０２から得た顔画像から三次元顔画像モデルを作成し、三次元顔データベースとマッチングを取ることにより顔の方向を求め、さらに顔画像の内眼球部分に当たる領域において、黒目と白目の位置関係から眼球の方向を求める。この顔の方向と眼球の方向から視線の方向を得る。この時の視聴者ｉと映像表示画面１０１Ａとの関係は図７に示すようになる。
この図７から明らかなように、視聴者ｉが映像表示画面１０１Ａ中で見ている座標（Ｖ^ｉ _ｘ，Ｖ^ｉ _ｙ，0）は、数２で与えられる。 The viewer gaze coordinate detecting means 105 detects the direction of the line of sight that the viewer 200 is gazing in the video display screen 101A, and the position coordinates being viewed in the video display screen 101A are recognized for each recognized viewer. (Step S14).
As a method for detecting the viewing direction of the viewer i, the image recognition method is used in the first embodiment. That is, a three-dimensional face image model is created from the face images obtained from the two viewer information acquisition cameras 102, the face direction is obtained by matching with the three-dimensional face database, and the inner eyeball portion of the face image is hit. In the region, the direction of the eyeball is obtained from the positional relationship between the black eye and the white eye. The line-of-sight direction is obtained from the face direction and the eyeball direction. The relationship between the viewer i and the video display screen 101A at this time is as shown in FIG.
As is apparent from FIG. 7, the coordinates (V ⁱ _x , V ⁱ _y , 0) that the viewer i sees in the video display screen 101A are given by Equation 2.

［数２］

[Equation 2]

注目映像情報検出手段１０６では、視聴者注視座標検出手段１０５で求められた視線の位置座標を基に視聴者２００が注視している映像表示画面１０１Ａ中の映像を検出し、この注目映像情報を一定時間取得して視聴者２００の注目映像メタ情報を視聴者毎に算出する（ステップＳ１５）。
すなわち、映像表示画面１０１Ａの主表示領域１０１Ｂ及び副表示領域１０１Ｃにおいて表示されている映像の数をｋ個とする。視聴者ｉの見ている座標（Ｖ^ｉ _ｘ，Ｖ^ｉ _ｙ，0）から、その瞬間視聴者が見ている映像がどの映像なのかを判定する。そして、現時刻ｔでの視聴者ｉの注目映像情報Ｐ^ｉ（ｔ）＝（ｐ^ｉ ₁（ｔ），ｐ^ｉ ₂（ｔ），…，ｐ^ｉ _ｋ（ｔ））を得る。

The attention video information detection means 106 detects the video in the video display screen 101A being watched by the viewer 200 based on the position coordinates of the line of sight obtained by the viewer gaze coordinate detection means 105, and this attention video information is detected. The attention video meta information of the viewer 200 is calculated for each viewer after a certain period of time (step S15).
That is, the number of videos displayed in the main display area 101B and the sub display area 101C of the video display screen 101A is k. From the coordinates (V ⁱ _x , V ⁱ _y , 0) viewed by the viewer i, it is determined which video the viewer is viewing at that moment. Then, the attention video information P ⁱ (t) = (p ⁱ ₁ (t), p ⁱ ₂ (t),..., P ⁱ _k (t)) of the viewer i at the current time t is obtained.

次に、視聴者ｉの注目映像情報を一定時間取得し加算する。これは、人は何かに注目しているとき常にその領域を見続けるわけではなく、時として視線がずれること、また、ノイズなどによる誤領域の検出の影響を軽減するためなどの理由による。したがって、視線位置情報を時刻ｔ−Ｔから現時刻ｔまで加算したとすると、視聴者ｉの注目映像メタ情報Ｐ^ｉ _ｍｅｔａ（ｔ）は、数３で求められる。 Next, the attention image information of the viewer i is acquired for a certain time and added. This is because when a person is paying attention to something, he / she does not always look at the area, sometimes the line of sight is shifted, and the influence of detection of an erroneous area due to noise or the like is reduced. Therefore, if the line-of-sight position information is added from the time t-T to the current time t, the video-of-interest meta information P ⁱ _meta (t) of the viewer i can be obtained by _Equation 3.

[数３]

[Equation 3]

満足度検出手段１０７では、顔移動係数を算出し、算出された顔移動係数に基づいて視聴者２００の満足度を算出する（ステップＳ１６）。
すなわち、顔画像の位置座標に基づく視聴者２００が映像表示画面１０１Ａに顔を向けて映像をどれくらい静止して見ていたかを表す顔移動係数は、顔位置座標（Ｆ^ｉ _ｘ，Ｆ^ｉ _ｙ，0）を時刻ｔ−Ｔから現時刻ｔまで取得することで求められる。この場合は、まず、現在の位置と一回前に取得した位置との差分を取る。不特定多数の人物から映像内での興味領域を得るという目的上、表示されている映像に興味を持たない人物はできる限り除きたい。視聴者ｉが、静止して映像を見ているか、それとも歩きながらたまたま視線を向けているだけかの判定として、静止しているという情報を用いるのが最もわかりやすく、また有効であると考えられる。現時刻ｔでの移動時間係数ｆ^ｉ _ｔは単調減少関数Ｇ（ｘ）を用いて、数４で与えられる。 The satisfaction level detection means 107 calculates a face movement coefficient, and calculates the satisfaction level of the viewer 200 based on the calculated face movement coefficient (step S16).
That is, the face movement coefficient indicating how much the viewer 200 based on the position coordinates of the face image faces the image display screen 101A and looks at the image is expressed by the face position coordinates (F ⁱ _x , F ⁱ _y , 0) is obtained from time t-T to the current time t. In this case, first, the difference between the current position and the position acquired once before is taken. For the purpose of obtaining a region of interest in a video from an unspecified number of people, we would like to exclude people who are not interested in the displayed video as much as possible. Using the information that the viewer i is stationary as the determination whether the viewer i is standing still watching the video or just walking while looking at the video is most easily understood and effective. . Travel time factor f ⁱ _t at the current time t by using the monotonically decreasing function G (x), is given by the number 4.

[数４]

この式から、瞬間の顔の移動量を用いた瞬間の顔移動係数が求まる。この瞬間の顔移動係数を時刻ｔ−Ｔから現時刻ｔまで加算することで、現時刻までにおいて視聴者ｉがどれくらい映像を静止してみていたかを表す顔移動係数Ｆ^ｉ _ｍｏｖｅが数５から求まる。 [Equation 4]

From this equation, the instantaneous face movement coefficient using the instantaneous face movement amount is obtained. By adding the face movement coefficient at this moment from the time t-T to the current time t, the face movement coefficient F ⁱ _move representing how much the viewer i tried to stop the video up to the current time is obtained from the equation (5). I want.

［数５］

[Equation 5]

さらに、重み付けとして、視聴者ｉの満足度を求める。人は、注目しているほど、映像をできるだけ良い環境で見ようとする。
映像表示画面１０１Ａ上での視線の座標（Ｖ^ｉ _ｘ，Ｖ^ｉ _ｙ，0）が、顔位置座標（Ｆ^ｉ _ｘ，Ｆ^ｉ _ｙ，0）に近いほど、映像を正面から見ていることになる。そして、映像表示部１０１の映像を最もよく見ることができる環境は、映像表示部１０１に対して正面であることは直感的に理解できる。しかし、多人数で映像表示画面１０１Ａを視聴する場合、必ずしも良好な環境で、つまりは映像を正面から見られるわけではない。そこで現状の環境に対する満足度として、時刻ｔにおいて視線の方向と顔位置がどれだけ離れているかで、視聴者ｉの満足度Ｓ^ｉ（ｔ）を定義する。Ｈ（ｘ）を単調減少関数とすると、満足度Ｓ^ｉ（ｔ）は数６から求められる。 Furthermore, the satisfaction level of the viewer i is obtained as a weight. The more people pay attention, the more they try to see the video in the best possible environment.
The closer the coordinates (V ⁱ _x , V ⁱ _y , 0) of the line of sight on the image display screen 101A are to the face position coordinates (F ⁱ _x , F ⁱ _y , 0), the more the image is viewed from the front. Become. It can be intuitively understood that the environment in which the video on the video display unit 101 can be best viewed is the front of the video display unit 101. However, when viewing the video display screen 101A with a large number of people, it is not always possible to view the video in a favorable environment, that is, from the front. Therefore, as the satisfaction with the current environment, the satisfaction S ⁱ (t) of the viewer i is defined by how far the line-of-sight direction and the face position are at time t. When H (x) is a monotone decreasing function, the satisfaction degree S ⁱ (t) can be obtained from Equation 6.

［数６］

上記の式で得られる満足度は良い視聴環境であれば高くなる。この満足度が低い視聴者ほど、現状に満足しておらず、自分の見ている映像を主表示領域１０１Ｂにて視聴したい要求が強いと考えられる。 [Equation 6]

Satisfaction obtained by the above formula increases with a good viewing environment. It can be considered that the viewer with the lower satisfaction level is not satisfied with the current situation, and there is a strong demand for viewing the video he / she sees in the main display area 101B.

表示欲求度検出手段１０８では、以上の係数を用いて、現時刻ｔでの視聴者ｉの映像の表示欲求度Ｄ^ｉ（ｔ）は数７から求められる（ステップＳ１７）。 The display desire degree detection means 108 obtains the display desire degree D ⁱ (t) of the video image of the viewer i at the current time t from Equation 7 using the above coefficients (step S17).

［数７］

ここで、Ｄ^ｉ（ｔ）は、視聴者がｋ個の映像それぞれをどの程度視聴したいかを表すものである。さらに、表示欲求度検出手段１０８では、Ｄ^ｉ（ｔ）を認識したｍ人の視聴者全員に対して求め、加算することで視聴者全員の映像の表示欲求度ＡＤ（ｔ）を数８から算出する（ステップＳ１８）。 [Equation 7]

Here, D ⁱ (t) represents how much the viewer wants to view each of the k videos. Further, the display desire level detection means 108 obtains all the m viewers who have recognized D ⁱ (t) and adds them to obtain the display desire level AD (t) of all the viewers from the equation (8). Calculate (step S18).

［数８］

[Equation 8]

表示映像作成手段１０９では、表示欲求度検出手段１０８で算出された表示欲求度の最も高い値の映像が主表示領域１０１Ｂの映像か否かを判定する（ステップＳ１９）。ここで、表示欲求度の最も高い値の映像が主表示領域１０１Ｂの映像であると判定された場合は、その映像が主表示領域１０１Ｂに表示する（ステップＳ２０）。また、表示欲求度の最も高い値の映像が副表示領域１０１Ｃの映像であると判定された場合、例えば表示欲求度の最も高い値の映像が副表示領域１０１Ｃの映像Ｐ１であると判定された場合には、主表示領域に表示されている映像Ｐ_ＭＡＸに代えて、表示欲求度の最も高い値の映像Ｐ１に入れ替え、映像Ｐ１が映像表示部１０１に表示する（ステップＳ２１）。以上の処理により主表示領域には、視聴者達の最も興味のある映像が表示され続ける。
以下、図８のステップＳ１１からステップＳ２１に示す処理は所定の時間単位、例えば数十秒ないし１分程度の時間単位で繰り返し実行され、これにより、映像表示部１０１に興味の集まった映像を主表示領域１０１Ｂに入れ替えて表示できる。なお、上記処理サイクルは、上述した時間に限定されるものではない。 The display video creation means 109 determines whether or not the video with the highest display desire calculated by the display desire detection means 108 is the video in the main display area 101B (step S19). Here, when it is determined that the video with the highest display desire value is the video in the main display area 101B, the video is displayed in the main display area 101B (step S20). When it is determined that the video with the highest display desire level is the video in the sub display area 101C, for example, the video with the highest display desire level is determined as the video P1 in the sub display area 101C. In this case, instead of the video P _MAX displayed in the main display area, the video P1 having the highest display desire is replaced with the video P1, and the video P1 is displayed on the video display unit 101 (step S21). Through the above processing, the most interesting video of the viewers continues to be displayed in the main display area.
Hereinafter, the processing shown in steps S11 to S21 in FIG. 8 is repeatedly executed in a predetermined time unit, for example, a time unit of about several tens of seconds to 1 minute. The display area 101B can be switched and displayed. The processing cycle is not limited to the time described above.

このように実施例１によれば、従来のように観客に特別に装着される視線入力装置を用いることなしに、屋外の映像表示部に表示された映像に対する不特定多数の視聴者の視線情報を取得し、取得された視線情報に基づいて視聴者の興味が集まる映像を検出し、検出された映像を主表示領域１０１Ｂに表示する。したがって、視聴者に手間をかけることなく、視聴者の興味が集まる映像を大きく表示し、視聴者の興味をいっそう喚起することができる。 As described above, according to the first embodiment, the line-of-sight information of an unspecified number of viewers with respect to the video displayed on the outdoor video display unit without using the line-of-sight input device that is specially attached to the audience as in the past. , And a video that attracts the viewer's interest is detected based on the acquired line-of-sight information, and the detected video is displayed in the main display area 101B. Accordingly, it is possible to display a video that attracts the viewer's interest without encumbering the viewer, thereby further arousing the viewer's interest.

次に、図９〜図１６により実施例２について説明する。
図９は、実施例２の映像表示システムの概略構成を示す図、図１０は実施例２の映像表示システムの構成を示す機能ブロック図、図１１は実施例２における映像表示部と視聴者情報取得用カメラとの位置関係を示す説明図、図１２は実施例２において表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の第１段階を示す説明図、図１３は実施例２において表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の第２段階を示す説明図、図１４は実施例２において表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の第３段階を示す説明図、図１５は本実施例２において表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の最終段階を示す説明図、図１６は本実施例２の映像表示システムの動作を示すフローチャートである。 Next, Example 2 will be described with reference to FIGS.
9 is a diagram illustrating a schematic configuration of a video display system according to the second embodiment, FIG. 10 is a functional block diagram illustrating a configuration of the video display system according to the second embodiment, and FIG. 11 is a video display unit and viewer information according to the second embodiment. FIG. 12 is an explanatory diagram showing the positional relationship with the acquisition camera, FIG. 12 is an explanatory diagram showing the first stage of processing for determining the size and order of regions to be displayed from the display desire matrix in the second embodiment, and FIG. FIG. 14 is a diagram illustrating the second stage of the process of determining the size and order of the area to be displayed from the display desire matrix in Example 2, and FIG. 14 illustrates the size of the area to be displayed from the display desire matrix in Embodiment 2 FIG. 15 is an explanatory diagram showing the third stage of the process for determining the order, and FIG. 15 is an explanatory diagram showing the final stage of the process for determining the size of the area to be displayed from the display desire degree matrix and its order in the second embodiment. Is Is a flowchart showing the operation of the image display system of the second embodiment.

まず、図９に示す映像表示システムの構成について説明する。
図９に示すように、遠隔地９０において、風景、人物、繁華街などをカメラで撮影し、その映像をリアルタイムに現在地９１の映像表示装置９２（図１０及び図１１に示す映像表示部４０１に相当する）に表示する。撮影は二つのカメラ９３および９４を用いて行い、カメラ９３を主表示領域用カメラとし、カメラ９４を副表示領域表示用カメラとする。副表示領域用カメラ９４は、被写体の全体映像を撮影するものである。主表示領域表示用カメラ９３は、副表示領域用カメラ９４により撮影される被写体のうち視聴者の興味が集まる部分（領域）を拡大して撮影するために用いられる。また、主表示領域表示用カメラ９３及び副表示領域用カメラ９４と映像表示装置９２とは、通信回線９５介して接続されている。 First, the configuration of the video display system shown in FIG. 9 will be described.
As shown in FIG. 9, in a remote location 90, a landscape, a person, a downtown area, etc. are photographed with a camera, and the video is displayed in real time on a video display device 92 (present video display unit 401 shown in FIGS. 10 and 11). Display). Photographing is performed using two cameras 93 and 94. The camera 93 is a main display area camera, and the camera 94 is a sub display area display camera. The sub display area camera 94 captures the entire video of the subject. The main display area display camera 93 is used to magnify and shoot a portion (area) where the viewer's interest gathers among the subjects imaged by the sub display area camera 94. The main display area display camera 93 and the sub display area camera 94 are connected to the video display device 92 via a communication line 95.

実施例２の映像表示装置４００は、図１０に示すように、映像表示部４０１、２つの視聴者情報取得用カメラ４０２、顔画像認識手段４０３、顔座標検出手段４０４、視聴者注視座標検出手段４０５、視線位置行列作成手段４０６、視線位置情報検出手段４０７、満足度検出手段４０８、表示欲求度検出手段４０９、表示領域・表示順番決定手段４１０、動領域検出手段４１１、送信手段４１２、受信手段４１３および表示映像作成手段４１４を備える。 As shown in FIG. 10, the video display device 400 according to the second embodiment includes a video display unit 401, two viewer information acquisition cameras 402, face image recognition means 403, face coordinate detection means 404, viewer gaze coordinate detection means. 405, eye-gaze position matrix creating means 406, eye-gaze position information detecting means 407, satisfaction degree detecting means 408, display desire degree detecting means 409, display area / display order determining means 410, moving area detecting means 411, transmitting means 412, receiving means 413 and display image creation means 414.

映像表示部４０１は、屋外のような人通りの多いに広場などに設置されるもので、横寸法がＸ，縦寸法がＹの大きさの映像表示画面４０１Ａを有する。また、この映像表示画面４０１Ａは、図１１に示すように、視聴者の興味を持った領域の映像を表示する主表示領域４０１Ｂと、全体映像を縮小して表示する副表示領域４０１Ｃとに区分されている。また、実施例２では、主表示領域表示用カメラ９３により撮影された映像を主表示領域４０１Ｂに表示し、副表示領域用カメラ９４により撮影された映像を副表示領域４０１Ｃに表示することができるようになっている。 The video display unit 401 is installed in a square or the like where there are many people like the outdoors, and has a video display screen 401A having a horizontal dimension of X and a vertical dimension of Y. Further, as shown in FIG. 11, the video display screen 401A is divided into a main display area 401B that displays an image of an area in which the viewer is interested and a sub display area 401C that displays the entire video in a reduced size. Has been. Further, in the second embodiment, the video captured by the main display area display camera 93 can be displayed on the main display area 401B, and the video captured by the sub display area camera 94 can be displayed on the sub display area 401C. It is like that.

２つの視聴者情報取得用カメラ４０２は、映像表示部４０１の映像表示画面を視聴する不特定多数の視聴者を撮影するもので、図１１に示すように、映像表示部４０１の表示画面上における上端の左右個所に、視聴者に向けて配置されている。この左右両視聴者情報取得用カメラ４０２の間隔は、視聴者の表示画面に対する視線の位置を正確に求めるために表示画面のＸ軸方向の寸法に相当する距離Ｌに設定されている。 The two viewer information acquisition cameras 402 capture a large number of unspecified viewers who view the video display screen of the video display unit 401. As shown in FIG. It is arranged toward the viewer at the left and right positions of the upper end. The distance between the left and right viewer information acquisition cameras 402 is set to a distance L corresponding to the dimension of the display screen in the X-axis direction in order to accurately determine the position of the line of sight with respect to the viewer's display screen.

顔画像認識手段４０３は、各視聴者情報取得用カメラ４０２で取得された視聴者映像に基づいて、図９に示すように、映像表示画面４０１Ａに顔を向けている視聴者２００の顔画像をＳＶＭ等により認識し、認識された顔画像に基づいて映像表示画面４０１Ａに顔を向けている視聴者２００の数を検出する。
顔座標検出手段４０４は、実施例１の顔座標検出手段１０４と同様に、視聴者情報取得用カメラ４０２毎に顔画像認識手段４０３により認識された顔画像に基づいて映像表示画面４０１Ａ上における顔画像の位置座標を、認識された視聴者２００毎に検出する。 As shown in FIG. 9, the face image recognition unit 403 displays the face image of the viewer 200 facing the video display screen 401A based on the viewer video acquired by each viewer information acquisition camera 402. Recognized by SVM or the like, the number of viewers 200 facing the video display screen 401A is detected based on the recognized face image.
Similar to the face coordinate detection unit 104 of the first embodiment, the face coordinate detection unit 404 is a face on the video display screen 401A based on the face image recognized by the face image recognition unit 403 for each viewer information acquisition camera 402. The position coordinates of the image are detected for each recognized viewer 200.

視聴者注視座標検出手段４０５は、実施例１の視聴者注視座標検出手段１０５と同様に、視聴者情報取得用カメラ４０２毎に顔画像認識手段４０３により認識された両方の顔画像に基づいて視聴者２００の顔の方向と眼球の方向を検出し、認識された視聴者２００の視線方向を視聴者毎に検出し、検出された視線方向の情報と顔座標検出手段４０４により検出された顔画像の位置座標に基づいて視聴者２００が映像表示画面４０１Ａ中で注視している視線の位置を視線位置情報として、認識された視聴者毎に検出する。 Similar to the viewer gaze coordinate detecting means 105 of the first embodiment, the viewer gaze coordinate detecting means 405 performs viewing based on both face images recognized by the face image recognizing means 403 for each viewer information acquisition camera 402. The direction of the face of the person 200 and the direction of the eyeball are detected, the line-of-sight direction of the recognized viewer 200 is detected for each viewer, and information on the detected line-of-sight direction and the face image detected by the face coordinate detection unit 404 are detected. Based on the position coordinates, the position of the line of sight that the viewer 200 is gazing in the video display screen 401A is detected as line-of-sight position information for each recognized viewer.

視線位置行列作成手段４０６は、視聴者注視座標検出手段４０５により検出された視線位置情報に基づいて映像表示画面４０１Ａ上の座標（０，０，０）〜（Ｘ，Ｙ，０）に対して視聴者２００の視線が向いているかどうかを表すＸ×Ｙの行列Ｖ^ｉ（ｔ）を作成する。
また、視線位置情報検出手段４０７は、視線位置行列作成手段４０６により作成されたＸ×Ｙの行列に過去（時刻ｔ−Ｔから現時刻ｔ）の視線位置情報を加算して現在の視線位置情報を視線位置メタ情報Ｖ^ｉ _ｍｅｔａ（ｔ）として、視聴者毎に算出する。 The line-of-sight position matrix creation unit 406 performs the operations on the coordinates (0, 0, 0) to (X, Y, 0) on the video display screen 401A based on the line-of-sight position information detected by the viewer gaze coordinate detection unit 405. An X × Y matrix V ⁱ (t) representing whether or not the viewer 200 is looking is created.
Further, the line-of-sight position information detection unit 407 adds the line-of-sight position information of the past (from time t-T to the current time t) to the X × Y matrix created by the line-of-sight position matrix creation unit 406 to obtain current line-of-sight position information. _Is calculated for each viewer as line-of-sight position meta information V ⁱ _meta (t).

満足度検出手段４０８は、顔座標検出手段４０４により検出された顔画像の位置座標に基づいて視聴者２００が映像表示画面４０１Ａに顔を向けて映像をどれくらい静止して見ていたかを表す顔移動係数Ｆ^ｉ _ｍｏｖｅを算出するとともに、算出された顔移動係数Ｆ^ｉ _ｍｏｖｅに基づいて視聴者２００の満足度Ｓ^ｉ（ｔ）を視聴者毎に算出する。 The satisfaction level detection means 408 is a face movement that indicates how much the viewer 200 is viewing the video with the face facing the video display screen 401A based on the position coordinates of the face image detected by the face coordinate detection means 404. The coefficient F ⁱ _move is calculated, and the satisfaction degree S ⁱ (t) of the viewer 200 is calculated for each viewer based on the calculated face movement coefficient F ⁱ _move .

表示欲求度検出手段４０９は、視線位置情報検出手段４０７により算出された視線位置メタ情報Ｖ^ｉ _ｍｅｔａ（ｔ）、満足度検出手段４０８により算出された顔移動係数Ｆ^ｉ _ｍｏｖｅおよび満足度Ｓ^ｉ（ｔ）に基づいて映像表示画面４０１Ａ中の全領域での映像に対する視聴者２００の表示欲求度Ｄ^ｉ（ｔ）を、認識した視聴者２００毎に算出し、算出された表示欲求度Ｄ^ｉ（ｔ）を加算して、認識した視聴者全員の映像表示画面４０１Ａ中の表示欲求度ＡＤ（ｔ）として求める。 The display desire level detection unit 409 includes the gaze position meta information V ⁱ _meta (t) calculated by the gaze position information detection unit 407, the face movement coefficient F ⁱ _move calculated by the satisfaction level detection unit 408, and the satisfaction level S ⁱ ( t), the display desire degree D ⁱ (t) of the viewer 200 for the video in the entire area of the video display screen 401A is calculated for each recognized viewer 200, and the calculated display desire degree D ⁱ ( t) is added and obtained as the display desire degree AD (t) in the video display screen 401A of all recognized viewers.

表示領域・表示順番決定手段４１０は、表示欲求度検出手段４０９により算出された表示欲求度ＡＤ（ｔ）の情報に基づいて映像表示画面４０１Ａの主表示領域４０１Ｂに表示する表示領域の大きさとこの表示領域の表示の順番を決定し、表示領域および表示順の情報を主表示領域表示用カメラ９３に対して送出する。
動領域検出手段４１１は、表示領域・表示順番決定手段４１０により決定された各表示領域が動領域か静止領域かを判定し、表示領域が動領域と判定されたとき、表示領域（すなわち主表示領域表示用カメラ９３の撮影領域）が変更されてトラッキングされるように主表示領域表示用カメラ９３に対してトラッキング指令を送出するものである。 The display area / display order determining unit 410 determines the size of the display area to be displayed on the main display area 401B of the video display screen 401A based on the information of the display desire level AD (t) calculated by the display desire level detection unit 409. The display order of the display areas is determined, and information about the display areas and the display order is sent to the main display area display camera 93.
The moving area detecting means 411 determines whether each display area determined by the display area / display order determining means 410 is a moving area or a stationary area. When the display area is determined to be a moving area, the moving area detecting means 411 A tracking command is sent to the main display area display camera 93 so that the shooting area of the area display camera 93 is changed and tracked.

送信手段４１２は、表示領域・表示順番決定手段４１０により決定された表示領域と順番を表す情報と動領域検出手段４１１により検出された動領域か静止領域かの判定情報とを遠隔地に設置された主表示領域表示用カメラ９３に送信する。
受信手段４１３は、遠隔地に設置された主表示領域表示用カメラ９３及び副表示領域用カメラ９４により撮影された映像を受信し、表示映像作成手段４１４に出力する。
表示映像作成手段４１４は、受信手段４１３からの主表示領域表示用カメラ９３及び副表示領域用カメラ９４の映像を映像表示部４０１の主表示領域４０１Ｂ及び副表示領域４０１Ｃに表示するように加工する。 The transmission means 412 is installed at a remote location with the display area and information indicating the order determined by the display area / display order determination means 410 and the determination information as to whether the moving area or the stationary area is detected by the moving area detection means 411. To the main display area display camera 93.
The receiving means 413 receives the video imaged by the main display area display camera 93 and the sub display area camera 94 installed at a remote place, and outputs them to the display video creation means 414.
The display video creation unit 414 processes the video of the main display area display camera 93 and the sub display area camera 94 from the reception unit 413 so as to be displayed in the main display area 401B and the sub display area 401C of the video display unit 401. .

次に、実施例２の映像表示装置の動作について図９〜図１６を参照して説明する。
図９において、遠隔地９０の主表示領域用カメラ３０１及び副表示領域用カメラ３０２により撮影されたそれぞれの映像は、通信回線９５を通して現在地９１にある映像表示装置９２に伝送され表示される。より具体的には、主表示領域表示用カメラ９３は視聴者達が興味を持った領域の拡大映像を得るための被写体、例えば図９に示す遠隔地９０の撮影場所にいる人物９６を撮影し、また、副表示領域用カメラ９４は遠隔地９０の撮影場所にいる人物９６及びその背景建物９７を含む全体映像を撮影する。そして、主表示領域表示用カメラ９３で撮影された映像は受信手段４１３で受信され、表示映像作成手段４１４によって映像表示部４０１の主表示領域４０１Ｂに図９及び図１１に示すように拡大表示される。また、副表示領域用カメラ９４で撮影された映像は受信手段４１３で受信され、表示映像作成手段４１４によって映像表示部４０１の副表示領域４０１Ｃに図９及び図１１に示すように縮小表示される。 Next, the operation of the video display apparatus according to the second embodiment will be described with reference to FIGS.
In FIG. 9, the respective images taken by the main display area camera 301 and the sub display area camera 302 in the remote location 90 are transmitted and displayed on the image display device 92 at the current location 91 through the communication line 95. More specifically, the main display area display camera 93 captures a subject for obtaining an enlarged image of an area in which the viewers are interested, for example, a person 96 at the photographing location in the remote area 90 shown in FIG. The sub display area camera 94 captures the entire image including the person 96 and the background building 97 at the photographing location of the remote location 90. The video captured by the main display area display camera 93 is received by the receiving means 413, and is enlarged and displayed in the main display area 401B of the video display section 401 by the display video creating means 414 as shown in FIGS. The The video captured by the sub display area camera 94 is received by the receiving means 413, and is reduced and displayed in the sub display area 401C of the video display section 401 by the display video creating means 414 as shown in FIGS. .

かかる状態において、左右の視聴者情報取得用カメラ４０２を用いて、それぞれのカメラ視野内にいる視聴者２００を撮影し、視聴者２００の映像情報を得る（ステップＳ３１）。この時、視聴者情報取得用カメラ４０２と映像表示部４０１との位置関係は既知であり、図１１に示すような配置をとるものとする。また、図１１において、座標（０，０，０）は映像表示画面４０１Ａの左端上の座標を表し、座標（Ｘ，Ｙ，０）は映像表示画面４０１Ａの右端下の座標を表している。 In this state, the left and right viewer information acquisition cameras 402 are used to photograph the viewers 200 in the respective camera fields of view, thereby obtaining video information of the viewers 200 (step S31). At this time, the positional relationship between the viewer information acquisition camera 402 and the video display unit 401 is known, and the arrangement shown in FIG. 11 is assumed. In FIG. 11, coordinates (0, 0, 0) represent the coordinates at the upper left end of the video display screen 401A, and coordinates (X, Y, 0) represent the coordinates at the lower right end of the video display screen 401A.

顔画像認識手段４０３では、実施例１と同様に、視聴者情報取得用カメラ４０２から得られた映像信号を用いてＳＶＭ等の手法により、映像表示画面４０１Ａに顔を向けている視聴者２００の顔画像認識を行い、この視聴者の数を認識する（ステップＳ３２）。このとき認識した視聴者２００の人数をｍとする。
顔座標検出手段４０４では、実施例１と同様に、顔画像認識手段４０３によって認識された視聴者２００のそれぞれに対して、映像表示画面４０１Ａに対する顔の位置座標を求める（ステップＳ３３）。 As in the first embodiment, the face image recognition unit 403 uses the video signal obtained from the viewer information acquisition camera 402 by using a technique such as SVM and the like for the viewer 200 facing the video display screen 401A. Face image recognition is performed to recognize the number of viewers (step S32). Let m be the number of viewers 200 recognized at this time.
As in the first embodiment, the face coordinate detection unit 404 obtains the face position coordinates relative to the video display screen 401A for each of the viewers 200 recognized by the face image recognition unit 403 (step S33).

視聴者注視座標検出手段４０５では、実施例１同様に、視聴者２００が映像表示画面４０１Ａ中で注視している視線の方向を検出し、かつ映像表示画面４０１Ａ中で見ている位置座標を、認識された視聴者毎に検出する（ステップＳ３４）。
視線位置行列作成手段４０６では、視聴者注視座標検出手段４０５で求められた視線位置情報を基に映像表示画面４０１Ａ上の座標（０，０，０）〜（Ｘ，Ｙ，０）に対して視聴者２００の視線が向いているかどうかを表すＸ×Ｙの視線位置行列を作成する（ステップＳ３５）。すなわち、視線の位置座標は（Ｖ^ｉ _ｘ，Ｖ^ｉ _ｙ，0）一点ではなく、（Ｖ^ｉ _ｘ，Ｖ^ｉ _ｙ，0）を中心にして半径ｒの円内部に視線が向いているとすると、現時刻ｔで視線が向いているかどうかを表すＸ×Ｙの視線位置行列Ｖ^ｉ（ｔ）は数９に基づいて作成することができる。 As in the first embodiment, the viewer gaze coordinate detection unit 405 detects the direction of the line of sight that the viewer 200 is gazing on in the video display screen 401A, and the position coordinates in the video display screen 401A are Detection is performed for each recognized viewer (step S34).
In the line-of-sight position matrix creation means 406, the coordinates (0, 0, 0) to (X, Y, 0) on the video display screen 401A are based on the line-of-sight position information obtained by the viewer gaze coordinate detection means 405. An X × Y line-of-sight position matrix indicating whether or not the line of sight of the viewer 200 is facing is created (step S35). That is, the position coordinates of the sight line ^{_{^{_{(V i x, V i y}}}} , 0) rather than a single ^point, if the facing line of sight to a circle inside the radius r about the _{^{_{(V i x, V i y}}} , 0) The X × Y line-of-sight position matrix V ⁱ (t) indicating whether the line of sight is facing at the current time t can be created based on Equation 9.

［数９］

[Equation 9]

上記の視線位置行列は、図１１に示される映像表示画面４０１Ａ上の座標（０，０，０）〜（Ｘ，Ｙ，０）の位置と対応しており、行列の要素v₁₁が映像表示画面４０１Ａの左端上を示し、要素v_xyが右端下を示している。したがって、視線位置情報検出手段４０７は、視線位置行列作成手段４０６で作成されたＸ×Ｙの視線位置行列に過去（時刻ｔ−Ｔから現時刻ｔ）の視線位置情報を加算することにより、現在の視線位置情報を視線位置メタ情報Ｖ^ｉ _ｍｅｔａ（ｔ）として、次の数１０から視聴者毎に求めることができる（ステップＳ３６）。 It said gaze position matrix, the coordinates (0,0,0) - on the video display screen 401A shown in FIG. 11 (X, Y, 0) position and corresponds to the element v ₁₁ of the matrix display picture The upper left end of the screen 401A is shown, and the element v _xy shows the lower right end. Therefore, the line-of-sight position information detecting unit 407 adds the line-of-sight position information of the past (from time t-T to the current time t) to the X × Y line-of-sight position matrix created by the line-of-sight position matrix creating unit 406, thereby Can be obtained for each viewer from the following equation 10 as gaze position meta-information V ⁱ _meta (t) (step S36).

［数１０］

[Equation 10]

満足度検出手段４０８では、上記実施例１に示す場合と同様にして、顔移動係数Ｆ^ｉ _ｍｏｖｅを算出し、算出された顔移動係数Ｆ^ｉ _ｍｏｖｅに基づいて視聴者２００の満足度Ｓ^ｉ（ｔ）を算出する（ステップＳ３７）。
表示欲求度検出手段４０９では、視線位置情報検出手段４０７で算出された視線位置メタ情報Ｖ^ｉ _ｍｅｔａ（ｔ）と満足度検出手段４０８で算出された顔移動係数Ｆ^ｉ _ｍｏｖｅ及び満足度Ｓ^ｉ（ｔ）とに基づいて現時刻ｔにおける映像表示画面４０１Ａ中の全領域での映像に対する視聴者２００の表示欲求度Ｄ^ｉ（ｔ）を数１１により、認識した視聴者２００毎に算出する（ステップＳ３８）。 The satisfaction level detection means 408 calculates the face movement coefficient F ⁱ _move in the same manner as shown in the first embodiment, and the satisfaction level S ⁱ of the viewer 200 based on the calculated face movement coefficient F ⁱ _move. t) is calculated (step S37).
In the display desire level detection means 409, the gaze position meta information V ⁱ _meta (t) calculated by the gaze position information detection means 407, the face movement coefficient F ⁱ _move calculated by the satisfaction level detection means 408, and the satisfaction degree S ⁱ ( Based on t), the display desire degree D ⁱ (t) of the viewer 200 for the video in the entire area of the video display screen 401A at the current time t is calculated for each recognized viewer 200 by Equation 11 (step) S38).

［数１１］

[Equation 11]

さらに、表示欲求度検出手段４０９では、認識した視聴者２００毎に算出された表示欲求度Ｄ^ｉ（ｔ）を加算して、認識した視聴者全員の映像表示画面４０１Ａ中における映像の表示欲求度ＡＤ（ｔ）を数１２により算出する（ステップＳ３９）。この表示欲求度ＡＤ（ｔ）はＸ×Ｙの行列として得られる。 Further, the display desire level detection means 409 adds the display desire level D ⁱ (t) calculated for each recognized viewer 200 and displays the video display desire level on the video display screen 401A of all recognized viewers. AD (t) is calculated by Equation 12 (step S39). This display desire degree AD (t) is obtained as an X × Y matrix.

［数１２］

[Equation 12]

表示領域・表示順番決定手段４１０では、Ｘ×Ｙの行列として得られる表示欲求度ＡＤ（ｔ）に基づいて、主表示領域４０１Ｂに表示する領域を選択し、選択された領域の大きさおよび表示順を、以下に述べる処理により決定する（ステップＳ４０）。 The display area / display order determining means 410 selects an area to be displayed in the main display area 401B based on the display desire degree AD (t) obtained as an X × Y matrix, and the size and display of the selected area. The order is determined by the process described below (step S40).

処理１；表示欲求度ＡＤ（ｔ）の行列の要素ａｄ_ｉｊにおいて、要素ａｄ_ｉｊが予め定めた閾値以下ａｄ_ｉｊ＜ｔｈ_{ｖａｌｕｅ}となる領域を表示しない領域（ＮＧ）とし、要素ａｄ_ｉｊが閾値以上の領域を表示する領域として選択する。図１２に示す主表示領域４０１Ｂにおいて、ハッチングの領域が表示しない領域（ＮＧ）を示しており、白抜きの領域が表示する領域１２０を示している。また、この表示する領域１２０のうち、白抜きの度合が大きい程要素ａｄ_ｉｊの値が大きくなる。 Process 1; in component _{ad ij} of the matrix display craving of AD (t), and a region that does not display the region element _{ad ij} is a predetermined threshold or less _{_{ad ij <th value (NG)}} , the element _{ad ij} is more than the threshold value This area is selected as the display area. In the main display area 401B shown in FIG. 12, the hatched area indicates a non-display area (NG), and the outlined area indicates the area 120 displayed. In addition, in the display area 120, the value of the element ad _ij increases as the degree of whiteness increases.

処理２；表示しない領域（ＮＧ）になっていない要素ａｄ_ｉｊを連結させ、ｎ個の連結された領域ＡＲＥＡ_ｊ（ｊ＝0，…，ｎ）を作成する。
この領域においてＡＲＥＡ_ｊの面積＜ｔｈ_ＡＲＥＡとなる閾値以下のエリアはＮＧとし、閾値以上のエリアを表示するエリア１３０とする。そして、この各エリア１３０の中心点１３１を図１３に示すように求める。この場合、中心点１３１がある閾値以上の副表示領域４０１Ｃに含まれていたならば、そのエリアをＮＧとする。 Process 2; Elements ad _ij that are not in a non-display area (NG) are connected to create n connected areas AREA _j (j = 0,..., N).
In this area, the area below the threshold value where AREA _j area <th _AREA is NG, and the area above the threshold value is displayed as area 130. Then, the center point 131 of each area 130 is obtained as shown in FIG. In this case, if the center point 131 is included in the sub display area 401C having a certain threshold value or more, the area is determined as NG.

処理３；中心点を中心にして、図１４に示すように、ＮＧ領域が入らないように表示エリア１３０からＮ：Ｌの比率の領域１４０を抜き出す。
処理４；抜き出した領域１４０のうち、ＮＧになっていない領域１４０の大きさの平均値を求め、この平均値の高いものから順に、図１５に示すようなラベルをつける。すなわち、各表示する領域１４０に図１５に示すような表示の順番を表す番号「１」、「２」、「３」を付ける。なお、領域の表示の順番を表す番号の順序は図１５に示すものに限定されない。
以上の処理で、主表示領域４０１Ｂに表示する表示領域の大きさおよび表示順を決定することができる。 Process 3: Centering on the center point, as shown in FIG. 14, a region 140 having a ratio of N: L is extracted from the display area 130 so that the NG region does not enter.
Process 4; The average value of the size of the area 140 that is not NG among the extracted areas 140 is obtained, and labels as shown in FIG. 15 are attached in order from the highest average value. That is, numbers “1”, “2”, and “3” indicating the display order as shown in FIG. Note that the order of numbers indicating the display order of the regions is not limited to that shown in FIG.
With the above processing, the size and display order of the display areas displayed in the main display area 401B can be determined.

動領域検出手段４１１では、表示領域・表示順番決定手段４１０で選択されたそれぞれの表示領域が動領域か静止領域かを判定する。動領域検出手段４１１は、表示領域内の映像を時間によるブロックマッチング処理を行い、動きベクトルを検出する（ステップＳ４１）。そして、検出された動きベクトル値（ｍｖ_ｘ，ｍｖ_ｙ）を予め定めた閾値（ｍｖ_ｘ＞ｔｈ_{ｍｖ_ｘ} ｏｒｍｖ_ｙ＞ｔｈ_{ｍｖ_ｙ}）と比較し、動きベクトルの値が閾値以上のとき、この表示領域を動領域と判定する（ステップＳ４２）。そして、動領域検出手段４１１は、表示領域が動領域の場合には、動き検出を行い、動領域をトラッキングする。また、表示領域が動領域でない場合には、静止領域であるから、そのまま保持する。 The moving area detecting unit 411 determines whether each display area selected by the display area / display order determining unit 410 is a moving area or a stationary area. The moving area detection unit 411 performs block matching processing on the video in the display area according to time, and detects a motion vector (step S41). The detected motion vector value (mv _x , mv _y ) is compared with a predetermined threshold value (mv _x > th _{mv_x} or mv _y > th _{mv_y} ). When the motion vector value is equal to or greater than the threshold value, this display area Is determined to be a moving area (step S42). Then, when the display area is a moving area, the moving area detecting unit 411 performs movement detection and tracks the moving area. If the display area is not a moving area, the display area is held as it is because it is a stationary area.

次に、送信手段４１２では、表示領域・表示順番決定手段４１０により決定された表示領域（撮影場所）と表示の順番を表す情報、及び動領域検出手段４１１により検出された動領域か静止領域かの判定情報である動き検出の情報を遠隔地に設置された主表示領域表示用カメラ９３に送信する（ステップＳ４５）。これに伴い、主表示領域用カメラ９３は、与えられた情報に基づいて、カメラ本体の角度や、ズーム、トラッキングの有無などが変更されるとともに、指定された領域の映像を撮影して現在地の映像表示装置に送信する。この映像を受信した映像表示部４０１では、各表示領域を一定時間表示した後、副表示領域４０１Ｃに表示されている全体映像を主表示領域４０１Ｂに表示する。
以下、図１６のステップＳ３１からステップＳ４６に示す処理は所定の時間単位、例えば数十秒ないし１分程度の時間単位で繰り返し実行される。なお、上記処理サイクルは、上述した時間に限定されるものではない。 Next, in the transmission means 412, information indicating the display area (shooting location) determined by the display area / display order determining means 410 and the display order, and whether the moving area detected by the moving area detecting means 411 is a stationary area or not. The motion detection information, which is the determination information, is transmitted to the main display area display camera 93 installed in the remote place (step S45). Along with this, the main display area camera 93 changes the angle of the camera body, zoom, tracking, etc. based on the given information, and also captures the image of the specified area and captures the current location. Send to video display device. Upon receiving this video, the video display unit 401 displays each display area for a certain period of time, and then displays the entire video displayed in the sub display area 401C in the main display area 401B.
Hereinafter, the processing shown in steps S31 to S46 in FIG. 16 is repeatedly executed in a predetermined time unit, for example, a time unit of several tens of seconds to 1 minute. The processing cycle is not limited to the time described above.

上記で述べた映像表示制御の処理手順を要約すると、以下に述べる手順で映像表示処理が繰り返し行われることになる。
手順１；映像表示部４０１の表示領域全体に、副表示領域用カメラ９３で取得した遠隔地９０からの全体映像を表示する。
手順２；全体映像を視聴している視聴者２００達が興味を持った領域を複数個求める。
手順３；映像部４０１の表示領域を主表示領域４０１Ｂと副表示領域４０１Ｃに分割して、主表示領域４０１Ｂに主表示領域表示用カメラ９３で取得した視聴者の興味を持った映像Ｐ１を図９に示すように拡大して表示し、副表示領域４０１Ｃに副表示領域表示用カメラ９４で取得した全体映像Ｐ２を図９に示すように縮小して表示する。
手順４；上記手順２で求めた視聴者の興味を持った領域を一定時間毎に切り替え、全て表示する。
手順５；手順１に戻る。 Summarizing the processing procedure of the video display control described above, the video display processing is repeatedly performed according to the procedure described below.
Procedure 1: The entire video from the remote location 90 acquired by the sub display area camera 93 is displayed on the entire display area of the video display unit 401.
Procedure 2: Obtain a plurality of areas in which the viewers 200 who are viewing the entire video are interested.
Step 3: The display area of the video unit 401 is divided into a main display area 401B and a sub display area 401C, and a video P1 with the viewer's interest acquired by the main display area display camera 93 is displayed in the main display area 401B. 9, the entire image P2 acquired by the sub display area display camera 94 is reduced and displayed in the sub display area 401C as shown in FIG.
Step 4: The region of interest of the viewer obtained in the above step 2 is switched at regular intervals to display all.
Procedure 5: Return to Procedure 1.

このように実施例２によれば、従来のように観客に特別に装着される視線入力装置を用いることなしに、屋外の映像表示部に表示された映像に対する不特定多数の視聴者の視線情報を取得し、取得された視線情報に基づいて視聴者の興味が集まる映像を表示する。
したがって、視聴者に手間をかけることなく、視聴者の興味が集まる映像を大きく表示し、ひいては視聴者の興味をいっそう喚起することができる。 As described above, according to the second embodiment, the line-of-sight information of an unspecified number of viewers with respect to the video displayed on the outdoor video display unit without using the line-of-sight input device that is specially attached to the audience as in the past. , And displays an image that attracts the viewer's interest based on the acquired line-of-sight information.
Therefore, it is possible to display a video image that attracts the viewer's interest without taking time and effort for the viewer, thereby further arousing the viewer's interest.

実施例１の映像表示装置の構成を示す機能ブロック図である。1 is a functional block diagram illustrating a configuration of a video display device according to Embodiment 1. FIG. 映像表示部の映像表示形態の一例を示す説明図である。It is explanatory drawing which shows an example of the video display form of a video display part. 映像表示部と視聴者との関係を示す説明図である。It is explanatory drawing which shows the relationship between a video display part and a viewer. 映像表示部と視聴者情報取得用カメラとの位置関係を示す説明図である。It is explanatory drawing which shows the positional relationship of a video display part and a viewer information acquisition camera. 映像表示部の表示画面に顔を向けている視聴者の顔画像を認識する場合の映像表示部と視聴者との関係を示す説明図である。It is explanatory drawing which shows the relationship between a video display part and a viewer in the case of recognizing the viewer's face image which has turned his face to the display screen of a video display part. 映像表示部の表示画面と視聴者の顔領域との位置関係を求めるための説明図である。It is explanatory drawing for calculating | requiring the positional relationship of the display screen of an image | video display part, and a viewer's face area. 映像表示部の表示画面中で見ている視聴者の視線位置を表す説明図である。It is explanatory drawing showing the gaze position of the viewer who is looking in the display screen of an image | video display part. 実施例１の映像表示装置の動作を示すフローチャートである。3 is a flowchart illustrating an operation of the video display device according to the first embodiment. 実施例２の映像表示システムの概略構成を示す図である。It is a figure which shows schematic structure of the video display system of Example 2. FIG. 実施例２の映像表示システムの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the video display system of Example 2. 映像表示部と視聴者情報取得用カメラとの位置関係を示す説明図である。It is explanatory drawing which shows the positional relationship of a video display part and a viewer information acquisition camera. 表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の第１段階を示す説明図である。It is explanatory drawing which shows the 1st step of the process which determines the magnitude | size of the area | region to display from the display desire degree matrix, and its order. 表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の第２段階を示す説明図である。It is explanatory drawing which shows the 2nd step of the process which determines the magnitude | size of the area | region to display from a display desire degree matrix, and its order. 表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の第３段階を示す説明図である。It is explanatory drawing which shows the 3rd step of the process which determines the magnitude | size of the area | region to display from the display desire degree matrix, and its order. 表示欲求度行列から表示する領域の大きさおよびその順番を決定する処理の最終段階を示す説明図である。It is explanatory drawing which shows the last step of the process which determines the magnitude | size of the area | region to display from a display desire degree matrix, and its order. 実施例２の映像表示システムの動作を説明するためのフローチャートである。10 is a flowchart for explaining the operation of the video display system according to the second embodiment.

Explanation of symbols

１００……映像表示装置、１０１……映像表示部、１０１Ａ……映像表示画面、１０１Ｂ……主表示領域、１０１Ｃ……副表示領域、１０２……視聴者情報取得用カメラ、１０３……顔画像認識手段、１０４……顔座標検出手段、１０５……視聴者注視座標検出手段、１０６……注目映像情報検出手段、１０７……満足度検出手段、１０８……表示欲求度検出手段、１０９……表示映像作成手段、１１０……受信手段、２００……視聴者、３０１……主表示領域用カメラ、３０２……副表示領域用カメラ、４００……映像表示装置、４０１……映像表示部、４０２……視聴者情報取得用カメラ、４０３……顔画像認識手段、４０４……顔座標検出手段、４０５……視聴者注視座標検出手段、４０６……視線位置行列作成手段、４０７……視線位置情報検出手段、４０８……満足度検出手段、４０９……表示欲求度検出手段、４１０……表示領域・表示順番決定手段、４１１……動領域検出手段、４１２……送信手段、４１３……受信手段、４１４……表示映像作成手段、４０１Ａ……映像表示画面、４０１Ｂ……主表示領域、４０１Ｃ……副表示領域。
DESCRIPTION OF SYMBOLS 100 ... Video display apparatus, 101 ... Video display part, 101A ... Video display screen, 101B ... Main display area, 101C ... Sub display area, 102 ... Camera for viewer information acquisition, 103 ... Face image Recognizing means 104... Face coordinate detecting means 105... Viewer gaze coordinate detecting means 106... Interesting video information detecting means 107 .. Satisfaction detecting means 108. Display image creation means, 110... Reception means, 200... Viewer, 301... Main display area camera, 302 .. secondary display area camera, 400... Video display apparatus, 401. ... viewer information acquisition camera, 403 ... face image recognition means, 404 ... face coordinate detection means, 405 ... viewer gaze coordinate detection means, 406 ... gaze position matrix creation means, 407 ... Position information detecting means, 408 ... Satisfaction detecting means, 409 ... Display desire degree detecting means, 410 ... Display area / display order determining means, 411 ... Moving area detecting means, 412 ... Transmitting means, 413 ... Receiving means, 414... Display image creating means, 401A... Video display screen, 401B... Main display area, 401C.

Claims

A video display step of providing a main area on the display screen and one or more sub areas smaller than the main area, and displaying video in the main area and the one or more sub areas;
A viewer shooting step of shooting a viewer who views the video displayed on the display screen in the video display step with a plurality of cameras;
A face image recognition step for recognizing the viewer's face image based on the image photographed in the viewer photographing step;
Display desire calculation for detecting the video that the face image gazes on the basis of the face image recognized in the face image recognition step and calculating a display desire for displaying the video based on the detected video Steps,
The display desire levels calculated in the display desire level calculation step are compared, and the video having the highest display desire level is displayed in the main area and the one or more sub areas so as to be displayed in the main area. A video switching step for switching video;
A video display method comprising:

In the viewer photographing step, a moving image is photographed at a predetermined time by the plurality of cameras,
The face image recognition step recognizes the face image for each frame constituting the moving image photographed in the viewer photographing step,
The video display method according to claim 1, wherein the display desire degree calculation step calculates the display desire degree based on each face image of each frame recognized in the face image recognition step.

The display desire degree calculating step includes:
A face coordinate detection step of detecting a position coordinate of the face image with respect to the display screen based on the face image recognized in the face image recognition step;
A gaze direction detection step of detecting a gaze direction of the face image based on the face image recognized in the face image recognition step;
Based on the gaze direction of the face image detected in the gaze direction detection step and the position coordinate of the corresponding face image detected in the face coordinate detection step, the position coordinates of the display screen that the face image gazes at are determined. Gaze coordinate detection step to detect;
An attention video detection step of detecting any one of the main region and the sub region that the face image gazes based on the position coordinates detected in the gaze coordinate detection step;
The video display method according to claim 1, wherein the display desire degree of each video of the main area and the one or more sub areas is calculated based on detection information detected in the attention video detection step. .

The display desire degree calculating step includes:
A face coordinate detection step of detecting a position coordinate of the face image with respect to the display screen based on the face image recognized in the face image recognition step;
A movement amount detection step for detecting a movement amount of the face image based on the position coordinates of the face image detected in the face coordinate detection step;
A satisfaction degree calculating step for calculating a satisfaction degree of the face image based on the movement amount detected in the movement amount detecting step;
A gaze direction detection step of detecting a gaze direction of the face image based on the face image recognized in the face image recognition step;
Based on the gaze direction of the face image detected in the gaze direction detection step and the position coordinate of the corresponding face image detected in the face coordinate detection step, the position coordinates of the display screen that the face image gazes at are determined. Gaze coordinate detection step to detect;
An attention video detection step of detecting any one of the main region and the sub region that the face image gazes based on the position coordinates detected in the gaze coordinate detection step;
Based on the detection information detected in the attention video detection step, the movement amount detected in the movement amount detection step, and the satisfaction level calculated in the satisfaction level calculation step, the main region and the one or more sub-regions The video display method according to claim 1, wherein the display desire level of each video is calculated.

The plurality of cameras includes two cameras,
The video display method according to claim 1, wherein the two cameras are provided in the horizontal direction at both ends in the horizontal direction of the display screen.

A video display unit that provides a main area and one or more sub areas smaller than the main area on a display screen, and displays video in the main area and the one or more sub areas;
A plurality of cameras for photographing viewers who view the video displayed on the display screen;
Face image recognition means for recognizing the viewer's face image based on images taken by the plurality of cameras;
Display desire calculation for detecting the video that the face image gazes based on the face image recognized by the face image recognition means, and for calculating the display desire to display the video based on the detected video Means,
The display desire degree calculated by the display desire degree calculating means is compared, and the video having the highest display desire degree is displayed in the main area and the one or more sub areas so as to be displayed in the main area. Video switching means for switching video;
A video display device comprising:

The plurality of cameras capture a moving image at a certain time,
The face image recognition means recognizes the face image for each frame constituting a moving image shot by the camera,
7. The video display apparatus according to claim 6, wherein the display desire degree calculation means calculates the display desire degree based on each face image of each frame recognized by the face image recognition means.

The display desire degree calculating means includes:
Face coordinate detection means for detecting position coordinates of the face image with respect to the display screen based on the face image recognized by the face image recognition means;
Gaze direction detection means for detecting the gaze direction of the face image based on the face image recognized by the face image recognition means;
Based on the line-of-sight direction of the face image detected by the line-of-sight direction detection means and the position coordinates of the corresponding face image detected by the face coordinate detection means, the position coordinates of the display screen that the face image gazes at are obtained. Gaze coordinate detection means for detecting;
Attention video detection means for detecting any one of the main area and the sub area that the face image gazes based on the position coordinates detected by the gaze coordinate detection means,
7. The video display apparatus according to claim 6, wherein the display desire degree of each video of the main area and the one or more sub areas is calculated based on detection information detected by the target video detection means. .

The display desire degree calculating means includes:
Face coordinate detection means for detecting position coordinates of the face image with respect to the display screen based on the face image recognized by the face image recognition means;
A movement amount detection means for detecting a movement amount of the face image based on the position coordinates of the face image detected by the face coordinate detection means;
Satisfaction calculation means for calculating the satisfaction degree of the face image based on the movement amount detected by the movement amount detection means;
Gaze direction detection means for detecting the gaze direction of the face image based on the face image recognized by the face image recognition means;
Based on the line-of-sight direction of the face image detected by the line-of-sight direction detection means and the position coordinates of the corresponding face image detected by the face coordinate detection means, the position coordinates of the display screen that the face image gazes at are obtained. Gaze coordinate detection means for detecting;
Attention video detection means for detecting any one of the main area and the sub area that the face image gazes based on the position coordinates detected by the gaze coordinate detection means,
Based on the detection information detected by the attention image detection means, the movement amount detected by the movement amount detection means, and the satisfaction level calculated by the satisfaction level calculation means, the main area and the one or more sub-areas The video display device according to claim 6, wherein the display desire level of each video is calculated.

The plurality of cameras includes two cameras,
The video display apparatus according to claim 6, wherein the two cameras are provided in the horizontal direction at both ends in the horizontal direction of the display screen.