JP6550307B2

JP6550307B2 - Image display system and image display method

Info

Publication number: JP6550307B2
Application number: JP2015180009A
Authority: JP
Inventors: 吏中野; 貴司折目; 康夫高橋; 暦本　純一; 純一暦本; 雄一郎竹内; 渡辺　潤; 潤渡辺; 直紀永井
Original assignee: Sony Corp; Sony Network Communications Inc; Daiwa House Industry Co Ltd
Current assignee: Sony Corp; Sony Network Communications Inc; Daiwa House Industry Co Ltd
Priority date: 2015-09-11
Filing date: 2015-09-11
Publication date: 2019-07-24
Anticipated expiration: 2035-09-11
Also published as: WO2017043661A1; JP2017055354A

Description

本発明は、画像表示システム及び画像表示方法に係り、特に、ユーザの映像を構成するフレーム画像について、その画像データの伝送負荷を軽減することが可能な画像表示システム及び画像表示方法に関する。 The present invention relates to an image display system and an image display method, and more particularly to an image display system and an image display method capable of reducing a transmission load of image data of a frame image constituting a video of a user.

ＩＣＴ（情報通信技術）を利用した画像表示システムは既に知られている。かかるシステムは、例えば、互いに離れた空間に居るユーザ同士が対話する際に利用される。このようなケースにおいて、各ユーザは、スクリーン等の表示器に表示された対話相手の画像（より具体的には、複数のフレーム画像からなる映像）を見ながら、当該対話相手を話すことが可能である。これにより、表示器を通じて対話相手を見ているユーザは、当該対話相手と実際に対面しているときと同じ雰囲気（臨場感）の中で対話することが可能となる。 Image display systems using ICT (information communication technology) are already known. Such a system is used, for example, when users in different spaces interact with each other. In such a case, each user can talk to the other party while viewing the image of the other party (more specifically, a video composed of a plurality of frame images) displayed on the display such as a screen. It is. This enables the user who is looking at the other party of the dialogue through the display to interact in the same atmosphere (realism) as when actually facing the other party of the dialogue.

一方、対話の臨場感は、表示器に表示される対話相手の画像が高画質であるほど向上する。しかし、その反面、対話相手の画像が高画質になるほど、当該対話相手側から送られてくる画像データのデータ容量が大きくなってしまい、当該画像データの送受信に係る負荷（通信負荷）が大きくなってしまう。このような問題に対する方策としては、例えば、画像を新たに取得した際に、前回取得した画像との差分に相当する画像（すなわち、動いた部分の画像）のみの画像データを送ることが考えられる（特許文献１参照）。かかる構成であれば、画像データの送信時にデータ容量を削減すると共に、画像受信装置において高画質な画像（映像）を表示させることが可能となる。 On the other hand, the sense of realism of dialogue improves as the image of the other party of the dialogue displayed on the display device has higher image quality. However, on the other hand, the higher the quality of the image of the conversation partner, the larger the data capacity of the image data sent from the conversation partner, and the load (communication load) associated with transmission and reception of the image data increases. It will As a measure against such a problem, for example, when an image is newly acquired, it is possible to send image data of only an image (that is, an image of a moved part) corresponding to the difference from the previously acquired image. (See Patent Document 1). With such a configuration, it is possible to reduce the data capacity at the time of transmission of the image data and to display a high quality image (video) in the image receiving apparatus.

特開２００３−２９９０８８号公報JP 2003-299088 A

ところで、フレーム画像に表示されている人物が動いた場合には、動いた部分（被特定部分）を特定した上で、当該被特定部分の画像データを送信することになる。一方、被特定部分を特定する際には、当然ながら適切な手順によって精度よく特定することが求められる。特に、上記システムを用いた対話では、被特定部分が適切に特定されるかどうかが対話の臨場感に対して影響を及ぼし得る。 By the way, when the person displayed in the frame image moves, after the moved portion (specified portion) is specified, the image data of the specified portion is transmitted. On the other hand, when specifying the specified part, it is naturally required to specify with high accuracy by an appropriate procedure. In particular, in the dialog using the above system, whether or not the specified part is properly identified may affect the realism of the dialog.

そこで、本発明は、上記の課題に鑑みてなされたものであり、その目的とするところは、画像データの伝送負荷を軽減しつつ、フレーム画像に映し出された人物に動きがあった際にその部分を適切に特定することが可能な画像表示システムを提供することである。同様に、本発明の他の目的は、画像データの伝送負荷を軽減しつつ、フレーム画像に映し出された人物に動きがあった際にその部分を適切に特定することが可能な画像表示方法を提供することである。 Therefore, the present invention has been made in view of the above problems, and the object of the present invention is to reduce the transmission load of image data and to move the person displayed in the frame image. It is providing the image display system which can identify a part appropriately. Similarly, another object of the present invention is to provide an image display method capable of appropriately specifying a part when a person shown in a frame image moves while reducing transmission load of image data. It is to provide.

前記課題は、本発明の画像表示システムによれば、（Ａ）第一ユーザを撮像する撮像装置と、（Ｂ）前記第一ユーザの身体各部の位置に関する計測対象値を計測する計測装置と、（Ｃ）該撮像装置が撮像した前記第一ユーザの映像を構成するフレーム画像を取得する第一コンピュータと、（Ｄ）前記フレーム画像を取得するために前記第一コンピュータと通信する第二コンピュータと、（Ｅ）該第二コンピュータが取得した前記フレーム画像を、前記第一ユーザとは異なる場所に居る第二ユーザに対して表示する表示器と、を有し、（Ｆ）前記第一コンピュータは、（ｆ１）前回の前記フレーム画像の取得時から今回の前記フレーム画像の取得時までの期間中における前記計測対象値の計測結果の変化に基づいて、前記身体各部のうち、前記期間中に動いた被特定部分を特定する処理と、（ｆ２）前記第一コンピュータが今回取得した前記フレーム画像における前記第一ユーザの人物画像のうち、前記被特定部分を含む領域を抽出する処理と、（ｆ３）前記領域の画像データを生成して前記第二コンピュータに向けて送信する処理と、を実行し、（Ｇ）前記第二コンピュータは、前記領域の前記画像データを受信すると、該画像データの受信前に前記表示器に表示された前記フレーム画像のうち、前記領域と対応した位置に前記領域の画像を重ね合わせることで構成された前記フレーム画像を、前記表示器に表示させることにより解決される。 According to the image display system of the present invention, the subject includes (A) an imaging device for imaging a first user, and (B) a measurement device for measuring a measurement target value regarding the position of each part of the first user. (C) a first computer for acquiring a frame image constituting an image of the first user captured by the imaging device; (D) a second computer for communicating with the first computer for acquiring the frame image; (E) a display for displaying the frame image acquired by the second computer to a second user located at a place different from the first user, and (F) the first computer (F1) The above-mentioned each of the body parts based on the change in the measurement result of the measurement target value during the period from the previous acquisition of the frame image to the current acquisition of the frame image. (F2) a process of extracting an area including the identified part from the person image of the first user in the frame image acquired this time by the first computer; And (f3) a process of generating image data of the area and transmitting it to the second computer, and (G) the second computer receives the image data of the area, The display unit is configured to display the frame image configured by superimposing the image of the area on a position corresponding to the area among the frame images displayed on the display before receiving the image data. Solved by

以上のように構成された本発明の画像表示システムでは、第一ユーザの身体各部の位置に関する計測対象値の計測結果の変化に基づいて、第一ユーザの身体中、前回のフレーム画像取得時から今回のフレーム画像取得時までの期間中に動いた部分（すなわち、被特定部分）を特定する。すなわち、本発明の画像表示システムでは、上記の計測対象値の計測結果の変化に基づいて被特定部分を特定するので、当該被特定部分を的確に特定することが可能となる。
一方、第一コンピュータは、今回取得したフレーム画像における第一ユーザの人物画像から被特定部分を含む領域を抽出し、当該領域の画像データを第二コンピュータに向けて送信する。これにより、第一ユーザの人物画像全体の画像データを送信する場合に比して、データ伝送負荷を軽減することが可能となる。
以上により、本発明の画像表示システムによれば、画像データの伝送負荷を軽減しつつ、フレーム画像に映し出された人物の身体において動いた部分（すなわち、被特定部分）を適切に特定することが可能となる。 In the image display system of the present invention configured as described above, based on the change in the measurement result of the measurement target value regarding the position of each part of the body of the first user, in the body of the first user, from the time of previous frame image acquisition A portion (that is, a specified portion) moved during the period up to the frame image acquisition time this time is identified. That is, in the image display system of the present invention, the identified part is identified based on the change in the measurement result of the measurement target value, so that the identified part can be identified accurately.
On the other hand, the first computer extracts an area including the specified part from the person image of the first user in the frame image acquired this time, and transmits the image data of the area to the second computer. As a result, it is possible to reduce the data transmission load as compared to the case of transmitting the image data of the entire person image of the first user.
As described above, according to the image display system of the present invention, it is possible to appropriately identify a moved part (that is, a specified part) in the body of a person shown in a frame image while reducing the transmission load of image data. It becomes possible.

また、本発明の画像表示システムについて好適な構成を述べると、前記第一コンピュータは、前記被特定部分を特定する処理において、前記期間中における前記計測対象値の計測結果の変化に基づいて、前記第一ユーザの骨格において複数設定された設定部位のうち、前記期間中に動いた前記設定部位を特定し、該設定部位を少なくとも含むように前記被特定部分を特定するとよい。
上記の構成では、第一ユーザの骨格において複数設定された設定部位について動きの有無を見ることで被特定部分を特定することが可能となる。このような構成であれば、被特定部分を特定するにあたり、各設定部位における動きの有無を確認すればよいので、より容易に被特定部分を特定することが可能となる。 Further, to describe a suitable configuration of the image display system according to the present invention, in the process of identifying the specified portion, the first computer determines the configuration based on a change in the measurement result of the measurement target value during the period. Of the plurality of setting sites set in the skeleton of the first user, the setting site moved during the period may be specified, and the specified portion may be specified to include at least the setting site.
In the above configuration, it becomes possible to specify the specified part by looking at the presence or absence of movement for the plurality of set parts set in the skeleton of the first user. With such a configuration, it is possible to identify the specified part more easily because it is sufficient to confirm the presence or absence of the movement at each setting site when specifying the specified part.

また、本発明の画像表示システムについてより好適な構成を述べると、前記第一コンピュータは、前記被特定部分を特定する処理において前記期間中に動いた前記設定部位を特定する際、前記期間中における前記設定部位の変位量が閾値以上であるかどうかの判定を前記設定部位毎に行い、前記変位量が前記閾値以上である前記設定部位を前記期間中に動いた前記設定部位として特定するとよい。
上記の構成では、設定部位の変位量が閾値以上であるかどうかの判定を設定部位毎に行う。このような設定部位毎の判定を通じて、各設定部位における動きの有無、換言すると被特定部位の特定を一段と容易に行うことが可能となる。 Further, to describe a more preferable configuration of the image display system according to the present invention, when the first computer specifies the set site moved during the period in the process of specifying the specified portion, the first computer may The determination as to whether the displacement amount of the setting site is equal to or more than a threshold may be performed for each setting site, and the setting site having the displacement amount equal to or more than the threshold may be specified as the setting site moved during the period.
In the above configuration, it is determined for each set site whether the displacement amount of the set site is equal to or greater than a threshold. Through such determination for each set site, it is possible to more easily determine the presence or absence of movement in each set site, in other words, the specified site.

また、本発明の画像表示システムについて更に好適な構成を述べると、前記第一コンピュータは、前記判定を前記設定部位毎に行う際、ある前記設定部位についての前記判定の次に、ある前記設定部位の隣に位置する前記設定部位についての前記判定を行い、前記被特定部分を含む前記領域を抽出する際には、前記期間中に動いた前記設定部位すべてが前記領域内に含まれるように前記領域を抽出するとよい。
上記の構成では、ある設定部位についての判定の次に、ある設定部位の隣に位置する設定部位についての判定を行う。そして、被特定部分を含む領域を抽出する際には、前回のフレーム画像取得時と今回のフレーム画像取得時までの期間中に動いた設定部位すべてが含まれるように領域を抽出する。これにより、第一ユーザの人物画像中、動いた部分の画像が適切に抽出されるようになる。そして、抽出された領域の画像を前回の表示画像（フレーム画像）に重ね合わせることで、第一コンピュータが今回取得したフレーム画像（厳密には、当該フレーム画像における第一ユーザの人物画像）を適切に再現することが可能となる。 Further, to describe a further preferable configuration of the image display system of the present invention, when the first computer performs the determination for each of the set regions, the first set region after the determination of the predetermined region. When performing the determination on the set site located next to the target area and extracting the area including the specified part, the set site moved during the period is included in the area. It is good to extract the area.
In the above configuration, after the determination on a certain set site, the determination on the set site located next to the certain set site is performed. Then, when extracting the area including the specified part, the area is extracted so as to include all the set parts moved during the period from the previous frame image acquisition to the current frame image acquisition. As a result, in the person image of the first user, the image of the moved part is properly extracted. Then, by superimposing the image of the extracted area on the previous display image (frame image), the frame image acquired this time by the first computer (strictly speaking, the person image of the first user in the frame image) is appropriately used. It is possible to reproduce in

また、本発明の画像表示システムについて一段と好適な構成を述べると、複数設定された前記設定部位のうちの少なくとも一つは、前記第一ユーザの上半身の体軸上にある部位であり、前記第一コンピュータは、前記体軸上にある前記設定部位についての前記判定において前記変位量が前記閾値以上であると判定したとき、前記上半身の画像を前記領域として抽出するとよい。
上記の構成では、体軸上にある設定部位の変位量が閾値以上であると判定したとき、上半身の画像を領域として抽出する。このように上半身画像という単位で領域抽出を行うことにより、領域抽出に係る処理がより簡易的に実行されるようになる。 Further, to describe a further preferable configuration of the image display system of the present invention, at least one of the plurality of set parts set is a part on the body axis of the upper body of the first user, The one computer may extract the image of the upper body as the area when it is determined that the displacement amount is equal to or more than the threshold value in the determination of the set region on the body axis.
In the above configuration, when it is determined that the displacement amount of the set region on the body axis is equal to or more than the threshold, the image of the upper body is extracted as a region. By performing area extraction in units of upper body images in this manner, processing relating to area extraction can be performed more simply.

また、本発明の画像表示システムについて尚一層好適な構成を述べると、前記第一コンピュータは、前記フレーム画像中の背景画像を示す背景画像データを、前記背景画像以外の画像データと分けて生成して前記第二コンピュータに向けて送信する処理を実行し、前記第一コンピュータが前記背景画像データを送信する処理を実行する頻度は、前記第一コンピュータが前記撮像装置から前記フレーム画像を取得する頻度よりも少ないとよい。
上記の構成では、フレーム画像中の背景画像を示す背景画像データを、背景画像以外の画像データと分けて生成して第二コンピュータに向けて送信する。また、背景画像データの送信頻度は、第一コンピュータが撮像装置からフレーム画像を取得する頻度よりも少なくなっている。これは、一般に背景画像における変化が少ないことを反映しているためである。すなわち、背景画像の画像データについては送信回数がより少なく済む。このため、上記の構成のように背景画像データの送信頻度をフレーム画像の取得頻度よりも少なくすることでデータ伝送負荷をより軽減することが可能となる。 Further, to describe the still more preferable configuration of the image display system of the present invention, the first computer generates background image data indicating a background image in the frame image separately from image data other than the background image. And the frequency at which the first computer executes the process of transmitting the background image data is the frequency at which the first computer acquires the frame image from the imaging device. Better than less.
In the above configuration, the background image data indicating the background image in the frame image is generated separately from the image data other than the background image and transmitted to the second computer. Further, the transmission frequency of the background image data is smaller than the frequency at which the first computer acquires the frame image from the imaging device. This is because this generally reflects that the change in the background image is small. That is, for the image data of the background image, the number of transmissions can be reduced. For this reason, it is possible to further reduce the data transmission load by setting the transmission frequency of background image data to be lower than the acquisition frequency of frame images as in the above configuration.

また、本発明の画像表示システムについて益々好適な構成を述べると、前記表示器の前に前記第二ユーザが居る状態で前記第二ユーザと前記表示器との位置関係及び前記第二ユーザの姿勢のうち、少なくとも一つの内容に関する情報を前記第二コンピュータに提供する情報提供装置を有し、前記第一コンピュータは、前記第二コンピュータが前記情報から特定した前記少なくとも一つの内容を取得する処理を更に実行し、前記領域の前記画像データを生成する処理では、前記領域の画像中、前記表示器において前記少なくとも一つの内容に応じて決まる範囲に表示される第一画像よりも該第一画像とは異なる範囲に表示される第二画像が低画質となるように前記領域の前記画像データを生成するとよい。
上記の構成では、抽出された領域の画像中、所定範囲にある画像（例えば、表示器に表示された際に第二ユーザの中心視野領域内にある画像）以外の画像について画質を低下させる。これは、中心視野領域以外の画像が視覚的に認識され難い画像であるため、当該画像の画質が比較的低かったとしても、第二ユーザが感じる対話の臨場感に及ぶ影響が小さいことを反映しているためである。故に、上記の構成によれば、データ伝送負荷を一段と軽減することが可能となる。かかる効果は、抽出された領域が広域になるほど有効に発揮されることとなる。 Further, to describe the configuration of the image display system according to the present invention, the positional relationship between the second user and the display and the posture of the second user in the state where the second user is in front of the display. Information providing apparatus for providing the second computer with information related to at least one content, the first computer processing for acquiring the at least one content specified by the second computer from the information Further, in the process of generating the image data of the area, the first image displayed in the range of the area determined according to the content of the at least one in the image of the area is more The image data of the area may be generated such that a second image displayed in a different area has a low image quality.
In the above configuration, the image quality of an image other than an image in a predetermined range (for example, an image in the central visual field of the second user when displayed on the display) in the extracted area is degraded. This reflects that the images other than the central visual field are difficult to be visually recognized, so even if the image quality of the image is relatively low, the influence on the sense of realism of the dialogue felt by the second user is small. It is because Therefore, according to the above configuration, it is possible to further reduce the data transmission load. Such an effect is more effectively exhibited as the extracted area becomes wider.

また、本発明の画像表示システムについて殊更好適な構成を述べると、前記表示器の前に前記第二ユーザが居る状態で前記第二ユーザと前記表示器との間の距離を計測する距離計測装置を有し、前記第一コンピュータは、前記第二コンピュータから前記距離の計測結果を取得し、前記距離が予め設定された大きさ以上であるときには、前記人物画像の画質を所定の画質まで低下させ、低下後の画質の前記人物画像を示す低画質人物画像データを生成して前記第二コンピュータに向けて送信するとよい。
上記の構成では、第二ユーザと表示器との間の距離が予め設定された大きさ以上であるとき、第一ユーザの人物画像の画質を低下させ、低下後の画質の人物画像を示すデータ（低画質人物画像データ）を生成して第二コンピュータに向けて送信する。これは、上記の距離が設定値よりも大きくなったとき、表示器に表示されている画像の画質が多少低下したとしても、第二ユーザが感じる対話の臨場感に及ぶ影響が小さいことを反映しているためである。故に、上記の構成によれば、対話の臨場感を確保しつつ、データ伝送負荷を軽減することが可能となる。 Further, to describe the particularly preferable configuration of the image display system according to the present invention, a distance measuring device for measuring a distance between the second user and the display in a state where the second user is in front of the display. And the first computer acquires the measurement result of the distance from the second computer, and reduces the image quality of the person image to a predetermined image quality when the distance is greater than or equal to a preset size. Preferably, low-quality person image data representing the person image of the image quality after deterioration is generated and transmitted to the second computer.
In the above configuration, when the distance between the second user and the display is equal to or greater than the preset size, the image quality of the person image of the first user is degraded, and data indicating the person image of the image quality after degradation (Low-quality human image data) is generated and transmitted to the second computer. This reflects that the influence on the sense of realism of the dialogue felt by the second user is small even if the image quality of the image displayed on the display is slightly degraded when the above distance becomes larger than the set value. It is because Therefore, according to the above configuration, it is possible to reduce the data transmission load while securing the sense of reality of the dialogue.

また、前述した課題は、本発明の画像表示方法によれば、撮像装置が撮像した第一ユーザの映像を構成するフレーム画像を取得する第一コンピュータと、前記フレーム画像を取得するために前記第一コンピュータと通信する第二コンピュータと、を用いて、前記第二コンピュータが取得した前記フレーム画像を表示器により前記第一ユーザとは異なる場所に居る第二ユーザに対して表示する画像表示方法であって、（Ａ）前記第一コンピュータが、前記第一ユーザの身体各部の位置に関する計測対象値を計測する計測装置から、前記計測対象値の計測結果を取得する処理を実行することと、（Ｂ）前記第一コンピュータが、前回の前記フレーム画像の取得時から今回の前記フレーム画像の取得時までの期間中における前記計測対象値の計測結果の変化に基づいて、前記身体各部のうち、前記期間中に動いた被特定部分を特定する処理を実行することと、（Ｃ）前記第一コンピュータが、今回取得した前記フレーム画像における前記第一ユーザの人物画像のうち、前記被特定部分を含む領域を抽出する処理を実行することと、（Ｄ）前記第一コンピュータが、前記領域の画像データを生成して前記第二コンピュータに向けて送信する処理を実行することと、（Ｅ）前記第二コンピュータが、前記画像データを受信すると、該画像データの受信前に前記表示器に表示された前記フレーム画像のうち、前記領域と対応した位置に前記画像データが示す前記領域の画像を重ね合わせることで構成された前記フレーム画像を、前記表示器に表示させることと、を有することにより解決される。
上記の方法によれば、画像データの伝送負荷を軽減しつつ、フレーム画像における第一ユーザの人物画像中、第一ユーザの身体において動いた部分（すなわち、被特定部分）を適切に特定することが可能となる。
Another object described above, according to the image display method of the present invention, a first computer imaging apparatus acquires a frame image constituting the video of a first user who has captured the in order to obtain the frame image first An image display method for displaying the frame image acquired by the second computer using a display to a second user who is at a different place from the first user using a second computer in communication with the one computer (A) the first computer executes processing for acquiring the measurement result of the measurement target value from a measurement device that measures the measurement target value regarding the position of each part of the first user's body; B) The first computer measures the measurement target value during the period from the previous acquisition of the frame image to the current acquisition of the frame image. Performing, on the basis of the change in the part of the body, a process for identifying a specified part that has moved during the period; and (C) the first computer in the frame image acquired this time. Performing a process of extracting an area including the specified part in the person image of the user; and (D) the first computer generates image data of the area and transmits it to the second computer (E) when the second computer receives the image data, a position corresponding to the area in the frame image displayed on the display before the image data is received And displaying the frame image configured by superimposing the image of the area indicated by the image data on the display.
According to the above method, while reducing the transmission load of the image data, in the person image of the first user in the frame image, it is possible to appropriately identify the moved part (that is, the identified part) in the body of the first user. Is possible.

本発明の画像表示システム及び画像表示方法によれば、画像データの伝送負荷を軽減しつつ、フレーム画像におけるユーザの人物画像のうち、当該ユーザの身体の中で動いた部分を適切に特定することが可能となる。この結果、よりスムーズな画像データの送受信を実現しつつ、対話相手の人物画像を表示器に表示しながら行われる対話の臨場感（リアル感）を確保することが可能となる。 According to the image display system and the image display method of the present invention, it is possible to appropriately identify a portion of the person image of the user in the frame image that has moved within the body of the user while reducing the transmission load of the image data. Is possible. As a result, while realizing smoother image data transmission and reception, it is possible to secure a sense of reality (real feeling) of the dialogue performed while displaying the person image of the dialogue partner on the display.

本発明の一実施形態に係る画像表示システムの概念図を示す図である。FIG. 1 is a schematic view of an image display system according to an embodiment of the present invention. 画像表示システムを構成する通信ユニットの機器構成を示す図である。It is a figure showing equipment composition of a communication unit which constitutes an image display system. 撮像装置が撮像した映像のフレーム画像と深度データとを示す図である。It is a figure which shows the frame image and depth data of the imaging | video which the imaging device imaged. 本発明の一実施形態において用いられる表示器の状態を示す図であり、図中の（Ａ）には非対話時における状態を、（Ｂ）には対話時の状態をそれぞれ示している。It is a figure which shows the state of the indicator used in one Embodiment of this invention, and (A) in the figure shows the state at the time of non-interaction, and each shows the state at the time of interaction. 背景画像及び人物画像の分離及び合成についての説明図である。It is an explanatory view about separation and composition of a background image and a person image. 図６の（Ａ）、（Ｂ）及び（Ｃ）は、低画質化処理についての説明図である。(A), (B) and (C) of FIG. 6 are explanatory diagrams of the image quality reduction process. 図７の（Ａ）、（Ｂ）、（Ｃ）及び（Ｄ）は、画像の切り出しに関する説明図である。(A), (B), (C) and (D) of FIG. 7 are explanatory diagrams related to clipping of an image. 画質調整処理についての説明図である。It is explanatory drawing about an image quality adjustment process. 対話通信フローの流れを示した図である。It is a figure showing the flow of the dialogue communication flow. 通信前処理の流れを示した図である。It is a figure showing the flow of communication pre-processing. 現在情報通知処理の流れを示した図である。It is a figure showing the flow of present information notification processing. 画像加工送信処理の流れを示した図である。It is a figure showing the flow of image processing transmission processing. 切り出し領域の選定処理の流れを示した図である。It is a figure showing the flow of selection processing of a logging field. 切り出し領域の算出処理の流れを示した図である。It is a figure showing the flow of calculation processing of a logging field. 画質調整処理の流れを示した図である。It is a figure showing the flow of image quality adjustment processing. 表示映像の再構築処理の流れを示した図である。It is the figure which showed the flow of the reconstruction process of a display image.

以下、本発明の一実施形態（以下、本実施形態）について説明する。なお、以下に説明する実施形態は、本発明の理解を容易にするための一例に過ぎず、本発明を限定するものではない。すなわち、本発明は、その趣旨を逸脱することなく、変更、改良され得ると共に、本発明にはその等価物が含まれることは勿論である。 Hereinafter, an embodiment of the present invention (hereinafter, the present embodiment) will be described. The embodiments described below are merely examples for facilitating the understanding of the present invention, and are not intended to limit the present invention. That is, the present invention can be modified and improved without departing from the gist thereof, and the present invention naturally includes the equivalents thereof.

＜＜本実施形態に係る画像表示システムの用途＞＞
先ず、本実施形態に係る画像表示システム（以下、本システムＳ）について、その用途を概説する。本システムＳは、互いに離れた場所に居るユーザ同士が互いの姿を見ながら対話するために用いられる。つまり、本システムＳを用いた対話（以下、対話通信）において、各ユーザは、実際に対話相手と会って話をしているような感覚を感じるようになる。以下の説明では、上記の視覚的効果を臨場感（リアル感）と呼ぶこととする。 << Application of image display system according to the present embodiment >>
First, the application of the image display system (hereinafter, the present system S) according to the present embodiment will be outlined. The present system S is used to allow users in distant places to interact with each other while looking at each other. That is, in a dialogue (hereinafter, dialogue communication) using the present system S, each user comes to feel as if he / she is actually talking to the other party. In the following description, the above-mentioned visual effect will be referred to as realism.

なお、本実施形態の対話通信は、各ユーザが各自宅の所定の部屋（自分の部屋）内に居るときに行われるものである。ただし、これに限定されるものではなく、ユーザが自宅以外の場所、例えば、集会所や商業施設、あるいは学校の教室や学習塾、病院等の公共施設、会社や事務所等に居るときに本システムＳによる対話通信が行われてもよい。また、同じ建物内に居るユーザが当該建物内の異なる部屋に居るときに対話通信が行われてもよい。
以上のように本システムＳは、互いに異なる場所に居る者同士が相手の顔を見ながら対話するシチュエーションにおいて幅広く利用することが可能である。 Note that the dialog communication of this embodiment is performed when each user is in a predetermined room (own room) of each home. However, the present invention is not limited to this, and it is useful when the user is in a place other than home, for example, a meeting place or commercial facility, a school classroom or learning school, a public facility such as a hospital, a company or office, etc. Dialogue communication may be performed by the system S. In addition, interactive communication may be performed when users in the same building are in different rooms in the building.
As described above, the present system S can be widely used in situations where persons in different places interact with each other while looking at the face of the other.

以下、ユーザであるＡさんとＢさんとが対話通信を行うケースを例に挙げて説明する。また、以下では、Ｂさん側の視点（換言すると、Ａさんの姿を見る立場）から説明することとする。かかるケースにおいて、Ａさんが「第一ユーザ」に相当し、Ｂさんが「第二ユーザ」に相当する。ここで、「第一ユーザ」及び「第二ユーザ」は、画像を見る者及び見られる者の関係に応じて切り替わる相対的な概念であり、Ａさん側の視点を基準としたときにはＢさんが「第一ユーザ」に相当し、Ａさんが「第二ユーザ」に相当することとなる。 Hereinafter, a case in which the user A and the user B communicate with each other will be described as an example. Also, in the following, it will be explained from the viewpoint of Mr. B (in other words, the position to see Mr. A's figure). In such a case, Mr. A corresponds to the "first user", and Mr. B corresponds to the "second user". Here, “first user” and “second user” are relative concepts that are switched according to the relationship between the viewer and the viewer of the image. It corresponds to the "first user" and Mr. A corresponds to the "second user".

Ａさん及びＢさんの双方は、対話通信を行うにあたり、各自の部屋に入室する。具体的に説明すると、各自の部屋にはミラー型の表示器（詳しくは図２に図示のディスプレイ５）が配置されている。Ａさん及びＢさんは、対話通信を行う上で、表示器の正面位置まで移動する。この際、本システムＳが起動していると対話通信が開始される。なお、システム起動タイミングについては、特に限定されるものではなく、好適なタイミングであれば上記の内容と異なるタイミングであってもよい。 Both Mr. A and Mr. B will enter their own room when conducting dialogue communication. Specifically, a mirror-type display (specifically, the display 5 shown in FIG. 2) is disposed in each room. Mr. A and Mr. B move to the front position of the display when performing dialogue communication. At this time, if the present system S is activated, interactive communication is started. The system activation timing is not particularly limited, and may be a timing different from the above contents as long as it is a suitable timing.

対話通信が開始されると、Ｂさん側の表示器にＡさんの画像が表示される。この画像は、Ａさん側に設けられたカメラ２（撮像装置に相当）が撮像した画像であり、厳密に説明すると、当該カメラ２が撮像したＡさんの映像を構成するフレーム画像である。すなわち、Ｂさん側の表示器に表示される画像は、一定の速度（具体的には、フレーム画像の取得速度に相当する速度）にて切り替わるようになる。これにより、表示器にはＡさんの連続画像、すなわち映像が表示されるようになり、Ｂさんは、あたかもＡさんと対面しているような感じ（臨場感）を感じるようになる。 When the dialog communication is started, the image of Mr. A is displayed on the display of Mr. B's side. This image is an image captured by a camera 2 (corresponding to an imaging device) provided on the side of Mr. A, and described strictly, it is a frame image constituting a video of Mr. A captured by the camera 2. That is, the image displayed on the display on the side of Mr. B is switched at a constant speed (specifically, a speed corresponding to an acquisition speed of a frame image). As a result, a continuous image of Mr. A, that is, an image, is displayed on the display, and Mr. B feels as if he / she is facing Mr. A (realism).

ちなみに、Ｂさん側の表示器には、Ａさんの全身画像が等身大で表示されることになっている。具体的に説明すると、表示器は、前述したようにミラー型のディスプレイ５によって構成されており、一般的な姿見と同様の形状・サイズとなっており、Ａさんの全身映像を等身大で表示するのに適した形状及びサイズとなっている。このような構成により、Ｂさんは、表示器に映る等身大のＡさんを見るようになり、あたかもガラス越しにＡさんと会っている感じを感じるようになる。 By the way, the whole body image of Mr. A is to be displayed in a life-size on the indicator of Mr. B's side. Specifically, as described above, the display is configured by the mirror-type display 5 and has the same shape and size as a general look, and displays a full-length image of Mr. A in a full-size view It has a shape and size suitable for With this configuration, Mr. B sees life-size Mr. A reflected in the display, and feels as if he is meeting Mr. A through the glass.

＜＜本実施形態に係る画像表示システムの構成について＞＞
次に、本システムＳについてその具体的構成を説明する。本システムＳは、Ａさんの自宅及びＢさんの自宅の双方に用意された情報通信用のユニット（以下、通信ユニット）によって構成されている。具体的に説明すると、Ａさんの自宅においてＡさんにより利用される第一通信ユニット１００Ａと、Ｂさんの自宅においてＢさんにより利用される第二通信ユニット１００Ｂによって本システムＳが構成されている。以下、第一通信ユニット１００Ａ及び第二通信ユニット１００Ｂのそれぞれの構成について説明する。 << About the Configuration of the Image Display System According to the Present Embodiment >>
Next, the specific configuration of the present system S will be described. The present system S is configured by a unit for information communication (hereinafter referred to as a communication unit) prepared in both A's home and B's home. Specifically, the present system S is configured by a first communication unit 100A used by Mr. A at Mr. A's home and a second communication unit 100 B used by Mr. B at Mr. B's home. The configurations of the first communication unit 100A and the second communication unit 100B will be described below.

なお、「第一通信ユニット１００Ａ」及び「第二通信ユニット１００Ｂ」は、前述した第一ユーザ及び第二ユーザの関係に付随して決まる概念であり、Ａさんを第一ユーザとして見た場合、Ａさんが利用する通信ユニットが第一通信ユニット１００Ａに該当し、Ｂさんが利用する通信ユニットが第二通信ユニット１００Ｂに該当する。反対に、Ａさんを第二ユーザとして見た場合には、Ｂさんが利用する通信ユニットが第一通信ユニット１００Ａに該当し、Ａさんが利用する通信ユニットが第二通信ユニット１００Ｂに該当する。 Note that “first communication unit 100A” and “second communication unit 100B” are concepts determined in accordance with the relationship between the first user and the second user described above, and when Mr. A is viewed as the first user, The communication unit used by Mr. A corresponds to the first communication unit 100A, and the communication unit used by Mr. B corresponds to the second communication unit 100B. Conversely, when Mr. A is viewed as the second user, the communication unit used by Mr. B corresponds to the first communication unit 100A, and the communication unit used by Mr. A corresponds to the second communication unit 100B.

第一通信ユニット１００Ａ及び第二通信ユニット１００Ｂは、略同様のメカ構成となっており、具体的に説明すると、図１に示すように、いずれのユニットにもホームサーバ１とカメラ２とマイク３と赤外線センサ４とディスプレイ５とスピーカ６とが搭載されている。これらの機器のうち、カメラ２、マイク３、赤外線センサ４、ディスプレイ５及びスピーカ６は、各ユーザの自宅における各自の部屋（対面対話を行う際に入室する部屋）内に配置されている。図１は、本システムＳの構成を示す概念図である。 The first communication unit 100A and the second communication unit 100B have substantially the same mechanical configuration, and to be specific, as shown in FIG. 1, the home server 1, the camera 2, and the microphone 3 in any unit. The infrared sensor 4, the display 5 and the speaker 6 are mounted. Among these devices, the camera 2, the microphone 3, the infrared sensor 4, the display 5 and the speaker 6 are disposed in the room of each user at home (the room to be entered when performing face-to-face interaction). FIG. 1 is a conceptual view showing the configuration of the present system S. As shown in FIG.

ホームサーバ１は、本システムＳの中枢をなす装置であり、ＣＰＵ、ＲＯＭやＲＡＭ等のメモリ、通信用インタフェース及びハードディスクドライブ等を有するコンピュータである。なお、第一通信ユニット１００Ａが有するホームサーバ１は、第一コンピュータに相当し、第二通信ユニット１００Ｂが有するホームサーバ１は、第二コンピュータに相当する。 The home server 1 is a device that forms the core of the present system S, and is a computer having a CPU, a memory such as a ROM and a RAM, a communication interface, a hard disk drive, and the like. The home server 1 of the first communication unit 100A corresponds to a first computer, and the home server 1 of the second communication unit 100B corresponds to a second computer.

また、ホームサーバ１には、対話通信用のプログラムがインストールされている。このプログラムがＣＰＵに実行されることで、ホームサーバ１が後述する対話通信機能を発揮するようになる。また、ホームサーバ１同士は、インターネット等の外部通信ネットワークＧＮを介して通信可能に接続されており、互いに各種データの送受信を行う。ここで、ホームサーバ１が送受信するデータは、対話通信に必要なデータであり、例えば、各種画像の画像データや音声データである。 Further, a program for interactive communication is installed in the home server 1. By the CPU executing this program, the home server 1 exerts an interactive communication function described later. The home servers 1 are communicably connected via an external communication network GN such as the Internet, and transmit / receive various data to / from each other. Here, the data transmitted and received by the home server 1 is data necessary for dialogue communication, and is, for example, image data and audio data of various images.

カメラ２は、撮像範囲（画角）内にある被写体の映像を撮像する撮像装置であり、本実施形態では公知のネットワークカメラによって構成されている。また、カメラ２は、ユーザ（Ａさん、Ｂさん）がディスプレイ５の前に立っているときに当該ユーザの全身像を撮像する。すなわち、第一通信ユニット１００Ａが有するカメラ２は、Ａさんの部屋内に設置されたディスプレイ５の前にＡさんが立っているとき、Ａさん及びその周辺を撮像する。同様に、第二通信ユニット１００Ｂが有するカメラ２は、Ｂさんの部屋内に設置されたディスプレイ５の前にＢさんが立っているとき、Ｂさん及びその周辺を撮像する。 The camera 2 is an imaging device that captures an image of an object within an imaging range (angle of view), and is configured by a known network camera in the present embodiment. In addition, when the user (Mr. A, Mr. B) stands in front of the display 5, the camera 2 captures a full-length image of the user. That is, the camera 2 included in the first communication unit 100A captures an image of Mr. A and the periphery thereof when Mr. A stands in front of the display 5 installed in Mr. A's room. Similarly, when Mr. B stands in front of the display 5 installed in Mr. B's room, the camera 2 included in the second communication unit 100B images Mr. B and the periphery thereof.

なお、本実施形態では、図２に示すように、カメラ２のレンズがディスプレイ５の表示画面５ａに面している。ここで、表示画面５ａを構成するディスプレイ５の鏡面パネルは、透明なガラスによって構成されている。したがって、カメラ２は、ディスプレイ５の前に立っているユーザを上記の鏡面パネル越しで撮像することになる。図２は、各通信ユニットの機器構成を示す図であり、各機器の配置位置についての説明図である。ただし、カメラ２の配置位置は、図２に図示の位置に限定されるものではなく、ディスプレイ５から離れた位置でもよい。 In the present embodiment, as shown in FIG. 2, the lens of the camera 2 faces the display screen 5 a of the display 5. Here, the mirror surface panel of the display 5 constituting the display screen 5a is made of transparent glass. Therefore, the camera 2 captures an image of the user standing in front of the display 5 through the above-mentioned mirror panel. FIG. 2 is a view showing the device configuration of each communication unit, and an explanatory view of the arrangement position of each device. However, the arrangement position of the camera 2 is not limited to the position illustrated in FIG. 2 and may be a position away from the display 5.

ちなみに、ユーザがディスプレイ５の前に立っていないとき、カメラ２は、当該カメラ２が設置された部屋の内部空間（厳密には、カメラ２の画角内にある範囲）を撮像することになっている。この際に撮像された映像のフレーム画像は、「背景画像」として利用されることになっている。 By the way, when the user is not standing in front of the display 5, the camera 2 is to image the internal space of the room in which the camera 2 is installed (strictly speaking, the range within the angle of view of the camera 2). ing. The frame image of the video captured at this time is to be used as a "background image".

そして、カメラ２の撮像映像を構成するフレーム画像は、データ化されてホームサーバ１（厳密には、同じ通信ユニットに属するホームサーバ１）に伝送される。 Then, the frame image constituting the captured image of the camera 2 is digitized and transmitted to the home server 1 (strictly speaking, the home server 1 belonging to the same communication unit).

マイク３は、ユーザの話し声等、マイク３が設置された部屋内で発生する音を集音する装置である。そして、マイク３は、集音した音を示す音声信号をホームサーバ１（厳密には、同じ通信ユニットに属するホームサーバ１）に対して出力する。なお、本実施形態では、図２に示すようにディスプレイ５の直上位置にマイクが設置されている。 The microphone 3 is a device for collecting a sound generated in a room in which the microphone 3 is installed, such as a user's speaking voice. Then, the microphone 3 outputs an audio signal indicating the collected sound to the home server 1 (strictly, the home server 1 belonging to the same communication unit). In the present embodiment, as shown in FIG. 2, a microphone is installed at a position immediately above the display 5.

赤外線センサ４は、所謂デプスセンサであり、赤外線方式にて計測対象物の深度を計測するセンサである。具体的に説明すると、赤外線センサ４は、計測対象物に向けて発光部４ａから赤外線を照射し、その反射光を受光部４ｂにて受光することにより深度を計測する。ここで、「深度」とは、基準位置から計測対象物までの距離（すなわち、奥行距離）のことである。ちなみに、本実施形態では、ディスプレイ５の表示画面５ａ（前面）の位置が基準位置として設定されている。つまり、赤外線センサ４は、深度として、表示画面５ａの法線方向における計測対象物と表示画面５ａとの間の距離を計測する。ただし、基準位置については、上記の位置に限定されず、任意の位置に設定することが可能である。 The infrared sensor 4 is a so-called depth sensor, and is a sensor that measures the depth of an object to be measured by an infrared method. Specifically, the infrared sensor 4 irradiates infrared light from the light emitting unit 4 a toward the measurement object, and measures the depth by receiving the reflected light by the light receiving unit 4 b. Here, “depth” is the distance from the reference position to the measurement object (ie, the depth distance). Incidentally, in the present embodiment, the position of the display screen 5a (front surface) of the display 5 is set as the reference position. That is, the infrared sensor 4 measures the distance between the measurement object in the normal direction of the display screen 5a and the display screen 5a as the depth. However, the reference position is not limited to the above position, and can be set to any position.

また、深度の計測結果は、カメラ２が撮像した映像のフレーム画像を所定数の画素に分割した際の当該画素毎に得られる。そして、画素毎に得た深度の計測結果をフレーム画像単位でまとめることで、図３に図示の深度データが得られるようになる。この深度データは、フレーム画像について画素別に深度の計測結果を示すデータであり、図３に図示するように、深度の計測結果に応じて各画素の色・濃淡を設定して得られるビットマップデータとなっている。図３は、フレーム画像と当該フレーム画像についての深度データとを示す図である。 In addition, the measurement result of the depth is obtained for each of the pixels when the frame image of the image captured by the camera 2 is divided into a predetermined number of pixels. Then, the depth data illustrated in FIG. 3 can be obtained by collecting the measurement results of the depth obtained for each pixel in frame image units. This depth data is data indicating the measurement result of the depth for each pixel in the frame image, and as shown in FIG. 3, bit map data obtained by setting the color and density of each pixel according to the measurement result of the depth It has become. FIG. 3 is a diagram showing a frame image and depth data of the frame image.

深度データについてより詳しく説明すると、深度データは、カメラ２の撮像映像を構成するフレーム画像の各々について取得されることになっている。また、図３に示すように、深度データ中、フレーム画像において奥側に位置する被写体の画像に属する画素（図中、黒塗りの画素）と、手前側に位置する被写体の画像に属する画素（図中、白塗りの画素）とでは、当然ながら深度の計測結果が異なってくる。このような性質を利用すれば、深度データを構成する画素のうち、背景画像に属する画素と人物画像に属する画素とを区別、分離することが可能となる。 Describing in more detail about depth data, depth data is to be acquired for each of the frame images that constitute the captured image of the camera 2. Further, as shown in FIG. 3, in the depth data, pixels belonging to the image of the subject positioned on the back side in the frame image (black pixels in the figure) and pixels belonging to the image of the subject positioned on the front side ( Of course, in the case of white pixels in the figure, the measurement results of depth differ. By using such a property, it is possible to distinguish and separate the pixels belonging to the background image and the pixels belonging to the human image among the pixels constituting the depth data.

以上の赤外線センサ４がＡさんの部屋及びＢさんの部屋の双方に設置されている。つまり、Ａさんの部屋に設置されたディスプレイ５の前にＡさんが立つと、第一通信ユニット１００Ａの赤外線センサ４がＡさんの身体各部について深度を計測するようになる。すなわち、第一通信ユニット１００Ａの赤外線センサ４は、Ａさんの身体各部の位置に関する計測対象値として深度を計測する計測装置に相当する。 The above infrared sensor 4 is installed in both the room A and the room B. That is, when Mr. A stands in front of the display 5 installed in Mr. A's room, the infrared sensor 4 of the first communication unit 100A measures the depth of each part of Mr. A's body. That is, the infrared sensor 4 of the first communication unit 100A corresponds to a measurement device that measures depth as a measurement target value regarding the position of each part of the person A.

同様に、Ｂさんの部屋に設置されたディスプレイ５の前にＢさんが立つと、第二通信ユニット１００Ｂの赤外線センサ４がＢさんの身体各部について深度を計測するようになる。すなわち、第二通信ユニット１００Ｂの赤外線センサ４は、ディスプレイ５の前にＢさんが居る状態で深度、換言すると、Ｂさんとディスプレイ５との間の距離を計測する距離計測装置に相当する。 Similarly, when Mr. B stands in front of the display 5 installed in Mr. B's room, the infrared sensor 4 of the second communication unit 100 B measures the depth of each part of Mr. B's body. That is, the infrared sensor 4 of the second communication unit 100B corresponds to a distance measurement device that measures the depth in a state where Mr. B is in front of the display 5, in other words, the distance between Mr. B and the display 5.

なお、身体各部の位置に関する計測対象値を計測する装置（計測装置）については、赤外線センサ４に限定されるものではなく、例えば、ユーザに装着されて身体各部の位置を直接計測するセンサ（モーションキャプチャ用のセンサ）であってもよい。また、ディスプレイ５との間の距離を計測する方法については、赤外線センサ４を用いる方法に限定されるものではなく、例えば、ユーザの立ち位置をセンサ等にて検知し、その検知結果からディスプレイ５との間の距離を計測してもよい。あるいは、カメラ２の撮影映像を解析することで当該距離を割り出してもよい。 The device (measuring device) that measures the measurement target value regarding the position of each part of the body is not limited to the infrared sensor 4; for example, a sensor (motion that is mounted on the user and directly measures the position of each part of the body It may be a sensor for capture. Further, the method of measuring the distance to the display 5 is not limited to the method of using the infrared sensor 4. For example, the standing position of the user is detected by a sensor or the like. The distance between and may be measured. Alternatively, the distance may be determined by analyzing a captured image of the camera 2.

スピーカ６は、ホームサーバ１が受信した音声データを展開することで再生される音声（再生音）を発する装置である。具体的に説明すると、第一通信ユニット１００Ａのホームサーバ１は、第二通信ユニット１００Ｂのホームサーバ１から音声データを受信すると、当該音声データを展開し、Ｂさんの部屋で集音された音声をスピーカ６によって再生させる。他方、第二通信ユニット１００Ｂのホームサーバ１は、第一通信ユニット１００Ａのホームサーバ１から音声データを受信すると、当該音声を展開し、Ａさんの部屋で集音された音声をスピーカ６によって再生させる。なお、本実施形態では、図２に示すように、スピーカ６がディスプレイ５の横幅方向においてディスプレイ５を挟む位置に複数（図２では４個）設置されている。 The speaker 6 is a device that emits sound (reproduction sound) to be reproduced by expanding the sound data received by the home server 1. Specifically, when the home server 1 of the first communication unit 100A receives voice data from the home server 1 of the second communication unit 100B, the home server 1 expands the voice data, and the voice collected in Mr. B's room is collected. Are reproduced by the speaker 6. On the other hand, when the home server 1 of the second communication unit 100B receives the audio data from the home server 1 of the first communication unit 100A, it expands the audio and reproduces the audio collected in the room of Mr. A by the speaker 6 Let In the present embodiment, as shown in FIG. 2, a plurality of (four in FIG. 2) speakers 6 are provided at positions sandwiching the display 5 in the widthwise direction of the display 5.

ディスプレイ５は、ホームサーバ１が取得したフレーム画像を表示画面５ａにて画像を表示する表示器である。より具体的に説明すると、第一通信ユニット１００Ａが有するディスプレイ５は、第一通信ユニット１００Ａのホームサーバ１が取得したフレーム画像をＡさんに対して表示する。他方、第二通信ユニット１００Ｂが有するディスプレイ５は、第二通信ユニット１００Ｂのホームサーバ１が取得したフレーム画像をＢさんに対して表示する。 The display 5 is a display for displaying the frame image acquired by the home server 1 on the display screen 5a. More specifically, the display 5 of the first communication unit 100A displays the frame image acquired by the home server 1 of the first communication unit 100A to Mr. A. On the other hand, the display 5 of the second communication unit 100B displays the frame image acquired by the home server 1 of the second communication unit 100B to Mr. B.

また、本実施形態に係るディスプレイ５は、前述したように、ミラー型の表示器によって構成されている。さらに、本実施形態に係るディスプレイ５は、通常時には、図４の（Ａ）に示すように部屋内に配置された家具、具体的には姿見として機能する。つまり、非対話時（対話通信を行っていないとき）には、ディスプレイ５の表示画面５ａにフレーム画像が表示されないため、同表示画面５ａが鏡面として機能する。一方、対話時（対話通信を行っているとき）には、図４の（Ｂ）に示すように、表示画面５ａにフレーム画像が表示（再生）されるようになる。図４の（Ａ）及び（Ｂ）は、本実施形態に係るディスプレイ５の構成例を示した図であり、（Ａ）が非対話時の状態を、（Ｂ）が対話時の状態をそれぞれ示している。 Moreover, the display 5 which concerns on this embodiment is comprised by the display of a mirror type as mentioned above. Furthermore, the display 5 according to the present embodiment normally functions as furniture arranged in a room as shown in FIG. That is, since the frame image is not displayed on the display screen 5a of the display 5 at the time of non-interaction (when the dialogue communication is not performed), the display screen 5a functions as a mirror surface. On the other hand, at the time of dialogue (when dialogue communication is performed), as shown in FIG. 4B, a frame image is displayed (reproduced) on the display screen 5a. (A) and (B) of FIG. 4 are diagrams showing a configuration example of the display 5 according to the present embodiment, (A) showing the non-interactive state and (B) showing the interactive state. It shows.

以上のように本実施形態に係るディスプレイ５は、非対話時には姿見として利用され、対面時には表示画面５ａにてフレーム画像を表示するようになる。これにより、非対話時には表示画面５ａの存在が気付かれ難くなる。その一方で、対話時には、あたかも対話相手とガラス越しに対面しているような視覚的演出効果をユーザに感じさせるようになる。 As described above, the display 5 according to the present embodiment is used as a look at non-interaction, and displays a frame image on the display screen 5a at the time of meeting. This makes it difficult to notice the presence of the display screen 5a when not interacting. On the other hand, at the time of dialogue, the user is made to feel the visual effect as if they were facing the other party of the dialogue through the glass.

なお、画像の表示器と姿見とを兼用する構成については、例えば国際公開第２００９／１２２７１６号に記載された構成のように公知の構成が利用可能である。また、ディスプレイ５については、姿見として兼用される構成に限定されるものではない。ディスプレイ５として用いられる機器については、対話相手の全身画像を表示するのに十分なサイズを有しているものであればよい。そして、非対話時に表示画面５ａの存在を気付き難くする観点からは、部屋内に設置された他の家具や建築材料であって鏡面部を有するものが好適であり、例えば扉（ガラス戸）や窓（ガラス窓）をディスプレイ５として利用してもよい。なお、ディスプレイ５については、家具や建築材料として兼用されるものに限定されず、起動中、表示画面５ａを常時形成する通常の表示器であってもよい。 In addition, about the structure which combines the display of an image and a look, a well-known structure like the structure described, for example in the international publication 2009/122716 can be utilized. In addition, the display 5 is not limited to the configuration used as a look-aside. The device used as the display 5 may have a size sufficient to display a full-length image of the other party. And from the viewpoint of making it hard to notice the presence of the display screen 5a at the time of non-interaction, the other furniture or building material installed in the room and having a mirror surface portion is preferable, for example, a door (glass door) or A window (glass window) may be used as the display 5. In addition, about the display 5, it is not limited to what is also used as furniture and a building material, The normal display which always forms the display screen 5a may be sufficient during starting.

＜＜ホームサーバの機能について＞＞
次に、各通信ユニットのホームサーバ１が具備する対話通信機能について説明する。なお、以下では、対話通信機能のうち、画像表示に関する機能のみを説明することとし、音声再生に関する機能等については説明を省略することとする。また、以下では、説明を分かり易くするため、Ａさん側（つまり、第一通信ユニット１００Ａ）から配信されてくる画像をＢさん側（つまり、第二通信ユニット１００Ｂ）にて表示するケースを例に挙げて説明する。なお、付言しておくと、以下に説明する内容は、視点を変えた場合にも成立することになる。つまり、以下の説明中、第一通信ユニット１００Ａのホームサーバ１の機能については、第二通信ユニット１００Ｂのホームサーバ１にも具備されており、第二通信ユニット１００Ｂのホームサーバ１の機能については、第一通信ユニット１００Ａのホームサーバ１にも具備されている。 << Home Server Functions >>
Next, the interactive communication function provided in the home server 1 of each communication unit will be described. In the following, among the interactive communication functions, only the function related to image display will be described, and the description of functions related to sound reproduction and the like will be omitted. Also, in the following, in order to make the description easy to understand, an example is shown in which the image delivered from Mr. A (that is, the first communication unit 100A) is displayed by Mr. B (that is, the second communication unit 100B). I will list and explain. In addition, the contents to be described below hold true even when the viewpoint is changed. That is, in the following description, the function of the home server 1 of the first communication unit 100A is also included in the home server 1 of the second communication unit 100B, and the function of the home server 1 of the second communication unit 100B is , And the home server 1 of the first communication unit 100A.

第一通信ユニット１００Ａのホームサーバ１は、画像配信側のサーバとして機能し、具体的には下記（１）〜（５）の機能を具備している。
（１）フレーム画像取得機能
（２）骨格モデル特定機能
（３）現在情報特定・通知機能
（４）相手方視野推定機能
（５）画像加工・送信機能 The home server 1 of the first communication unit 100A functions as a server on the image distribution side, and specifically includes the following functions (1) to (5).
(1) Frame image acquisition function (2) Skeletal model identification function (3) Current information identification / notification function (4) Opposite party visual field estimation function (5) Image processing / transmission function

また、第二通信ユニット１００Ｂのホームサーバ１は、画像表示側のサーバとして機能し、具体的には下記（６）の機能を具備している。
（６）表示画像再構築機能
以下、各機能について詳細に説明する。 Further, the home server 1 of the second communication unit 100B functions as a server on the image display side, and specifically, has the following function (6).
(6) Display Image Reconstruction Function Hereinafter, each function will be described in detail.

（フレーム画像取得機能）
第一通信ユニット１００Ａのホームサーバ１は、同ユニットに属するカメラ２のフレームレートに相当する間隔で、当該カメラ２が撮像したフレーム画像を取得する。より具体的に説明すると、Ａさんが部屋（厳密には、対話通信の際に入室する部屋）内でディスプレイ５の前方に居るとき、カメラ２は、Ａさん及びその背景を撮像する。このため、ホームサーバ１は、Ａさんの人物画像とその背景画像を含むフレーム画像を取得することになる。一方、Ａさんが部屋内に居ないとき、ホームサーバ１は、背景画像（部屋の内部空間の画像）のみからなるフレーム画像を取得することになる。 (Frame image acquisition function)
The home server 1 of the first communication unit 100A acquires a frame image captured by the camera 2 at an interval corresponding to the frame rate of the camera 2 belonging to the unit. More specifically, when Mr. A is in front of the display 5 in a room (strictly speaking, a room that enters the room during interactive communication), the camera 2 images Mr. A and the background thereof. For this reason, the home server 1 acquires a frame image including a person image of Mr. A and its background image. On the other hand, when Mr. A is not in the room, the home server 1 acquires a frame image consisting only of the background image (image of the internal space of the room).

なお、第一通信ユニット１００Ａのホームサーバ１は、フレーム画像を取得する際、当該フレーム画像についての深度データを取得する。フレーム画像についての深度データは、前述したように、当該フレーム画像を所定の画素にて分割した際の各画素について深度の計測結果を示すものであり、具体的には図３に図示したビットマップデータによって構成されている。 When acquiring a frame image, the home server 1 of the first communication unit 100A acquires depth data for the frame image. As described above, the depth data of the frame image indicates the measurement result of the depth for each pixel when the frame image is divided into predetermined pixels, and more specifically, the bit map illustrated in FIG. 3 It consists of data.

（骨格モデル特定機能）
第一通信ユニット１００Ａのホームサーバ１は、前述したように、フレーム画像を取得する都度、当該フレーム画像についての深度データを取得する。そして、ホームサーバ１は、フレーム画像（厳密には、フレーム画像中のＡさんの人物画像）と当該フレーム画像についての深度データに基づいて、Ａさんの骨格モデルを特定する。具体的に説明すると、Ａさんの人物画像を含むフレーム画像についての深度データでは、図３に示すように、人物画像に属する画素（図３中、白抜きの画素）と、それ以外の画像に属する画素（図３中、黒抜きの画素や斜線ハッチングの画素）とでは、明らかに深度が異なっている。このような特徴を利用して、ホームサーバ１は、深度データ中、人物画像に属する画素を抽出する。その上で、ホームサーバ１は、抽出した画素からＡさんの骨格モデルを特定する。 (Skeletal model identification function)
As described above, each time the frame image is acquired, the home server 1 of the first communication unit 100A acquires depth data of the frame image. Then, the home server 1 specifies the skeletal model of Mr. A based on the frame image (strictly, the person's human image in the frame image) and the depth data of the frame image. Specifically, in depth data of a frame image including a person image of Mr. A, as shown in FIG. 3, pixels belonging to the person image (in FIG. 3, white pixels) and the other images are included. The depth is obviously different from that of the pixel to which the pixel belongs (in FIG. 3, the black pixels and the hatched pixels). Using such a feature, the home server 1 extracts pixels belonging to a person image in depth data. Then, the home server 1 specifies a skeleton model of Mr. A from the extracted pixels.

骨格モデルは、図３に示すように、人間の骨格、特に頭部、肩、肘、手、脚、腰、股関節、膝、足に関する位置情報を簡易的にモデル化したものである。ここで、骨格モデルにおいて設定された上記の部位は、本発明の「設定部位」に相当する。また、当該設定部位の中には、第一ユーザの上半身の体軸上にある部位が含まれており、具体的には、頭部及び腰が該当する。ちなみに、骨格モデルを特定する方法については、公知の方法（例えば、特開２０１４−１５５６９３号公報や特開２０１３−１１６３１１号公報に記載の方法）が利用可能である。 As shown in FIG. 3, the skeletal model is a simplified model of position information on the human skeleton, in particular, the head, shoulders, elbows, hands, legs, hips, hips, knees, and feet. Here, the above-described site set in the skeletal model corresponds to the "set site" in the present invention. In addition, the setting site includes a site on the body axis of the upper body of the first user, and specifically, the head and the waist correspond. Incidentally, as a method of specifying a skeletal model, known methods (for example, methods described in Japanese Patent Application Laid-Open Nos. 2014-155693 and 2013-116311) can be used.

そして、第一通信ユニット１００Ａのホームサーバ１は、深度データを取得する都度、換言すると、フレーム画像を取得する都度、骨格モデルを特定する。これにより、骨格モデルとして表されるＡさんの身体各部の位置変化、より具体的には骨格モデルにおいて設定された複数の設定部位の各々について、動き（変位）の有無を検出することが可能となる。 Then, the home server 1 of the first communication unit 100A specifies the skeleton model each time depth data is acquired, in other words, each time a frame image is acquired. Thereby, it is possible to detect the presence or absence of movement (displacement) for the change in position of each part of the body of Mr. A represented as a skeletal model, more specifically, for each of a plurality of setting parts set in the skeletal model. Become.

また、第一通信ユニット１００Ａのホームサーバ１は、図３に示すように、あるフレーム画像についての深度データから特定した骨格モデルに基づき、当該あるフレーム画像の中から人物画像を抽出することが可能である。なお、本明細書では、骨格モデルに基づいてフレーム画像の中から人物画像を抽出する方法については説明を省略するが、大まかな手順を述べると、特定した骨格モデルに基づいて深度データ中、人物画像に属する画素群を特定する。その後、特定した画素群と対応する領域をフレーム画像の中から抽出する。かかる手順によって抽出された画像がフレーム画像中の人物画像に該当する。 Further, as shown in FIG. 3, the home server 1 of the first communication unit 100A can extract a person image from the frame image based on the skeleton model specified from the depth data of the frame image. It is. In the present specification, a description of a method of extracting a human image from a frame image based on a skeletal model is omitted, but if a rough procedure is described, a person in depth data based on a skeletal model identified Identify pixel groups that belong to the image. Thereafter, an area corresponding to the specified pixel group is extracted from the frame image. The image extracted by this procedure corresponds to the person image in the frame image.

（現在情報特定・通知機能）
第一通信ユニット１００Ａのホームサーバ１は、対話通信においてＡさんの現在の状態に関する情報（以下、現在情報）を特定し、当該現在情報を第二通信ユニット１００Ｂのホームサーバ１に向けて送信する。ここで、「現在情報」とは、ディスプレイ５の前に居る状態のＡさんとディスプレイ５との位置関係、及び、Ａさんの姿勢のうち、少なくとも一つに関する内容のことであり、本実施形態では、Ａさんとディスプレイ５との間の距離（奥行距離）、Ａさんの身長、及び、Ａさんの顔の向きである。なお、現在情報として特定される内容については、上記内容に限定されるものではなく、他の情報、例えばＡさんの視線の向きや顔の位置（垂直方向及び水平方向の両方向における位置）が含まれてもよい。 (Current information identification / notification function)
The home server 1 of the first communication unit 100A specifies information (hereinafter referred to as current information) related to the current state of the user A in dialog communication, and transmits the current information to the home server 1 of the second communication unit 100B. . Here, the "current information" refers to the content related to at least one of the positional relationship between the user A and the user in the state of being in front of the display 5 and the posture of the user A, and the present embodiment Then, it is the distance between A and the display 5 (depth distance), the height of A, and the direction of A's face. The contents specified as current information are not limited to the above contents, and include other information such as the direction of the line of sight of Mr. A and the position of the face (position in both the vertical direction and the horizontal direction) It may be

各現在情報の特定方法について説明すると、Ａさんとディスプレイ５との間の距離については、Ａさんがディスプレイ５の前に立っている状態で赤外線センサ４が計測した際の深度の計測結果、すなわち、深度データから特定することが可能である。つまり、第一通信ユニット１００Ａのホームサーバ１は、赤外線センサ４の計測結果に基づいてＡさんとディスプレイ５との間の距離を特定する。換言すると、赤外線センサ４は、Ａさんとディスプレイ５との間の距離に関する情報として、深度の計測結果をホームサーバ１に提供する情報提供装置に該当すると言える。 Regarding the method of identifying each piece of current information, regarding the distance between Mr. A and display 5, the measurement result of the depth when infrared sensor 4 measures with Mr. A standing in front of display 5, ie, , And can be specified from depth data. That is, the home server 1 of the first communication unit 100A specifies the distance between the user A and the display 5 based on the measurement result of the infrared sensor 4. In other words, it can be said that the infrared sensor 4 corresponds to the information providing device that provides the home server 1 with the measurement result of the depth as the information on the distance between the user A and the display 5.

Ａさんの身長については、上記の方法により特定したＡさんとディスプレイ５との間の距離と、深度データから特定した骨格モデルと、に基づいて特定することが可能である。より具体的に説明すると、第一通信ユニット１００Ａのホームサーバ１は、骨格モデル上でのＡさんの身長（以下、モデル上の身長）を割り出す。また、ホームサーバ１は、Ａさんとディスプレイ５との間の距離から、実際のＡさんの身長に対するモデル上の身長の比率を算出する。そして、ホームサーバ１は、割り出したモデル上の身長、及び、算出した比率に基づいてＡさんの身長（実際の身長）を特定する。 The height of Mr. A can be identified based on the distance between Mr. A and the display 5 identified by the above method and the skeletal model identified from the depth data. More specifically, the home server 1 of the first communication unit 100A determines the height of Mr. A on the skeletal model (hereinafter, the height on the model). Further, the home server 1 calculates the ratio of the height on the model to the actual height of Mr. A from the distance between Mr. A and the display 5. Then, the home server 1 specifies the height (actual height) of Mr. A based on the calculated height on the model and the calculated ratio.

Ａさんの顔の向きは、Ａさんがディスプレイ５の前に立っている状態でカメラ２が撮像した際のフレーム画像から特定することが可能である。より具体的に説明すると、第一通信ユニット１００Ａのホームサーバ１は、上記のフレーム画像に対して公知の画像解析処理を適用し、Ａさんの顔の向きを特定する。換言すると、カメラ２は、Ａさんの姿勢（顔の向き）に関する情報として、Ａさんの人物画像を含むフレーム画像をホームサーバ１に提供する情報提供装置に該当すると言える。 The direction of the face of Mr. A can be specified from a frame image when the camera 2 captures an image while Mr. A is standing in front of the display 5. More specifically, the home server 1 of the first communication unit 100A applies known image analysis processing to the above-described frame image to specify the direction of Mr. A's face. In other words, it can be said that the camera 2 corresponds to an information providing device that provides the home server 1 with a frame image including a person's image of the person A as information on the posture (the direction of the face) of the person A.

第一通信ユニット１００Ａのホームサーバ１は、上記３つの現在情報を特定した後、これらを第二通信ユニット１００Ｂのホームサーバ１に通知する。一方、現在情報の特定及び通知は、第二通信ユニット１００Ｂのホームサーバ１においても同様に行われる。すなわち、第二通信ユニット１００Ｂのホームサーバ１は、Ｂさんがディスプレイ５の前に居る状態において、Ｂさんとディスプレイ５との間の距離、Ｂさんの身長及びＢさんの顔の向きを特定し、これらを第一通信ユニット１００Ａのホームサーバ１に通知する。なお、第二通信ユニット１００Ｂの赤外線センサ４は、情報提供装置として、Ｂさんとディスプレイ５との間の距離に関する情報、より具体的には深度の計測結果をホームサーバ１に提供する。また、第二通信ユニット１００Ｂのカメラ２は、情報提供装置として、Ｂさんの姿勢（顔の向き）に関する情報、より具体的にはＢさんの人物画像を含むフレーム画像をホームサーバ１に提供する。 After specifying the three pieces of current information, the home server 1 of the first communication unit 100A notifies the home server 1 of the second communication unit 100B of these. On the other hand, identification and notification of current information are similarly performed in the home server 1 of the second communication unit 100B. That is, the home server 1 of the second communication unit 100B specifies the distance between the B and the display 5, the height of the B and the direction of the B's face while the B is in front of the display 5. , These are notified to the home server 1 of the first communication unit 100A. The infrared sensor 4 of the second communication unit 100B provides the home server 1 with information on the distance between the user B and the display 5, more specifically, the measurement result of the depth, as an information providing device. In addition, the camera 2 of the second communication unit 100B provides the home server 1 with information on the posture (face orientation) of Mr. B, more specifically, a frame image including Mr. B's person image, as an information providing device. .

そして、第一通信ユニット１００Ａのホームサーバ１は、第二通信ユニット１００Ｂのホームサーバ１がＢさんの現在情報を通知することで、当該現在情報（すなわち、第二通信ユニット１００Ｂのホームサーバ１が赤外線センサ４やカメラ２からの提供情報に基づいて特定した内容）を取得するようになる。 Then, when the home server 1 of the second communication unit 100B notifies the present information of Mr. B, the home server 1 of the first communication unit 100A notifies the present information (that is, the home server 1 of the second communication unit 100B). The content specified based on the provided information from the infrared sensor 4 or the camera 2 is acquired.

（相手方視野推定機能）
第一通信ユニット１００Ａのホームサーバ１は、取得したＢさんの現在情報に基づいて、Ｂさんの視野と対応する領域、より具体的には中心視野領域と対応する範囲を推定する。より具体的に説明すると、ホームサーバ１は、Ｂさんの身長及び顔の向きに関する情報からＢさんの目線の高さ（目線高さ）及び向き（目線向き）を割り出す。そして、ホームサーバ１は、上記の目線高さから上記の目線向きに向かって延出する仮想線を基準にして所定の角度（視野角）分だけ拡がった範囲を特定する。かかる範囲がＢさんの中心視野領域と対応する範囲（以下、単に中心視野領域と言う）に相当する。 (The other party's view estimation function)
The home server 1 of the first communication unit 100A estimates a region corresponding to the field of view of the user B, more specifically, a range corresponding to the central region of view, based on the acquired current information of the user B. More specifically, the home server 1 determines the height (view height) and the direction (view direction) of Mr. B's eyes from the information on the height and the direction of the face of Mr. B. Then, the home server 1 specifies a range expanded by a predetermined angle (viewing angle) based on the virtual line extending from the above-mentioned eye height toward the above-mentioned eye direction. This range corresponds to a range corresponding to the central visual field of Mr. B (hereinafter, referred to simply as a central visual field).

第一通信ユニット１００Ａのホームサーバ１は、上記の方法によりＢさんの中心視野領域を推定した後、その推定結果を示す位置を記憶する。ここで、「推定結果を示す位置」とは、第二通信ユニット１００Ｂが有するディスプレイ５の表示画面５ａに対するＢさんの中心視野領域の相対位置のことである。 The home server 1 of the first communication unit 100A estimates the central visual field area of Mr. B by the above method, and then stores the position indicating the estimation result. Here, the "position indicating the estimation result" is the relative position of the central visual field of Mr. B with respect to the display screen 5a of the display 5 of the second communication unit 100B.

以上のように本実施形態では、対話相手の中心視野領域を、対話相手の身長及び顔の向きに基づいて適切に推定することが可能である。なお、中心視野領域を推定する方法としては、上記の方法に限定されるものではなく、中心視野領域を推定するのに好適な方法である限り、他の方法を採用してもよい。 As described above, in the present embodiment, it is possible to appropriately estimate the central visual field of the conversation partner based on the height and the face orientation of the conversation partner. In addition, as a method of estimating a central visual field area, it is not limited to said method, As long as it is a method suitable for estimating a central visual field area, you may employ | adopt another method.

（画像加工・送信機能）
第一通信ユニット１００Ａのホームサーバ１は、第二通信ユニット１００Ｂのディスプレイ５にＢさんの人物画像を含むフレーム画像を表示させるために、第二通信ユニット１００Ｂのホームサーバ１に向けて画像データを送信する。ここで、送信される画像データについて説明すると、対話通信の臨場感を確保する目的から原則として高画質な画像データを送信することとしている。一方、高画質な画像データであるほど、データ伝送時における送信負荷（以下、データ伝送負荷）が大きくなる。このため、第一通信ユニット１００Ａのホームサーバ１は、データ伝送負荷を軽減すべく、カメラ２から取得したフレーム画像に対して所定の加工処理を行い、処理後の画像のデータ（画像データ）を送信することとしている。 (Image processing and transmission function)
The home server 1 of the first communication unit 100A directs the image data to the home server 1 of the second communication unit 100B in order to display a frame image including a person image of Mr. B on the display 5 of the second communication unit 100B. Send. Here, the image data to be transmitted will be described. In order to ensure the presence of interactive communication, in principle, high-quality image data is transmitted. On the other hand, the transmission load at the time of data transmission (hereinafter, data transmission load) becomes larger as the image data is of higher quality. Therefore, the home server 1 of the first communication unit 100A performs predetermined processing on the frame image acquired from the camera 2 in order to reduce the data transmission load, and the data (image data) of the processed image is obtained. It is supposed to be sent.

以下、データ伝送負荷を軽減するための加工処理について図５乃至８を参照しながら説明する。図５は、フレーム画像の背景画像及び人物画像を分離する処理についての説明図である。図６の（Ａ）、（Ｂ）及び（Ｃ）は、低画質化処理についての説明図であり、図中の（Ａ）は、Ｂさんとディスプレイ５との位置関係を示し、（Ｂ）は、Ｂさんがディスプレイ５に近い位置に居るときの当該ディスプレイ５の表示画像を示し、（Ｃ）は、Ｂさんがディスプレイ５から離れた位置に居るときの当該ディスプレイ５の表示画像を示している。図７の（Ａ）、（Ｂ）、（Ｃ）及び（Ｄ）は、フレーム画像の中から選択された画像の切り出しに関する説明図であり、図中の（Ａ）は、前回のフレーム画像と今回のフレーム画像とを対比した図であり、（Ｂ）は、前回の骨格モデルと今回の骨格モデルとを対比した図であり、（Ｃ）は、今回のフレーム画像の中から送信対象として切り出される画像を示す図であり、（Ｄ）は、切り出された画像を用いて表示画像を再構築する手順を示す図である。図８は、画質調整処理についての説明図である。 Hereinafter, processing for reducing the data transmission load will be described with reference to FIGS. 5 to 8. FIG. 5 is an explanatory diagram of processing for separating a background image of a frame image and a person image. (A), (B) and (C) in FIG. 6 are explanatory diagrams of the image quality reduction process, (A) in the figure shows the positional relationship between the B and the display 5, (B) Shows a display image of the display 5 when Mr. B is at a position close to the display 5, and (C) shows a display image of the display 5 when Mr. B is at a position away from the display 5. There is. (A), (B), (C) and (D) of FIG. 7 are explanatory diagrams related to clipping of an image selected from among frame images, and (A) in the figure is a frame image of the previous time. It is a figure which contrasted with this frame picture, (B) is a figure which contrasted the last skeleton model and this skeleton model, (C) was cut out as a transmitting object out of this frame image. (D) is a figure which shows the procedure which reconstructs a display image using the cut-out image. FIG. 8 is an explanatory view of the image quality adjustment process.

先ず、図５を参照しながら画像分離処理について説明する。第一通信ユニット１００Ａのホームサーバ１は、対話通信が開始されると、カメラ２から順次送られてくるフレーム画像（撮像画像）を取得する。そして、取得したフレーム画像中にＡさんの人物画像及びその背景画像が含まれているとき、ホームサーバ１は、図５に示すようにフレーム画像から人物画像を抽出し、当該人物画像と背景画像とを分離する。その上で、ホームサーバ１は、人物画像の画像データのみ送信する。 First, the image separation processing will be described with reference to FIG. When interactive communication is started, the home server 1 of the first communication unit 100A acquires frame images (captured images) sequentially sent from the camera 2. Then, when the person image of Mr. A and its background image are included in the acquired frame image, the home server 1 extracts the person image from the frame image as shown in FIG. 5, and the person image and the background image And separate. Then, the home server 1 transmits only image data of a person image.

一方、背景画像の画像データについては、背景画像以外の画像データと分けて生成され、第二通信ユニット１００Ｂのホームサーバ１に向けて送信されることになっている。なお、本実施形態では、背景画像データの送信処理の実行頻度が第一通信ユニット１００Ａのホームサーバ１がカメラ２からフレーム画像を取得する頻度よりも少なくなっている。 On the other hand, the image data of the background image is generated separately from the image data other than the background image, and is transmitted to the home server 1 of the second communication unit 100B. In the present embodiment, the execution frequency of the transmission process of the background image data is smaller than the frequency at which the home server 1 of the first communication unit 100A acquires a frame image from the camera 2.

より具体的に説明すると、第一通信ユニット１００Ａのホームサーバ１は、対話通信の開始直後や後述する通信前処理において、背景画像のみからなるフレーム画像をカメラ２から取得する。かかるフレーム画像の取得後、ホームサーバ１は、当該フレーム画像の画像データを背景画像の画像データとして送信する。以降、対話通信が終了するまでの間、ホームサーバ１が背景画像の画像データを送信することはない。このように背景画像の画像データの送信を対話通信の開始時等に限定しているのは、一般に背景画像における変化が少ないことを反映しているためである。 More specifically, the home server 1 of the first communication unit 100A acquires a frame image consisting of only a background image from the camera 2 immediately after the start of the dialogue communication or in the pre-communication processing to be described later. After obtaining the frame image, the home server 1 transmits the image data of the frame image as the image data of the background image. Thereafter, the home server 1 does not transmit the image data of the background image until the interactive communication ends. The reason why the transmission of the image data of the background image is limited to the start time of the dialogue communication and the like in this way is that it generally reflects that the change in the background image is small.

そして、ホームサーバ１は、対話通信の開始時に背景画像の画像データを一回送信すると、それ以降はフレーム画像中の人物画像の画像データのみを送信することとし、背景画像の画像データについては送信しない。これにより、フレーム画像全体の画像データ（すなわち、人物画像及び背景画像の双方の画像データ）を送信する場合に比して、データ伝送負荷を軽減することが可能となる。 When the home server 1 transmits image data of the background image once at the start of the dialogue communication, the home server 1 transmits only the image data of the person image in the frame image thereafter, and transmits the image data of the background image. do not do. This makes it possible to reduce the data transmission load as compared with the case of transmitting image data of the entire frame image (that is, both image data of a person image and a background image).

なお、分離された背景画像と人物画像とは、第二通信ユニット１００Ｂのホームサーバ１によって再合成される。より具体的に説明すると、第二通信ユニット１００Ｂのホームサーバ１は、第一通信ユニット１００Ａのホームサーバ１が対話通信時等に送信した背景画像の画像データと、その後に送信されてくる人物画像の画像データと、をそれぞれ受信して展開し、両画像を合成した画像（合成画像）を構築する。かかる合成画像は、第一通信ユニット１００Ａのホームサーバ１がカメラ２から取得した時点でのフレーム画像、すなわち、人物画像と背景画像とに分離される前のフレーム画像と略一致する。 The separated background image and person image are recombined by the home server 1 of the second communication unit 100B. More specifically, the home server 1 of the second communication unit 100B receives the image data of the background image transmitted by the home server 1 of the first communication unit 100A at the time of interactive communication, and the person image transmitted thereafter. The image data of (1) is received and expanded, and an image (synthetic image) in which both images are synthesized is constructed. The composite image substantially matches the frame image at the time when the home server 1 of the first communication unit 100A acquires it from the camera 2, that is, the frame image before being separated into the human image and the background image.

第二通信ユニット１００Ｂのホームサーバ１は、以上のように背景画像及び人物画像を合成することで、新たなフレーム画像を取得する。そして、新たに取得したフレーム画像は、今回の表示画像としてディスプレイ５に表示されるようになる。 The home server 1 of the second communication unit 100B acquires a new frame image by combining the background image and the person image as described above. Then, the newly acquired frame image is displayed on the display 5 as a display image of this time.

次に、図６の（Ａ）、（Ｂ）及び（Ｃ）を参照しながら低画質化処理について説明する。第一通信ユニット１００Ａのホームサーバ１は、前述したように、カメラ２から取得したフレーム画像の中からＡさんの人物画像を抽出し、当該人物画像のデータを送信する。一方、第一通信ユニット１００Ａのホームサーバ１は、Ｂさんの現在情報として、Ｂさんとディスプレイ５との間の距離を第二通信ユニット１００Ｂのホームサーバ１から取得する。 Next, the image quality reduction processing will be described with reference to (A), (B) and (C) of FIG. As described above, the home server 1 of the first communication unit 100A extracts the person image of Mr. A from the frame image acquired from the camera 2, and transmits the data of the person image. On the other hand, the home server 1 of the first communication unit 100A acquires the distance between the user B and the display 5 from the home server 1 of the second communication unit 100B as current information of the user B.

そして、Ｂさんとディスプレイ５との間の距離が閾値未満であるとき（例えば、図６の（Ａ）において記号ｄ１にて示す距離であるとき）、第一通信ユニット１００Ａのホームサーバ１は、抽出した人物画像をそのままの画質で表示する画像データを生成し、当該画像データを第二通信ユニット１００Ｂのホームサーバ１に向けて送信する。ここで、閾値は、低画質化処理の実行の有無を判定する際の基準値であり、上記の距離に関して予め設定された大きさの値となっている。なお、閾値の具体的な値については、特に限定されるものではないが、低画質化処理の実行の有無を判定するのに好適な値に設定されるのが望ましい。 Then, when the distance between the user B and the display 5 is less than the threshold (for example, when it is the distance indicated by the symbol d1 in FIG. 6A), the home server 1 of the first communication unit 100A The image data which displays the extracted person image with the image quality as it is is produced | generated, and the said image data is transmitted toward the home server 1 of 2nd communication unit 100B. Here, the threshold is a reference value when determining whether or not the image quality reduction processing is to be performed, and is a value of a preset magnitude with respect to the above distance. The specific value of the threshold is not particularly limited, but is preferably set to a value suitable for determining the execution of the image quality reduction process.

一方で、Ｂさんとディスプレイ５との間の距離が閾値以上であるとき（例えば、図６の（Ａ）において記号ｄ２にて示す距離であるとき）、第一通信ユニット１００Ａのホームサーバ１は、抽出した人物画像に対して低画質化処理を実行する。この低画質化処理では、抽出した人物画像の画質を所定の画質まで低下させ、低下後の画質の人物画像を示す画像データ（以下、低画質人物画像データ）を生成する。ここで、「画質を低下させる」とは、解像度を下げることを意味する。また、上述した「所定の画質」については、少なくとも第一通信ユニット１００Ａのホームサーバ１がカメラ２から取得した時点でのフレーム画像の画質、すなわち、原画像の画質よりも低い画質に設定されることとし、望ましくは、対話通信の臨場感を損なわない程度の画質に設定されるとよい。 On the other hand, when the distance between Mr. B and the display 5 is equal to or larger than the threshold (for example, when it is the distance indicated by symbol d2 in (A) of FIG. 6), the home server 1 of the first communication unit 100A The image quality reduction process is performed on the extracted human image. In this image quality reduction process, the image quality of the extracted person image is reduced to a predetermined image quality, and image data (hereinafter, low image quality person image data) indicating a person image of the image quality after reduction is generated. Here, "reduce the image quality" means to reduce the resolution. Further, with regard to the “predetermined image quality” described above, the image quality of the frame image at the time when at least the home server 1 of the first communication unit 100A acquires from the camera 2, that is, the image quality lower than the image quality of the original image In addition, desirably, the image quality should be set to a degree that does not impair the sense of reality of interactive communication.

そして、低画質人物画像データは、生成後、第二通信ユニット１００Ｂのホームサーバ１に向けて送信される。このときのデータ送信負荷は、画質を低下された分だけ軽減されることになる。 Then, after the low-quality person image data is generated, it is transmitted to the home server 1 of the second communication unit 100B. The data transmission load at this time is reduced by the amount of image quality degradation.

以上のように、Ｂさんとディスプレイ５との間の距離が閾値以上であるときと、当該距離が閾値未満であるときとで、第一通信ユニット１００Ａのホームサーバ１が配信する人物画像の画質が異なってくる。このため、第二通信ユニット１００Ｂのディスプレイ５に表示されるフレーム画像（すなわち、人物画像と背景画像との合成画像）中の人物画像の画質についても、上記の距離に応じて変わることになる。具体的に説明すると、Ｂさんとディスプレイ５との間の距離が閾値未満である場合には、図６の（Ｂ）に示すように、ディスプレイ５の表示画像中の人物画像は、第一通信ユニット１００Ａのホームサーバ１がカメラ２から取得したフレーム画像（原画像）中の人物画像と略同じ画質となっている。 As described above, the image quality of the person image distributed by the home server 1 of the first communication unit 100A when the distance between Mr. B and the display 5 is equal to or greater than the threshold and when the distance is less than the threshold. Will be different. For this reason, the image quality of the human image in the frame image (that is, the composite image of the human image and the background image) displayed on the display 5 of the second communication unit 100B also changes in accordance with the above distance. Specifically, when the distance between Mr. B and the display 5 is less than the threshold, as shown in FIG. 6B, the person image in the display image of the display 5 is the first communication The image quality is substantially the same as the human image in the frame image (original image) acquired by the home server 1 of the unit 100A from the camera 2.

一方で、Ｂさんとディスプレイ５との間の距離が閾値以上である場合には、図６の（Ｃ）に示すように、ディスプレイ５の表示画像中の人物画像が、第一通信ユニット１００Ａのホームサーバ１がカメラ２から取得したフレーム画像中の人物画像に比べて幾分低画質（低解像度）となる。ただし、この場合、表示画像中の人物画像の画質が低下していても、ディスプレイ５を見ているＢさんは、ディスプレイ５から離れているので、画質低下による違和感を然程感じない。つまり、上記の距離が閾値以上であれば、人物画像に対して低画質化処理を実行して低画質人物画像データを第二通信ユニット１００Ｂのホームサーバ１に向けて送信したとしても、対話通信の臨場感（リアル感）が損なわれない。これにより、表示画像中の人物画像の画質を低下させながらも対話通信の臨場感を確保しつつ、データ伝送負荷を画質低下の分だけ軽減することが可能となる。 On the other hand, when the distance between Mr. B and the display 5 is equal to or larger than the threshold, as shown in FIG. 6C, the person image in the display image of the display 5 is the one of the first communication unit 100A. The image quality is somewhat lower than that of the person image in the frame image acquired by the home server 1 from the camera 2 (low resolution). However, in this case, even if the image quality of the human image in the display image is lowered, Mr. B looking at the display 5 is far from the display 5, and therefore does not feel a sense of discomfort due to the image quality deterioration. That is, if the above distance is equal to or more than the threshold value, the dialog communication is performed even if the low image quality person image data is transmitted to the home server 1 of the second communication unit 100B by executing the image quality reduction processing on the person image. The sense of reality (realism) is not impaired. As a result, it is possible to reduce the data transmission load by an amount corresponding to the reduction in image quality while securing the presence of interactive communication while reducing the image quality of the person image in the display image.

次に、図７の（Ａ）、（Ｂ）、（Ｃ）及び（Ｄ）を参照しながら画像の切り出しについて説明する。第一通信ユニット１００Ａのホームサーバ１は、前述したように、カメラ２から取得したフレーム画像の中からＡさんの人物画像を抽出する。その後、ホームサーバ１は、抽出した人物画像の画像データを生成することになる。この際、Ｂさんとディスプレイ５との間の距離が閾値未満であるときには、上述したように、原画像と同じ画質となるように人物画像の画像データを生成することになる。かかる画像データは、より高画質となっている分、より大きなデータ伝送負荷を生じさせることになる。 Next, clipping of an image will be described with reference to (A), (B), (C) and (D) of FIG. 7. The home server 1 of the first communication unit 100A extracts the person image of Mr. A from the frame image acquired from the camera 2 as described above. Thereafter, the home server 1 generates image data of the extracted person image. At this time, when the distance between the user B and the display 5 is less than the threshold value, as described above, the image data of the human image is generated so as to have the same image quality as the original image. Such image data causes a larger data transmission load due to the higher image quality.

一方で、図７の（Ａ）に示すように、連続して取得される２つのフレーム画像（前回のフレーム画像と今回のフレーム画像）を対比すると、フレーム画像中の人物画像には、フレーム画像間で異なる部分と、フレーム画像間で共通する部分とがある。つまり、上記２つのフレーム画像のうち、今回取得したフレーム画像中の人物画像には、前回取得したフレーム画像から動いた部分と、動いていない部分とが存在する。 On the other hand, as shown in FIG. 7A, when two frame images (previous frame image and current frame image) acquired continuously are compared, a frame image is displayed for the person image in the frame image. There are parts that differ between the two, and parts that are common to the frame images. That is, of the two frame images, in the human image in the frame image acquired this time, there are a portion moved from the frame image acquired last time and a portion not moved.

そして、第一通信ユニット１００Ａのホームサーバ１は、今回取得したフレーム画像中の人物画像のうち、動いた部分の画像を切り出し、切り出した画像の画像データを生成して第二通信ユニット１００Ｂのホームサーバ１に向けて送信することとしている。ここで、「動いた部分の画像」とは、Ａさんの身体各部のうち、前回のフレーム画像の取得時から今回のフレーム画像の取得時までの期間中に動いた部分の画像のことである。 Then, the home server 1 of the first communication unit 100A cuts out the image of the moved part of the person image in the frame image acquired this time, generates the image data of the cut out image, and generates the home of the second communication unit 100B. It is supposed to be sent to the server 1. Here, the “image of the moved part” refers to an image of a part of the body of Mr. A who moved during the period from the time of acquisition of the previous frame image to the time of acquisition of the current frame image. .

以上のように、本実施形態では、今回取得したフレーム画像中の人物画像のうち、動いた部分の画像データを第二通信ユニット１００Ｂのホームサーバ１に向けて送信することとしている。これにより、送信される人物画像の画像データについて、当該人物画像中の動いていない部分の画像データの分だけ削減することが可能となる。この結果、人物画像の画像データを送信する際のデータ送信負荷を一段と軽減することが可能となる。 As described above, in the present embodiment, the image data of the moving part of the person image in the frame image acquired this time is transmitted to the home server 1 of the second communication unit 100B. This makes it possible to reduce the image data of the transmitted person image by the amount of the image data of the non-moving part in the person image. As a result, it is possible to further reduce the data transmission load when transmitting the image data of a person image.

ところで、動いた部分の画像データを生成するにあたっては、Ａさんの身体各部のうち、前回のフレーム画像の取得時から今回のフレーム画像の取得時までの期間中に動いた部分（以下、被特定部分）を特定する必要がある。そして、本実施形態では、被特定部分を特定する際に、上記の期間中における第一通信ユニット１００Ａの赤外線センサ４の計測結果の変化に基づいて被特定部分を特定することとしている。 By the way, when generating the image data of the moved part, a part of Ms. A's body that moved during the period from the time of acquisition of the previous frame image to the time of acquisition of the current frame image (hereinafter referred to as specified Part) needs to be identified. And in this embodiment, when specifying a to-be-specified part, it is supposed that a to-be-specified part is specified based on the change of the measurement result of the infrared sensor 4 of 1st communication unit 100A in said period.

より具体的に説明すると、図７の（Ｂ）に示すように、前回取得したフレーム画像についての深度データ、及び、今回取得したフレーム画像についての深度データの各々から骨格モデルを特定する。そして、２つの骨格モデルを対比することで被特定部分を特定する。ちなみに、図７の（Ｂ）に図示のケースでは、手及び肘が被特定部分として特定されることになる。なお、被特定部分を特定する際の具体的手順については、後述することとする。 More specifically, as shown in FIG. 7B, the skeletal model is specified from each of the depth data for the previously acquired frame image and the depth data for the currently acquired frame image. Then, the identified part is identified by comparing the two skeletal models. Incidentally, in the case illustrated in FIG. 7B, the hand and the elbow are identified as the identified part. A specific procedure for specifying the specified part will be described later.

以上のように本実施形態では、フレーム画像におけるＡさんの人物画像中、被特定部分（すなわち、Ａさんの身体において動いた部分）を特定する際に、２つの骨格モデルを対比して骨格モデル間の相違（変化）から被特定部分を特定する。この結果、被特定部分が適切且つ的確に特定されるようになる。 As described above, in the present embodiment, when identifying the specified part (that is, the moved part in the body of A) in the person image of A in the frame image, the skeletal model is compared with each other for the skeletal model. Identify the identified part from the difference (change) between As a result, the specified part can be properly and accurately identified.

被特定部分の特定後、第一通信ユニット１００Ａのホームサーバ１は、今回取得したフレーム画像におけるＡさんの人物画像のうち、被特定部分を含む領域（以下、切り出し領域、若しくは切り出し画像とも呼ぶ）を抽出する。具体的に説明すると、ホームサーバ１は、前回のフレーム画像の取得時から今回のフレーム画像の取得時までの期間中に動いた設定部位を含むように切り出し領域を抽出する。図７の（Ｂ）のケースを例に挙げて説明すると、手及び肘が被特定部分として特定された場合、ホームサーバ１は、図７の（Ｃ）に示すように、Ａさんの人物画像中、手から肘までの範囲（すなわち、手及び前腕部分）の画像を切り出し領域として抽出する。 After specifying the specified part, the home server 1 of the first communication unit 100A includes an area including the specified part in the person image of Mr. A in the frame image acquired this time (hereinafter, also referred to as cutout area or cutout image) Extract Specifically, the home server 1 extracts the cutout region so as to include the set portion that has moved during the period from the time of acquisition of the previous frame image to the time of acquisition of the current frame image. When the case of FIG. 7B is described as an example, when the hand and the elbow are identified as the identified part, the home server 1 displays a person image of Mr. A, as shown in FIG. 7C. Images of the range from the hand to the elbow (ie, the hand and forearm part) are extracted as cutout regions.

また、本実施形態において、第一通信ユニット１００Ａのホームサーバ１は、上記の手順により抽出した領域に加え、Ａさんの顔全体を含む領域（すなわち、頭部画像）も切り出し領域として抽出することになっている。これは、対話通信においてＡさんの顔の表情や口の動きが変化し易いことを反映しているためである。 Further, in the present embodiment, the home server 1 of the first communication unit 100A extracts the area including the entire face of Mr. A (ie, the head image) as the cutout area in addition to the area extracted by the above procedure. It has become. This is because it reflects that the facial expression and mouth movement of Mr. A are likely to change in the dialogue communication.

以上のようにして領域抽出（切り出し領域の選定）が行われると、その後、第一通信ユニット１００Ａのホームサーバ１は、抽出した領域の画像データを生成し、第二通信ユニット１００Ｂのホームサーバ１に向けて送信する。なお、切り出し領域の画像データには、当該領域の表示位置（厳密には、フレーム画像に対する相対位置）を示す表示位置データが組み込まれている。 After the area extraction (selection of the cutout area) is performed as described above, the home server 1 of the first communication unit 100A then generates image data of the extracted area, and the home server 1 of the second communication unit 100B. Send towards. Note that display position data indicating the display position of the area (strictly speaking, relative position to the frame image) is incorporated in the image data of the cutout area.

一方、第二通信ユニット１００Ｂのホームサーバ１は、切り出し領域の画像データを受信すると、当該画像データを展開することで得られる画像（すなわち、切り出し画像）を、前回表示したフレーム画像に合成することで今回表示するフレーム画像を取得する。ここで、「前回表示したフレーム画像」とは、切り出し領域の画像データを受信する直前にディスプレイ５に表示されていたフレーム画像（表示画像）のことである。 On the other hand, when the home server 1 of the second communication unit 100B receives the image data of the cutout area, the home server 1 combines the image obtained by expanding the image data (that is, the cutout image) with the previously displayed frame image. Get the frame image to be displayed this time. Here, the “frame image displayed last time” is a frame image (display image) displayed on the display 5 immediately before receiving the image data of the cutout area.

より詳しく説明すると、第二通信ユニット１００Ｂのホームサーバ１は、受信した画像データ中の表示位置データを解析して、切り出し領域と対応した位置（すなわち、切り出し画像の表示位置）を特定する。その上で、ホームサーバ１は、図７の（Ｄ）に示すように、前回表示したフレーム画像におけるＡさんの人物画像のうち、特定した切り出し領域の位置に切り出し画像を重ね合わせる。この結果、同図に示すように、今回表示するフレーム画像（厳密には、フレーム画像におけるＡさんの人物画像）が得られるようになる。 More specifically, the home server 1 of the second communication unit 100B analyzes the display position data in the received image data, and specifies the position corresponding to the cutout area (that is, the display position of the cutout image). Then, as shown in (D) of FIG. 7, the home server 1 superimposes the cutout image on the position of the specified cutout area among the person images of Mr. A in the frame image displayed last time. As a result, as shown in the figure, a frame image to be displayed this time (strictly speaking, a person image of Mr. A in the frame image) can be obtained.

次に、図８を参照しながら画質調整処理について説明する。第一通信ユニット１００Ａのホームサーバ１は、以上までに説明してきたように、カメラ２が撮像したフレーム画像中、Ａさんの人物画像や当該人物画像中の一部分の画像（以下、これらをまとめて送信画像という）について画像データを生成する。一方、第一通信ユニット１００Ａのホームサーバ１は、前述したように、Ｂさんの中心視野領域を推定する。 Next, the image quality adjustment processing will be described with reference to FIG. As described above, the home server 1 of the first communication unit 100A includes the person image of Mr. A and a partial image of the person image in the frame image captured by the camera 2 (hereinafter, these are collectively Image data is generated for the transmission image). On the other hand, as described above, the home server 1 of the first communication unit 100A estimates the central visual field of Mr. B.

そして、第一通信ユニット１００Ａのホームサーバ１は、送信画像に対して画質調整処理を実行する。この画像調整処理では、送信画像中、ディスプレイ５の表示画面５ａにおいてＢさんの中心視野領域内に表示される画像（第一画像）よりも中心視野領域以外の領域に表示される画像（第二画像）を低画質化する。なお、「第一画像よりも第二画像を低画質化する」とは、第一画像の解像度よりも第二画像の解像度を低くすることである。また、第二画像の画質を低下させる際の度合い（低下度合い）については、特に限定されるものではないが、ディスプレイ５に画質低下後の第二画像を表示した際にＢさんが違和感を感じない程度に設定されているとよい。 Then, the home server 1 of the first communication unit 100A performs the image quality adjustment process on the transmission image. In this image adjustment process, an image displayed in an area other than the central visual field than the image (first image) displayed in the central visual field of Mr. B on the display screen 5a of the display 5 in the transmission image (second Reduce the image quality). Note that “to lower the image quality of the second image than the first image” means to lower the resolution of the second image than the resolution of the first image. Further, the degree of reduction in the image quality of the second image (reduction degree) is not particularly limited, but when displaying the second image after the image quality reduction on the display 5, Mr. B feels a sense of discomfort It is good to be set to no extent.

また、画像調整処理において、第一通信ユニット１００Ａのホームサーバ１は、送信画像の画像データとして、第一画像よりも第二画像が低画質となるように当該送信画像の画像データを生成し、第二通信ユニット１００Ｂのホームサーバ１に向けて送信する。 Further, in the image adjustment process, the home server 1 of the first communication unit 100A generates, as the image data of the transmission image, the image data of the transmission image such that the second image has a lower image quality than the first image. It transmits toward the home server 1 of 2nd communication unit 100B.

上記の送信画像の画像データが第二通信ユニット１００Ｂのホームサーバ１に受信されると、第二通信ユニット１００Ｂのディスプレイ５に当該送信画像を含むフレーム画像が表示されるようになる。かかる表示画像中、Ｂさんの中心視野領域内に表示される第一画像（図８中、ハッチングが施された部分）は、より高画質な画像となっているのに対し、中心視野領域以外（すなわち、周辺視野領域内）に表示される第二画像は、より低画質な画像となっている。このような表示画像であっても、中心視野領域以外に表示される画像（第二画像）は視覚的に認識され難くなっているため、ディスプレイ５を見ているＢさんは、違和感を然程感じることがない。つまり、表示画像において画質が異なる部分が存在していても、中心視野領域に表示される部分が高画質であれば、対話通信の臨場感（リアル感）に及ぶ影響が小さくなる。したがって、本実施形態では、表示画像中の第二画像の画質を低下させながらも対話通信の臨場感を確保しつつ、データ伝送負荷を画質低下の分だけ軽減することが可能となる。 When the image data of the transmission image is received by the home server 1 of the second communication unit 100B, a frame image including the transmission image is displayed on the display 5 of the second communication unit 100B. In the displayed image, the first image (the hatched portion in FIG. 8) displayed in the central visual field of Mr. B is a higher quality image, but it is other than the central visual field. The second image displayed (that is, in the peripheral vision region) is a lower quality image. Even with such a display image, the image (second image) displayed outside the central visual field is difficult to be recognized visually, so Mr. B looking at the display 5 feels a sense of discomfort I can not feel it. That is, even if there is a portion with different image quality in the display image, if the portion displayed in the central visual field has high image quality, the influence on the realism of the dialog communication is reduced. Therefore, in the present embodiment, it is possible to reduce the data transmission load by an amount corresponding to the decrease in image quality while securing the presence of interactive communication while reducing the image quality of the second image in the display image.

また、送信画像中、低画質化する範囲（すなわち、第二画像）を選定するにあたり、Ｂさんの中心視野領域を推定することになるが、本実施形態では前述したように、Ｂさんの身長及び顔の向きに基づいて中心視野領域を推定することになっている。これにより、Ｂさんの中心視野領域が適切に推定されるようになり、この結果、Ｂさんの中心視野領域に応じて決まる第二画像についても、Ａさんの人物画像の中から適切な範囲が選定されるようになる。 In addition, in selecting a range (i.e., the second image) to be lowered in image quality in the transmission image, the central visual field area of Mr. B is estimated, but in the present embodiment, as described above, the height of Mr. B is The central visual field area is estimated on the basis of the face orientation. As a result, the central visual field area of Mr. B can be appropriately estimated, and as a result, an appropriate range of the human image of Mr. A is obtained for the second image determined according to the central visual field area of Mr. B as well. It will be selected.

（表示画像再構築機能）
第二通信ユニット１００Ｂのホームサーバ１は、第一通信ユニット１００Ａのホームサーバ１から送信されてくる画像データを受信し、当該画像を展開して得られる画像をディスプレイ５に表示する。ここで、第一通信ユニット１００Ａのホームサーバ１から送信されてくる画像データについて述べると、前述したように、背景画像の画像データと人物画像の画像データとが別々に送信されることになっている。このため、第二通信ユニット１００Ｂのホームサーバ１は、それぞれの画像データを受信し、当該画像データを展開した上で背景画像と人物画像とを合成する。このようにして第二通信ユニット１００Ｂのホームサーバ１は、第一通信ユニット１００Ａのホームサーバ１から受信した各画像（受信画像）を再構築し、今回ディスプレイ５に表示するフレーム画像（表示画像）を取得する。 (Display image reconstruction function)
The home server 1 of the second communication unit 100B receives the image data transmitted from the home server 1 of the first communication unit 100A, and displays an image obtained by expanding the image on the display 5. Here, when the image data transmitted from the home server 1 of the first communication unit 100A is described, as described above, the image data of the background image and the image data of the person image are separately transmitted. There is. Therefore, the home server 1 of the second communication unit 100B receives the respective image data, develops the image data, and combines the background image and the person image. Thus, the home server 1 of the second communication unit 100B reconstructs each image (received image) received from the home server 1 of the first communication unit 100A, and displays a frame image (display image) to be displayed on the display 5 this time. To get

また、第二通信ユニット１００Ｂのホームサーバ１は、人物画像中の一部分の画像データ（すなわち、切り出し領域の画像データ）を受信した場合、前回表示したフレーム画像のうち、切り出し画像と対応した位置に当該切り出し画像を重ね合わせることで、今回表示するＡさんの人物画像を取得する。 In addition, when the home server 1 of the second communication unit 100B receives image data of a part of the person image (that is, image data of the cutout region), the home server 1 of the second communication unit 100B By superimposing the cutout images, a person image of Mr. A to be displayed this time is acquired.

そして、第二通信ユニット１００Ｂのホームサーバ１は、取得したフレーム画像をディスプレイ５に表示させる。この際、第二通信ユニット１００Ｂのホームサーバ１は、フレーム画像中のＡさんの人物画像の表示サイズをＡさんの実際のサイズ（等身大サイズ）となるように調整する。具体的に説明すると、第二通信ユニット１００Ｂのホームサーバ１は、第一通信ユニット１００Ａのホームサーバ１から取得したＡさんの現在情報のうち、Ａさんとディスプレイ５との間の距離及びＡさんの距離に応じて、Ａさんの人物画像の表示サイズを調整する。 Then, the home server 1 of the second communication unit 100B causes the display 5 to display the acquired frame image. At this time, the home server 1 of the second communication unit 100B adjusts the display size of the person image of Mr. A in the frame image to the actual size of Mr. A (life size). Specifically, the home server 1 of the second communication unit 100B determines the distance between the A and the display 5 and the A among the current information of the A acquired from the home server 1 of the first communication unit 100A. Adjust the display size of the person image of Mr. A according to the distance of.

＜＜本実施形態に係る画像表示システムを用いた対話の流れ＞＞
次に、本システムＳを用いて行われるユーザ間の対話、すなわち、対話通信の具体的な流れ（以下、対話通信フロー）について、図９乃至１６を参照しながら説明する。図９は、対話通信フローの流れを示した図である。図１０は、通信前処理の流れを示した図である。図１１は、現在情報通知処理の流れを示した図である。図１２は、画像加工送信処理の流れを示した図である。図１３は、切り出し領域の選定処理の流れを示した図である。図１４は、切り出し領域の算出処理の流れを示した図である。図１５は、画質調整処理の流れを示した図である。図１６は、表示映像の再構築処理の流れを示した図である。 << Flow of dialogue using the image display system according to the present embodiment >>
Next, user interaction performed using the present system S, that is, a specific flow of dialogue communication (hereinafter, dialogue communication flow) will be described with reference to FIGS. FIG. 9 is a diagram showing the flow of the interactive communication flow. FIG. 10 is a diagram showing the flow of communication pre-processing. FIG. 11 is a diagram showing a flow of current information notification processing. FIG. 12 is a diagram showing the flow of the image processing transmission process. FIG. 13 is a diagram showing a flow of selection processing of the cutout region. FIG. 14 is a diagram showing a flow of calculation processing of the cutout region. FIG. 15 is a diagram showing a flow of image quality adjustment processing. FIG. 16 is a diagram showing a flow of display image reconstruction processing.

ところで、以下に説明する対話通信フローでは、本発明の画像表示方法が採用されている。すなわち、本発明の画像表示方法は、本システムＳの各機器、特に第一通信ユニット１００Ａ及び第二通信ユニット１００Ｂの各々のホームサーバ１（第一コンピュータ及び第二コンピュータに相当）が各自の機能を発揮することで実現される。 By the way, in the interactive communication flow described below, the image display method of the present invention is adopted. That is, according to the image display method of the present invention, each device of the system S, in particular, the home server 1 (corresponding to the first computer and the second computer) of each of the first communication unit 100A and the second communication unit 100B has their own functions. It is realized by exerting

先ず、対話通信フローの大まかな流れについて図９を参照しながら説明すると、対話通信フローの開始に際して通信前処理が実行される（Ｓ００１）。通信前処理は、対話通信の開始の可否を判定するために実行される処理であり、対話通信フローの開始前、例えば、Ａさん又はＢさんが部屋（厳密には、対話通信を行う際に居る部屋）に入室した時点で実行される。 First, a rough flow of the dialog communication flow will be described with reference to FIG. 9. At the start of the dialog communication flow, communication pre-processing is executed (S001). The communication pre-processing is a process executed to determine whether or not to start dialogue communication, and before the start of the dialogue communication flow, for example, when Mr. A or Mr. It is executed when entering the room).

通信前処理の実行後に対話通信が開始されると、その後、現在情報通知処理（Ｓ００２）、相手方現在情報の受信（Ｓ００３）、画像加工送信処理（Ｓ００４）、相手方画像の受信（Ｓ００５）、及び表示画像の再構築処理（Ｓ００６）が実行される。これらの処理は、第一通信ユニット１００Ａ及び第二通信ユニット１００Ｂの双方のホームサーバ１において実行され、対話通信が終了するまで繰り返し実行される（Ｓ００７）。そして、Ａさん又はＢさんが対話通信において当該対話通信を終了する動作を行うと、かかる終了動作を本システムＳが受け付け、その結果、対話通信が終了する。 When dialogue communication is started after execution of communication preprocessing, current information notification processing (S002), reception of partner current information (S003), image processing / transmission processing (S004), reception of partner image (S005), and the like A display image reconstruction process (S006) is performed. These processes are performed in the home server 1 of both the first communication unit 100A and the second communication unit 100B, and are repeatedly performed until the interactive communication is completed (S007). Then, when Mr. A or Mr. B performs an operation of ending the dialog communication in the dialog communication, the present system S receives the ending operation, and as a result, the dialog communication is ended.

次に、対話通信フローにおける各処理Ｓ００１〜Ｓ００７の流れについて説明する。なお、Ａさん側の通信ユニット（すなわち、第一通信ユニット１００Ａ）で実行される処理の流れと、Ｂさん側の通信ユニット（すなわち、第二通信ユニット１００Ｂ）で実行される処理の流れとは略同様である。このため、以下では、後述する表示画像の再構築処理を除き、第一通信ユニット１００Ａで行われる処理の流れのみを説明することし、表示画像の再構築処理については、第二通信ユニット１００Ｂで行われる処理の流れを説明することとする。 Next, the flow of each process S001 to S007 in the interactive communication flow will be described. The flow of processing executed by the communication unit on the A side (that is, the first communication unit 100A) and the flow of processing executed by the communication unit on the B side (that is, the second communication unit 100B) It is substantially the same. For this reason, in the following, only the flow of processing performed in the first communication unit 100A will be described except for the display image reconstruction processing described later, and the second communication unit 100B will be described for the display image reconstruction processing. The flow of processing to be performed will be described.

はじめに、通信前処理について図１０を参照しながら説明する。通信前処理は、カメラ２が設置されている部屋を当該カメラ２が撮像して部屋内の撮像画像（フレーム画像）を、ホームサーバ１が取得するところから始まる（Ｓ０１１）。この際、ホームサーバ１は、フレーム画像と共に当該フレーム画像についての深度データを取得する（Ｓ０１２）。 First, communication pre-processing will be described with reference to FIG. The communication preprocessing starts from the point where the home server 1 acquires a captured image (frame image) in the room by the camera 2 capturing an image of a room in which the camera 2 is installed (S011). At this time, the home server 1 acquires depth data on the frame image together with the frame image (S012).

そして、ホームサーバ１は、前ステップＳ０１１、Ｓ０１２で取得したフレーム画像及び深度データに基づいて、ディスプレイ５の前にＡさんが居るかどうかを判定する（Ｓ０１３）。ディスプレイ５の前にＡさんが居ると判定した場合、ホームサーバ１は、相手方のホームサーバ１が同様の判定結果（すなわち、Ｂさんがディスプレイ５の前に居るという判定結果）を得るまで待機する。そして、双方のホームサーバ１が上記の判定結果を得た時点で通信開始可能となり（Ｓ０１４）、かかる時点で通信前処理が終了する。 Then, the home server 1 determines whether or not Mr. A is present in front of the display 5 based on the frame image and depth data acquired in the previous steps S011 and S012 (S013). If it is determined that Mr. A is present in front of the display 5, the home server 1 waits until the home server 1 of the other party obtains a similar determination result (that is, a determination result that Mr. B is in front of the display 5). . Then, when both home servers 1 obtain the above determination result, communication can be started (S014), and at this time, the communication pre-processing ends.

一方、ディスプレイ５の前にＡさんが居ないと判定した場合、ホームサーバ１は、背景画像の更新時間に至っているかどうかを判定する（Ｓ０１５）。背景画像の更新に至っていると判定した場合、ホームサーバ１は、前ステップＳ０１１で取得したフレーム画像の画像データを相手方のホームサーバ１に向けて送信する（Ｓ０１６）。この際に送信される画像データは、Ａさんが映っておらず部屋内のみが映っている画像、すなわち背景画像の画像データとなっている。 On the other hand, when it is determined that Mr. A is not present in front of the display 5, the home server 1 determines whether it has reached the update time of the background image (S015). If it is determined that the background image has been updated, the home server 1 transmits the image data of the frame image acquired in the previous step S011 to the other home server 1 (S016). The image data transmitted at this time is an image in which only Mr. A is not shown but only the inside of the room is shown, that is, image data of a background image.

以上のように、ホームサーバ１は、通信前処理においてディスプレイ５の前にＡさんが居ない間、背景画像の更新時間に至る度に背景画像の画像データを送信する。なお、背景画像の更新周期（時間間隔）については、特に限定されるものではなく、任意に設定することが可能である。 As described above, the home server 1 transmits the image data of the background image every time it reaches the update time of the background image while Mr. A is not present in front of the display 5 in the communication pre-processing. The background image update cycle (time interval) is not particularly limited, and can be set arbitrarily.

次に、現在情報通知処理について図１１を参照しながら説明する。現在情報通知処理は、Ａさんがディスプレイ５の前に居る状態で行われ、かかる状態におけるＡさんの位置や姿勢を現在情報として相手方のホームサーバ１に通知する。具体的に説明すると、現在情報通知処理において、ホームサーバ１は、Ａさんが映っているフレーム画像と共に取得した深度データに基づいて、Ａさんとディスプレイ５との間の距離を計算する（Ｓ０２１）。また、ホームサーバ１は、上記の深度データ及びフレーム画像からＡさんの骨格モデルを特定する（Ｓ０２２）。また、ホームサーバ１は、ステップＳ０２１で計算した距離の計算結果と、ステップＳ０２２で特定した骨格モデルからＡさんの身長を計算する（Ｓ０２３）。さらに、ホームサーバ１は、取得したフレーム画像中、Ａさんの人物画像からＡさんの顔の向きを特定する（Ｓ０２４）。 Next, the current information notification process will be described with reference to FIG. The present information notification process is performed in a state where Mr. A is in front of the display 5, and notifies the other party's home server 1 as the present information on the position and posture of Mr. A in such a state. Specifically, in the current information notification process, the home server 1 calculates the distance between the user A and the display 5 based on the depth data acquired together with the frame image in which the user A appears (S021). . Further, the home server 1 specifies a skeleton model of Mr. A from the depth data and the frame image described above (S022). Further, the home server 1 calculates the height of Mr. A from the calculation result of the distance calculated in step S021 and the skeletal model specified in step S022 (S023). Furthermore, the home server 1 specifies the direction of the face of the person A from the person image of the person A in the acquired frame image (S024).

そして、ホームサーバ１は、以上までのステップにより得られた現在情報、すなわち、Ａさんとディスプレイとの間の距離、Ａさんの身長及びＡさんの顔の向きを相手方のホームサーバ１に通知する（Ｓ０２５）。かかる時点で現在情報通知処理が終了する。 Then, the home server 1 notifies the home server 1 of the other party of the present information obtained by the above steps, that is, the distance between Mr. A and the display, the height of Mr. A and the orientation of Mr. A's face. (S025). At this point in time, the current information notification process ends.

次に、相手方現在情報の受信について説明する。ホームサーバ１は、相手方のホームサーバ１との通信を通じて、当該相手方のホームサーバ１が通知した相手方現在情報（すなわち、Ｂさんの現在情報）を取得する。具体的に説明すると、ホームサーバ１は、Ｂさんとディスプレイ５との間の距離、Ｂさんの身長及びＢさんの顔の向きを示すデータを、相手方のホームサーバ１から受信する。 Next, reception of the other party current information will be described. The home server 1 acquires the other party current information notified by the other party home server 1 (that is, the current information of Mr. B) through communication with the other party's home server 1. Specifically, the home server 1 receives, from the other party's home server 1, data indicating the distance between the B and the display 5, the height of the B and the orientation of the B's face.

次に、画像加工送信処理について図１２を参照しながら説明する。画像加工送信処理は、ホームサーバ１がカメラ２からフレーム画像を取得する度に実行され、同処理では、取得したフレーム画像あるいは当該フレーム画像中の一部分の画像データを相手方のホームサーバ１に送信する。そして、画像加工送信処理において送信される画像データの種類は、対話通信開始後の経過時間や取得したＢさんの現在情報等に応じて変化する。 Next, the image processing transmission process will be described with reference to FIG. The image processing and transmission process is executed each time the home server 1 acquires a frame image from the camera 2. In this process, the acquired frame image or a part of the image data of the frame image is transmitted to the other party's home server 1 . Then, the type of image data transmitted in the image processing and transmission process changes in accordance with the elapsed time after the start of the dialog communication, the current information of the acquired Mr. B, and the like.

具体的に説明すると、対話通信の開始直後には、背景画像の画像データが送信されることになっている（Ｓ０３１、Ｓ０３２）。この際に送信される背景画像の画像データは、通信開始の事前段階（例えば、前述の通信前処理）にホームサーバ１が予め取得していたフレーム画像、より詳細には、Ａさんがディスプレイ５の前に移動してくる前にカメラ２が撮像した際のフレーム画像を示す画像データである。 Specifically, immediately after the start of the dialogue communication, image data of the background image is to be transmitted (S031, S032). The image data of the background image transmitted at this time is a frame image acquired in advance by the home server 1 at the preliminary stage of communication start (for example, the above-mentioned communication preprocessing), more specifically, Mr. A displays 5 The image data indicates a frame image when the camera 2 captures an image before moving to the front of the image.

なお、背景画像の画像データは、通信開始直後に送信されると、それ以降、対話通信が終了するまで送られないことになっている。すなわち、背景画像の画像データを送信する処理については、ホームサーバ１がカメラ２からフレーム画像を取得する頻度よりも少ない頻度にて実行される。この結果、対話通信中、通信開始直後に背景画像の画像データを一回送信してからは、背景画像の画像データを送信せずに済み、その分、データ伝送負荷が軽減されるようになる。 When the image data of the background image is transmitted immediately after the start of communication, it is not transmitted until after the interactive communication ends. That is, the process of transmitting the image data of the background image is executed at a frequency less than the frequency at which the home server 1 acquires a frame image from the camera 2. As a result, during interactive communication, once image data of the background image is transmitted immediately after the start of communication, it is not necessary to transmit image data of the background image, and the data transmission load is reduced accordingly. .

一方、背景画像の画像データを送信した後には、専らＡさんの人物画像の画像データが送信されることになる。つまり、背景画像の画像データの送信後、ホームサーバ１は、カメラ２から取得したフレーム画像からＡさんの人物画像を抽出する（Ｓ０３３）。その後、ホームサーバ１は、取得したＢさんの現在情報のうち、Ｂさんとディスプレイ５との間の距離に基づいて以降の処理内容を決定する。 On the other hand, after transmitting the image data of the background image, the image data of the person image of Mr. A is transmitted exclusively. That is, after transmission of the image data of the background image, the home server 1 extracts the person image of Mr. A from the frame image acquired from the camera 2 (S033). Thereafter, the home server 1 determines the subsequent processing contents based on the distance between the user B and the display 5 among the acquired current information of the user B.

具体的に説明すると、ホームサーバ１は、Ｂさんとディスプレイ５との間の距離が閾値以上であるかどうかを判定する（Ｓ０３４）。かかる判定において上記の距離が閾値以上であるとき、ホームサーバ１は、ステップＳ０３３で抽出したＡさんの人物画像に対して低画質化処理を実行する（Ｓ０３５）。これにより、抽出されたＡさんの人物画像の画質が所定の画質（解像度）まで低下されるようになる。そして、ホームサーバ１は、低下後の画質の人物画像を示す画像データすなわち、低画質人物画像データを生成して相手方のホームサーバ１に向けて送信する（Ｓ０３６）。この際に送信される低画質人物画像データは、Ａさんの人物画像、より厳密にはＡさんの全身画像を低下後の画質にて表示するデータとなっている。 If it explains concretely, home server 1 will judge whether the distance between Mr. B and display 5 is more than a threshold (S034). When the above distance is equal to or greater than the threshold value in this determination, the home server 1 executes the image quality reduction process on the person image of Mr. A extracted in step S033 (S035). As a result, the image quality of the extracted person image of Mr. A is reduced to a predetermined image quality (resolution). Then, the home server 1 generates image data representing low-quality person images, that is, low-quality person image data, and transmits it to the other party's home server 1 (S036). The low-quality person image data transmitted at this time is data for displaying the person image of Mr. A, more strictly, the whole-body image of Mr. A with the image quality after reduction.

以上のようにＢさんとディスプレイ５との間の距離が閾値以上であるときに、Ｂさんに対して表示されるＡさんの人物画像がより低画質な画像となるように低画質人物画像データを生成する。そして、ホームサーバ１は、生成した低画質人物画像データを相手方のホームサーバ１に向けて送信する。このように低画質人物画像データを送信することにより、画質低下の分だけ、データ伝送負荷が軽減されるようになる。 As described above, when the distance between Mr. B and the display 5 is equal to or more than the threshold value, low-quality person image data so that the person image of Mr. A displayed to Mr. B becomes a lower quality image. Generate Then, the home server 1 transmits the generated low image-quality person image data to the home server 1 of the other party. By transmitting low-quality human image data in this manner, the data transmission load can be reduced by the amount of image quality degradation.

一方、Ｂさんとディスプレイ５との間の距離が閾値未満である場合、ホームサーバ１は、Ａさんの人物画像の中から一部の領域を切り出し、当該切り出し領域の画像データを送信することになっている。これに際して、ホームサーバ１は、Ａさんの人物画像の中からどの領域を切り出すかを選定する処理、すなわち、切り出し領域の選定処理を実行する（Ｓ０３７）。 On the other hand, when the distance between Mr. B and the display 5 is less than the threshold value, the home server 1 cuts out a partial area from the person image of Mr. A and transmits the image data of the cutout area. It has become. At this time, the home server 1 executes a process of selecting which area is to be cut out of the person image of Mr. A, that is, a process of selecting a cutout area (S037).

切り出し領域の選定処理の手順について図１３を参照しながら説明すると、本処理では、先ず、Ａさんの体軸上にある設定部位、具体的には頭と腰についてそれぞれの変位量を計算する（Ｓ１０１）。ここで、「変位量」とは、ホームサーバ１が前回のフレーム画像取得時点から今回のフレーム画像取得時点までの期間（以下、画像取得間期間）における移動量のことである。そして、本実施形態では、現在情報通知処理において特定したＡさんの骨格モデルの変化（具体的には、前回のフレーム画像取得時に特定した骨格モデルと、今回のフレーム画像取得時に特定した骨格モデルとの差分）から上記の変位量を計算することとしている。 The procedure of the process of selecting the cutout region will be described with reference to FIG. 13. In this process, first, displacement amounts of the set region on the body axis of person A, specifically, the head and the waist are calculated ( S101). Here, the “displacement amount” refers to the movement amount of the home server 1 in a period from the previous frame image acquisition time to the current frame image acquisition time (hereinafter, an image acquisition interval). Then, in the present embodiment, the change of the skeletal model of Mr. A identified in the current information notification process (specifically, the skeletal model identified at the previous frame image acquisition, and the skeletal model identified at the current frame image acquisition) The above displacement amount is calculated from the difference of

変位量の計算後、ホームサーバ１は、頭及び腰のうち、少なくとも一方の変位量が閾値以上であるかどうかを判定する（Ｓ１０２）。ここで、「閾値」とは、切り出し領域の選定用に設定された値であり、骨格モデル中の各設定部位が画像取得間期間中に動いたかどうかを判定する際の基準値となっている。なお、閾値の具体的な値については、特に限定されるものではないが、切り出し領域を適切に選定する上で好適な値に設定されていることが望ましい。 After calculating the displacement amount, the home server 1 determines whether or not the displacement amount of at least one of the head and the waist is equal to or greater than a threshold (S102). Here, the “threshold” is a value set for selection of the cutout region, and is a reference value for determining whether each set site in the skeletal model has moved during the image acquisition period. . The specific value of the threshold is not particularly limited, but is preferably set to a suitable value for appropriately selecting the cutout region.

そして、頭及び腰のうち、少なくとも一方の変位量が閾値以上であるとき、ホームサーバ１は、さらに各足の変位量を計算する（Ｓ１０３）。その後、ホームサーバ１は、各足の変位量が閾値以上であるかどうかを判定する（Ｓ１０４）。かかる判定において少なくとも一方の足の変位量が閾値以上であると判定した場合、ホームサーバ１は、Ａさんの人物画像中、上半身画像及び下半身画像、すなわち全身画像を切り出す（Ｓ１０５）。反対に、２つの足の変位量がいずれも閾値未満であると判定した場合、ホームサーバ１は、Ａさんの人物画像中、上半身画像を切り出す（Ｓ１０６）。 Then, when at least one displacement amount of the head and the waist is equal to or more than the threshold, the home server 1 further calculates the displacement amount of each foot (S103). Thereafter, the home server 1 determines whether the displacement amount of each foot is equal to or more than a threshold (S104). If it is determined in this determination that the displacement amount of at least one foot is equal to or greater than the threshold value, the home server 1 cuts out the upper body image and the lower body image, that is, the whole body image in the person image of Mr. A (S105). Conversely, if it is determined that the displacement amounts of the two feet are both less than the threshold value, the home server 1 cuts out the upper body image in the person image of Mr. A (S106).

以上のように本実施形態では、頭及び腰のうち、少なくともいずれか一方の変位量が閾値以上であるとき、Ａさんの人物画像の中から上半身画像を切り出すこととしている。これは、頭及び腰の少なくともいずれか一方が動いていれば、体軸、すなわち上半身が動いて変位していると想定されるためである。そして、上半身画像という単位で切り出し領域の選定を行えば、その選定に係る処理をより簡易的に実行することが可能となる。 As described above, in the present embodiment, when the displacement amount of at least one of the head and the waist is equal to or larger than the threshold value, the upper body image is cut out from the person image of Mr. A. This is because it is assumed that the body axis, that is, the upper body is moving and displaced if at least one of the head and the waist is moving. Then, if the cutout region is selected in units of upper body images, it is possible to more simply execute the processing related to the selection.

一方、頭及び腰の変位量がいずれも閾値未満であるとき、ホームサーバ１は、四肢（２つの手及び２つの足）のそれぞれについて変位量を計算する（Ｓ１０７）。そして、ホームサーバ１は、四肢それぞれの変位量が閾値以上であるかを判定する（Ｓ１０８）。かかる判定において、いずれの変位量も閾値未満であると判定した場合、ホームサーバ１は、Ａさんの人物画像の中から頭部画像を切り出す（Ｓ１０９）。 On the other hand, when the displacement amounts of the head and the waist are both less than the threshold value, the home server 1 calculates displacement amounts for each of the limbs (two hands and two legs) (S107). Then, the home server 1 determines whether the displacement amount of each of the four limbs is equal to or more than a threshold (S108). In this determination, when it is determined that any displacement amount is less than the threshold, the home server 1 cuts out a head image from the person's image of Mr. A (S109).

これに対し、少なくとも一つの変位量が閾値以上であると判定した場合、ホームサーバ１は、切り出し領域を更に細かく決めるための処理として、切り出し領域の算出処理を実行する（Ｓ１１０）。切り出し領域の算出処理の手順について図１４を参照しながら説明すると、本処理では、先ず、既に変位量を計算した設定部位（すなわち、頭、腰及び四肢）以外の設定部位について変位量を計算する（Ｓ１２１）。より具体的に説明すると、ホームサーバ１は、四肢のうち、変位量の閾値以上となった部位を特定し、当該部位と隣接する設定部位について変位量を計算する。なお、「ある部位と隣接する設定部位」とは、骨格モデルにおいて複数設定された設定部位のうち、ある部位の隣に位置する設定部位、より厳密には、ある部位とは体軸に近い側で隣り合う設定部位のことである。 On the other hand, when it is determined that at least one displacement amount is equal to or more than the threshold value, the home server 1 executes cutout area calculation processing as processing for determining the cutout area in more detail (S110). The procedure of calculation processing of the cutout region will be described with reference to FIG. 14. In this processing, first, the displacement amount is calculated for the setting portion other than the setting portion (that is, the head, the waist and the limbs) for which the displacement amount has already been calculated. (S121). More specifically, the home server 1 specifies a part of the four limbs that is equal to or greater than the threshold of the displacement amount, and calculates the displacement amount for a set part adjacent to the part. Note that “a setting site adjacent to a certain site” means a setting site located next to a certain site among a plurality of setting sites set in the skeletal model, and more strictly, a side closer to the body axis with a certain site It is a setting part adjacent to each other.

そして、ホームサーバ１は、計算した変位量が閾値以上であるかどうかを判定する（Ｓ１２２）。かかる判定において変位量が閾値以上であると判定したとき、ホームサーバ１は、変位量が閾値以上であると判定された設定部位（以下、該当部位）について、前回のフレーム画像における座標と、今回のフレーム画像における座標と、を記憶する（Ｓ１２３）。ここで、「前回のフレーム画像における座標」とは、ホームサーバ１がカメラ２から前回取得したフレーム画像に対する該当部位の相対位置を表す座標（二次元座標）のことであり、「今回のフレーム画像における座標」とは、ホームサーバ１がカメラ２から今回取得したフレーム画像に対する該当部位の相対位置を表す座標（二次元座標）のことである。 Then, the home server 1 determines whether the calculated displacement amount is equal to or more than a threshold (S122). When it is determined in this determination that the displacement amount is equal to or greater than the threshold value, the home server 1 determines the coordinates of the previous frame image with respect to the set portion (hereinafter referred to as the corresponding portion) determined to have the displacement amount equal to or more than the threshold value. And the coordinates in the frame image of the image are stored (S123). Here, “coordinates in the previous frame image” is coordinates (two-dimensional coordinates) representing the relative position of the corresponding part with respect to the frame image acquired by the home server 1 from the camera 2 in the previous time. “Coordinates in” means coordinates (two-dimensional coordinates) representing the relative position of the relevant part with respect to the frame image acquired by the home server 1 from the camera 2 this time.

その後、ホームサーバ１は、該当部位と隣接する設定部位が有るかどうかを判定し（Ｓ１２４）、該当部位と隣接する設定部位が有る場合には、その設定部位について変位量を計算し（Ｓ１２５）、その計算結果が閾値以上であるかを判定する（Ｓ１２６）。かかる判定において変位量が閾値以上であると判定したとき、ホームサーバ１は、変位量が閾値以上であると判定された設定部位（すなわち、新たに該当部位となる設定部位）について、前回のフレーム画像における座標と今回のフレーム画像における座標とを記憶する（Ｓ１２３）。 After that, the home server 1 determines whether there is a set part adjacent to the corresponding part (S124), and if there is a set part adjacent to the corresponding part, calculates the displacement amount for the set part (S125) Then, it is determined whether the calculation result is equal to or more than a threshold (S126). If it is determined in this determination that the displacement amount is equal to or greater than the threshold value, the home server 1 determines the previous frame with respect to the set portion determined to have the displacement amount equal to or greater than the threshold value (ie, the set portion to be newly corresponding). The coordinates in the image and the coordinates in the current frame image are stored (S123).

以後、ホームサーバ１は、新たに該当部位となった設定部位と隣接する設定部位について、変位量の計算（Ｓ１２５）、閾値との対比（Ｓ１２６）及び座標の記憶（Ｓ１２３）を繰り返す。そして、変位量が閾値未満となる設定部位、すなわち動いていない設定部位まで達した時点で、ホームサーバ１は、それまで記憶していた座標を読み出し、各座標のＸ成分及びＹ成分をそれぞれ特定する。その上で、ホームサーバ１は、成分毎に最大値及び最小値を特定する（Ｓ１２７）。その後、ホームサーバ１は、各成分の最小値及び最大値により規定される領域（具体的には、各成分の最小値及び最大値を頂点座標とする矩形領域）を切り出し領域とする（Ｓ１２８）。 Thereafter, the home server 1 repeats the calculation of the displacement amount (S125), the comparison with the threshold (S126), and the storage of the coordinates (S123) for the setting site that is newly the corresponding site and the setting site adjacent thereto. Then, when reaching the set region where the displacement amount is less than the threshold, ie, the non-moving region, the home server 1 reads the coordinates stored so far and specifies the X component and the Y component of each coordinate. Do. Then, the home server 1 specifies the maximum value and the minimum value for each component (S127). After that, the home server 1 sets an area defined by the minimum value and the maximum value of each component (specifically, a rectangular area having the minimum value and the maximum value of each component as vertex coordinates) as a cutout area (S128) .

以上までに説明してきた一連のステップＳ１２１〜Ｓ１２８は、すべての設定部位について処理が完了するまで繰り返して行われる（Ｓ１２９）。そして、未処理の設定部位が無くなった時点で、ホームサーバ１は、切り出し領域の算出処理を終了する。 The series of steps S121 to S128 described above are repeated until the processing is completed for all the set parts (S129). Then, when there are no unprocessed set parts, the home server 1 ends the cutout area calculation process.

切り出し領域の選定処理についての説明に戻ると、切り出し領域の算出処理が実行されたとき、ホームサーバ１は、当該算出処理において算出（決定）された領域の画像及び頭部画像をＡさんの人物画像中から切り出す（Ｓ１１１）。
そして、以上までに説明してきた手順により切り出し領域が選定された時点で、ホームサーバ１は、切り出し領域の選定処理を終了する。 Returning to the explanation of the selection processing of the cutout region, when the calculation processing of the cutout region is executed, the home server 1 performs the image of the region calculated (decided) in the calculation processing and the head image of the person A's person Cut out from the image (S111).
Then, when the cutout area is selected according to the procedure described above, the home server 1 ends the cutout area selection process.

以上のように本実施形態では、Ｂさんとディスプレイ５との間の距離が閾値未満である場合、ホームサーバ１は、Ａさんの人物画像の中から一部の領域を切り出し、当該領域の画像データのみを相手方のホームサーバ１に送信する。これにより、Ａさんの人物画像全体の画像データを送信する場合に比して、データ送信負荷が軽減されるようになる。また、切り出される領域としては、Ａさんの身体中、前回のフレーム画像取得時から今回のフレーム画像取得時までの期間（画像取得間期間）中に動いた設定部位を含む領域と、頭部画像とが選定されることになっている。 As described above, in the present embodiment, when the distance between Mr. B and the display 5 is less than the threshold, the home server 1 cuts out a partial area from the personal image of Mr. A, and the image of the area Only the data is transmitted to the home server 1 of the other party. As a result, the data transmission load can be reduced as compared with the case of transmitting the image data of the entire person image of Mr. A. In addition, as a region to be cut out, a region including a set region which moves in a period from the previous frame image acquisition to the current frame image acquisition (period between image acquisition) in the body of Mr. A and a head image And are to be selected.

一方、本実施形態では、画像取得間期間中に動いた設定部位を特定する際、骨格モデルの変化（具体的には、前回の骨格モデルと今回の骨格モデルとの差分）に基づいて特定している。これにより、Ａさんの身体中、画像取得間期間中に動いた部分（被特定部分）を適切且つ的確に特定することが可能となる。 On the other hand, in the present embodiment, when specifying the setting site moved during the image acquisition period, it is specified based on the change of the skeletal model (specifically, the difference between the previous skeletal model and the present skeletal model). ing. This makes it possible to appropriately and accurately identify the part (specified part) that has moved during the image acquisition period in the body of Mr. A.

また、本実施形態では、画像取得間期間中における動きの有無を設定部位単位で確認することになっている。この結果、Ａさんの身体中、画像取得間期間中に動いた部分（被特定部分）を容易に特定することが可能となる。また、本実施形態では、画像取得間期間中における各設定部位の動きの有無を確認する上で、各設定部位について画像取得間期間中の変位量を計算し、当該変位量の計算結果が閾値以上であるか否かの判定を行うことになっている。このような手順であれば、画像取得間期間中に動いた部分をより一層容易に特定することが可能となる。 Further, in the present embodiment, the presence or absence of movement during the image acquisition period is to be confirmed in units of set parts. As a result, it becomes possible to easily identify the part (specified part) which has moved during the image acquisition period in the body of Mr. A. Further, in the present embodiment, when confirming the presence or absence of the movement of each setting site during the image acquisition interval, the displacement amount during the image acquisition interval is calculated for each setting site, and the calculation result of the displacement amount is a threshold. It is to be judged whether it is above or not. With such a procedure, it is possible to more easily identify the portion moved during the image acquisition period.

さらに、本実施形態では、切り出し領域の算出処理において、ある設定部位について変位量と閾値との対比（判定）を行った次には、ある設定部位の隣に位置する設定部位について判定を行うことになっている。そして、切り出し領域を選定する際には、画像取得間期間中に動いた設定部位（該当部位）すべてが含まれるような領域を選定する。具体的に説明すると、各該当部位について前回のフレーム画像における座標と、今回のフレーム画像における座標とを求める。また、該当部位毎に求めた上記座標のＸ成分及びＹ成分について最大値と最小値とを特定する。そして、特定した各成分の最大値及び最小値により規定される領域を切り出し領域として選定する。 Furthermore, in the present embodiment, after the displacement amount and the threshold value are compared (decided) with respect to a certain set site in the process of calculating the cutout region, next, the judgment with respect to the set site located next to the certain set site is performed. It has become. Then, when selecting the cutout region, the region is selected so as to include all the set parts (corresponding parts) moved during the inter-image acquisition period. Specifically, the coordinates in the previous frame image and the coordinates in the current frame image for each corresponding portion are determined. In addition, the maximum value and the minimum value are specified for the X component and the Y component of the coordinates obtained for each corresponding part. Then, a region defined by the maximum value and the minimum value of each of the identified components is selected as a cutout region.

以上のような手順にて切り出し領域を選定することにより、Ａさんの人物画像中、画像取得間期間中に動いた部分の画像が適切に選定されるようになる。さらに、当該切り出し画像を前回の表示画像（フレーム画像）に重ね合わせて今回の表示画像を構成することにより、ホームサーバ１が今回取得したフレーム画像（厳密には、当該フレーム画像中、Ａさんの人物画像）を適切に再現することが可能となる。 By selecting the cutout region according to the above-described procedure, an image of a portion that has moved during the image acquisition period can be appropriately selected in the person image of person A. Furthermore, by superimposing the cut-out image on the previous display image (frame image) to construct the display image of this time, the frame image acquired this time by the home server 1 (strictly speaking, in the frame image, Mr. A's It is possible to appropriately reproduce a person image).

画像加工送信処理についての説明に戻ると、切り出し領域の選定後、ホームサーバ１は、当該切り出し領域の画像データ（すなわち、送信対象の画像データ）のデータ容量を確認する。そして、ホームサーバ１は、データ容量が設定値以上であるかどうかを判定する（Ｓ０３９）。ここで、「設定値」とは、送信画像に対する画質調整処理の実行の有無を決めるための基準値として予め設定された値である。なお、設定値の具体的な値については、特に限定されるものではないが、画質調整処理の実行の有無を適切に判定する上で好適な値に設定されるのが望ましい。 Returning to the description of the image processing and transmission process, after selecting the cutout area, the home server 1 confirms the data capacity of the image data of the cutout area (that is, the image data to be transmitted). Then, the home server 1 determines whether the data capacity is equal to or more than the set value (S039). Here, the “set value” is a value set in advance as a reference value for determining the presence or absence of the execution of the image quality adjustment process on the transmission image. The specific value of the setting value is not particularly limited, but is preferably set to a suitable value in order to appropriately determine whether or not the image quality adjustment processing is to be performed.

上記の判定においてデータ容量が設定値未満である場合、ホームサーバ１は、切り出し領域の画像（切り出し画像）に対して画質調整処理を行うことなく、当該切り出し領域の画像データを相手方のホームサーバ１に向けて送信する（Ｓ０４０）。一方、上記の判定においてデータ容量が設定値以上である場合、ホームサーバ１は、切り出し画像に対して画質調整処理を実行する（Ｓ０４１）。画質調整処理の終了後、ホームサーバ１は、画質調整処理が施された切り出し画像（すなわち、画質調整済み画像）を表示させる画像データを生成し、相手方のホームサーバ１に向けて送信する（Ｓ０４２）。 If the data capacity is less than the set value in the above determination, the home server 1 does not perform the image quality adjustment process on the image of the cutout area (clipped image), and the image data of the cutout area is processed by the home server 1 of the other party. Send toward (S040). On the other hand, when the data capacity is equal to or greater than the set value in the above determination, the home server 1 executes the image quality adjustment processing on the cutout image (S041). After the end of the image quality adjustment processing, the home server 1 generates image data for displaying the clipped image (that is, the image quality adjusted image) subjected to the image quality adjustment processing, and transmits it to the other party's home server 1 (S042) ).

画質調整処理の手順について図１５を参照しながら説明すると、本処理では、先ず、取得したＢさんの現在情報、具体的にはＢさんの身長及びＢさんの顔の向きからＢさんの中心視野領域を推定する（Ｓ１３１）。その後、ホームサーバ１は、送信対象である切り出し画像のデータがＡさんの全身画像のデータであるかどうかを判別する（Ｓ１３２）。 The procedure of the image quality adjustment process will be described with reference to FIG. 15. In this process, first, the present information of Mr. B, specifically, the height of Mr. B and the direction of Mr. B's face from Mr. B's central visual field An area is estimated (S131). After that, the home server 1 determines whether the data of the cutout image to be transmitted is the data of the whole-body image of Mr. A (S132).

切り出し画像のデータが全身画像のデータである場合（分かり易くは、切り出し領域の選定処理でステップＳ１０５に至った場合）、ホームサーバ１は、当該切り出し画像中、ディスプレイ５の表示画面５ａに表示した際にＢさんの中心視野領域内に位置する画像（第一画像）よりも中心視野領域以外の領域に表示される画像（第二画像）を低画質化する（Ｓ１３３）。 When the data of the cut-out image is data of a whole-body image (intelligibly, when the selection processing of the cut-out area has reached step S105), the home server 1 displays on the display screen 5a of the display 5 in the cut-out image. At the time, the image quality of the image (second image) displayed in the area other than the central visual field area is lowered than the image (first image) located in the central visual field area of Mr. B (S133).

一方、切り出し画像のデータが全身画像のデータでない場合、ホームサーバ１は、その切り出し画像を選択する（Ｓ１３４）。そして、ホームサーバ１は、選択した切り出し画像中、ディスプレイ５の表示画面５ａに表示した際にＢさんの中心視野領域以外の領域に表示される画像（第二画像）があるかどうかを判定する（Ｓ１３５）。かかる判定において、選択した切り出し画像中に第二画像に相当する部分が存在すると判定した場合、ホームサーバ１は、Ｂさんの中心視野領域内に表示される画像（第一画像）に対して第二画像を低画質化する（Ｓ１３３）。 On the other hand, when the data of the cutout image is not the data of the whole body image, the home server 1 selects the cutout image (S134). Then, the home server 1 determines whether or not there is an image (second image) to be displayed in an area other than the central visual field of Mr. B when displayed on the display screen 5a of the display 5 in the selected cutout image. (S135). In this determination, when it is determined that the portion corresponding to the second image is present in the selected cutout image, the home server 1 performs the first on the image (first image) displayed in the central visual field of Mr. B. The image quality of the two images is reduced (S133).

その後、ホームサーバ１は、未処理の切り出し画像が残っているどうかを判定し（Ｓ１３６）、未処理の切り出し画像に対して画像選択（Ｓ１３４）、第二画像の有無の判定（Ｓ１３５）及び第二画像の低画質化（Ｓ１３３）を繰り返す。そして、未処理の切り出し画像が無くなった時点で、ホームサーバ１は、画質調整処理を終了する。 Thereafter, the home server 1 determines whether or not an unprocessed cutout image remains (S136), selects an image with respect to the unprocessed cutout image (S134), determines the presence or absence of a second image (S135), and The image quality reduction of the two images (S133) is repeated. Then, when there is no unprocessed cutout image, the home server 1 ends the image quality adjustment processing.

以上のように本実施形態では、送信する切り出し画像の画像データの容量が設定値以上であるとき、切り出し画像の一部を低画質化する画質調整処理を実行する。これにより、処理後の切り出し画像の画像データが処理前の画像データよりも小さくなり、当該画像データの伝送負荷が軽減される。なお、かかる効果は、Ａさんの人物画像の中から切り出された領域（すなわち、切り出し領域）が広くなるほど、有効に発揮されることとなる。 As described above, in the present embodiment, when the capacity of the image data of the cutout image to be transmitted is equal to or more than the set value, the image quality adjustment processing is performed to reduce the image quality of part of the cutout image. As a result, the image data of the cutout image after processing becomes smaller than the image data before processing, and the transmission load of the image data is reduced. Such an effect is more effectively exhibited as the area (that is, the cutout area) cut out from the person image of Mr. A becomes wider.

また、切り出し画像中、低画質化する部分（第二画像）を選ぶにあたってＢさんの中心視野領域を推定する。そして、切り出し画像中、ディスプレイ５の表示画面５ａにおいて推定したＢさんの中心視野領域から外れた領域（周辺視野領域）に表示される部分の画質を所定の画質まで低下させる。これは、周辺視野領域内にある画像が視覚的に認識され難く、当該画像の画質が多少低かったとしても、表示画像を見る者が感じる対話通信の臨場感に及ぶ影響が小さいことを反映している。以上の結果、切り出し画像中、画質を低下させる部分（第二画像）が適切に選定されるようになるため、対話通信の臨場感が損なわれることなくデータ伝送負荷を効果的に軽減することが可能となる。 In addition, in selecting a portion (second image) to be degraded in the cut-out image, the central visual field area of Mr. B is estimated. Then, in the cutout image, the image quality of the portion displayed in the area (peripheral visual field area) outside the central visual field area of Mr. B estimated on the display screen 5a of the display 5 is reduced to a predetermined image quality. This reflects that the image within the peripheral visual field is difficult to visually recognize, and even if the image quality of the image is somewhat low, the influence on the realism of the dialog communication felt by the viewer of the display image is small. ing. As a result of the above, the portion (the second image) that lowers the image quality is appropriately selected from the cut-out image, so that the data transmission load can be effectively reduced without losing the sense of realism of the dialog communication. It becomes possible.

そして、ホームサーバ１は、各種画像データの送信を終えた時点で画像加工送信処理を終了する。 Then, the home server 1 ends the image processing and transmission process when transmission of various image data is completed.

次に、表示映像の再構築処理について図１６を参照しながら説明する。本処理は、第二通信ユニット１００Ｂのホームサーバ１が第一通信ユニット１００Ａのホームサーバ１から受信した画像データを展開して得られる各画像を再構築し、今回ディスプレイ５に表示させる画像（フレーム画像）を取得する処理である。 Next, display image reconstruction processing will be described with reference to FIG. The present process reconstructs each image obtained by expanding the image data received by the home server 1 of the second communication unit 100B from the home server 1 of the first communication unit 100A, and displays this on the display 5 (frame (frame Image) is acquired.

より具体的に説明すると、第二通信ユニット１００Ｂのホームサーバ１は、対話通信の開始直後に背景画像の画像データを受信する（Ｓ０５１でＮｏ）。それ以降、第二通信ユニット１００Ｂのホームサーバ１は、Ａさんの人物画像の画像データを受信する（Ｓ０５１でＹｅｓ）。この際に受信した画像データがＡさんの全身画像のデータである場合（Ｓ０５２でＹｅｓ）、ホームサーバ１は、Ａさんの現在情報（具体的にはＡさんの身長）に応じて上記の全身画像の表示サイズを、Ａさんの実際のサイズ（等身大サイズ）となるように調整する（Ｓ０５４）。その後、ホームサーバ１は、既に取得済みの背景画像と今回取得したＡさんの人物画像とを合成することにより、今回ディスプレイ５に表示するフレーム画像（表示画像）を取得する（Ｓ０５５）。 More specifically, the home server 1 of the second communication unit 100B receives the image data of the background image immediately after the start of the dialogue communication (No in S051). After that, the home server 1 of the second communication unit 100B receives the image data of the person image of Mr. A (Yes in S051). If the image data received at this time is the data of the whole-body image of Mr. A (Yes in S052), the home server 1 responds to the current information of Mr. A (specifically, the height of Mr. A). The display size of the image is adjusted to be the actual size (life size size) of Mr. A (S054). Thereafter, the home server 1 acquires a frame image (display image) to be displayed on the display 5 this time by combining the background image already acquired and the person image of Mr. A acquired this time (S055).

一方、第一通信ユニット１００Ａのホームサーバ１から受信した画像データがＡさんの人物画像の一部（すなわち、切り出し画像）の画像データである場合（Ｓ０５２でＮｏ）、第二通信ユニット１００Ｂのホームサーバ１は、上記の画像データを用いてＡさんの人物画像を再構築する。 On the other hand, when the image data received from the home server 1 of the first communication unit 100A is image data of a part of the person's image of A (that is, the cutout image) (No in S052), the home of the second communication unit 100B. The server 1 reconstructs the person image of Mr. A using the image data described above.

詳しく説明すると、第二通信ユニット１００Ｂのホームサーバ１は、今回受信した画像データが示す画像（切り出し画像）と、前回ディスプレイ５に表示したＡさんの人物画像と、を重ね合わせる（Ｓ０５３）。この際、ホームサーバ１は、今回受信した画像データに組み込まれた表示位置データを解析して切り出し画像の表示位置を特定し、前回ディスプレイ５に表示したＡさんの人物画像において上記の表示位置に切り出し画像を重ね合わせる。なお、特定される切り出し画像の表示位置は、切り出し領域の画像データの受信直前にディスプレイ５に表示されたフレーム画像（すなわち、前回の表示画像）中、切り出し領域と対応した位置、つまり切り出し領域として選定された矩形領域と対応した位置となっている。 Explaining in detail, the home server 1 of the second communication unit 100B superimposes the image (cutout image) indicated by the image data received this time and the person image of Mr. A displayed on the display 5 last time (S053). At this time, the home server 1 analyzes the display position data incorporated in the image data received this time, specifies the display position of the cutout image, and displays the display position in the person image of Mr. A displayed on the display 5 last time. The cutout image is superimposed. The display position of the cutout image to be identified is a position corresponding to the cutout region in the frame image (that is, the previous display image) displayed on the display 5 immediately before the reception of the image data of the cutout region, that is, a cutout region. The position corresponds to the selected rectangular area.

以上のように、第二通信ユニット１００Ｂのホームサーバ１は、切り出し画像と前回表示されたＡさんの人物画像とを用いて、今回ディスプレイ５に表示するＡさんの人物画像を再構築（取得）する。その後、第二通信ユニット１００Ｂのホームサーバ１は、上述した手順と同様の手順にてＡさんの人物画像の表示サイズを調整し、その上で、背景画像と今回取得したＡさんの人物画像とを合成して今回の表示画像を取得する（Ｓ０５５）。 As described above, the home server 1 of the second communication unit 100B reconstructs (acquires) the person image of Mr. A displayed on the display 5 this time using the cutout image and the person image of Mr. A displayed last time. Do. After that, the home server 1 of the second communication unit 100B adjusts the display size of the person image of the person A in the same procedure as the above-described procedure, and then the background image and the person image of the person A obtained this time To obtain the display image of this time (S055).

そして、第二通信ユニット１００Ｂのホームサーバ１は、今回取得したフレーム画像（表示画像）をディスプレイ５に表示させる（Ｓ０５６）。かかる時点で、ホームサーバ１は、表示映像の再構築処理を終了する。 Then, the home server 1 of the second communication unit 100B causes the display 5 to display the frame image (display image) acquired this time (S056). At this time, the home server 1 ends the display image reconstruction process.

以上までに説明してきた一連の処理については、対話通信が終了するまで繰り返し実行される。これにより、データ伝送の負荷を効果的に軽減しつつ、臨場感（リアル感）がある対話通信が実現されるようになる。 The series of processes described above are repeatedly executed until the dialogue communication is completed. As a result, while effectively reducing the load of data transmission, interactive communication with a sense of reality (realism) can be realized.

１ホームサーバ
２カメラ（撮像装置，情報提供装置）
３マイク
４赤外線センサ（計測装置，情報提供装置，距離計測装置）
５ディスプレイ（表示器）
５ａ表示画面
６スピーカ
１００Ａ第一通信ユニット
１００Ｂ第二通信ユニット
ＧＮ外部ネットワーク
Ｓ本システム（画像表示システム） 1 Home Server 2 Camera (Imaging Device, Information Providing Device)
3 Microphone 4 Infrared sensor (measuring device, information providing device, distance measuring device)
5 Display
5a Display Screen 6 Speaker 100A First Communication Unit 100B Second Communication Unit GN External Network S This System (Image Display System)

Claims

An imaging device for imaging a first user;
A measurement device for measuring a measurement target value regarding the position of each part of the body of the first user;
A first computer for acquiring a frame image constituting an image of the first user captured by the imaging device;
A second computer in communication with the first computer to obtain the frame image;
And a display for displaying the frame image acquired by the second computer to a second user who is at a different place from the first user,
The first computer is
Of the various parts of the body, the specified part that moved during the period, based on the change in the measurement result of the measurement value during the period from the time of acquisition of the frame image to the time of acquisition of the frame image this time. And the process of identifying
A process of extracting an area including the specified part from the person image of the first user in the frame image acquired this time by the first computer;
Executing processing of generating image data of the area and transmitting it to the second computer;
When the second computer receives the image data of the area, the second computer superimposes the image of the area on a position corresponding to the area among the frame images displayed on the display before the image data is received. An image display system configured to display the frame image configured as described above on the display.

The first computer, in the process of identifying the specified part, the setting part among a plurality of setting parts set in the skeleton of the first user based on a change in the measurement result of the measurement target value during the period. The image display system according to claim 1, characterized in that the set site moved during a period is specified, and the specified portion is specified to include at least the set site.

When the first computer specifies the set site moved during the period in the process of specifying the specified portion, it is determined whether the displacement amount of the set site during the period is equal to or more than a threshold. The image display system according to claim 2, characterized in that it is performed for each setting site, and the setting site whose displacement amount is equal to or more than the threshold is specified as the setting site that has moved during the period.

When the first computer performs the determination for each setting site, the first computer performs the determination for the setting site located next to the setting site after the determination for the setting site, and 4. The image display according to claim 3, wherein when extracting the area including the specific part, the area is extracted such that all the set parts moved during the period are included in the area. system.

At least one of the plurality of setting sites set is a site on the body axis of the upper body of the first user,
The first computer extracts the image of the upper body as the area when it is determined that the displacement amount is equal to or more than the threshold value in the determination on the set region on the body axis. Item 5. An image display system according to item 3 or 4.

The first computer executes processing for generating background image data indicating a background image in the frame image separately from image data other than the background image, and transmitting the generated image data to the second computer.
The frequency at which the first computer executes the process of transmitting the background image data is less than the frequency at which the first computer acquires the frame image from the imaging device. The image display system according to any one of the items.

Providing information relating to at least one of the positional relationship between the second user and the display and the posture of the second user while the second user is present in front of the display, to the second computer Has an information provision device,
The first computer further executes a process of acquiring the at least one content specified from the information by the second computer, and in the process of generating the image data of the area, the display is performed in the image of the area. The image data of the area is set such that the second image displayed in a range different from the first image is lower than the first image displayed in the range determined according to the at least one content in the device. The image display system according to any one of claims 1 to 6, wherein the image display system generates the image.

It has a distance measurement device that measures the distance between the second user and the display while the second user is in front of the display.
The first computer acquires the measurement result of the distance from the second computer, and when the distance is equal to or larger than a preset size, the image quality of the person image is reduced to a predetermined image quality, The image display system according to any one of claims 1 to 7, wherein low-quality person image data indicating the person image of image quality is generated and transmitted to the second computer.

The second computer comprises a first computer for acquiring a frame image constituting a video of a first user captured by an imaging device, and a second computer for communicating with the first computer to acquire the frame image. The image display method, wherein the frame image acquired by the computer is displayed by a display to a second user who is at a different place from the first user,
Performing a process of acquiring the measurement result of the measurement target value from a measurement device that measures the measurement target value regarding the position of each part of the body of the first user;
The first computer calculates the duration of the period among the body parts based on a change in the measurement result of the measurement target value during the period from the previous acquisition of the frame image to the current acquisition of the frame image. Executing processing for identifying the identified part moved to
Performing a process of extracting an area including the specified part from the person image of the first user in the frame image acquired this time;
Performing processing of generating image data of the area and transmitting the image data to the second computer;
When the second computer receives the image data, an image of the area indicated by the image data is displayed at a position corresponding to the area in the frame image displayed on the display before the image data is received. Displaying the frame image configured by superimposing on the display;
An image display method characterized by comprising: