JP7640427B2

JP7640427B2 - Gaze position analysis system and method

Info

Publication number: JP7640427B2
Application number: JP2021162786A
Authority: JP
Inventors: 浩彦佐川; 貴之藤原
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-10-01
Filing date: 2021-10-01
Publication date: 2025-03-05
Anticipated expiration: 2041-10-01
Also published as: JP2023053631A; WO2023054661A1

Description

本発明は、注視位置分析システム及び注視位置分析方法に関する。 The present invention relates to a gaze position analysis system and a gaze position analysis method.

利用者に装着する形態の視線計測装置を用いて取得した視線情報から利用者の関心や作業状況等の分析を行う場合、様々な物体が配置された三次元空間中を利用者が自由に移動することになる。このため、三次元空間中において利用者が注視した位置がどこであるか、またその遷移がどのようになっているか等を確認できることが望ましい。 When analyzing a user's interests, work status, etc. from gaze information acquired using a gaze measurement device worn by the user, the user moves freely in a three-dimensional space in which various objects are placed. For this reason, it is desirable to be able to confirm where in the three-dimensional space the user is gazing, and how that transition occurs.

利用者に装着する形態の視線計測装置を対象として、利用者が三次元空間中を自由に移動して取得された視線情報を三次元空間中の注視位置として分析する技術は、特許文献１および特許文献２が開示されている。 Patent Documents 1 and 2 disclose technology that uses a gaze measurement device worn by a user to analyze gaze information acquired while the user moves freely in three-dimensional space as the gaze position in the three-dimensional space.

特許文献１には、複数の異なる撮影位置から撮影された画像を用いて仮想三次元空間を生成するとともに仮想三次元空間における撮影位置を算出し、画像を撮影したタイミングで取得した利用者の視線方向から仮想三次元空間における利用者の注視位置及び注視時間を算出する技術が開示されている。 Patent document 1 discloses a technology that generates a virtual three-dimensional space using images captured from multiple different shooting positions, calculates the shooting positions in the virtual three-dimensional space, and calculates the user's gaze position and gaze time in the virtual three-dimensional space from the user's line of sight obtained at the time the images were captured.

一方、特許文献２には、三次元空間中の表示装置の位置及び利用者の視線と三次元空間上の物体の位置から、利用者が着目していた物体を特定する技術が開示されている。 On the other hand, Patent Document 2 discloses a technology that identifies an object that a user is focusing on based on the position of the display device in three-dimensional space, the user's line of sight, and the position of the object in three-dimensional space.

特開２０２０－１３５７３７号公報JP 2020-135737 A 特開２０１８－１９５３１９号公報JP 2018-195319 A

特許文献１では、複数の画像から生成した仮想三次元空間はそれを構成するデータが粗い場合や、部分的にしかデータを生成できない場合があり、その様な場合、何に注視していたかを判定することが困難となる。 In Patent Document 1, the virtual three-dimensional space generated from multiple images may contain sparse data or may only be partially generated, making it difficult to determine what the subject was looking at.

一方、特許文献２では、三次元空間中の利用者の視線と三次元空間上の物体の位置が常に取得可能なことが前提であるため、必要となる設備が大掛かりになるとともに適用範囲も限定されるという問題がある。また、分析の対象となる全ての物体の位置をあらかじめ明確にしておく必要もあり、事前準備に労力を要する。 On the other hand, in Patent Document 2, it is assumed that the user's line of sight in three-dimensional space and the positions of objects in the three-dimensional space can always be obtained, which poses the problem that the equipment required is large-scale and the scope of application is limited. In addition, the positions of all objects to be analyzed must be clearly defined in advance, which requires a lot of effort in advance preparation.

三次元空間中の注視位置を分析する場合、三次元空間全体の中をどのように注視位置が遷移したかを閲覧する他、特定の物体のみに着目し、その物体上を注視位置がどのように遷移したかを容易に閲覧できることが望ましい。 When analyzing gaze position in three-dimensional space, it is desirable to be able to view how the gaze position has transitioned across the entire three-dimensional space, as well as to focus on a specific object and easily view how the gaze position has transitioned over that object.

この際、三次元空間中に存在する各物体に対する三次元モデル上に注視位置を対応付けることができると、より詳細な分析が可能になることが期待できる。また、各物体モデルは三次元空間中に自動的に配置することができれば、分析に必要な事前設定を簡略化することができると考えられる。 If it were possible to associate the gaze position with a three-dimensional model of each object that exists in three-dimensional space, it would be possible to perform a more detailed analysis. Furthermore, if each object model could be automatically positioned in three-dimensional space, it would be possible to simplify the pre-settings required for analysis.

本発明の目的は、注視位置分析システムにおいて、三次元空間中に存在する各物体に対応する三次元モデル上に利用者の注視位置を自動的に対応付けることにある。 The object of the present invention is to automatically associate a user's gaze position with a three-dimensional model corresponding to each object existing in a three-dimensional space in a gaze position analysis system.

本発明の一態様の注視位置分析システムは、空間モデル作成部と物体位置姿勢推定部と注視位置算出部を有し、空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置である物体モデル注視位置を自動的に対応付ける注視位置分析システムであって、前記空間モデル作成部は、前記利用者の視野と同様の映像である一人称映像を取得し、前記利用者の前記一人称映像上の注視位置である一人称映像注視位置を取得し、前記利用者が視線を向けた範囲における空間の三次元モデルである空間モデルを前記一人称映像から作成し、前記空間モデル中における前記一人称映像の撮影位置姿勢を算出し、前記物体位置姿勢推定部は、前記物体モデルと前記空間モデルとをマッチングすることにより、前記空間モデル中における前記物体の位置姿勢を推定し、前記空間モデル中における前記物体の位置姿勢を用いて、前記物体モデルを前記空間モデル中に配置し、前記空間モデル中における前記一人称映像の前記撮影位置姿勢と前記一人称映像注視位置を用いて、前記空間モデル中における注視方向を算出し、前記注視位置算出部は、前記空間モデル中における前記注視方向と前記物体モデルとの交点を求めることにより、前記物体モデル注視位置を算出することを特徴とする。 The gaze position analysis system of one aspect of the present invention has a spatial model creation unit, an object position and orientation estimation unit, and a gaze position calculation unit, and is a gaze position analysis system that automatically matches an object model gaze position, which is a user's gaze position on an object model, on an object model that is a three-dimensional model of an object existing in space. The spatial model creation unit acquires a first-person video that is an image similar to the user's field of vision, acquires a first-person video gaze position that is the gaze position of the user on the first-person video, creates a spatial model from the first-person video that is a three-dimensional model of the space in the range in which the user directs his or her gaze, and calculates the gaze position of the first-person video in the spatial model. The object position and orientation estimation unit estimates the position and orientation of the object in the spatial model by matching the object model with the spatial model, places the object model in the spatial model using the position and orientation of the object in the spatial model, calculates the gaze direction in the spatial model using the shooting position and orientation of the first-person video in the spatial model and the first-person video gaze position, and the gaze position calculation unit calculates the object model gaze position by determining the intersection of the gaze direction and the object model in the spatial model.

本発明の一態様の注視位置分析システムは、空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置を自動的に対応付ける注視位置分析システムであって、撮影された複数枚の撮影画像から、周囲の空間の三次元モデルである空間モデルを作成する空間モデル作成部と、前記空間モデルと前記物体モデルとをマッチングし、前記マッチングで得られた位置姿勢により前記物体モデルを前記空間モデル上に配置する物体位置姿勢推定部と、前記配置された物体モデルと前記空間モデル中における注視方向に基づいて、前記物体モデル上の前記注視位置を算出する注視位置算出部と、を有することを特徴とする。 The gaze position analysis system according to one aspect of the present invention is a gaze position analysis system that automatically matches a user's gaze position on an object model, which is a three-dimensional model of an object existing in space, and is characterized by having a spatial model creation unit that creates a spatial model, which is a three-dimensional model of the surrounding space, from a plurality of captured images, an object position and orientation estimation unit that matches the spatial model with the object model and places the object model on the spatial model based on the position and orientation obtained by the matching, and a gaze position calculation unit that calculates the gaze position on the object model based on the placed object model and the gaze direction in the spatial model.

本発明の一態様によれば、注視位置分析システムにおいて、三次元空間中に存在する各物体に対応する三次元モデル上に利用者の注視位置を自動的に対応付けることができる。 According to one aspect of the present invention, a gaze position analysis system can automatically map a user's gaze position onto a three-dimensional model that corresponds to each object that exists in a three-dimensional space.

本発明の実施例の注視位置分析システムを一般的なコンピュータによって実行させる場合のコンピュータの構成図である。FIG. 2 is a diagram showing the configuration of a general computer when the gaze position analysis system according to the embodiment of the present invention is executed by the general computer. 本発明で想定している視線計測装置の基本的な構成を表す図である。1 is a diagram illustrating a basic configuration of a gaze detection device according to the present invention. 一人称映像データのフォーマットの一例を示す図である。FIG. 1 is a diagram illustrating an example of a format of first-person video data. 注視位置データのフォーマットの一例を示す図である。FIG. 11 is a diagram illustrating an example of a format of gaze position data. 物体モデルデータのフォーマットの一例を示す図である。FIG. 2 is a diagram illustrating an example of a format of object model data. 空間モデルデータのフォーマットの一例を示す図である。FIG. 2 is a diagram illustrating an example of a format of spatial model data. 物体配置データのフォーマットの一例を示す図である。FIG. 11 is a diagram illustrating an example of a format of object placement data. 空間モデル作成プログラムで実行される処理の内容を説明する図である。FIG. 2 is a diagram illustrating the contents of a process executed by a spatial model creation program. 注視位置算出プログラムで実行される処理のフローチャートの一例を示す図である。FIG. 11 is a diagram illustrating an example of a flowchart of a process executed by a gaze position calculation program. 注視位置算出プログラムで実行される処理の内容を説明する図である。11A to 11C are diagrams illustrating the contents of a process executed by a gaze position calculation program. 実際の物体上に設置されたＡＲマーカーの一例を示す図である。FIG. 1 is a diagram showing an example of an AR marker placed on an actual object. 形状や大きさが変化していく物体モデルの一例を示す図である。FIG. 1 is a diagram showing an example of an object model whose shape and size change. 物体モデルと物体モデル上の注視位置の一例を示す図である。FIG. 2 is a diagram showing an example of an object model and a gaze position on the object model. 注視位置を常に利用者に正対するように表示した場合の一例を示す図である。FIG. 13 is a diagram showing an example of a case where the gaze position is always displayed facing the user. 物体モデルの位置姿勢の調整方法を制限して注視位置を表示した場合の一例を示す図である。13 is a diagram showing an example of a case where the gaze position is displayed by restricting the adjustment method for the position and orientation of an object model. FIG.

以下、図面を用いて本発明の実施例について説明する。 The following describes an embodiment of the present invention with reference to the drawings.

図１は、本発明における実施例における注視位置分析システムを一般的なコンピュータによって実行させる場合のコンピュータの構成図である。
図１における視線計測装置１０１は、利用者の視野と同様の映像である一人称映像と一人称映像上の注視位置を計測し、それぞれを一人称映像データ１１０と注視位置データ１１１を格納するためのデータベースに記録するための入力装置であり、「アイトラッカー」等の名称で一般的に良く利用されている装置を使用することができる。 FIG. 1 is a diagram showing the configuration of a general computer that executes a gaze position analysis system according to an embodiment of the present invention.
The gaze measurement device 101 in Figure 1 is an input device that measures a first-person image, which is an image similar to the user's field of vision, and the gaze position on the first-person image, and records them in a database that stores first-person image data 110 and gaze position data 111, respectively, and can use a device that is commonly used and known by names such as an "eye tracker."

特に、本発明における視線計測装置１０１としては、利用者に装着して用いることができる携帯型の装置を想定している。これにより、利用者が自由に空間上を移動できる状態で注視方向を計測することができる。 In particular, the gaze measurement device 101 of the present invention is assumed to be a portable device that can be worn by the user. This makes it possible to measure the gaze direction of the user while they are able to move freely in space.

図２に、利用者に装着して用いる視線計測装置１０１の基本的な構成を示す。
図２における２０１は、利用者の一人称映像を取得するための撮影装置であり、一般的にパソコン等で用いられるカメラと同等の装置を使用することができる。２０２は、利用者の目の動きを検知し、注視方向を計測するためのセンサが搭載されている眼鏡型の装置である。２０３は、取得された一人称映像データ１１０と注視位置データ１１１をデータベースに記録するための端末である。端末２０３は、情報処理装置１０２にデータを送信するようにしても良い。 FIG. 2 shows a basic configuration of the gaze detection device 101 that is worn by a user.
2, 201 is a photographing device for acquiring a first-person video of a user, and a device equivalent to a camera generally used in a personal computer or the like can be used. 202 is a glasses-type device equipped with a sensor for detecting the movement of the user's eyes and measuring the gaze direction. 203 is a terminal for recording the acquired first-person video data 110 and gaze position data 111 in a database. The terminal 203 may transmit data to the information processing device 102.

視線計測装置１０１としては、空間中を利用者が自由に移動できる状態での注視位置の計測が可能で、且つ、一人称映像上での注視位置の取得が可能であれば、携帯型に限らず、据え置き型の装置を用いても良い。 The gaze measurement device 101 is not limited to a portable type, but may be a stationary type device as long as it is capable of measuring the gaze position while the user is able to move freely through the space and is capable of acquiring the gaze position on a first-person image.

図１における情報処理装置１０２は、注視位置分析システムにおける各プログラムを実行するための情報処理装置である。 The information processing device 102 in FIG. 1 is an information processing device for executing each program in the gaze position analysis system.

入力装置１０３には、システムの開始や終了等を制御するためのキーボード、ボタン、マウスあるいはタッチパネル等の一般的なコンピュータにおける入力装置が含まれる。 The input device 103 includes typical computer input devices such as a keyboard, buttons, a mouse, or a touch panel for controlling the start and end of the system.

出力装置１０４は、利用者に注視位置分析の結果やシステムの動作状況等を表示するための手段であり、スマートフォンやタブレット端末の画面、あるいは一般的なコンピュータ用の表示装置が含まれる。 The output device 104 is a means for displaying the results of gaze position analysis and the system's operating status to the user, and may include the screen of a smartphone or tablet device, or a display device for a general computer.

また、１０５は、注視位置分析システムにおける各プログラムを格納するための記憶装置である。記憶装置１０５には、空間モデル作成プログラム１０６、物体位置姿勢推定プログラム１０７、注視位置算出プログラム１０８および注視位置表示プログラム１０９が含まれる。 105 is a storage device for storing each program in the gaze position analysis system. The storage device 105 includes a space model creation program 106, an object position and orientation estimation program 107, a gaze position calculation program 108, and a gaze position display program 109.

ここで、情報処理装置１０２は、空間モデル作成プログラム１０６に従って処理を実行することで空間モデル部として機能する。また、情報処理装置１０２は、物体位置姿勢推定プログラム１０７に従って処理を実行することで物体位置姿勢推定部として機能する。また、情報処理装置１０２は、注視位置算出プログラム１０８に従って処理を実行することで注視位置算出部として機能する。また、情報処理装置１０２は、注視位置表示プログラム１０９に従って処理を実行することで注視位置表示部として機能する。 Here, the information processing device 102 functions as a space model unit by executing processing according to a space model creation program 106. The information processing device 102 also functions as an object position/orientation estimation unit by executing processing according to an object position/orientation estimation program 107. The information processing device 102 also functions as a gaze position calculation unit by executing processing according to a gaze position calculation program 108. The information processing device 102 also functions as a gaze position display unit by executing processing according to a gaze position display program 109.

一人称映像データ１１０のデータベースには、利用者の視野と同様の映像である一人称映像のデータが格納される。一人称映像データ１１０は上述のように視線計測装置１０１で取得されることを想定している。 The database of first-person video data 110 stores data on first-person video, which is video similar to the user's field of vision. It is assumed that the first-person video data 110 is acquired by the gaze measurement device 101 as described above.

図３に、一人称映像データ１１０のフォーマットの一例を示す。
図３におけるデータの名称３０１は、一人称映像データに付与された名称であり、任意の文字および記号の列を用いることができる。また、一人称映像データ１１０には、あらかじめ定められた時間間隔、あるいは任意のタイミングで取得した複数枚の画像が取得された順番で含まれており、図３におけるデータ数３０２は、データ中に含まれる画像の枚数を表す。 FIG. 3 shows an example of the format of the first-person video data 110.
The data name 301 in Fig. 3 is a name given to the first-person video data, and can be any string of characters and symbols. The first-person video data 110 includes a plurality of images captured at predetermined time intervals or at any timing in the order in which they were captured, and the number of data 302 in Fig. 3 indicates the number of images included in the data.

３０３の時刻１は最初に画像が取得された時刻、３０４の画像１は最初に取得された画像、３０５の撮影位置姿勢１は最初の画像を撮影したカメラの位置姿勢を表す。３０６の時刻ｎはｎ番目の画像が取得された時刻、３０７の画像ｎはｎ番目の画像、３０８の撮影位置姿勢ｎはｎ番目の画像を撮影したカメラの位置姿勢を表す。なお、撮影位置姿勢３０５および３０８は、後述するように、空間モデル作成プログラム１０６によって算出されるデータであり、各画像を取得した時点では空欄である。 Time 1 in 303 indicates the time when the first image was acquired, image 1 in 304 indicates the first image acquired, and shooting position and orientation 1 in 305 indicates the position and orientation of the camera that captured the first image. Time n in 306 indicates the time when the nth image was acquired, image n in 307 indicates the nth image, and shooting position and orientation n in 308 indicates the position and orientation of the camera that captured the nth image. Note that shooting positions and orientations 305 and 308 are data calculated by the spatial model creation program 106, as described below, and are blank when each image is acquired.

また、一人称映像データ１１０のフォーマットとしては、一般的に用いられる動画フォーマット等、データ中の各時刻における画像が容易に取得できるフォーマットであれば、どのようなフォーマットを用いても良い。注視位置データ１１１のデータベースには、視線計測装置１０１で取得された注視位置のデータが格納される。 The format of the first-person video data 110 may be any format, such as a commonly used video format, as long as it is a format that allows easy acquisition of images at each time in the data. The gaze position data 111 database stores the gaze position data acquired by the gaze measurement device 101.

図４に、注視位置データ１１１のフォーマットの一例を示す。
上述のように、注視位置は一人称映像データ１１０上における位置座標として表されることを前提とするため、注視位置データ１１１には対応する一人称映像データ１１０が存在する。このため、データの名称４０１に対応する一人称映像データにおけるデータの名称３０１と同じ名称を記述することにより、注視位置データ１１１と一人称映像データ１１０との対応関係を表す。 FIG. 4 shows an example of the format of the gaze position data 111. As shown in FIG.
As described above, since it is assumed that the gaze position is represented as position coordinates on the first-person video data 110, the gaze position data 111 has corresponding first-person video data 110. Therefore, the correspondence between the gaze position data 111 and the first-person video data 110 is expressed by describing the same name as the data name 301 in the corresponding first-person video data in the data name 401.

また、注視位置データ１１１には、あらかじめ定められた時間間隔、あるいは任意のタイミングで取得した複数の位置座標が取得された順番で含まれている。図４におけるデータ数４０２は、データ中に含まれる位置座標の数を表す。４０３は最初に注視位置の位置座標が取得された時刻、４０４は最初に取得された注視位置の位置座標、４０５は注視位置算出プログラム１０８により注視位置データが物体モデルに対応付けられた場合、その物体の名称を記述する。 The gaze position data 111 also includes multiple position coordinates acquired at predetermined time intervals or at arbitrary timing in the order in which they were acquired. The number of data 402 in FIG. 4 indicates the number of position coordinates included in the data. 403 is the time when the gaze position coordinates were first acquired, 404 is the position coordinates of the first acquired gaze position, and 405 describes the name of the object when the gaze position data is associated with an object model by the gaze position calculation program 108.

４０６はｎ番目に注視位置の位置座標が取得された時刻、４０７はｎ番目に取得された注視位置の位置座標、４０８は注視位置算出プログラム１０８により注視位置データが物体モデルに対応付けられた場合、その物体の名称を記述する。注視位置データが取得された時点では、対応付けられた物体モデルは無いため、４０５および４０８は空欄となる。 406 is the time when the position coordinates of the gaze position were acquired for the nth time, 407 is the position coordinates of the gaze position acquired for the nth time, and 408 describes the name of the object when the gaze position data is associated with an object model by the gaze position calculation program 108. At the time when the gaze position data is acquired, there is no associated object model, so 405 and 408 are left blank.

また、注視位置データ１１１が取得された時点では、注視位置は一人称映像上の位置座標、すなわち二次元の座標データであるが、注視位置データ１１１が物体モデルに対応付けられた場合は物体モデル上の位置座標となり三次元の座標データに書き換えられる。 In addition, when the gaze position data 111 is acquired, the gaze position is a position coordinate on the first-person video, i.e., two-dimensional coordinate data, but when the gaze position data 111 is associated with an object model, it becomes a position coordinate on the object model and is rewritten into three-dimensional coordinate data.

上述した注視位置データ１１１では、視線計測装置１０１で取得された注視位置データと、物体モデルに対応付けられた注視位置データを同じデータで管理することを前提としている。しかし、視線計測装置１０１で取得された注視位置データと、物体モデルに対応付けられた注視位置データを別々のデータとして管理するようにしても良い。 The gaze position data 111 described above is based on the premise that the gaze position data acquired by the gaze measurement device 101 and the gaze position data associated with the object model are managed as the same data. However, the gaze position data acquired by the gaze measurement device 101 and the gaze position data associated with the object model may be managed as separate data.

物体モデルデータ１１２のデータベースには、利用者が視線を向ける範囲内の空間に存在する物体の三次元モデルを格納する。本発明では、それぞれの物体の三次元モデルは物体の形状を表す点の集まり、すなわち点群データとして表されることを想定する。 The database of object model data 112 stores three-dimensional models of objects that exist in the space within the user's line of sight. In this invention, it is assumed that the three-dimensional model of each object is represented as a collection of points that represent the shape of the object, i.e., point cloud data.

一般的な三次元ＣＡＤ等を使用して作成される物体モデルは、多角形の集まりとして表される場合が多いが、多角形で表された物体モデルは、容易に点群モデルに変換することができる。例えば、まずそれぞれの多角形について、ある頂点とその頂点と隣り合う頂点以外を結ぶ線で多角形を三角形に分割し、それぞれの三角形については、ある頂点とそれに向かい合う辺の中点を結ぶ線で三角形を分割することを繰り返し、最終的に全ての三角形の頂点を選択することで、多角形で表されている物体モデルを点群データとして表すことができる。 Object models created using general 3D CAD and the like are often represented as a collection of polygons, but object models represented as polygons can be easily converted into point cloud models. For example, first, for each polygon, divide the polygon into triangles with lines connecting a vertex to all vertices other than the one adjacent to that vertex, and then repeat the process of dividing each triangle with a line connecting a vertex to the midpoint of the opposite side, and finally select the vertices of all triangles, allowing the object model represented as polygons to be represented as point cloud data.

図５に、物体モデルデータ１１２のデータベースに格納される物体モデルのフォーマットの一例を示す。
図５における５０１は、物体に付与された名称であり、任意の文字および記号の列を用いることができる。５０２は物体モデル中に含まれる点の数、５０３は最初の点の位置座標、５０４はｎ番目の点の位置座標を表す。図５に示すフォーマットでは点の位置座標のみを含めているが、例えば色に関する情報等、それぞれの点に付属する情報を含めるようにしても良い。 FIG. 5 shows an example of the format of an object model stored in the database of the object model data 112. In FIG.
5, 501 is a name given to an object, and can be any string of characters and symbols. 502 indicates the number of points included in the object model, 503 indicates the position coordinates of the first point, and 504 indicates the position coordinates of the nth point. The format shown in FIG. 5 includes only the position coordinates of the points, but it is also possible to include information associated with each point, such as information about color.

空間モデルデータ１１３のデータベースには、利用者が視線を向ける範囲内の空間に対する三次元モデルである空間モデルを格納する。空間モデルは、一人称映像データ１１０を用いて空間モデル作成プログラム１０６によって作成されるデータであり、上述した物体モデルと同様に点群データとして表されることを想定している。 The spatial model data 113 database stores a spatial model, which is a three-dimensional model of the space within the user's line of sight. The spatial model is data created by the spatial model creation program 106 using the first-person video data 110, and is assumed to be represented as point cloud data in the same way as the object model described above.

図６に、空間モデルデータ１１３のデータベースに格納される空間モデルのフォーマットの一例を示す。
図６におけるモデルの名称６０１は、空間モデルに付与された名称であり、任意の文字および記号の列を用いることができる。６０２は空間モデルの作成に使用された一人称映像データの名称、６０３は空間モデル中に含まれる点の数、６０４は最初の点の位置座標、６０５はｎ番目の点の位置座標を表す。図６に示すフォーマットでは点の位置座標のみを含めているが、例えば色に関する情報等、それぞれの点に付属する情報を含めるようにしても良い。 FIG. 6 shows an example of the format of a spatial model stored in the database of the spatial model data 113.
The model name 601 in Fig. 6 is a name given to the spatial model, and can be any string of characters and symbols. 602 is the name of the first-person video data used to create the spatial model, 603 is the number of points included in the spatial model, 604 is the position coordinate of the first point, and 605 is the position coordinate of the nth point. The format shown in Fig. 6 includes only the position coordinates of the points, but information associated with each point, such as information about color, may also be included.

物体配置データ１１４のデータベースには、物体位置姿勢推定プログラム１０７によって、空間モデルにマッチングされ、空間モデル上に配置された物体モデルに関するデータを格納する。 The object placement data 114 database stores data on object models that are matched to a spatial model and placed on the spatial model by the object position and orientation estimation program 107.

図７に、物体配置データ１１４のデータベースに格納される物体配置データ１１４のフォーマットの一例を示す。
物体配置データ１１４は空間モデルごとに格納される想定として、図７におけるモデルの名称７０１に対応する空間モデルの名称を記述する。物体の数７０２は、対象となっている空間モデルに配置されている物体モデルの数を表す。 FIG. 7 shows an example of the format of the object placement data 114 stored in the database of the object placement data 114. In FIG.
Assuming that the object placement data 114 is stored for each spatial model, the name of the spatial model corresponding to the model name 701 in Fig. 7 is described. The number of objects 702 indicates the number of object models placed in the target spatial model.

図７において、７０３は空間モデル上に配置されている最初の物体の名称、７０４は最初の物体の空間モデル上における位置姿勢、７０５は最初の物体が空間モデルに配置された時刻を表す。７０６は空間モデル上に配置されているｎ番目の物体の名称、７０７はｎ番目の物体の空間モデル上における位置姿勢、７０８はｎ番目の物体が空間モデルに配置された時刻を表す。 In FIG. 7, 703 indicates the name of the first object placed on the spatial model, 704 indicates the position and orientation of the first object on the spatial model, and 705 indicates the time when the first object was placed on the spatial model. 706 indicates the name of the nth object placed on the spatial model, 707 indicates the position and orientation of the nth object on the spatial model, and 708 indicates the time when the nth object was placed on the spatial model.

本発明は、視線計測装置１０１から取得された一人称映像データ１１０および注視位置データ１１１から、空間モデル作成プログラム１０６、物体位置姿勢推定プログラム１０７および注視位置算出プログラム１０８によって、各物体上における注視位置の分析処理を行う。本実施例では特に、視線計測装置１０１から一人称映像データ１１０および注視位置データ１１１をリアルタイムに取得しながら処理を行うことを想定している。 The present invention performs analysis processing of the gaze position on each object using a spatial model creation program 106, an object position and orientation estimation program 107, and a gaze position calculation program 108 from first-person video data 110 and gaze position data 111 acquired from the gaze measurement device 101. In particular, this embodiment assumes that processing is performed while first-person video data 110 and gaze position data 111 are acquired in real time from the gaze measurement device 101.

このためまず、空間モデル作成プログラム１０６では、視線計測装置１０１から取得され、一人称映像データ１１０のデータベースに保存されたデータから、新たに保存されたデータを常時読み込み、空間モデルを作成する処理を行う。 For this reason, the spatial model creation program 106 first performs a process of constantly reading newly saved data from the data acquired from the gaze measurement device 101 and stored in the database of first-person video data 110, and creating a spatial model.

映像データから点群データで表現される空間モデルを作成する技術としては、良く知られたＳＬＡＭ（ＳｉｍｕｌｔａｎｅｏｕｓＬｏｃａｌｉｚａｔｉｏｎａｎｄＭａｐｐｉｎｇ）およびＭＶＳ（Ｍｕｌｔｉ－ＶｉｅｗＳｔｅｒｅｏ）法を組み合わせた技術を用いることができる。 A technique that can be used to create a spatial model represented by point cloud data from video data is a combination of the well-known SLAM (Simultaneous Localization and Mapping) and MVS (Multi-View Stereo) methods.

ＳＬＡＭはカメラを移動させた場合に対応する連続した複数枚の画像を用いて、画像間の対応関係を解析することにより、粗い点群データを作成する技術である。一方、ＭＶＳ法は、ＳＬＡＭにおける解析結果を用いることにより、より詳細な密な点群データを作成する技術である。 SLAM is a technology that creates coarse point cloud data by analyzing the correspondence between images using multiple consecutive images that correspond to the movement of the camera. On the other hand, the MVS method is a technology that creates more detailed, dense point cloud data by using the analysis results of SLAM.

ＳＬＡＭおよびＭＶＳ法を用いて点群データによる空間モデルを作成する場合のイメージを図８に示す。 Figure 8 shows an example of how a spatial model is created using point cloud data with the SLAM and MVS methods.

図８において８０１は、利用者が視線を移動する範囲に存在している物体を表しており、簡単のため、周囲には他の物体は存在しないと想定している。８０２および８０４は一人称映像を撮影した際のカメラの位置を表しており、８０３および８０５はそれぞれのカメラの位置から一人称映像を撮影した際のカメラの視野の範囲およびカメラの姿勢を図示したものである。 In Figure 8, 801 represents an object that exists within the range where the user's line of sight moves, and for simplicity, it is assumed that there are no other objects in the vicinity. 802 and 804 represent the camera positions when the first-person video was captured, and 803 and 805 illustrate the range of the camera's field of view and the camera's attitude when the first-person video was captured from each camera position.

８０２および８０４に示すように、複数の箇所から撮影した同一物体あるいは同一箇所の画像を用いることにより、８０６に示すような点群データで表される空間モデルが作成される。 As shown in 802 and 804, by using images of the same object or the same location taken from multiple locations, a spatial model represented by point cloud data as shown in 806 is created.

図８では、カメラの位置は２箇所のみを示しているが、より正確な空間モデル８０６を作成するために、一般的には、それ以上の枚数の画像が用いられる。以上のような技術を用いることにより、利用者の注視位置の移動に応じて、空間モデル８０６をリアルタイムに作成することができる。空間モデル作成プログラム１０６で使用する技術としては、上記の技術の他、同様に点群データで表された空間モデルを作成できる技術であれば、どのような技術を用いても良い。 In FIG. 8, only two camera positions are shown, but more images are generally used to create a more accurate spatial model 806. By using the above-mentioned techniques, the spatial model 806 can be created in real time in response to changes in the user's gaze position. In addition to the above-mentioned techniques, any other technique may be used in the spatial model creation program 106 as long as it can create a spatial model similarly represented by point cloud data.

また、ＳＬＡＭ技術を用いることにより、作成された空間モデル上における一人称映像を撮影したカメラの位置姿勢も同時に算出することができる。後述するように、カメラの位置姿勢は物体モデル上の注視位置の算出で必要となる情報である。このため、カメラの位置姿勢を算出することができない空間モデルの作成技術を使用する場合は、空間モデル上のカメラの位置姿勢を取得する手段を別途利用する必要がある。例えば、位置姿勢を取得するセンサを利用する等、空間中の位置姿勢を取得する技術であればどのような技術でも使用することができる。 Furthermore, by using SLAM technology, the position and orientation of the camera that captured the first-person video on the created spatial model can also be calculated at the same time. As described below, the position and orientation of the camera is information required to calculate the gaze position on the object model. For this reason, when using a spatial model creation technology that cannot calculate the position and orientation of the camera, it is necessary to use a separate means of acquiring the position and orientation of the camera on the spatial model. For example, any technology that acquires the position and orientation in space can be used, such as using a sensor that acquires the position and orientation.

物体位置姿勢推定プログラム１０７は、空間モデル作成プログラム１０６によって作成された空間モデル中の一部の点群データに対して、物体モデルの点群データをマッチング、すなわち位置や姿勢がうまく合うように合わせ込みを行い、空間モデル中における物体モデルの位置姿勢を求めるためのプログラムである。 The object position and orientation estimation program 107 is a program for matching the point cloud data of the object model to a portion of the point cloud data in the spatial model created by the spatial model creation program 106, i.e., for adjusting the position and orientation so that they match well, and for determining the position and orientation of the object model in the spatial model.

物体位置姿勢推定プログラム１０７で使用する技術としては、良く知られたＩＣＰ（ＩｔｅｒａｔｉｖｅＣｌｏｓｅｓｔＰｏｉｎｔ）アルゴリズムやＮＤＴ（ＮｏｒｍａｌＤｉｓｔｒｉｂｕｔｉｏｎＴｒａｎｓｆｏｒｍ）アルゴリズムと呼ばれる技術を用いることができる。あるいは、点群データ同士のマッチングを行い、空間モデル中における物体モデルの位置姿勢を求めることができる技術であればどのような技術を用いても良い。また、物体位置姿勢推定プログラム１０７は、後述の注視位置算出プログラム１０８によって、処理を行うタイミングが制御される。 The object position and orientation estimation program 107 may use well-known techniques such as the ICP (Iterative Closest Point) algorithm or the NDT (Normal Distribution Transform) algorithm. Alternatively, any technique may be used as long as it can match point cloud data and determine the position and orientation of an object model in a spatial model. In addition, the object position and orientation estimation program 107 controls the timing of processing by the gaze position calculation program 108, which will be described later.

注視位置算出プログラム１０８は、視線計測装置１０１によって取得された注視位置データ１１１、物体モデルデータ１１２および空間モデル作成プログラム１０６によって作成された空間モデルデータ１１３を用いて、物体モデル上における注視位置を求めるプログラムである。 The gaze position calculation program 108 is a program that calculates the gaze position on the object model using gaze position data 111 acquired by the gaze measurement device 101, object model data 112, and spatial model data 113 created by the spatial model creation program 106.

図９のフローチャートを用いて、注視位置算出プログラム１０８の処理の流れを説明する。
図９におけるステップ９０１では、注視位置データ１１１のデータベースに格納されている注視位置データ１１１から新たな注視位置データ１１１を取得する。あるいは、視線計測装置１０１から新たな注視位置データを直接取得するようにしても良い。 The flow of processing of the gaze position calculation program 108 will be described with reference to the flowchart of FIG.
9, new gaze position data 111 is obtained from the gaze position data 111 stored in the database of the gaze position data 111. Alternatively, new gaze position data may be obtained directly from the gaze measurement device 101.

ステップ９０２では、新たに取得された注視位置データ１１１、および空間モデル作成プログラム１０６によって空間モデルを作成する際に算出され、一人称映像データ１１０に記載されている撮影位置姿勢３０５および３０８を用いて、空間モデル上の注視方向を算出する。 In step 902, the gaze direction on the spatial model is calculated using the newly acquired gaze position data 111 and the shooting positions and attitudes 305 and 308 calculated when creating the spatial model by the spatial model creation program 106 and recorded in the first-person video data 110.

また、空間モデル上の注視方向は、注視の始点となる位置とその位置からの視線方向を示すベクトルで表される。空間モデル上の注視方向の算出は、まず、新たに取得された注視位置データ１１１と同じ時刻の一人称映像データに対する撮影位置姿勢を一人称映像データ１１０から取得する。 The gaze direction on the spatial model is expressed by a vector indicating the starting point of gaze and the line of sight from that position. To calculate the gaze direction on the spatial model, first, the shooting position and orientation for the first-person video data at the same time as the newly acquired gaze position data 111 is acquired from the first-person video data 110.

全く同じ時刻の撮影位置姿勢が存在しない場合は、例えば、注視位置データ１１１の時刻の前後の時刻に対応する一人称映像データ１１０に対する撮影位置姿勢を取得し、注視位置データ１１１の時刻との関係に基づいて補間を行うことにより求めた撮影位置姿勢を使用する、等の方法を用いれば良い。次に、取得した撮影位置姿勢を用いて、一人称映像上の注視位置を空間モデル上の注視位置に座標変換する。 If there is no shooting position and orientation at the exact same time, a method may be used in which, for example, a shooting position and orientation for the first-person video data 110 corresponding to a time before or after the time of the gaze position data 111 is acquired, and the shooting position and orientation obtained by performing interpolation based on the relationship with the time of the gaze position data 111 is used. Next, the acquired shooting position and orientation are used to perform coordinate conversion of the gaze position on the first-person video to the gaze position on the spatial model.

上述のように、注視位置データ１１１は一人称映像上の注視位置で表されるが、良く知られたピンホールカメラモデルにカメラの視野角や焦点距離等の情報を適用することにより、撮影位置、一人称映像および注視位置の位置関係を実際の空間と同一サイズで求めることができる。さらに、一人称映像データ１１０から取得した撮影位置姿勢により、求めた位置関係を座標変換することにより、図１０に示すように、空間モデル上におけるカメラの撮影位置、一人称映像および注視位置の位置関係を表すことができる。 As described above, the gaze position data 111 is represented by the gaze position on the first-person video, but by applying information such as the camera's viewing angle and focal length to a well-known pinhole camera model, the positional relationship between the shooting position, first-person video, and gaze position can be obtained at the same size as in the actual space. Furthermore, by performing coordinate conversion on the obtained positional relationship using the shooting position and orientation obtained from the first-person video data 110, the positional relationship between the camera's shooting position, first-person video, and gaze position on the spatial model can be expressed as shown in FIG. 10.

図１０において、１００１は空間モデル、１００２は空間モデル１００１上における一人称映像を撮影したカメラの撮影位置、１００３は一人称映像の空間モデル１００１上における撮影範囲を示しており、カメラの撮影姿勢に対応する。 In FIG. 10, 1001 indicates the spatial model, 1002 indicates the shooting position of the camera that captured the first-person video on the spatial model 1001, and 1003 indicates the shooting range of the first-person video on the spatial model 1001, which corresponds to the shooting posture of the camera.

また、１００４は空間モデル１００１に対応付けられた一人称映像上の注視位置である。空間モデル１００１上の注視方向は、図１０において、１００２を始点として一人称映像上の注視位置１００４を通過するベクトル１００５として求めることができる。 1004 is the gaze position on the first-person image that is associated with the spatial model 1001. The gaze direction on the spatial model 1001 can be obtained as vector 1005 in FIG. 10 that starts at 1002 and passes through gaze position 1004 on the first-person image.

図９におけるステップ９０３では、ステップ９０２で求めた空間モデル１００１上の注視方向に、空間モデル１００１上に配置された物体モデルが存在するかどうかの判定を行い、空間モデル１００１上の注視方向に空間モデル１００１上に配置された物体モデルが存在すると判定された場合はステップ９０６に、存在しないと判定された場合は９０４に、それぞれ進む。 In step 903 in FIG. 9, it is determined whether an object model placed on the spatial model 1001 exists in the gaze direction on the spatial model 1001 determined in step 902. If it is determined that an object model placed on the spatial model 1001 exists in the gaze direction on the spatial model 1001, the process proceeds to step 906, and if it is determined that an object model does not exist, the process proceeds to step 904.

判定を行う方法としては、まず、物体配置データ１１４から対象となる空間モデル１００１に配置された物体モデルの情報を取得し、該当する物体モデルの点群データを物体モデルデータ１１２から読み込み空間モデル上に配置する。次に、空間モデル１００１上の注視方向を表すベクトル（以下、注視方向ベクトル）、例えば図１０におけるベクトル１００５と空間モデル１００１上に配置された物体モデルの各点との距離を求め、注視方向ベクトルと最も近い距離を選択する。 The method of making this determination is as follows: first, information on the object model placed in the target spatial model 1001 is obtained from the object placement data 114, and the point cloud data of the corresponding object model is read from the object model data 112 and placed on the spatial model. Next, the distance between a vector representing the gaze direction on the spatial model 1001 (hereinafter, gaze direction vector), for example, vector 1005 in FIG. 10, and each point of the object model placed on the spatial model 1001 is calculated, and the distance closest to the gaze direction vector is selected.

選択された距離があらかじめ定められた閾値以下であれば、対象となる物体モデルが注視方向に存在すると判定することができる。距離に対する閾値としては、例えば、物体モデルを構成する点群データにおける点間の距離において、最も大きい距離や最も大きい距離の２分の１等を選択することができる。また、判定方法としては、空間モデル上の注視方向ベクトルとの距離があらかじめ定められた閾値以下である点が、あらかじめ定められた閾値以上の個数存在する場合に、対象となる物体モデルが注視方向にあると判定するようにしても良い。 If the selected distance is equal to or less than a predetermined threshold, it can be determined that the target object model is in the gaze direction. As the threshold for distance, for example, the largest distance or half the largest distance between points in the point cloud data constituting the object model can be selected. In addition, as a method of determination, it may be determined that the target object model is in the gaze direction when there are a number of points on the spatial model whose distance from the gaze direction vector is equal to or less than a predetermined threshold equal to or more than a predetermined threshold.

また、物体モデルデータ１１２のデータベースに格納する物体モデルとして、点群データの他に多角形（ポリゴン）の集合で表された形式のデータであるポリゴンデータも格納しておき、空間モデル１００１上に該当する物体モデルもポリゴンデータを配置し、空間モデル１００１上に配置された各ポリゴンと空間モデル上に配置された視線方向を表すベクトルとが交差する点が存在する場合、対象となる物体モデルが注視方向に存在すると判定するようにしても良い。 In addition, in addition to the point cloud data, polygon data, which is data in a format represented by a collection of polygons, may also be stored as object models to be stored in the database of object model data 112, and the polygon data for the corresponding object model may also be arranged on the spatial model 1001. If there is a point where each polygon arranged on the spatial model 1001 intersects with a vector representing the line of sight arranged on the spatial model, it may be determined that the target object model is in the gaze direction.

さらには、空間モデル１００１上に配置された物体が複数存在する場合は、上述した注視方向ベクトル１００５と最も近い距離にある点や注視方向ベクトル１００５と交差するポリゴンの注視方向ベクトル上の位置を求め、注視方向ベクトル１００５の始点、すなわち撮影位置に最も近い点あるいはポリゴンを含む物体モデルを注視方向に存在する物体モデルとして選択すれば良い。 Furthermore, if there are multiple objects placed on the spatial model 1001, the point closest to the gaze direction vector 1005 or the position on the gaze direction vector of the polygon that intersects with the gaze direction vector 1005 can be obtained, and the starting point of the gaze direction vector 1005, i.e., the point or object model including the polygon closest to the shooting position, can be selected as the object model existing in the gaze direction.

またさらには、空間モデル１００１と同じ位置関係を表すことができる仮想空間を用意し、物体が空間モデル１００１上に配置されるごとに、仮想空間上に物体モデルを配置し、仮想空間上で上述した判定処理を実行するようにしても良い。 Furthermore, a virtual space that can express the same positional relationship as the spatial model 1001 may be prepared, and each time an object is placed on the spatial model 1001, an object model may be placed in the virtual space, and the above-mentioned determination process may be executed in the virtual space.

ステップ９０４では、物体位置姿勢推定プログラム１０７を用いて、空間モデル１００１上の注視方向に存在する空間モデル１００１上の点群データに対して、物体モデル１００４をマッチングする。この処理は、図１０において、空間モデル１００１に対して物体モデル１００６を位置や姿勢を様々に調整することにより、対象とする物体モデル１００６で空間モデル１００１の一部を置き換えることを可能とするための処理である。 In step 904, the object position and orientation estimation program 107 is used to match the object model 1004 to the point cloud data on the space model 1001 that exists in the gaze direction on the space model 1001. This process is a process for making it possible to replace a part of the space model 1001 with the target object model 1006 by adjusting the position and orientation of the object model 1006 in various ways relative to the space model 1001 in FIG. 10.

視線方向に存在する空間モデル１００１上の点群データとしては、例えば、注視方向ベクトル１００５からの距離があらかじめ定められた距離以下である空間モデル上の点群データを選択することができる。あらかじめ定められた距離としては、マッチングを行う物体モデルの最大サイズや最大サイズの２分の１等を用いることができる。 As point cloud data on the spatial model 1001 existing in the gaze direction, for example, point cloud data on the spatial model whose distance from the gaze direction vector 1005 is equal to or less than a predetermined distance can be selected. The predetermined distance can be the maximum size of the object model to be matched or half the maximum size, etc.

あるいは、注視方向ベクトル１００５を中心軸とするあらかじめ定められた大きさの特定の形状の範囲内に存在する点群データとすることもできる。また、点群データ中の各点の注視方向ベクトル上での位置を求め、点群データが注視方向ベクトル上で最も集中している箇所を中心にあらかじめ定められた範囲の点を選択することもできる。以上の他にも、注視方向ベクトルを中心としてマッチングの対象とする点群データを選択できる方法であれば、どのような方法を用いても良い。 Alternatively, the point cloud data may be present within a range of a specific shape and a predetermined size with the gaze direction vector 1005 as the central axis. Also, the position of each point in the point cloud data on the gaze direction vector may be obtained, and points within a predetermined range may be selected centered on the point where the point cloud data is most concentrated on the gaze direction vector. In addition to the above, any method may be used as long as it is capable of selecting point cloud data to be matched with the gaze direction vector as the center.

空間モデル上の点群データにマッチングする物体モデルは、物体モデルデータ１１２のデータベースに格納されている全ての物体モデルを対象とすることもできるし、あるいは、実際の物体上に、良く知られたＱＲコード（登録商標）やＡＲマーカー、物体の名称を表す記号や文字列等を設置し、良く知られた画像認識技術や文字認識技術を用いてそれらを読み取り、対応する物体モデルを物体モデルデータ１１２から選択してマッチングするようにしても良い。 The object models to be matched to the point cloud data on the spatial model can be all object models stored in the database of object model data 112, or well-known QR codes (registered trademark), AR markers, symbols or character strings representing the names of objects, etc. can be placed on the actual objects, and these can be read using well-known image recognition technology or character recognition technology, and the corresponding object models can be selected from the object model data 112 for matching.

図１１の１１０１に、実際の物体上に設置されたＡＲマーカーの一例を示す。あるいは、空間モデル上の点群データと物体モデルの点群データから、良く知られた点群特徴量（三次元特徴量）を抽出し、空間モデル上の点群データと類似した点群特徴量を有する物体モデルを物体モデルデータ１１２から選択してマッチングするようにしても良い。 1101 in FIG. 11 shows an example of an AR marker placed on an actual object. Alternatively, well-known point cloud features (three-dimensional features) may be extracted from the point cloud data on the spatial model and the point cloud data of the object model, and an object model having point cloud features similar to the point cloud data on the spatial model may be selected from the object model data 112 for matching.

また、組み立て作業を行うような場合、特定の箇所に存在する物体の形状や大きさが変化していくが、このような場合、図１２の１２０１、１２０２および１２０３に示すように、作業の工程ごとに物体の状態を示す物体モデルとその順序関係を物体モデルデータ１１２に保存しておき、順に空間モデルとマッチングを行うようにすることで、作業の工程に応じた物体モデルを空間モデルにマッチングできるようになる。 In addition, when performing assembly work, the shape and size of an object present at a specific location change. In such a case, as shown in 1201, 1202, and 1203 in FIG. 12, the object model showing the state of the object for each work step and its sequential relationship are stored in the object model data 112, and matching with the spatial model is performed in sequence, making it possible to match the object model according to the work step with the spatial model.

具体的には、例えば、注視方向に存在する空間モデル上の点群データにいずれの物体モデルもマッチングされていない場合は、最初の工程に対応する物体モデルを選択してマッチングを行う。 Specifically, for example, if no object model is matched to the point cloud data on the spatial model that exists in the gaze direction, the object model corresponding to the first step is selected and matching is performed.

一方、注視方向に存在する空間モデル上の点群データに、いずれかの工程に対応する物体モデルがすでにマッチングされている場合は、次の工程に対応する物体モデルを選択し、空間モデル上の点群データとのマッチングを行い、既にマッチングされている物体モデルと空間モデルの一致度より高い一致度で次の工程に対応する物体モデルが空間モデルにマッチングされた場合、既にマッチングされている物体モデルを次の工程に対応する物体モデルで置き換えるようにすれば良い。 On the other hand, if an object model corresponding to one of the processes has already been matched to the point cloud data on the spatial model present in the gaze direction, the object model corresponding to the next process is selected and matched with the point cloud data on the spatial model. If the object model corresponding to the next process is matched to the spatial model with a higher degree of match than the already matched object model and spatial model, the already matched object model is replaced with the object model corresponding to the next process.

一致度としては、後述するステップ９０５での処理において使用するものと同様のものを用いることができる。さらに、次の工程に対応する物体モデルの空間モデルへのマッチングは、既にマッチングされている物体モデル上に注視位置が移動した場合や、空間モデル作成プログラム１０６において空間モデルの変化が検出された場合に行うようにすれば良い。 The degree of matching may be the same as that used in the processing in step 905 described below. Furthermore, matching of the object model corresponding to the next step to the spatial model may be performed when the gaze position moves onto an already matched object model or when a change in the spatial model is detected in the spatial model creation program 106.

空間モデルの変化の検出は、空間モデル作成プログラム１０６において、例えば、新たに取得された一人称映像の画像とその直前のあらかじめ定められた枚数の一人称映像の画像を用いて生成された空間モデルとそれ以前に生成されていた空間モデルとを比較し、両者の差分があらかじめ定められた閾値以上の大きさであれば、空間モデルが変化したと判定すればよい。 To detect a change in the spatial model, the spatial model creation program 106, for example, compares a spatial model generated using a newly acquired first-person video image and a predetermined number of first-person video images immediately preceding that image with the spatial model generated previously, and if the difference between the two is equal to or greater than a predetermined threshold, it is determined that the spatial model has changed.

空間モデルと空間モデルの差分は、例えば、一方の空間モデル中の各点に対して、他方の空間モデル中の最も近い点を検索し、検索された点との距離を算出し、一方の空間モデル中の全ての点についての距離の平均を求めることにより算出することができる。あるいは、求められた距離があらかじめ定められた閾値以上である点の数を用いても良い。あるいは、点群間の差分を算出できる方法であれば、どのような方法を用いても良い。 The difference between spatial models can be calculated, for example, by searching for the closest point in the other spatial model for each point in one spatial model, calculating the distance to the searched point, and averaging the distances for all points in one spatial model. Alternatively, the number of points whose calculated distance is equal to or greater than a predetermined threshold value may be used. Alternatively, any method may be used as long as it can calculate the difference between point groups.

ステップ９０５では、空間モデル上の注視方向に存在する空間モデル上の点群データに物体モデルが正しくマッチングされたかどうかを判定し、正しくマッチングされた場合は物体モデルのマッチング結果を物体配置データ１１４のデータベースに保存した後ステップ９０６に、そうでない場合はステップ９０８に進む。 In step 905, it is determined whether the object model has been correctly matched to the point cloud data on the spatial model that exists in the gaze direction on the spatial model. If the match is correct, the matching result of the object model is stored in the database of object placement data 114 and then the process proceeds to step 906; if not, the process proceeds to step 908.

空間モデル上の点群データに物体モデルが正しくマッチングされたかどうかを判定する方法としては、空間モデル上の点群データと、マッチング処理を行った結果に基づいて空間モデル上に配置された物体モデルとの一致度を求め、一致度があらかじめ定められた値より大きい場合、物体モデルが正しくマッチングされたと判定すれば良い。 A method for determining whether an object model has been correctly matched to point cloud data on a spatial model is to calculate the degree of match between the point cloud data on the spatial model and the object model placed on the spatial model based on the results of the matching process, and if the degree of match is greater than a predetermined value, it is determined that the object model has been correctly matched.

空間モデル上の点群データにマッチングされる物体モデルが複数存在する場合は、最も大きい一致度を選択し、選択した一致度があらかじめ定められた値より大きい場合、物体モデルが正しくマッチングされたと判定するとともに、選択された一致度に対応する物体モデルが空間モデル上の点群データにマッチングされたと判断すれば良い。 If there are multiple object models that can be matched to the point cloud data on the spatial model, the highest degree of matching is selected, and if the selected degree of matching is greater than a predetermined value, it is determined that the object models have been matched correctly, and that the object model corresponding to the selected degree of matching has been matched to the point cloud data on the spatial model.

一致度としては、例えば、マッチング結果を用いて空間モデル上に配置された物体モデル中の各点に対して、空間モデル上の点群データ中でもっとも距離が小さい点を探索し、求めた距離があらかじめ定められた閾値より小さい点の数の個数を求め、物体モデル中における点の数に対する求めた点の個数の割合を用いることができる。空間モデル上の点群データと物体モデルの一致度としては、上記の他、点群データ同士のマッチング結果の良否を判定できる指標であれば、どのような指標を用いても良い。 The degree of agreement can be determined, for example, by searching for the point in the point cloud data on the spatial model that is the shortest distance from each point in the object model placed on the spatial model using the matching results, determining the number of points whose distance is smaller than a predetermined threshold, and calculating the ratio of the number of points determined to the number of points in the object model. In addition to the above, any index can be used as the degree of agreement between the point cloud data on the spatial model and the object model, as long as it can determine whether the matching result between the point cloud data is good or bad.

ステップ９０６では、ステップ９０４でのマッチング結果を用いて空間モデル上に配置された物体モデルにおける注視位置を算出する。空間モデル上に配置された物体モデルと空間モデル上の注視方向ベクトルは、図１０における１００５となるため、上述したステップ９０３における処理、すなわち、空間モデル上の注視方向に空間モデル上に配置された物体モデルが存在するかどうかの判定処理と同様の処理を用いることができる。 In step 906, the matching result in step 904 is used to calculate the gaze position of the object model placed on the spatial model. The object model placed on the spatial model and the gaze direction vector on the spatial model are 1005 in FIG. 10, so a process similar to the process in step 903 described above, that is, the process of determining whether an object model placed on the spatial model exists in the gaze direction on the spatial model, can be used.

ただしステップ９０３では、物体モデルが点群データの場合、注視方向ベクトル１００５との距離があらかじめ定められた閾値以下の点が存在するかどうかに着目していたが、注視位置の算出では、注視方向ベクトルとの距離があらかじめ定められた閾値以下であり、かつ、撮影位置１００２に最も近い点を選択し、それを注視位置とすることが異なる。 However, in step 903, when the object model is point cloud data, attention is paid to whether there is a point whose distance from the gaze direction vector 1005 is equal to or less than a predetermined threshold. However, in calculating the gaze position, a point whose distance from the gaze direction vector is equal to or less than a predetermined threshold and that is closest to the shooting position 1002 is selected and used as the gaze position.

あるいは、注視方向ベクトル１００５との距離があらかじめ定められた閾値以下であり、撮影位置１００２に最も近い点からあらかじめ定められた範囲内に複数の点が存在する場合は、撮影位置１００２に最も近い点と、そこからあらかじめ定められた範囲内に点の平均を注視位置としても良い。 Alternatively, if the distance from the gaze direction vector 1005 is equal to or less than a predetermined threshold and there are multiple points within a predetermined range from the point closest to the shooting position 1002, the gaze position may be the average of the point closest to the shooting position 1002 and the points within the predetermined range from that point.

あるいは、物体モデルが多角形の集まりとして表される場合は、注視方向ベクトルと物体モデル中の各多角形との交点を求め、撮影位置１００２に最も近い交点を注視位置として求めるようにすれば良い。 Alternatively, if the object model is represented as a collection of polygons, the intersections between the gaze direction vector and each polygon in the object model can be found, and the intersection closest to the shooting position 1002 can be found as the gaze position.

以上によって求められた注視位置は空間モデル上での注視位置であるため、ステップ９０６では、求められた注視位置をステップ９０４のマッチング結果により得られた空間モデル上における物体モデルの位置姿勢を用いて座標変換を行う。これにより、空間モデル１００１上での注視位置１００４を物体モデル１００６上、すなわち物体モデルの座標系における注視位置１００７に変換する。 Since the gaze position obtained as described above is the gaze position on the spatial model, in step 906, the determined gaze position is subjected to coordinate conversion using the position and orientation of the object model on the spatial model obtained by the matching result in step 904. As a result, the gaze position 1004 on the spatial model 1001 is converted to the gaze position 1007 on the object model 1006, i.e., in the coordinate system of the object model.

また、ステップ９０５において空間モデル上に物体モデルが配置された場合、ステップ９０６において、過去の注視位置データに対して、配置された物体モデル上における注視位置の計算を行うようにしても良い。これは、注視位置算出プログラム１０８の処理の初期段階等では、空間モデル上の各物体に対する点群データが少なく、空間モデル上への物体モデルの配置が行えない場合が想定されるためである。 In addition, if an object model is placed on the spatial model in step 905, in step 906, the gaze position on the placed object model may be calculated based on past gaze position data. This is because, in the initial stages of processing of the gaze position calculation program 108, there may be a small amount of point cloud data for each object on the spatial model, and it may not be possible to place the object model on the spatial model.

ステップ９０７では、ステップ９０６で求めた物体モデル上の注視位置を注視位置データ１１１に保存する。また、物体モデルの注視位置を保存する際、空間モデル上における一人称映像の撮影位置姿勢や、空間モデル中における注視位置、空間モデル上における物体モデルの位置姿勢等を合わせて保存するようにしても良い。 In step 907, the gaze position on the object model determined in step 906 is stored in gaze position data 111. When storing the gaze position of the object model, the shooting position and orientation of the first-person video on the spatial model, the gaze position in the spatial model, the position and orientation of the object model on the spatial model, etc. may also be stored together.

ステップ９０８では、入力装置１０３等から終了の指示がある場合は処理を終了し、そうでなければステップ９０１に戻る。 In step 908, if an instruction to end the process is received from the input device 103 or the like, the process ends; if not, the process returns to step 901.

上述した注視位置算出プログラム１０８では、一人称映像データ１１０および注視位置データ１１１をリアルタイムに取得し、空間モデルデータ１１３を生成しながら物体モデルデータ１１２の空間モデル上への配置を行い、物体モデル上への注視位置の計算を行っていたが、対象となる一人称映像データ１１０および注視位置データ１１１の取得が一通り完了した後、一人称映像データ１１０および注視位置データ１１１に保存されているデータを順番に読み込みながら注視位置算出プログラム１０８の処理を実行するようにしても良い。 In the gaze position calculation program 108 described above, the first-person video data 110 and gaze position data 111 are acquired in real time, and the object model data 112 is placed on the spatial model while generating the spatial model data 113, and the gaze position on the object model is calculated. However, after the acquisition of the target first-person video data 110 and gaze position data 111 has been completed, the processing of the gaze position calculation program 108 may be executed while sequentially reading the data stored in the first-person video data 110 and gaze position data 111.

その際、空間モデルの作成、あるいは、空間モデルの作成および空間モデル上への物体モデルの配置を行った後、物体モデル上への注視位置の計算を行うようにしても良い。またその際、空間モデル上に物体モデルが配置されていない箇所に対して、手動で物体モデルを配置した後、上述した処理を行うようにしても良い。 In this case, the gaze position on the object model may be calculated after creating a spatial model, or after creating a spatial model and placing an object model on the spatial model. In addition, in a location on the spatial model where no object model is placed, the object model may be manually placed, and then the above-mentioned processing may be performed.

あるいは、注視位置算出プログラム１０８の処理を実行した後、注視位置算出プログラム１０８を実行した後に生成される空間モデルデータ１１３および物体配置データ１１４を用いて、再度、注視位置算出プログラム１０８を実行するようにしても良い。これは、注視位置算出プログラム１０８の処理の初期段階等では、空間モデル上の各物体に対する点群データが少なく、空間モデル上への物体モデルの配置が行えない場合が想定される。 Alternatively, after executing the processing of the gaze position calculation program 108, the gaze position calculation program 108 may be executed again using the spatial model data 113 and object placement data 114 generated after executing the gaze position calculation program 108. This is assumed to be the case in the early stages of processing of the gaze position calculation program 108, when there is little point cloud data for each object on the spatial model and it is not possible to place the object model on the spatial model.

このような場合、注視位置データ１１１に格納された注視位置の内、物体モデル上の注視位置が計算されないままになる注視位置が生じる可能性があるためである。この場合、注視位置算出プログラム１０８は以下の変更を行った上で実行される。 In such a case, among the gaze positions stored in the gaze position data 111, there may be gaze positions on the object model that remain uncalculated. In this case, the gaze position calculation program 108 is executed after making the following changes.

一つ目の変更点としては、ステップ９０１における注視位置データの取得は注視位置データ１１１に保存されているデータから保存された順序でデータを読み込むことによって行う。さらに、取得された注視位置データに対して物体モデル上への注視位置の計算が行われているかどうかを判定し、行われていればステップ９０８に進む処理を追加する。 The first change is that the gaze position data is acquired in step 901 by reading the data from gaze position data 111 in the order in which it was saved. In addition, a process is added to determine whether the gaze position on the object model has been calculated for the acquired gaze position data, and if so, to proceed to step 908.

二つ目の変更点としては、ステップ９０３において、空間モデル上の注視方向に物体モデルが配置されていない場合、ステップ９０４ではなくステップ９０８に進むようにすることである。 The second change is that in step 903, if an object model is not located in the gaze direction on the spatial model, the process proceeds to step 908 instead of step 904.

三つ目の変更点は、ステップ９０４およびステップ９０５を削除することである。注視位置算出プログラム１０８に対して、以上の三点の変更を行うことにより、物体モデル上への注視位置の計算が行われていない注視位置に対して、物体モデル上への注視位置の計算を行うことができる。またその際、空間モデル上に物体モデルが配置されていない箇所に対して、手動で物体モデルを配置した後、上述した処理を行うようにしても良い。 The third change is to delete steps 904 and 905. By making the above three changes to the gaze position calculation program 108, it is possible to calculate the gaze position on the object model for gaze positions where the gaze position on the object model has not been calculated. In this case, the above-mentioned processing may be performed after manually placing an object model in a location where no object model is placed on the spatial model.

注視位置表示プログラム１０９は、利用者からの指示に応じて、注視位置データの表示を行うプログラムである。注視位置は、視線分析で良く知られた方法、例えば、注視位置の頻度分布であるヒートマップによる方法や、注視位置が一定範囲内にとどまっていた時間の長さを円の大きさ等で視覚的に表示する方法等により表示を行う。 The gaze position display program 109 is a program that displays gaze position data in response to instructions from the user. The gaze position is displayed using methods well known in gaze analysis, such as a heat map method that shows the frequency distribution of gaze positions, or a method that visually shows the length of time that the gaze position remained within a certain range using the size of a circle, etc.

また、一人称映像上の注視位置を表示する他、物体モデル上における注視位置の表示を行う。物体モデル上における注視位置の表示方法としては、空間モデルに対応する仮想空間である仮想空間モデルを用意し、仮想空間モデル中に物体配置データ１１４中に保存されているデータに基づいて物体モデルを配置し、仮想空間モデル中に配置された物体モデル上に注視位置を表示する方法を用いることができる。あるいは、特定の物体モデルのみを選択し、選択された物体モデル上にのみ、注視位置を表示する方法を用いることもできる。 In addition to displaying the gaze position on the first-person image, the gaze position on the object model is also displayed. A method for displaying the gaze position on the object model can be to prepare a virtual space model, which is a virtual space corresponding to the spatial model, place an object model in the virtual space model based on the data stored in the object placement data 114, and display the gaze position on the object model placed in the virtual space model. Alternatively, a method can be used in which only specific object models are selected and the gaze position is displayed only on the selected object models.

さらに、物体モデル上に注視位置を表示する場合、注視位置が常に利用者に正対するように仮想空間モデルあるいは物体モデルを調整して表示するようにしても良い。 Furthermore, when displaying the gaze position on an object model, the virtual space model or object model may be adjusted and displayed so that the gaze position always faces the user directly.

図１３の１３０１は物体モデルの例を、１３０２、１３０３、１３０４および１３０５は物体モデル上の注視位置の例を示す。また、図１４に、図１３で示す注視位置を常に利用者に正対するように表示した場合の一例を示す。 In FIG. 13, 1301 shows an example of an object model, and 1302, 1303, 1304, and 1305 show examples of gaze positions on the object model. Also, FIG. 14 shows an example of the gaze position shown in FIG. 13 always displayed facing the user.

図１４において、１４０１は注視点１３０２を利用者に正対するように表示した物体モデル、１４０２は１４０１上に表示された注視点である。１４０３は注視点１３０３を利用者に正対するように表示した物体モデル、１４０４は１４０３上に表示された注視点である。１４０５は注視点１３０４を利用者に正対するように表示した物体モデル、１４０６は１４０５上に表示された注視点である。１４０７は注視点１３０５を利用者に正対するように表示した物体モデル、１４０８は１４０７上に表示された注視点である。 In FIG. 14, 1401 is an object model in which the gaze point 1302 is displayed facing the user, and 1402 is the gaze point displayed on 1401. 1403 is an object model in which the gaze point 1303 is displayed facing the user, and 1404 is the gaze point displayed on 1403. 1405 is an object model in which the gaze point 1304 is displayed facing the user, and 1406 is the gaze point displayed on 1405. 1407 is an object model in which the gaze point 1305 is displayed facing the user, and 1408 is the gaze point displayed on 1407.

図１４に示す表示方法では、各注視位置が完全に利用者に正対するように物体モデルを調整することにより、注視位置の表示を行っている。これを行うためには、例えば、物体モデルが点群データである場合は、注視位置を含むあらかじめ定められた範囲にある点を選択し、選択された点から注視位置を含む平面の法線の向きを求め、求めた法線が利用者側に向くように物体モデルの位置姿勢を調整するようにすればよい。物体モデルが多角形の集まりで表現されている場合は、注視位置を含む多角形を物体モデルから選択し、その法線が利用者側に向くように物体モデルの位置姿勢を調整するようにすればよい。 In the display method shown in FIG. 14, the gaze positions are displayed by adjusting the object model so that each gaze position faces completely directly at the user. To do this, for example, if the object model is point cloud data, a point in a predetermined range including the gaze position is selected, the direction of the normal of a plane including the gaze position is found from the selected point, and the position and orientation of the object model are adjusted so that the found normal faces the user. If the object model is represented as a collection of polygons, a polygon including the gaze position is selected from the object model, and the position and orientation of the object model are adjusted so that its normal faces the user.

また、注視位置を表示する際に、表示する物体モデルの位置姿勢の調整方法に制限を設けるようにしても良い。さらに、注視位置が利用者に正対する程度を調整することにより、注視位置を表示する際に物体モデルの姿勢が大きく変化することを抑えるようにしても良い。 In addition, when displaying the gaze position, restrictions may be placed on the method of adjusting the position and orientation of the displayed object model. Furthermore, by adjusting the degree to which the gaze position faces the user, large changes in the orientation of the object model when displaying the gaze position may be suppressed.

例えば、最初の注視位置は利用者に正対するように表示し、それ以降の注視位置は、利用者に正対する程度があらかじめ定められた範囲を超えないように、物体モデルの位置姿勢を調整する。注視位置が利用者に正対する程度としては、注視位置を含む平面の法線方向と利用者に正対する方向、すなわち画面に垂直な方向との間の角度や内積等を用いることができる。 For example, the first gaze position is displayed so that it faces the user directly, and the position and orientation of the object model are adjusted for subsequent gaze positions so that the degree to which the gaze position faces the user does not exceed a predetermined range. The degree to which the gaze position faces the user can be determined by the angle or dot product between the normal direction of the plane including the gaze position and the direction facing the user, i.e., the direction perpendicular to the screen.

図１５に、上述の方法で注視位置を表示した場合の表示例を示す。
図１５は、物体モデルを表示する際に、物体モデルの上下方向の座標軸を利用者側に傾けた状態で、注視位置を表示する際の位置姿勢の調整方法として、物体モデルの上下方向の座標軸周りの回転のみを許容して表示した場合を示している。物体モデル１５０１は最初の注視位置１５０２を利用者に正対するように位置姿勢を調整している。 FIG. 15 shows an example of a display in which the gaze position is displayed using the above-mentioned method.
15 shows a case where an object model is displayed with its vertical coordinate axis tilted toward the user, and only rotation around the vertical coordinate axis of the object model is permitted as a method of adjusting the position and orientation when displaying the gaze position. The position and orientation of an object model 1501 is adjusted so that an initial gaze position 1502 faces the user directly.

ただし図１５では、物体モデルの上下方向の座標軸が傾いており、注視位置を利用者に完全に正対させることはできないため、できるだけ正対するように位置姿勢の調整を行う。具体的には、上述と同様に注視位置を含む平面の法線を求め、求めた法線と利用者に正対する方向、すなわち画面に垂直な方向との差が最も小さくなるように、物体モデルの位置姿勢を調整する。物体モデル１５０３では、注視位置１５０４が利用者に正対する程度が注視位置１５０２より低い状態、すなわち注視位置の確認は可能であるが、利用者には正対していない状態で表示されている。物体モデル１５０５上の注視位置１５０６および物体モデル１５０７上の注視位置１５０８も注視位置１５０４と同じ方法で表示された場合を示している。 However, in FIG. 15, the coordinate axis of the object model in the vertical direction is tilted, and the gaze position cannot be perfectly faced to the user, so the position and orientation are adjusted so that it faces as directly as possible. Specifically, the normal of the plane including the gaze position is found as described above, and the position and orientation of the object model is adjusted so that the difference between the found normal and the direction facing the user, i.e., the direction perpendicular to the screen, is minimized. In object model 1503, gaze position 1504 is displayed in a state where it faces the user less directly than gaze position 1502, i.e., the gaze position can be confirmed, but is not facing the user. Gaze position 1506 on object model 1505 and gaze position 1508 on object model 1507 are also displayed in the same way as gaze position 1504.

以上のように、本発明に実施例における注視位置分析システムは、利用者の視野と同様の映像である一人称映像から利用者が視線を向けた範囲の空間の三次元モデルである空間モデルの生成と空間モデル中における一人称画像の撮影位置姿勢の算出を行い、空間モデル上における一人称映像の撮影位置姿勢と一人称映像上の注視位置とを用いて空間モデル上における注視方向を算出し、物体に対する三次元モデルである物体モデルと空間モデルとをマッチングすることにより推定された空間モデル上における物体の位置姿勢により物体モデルを空間モデル中に配置し、空間モデル上における注視方向と物体モデルとの交点を求めることにより、物体モデル上における注視位置を算出する。 As described above, the gaze position analysis system in this embodiment of the present invention generates a spatial model, which is a three-dimensional model of the space in the range in which the user directs his or her gaze from a first-person video, which is an image similar to the user's field of vision, and calculates the shooting position and orientation of the first-person image in the spatial model, calculates the gaze direction on the spatial model using the shooting position and orientation of the first-person video on the spatial model and the gaze position on the first-person video, places an object model in the spatial model based on the position and orientation of the object on the spatial model estimated by matching the object model, which is a three-dimensional model of the object, with the spatial model, and calculates the gaze position on the object model by finding the intersection between the gaze direction on the spatial model and the object model.

本発明の実施例によれば、利用者に装着する形態の視線計測装置を対象として、計測対象の空間を自由に移動して計測された利用者の視線情報から、空間中に存在する各種対象物の三次元モデル上に利用者の注視位置を自動的に対応付けることが可能となる。 According to an embodiment of the present invention, a gaze measurement device that is worn by a user is used, and the gaze information of the user measured while moving freely within the space being measured can be used to automatically map the gaze position of the user onto a three-dimensional model of various objects that exist within the space.

１０１視線計測装置
１０２情報処理装置
１０３入力装置
１０４出力装置
１０５記憶装置
１０６空間モデル作成プログラム
１０７物体位置姿勢推定プログラム
１０８注視位置算出プログラム
１０９注視位置表示プログラム
１１０一人称映像データ
１１１注視位置データ
１１２物体モデルデータ
１１３空間モデルデータ
１１４物体配置データ 101 Gaze measurement device 102 Information processing device 103 Input device 104 Output device 105 Storage device 106 Space model creation program 107 Object position and orientation estimation program 108 Gaze position calculation program 109 Gaze position display program 110 First-person video data 111 Gaze position data 112 Object model data 113 Space model data 114 Object arrangement data

Claims

A gaze position analysis system having a space model creation unit, an object position and orientation estimation unit, and a gaze position calculation unit, which automatically associates an object model gaze position, which is a gaze position of a user on an object model, with an object model, which is a three-dimensional model of an object existing in a space,
The spatial model creation unit includes:
Acquire a first-person image that is an image similar to the user's visual field;
A first-person image gaze position, which is a gaze position of the user on the first-person image, is acquired;
creating a spatial model, which is a three-dimensional model of the space in the range in which the user is looking, from the first-person image;
Calculating a shooting position and orientation of the first-person video in the spatial model;
The object position and orientation estimation unit
estimating a position and orientation of the object in the spatial model by matching the object model with the spatial model;
placing the object model in the spatial model using a position and orientation of the object in the spatial model;
Calculating a gaze direction in the spatial model using the shooting position and orientation of the first-person image and the first-person image gaze position in the spatial model;
The gaze position calculation unit
A gaze position analysis system, comprising: a gaze position analysis unit that calculates a gaze position of the object model by determining an intersection between the gaze direction and the object model in the spatial model.

The object position and orientation estimation unit
The gaze position analysis system of claim 1, characterized in that if there is data in the spatial model to which the object model is not matched in the gaze direction, the position and orientation of the object in the spatial model is estimated by matching the object model with the data in the spatial model to which the object model is not matched.

The object position and orientation estimation unit
selecting data in the spatial model that exists within a predetermined range based on the gaze direction from the data in the spatial model;
2. The gaze position analysis system according to claim 1, wherein the position and orientation of the object in the selected spatial model are estimated by matching the object model with data in the selected spatial model.

The object position and orientation estimation unit
Reading a preset image pattern, character string, or symbol string from the first-person video;
selecting the object model corresponding to the image pattern, the character string, or the symbol string;
calculating a position of the image pattern, the character string or the symbol string in the spatial model from the spatial model, a shooting position and orientation of the first-person video in the spatial model, and a position of the image pattern, the character string or the symbol string in the first-person video;
selecting data in the spatial model that exists within the predetermined range based on a position in the spatial model of the image pattern, the character string, or the symbol string;
4. The gaze position analysis system according to claim 3, wherein the position and orientation of the object in the selected spatial model are estimated by matching the selected object model with data in the selected spatial model.

The object position and orientation estimation unit
Calculating a difference between data representing a spatial model created from the first-person image at a certain time and data of a range of the spatial model already created that corresponds to the first-person image at the same time;
If the calculated difference is greater than a predetermined threshold, replacing the data in the corresponding range on the spatial model with data representing the spatial model created from the first-person video;
Matching the object model with the data of the replaced portion;
2. The gaze position analysis system according to claim 1, wherein the object model that was matched to the data before the replacement is replaced with the newly matched object model.

further comprising an object model storage unit configured to store, for the object whose shape or structure changes, the object model corresponding to each process of the change and an order relationship of the change,
The object position and orientation estimation unit
The gaze position analysis system according to claim 1, characterized in that, for a location where the object model corresponding to the changing object is matched, each time a change in the spatial model is detected, the object model corresponding to the process of the change of the object is selected from the object model storage unit, and the selected object model is matched with the spatial model.

The gaze position calculation unit
When any of the object models is newly matched to any location on the spatial model,
The gaze position analysis system according to claim 1 , further comprising: determining whether or not the object model gaze position is on the object model newly matched on the spatial model with respect to the previous object model gaze position.

The object position and orientation estimation unit
The user manually matches the object model to a portion of the spatial model to which the object model has not been matched;
The gaze position calculation unit
The gaze position analysis system according to claim 1, characterized in that for all of the object model gaze positions already acquired, it is determined whether the object model gaze position is on the object model that the user manually matched on the spatial model.

The gaze position calculation unit
The gaze position analysis system according to claim 1, characterized in that at least one of the name of the object model, the shooting position and orientation of the first-person image in the spatial model, the first-person image gaze position in the spatial model, and the position and orientation of the object model in the spatial model is stored in correspondence with the object model gaze position on the object model.

a gaze position display unit that displays a gaze position of the object model on the object model,
The gaze position display unit is
The gaze position analysis system according to claim 1, characterized in that, when displaying the gaze position of the object model on the object model, the position and orientation of the object model are adjusted so that the gaze position of the object model at each time is always displayed directly in front.

a gaze measurement device that captures the first-person image of the user and measures a gaze position of the first-person image while the user is able to move freely within the space;
The spatial model creation unit includes:
The gaze position analysis system according to claim 1 , wherein the first-person video and the first-person video gaze position are acquired from the gaze measurement device.

The gaze measurement device includes:
A photographing device is provided.
The spatial model creation unit includes:
The gaze position analysis system according to claim 11, wherein the surrounding spatial model is created from a plurality of images captured by the imaging device.

A gaze position analysis system that automatically associates a gaze position of a user on an object model, which is a three-dimensional model of an object existing in a space,
a spatial model creation unit that creates a spatial model, which is a three-dimensional model of the surrounding space, from a plurality of captured images;
an object position and orientation estimation unit that matches the spatial model with the object model and places the object model on the spatial model based on a position and orientation obtained by the matching;
a gaze position calculation unit that calculates the gaze position on the object model based on the arranged object model and a gaze direction in the space model;
A gaze position analysis system comprising:

The gaze position calculation unit
The gaze position analysis system according to claim 13, wherein the gaze position on the object model is calculated by determining an intersection between the gaze direction in the spatial model and the object model.

A gaze position analysis method for automatically associating an object model gaze position, which is a gaze position of a user on an object model, with an object model that is a three-dimensional model of an object existing in a space, comprising:
acquiring a first-person image similar to the user's field of view;
acquiring a first-person image gaze position, which is a gaze position of the user on the first-person image;
creating a spatial model, which is a three-dimensional model of a space in a range in which the user is looking, from the first-person image;
calculating a shooting position and orientation of the first-person video in the spatial model;
estimating a position and orientation of the object in the spatial model by matching the object model with the spatial model;
placing the object model in the spatial model using a position and orientation of the object in the spatial model;
calculating a gaze direction in the spatial model using the shooting position and orientation of the first-person image and the first-person image gaze position in the spatial model;
calculating an intersection of the gaze direction and the object model in the spatial model to calculate the object model gaze position;
A gaze position analysis method comprising the steps of: