JP6311461B2

JP6311461B2 - Gaze analysis system and gaze analysis apparatus

Info

Publication number: JP6311461B2
Application number: JP2014116404A
Authority: JP
Inventors: 智子小堀; 小川　隆; 隆小川; 隼沖
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2014-06-05
Filing date: 2014-06-05
Publication date: 2018-04-18
Anticipated expiration: 2034-06-05
Also published as: JP2015228992A

Description

本発明は、人物の視線を推定し、分析するための技術に関する。 The present invention relates to a technique for estimating and analyzing a person's line of sight.

従来、評価対象印刷物を見る人物の視線を推定・追跡して、掲載商品への視線を計測することで、注視度合いや関心度合いを解析し、生活者の購買動向を評価分析することが知られている。生活者に適切な購買情報を提供するためには、評価対象印刷物の掲載情報を、どのように見ているかを知ることが重要となる。 Conventionally, it is known to estimate and track the line of sight of a person looking at the printed material to be evaluated, and to measure the line of sight of the posted product, analyze the degree of gaze and interest, and evaluate and analyze consumer purchasing trends. ing. In order to provide appropriate purchase information to consumers, it is important to know how to view the printed information on the printed material to be evaluated.

このような評価対象印刷物の掲載情報の注目度を分析評価するため、出願人は、簡単に、被験者に無用な視線の動きを生じさせずに高い精度で、被験者の視線を推定して追跡することができる技術を提案している（特許文献１参照）。この技術を利用して、印刷物等を正面から撮像した状態である基準画像に注視点を変換した後、基準画像上にヒートマップを作成することが可能となる。ヒートマップとは、クラスタリング等の結果による数値の大小を色の濃淡等で表現したグラフの一種である。ヒートマップにより、被験者が基準画像上のどの箇所を見ていたかを確認することができる。 In order to analyze and evaluate the degree of attention of the posted information of the printed material to be evaluated, the applicant simply estimates and tracks the subject's gaze with high accuracy without causing unnecessary gaze movement in the subject. The technique which can do is proposed (refer patent document 1). Using this technique, it is possible to create a heat map on the reference image after converting the gazing point into a reference image that is in a state where a printed matter or the like is captured from the front. A heat map is a type of graph that expresses the magnitude of a numerical value based on the result of clustering or the like by color shading. With the heat map, it is possible to confirm which part on the reference image the subject was viewing.

特開２０１３−８１７６２号公報JP 2013-81762 A 特開２００７−３１９１８７号公報JP 2007-319187 A

ＤａｖｉｄＧ．Ｌｏｗｅ、“Ｄｉｓｔｉｎｃｔｉｖｅｉｍａｇｅｆｅａｔｕｒｅｓｆｒｏｍｓｃａｌｅｉｎｖａｒｉａｎｔｋｅｙｐｏｉｎｔｓ”、Ｉｎｔ．ＪｏｕｒｎａｌｏｆＣｏｍｐｕｔｅｒＶｉｓｉｏｎ、Ｖｏｌ．６０、Ｎｏ．２、ＰＰ．９１‐１１０、２００４．DavidG. Low, “Distinctive image features from scalevariant keypoints”, Int. Journal of Computer Vision, Vol. 60, no. 2, PP. 91-110, 2004. Ｈ．Ｂａｙ，Ｔ．Ｔｕｙｔｅｌａａｒｓ，ａｎｄＬ．ＶａｎＧｏｏｌ、“ＳＵＲＦ：ＳｐｅｅｄｅｄＵｐＲｏｂｕｓｔ．Ｆｅａｔｕｒｅｓ”、ＩｎＥＣＣＶ、ｐｐ．４０４‐４１７、２００６．H. Bay, T .; Tuyterrars, and L. VanGool, “SURF: Speeded Up Robust. Features”, InECCV, pp. 404-417, 2006. 橋本浩一、“ビジュアルサーボ−II−コンピュータビジョンの基礎”、日本バーチャルリアリティ学会論文誌、Ｖｏｌ.４、Ｎｏ．４、１９９９．Koichi Hashimoto, “Visual Servo-II—Basics of Computer Vision”, Transactions of the Virtual Reality Society of Japan, Vol. 4, 1999. トビー・テクノロジー・ジャパン株式会社、Ｔｏｂｉｉグラスアイトラッカー、［online］［平成２３年９月８日検索］、インターネット＜ＵＲＬ：ｈｔｔｐ：／／ｗｗｗ．ｔｏｂｉｉ．ｃｏｍ／ｊａ−ＪＰ／ｅｙｅ−ｔｒａｃｋｉｎｇ−ｒｅｓｅａｒｃｈ／ｊａｐａｎ／ｐｒｏｄｕｃｔｓ／ｈａｒｄｗａｒｅ/ｔｏｂｉｉ−ｇｌａｓｓｅｓ−ｅｙｅ−ｔｒａｃｋｅｒ／＞Toby Technology Japan Co., Ltd., Tobii Glass Eye Tracker, [online] [searched on September 8, 2011], Internet <URL: http: // www. tobii. com / ja-JP / eye-tracking-research / japan / products / hardware / tobii-glasses-eye-tracker /> ＯｐｅｎＣＶ、ＳＩＦＴ／ＳＵＲＦ、［online］［平成２３年９月８日検索］、インターネット＜ＵＲＬ：ｈｔｔｐ：／／ｏｐｅｎｃｖ．ｊｐ／ｏｐｅｎｃｖ−２．２／ｃｐｐ／ｆｅａｔｕｒｅｓ２ｄ＿ｆｅａｔｕｒｅ＿ｄｅｔｅｃｔｉｏｎ＿ａｎｄ＿ｄｅｓｃｒｉｐｔｉｏｎ．ｈｔｍｌ＞OpenCV, SIFT / SURF, [online] [searched on September 8, 2011], Internet <URL: http: // opencv. jp / opencv-2.2 / cpp / features2d_feature_detection_and_description. html> Ｖ．ＬｅｐｅｔｉｔａｎｄＰ．Ｆｕａ, “ＫｅｙｐｏｉｎｔＲｅｃｏｇｎｉｔｉｏｎｕｓｉｎｇＲａｎｄｏｍｉｚｅｄＴｒｅｅｓ”，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，Ｖｏｌ．２８，Ｎｏ．９，ｐｐ。１４６５−１４７９，２００６.V. Lepetit and P.M. Fua, “Keypoint Recognition using Randomized Trees”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, no. 9, pp. 1465-1479, 2006. Ｍ．Ｏｚｕｙｓａｌ，Ｐ．Ｆｕａ，Ｖ．Ｌｅｐｅｔｉｔ，−ＦａｓｔＫｅｙｐｏｉｎｔＲｅｃｏｇｎｉｔｉｏｎｉｎＴｅｎＬｉｎｅｓｏｆＣｏｄｅ．，Ｐｒｏｃ．ｏｆＩｎｔ．Ｃｏｎｆ．ｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（２００７）M.M. Ozuyal, P.M. Fua, V.A. Lepetit, -Fast Keypoint Recognition in Ten Lines of Code. , Proc. of Int. Conf. on Computer Vision and Pattern Recognition (2007)

上記特許文献１に記載の技術を利用して作成されたヒートマップを利用することにより、基準画像上のどの箇所をどの程度見ていたかを確認することはできる。しかしながら、対象物が、立体的、曲面が多い、特徴が少ないエリアが多い等の状態である場合、ある一つの角度から撮像して基準画像を得ると、対象物の特徴が正しく基準画像に反映されないという問題がある。また、対象物が、薄暗い場所で撮像された場合、視線追跡のためのシーンカメラにより撮像されるシーン画像は、実際に目で見るよりも暗く、対象物を認識してマッチングすることが難しいという問題がある。 By using a heat map created by using the technique described in Patent Document 1, it is possible to confirm how much and what part of the reference image has been viewed. However, if the target object is three-dimensional, has many curved surfaces, or has many areas with few features, the features of the target are correctly reflected in the reference image if the reference image is obtained from a certain angle. There is a problem that it is not. Also, when the object is imaged in a dim place, the scene image captured by the scene camera for tracking the line of sight is darker than it is actually seen, and it is difficult to recognize and match the object. There's a problem.

そこで、本発明は、対象物が撮像に適さない環境に存在する場合であっても、対象物上における閲覧者の注目した箇所を特定することが可能な視線分析システムおよび視線分析装置を提供することを課題とする。 Therefore, the present invention provides a line-of-sight analysis system and a line-of-sight analysis apparatus that can identify a spot on a target object that is viewed by a viewer even when the target object is present in an environment that is not suitable for imaging. This is the issue.

上記課題を解決するため、本発明では、対象物を撮像するカメラと、対象物との距離を測定する距離センサを備え、被験者の注視点を取得する視線追跡装置と、当該視線追跡装置とデータ通信可能な視線分析装置を有するシステムであって、前記視線分析装置は、前記カメラにより撮像されたシーン画像と、前記シーン画像上における注視点を取得するシーン画像取得手段と、前記距離センサにより測定された距離に基づくシーン距離画像を取得するシーン距離画像取得手段と、前記シーン画像から複数のシーン画像特徴点を抽出するシーン画像特徴点抽出手段と、前記シーン距離画像から複数のシーン距離画像特徴点を抽出するシーン距離画像特徴点抽出手段と、事前に準備された基準画像における基準画像特徴点と前記シーン画像特徴点の対応付けを行うシーン画像対応付け手段と、事前に準備された基準距離画像における基準距離画像特徴点と前記シーン距離画像特徴点の対応付けを行うシーン距離画像対応付け手段と、前記シーン画像とシーン距離画像のうち、対応付けが多くなされた方の画像を選択画像として、当該選択画像の特徴点と対応する基準画像または基準距離画像の対応関係に基づいて、両者の座標を変換する座標変換行列を、各選択画像ごとに作成する変換行列作成手段と、前記作成された座標変換行列を利用して、複数の前記シーン画像のそれぞれについて、前記シーン画像と対応付けられた注視点の座標を基準画像上の座標に変換し、対応点を算出する対応点算出手段と、を備えたものであることを特徴とする視線分析システムを提供する。 In order to solve the above problems, the present invention includes a camera that captures an object, a distance sensor that measures a distance from the object, a line-of-sight tracking apparatus that acquires a gaze point of the subject, the line-of-sight tracking apparatus, and data A system having a gaze analysis device capable of communication, wherein the gaze analysis device is measured by a scene image captured by the camera, a scene image acquisition means for acquiring a gazing point on the scene image, and the distance sensor. A scene distance image acquiring unit for acquiring a scene distance image based on the determined distance, a scene image feature point extracting unit for extracting a plurality of scene image feature points from the scene image, and a plurality of scene distance image features from the scene distance image. Scene distance image feature point extraction means for extracting points, reference image feature points in a reference image prepared in advance, and the scene image feature points A scene image associating means for associating; a scene distance image associating means for associating a reference distance image feature point with a scene distance image feature point in a reference distance image prepared in advance; and the scene image and scene A coordinate transformation matrix that transforms the coordinates of both of the distance images based on the correspondence between the feature points of the selected image and the reference image or the reference distance image, with the image that has been associated more frequently as the selected image. For each selected image, and using the generated coordinate transformation matrix as a reference for the coordinates of the gazing point associated with the scene image for each of the plurality of scene images. There is provided a line-of-sight analysis system characterized by comprising corresponding point calculation means for converting to coordinates on an image and calculating corresponding points.

本発明に係る視線分析システムによれば、対象物を撮像するカメラと、対象物との距離を測定する距離センサを備え、被験者の注視点を取得する視線追跡装置を用い、カメラにより撮像されたシーン画像と、シーン画像上における注視点を取得し、距離センサにより測定された距離に基づくシーン距離画像を取得し、シーン画像から複数のシーン画像特徴点を抽出し、シーン距離画像から複数のシーン距離画像特徴点を抽出し、シーン画像特徴点と基準画像特徴点の対応付けを行い、シーン距離画像特徴点と基準距離画像特徴点の対応付けを行い、シーン画像とシーン距離画像のうち、対応付けが多くなされた方の画像を選択画像として、選択画像の特徴点と対応する基準画像または選択特徴点の対応関係に基づいて、両者の座標を変換する座標変換行列を、各選択画像ごとに作成し、作成された座標変換行列を利用して、複数のシーン画像のそれぞれについて、シーン画像と対応付けられた注視点の座標を基準画像上の座標に変換し、対応点を算出するようにしたので、対象物が撮像に適さない環境に存在する場合であっても、対象物上における閲覧者の注目した箇所を特定することが可能となる。 According to the line-of-sight analysis system according to the present invention, a camera that captures an object and a distance sensor that measures the distance from the object are used, and the image is captured by the camera using a line-of-sight tracking device that acquires the gaze point of the subject. A scene image and a gazing point on the scene image are acquired, a scene distance image based on the distance measured by the distance sensor is acquired, a plurality of scene image feature points are extracted from the scene image, and a plurality of scenes are extracted from the scene distance image. Extracts distance image feature points, associates scene image feature points with reference image feature points, associates scene distance image feature points with reference distance image feature points, and supports correspondence between scene image and scene distance image Based on the correspondence between the feature point of the selected image and the corresponding reference image or the selected feature point, the coordinate of both is converted using the image with more attachments as the selected image A target transformation matrix is created for each selected image, and using the created coordinate transformation matrix, for each of the plurality of scene images, the coordinates of the gazing point associated with the scene image are used as the coordinates on the reference image. Since conversion is performed and corresponding points are calculated, it is possible to specify a spot on the target object that is viewed by the viewer even if the target object exists in an environment that is not suitable for imaging.

また、本発明では、視線分析システムの前記視線追跡装置は、前記カメラによる撮像と、前記距離センサによる測定を、同期して行うことを特徴とする。本発明によれば、カメラによる撮像と、距離センサによる測定を同期して行うようにしたので、シーン画像とともに取得した注視点を、シーン距離画像に対応する注視点として取得することができる。 In the present invention, the line-of-sight tracking device of the line-of-sight analysis system performs imaging by the camera and measurement by the distance sensor in synchronization. According to the present invention, since the imaging by the camera and the measurement by the distance sensor are performed in synchronization, the gazing point acquired together with the scene image can be acquired as the gazing point corresponding to the scene distance image.

また、本発明では、対象物を撮像するカメラと、対象物との距離を測定する距離センサと、を備え、前記カメラにより撮像されたシーン画像と、前記距離センサにより測定された距離に基づくシーン距離画像と、シーン画像上の被験者の注視点と、を取得する視線追跡装置に対して、データ通信可能に接続された視線分析装置であって、前記カメラにより撮像されたシーン画像と、前記シーン画像上における注視点を取得するシーン画像取得手段と、前記距離センサにより測定された距離に基づくシーン距離画像を取得するシーン距離画像取得手段と、前記シーン画像から複数のシーン画像特徴点を抽出するシーン画像特徴点抽出手段と、前記シーン距離画像から複数のシーン距離画像特徴点を抽出するシーン距離画像特徴点抽出手段と、基準画像における基準画像特徴点を複数取得する基準画像特徴点取得手段と、基準距離画像における基準距離画像特徴点を複数取得する基準距離画像特徴点取得手段と、前記シーン画像特徴点と基準画像特徴点の対応関係を求めるシーン画像対応付け手段と、前記シーン距離画像特徴点と基準距離画像特徴点の対応関係を求めるシーン距離画像対応付け手段と、前記シーン画像とシーン距離画像のうち、求められた対応関係が多い方の画像を選択画像として、当該選択画像の特徴点と対応する基準画像または選択特徴点の対応関係に基づいて、両者の座標を変換する座標変換行列を、各選択画像ごとに作成する変換行列作成手段と、前記作成された座標変換行列を利用して、複数の前記シーン画像のそれぞれについて、前記シーン画像と対応付けられた注視点の座標を基準画像上の座標に変換し、対応点を算出する対応点算出手段と、を有することを特徴とする視線分析装置を提供する。 The present invention further includes a camera that captures an object and a distance sensor that measures the distance to the object, and a scene image captured by the camera and a scene based on the distance measured by the distance sensor. A line-of-sight analysis apparatus connected to a line-of-sight tracking apparatus for acquiring a distance image and a gaze point of a subject on a scene image so as to be capable of data communication, the scene image captured by the camera, and the scene Scene image acquisition means for acquiring a gazing point on an image, scene distance image acquisition means for acquiring a scene distance image based on a distance measured by the distance sensor, and extracting a plurality of scene image feature points from the scene image Scene image feature point extracting means; scene distance image feature point extracting means for extracting a plurality of scene distance image feature points from the scene distance image; Reference image feature point acquisition means for acquiring a plurality of reference image feature points in the image, reference distance image feature point acquisition means for acquiring a plurality of reference distance image feature points in the reference distance image, the scene image feature points and the reference image feature points Of the scene image and the scene distance image, the scene image association means for obtaining the correspondence relationship, the scene distance image association means for obtaining the correspondence relationship between the scene distance image feature point and the reference distance image feature point, and the scene image and the scene distance image. A coordinate transformation matrix that converts the coordinates of both of the selected images based on the correspondence between the feature point of the selected image and the reference image or the selected feature point corresponding to the feature point of the selected image is selected for each selected image. Using the transformation matrix creation means to create and the created coordinate transformation matrix, each of the plurality of scene images is associated with the scene image. Was the coordinate of the gazing point is converted into coordinates on the reference image, the corresponding point calculation means for calculating a corresponding point, to provide a visual line analysis apparatus characterized by having a.

本発明に係る視線分析装置によれば、視線追跡装置のカメラにより撮像されたシーン画像と、シーン画像上における注視点を取得するとともに、視線追跡装置の距離センサにより測定された距離に基づくシーン距離画像を取得し、シーン画像から複数のシーン画像特徴点を抽出し、シーン距離画像から複数のシーン距離画像特徴点を抽出し、シーン画像特徴点と基準画像特徴点の対応付けを行い、シーン距離画像特徴点と基準距離画像特徴点の対応付けを行い、シーン画像とシーン距離画像のうち、対応付けが多くなされた方の画像を選択画像として、選択画像の特徴点と対応する基準画像または選択特徴点の対応関係に基づいて、両者の座標を変換する座標変換行列を、各選択画像ごとに作成し、作成された座標変換行列を利用して、複数のシーン画像のそれぞれについて、シーン画像と対応付けられた注視点の座標を基準画像上の座標に変換し、対応点を算出するようにしたので、カメラと距離センサを備え、カメラにより撮像されたシーン画像、距離センサにより測定された距離に基づくシーン距離画像、対応する被験者の注視点を取得する視線追跡装置と組み合わせることにより、対象物が撮像に適さない環境に存在する場合であっても、対象物上における閲覧者の注目した箇所を特定することが可能となる。 According to the line-of-sight analysis apparatus according to the present invention, a scene image captured by the camera of the line-of-sight tracking apparatus and a scene distance based on a distance measured by a distance sensor of the line-of-sight tracking apparatus while acquiring a gaze point on the scene image Acquire images, extract multiple scene image feature points from the scene image, extract multiple scene distance image feature points from the scene distance image, associate the scene image feature points with the reference image feature points, and The image feature point is associated with the reference distance image feature point, and the reference image or the selection corresponding to the feature point of the selected image is selected as the selected image of the scene image and the scene distance image, which has been associated with a large amount. Based on the correspondence between feature points, a coordinate transformation matrix that transforms the coordinates of both is created for each selected image, and multiple coordinates are created using the created coordinate transformation matrix. For each scene image, the coordinates of the gazing point associated with the scene image are converted to the coordinates on the reference image, and the corresponding points are calculated. Therefore, the scene that includes the camera and the distance sensor and is captured by the camera Combined with an image, a scene distance image based on the distance measured by the distance sensor, and a gaze tracking device that acquires the gaze point of the corresponding subject, even if the target is in an environment that is not suitable for imaging It is possible to specify a spot on the object that is viewed by the viewer.

本発明によれば、対象物が撮像に適さない環境に存在する場合であっても、対象物上における閲覧者の注目した箇所を特定することが可能となるという効果を有する。 According to the present invention, even if the target object is present in an environment that is not suitable for imaging, it has an effect that it is possible to specify a spot on the target object that is viewed by the viewer.

本発明の一実施形態に係る視線分析システムの概要を示す外観図である。1 is an external view showing an outline of a line-of-sight analysis system according to an embodiment of the present invention. 本発明の一実施形態に係る視線分析システムのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the gaze analysis system which concerns on one Embodiment of this invention. 本発明の一実施形態に係る視線分析システムの処理動作を示すフローチャートである。It is a flowchart which shows the processing operation of the gaze analysis system which concerns on one Embodiment of this invention. 視線追跡装置２００により取得されるシーン画像および注視点を示す図である。It is a figure which shows the scene image and gaze point which are acquired by the gaze tracking apparatus. シーン画像上の注視点および特徴点を示す図である。It is a figure which shows the gazing point and feature point on a scene image. 基準画像特徴点とシーン画像特徴点の対応関係を示す図である。It is a figure which shows the correspondence of a reference image feature point and a scene image feature point. シーン画像の注視点から基準画像の対応点への変換式を示す図である。It is a figure which shows the conversion type | formula from the gaze point of a scene image to the corresponding point of a reference | standard image. シーン画像の注視点から基準画像の対応点への座標変換の例を示す図である。It is a figure which shows the example of the coordinate transformation from the gaze point of a scene image to the corresponding point of a reference | standard image. 基準画像と、注視点を変換して得られた対応点の関係を示す図である。It is a figure which shows the relationship between the reference | standard image and the corresponding point obtained by converting a gaze point. 本実施形態に係る視線分析システムにより作成されたヒートマップの一例を示す図である。It is a figure which shows an example of the heat map produced by the visual line analysis system which concerns on this embodiment.

＜１．システム構成＞
以下、本発明の好適な実施形態について図面を参照して詳細に説明する。図１は、本発明の一実施形態に係る視線分析システムの概要を示す外観図である。図１において、１００は視線分析装置、２００は視線追跡装置、３００は対象物である。図１においては、対象物としてチラシ印刷物を用いた例を示している。 <1. System configuration>
DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments of the invention will be described in detail with reference to the drawings. FIG. 1 is an external view showing an outline of a line-of-sight analysis system according to an embodiment of the present invention. In FIG. 1, 100 is a line-of-sight analysis apparatus, 200 is a line-of-sight tracking apparatus, and 300 is an object. In FIG. 1, the example which used the leaflet printed matter as an object is shown.

視線追跡装置２００は、付属のカメラによりメガネ型の視線追跡装置２００を掛けた人物の視点に近い位置から対象物３００を撮像してシーン画像を取得するとともに、付属の距離センサにより掛けた人物からの距離に基づくシーン距離画像を取得する。撮像するタイミング、距離測定のタイミングと同期して、赤外線センサにより視点位置を取得し、シーン画像上における注視点を取得する。シーン画像と各シーン画像に対応する注視点、シーン距離画像は、送信その他の手段によりデータとして視線分析装置１００に渡される。視線分析装置１００は、対象物３００の基準画像、基準距離画像を記憶しておき、視線追跡装置２００から取得した各シーン画像の特徴点と基準画像の特徴点の対応付けを行うとともに、視線追跡装置２００から取得した各シーン距離画像の特徴点と基準距離画像の特徴点の対応付けを行い、対応付けが多く行われた方の画像を選択画像として、選択画像と基準画像または基準距離画像の座標変換行列を算出し、座標変換行列を用いて各シーン画像の注視点を基準画像上の対応点に変換する。 The line-of-sight tracking apparatus 200 obtains a scene image by capturing the object 300 from a position close to the viewpoint of the person wearing the glasses-type line-of-sight tracking apparatus 200 with an attached camera, and from the person applied with the attached distance sensor. A scene distance image based on the distance is acquired. In synchronization with the timing of imaging and the timing of distance measurement, the viewpoint position is acquired by the infrared sensor, and the gazing point on the scene image is acquired. The gazing point and scene distance image corresponding to the scene image and each scene image are passed to the line-of-sight analysis apparatus 100 as data by transmission or other means. The line-of-sight analysis apparatus 100 stores the reference image and the reference distance image of the object 300, associates the feature points of each scene image acquired from the line-of-sight tracking apparatus 200 with the feature points of the reference image, and tracks the line of sight. The feature points of each scene distance image acquired from the apparatus 200 are associated with the feature points of the reference distance image, and the image with the larger number of associations is selected as the selected image, and the selected image and the reference image or the reference distance image A coordinate transformation matrix is calculated, and the gaze point of each scene image is transformed into a corresponding point on the reference image using the coordinate transformation matrix.

図２（ａ）は、視線分析装置１００のハードウェア構成図である。視線分析装置１００は、汎用のコンピュータに専用のプログラムを組み込むことにより実現することができる。図２に示すように、視線分析装置１００は、ＣＰＵ（Central Processing Unit）１０ａと、メインメモリであるＲＡＭ（Random Access Memory）１０ｂと、ＣＰＵ１０ａが実行するプログラムやデータを記憶するためのハードディスク、フラッシュメモリ等の記憶装置１０ｃと、キーボード・マウス等の指示入力部１０ｄと、視線追跡装置２００やデータ記憶媒体等の外部装置とデータ通信するためのデータ入出力Ｉ／Ｆ（インタフェース）１０ｅと、液晶ディスプレイ等の表示デバイスである表示部１０ｆを備え、互いにバスを介して接続されている。 FIG. 2A is a hardware configuration diagram of the line-of-sight analysis apparatus 100. The line-of-sight analysis apparatus 100 can be realized by incorporating a dedicated program into a general-purpose computer. As shown in FIG. 2, the line-of-sight analysis apparatus 100 includes a CPU (Central Processing Unit) 10a, a main memory RAM (Random Access Memory) 10b, a hard disk and a flash for storing programs and data executed by the CPU 10a. A storage device 10c such as a memory; an instruction input unit 10d such as a keyboard / mouse; a data input / output I / F (interface) 10e for data communication with an external device such as a line-of-sight tracking device 200 or a data storage medium; A display unit 10f, which is a display device such as a display, is provided and connected to each other via a bus.

記憶装置１０ｃには、基準画像と、上記特許文献１、非特許文献１〜７に記載の手法により基準画像から抽出された特徴点である基準画像特徴点、基準距離画像から抽出された特徴点である基準距離画像特徴点が、それぞれ各画像と対応付けて記憶されている。基準画像は、対象物の画像データそのもの、または対象物を正面から撮像することにより取得される。基準距離画像は、距離センサを対象物の正面に設置して距離の測定を行い、測定された距離に基づいて、座標と距離を対応付けたデータ配列として取得される。このデータ配列は、各座標に対応する値を階調として、表示装置に表示させることにより画像として表現することが可能であるので、基準距離画像と呼ぶことにする。基準画像と基準距離画像の画素サイズ（縦×横の画素数）は、同一になるように揃えておく。 In the storage device 10c, a reference image, a reference image feature point that is a feature point extracted from the reference image by the methods described in Patent Document 1 and Non-Patent Documents 1 to 7, and a feature point extracted from a reference distance image Are stored in association with the respective images. The reference image is acquired by imaging the image data of the object itself or the object from the front. The reference distance image is acquired as a data array in which a distance sensor is installed in front of an object, the distance is measured, and coordinates and distances are associated with each other based on the measured distance. Since this data array can be expressed as an image by displaying the value corresponding to each coordinate as a gradation on a display device, it will be referred to as a reference distance image. The pixel sizes (vertical × horizontal number of pixels) of the reference image and the reference distance image are arranged to be the same.

図２（ｂ）は、視線追跡装置２００のハードウェア構成図である。赤外線光源２０１は、赤外線２０７を被験者の眼球３０１（角膜（瞳孔）や強膜（白目））に照射する。赤外線センサ２０２は、眼球３０１から反射された赤外線２０８（反射光）を受光する。シーンカメラ２０３は、二次元撮像素子である。シーンカメラ２０３は、被験者の視野ほぼ同等の光景（注視点を中心としたチラシを含む画像）をシーン画像として撮像する。距離センサ２０６は、対象物までの距離を測定するセンサである。図２（ｂ）に示した視線追跡装置２００は、特許文献１に示した公知の視線追跡装置に、公知の距離センサを取り付けることにより実現される。 FIG. 2B is a hardware configuration diagram of the eye tracking device 200. The infrared light source 201 irradiates the eyeball 301 (the cornea (pupil) or sclera (white eye)) of the subject with the infrared ray 207. The infrared sensor 202 receives the infrared ray 208 (reflected light) reflected from the eyeball 301. The scene camera 203 is a two-dimensional image sensor. The scene camera 203 captures, as a scene image, a scene (an image including a flyer centered on a gazing point) that is substantially equivalent to the subject's visual field. The distance sensor 206 is a sensor that measures the distance to the object. The line-of-sight tracking apparatus 200 shown in FIG. 2B is realized by attaching a known distance sensor to the known line-of-sight tracking apparatus shown in Patent Document 1.

距離センサ２０６としては、公知の様々な方式の距離センサを用いることができるが、本実施形態では、投射したレーザーが対象物まで往復するのにかかる時間から距離を計測するＴＯＦ（Time-of-Flight）方式を採用している。したがって、図２に示した距離センサ２０６は、レーザー光源、時間計測回路、受光部を備えた構成となっている。 Various known types of distance sensors can be used as the distance sensor 206, but in this embodiment, a TOF (Time-of-Time) that measures the distance from the time it takes for the projected laser to reciprocate to the object. The Flight method is used. Therefore, the distance sensor 206 shown in FIG. 2 has a configuration including a laser light source, a time measurement circuit, and a light receiving unit.

処理部２０４は、演算処理回路と、記憶回路と、コネクターと、二次電池とを備える。処理部２０４は、たとえば、眼球３０１の角膜と強膜の反射率の違いを利用した強膜反射法（リンバスト・トラッキング法。特許文献２による）に基づいて、受光した反射光２０８を用いて視線を計測して、視線の方向を推定する。なお、処理部２０４は、強膜反射法の代わりに、瞳孔−角膜反射法や、角膜反射法、ダブルプリキニエ法などを用いることも可能である。そして、特許文献１に示したように推定した視線の方向と三次元空間における印刷物の位置を利用して二次元平面であるシーン画像上における注視点の座標を求める。また、処理部２０４は、求められた注視点と撮像したシーン画像とを、通信接続した視線分析装置１００に送信する。さらに、処理部２０４は、取得したシーン距離画像を、通信接続した視線分析装置１００に送信する。コネクターは、視線追跡装置２００と視線分析装置１００とに通信接続する。二次電池は、処理部２０４自身の各回路と、赤外線光源２０１と赤外線センサ２０２とシーンカメラ２０３とに、電力を供給する。処理部２０４は、記憶回路に記憶されたプログラムに従って演算処理回路が上記各処理を実行する。 The processing unit 204 includes an arithmetic processing circuit, a storage circuit, a connector, and a secondary battery. For example, the processing unit 204 uses the reflected light 208 received based on the sclera reflection method (the robust tracking method, according to Patent Document 2) using the difference in reflectance between the cornea and the sclera of the eyeball 301. To estimate the direction of the line of sight. Note that the processing unit 204 can use a pupil-corneal reflection method, a corneal reflection method, a double pre-Kinnie method, or the like instead of the sclera reflection method. And the coordinate of the gaze point on the scene image which is a two-dimensional plane is calculated | required using the direction of the gaze estimated as shown in patent document 1, and the position of the printed matter in three-dimensional space. In addition, the processing unit 204 transmits the obtained gazing point and the captured scene image to the line-of-sight analysis device 100 connected in communication. Further, the processing unit 204 transmits the acquired scene distance image to the line-of-sight analysis device 100 connected for communication. The connector is communicatively connected to the line-of-sight tracking apparatus 200 and the line-of-sight analysis apparatus 100. The secondary battery supplies power to each circuit of the processing unit 204 itself, the infrared light source 201, the infrared sensor 202, and the scene camera 203. In the processing unit 204, the arithmetic processing circuit executes each of the above processes according to a program stored in the storage circuit.

＜２．処理動作＞
次に、図３に示したフローチャートを用いて、図１、２に示した視線分析システムの処理動作を説明する。まず、作業者は、メガネ型の視線追跡装置２００を起動するとともに、メガネ型の視線追跡装置２００をメガネのように掛けて、対象物３００の閲覧を開始する。視線追跡装置２００が起動されると、シーンカメラ２０３が撮像を開始し、距離センサ２０６が距離の測定を開始し、赤外線光源２０１が発光を開始し、赤外線センサ２０２も受光を開始する。図３に示したステップＳ１１〜ステップＳ１４の一連の処理と、ステップＳ１５〜ステップＳ１８の一連の処理は並行して行われる。 <2. Processing action>
Next, the processing operation of the line-of-sight analysis system shown in FIGS. 1 and 2 will be described using the flowchart shown in FIG. First, the operator activates the eyeglass-type line-of-sight tracking device 200 and puts the eyeglass-type eye-gaze tracking device 200 like glasses to start browsing the object 300. When the eye tracking device 200 is activated, the scene camera 203 starts imaging, the distance sensor 206 starts measuring the distance, the infrared light source 201 starts emitting light, and the infrared sensor 202 also starts receiving light. A series of processes from step S11 to step S14 shown in FIG. 3 and a series of processes from step S15 to step S18 are performed in parallel.

先に、ステップＳ１１〜ステップＳ１４の一連の処理について説明する。まず、視線追跡装置２００が、シーン画像および注視点を取得する（ステップＳ１１）。具体的には、シーンカメラ２０３が所定時間間隔で撮像を行ってシーン画像を取得し、赤外線センサ２０２により注視点を取得する。シーンカメラ２０３、赤外線光源２０１、赤外線センサ２０２は互いに同期しており、シーン画像が撮像されるタイミングで注視点が取得される。所定時間間隔としては適宜設定できるが、例えば、１秒間隔で設定することができる。また、撮影枚数も任意に設定することができる。本実施形態では、１秒間隔でシーン画像を１０枚撮影したものとして説明する。視線追跡装置２００の処理部２０４は、取得されたシーン画像と注視点を順次、図示しないデータ通信部を介して視線分析装置１００に送信する。注視点は、少なくとも対応するシーン画像中の座標を含む形式で送信される。視線分析装置１００では、データ入出力Ｉ／Ｆ１０ｅを介してシーン画像と注視点を受信する。 First, a series of processing from step S11 to step S14 will be described. First, the line-of-sight tracking apparatus 200 acquires a scene image and a gazing point (step S11). Specifically, the scene camera 203 captures images at predetermined time intervals to acquire a scene image, and the infrared sensor 202 acquires a gazing point. The scene camera 203, the infrared light source 201, and the infrared sensor 202 are synchronized with each other, and a gazing point is acquired at the timing when a scene image is captured. The predetermined time interval can be set as appropriate. For example, it can be set at an interval of 1 second. Also, the number of shots can be set arbitrarily. In the present embodiment, description will be made assuming that 10 scene images are taken at intervals of 1 second. The processing unit 204 of the line-of-sight tracking apparatus 200 sequentially transmits the acquired scene image and gazing point to the line-of-sight analysis apparatus 100 via a data communication unit (not shown). The gazing point is transmitted in a format including at least the coordinates in the corresponding scene image. The line-of-sight analysis apparatus 100 receives a scene image and a gazing point via the data input / output I / F 10e.

図４は、シーン画像Ｃと注視点Ｅの関係を示す図である。上述のように、１０枚のシーン画像が撮像された場合、視線分析装置１００は１０枚のシーン画像Ｃ１〜Ｃ１０を受信する。そして、視線分析装置１００は、各シーン画像Ｃ１〜Ｃ１０について、それぞれ対応する注視点Ｅ１〜Ｅ１０を抽出する。図４において、矩形状の枠で示したシーン画像Ｃ１〜Ｃ１０内の“・（黒丸、ドット）”は、注視点Ｅ１〜Ｅ１０の位置を示している。なお、図４の例では、シーン画像Ｃ１〜Ｃ１０として矩形状の枠のみを示しているが、実際には、図５に示すように、対象物３００の少なくとも一部（好ましくは全部）が撮像されて記録された状態となる。図５（ａ）は、１枚のシーン画像の一例を示しており、図５（ｂ）はシーン画像上に注視点がマッピングされた状態を示している。 FIG. 4 is a diagram illustrating a relationship between the scene image C and the gazing point E. As illustrated in FIG. As described above, when 10 scene images are captured, the line-of-sight analysis apparatus 100 receives 10 scene images C1 to C10. Then, the line-of-sight analysis apparatus 100 extracts gazing points E1 to E10 corresponding to the scene images C1 to C10, respectively. In FIG. 4, “• (black circles, dots)” in the scene images C1 to C10 indicated by rectangular frames indicate the positions of the gazing points E1 to E10. In the example of FIG. 4, only the rectangular frames are shown as the scene images C1 to C10, but actually, as shown in FIG. 5, at least a part (preferably all) of the object 300 is captured. To be recorded. FIG. 5A shows an example of one scene image, and FIG. 5B shows a state in which a gazing point is mapped on the scene image.

次に、視線分析装置１００が、シーン画像特徴点の抽出を行う（ステップＳ１２）。具体的には、シーン画像を解析し、シーン画像の特徴を表現した点となるシーン画像特徴点を抽出する。シーン画像からのシーン画像特徴点の抽出は、公知の様々な手法により行うことができる。例えば、非特許文献１に記載のＳＩＦＴアルゴリズム、非特許文献２に記載のＳＵＲＦ（ＳｐｅｅｄｅｄＵｐＲｏｂｕｓｔＦｅａｔｕｒｅｓ）アルゴリズム、非特許文献６に記載のＲａｎｄｏｍｉｚｅｄＴｒｅｅアルゴリズム、非特許文献７に記載のＦｅｒｎｓアルゴリズムを採用することができる。すなわち、上記アルゴリズムを実現するプログラムをＣＰＵ１０ａが実行することによりシーン画像からシーン画像特徴点の抽出を行う。本実施形態では、非特許文献１に記載のＳＩＦＴアルゴリズムに対応するプログラムをＣＰＵ１０ａが実行することにより、シーン画像特徴点が、シーン画像Ｃ１〜Ｃ１０からそれぞれ抽出されることになる。図５（ｃ）は、シーン画像Ｃ上から抽出された１つの特徴点Ｑを示している。 Next, the line-of-sight analysis apparatus 100 extracts scene image feature points (step S12). Specifically, the scene image is analyzed, and a scene image feature point that is a point expressing the feature of the scene image is extracted. Extraction of scene image feature points from a scene image can be performed by various known methods. For example, the SIFT algorithm described in Non-Patent Document 1, the SURF (SpeededUpRobustFeatures) algorithm described in Non-Patent Document 2, the Randomized Tree algorithm described in Non-Patent Document 6, and the Ferns algorithm described in Non-Patent Document 7 may be adopted. it can. That is, a scene image feature point is extracted from a scene image by the CPU 10a executing a program for realizing the above algorithm. In this embodiment, when the CPU 10a executes a program corresponding to the SIFT algorithm described in Non-Patent Document 1, scene image feature points are extracted from the scene images C1 to C10, respectively. FIG. 5C shows one feature point Q extracted from the scene image C.

次に、視線分析装置１００は、基準画像特徴点を取得する（ステップＳ１３）。具体的には、基準画像から事前に抽出され、記憶されていた基準画像特徴点を記憶装置１０ｃから取得する。続いて、視線分析装置１００は、シーン画像から抽出されたシーン画像特徴点と、基準画像特徴点の対応付けを行う（ステップＳ１４）。具体的には、特許文献１に記載の手法を用い、基準画像と各シーン画像の対応する特徴点を所定数ずつ抽出して、ＳＩＦＴ特徴量の類似度が高い特徴点同士の対応付けを行う。 Next, the line-of-sight analysis apparatus 100 acquires a reference image feature point (step S13). Specifically, the reference image feature points extracted and stored in advance from the reference image are acquired from the storage device 10c. Subsequently, the line-of-sight analysis apparatus 100 associates the scene image feature point extracted from the scene image with the reference image feature point (step S14). Specifically, using the technique described in Patent Document 1, a predetermined number of feature points corresponding to the reference image and each scene image are extracted, and feature points having high SIFT feature value similarity are associated with each other. .

図６は、基準画像特徴点Ｐとシーン画像特徴点Ｑを対応付けた結果を示している。上記特許文献１に記載の手法を用いることにより、図６に示すように特徴点同士の対応付けが行われる。通常、ＳＩＦＴ特徴量の類似度と所定のしきい値を比較することにより、ＳＩＦＴ特徴量の類似度が高いか否かを判定するため、各シーン画像において、基準画像の基準画像特徴点と対応付けられるシーン画像特徴点の数は異なる。ステップＳ１４における特徴点の対応付けは、同一の対象物を撮像した画像同士で、対応する点を特定するためのものであるので、上記特許文献１に記載の手法以外の公知の様々な手法を用いることが可能である。 FIG. 6 shows the result of associating the reference image feature point P with the scene image feature point Q. By using the method described in Patent Document 1, the feature points are associated with each other as shown in FIG. Usually, in order to determine whether or not the SIFT feature value similarity is high by comparing the SIFT feature value similarity with a predetermined threshold, each scene image corresponds to the reference image feature point of the reference image. The number of scene image feature points attached is different. Since the feature point association in step S14 is for identifying the corresponding points between the images of the same object, various known methods other than the method described in Patent Document 1 are used. It is possible to use.

次に、ステップＳ１５〜ステップＳ１８の一連の処理について説明する。まず、視線追跡装置２００が、シーン距離画像を取得する（ステップＳ１５）。具体的には、距離センサ２０６が所定時間間隔で距離の測定を行ってシーン距離画像を取得する。所定時間間隔としては適宜設定できるが、例えば、１秒間隔で設定することができる。また、距離の測定回数も任意に設定することができる。本実施形態では、１秒間隔で距離を１０回測定したものとして説明する。視線追跡装置２００の処理部２０４は、距離センサ２０６により測定された距離に基づいて、座標と距離を対応付けたデータ配列を作成する。このデータ配列は、各座標に対応する値を階調として、表示装置に表示させることにより画像として表現することが可能であるので、シーン距離画像と呼ぶことにする。視線追跡装置２００の処理部２０４は、取得されたシーン距離画像を順次、図示しないデータ通信部を介して視線分析装置１００に送信する。視線分析装置１００では、データ入出力Ｉ／Ｆ１０ｅを介してシーン画像を受信する。 Next, a series of processing from step S15 to step S18 will be described. First, the line-of-sight tracking apparatus 200 acquires a scene distance image (step S15). Specifically, the distance sensor 206 measures the distance at predetermined time intervals and acquires a scene distance image. The predetermined time interval can be set as appropriate. For example, it can be set at an interval of 1 second. The number of distance measurements can also be set arbitrarily. In the present embodiment, description will be made assuming that the distance is measured 10 times at intervals of 1 second. The processing unit 204 of the line-of-sight tracking apparatus 200 creates a data array in which coordinates and distances are associated with each other based on the distance measured by the distance sensor 206. Since this data array can be expressed as an image by displaying the value corresponding to each coordinate as a gradation on a display device, it is called a scene distance image. The processing unit 204 of the line-of-sight tracking apparatus 200 sequentially transmits the acquired scene distance images to the line-of-sight analysis apparatus 100 via a data communication unit (not shown). The line-of-sight analysis apparatus 100 receives a scene image via the data input / output I / F 10e.

次に、視線分析装置１００が、シーン距離画像特徴点の抽出を行う（ステップＳ１６）。具体的には、シーン距離画像を解析し、シーン距離画像の特徴を表現した点となるシーン画距離像特徴点を抽出する。シーン画像からのシーン距離画像特徴点の抽出は、公知の様々な手法により行うことができる。例えば、シーン画像と同様、非特許文献１に記載のＳＩＦＴアルゴリズム、非特許文献２に記載のＳＵＲＦ（ＳｐｅｅｄｅｄＵｐＲｏｂｕｓｔＦｅａｔｕｒｅｓ）アルゴリズム、非特許文献６に記載のＲａｎｄｏｍｉｚｅｄＴｒｅｅアルゴリズム、非特許文献７に記載のＦｅｒｎｓアルゴリズムを採用することができる。すなわち、上記アルゴリズムを実現するプログラムをＣＰＵ１０ａが実行することによりシーン距離画像からシーン距離画像特徴点の抽出を行う。本実施形態では、非特許文献１に記載のＳＩＦＴアルゴリズムに対応するプログラムをＣＰＵ１０ａが実行することにより、シーン距離画像特徴点が、シーン距離画像ＣＫ１〜ＣＫ１０からそれぞれ抽出されることになる。 Next, the line-of-sight analysis apparatus 100 extracts scene distance image feature points (step S16). Specifically, the scene distance image is analyzed, and a scene image distance image feature point that is a point expressing the feature of the scene distance image is extracted. Extraction of scene distance image feature points from a scene image can be performed by various known methods. For example, as with scene images, the SIFT algorithm described in Non-Patent Document 1, the SURF (SpeededUpRobustFeatures) algorithm described in Non-Patent Document 2, the Randomized Tree algorithm described in Non-Patent Document 6, and the Ferns algorithm described in Non-Patent Document 7. Can be adopted. That is, a scene distance image feature point is extracted from a scene distance image by the CPU 10a executing a program for realizing the above algorithm. In the present embodiment, when the CPU 10a executes a program corresponding to the SIFT algorithm described in Non-Patent Document 1, scene distance image feature points are extracted from the scene distance images CK1 to CK10, respectively.

次に、視線分析装置１００は、基準距離画像特徴点を取得する（ステップＳ１７）。具体的には、基準距離画像から事前に抽出され、記憶されていた基準距離画像特徴点を記憶装置１０ｃから取得する。続いて、シーン距離画像から抽出されたシーン距離画像特徴点と、基準距離画像特徴点の対応付けを行う（ステップＳ１８）。具体的には、ステップＳ１４と同様に、特許文献１に記載の手法を用い、基準距離画像と各シーン距離画像の対応する特徴点を所定数ずつ抽出して、ＳＩＦＴ特徴量の類似度が高い特徴点同士の対応付けを行う。ステップＳ１３の場合と同様に、図６に示したようなイメージで特徴点同士の対応付けが行われることになる。 Next, the line-of-sight analysis apparatus 100 acquires a reference distance image feature point (step S17). Specifically, the reference distance image feature points extracted in advance from the reference distance image and stored are acquired from the storage device 10c. Subsequently, the scene distance image feature point extracted from the scene distance image is associated with the reference distance image feature point (step S18). Specifically, similar to step S14, the technique described in Patent Document 1 is used to extract a predetermined number of feature points corresponding to the reference distance image and each scene distance image, and the similarity of SIFT feature values is high. The feature points are associated with each other. Similar to the case of step S13, the feature points are associated with each other using the image as shown in FIG.

上述のように、ステップＳ１１〜ステップＳ１４の一連の処理と、ステップＳ１５〜ステップＳ１８の一連の処理は並行して行われる。したがって、上述のように、ステップＳ１１〜ステップＳ１４の一連の処理を先に実行してもよいが、ステップＳ１５〜ステップＳ１８の一連の処理を先に実行してもよい。また、各シーン画像、各シーン距離画像について交互に処理を行ってもよい。いずれにしても、次の処理に進む前に、双方の処理を終えておく必要がある。 As described above, the series of processes in steps S11 to S14 and the series of processes in steps S15 to S18 are performed in parallel. Therefore, as described above, the series of processing from step S11 to step S14 may be executed first, but the series of processing from step S15 to step S18 may be executed first. Further, processing may be performed alternately for each scene image and each scene distance image. In any case, it is necessary to finish both processes before proceeding to the next process.

好ましくは、ステップＳ１１とステップＳ１５については、同時に行うことが好ましい。具体的には、カメラ２０３による撮像と、距離センサ２０６による測定を同期して行う。この場合、同期とは、所定の比較的短い時間範囲（例えば、０．１秒間）内で行うことを意味する。 Preferably, step S11 and step S15 are preferably performed simultaneously. Specifically, the imaging by the camera 203 and the measurement by the distance sensor 206 are performed in synchronization. In this case, the synchronization means that the synchronization is performed within a predetermined relatively short time range (for example, 0.1 second).

ステップＳ１４における基準画像と各シーン画像の特徴点の対応付けが全シーン画像について完了し、ステップＳ１８における基準距離画像と各シーン距離画像の特徴点の対応付けが全シーン距離画像について完了したら、次に、採用する画像を選択する（ステップＳ２０）。具体的には、ステップＳ１４、ステップＳ１８における対応付けの結果、より多く特徴点が対応付けられた方の画像を選択する。これは、より多く対応付けが行われている方が、マッチング精度が高いと考えられるためである。 When the association between the reference image and the feature points of each scene image in step S14 is completed for all scene images, and the association of the reference distance image and the feature points of each scene distance image in step S18 is completed for all scene distance images, Then, an image to be adopted is selected (step S20). Specifically, as a result of the association in step S14 and step S18, an image with more feature points associated is selected. This is because the matching accuracy is higher when more associations are performed.

採用する画像の選択は、各シーン画像、各シーン距離画像単位で行う。すなわち、シーン画像Ｃ１とシーン距離画像ＣＫ１を比較し、特徴点が多く対応付けられた方を選択する。したがって、本実施形態のように、シーン画像１０枚、シーン距離画像１０枚を利用した場合、同じ数字の画像のどちらか１枚が選択され、計１０枚の画像が選択されることになる。例えば、シーン画像からはＣ１、Ｃ２、Ｃ５、Ｃ８が選ばれ、シーン距離画像からは、ＣＫ３、ＣＫ４、ＣＫ６、ＣＫ７、ＣＫ９、ＣＫ１０が選ばれる、というような状態となる。 The image to be adopted is selected for each scene image and each scene distance image. That is, the scene image C1 and the scene distance image CK1 are compared, and the one that has many feature points associated therewith is selected. Accordingly, when 10 scene images and 10 scene distance images are used as in the present embodiment, one of the images with the same number is selected, and a total of 10 images are selected. For example, C1, C2, C5, and C8 are selected from the scene image, and CK3, CK4, CK6, CK7, CK9, and CK10 are selected from the scene distance image.

次に、選択された画像と、対応する基準画像または基準距離画像の特徴点を用いて、座標変換行列を作成する（ステップＳ３０）。シーン画像が選択された場合、基準画像特徴点（Ｘ，Ｙ）、シーン画像特徴点（ｘ，ｙ）を図７（ａ）に示した所定の変換式に代入し、座標変換行列を求める。本実施形態では、座標変換行列として、３次元平面のある平面から他の平面へ投影するためのホモグラフィ行列Ｈを利用している。図７は、ホモグラフィ行列Ｈを利用した場合の、シーン画像の注視点から基準画像の対応点への変換式を示す図である。図７（ｂ）に座標変換行列として示したホモグラフィ行列Ｈのうち、要素ｈ３３はスケールファクタであるため、ｈ３３＝１として正規化すると、残りの８個の要素を求めることにより、座標変換行列が求まる。したがって、対応する基準画像特徴点（Ｘ，Ｙ）、シーン画像特徴点（ｘ，ｙ）として、それぞれ所定個以上代入することにより、要素ｈ１１、ｈ１２、ｈ１３、ｈ２１、ｈ２２、ｈ２３、ｈ３１、ｈ３２を求めることが可能となる。 Next, a coordinate transformation matrix is created using the selected image and the feature points of the corresponding reference image or reference distance image (step S30). When a scene image is selected, the reference image feature point (X, Y) and the scene image feature point (x, y) are substituted into a predetermined conversion formula shown in FIG. 7A to obtain a coordinate conversion matrix. In this embodiment, a homography matrix H for projecting from one plane of a three-dimensional plane to another plane is used as the coordinate transformation matrix. FIG. 7 is a diagram illustrating a conversion formula from the gazing point of the scene image to the corresponding point of the reference image when the homography matrix H is used. Of the homography matrix H shown as the coordinate transformation matrix in FIG. 7B, the element h33 is a scale factor. Therefore, when normalized with h33 = 1, the remaining eight elements are obtained to obtain the coordinate transformation matrix. Is obtained. Therefore, elements h11, h12, h13, h21, h22, h23, h31, h32 are assigned by substituting a predetermined number or more as corresponding reference image feature points (X, Y) and scene image feature points (x, y). Can be obtained.

ステップＳ２０においてシーン距離画像が選択された場合、基準距離画像特徴点（Ｘ，Ｙ）、シーン距離画像特徴点（ｘ，ｙ）を図７（ａ）に示した所定の変換式に代入し、シーン距離画像が選択された場合と同様にして、座標変換行列を求める。 When the scene distance image is selected in step S20, the reference distance image feature point (X, Y) and the scene distance image feature point (x, y) are substituted into the predetermined conversion formula shown in FIG. A coordinate transformation matrix is obtained in the same manner as when a scene distance image is selected.

次に、作成した座標変換行列を用いて、各シーン画像に対応する注視点の座標を基準画像上の座標に変換し、対応点を算出する（ステップＳ４０）。具体的には、ステップＳ３０において作成された座標変換行列を用いてシーン画像上の注視点を基準画像上の対応点に座標変換する。図８は、１枚のシーン画像Ｃの注視点Ｅを基準画像Ｋ上の対応点Ｔに座標変換する様子を示す図である。基準画像Ｋと基準距離画像ＫＫは最初から座標を一致させてあるので、基準距離画像ＫＫ上の対応点ＴＫは、そのまま基準画像Ｋ上のものとして処理することができる。各シーン画像の注視点を基準画像上に座標変換することにより、基準画像上に複数の対応点を求めることができる。 Next, using the generated coordinate conversion matrix, the coordinates of the gazing point corresponding to each scene image are converted to the coordinates on the reference image, and the corresponding points are calculated (step S40). Specifically, the gaze point on the scene image is coordinate-transformed to the corresponding point on the reference image using the coordinate transformation matrix created in step S30. FIG. 8 is a diagram illustrating a state in which the gazing point E of one scene image C is coordinate-converted to a corresponding point T on the reference image K. Since the coordinates of the reference image K and the reference distance image KK are matched from the beginning, the corresponding point TK on the reference distance image KK can be processed as it is on the reference image K as it is. A plurality of corresponding points can be obtained on the reference image by performing coordinate conversion of the gazing point of each scene image on the reference image.

図９は、基準画像Ｋと対応点Ｔ、対応点ＴＫの関係を示す図である。上述のように、基準画像Ｋは、対象物３００の正面からの状態を記録した画像であり、１枚のみである。シーン画像Ｃ１、Ｃ２、Ｃ５、Ｃ８、シーン距離画像ＣＫ３、ＣＫ４、ＣＫ６、ＣＫ７、ＣＫ９、ＣＫ１０が選択画像として選ばれた場合、視線分析装置１００は、注視点Ｅ１、Ｅ２、Ｅ５、Ｅ８を、それぞれ各シーン画像Ｃ１、Ｃ２、Ｃ５、Ｃ８に対応する座標変換行列を用いて座標変換し、変換された座標を対応点Ｔ１、Ｔ２、Ｔ５、Ｔ８とする。また、注視点Ｅ３、Ｅ４、Ｅ６、Ｅ７、Ｅ９、Ｅ１０を、それぞれ各シーン距離画像ＣＫ３、ＣＫ４、ＣＫ６、ＣＫ７、ＣＫ９、ＣＫ１０に対応する座標変換行列を用いて座標変換し、変換された座標を対応点ＴＫ３、ＴＫ４、ＴＫ６、ＴＫ７、ＴＫ９、ＴＫ１０とする。図９において、矩形状の枠で示した基準画像Ｋ内の“・（黒丸、ドット）”は、対応点Ｔ１〜ＴＫ１０の位置を示している。なお、図９の例では、基準画像Ｋとして矩形状の枠のみを示しているが、実際には、対象物３００の正面からの像が記録された状態となる。 FIG. 9 is a diagram illustrating the relationship between the reference image K, the corresponding point T, and the corresponding point TK. As described above, the reference image K is an image in which the state from the front of the object 300 is recorded, and is only one image. When the scene images C1, C2, C5, C8 and the scene distance images CK3, CK4, CK6, CK7, CK9, CK10 are selected as the selection images, the line-of-sight analysis apparatus 100 displays the gazing points E1, E2, E5, E8, Coordinate conversion is performed using the coordinate conversion matrix corresponding to each of the scene images C1, C2, C5, and C8, and the converted coordinates are set as corresponding points T1, T2, T5, and T8. Also, the gazing points E3, E4, E6, E7, E9, and E10 are coordinate-transformed using the coordinate transformation matrix corresponding to each scene distance image CK3, CK4, CK6, CK7, CK9, and CK10, respectively, and the transformed coordinates Are the corresponding points TK3, TK4, TK6, TK7, TK9, TK10. In FIG. 9, “• (black circle, dot)” in the reference image K indicated by a rectangular frame indicates the positions of the corresponding points T1 to TK10. In the example of FIG. 9, only a rectangular frame is shown as the reference image K, but actually, an image from the front of the object 300 is recorded.

図９の例では、１枚の基準画像上に、各シーン画像、各シーン距離画像に対応する対応点Ｔ、ＴＫが計１０個得られる。なお、図９の例では、対応点が全て互いに異なる座標（画素）となる場合、すなわち１つの座標（画素）に複数の対応点が重ならない場合を示している。複数の対応点が同一の座標（画素）に重なる場合には、１つの座標（画素）に対応する対応点の数をカウントして記録する。 In the example of FIG. 9, a total of ten corresponding points T and TK corresponding to each scene image and each scene distance image are obtained on one reference image. In the example of FIG. 9, a case where the corresponding points are all different from each other in coordinates (pixels), that is, a case where a plurality of corresponding points do not overlap one coordinate (pixel) is shown. When a plurality of corresponding points overlap the same coordinate (pixel), the number of corresponding points corresponding to one coordinate (pixel) is counted and recorded.

続いて、基準画像と対応点Ｔ、ＴＫを利用してヒートマップを作成する。具体的には、各基準画像Ｋの座標を所定の領域に分割したクラスタリングの実行や、ガウシアンフィルタ等を用いて対応点以外に着色領域等を広げる等の加工処理をすることにより、ヒートマップを作成する。対応点等の特定の点の座標（画素）を用いてヒートマップを作成する手法としては、公知の様々な手法を用いることができる。作成されたヒートマップは、カメラにより撮像された画像、距離センサにより測定された距離に基づき、より適した方の注視点を表現したものとなるので、対象物が、立体的、曲面が多い、特徴が少ないエリアが多い等の状態である場合、また、対象物が、薄暗い場所で撮像された場合であっても、対象物上における閲覧者の注目した箇所を特定することが可能となる。 Subsequently, a heat map is created using the reference image and the corresponding points T and TK. Specifically, by executing clustering by dividing the coordinates of each reference image K into predetermined areas, or by performing processing such as expanding a colored area other than the corresponding points using a Gaussian filter or the like, the heat map is create. As a method for creating a heat map using the coordinates (pixels) of a specific point such as a corresponding point, various known methods can be used. The created heat map is based on the image captured by the camera and the distance measured by the distance sensor, so that the more suitable gazing point is expressed, so the object is three-dimensional and has many curved surfaces. Even when there are many areas with few features, or even when the object is imaged in a dim place, it is possible to identify the spot on the object that the viewer has noticed.

本実施形態に係る視線分析システムにより作成されたヒートマップの一例を図１０に示す。ある領域に対応点が多数含まれる場合は、熱を表現した赤系統の色になり、対応注視点が含まれる数が少なくなる程、黄色、緑色と変化していく。図１０の例では、基準画像として正面から撮像された対象物であるチラシ上に熱（対応点の数）に対応する色が上書きされた状態となっている。 An example of a heat map created by the line-of-sight analysis system according to this embodiment is shown in FIG. When a lot of corresponding points are included in a certain region, it becomes a red color representing heat, and changes to yellow and green as the number of corresponding attention points decreases. In the example of FIG. 10, a color corresponding to heat (the number of corresponding points) is overwritten on a leaflet that is an object captured from the front as a reference image.

上記実施形態では、対象物がチラシ印刷物である場合について説明した。特に、チラシ印刷物等の平面上のものであっても、特徴が少ないエリアが多い等の状態である場合や、薄暗い場所で閲覧された場合には、本発明の効果が得られる。また、本発明は、上述のように、立体的、曲面が多い対象物が、薄暗い場所で撮像された場合にも効果を発揮する。このような例としては、自動車の座席空間内部が挙げられる。 In the above embodiment, the case where the object is a printed flyer has been described. In particular, even on a flat surface such as a leaflet printed matter, the effect of the present invention can be obtained when there are many areas with few features or when browsing in a dim place. Further, as described above, the present invention is also effective when a three-dimensional object with many curved surfaces is imaged in a dim place. An example of this is the interior of a seat space of an automobile.

以上、本発明の好適な実施形態について説明したが、本発明は上記実施形態に限定されず、種々の変形が可能である。例えば、上記実施形態では、メガネ型の視線追跡装置２００において、シーンカメラ２０３と距離センサ２０６を一方のレンズに対応する位置に左右方向に隣接させて設置したが、上下方向に隣接させて設置したり、左右両端に離して設置したりしてもよい。できれば、隣接して設置することが好ましい。 The preferred embodiments of the present invention have been described above. However, the present invention is not limited to the above embodiments, and various modifications can be made. For example, in the above embodiment, in the eyeglass-type line-of-sight tracking apparatus 200, the scene camera 203 and the distance sensor 206 are installed adjacent to each other in the left-right direction at a position corresponding to one lens, but are installed adjacent to each other in the vertical direction. Or may be set apart from the left and right ends. If possible, it is preferable to install them adjacent to each other.

１０ａ・・・ＣＰＵ（Central Processing Unit）
１０ｂ・・・ＲＡＭ（Random Access Memory）
１０ｃ・・・記憶装置
１０ｄ・・・指示入力部
１０ｅ・・・データ入出力Ｉ／Ｆ
１０ｆ・・・表示部
１００・・・視線分析装置
２００・・・視線追跡装置
２０１・・・赤外線光源
２０２・・・赤外線センサ
２０３・・・シーンカメラ
２０４・・・処理部
２０６・・・距離センサ
２０７・・・赤外線
２０８・・・赤外線（反射光）
３００・・・対象物 10a: CPU (Central Processing Unit)
10b ... RAM (Random Access Memory)
10c: Storage device 10d: Instruction input unit 10e: Data input / output I / F
DESCRIPTION OF SYMBOLS 10f ... Display part 100 ... Gaze analysis apparatus 200 ... Gaze tracking apparatus 201 ... Infrared light source 202 ... Infrared sensor 203 ... Scene camera 204 ... Processing part 206 ... Distance sensor 207 ... Infrared 208 ... Infrared (reflected light)
300 ... Object

Claims

The system includes a camera that captures an object, a distance sensor that measures a distance from the object, and a gaze tracking device that acquires a gaze point of a subject, and a gaze analysis device that can perform data communication with the gaze tracking device. And
The line of sight analyzer is
A scene image captured by the camera, a scene image acquisition means for acquiring a gazing point on the scene image,
Scene distance image acquisition means for acquiring a scene distance image based on the distance measured by the distance sensor;
Scene image feature point extracting means for extracting a plurality of scene image feature points from the scene image;
Scene distance image feature point extracting means for extracting a plurality of scene distance image feature points from the scene distance image;
A scene image associating means for associating a reference image feature point in the reference image prepared in advance with the scene image feature point;
A scene distance image associating means for associating a reference distance image feature point with the scene distance image feature point in a reference distance image prepared in advance;
Of the scene image and the scene distance image, the image with the larger number of correspondences is selected as the selected image, and the coordinates of both are calculated based on the correspondence between the feature point of the selected image and the reference image or the reference distance image. A transformation matrix creation means for creating a coordinate transformation matrix to be transformed for each selected image;
Using the generated coordinate transformation matrix, for each of the plurality of scene images, corresponding points for converting corresponding points of the gazing point associated with the scene image into coordinates on a reference image and calculating corresponding points A calculation means;
A line-of-sight analysis system characterized by comprising:

The line-of-sight analysis system according to claim 1, wherein the line-of-sight tracking apparatus synchronously performs imaging by the camera and measurement by the distance sensor.

A camera that captures an object; and a distance sensor that measures a distance from the object; a scene image captured by the camera; a scene distance image based on the distance measured by the distance sensor; and a scene image A gaze analyzer connected to the gaze tracking device for acquiring the gaze point of the subject above and capable of data communication,
A scene image captured by the camera, a scene image acquisition means for acquiring a gazing point on the scene image,
Scene distance image acquisition means for acquiring a scene distance image based on the distance measured by the distance sensor;
Scene image feature point extracting means for extracting a plurality of scene image feature points from the scene image;
Scene distance image feature point extracting means for extracting a plurality of scene distance image feature points from the scene distance image;
Reference image feature point acquisition means for acquiring a plurality of reference image feature points in the reference image;
Reference distance image feature point acquisition means for acquiring a plurality of reference distance image feature points in the reference distance image;
Scene image associating means for obtaining a correspondence relationship between the scene image feature points and the reference image feature points;
Scene distance image associating means for obtaining a correspondence relationship between the scene distance image feature point and a reference distance image feature point;
Of the scene image and the scene distance image, an image having a larger correspondence relationship is selected as a selected image, and based on the correspondence relationship between the feature point of the selected image and the reference image or the selected feature point, the coordinates of both are selected. A transformation matrix creation means for creating a coordinate transformation matrix for transforming each selected image;
Using the generated coordinate transformation matrix, for each of the plurality of scene images, corresponding points for converting corresponding points of the gazing point associated with the scene image into coordinates on a reference image and calculating corresponding points A calculation means;
A line-of-sight analysis apparatus comprising:

A program for causing a computer to function as the line-of-sight analysis apparatus according to claim 3.