JP7766291B2

JP7766291B2 - Image analysis method and image analysis device

Info

Publication number: JP7766291B2
Application number: JP2021085166A
Authority: JP
Inventors: 亮古川; 道弘三鴨; 洋川崎; 志郎岡; 信治田中; 立昌佐川
Original assignee: Kyushu University NUC; Hiroshima University NUC; National Institute of Advanced Industrial Science and Technology AIST; Kindai University
Current assignee: Kyushu University NUC; Hiroshima University NUC; National Institute of Advanced Industrial Science and Technology AIST; Kindai University
Priority date: 2021-05-20
Filing date: 2021-05-20
Publication date: 2025-11-10
Anticipated expiration: 2041-05-20
Also published as: JP2022178393A

Description

特許法第３０条第２項適用開催年月日：令和２年７月２０日集会名、開催場所：国際会議「４２ｎｄＡｎｎｕａｌＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｓｏｆｔｈｅＩＥＥＥＥｎｇｉｎｅｅｒｉｎｇｉｎＭｅｄｉｃｉｎｅａｎｄＢｉｏｌｏｇｙＳｏｃｉｅｔｙｉｎｃｏｎｊｕｎｃｔｉｏｎｗｉｔｈｔｈｅ４３ｒｄＡｎｎｕａｌＣｏｎｆｅｒｅｎｃｅｏｆｔｈｅＣａｎａｄｉａｎＭｅｄｉｃａｌａｎｄＢｉｏｌｏｇｉｃａｌＥｎｇｉｎｅｅｒｉｎｇＳｏｃｉｅｔｙ」（オンライン開催）Article 30, Paragraph 2 of the Patent Act applies. Date held: July 20, 2020. Name and location of the meeting: International conference "42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society in conjunction with the 43rd Annual Conference of the Canadian Medical and Biological Engineering Society" (held online).

本開示は、画像分析方法および画像分析装置に関し、特に、基準となるパターン画像と、パターン光を投影した撮影対象から生成された撮影画像との対応関係を分析する画像分析方法および画像分析装置に関する。 This disclosure relates to an image analysis method and an image analysis device, and in particular to an image analysis method and an image analysis device that analyzes the correspondence between a reference pattern image and a captured image generated from a subject onto which pattern light is projected.

撮像装置（カメラ）を使用して物体（撮影対象）を撮影することによって生成される二次元画像において立体感（奥行）を表現する（つまり、三次元画像を構築する）技術が存在する。例えば、ステレオ画像法は、三次元画像を表現する典型的な技術である。 Technologies exist for expressing a sense of depth in two-dimensional images (i.e., constructing three-dimensional images) that are generated by photographing an object (subject) using an imaging device (camera). For example, stereo imaging is a typical technique for expressing three-dimensional images.

ステレオ画像法は一般的に、パッシブステレオ法およびアクティブステレオ法に分類される。アクティブステレオ法は、撮像装置とは別に設けられた投光器が撮影対象に光を投影し、光が投影された撮影対象を撮像装置が撮影する。アクティブステレオ法では、投光器からの光が撮影対象から反射されて撮像装置に到達することになり、つまり、投光器からの光の光路は、撮影対象の奥行に対応する距離に応じてずれることになる。よって、アクティブステレオ法は、この光路のずれを利用して、三角測量に基づいて撮像装置と撮影対象との間の距離を計算し、この距離に基づいて三次元画像（奥行）を表現する。 Stereo imaging methods are generally classified into passive stereo and active stereo. In active stereo, a light projector installed separately from the imaging device projects light onto the subject, which is then captured by the imaging device. In active stereo, the light from the projector is reflected from the subject and reaches the imaging device, meaning that the optical path of the light from the projector is shifted according to a distance corresponding to the depth of the subject. Therefore, active stereo uses this optical path shift to calculate the distance between the imaging device and the subject based on triangulation, and represents a three-dimensional image (depth) based on this distance.

特開２００９－３００２７７号公報JP 2009-300277 A

上述したアクティブステレオ法では、上述した光路のずれに応じた距離を計算するために、撮影対象の二次元平面における、撮影対象に投影されることになる光の元の位置と、実際に投影された光の位置とを対応付ける必要がある。アクティブステレオ法のうち、投光器から特定のパターン（格子など）を構成する光を投影する方式であるパターン光投影法が存在する。パターン光投影法は、例えば、スポット光を投影する方式と比較して、撮影対象を撮影する回数を低減させることができる。 In the active stereo method described above, in order to calculate the distance corresponding to the deviation in the optical path, it is necessary to associate the original position of the light to be projected onto the subject on a two-dimensional plane of the subject with the actual position of the projected light. One type of active stereo method is the pattern light projection method, which projects light that forms a specific pattern (such as a grid) from a projector. Compared to methods that project spot light, for example, the pattern light projection method can reduce the number of times the subject needs to be photographed.

しかしながら、パターン光投影法は、複数の格子のそれぞれに対して、パターン光の元の位置と実際に投影された位置とを対応付ける必要があるので、対応付けが困難である。撮影対象の奥行に応じて、撮影対象から反射したパターン光の形状が歪み、投影されることになるパターン光のどの格子は、実際に投影されたパターン光のどの格子に対応するかを判定することが困難であることが理由である。 However, with the pattern light projection method, it is necessary to associate the original position of the pattern light with the actual projected position for each of the multiple gratings, making this association difficult. This is because the shape of the pattern light reflected from the subject is distorted depending on the depth of the subject, making it difficult to determine which grating in the pattern light to be projected corresponds to which grating in the actually projected pattern light.

上述した問題を、投影装置と撮像装置との間のエピポーラ幾何を考慮して解決する技術が存在する。エピポーラ幾何を考慮するためには、投影装置と撮像装置との間の位置関係が既知であることが必要となる。例えば、撮像装置は内視鏡に組み込まれ、人体の内部の撮影対象を撮影する場合、内視鏡が人体の内部に入り込むので、投影装置と撮像装置との間の位置関係を知ることができない。このようなケースでは、エピポーラ幾何を考慮することができず、上述した対応付けが困難である。 There is a technology that solves the above problem by considering the epipolar geometry between the projection device and the imaging device. To consider epipolar geometry, the positional relationship between the projection device and the imaging device must be known. For example, when an imaging device is incorporated into an endoscope and an image of an object inside the human body is captured, the endoscope enters the human body, making it impossible to know the positional relationship between the projection device and the imaging device. In such cases, it is not possible to consider epipolar geometry, making the above-mentioned correspondence difficult.

特許文献１は、縦と横のみが識別可能な多数の直線で構成された単純なグリッドパターンを使用して、その交点を特徴点とすることで三次元復元を実現する技術を開示している。特許文献１の開示された技術は、単純なパターンを使用することによって、簡易に三次元復元を実現することができるが、上述した課題を解決するものではない。 Patent Document 1 discloses a technology that achieves three-dimensional reconstruction by using a simple grid pattern made up of a large number of straight lines, where only the vertical and horizontal directions can be distinguished, and using the intersections of these lines as feature points. While the technology disclosed in Patent Document 1 can easily achieve three-dimensional reconstruction by using a simple pattern, it does not solve the problems mentioned above.

一実施形態に係る実行される方法は、パターンを含む基準画像と、前記基準画像に対応するパターン光を投影した撮影対象から生成された撮影画像との対応関係を分析する、コンピュータによって実行される方法であって、前記基準画像は、二次元平面において縦方向に延在する第１の要素の集合および二次元平面において横方向に延在する第２の要素の集合を含み、前記基準画像は、前記第１の要素の集合および前記第２の要素の集合の少なくともいずれかが、断続的に延在する要素を含むことを表し、前記撮影画像内で、前記第１の要素の集合および前記第２の要素の集合から構成されたそれぞれのノードを識別するステップと、前記ノードの各々に対し、隣接するノードに対する相対位置に基づいて前記ノードを分類して、分類値を付与するステップと、前記分類したノードから格子グラフを生成するステップと、前記分類値および前記格子グラフをグラフ畳み込みネットワーク（ＧＣＮ）に入力することによって、前記撮影画像内のノードを前記基準画像内のノードと対応付けるステップであって、前記ＧＣＮは、前記基準画像から分類されたノードに対する分類値および前記基準画像から生成された格子グラフを学習するように構成されている、ステップと、を含む。 In one embodiment, a computer-implemented method analyzes the correspondence between a reference image including a pattern and a captured image generated from a subject onto which a pattern light corresponding to the reference image is projected, the reference image including a first set of elements extending vertically in a two-dimensional plane and a second set of elements extending horizontally in the two-dimensional plane, and the reference image represents that at least one of the first set of elements and the second set of elements includes elements that extend intermittently. The method includes the steps of: identifying, in the captured image, respective nodes composed of the first set of elements and the second set of elements; classifying each of the nodes based on its relative position with respect to adjacent nodes and assigning a classification value to the node; generating a lattice graph from the classified nodes; and matching the nodes in the captured image with nodes in the reference image by inputting the classification values and the lattice graph into a graph convolutional network (GCN), the GCN being configured to learn the classification values for the classified nodes from the reference image and the lattice graph generated from the reference image.

また、別の実施形態に係る方法は、パターンを含む基準画像と、前記基準画像に対応するパターン光を投影した撮影対象から生成された撮影画像との対応関係を分析する、コンピュータによって実行される方法であって、前記基準画像は、二次元平面において縦方向または横方向に延在する第１の要素の集合および前記第１の要素の集合と同一の方向に断続的に延在する第２の要素の集合を含み、前記基準画像は、前記第２の要素の集合の各々が、前記同一の方向にランダムな間隔に配置されることを表し、前記撮影画像内で、前記第１の要素の集合および前記第２の要素の集合から構成されたそれぞれのノードを識別するステップと、前記ノードの各々に対し、隣接するノードに対する相対角度に基づいて前記ノードを分類して、分類値を付与するステップと、前記分類したノードから格子グラフを生成するステップと、前記分類値および前記格子グラフをグラフ畳み込みネットワーク（ＧＣＮ）に入力することによって、前記撮影画像内のノードを前記基準画像内のノードと対応付けるステップであって、前記ＧＣＮは、前記基準画像から分類されたノードに対する分類値および前記基準画像から生成された格子グラフを学習するように構成されている、ステップと、を含む。 Another embodiment of the method is a computer-implemented method for analyzing the correspondence between a reference image including a pattern and a captured image generated from a subject onto which a pattern light corresponding to the reference image is projected, wherein the reference image includes a first set of elements extending vertically or horizontally in a two-dimensional plane and a second set of elements extending intermittently in the same direction as the first set of elements, and the reference image represents each of the second set of elements being arranged at random intervals in the same direction. The method includes the steps of: identifying, in the captured image, each node composed of the first set of elements and the second set of elements; classifying each of the nodes based on its relative angle with respect to adjacent nodes and assigning a classification value to each of the nodes; generating a lattice graph from the classified nodes; and matching the nodes in the captured image with nodes in the reference image by inputting the classification values and the lattice graph into a graph convolutional network (GCN), wherein the GCN is configured to learn the classification values for the classified nodes from the reference image and the lattice graph generated from the reference image.

実施形態に係る画像分析方法および画像分析装置によれば、元のパターンと撮影対象を撮影した画像に含まれるパターンとの対応付けを容易にする。 The image analysis method and image analysis device according to the embodiment make it easy to associate an original pattern with a pattern contained in an image of a photographed subject.

画像分析システムの構成の例を示す図である。FIG. 1 is a diagram illustrating an example of the configuration of an image analysis system. コンピュータデバイスの構成の例を示すブロック図である。FIG. 1 is a block diagram illustrating an example of the configuration of a computer device. 第１の実施形態に係る基準画像の例を示す図である。FIG. 2 is a diagram illustrating an example of a reference image according to the first embodiment. 第１の実施形態に係る基準画像の別の例を示す図である。FIG. 10 is a diagram showing another example of the reference image according to the first embodiment. 第１の実施形態に係る画像分析システムが実行する処理の例を示すフローチャートである。4 is a flowchart illustrating an example of processing executed by the image analysis system according to the first embodiment. 画像内のノードを分類する例を示す図である。FIG. 10 is a diagram illustrating an example of classifying nodes in an image. マーカを中心として領域を示す図である。FIG. 10 is a diagram showing a region with a marker at the center. ＧＣＮの構成の例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of the configuration of a GCN. 特徴ベクトル、隣接ノードとの関係、および重み値を考慮した演算を示す図である。FIG. 10 is a diagram illustrating an operation that takes into account feature vectors, relationships with adjacent nodes, and weight values. 第２の実施形態に係る画像分析システムが実行する処理の例を示すフローチャートである。10 is a flowchart illustrating an example of processing executed by an image analysis system according to a second embodiment. 第２の実施形態に係る基準画像および撮影画像の特徴ベクトルの内積を計算する処理の例を示す。10 shows an example of a process for calculating the inner product of feature vectors of a reference image and a captured image according to the second embodiment. 第３の実施形態に係る基準画像の例を示す図である。FIG. 13 is a diagram illustrating an example of a reference image according to the third embodiment. 第３の実施形態に係る撮影画像から格子の位相（繰り返す格子と各画素の相対位置関係を、０以上２π未満の回転角度として表現したもの）を求める処理の例を示す図である。10A and 10B are diagrams illustrating an example of processing for determining the phase of a grid (the relative positional relationship between a repeating grid and each pixel expressed as a rotation angle greater than or equal to 0 and less than 2π) from a captured image according to the third embodiment. 第３の実施形態に係る近接関係に基づく接続の例を示す図である。FIG. 11 is a diagram illustrating an example of a connection based on a proximity relationship according to a third embodiment. 第３の実施形態に係る画像内のノードを分類する例を示す図である。FIG. 13 is a diagram illustrating an example of classifying nodes in an image according to the third embodiment.

以下、添付した図面を参照して、一実施形態に係る画像分析方法および画像分析装置を詳細に説明する。本実施形態では、画像分析方法および画像分析装置は、画像分析システムにおいて実装される。また、本実施形態では、投光器からの光が投影され、カメラから撮影される対象を「撮影対象」と称する。撮影対象は、人間、動物、物、またはそれらの一部など、奥行を有するいずれかの物体を含む。 An image analysis method and image analysis device according to one embodiment will be described in detail below with reference to the accompanying drawings. In this embodiment, the image analysis method and image analysis device are implemented in an image analysis system. In this embodiment, the object onto which light from a projector is projected and photographed by a camera is referred to as the "photographed object." The photographed object includes any object with depth, such as a human, animal, object, or part of any of these.

以下では、実施形態に係る画像分析システムを説明する。画像分析システムは、人体の内部の撮影対象を撮影する例に適用される。画像分析システムは、アクティブステレオ法に基づいて、撮影対象に投影するパターン光に対応するパターン画像（基準画像）と、パターン光が投影された撮影対象を撮影することによって生成された撮影画像との間の対応関係を分析する。 The following describes an image analysis system according to an embodiment. The image analysis system is applied to an example of capturing images of an object inside the human body. Based on the active stereo method, the image analysis system analyzes the correspondence between a pattern image (reference image) corresponding to the pattern light projected onto the object and a captured image generated by capturing an image of the object onto which the pattern light is projected.

＜第１の実施形態＞
まず、図１を参照して、画像分析システム１００の構成の例を説明する。本実施形態では、画像分析システム１００は、コンピュータデバイス１および内視鏡２。コンピュータデバイス１は、バスまたはネットワークなどを介して、内視鏡２と相互に結合されている。 First Embodiment
First, an example of the configuration of an image analysis system 100 will be described with reference to Fig. 1. In this embodiment, the image analysis system 100 includes a computer device 1 and an endoscope 2. The computer device 1 is mutually coupled to the endoscope 2 via a bus, a network, or the like.

コンピュータデバイス１は、少なくとも演算機能を含むいずれかの情報処理装置である。コンピュータデバイス１は、後述する撮像装置から受信した撮影信号に基づいて画像を生成し、投光装置が投影するパターン光に対応する基準画像と、パターン光が投影された撮影対象を撮影することによって生成された撮影画像との間の対応関係を分析する。 Computer device 1 is any information processing device that includes at least a calculation function. Computer device 1 generates an image based on an imaging signal received from an imaging device (described below), and analyzes the correspondence between a reference image corresponding to the pattern light projected by the light projecting device and a captured image generated by capturing an image of a subject onto which the pattern light is projected.

内視鏡２は、先端（ヘッド）に組み込まれた撮像装置２１（カメラ）および投光装置２２を含む。図１では、内視鏡２の先端を囲む楕円Ｏ１に対応した楕円Ｏ２の内部に、撮像装置２１、投光装置２２、および内視鏡ヘッド２３を拡大した状態を示す。内視鏡ヘッド２３は、人体の内部に挿入され、投光装置２２が撮影対象にパターン光ＰＬを投影し、撮像装置２１が撮影対象を撮影する。 The endoscope 2 includes an imaging device 21 (camera) and a light projecting device 22 built into the tip (head). Figure 1 shows an enlarged view of the imaging device 21, light projecting device 22, and endoscope head 23 within an ellipse O2 that corresponds to the ellipse O1 surrounding the tip of the endoscope 2. The endoscope head 23 is inserted into the human body, the light projecting device 22 projects pattern light PL onto the subject, and the imaging device 21 captures an image of the subject.

撮像装置２１は、１つまたは複数のＣＣＤイメージセンサまたはＣＭＯＳイメージセンサなどを含み、撮影対象を撮影し、撮影信号をコンピュータデバイス１に送信する。撮像装置２１は、高画素の画像を生成するために、より多くのイメージセンサが配列されることが望ましい。 The imaging device 21 includes one or more CCD image sensors or CMOS image sensors, captures an image of the subject, and transmits an image signal to the computing device 1. It is desirable for the imaging device 21 to have an array of more image sensors in order to generate high-pixel images.

投光装置２２は、回折光学素子（ＤＯＥ：Diffractive Optical Element）２２ａ、レンズ２２ｂ、および光ファイバー２２ｃを含む。図１では、投光装置２２を囲む四角形Ｒ１に対応した四角形Ｒ２の内部に、回折光学素子２２ａ、レンズ２２ｂ、および光ファイバー２２ｃを拡大した状態を示す。 The light-projecting device 22 includes a diffractive optical element (DOE) 22a, a lens 22b, and an optical fiber 22c. Figure 1 shows an enlarged view of the diffractive optical element 22a, lens 22b, and optical fiber 22c within a rectangle R2 corresponding to a rectangle R1 surrounding the light-projecting device 22.

回折光学素子２２ａは、投影するパターン光ＰＬを回折させる。レンズ２２ｂは、例えば、屈折率分布型（ＧＲＩＮ）レンズによって実装される。屈折率分布型レンズを実装することによって、光の屈折率を半径方向に変化させ、投光装置２２を小型にすることができる。光源（図示せず）からの光が光ファイバー２２ｃを伝播して、レンズ２２ｂおよび回折光学素子２２ａを通じてパターン光ＰＬを放射する。パターン光ＰＬの詳細については後述する。 The diffractive optical element 22a diffracts the projected pattern light PL. The lens 22b is implemented, for example, by a gradient index (GRIN) lens. By implementing a gradient index lens, the refractive index of light can be changed in the radial direction, making it possible to reduce the size of the light projector 22. Light from a light source (not shown) propagates through the optical fiber 22c and emits the pattern light PL through the lens 22b and the diffractive optical element 22a. Details of the pattern light PL will be described later.

なお、本実施形態では、人体の内部の撮影対象を撮影するために、撮像装置２１および投光装置２２が内視鏡２に組み込まれた構成を採用するが、本実施形態はそのような構成に限定されない。例えば、任意の撮影対象を撮影するために、撮影対象を撮影する撮像装置および撮影対象にパターン光を投影する投光装置が独立して存在してもよい。つまり、本実施形態は、少なくとも撮影対象を撮影する撮像装置および撮影対象にパターン光を投影する投光装置を含むアクティブステレオシステムの構成を採用する。 In this embodiment, the imaging device 21 and the light projector 22 are incorporated into the endoscope 2 to capture images of a subject inside the human body, but this embodiment is not limited to such a configuration. For example, to capture an arbitrary subject, an imaging device that captures the subject and a light projector that projects patterned light onto the subject may exist independently. In other words, this embodiment employs an active stereo system configuration that includes at least an imaging device that captures images of the subject and a light projector that projects patterned light onto the subject.

次に、図２を参照して、コンピュータデバイス１の詳細な構成要素を説明する。コンピュータデバイス１は、制御装置１１、メモリ１２、記憶装置１３、通信装置１４、入力装置１５、および出力装置１６を含む。メモリ１２、記憶装置１３通信装置１４、入力装置１５、および出力装置１６はそれぞれ、内部バスを通じて制御装置１１に結合され、制御装置１１によって制御される。 Next, detailed components of the computer device 1 will be described with reference to Figure 2. The computer device 1 includes a control device 11, memory 12, storage device 13, communication device 14, input device 15, and output device 16. The memory 12, storage device 13, communication device 14, input device 15, and output device 16 are each coupled to the control device 11 via an internal bus and are controlled by the control device 11.

制御装置１１は、プロセッサとも称され、中央処理装置（ＣＰＵ）およびグラフィックプロセシングユニット（ＧＰＵ）などを含む。制御装置１１は、撮像装置２１から受信した撮影信号に基づいて画像を生成し、画像内のパターンに基づいて、投光装置２２が投影するパターン光ＰＬに対応する基準画像と、生成された画像との間の対応関係を分析するなどの演算を実行する。 The control device 11, also referred to as a processor, includes a central processing unit (CPU) and a graphics processing unit (GPU). The control device 11 generates an image based on the imaging signal received from the imaging device 21, and performs calculations such as analyzing the correspondence between the generated image and a reference image corresponding to the pattern light PL projected by the light projector 22 based on the pattern within the image.

メモリ１２は、制御装置１１が処理する、コンピュータ実行可能命令、および当該命令による演算処理後のデータなどを記憶した揮発性データ記憶装置である。メモリ１２は、ＲＡＭ（ランダムアクセスメモリ）（例えば、ＳＲＡＭ（スタティックＲＡＭ）およびＤＲＡＭ（ダイナミックＲＡＭ））などで実装されてもよい。 Memory 12 is a volatile data storage device that stores computer-executable instructions processed by control device 11, as well as data after calculations performed by those instructions. Memory 12 may be implemented as RAM (random access memory) (e.g., SRAM (static RAM) and DRAM (dynamic RAM)).

記憶装置１３は、上述したコンピュータ実行可能命令を含むプログラムなどを記憶した不揮発性データ記憶装置である。記憶装置１３は、ＲＯＭ（リードオンリメモリ）などの不揮発性半導体メモリ、磁気記憶装置（ハードディスクドライブなど）、および光ディスクなどで実装されてもよい。なお、プログラムなどのデータは、記憶装置１３に加えまたはその代わりに、ＮＡＳ（Network Attached Storage）および／またはＳＡＮ（Storage Area Network）などに記憶されてもよい。 Storage device 13 is a non-volatile data storage device that stores programs and other data including the computer-executable instructions described above. Storage device 13 may be implemented using non-volatile semiconductor memory such as ROM (read-only memory), a magnetic storage device (such as a hard disk drive), or an optical disk. Note that data such as programs may also be stored in a NAS (Network Attached Storage) and/or a SAN (Storage Area Network) in addition to or instead of storage device 13.

通信装置１４は、内視鏡２（撮像装置２１）から撮影信号を受信し、内視鏡２（投光装置２２）に基準画像に対応する信号を送信するなど、コンピュータデバイス１と結合された外部機器と通信する。 The communication device 14 communicates with external devices connected to the computer device 1, such as receiving an image capture signal from the endoscope 2 (imaging device 21) and transmitting a signal corresponding to a reference image to the endoscope 2 (light projector device 22).

入力装置１５は、ユーザからの入力を受け付け、受け付けた入力を制御装置１１に送信する。入力装置１５は、例えば、マウス、タッチパッド、キーボード、およびトラックボールなどによって実装される。 The input device 15 accepts input from the user and transmits the accepted input to the control device 11. The input device 15 may be implemented, for example, as a mouse, touchpad, keyboard, or trackball.

出力装置１６は、制御装置１１が演算した結果を出力する（例えば、制御装置１１が生成した基準画像を表示する）。出力装置１６は、例えば、ディスプレイ（液晶、ＣＲＴなど）などによって実装される。 The output device 16 outputs the results of calculations performed by the control device 11 (for example, displays the reference image generated by the control device 11). The output device 16 is implemented, for example, by a display (liquid crystal display, CRT, etc.).

なお、本実施形態では、入力装置１５および出力装置１６がコンピュータデバイス１に組み込まれた構成を採用するが、そのような構成に限定されない。入力装置１５および出力装置１６のいずれかまたは両方は、コンピュータデバイス１とは独立した装置として構成されてもよい。 Note that, although this embodiment employs a configuration in which the input device 15 and output device 16 are incorporated into the computer device 1, this configuration is not limiting. Either or both of the input device 15 and output device 16 may be configured as devices independent of the computer device 1.

次に、図３を参照して、投光装置２２が放射するパターン光ＰＬを構成する基準画像ＲＩを説明する。上述したように、画像分析システム１００は、撮影対象にパターン光ＰＬを投影する。基準画像ＲＩは、予め定められたパターンを表現した画像であり、コンピュータデバイス１または内視鏡２に記憶され、投光装置２２は、基準画像ＲＩに基づいてパターン光ＰＬを放射する。 Next, referring to Figure 3, the reference image RI that constitutes the pattern light PL emitted by the light projecting device 22 will be described. As described above, the image analysis system 100 projects the pattern light PL onto the subject. The reference image RI is an image that represents a predetermined pattern and is stored in the computer device 1 or the endoscope 2. The light projecting device 22 emits the pattern light PL based on the reference image RI.

基準画像ＲＩは、二次元平面において縦方向に延在する要素の集合（複数の第１の要素の集合）ＳＥ１、および二次元平面において横方向に延在する要素の集合（複数の第２の要素の集合）ＳＥ２を含む。図３に示す例では、第１の要素の集合ＳＥ１は、二次元平面において縦方向に延在する直線の集合であり、各々の直線が、第１の要素Ｅ１ａ乃至Ｅ１ｎ（ｎは任意の２以上の整数）である。第２の要素の集合ＳＥ２は、二次元平面において横方向に延在する直線の集合であり、各々の直線が、第２の要素Ｅ２ａ乃至Ｅ２ｎ（ｎは任意の２以上の整数）である。第１の要素の集合ＳＥ１のそれぞれの第１の要素は、第２の要素の集合ＳＥ２のそれぞれの第２の要素と交差する。なお、上述したｎおよび以下で言及するｎの値は、任意の２以上の整数を意味しており、言及した全てのｎが同一の値であるわけではない。 The reference image RI includes a set of elements (a set of multiple first elements) SE1 extending vertically in a two-dimensional plane, and a set of elements (a set of multiple second elements) SE2 extending horizontally in the two-dimensional plane. In the example shown in FIG. 3 , the set of first elements SE1 is a set of straight lines extending vertically in the two-dimensional plane, each of which is a first element E1a to E1n (n is any integer equal to or greater than 2). The set of second elements SE2 is a set of straight lines extending horizontally in the two-dimensional plane, each of which is a second element E2a to E2n (n is any integer equal to or greater than 2). Each first element in the set of first elements SE1 intersects with each second element in the set of second elements SE2. Note that the value of n mentioned above and below refers to any integer equal to or greater than 2, and not all of the n values mentioned are the same value.

図３に示すように、第１の要素の集合ＳＥ１における各々の直線は、基準画像ＲＩ内の二次元平面において縦方向に連続的に延在する。一方、第２の要素の集合ＳＥ２における各々の直線の一部は、基準画像ＲＩ内の二次元平面において横方向に断続的に延在する。図３では、基準画像ＲＩの一部を囲む円Ｃ１に対応した円Ｃ２の内部に、円Ｃ１によって囲まれた基準画像ＲＩの一部の領域を拡大した状態を示す。 As shown in Figure 3, each straight line in the first set of elements SE1 extends continuously vertically in a two-dimensional plane within the reference image RI. Meanwhile, a portion of each straight line in the second set of elements SE2 extends intermittently horizontally in a two-dimensional plane within the reference image RI. Figure 3 shows an enlarged view of a portion of the reference image RI surrounded by circle C1, within circle C2, which corresponds to circle C1 that surrounds a portion of the reference image RI.

円Ｃ２内の領域では、第１の要素Ｅ１ａ乃至Ｅ１ｄはそれぞれ、縦方向に連続的に延在する。第２の要素Ｅ２ａ、Ｅ２ｄ、およびＥ２ｇは、横方向に断続的に延在する。第２の要素Ｅ２ｂ、Ｅ２ｅ、およびＥ２ｈは、横方向に連続的に延在する。第２の要素Ｅ２ｃ、Ｅ２ｆ、およびＥ２ｉは、横方向に断続的に延在する。 In the area within circle C2, first elements E1a to E1d each extend continuously in the vertical direction. Second elements E2a, E2d, and E2g extend intermittently in the horizontal direction. Second elements E2b, E2e, and E2h extend continuously in the horizontal direction. Second elements E2c, E2f, and E2i extend intermittently in the horizontal direction.

第２の要素Ｅ２ａは、第１の要素Ｅ１ａとの交点Ｉ１から右横方向に延在し、第１の要素Ｅ１ｂとの交点Ｉ２において終端する。第２の要素Ｅ２ｄは、第１の要素Ｅ１ｂとの交点Ｉ３から右横方向に延在し、第１の要素Ｅ１ｃとの交点Ｉ４において終端する。第２の要素Ｅ２ｇは、第１の要素Ｅ１ｃとの交点Ｉ５から右横方向に延在し、第１の要素Ｅ１ｄとの交点Ｉ６において終端する。第２の要素Ｅ２ｄは、交点Ｉ３から、第２の要素Ｅ２ａよりも高い位置で延在し、第２の要素Ｅ２ｇは、交点Ｉ５から、第２の要素Ｅ２ｄよりも低い位置で延在する。 The second element E2a extends laterally to the right from the intersection I1 with the first element E1a and terminates at the intersection I2 with the first element E1b. The second element E2d extends laterally to the right from the intersection I3 with the first element E1b and terminates at the intersection I4 with the first element E1c. The second element E2g extends laterally to the right from the intersection I5 with the first element E1c and terminates at the intersection I6 with the first element E1d. The second element E2d extends from the intersection I3 at a higher position than the second element E2a, and the second element E2g extends from the intersection I5 at a lower position than the second element E2d.

第２の要素Ｅ２ｂは、第１の要素Ｅ１ａとの交点Ｉ７から右横方向に延在し、第１の要素Ｅ１ｂとの交点Ｉ８において第２の要素Ｅ２ｅと接続する。第２の要素Ｅ２ｅは、交点Ｉ８から右横方向に延在し、第１の要素Ｅ１ｃとの交点Ｉ９において第２の要素Ｅ２ｈと接続する。第２の要素Ｅ２ｈは、交点Ｉ９から右横方向に延在し、第１の要素Ｅ１ｄとの交点Ｉ１０において次の第２の要素（符号なし）と接続する。 The second element E2b extends laterally to the right from the intersection I7 with the first element E1a and connects to the second element E2e at the intersection I8 with the first element E1b. The second element E2e extends laterally to the right from the intersection I8 and connects to the second element E2h at the intersection I9 with the first element E1c. The second element E2h extends laterally to the right from the intersection I9 and connects to the next second element (unsigned) at the intersection I10 with the first element E1d.

第２の要素Ｅ２ｃは、第１の要素Ｅ１ａとの交点Ｉ１１から右横方向に延在し、第１の要素Ｅ１ｂとの交点Ｉ１２において終端する。第２の要素Ｅ２ｆは、第１の要素Ｅ１ｂとの交点Ｉ１３から右横方向に延在し、第１の要素Ｅ１ｃとの交点Ｉ１４において終端する。第２の要素Ｅ２ｉは、第１の要素Ｅ１ｃとの交点Ｉ１５から右横方向に延在し、第１の要素Ｅ１ｄとの交点Ｉ１６において終端する。第２の要素Ｅ２ｆは、交点Ｉ１３から、第２の要素Ｅ２ｃよりも低い位置で延在し、第２の要素Ｅ２ｉは、交点Ｉ１５から、第２の要素Ｅ２ｆよりも高い位置で延在する。 The second element E2c extends laterally to the right from the intersection I11 with the first element E1a and terminates at the intersection I12 with the first element E1b. The second element E2f extends laterally to the right from the intersection I13 with the first element E1b and terminates at the intersection I14 with the first element E1c. The second element E2i extends laterally to the right from the intersection I15 with the first element E1c and terminates at the intersection I16 with the first element E1d. The second element E2f extends from the intersection I13 at a lower position than the second element E2c, and the second element E2i extends from the intersection I15 at a higher position than the second element E2f.

また、基準画像ＲＩは、所定の交点上の位置にｎ個の任意の形状のマーカが配置される。本実施形態では、９個のマーカＭ１乃至Ｍ９が配置される。マーカＭ１乃至Ｍ９は、基準画像ＲＩの二次元平面において一定間隔に配置されてもよく、またはランダムな間隔に配置されてもよい。後述する図４に示す基準画像でも同様である。マーカＭ１乃至Ｍ９の各々は、対応付けの精度を高めるために使用されるが、詳細については後述する。 In addition, n markers of any shape are placed on the reference image RI at positions on predetermined intersections. In this embodiment, nine markers M1 to M9 are placed. The markers M1 to M9 may be placed at regular intervals on the two-dimensional plane of the reference image RI, or may be placed at random intervals. The same applies to the reference image shown in Figure 4, which will be described later. Each of the markers M1 to M9 is used to improve the accuracy of matching, and details will be given later.

なお、本実施形態では、二次元平面において連続的に縦方向に延在する要素の集合および二次元平面において断続的に横方向に延在する要素の集合を含む基準画像を採用しているが、基準画像は、そのような構成に限定されない。例えば、図４に示すように、基準画像ＲＩは、二次元平面において断続的に縦方向に延在する要素の集合および二次元平面において連続的に横方向に延在する要素の集合を含んでもよい。 Note that, in this embodiment, a reference image is employed that includes a set of elements that extend continuously vertically in a two-dimensional plane and a set of elements that extend intermittently horizontally in the two-dimensional plane; however, the reference image is not limited to such a configuration. For example, as shown in FIG. 4, the reference image RI may include a set of elements that extend intermittently vertically in a two-dimensional plane and a set of elements that extend continuously horizontally in the two-dimensional plane.

図４に示すように、第１の要素の集合ＳＥ１における各々の直線の一部は、基準画像ＲＩ内の二次元平面において縦方向に断続的に延在する。一方、第２の要素の集合ＳＥ２における各々の直線は、基準画像ＲＩ内の二次元平面において横方向に連続的に延在する。図４では、基準画像ＲＩの一部を囲む円Ｃ１に対応した円Ｃ２の内部に、円Ｃ１によって囲まれた基準画像ＲＩの一部の領域を拡大した状態を示す。 As shown in Figure 4, a portion of each straight line in the first set of elements SE1 extends intermittently vertically in a two-dimensional plane within the reference image RI. On the other hand, each straight line in the second set of elements SE2 extends continuously horizontally in a two-dimensional plane within the reference image RI. Figure 4 shows an enlarged view of a portion of the reference image RI surrounded by circle C1, within circle C2, which corresponds to circle C1 that surrounds a portion of the reference image RI.

円Ｃ２内の領域では、第２の要素Ｅ２ａ乃至Ｅ２ｄはそれぞれ、横方向に連続的に延在する。第１の要素Ｅ１ａ、Ｅ１ｄ、およびＥ１ｇは、縦方向に断続的に延在する。第１の要素Ｅ１ｂ、Ｅ１ｅ、およびＥ１ｈは、縦方向に連続的に延在する。第１の要素Ｅ１ｃ、Ｅ１ｆ、およびＥ１ｉは、縦方向に断続的に延在する。 In the area within circle C2, second elements E2a to E2d each extend continuously in the horizontal direction. First elements E1a, E1d, and E1g extend intermittently in the vertical direction. First elements E1b, E1e, and E1h extend continuously in the vertical direction. First elements E1c, E1f, and E1i extend intermittently in the vertical direction.

第１の要素Ｅ１ａは、第２の要素Ｅ１ａとの交点Ｉ１から下方向に延在し、第２の要素Ｅ２ｂとの交点Ｉ２において終端する。第１の要素Ｅ１ｄは、第２の要素Ｅ２ｂとの交点Ｉ３から下方向に延在し、第２の要素Ｅ２ｃとの交点Ｉ４において終端する。第１の要素Ｅ１ｇは、第２の要素Ｅ２ｃとの交点Ｉ５から下方向に延在し、第２の要素Ｅ２ｄとの交点Ｉ６において終端する。第１の要素Ｅ１ｄは、交点Ｉ３から、第１の要素Ｅ１ａよりも左の位置で延在し、第１の要素Ｅ１ｇは、交点Ｉ５から、第１の要素Ｅ１ｄよりも右の位置で延在する。 The first element E1a extends downward from the intersection I1 with the second element E1a and terminates at the intersection I2 with the second element E2b. The first element E1d extends downward from the intersection I3 with the second element E2b and terminates at the intersection I4 with the second element E2c. The first element E1g extends downward from the intersection I5 with the second element E2c and terminates at the intersection I6 with the second element E2d. The first element E1d extends from the intersection I3 to a position to the left of the first element E1a, and the first element E1g extends from the intersection I5 to a position to the right of the first element E1d.

第１の要素Ｅ１ｂは、第２の要素Ｅ２ａとの交点Ｉ７から下方向に延在し、第２の要素Ｅ２ｂとの交点Ｉ８において第１の要素Ｅ１ｅと接続する。第１の要素Ｅ１ｅは、交点Ｉ８から下方向に延在し、第２の要素Ｅ２ｃとの交点Ｉ９において第１の要素Ｅ１ｈと接続する。第１の要素Ｅ１ｈは、交点Ｉ９から下方向に延在し、第２の要素Ｅ２ｄとの交点Ｉ１０において次の第１の要素（符号なし）と接続する。 The first element E1b extends downward from the intersection I7 with the second element E2a and connects to the first element E1e at the intersection I8 with the second element E2b. The first element E1e extends downward from the intersection I8 and connects to the first element E1h at the intersection I9 with the second element E2c. The first element E1h extends downward from the intersection I9 and connects to the next first element (unsigned) at the intersection I10 with the second element E2d.

第１の要素Ｅ１ｃは、第２の要素Ｅ２ａとの交点Ｉ１１から下方向に延在し、第２の要素Ｅ２ｂとの交点Ｉ１２において終端する。第１の要素Ｅ１ｆは、第２の要素Ｅ２ｂとの交点Ｉ１３から下方向に延在し、第２の要素Ｅ２ｃとの交点Ｉ１４において終端する。第１の要素Ｅ１ｉは、第２の要素Ｅ２ｃとの交点Ｉ１５から下方向に延在し、第２の要素Ｅ２ｄとの交点Ｉ１６において終端する。第１の要素Ｅ１ｆは、交点Ｉ１３から、第１の要素Ｅ１ｃよりも右の位置で延在し、第１の要素Ｅ１ｉは、交点Ｉ１５から、第１の要素Ｅ１ｆよりも左の位置で延在する。 The first element E1c extends downward from the intersection I11 with the second element E2a and terminates at the intersection I12 with the second element E2b. The first element E1f extends downward from the intersection I13 with the second element E2b and terminates at the intersection I14 with the second element E2c. The first element E1i extends downward from the intersection I15 with the second element E2c and terminates at the intersection I16 with the second element E2d. The first element E1f extends from the intersection I13 to the right of the first element E1c, and the first element E1i extends from the intersection I15 to the left of the first element E1f.

図３および図４において示したパターンはいずれも、二次元平面において縦方向に延在する直線の要素の集合、およびそれぞれが縦方向に延在する直線の要素の集合と交差し、二次元平面において横方向に延在する直線の要素の集合を含むと言える。また、上記パターンは、縦方向に延在する直線の要素の集合およびそれぞれが横方向に延在する直線の要素の集合の少なくともいずれかが、断続的に延在する要素を含むと言える。 The patterns shown in Figures 3 and 4 can both be said to include a set of straight line elements extending vertically in a two-dimensional plane, and a set of straight line elements extending horizontally in a two-dimensional plane, each of which intersects with a set of straight line elements extending vertically. Furthermore, the above patterns can also be said to include elements in which at least one of the set of straight line elements extending vertically and the set of straight line elements extending horizontally extends intermittently.

要素が二次元平面において断続的に延在するとは、その要素が、交差するもう一方の要素のうちの同一の要素と複数の位置において交差し、その交点の位置が二次元平面において異なることであると言える。図３において示したパターンでは、第２の要素が二次元平面において横方向に断続的に延在するとは、第２の要素（例えば、第２の要素Ｅ２ａおよびＥ２ｄ）が同一の第１の要素（例えば、第１の要素Ｅ１ｂ）の複数の位置において交差し（例えば、交点Ｉ２およびＩ３）、その交点の位置が二次元平面において異なることであると言える。図４において示したパターンでは、第１の要素が二次元平面において縦方向に断続的に延在するとは、第１の要素（例えば、第１の要素Ｅ１ａおよびＥ１ｄ）が同一の第２の要素（例えば、第２の要素Ｅ２ｂ）の複数の位置において交差し（例えば、交点Ｉ２およびＩ３）、その交点の位置が二次元平面において異なることであると言える。 An element extending intermittently in a two-dimensional plane means that the element intersects with the same element of the other intersecting elements at multiple positions, and the positions of the intersections are different in the two-dimensional plane. In the pattern shown in Figure 3, a second element extending intermittently in the horizontal direction in a two-dimensional plane means that a second element (e.g., second elements E2a and E2d) intersects with the same first element (e.g., first element E1b) at multiple positions (e.g., intersections I2 and I3), and the positions of the intersections are different in the two-dimensional plane. In the pattern shown in Figure 4, a first element extending intermittently in the vertical direction in a two-dimensional plane means that a first element (e.g., first elements E1a and E1d) intersects with the same second element (e.g., second element E2b) at multiple positions (e.g., intersections I2 and I3), and the positions of the intersections are different in the two-dimensional plane.

なお、本実施形態では、基準画像ＲＩが、二次元平面において縦方向に延在する直線の集合、および二次元平面において横方向に延在する直線の集合を含むが、そのような構成に限定されない。例えば、基準画像ＲＩは、二次元平面において縦方向に延在する曲線の集合などの任意の形状を有する要素の集合、および二次元平面において横方向に延在する曲線の集合などの任意の形状を有する要素の集合を含んでもよい。 In this embodiment, the reference image RI includes a set of straight lines extending vertically in a two-dimensional plane and a set of straight lines extending horizontally in a two-dimensional plane, but is not limited to such a configuration. For example, the reference image RI may include a set of elements having an arbitrary shape, such as a set of curved lines extending vertically in a two-dimensional plane, and a set of elements having an arbitrary shape, such as a set of curved lines extending horizontally in a two-dimensional plane.

次に、図５に示すフローチャートを参照して、画像分析システム１００が実行する処理の例を説明する。上述したように、本実施形態では、撮影対象に投影するパターン光に対応する基準画像と、パターン光が投影された撮影対象を撮影することによって生成された撮影画像との間の対応関係を分析する。具体的には、図３または図４に示した基準画像と、基準画像に対応するパターン光を投影した撮影対象を撮影することによって生成された撮影画像との対応関係を分析する。対応関係は、ニューラルネットワークを使用することによって分析される。 Next, an example of processing performed by the image analysis system 100 will be described with reference to the flowchart shown in Figure 5. As described above, in this embodiment, the correspondence between a reference image corresponding to a pattern light projected onto the subject and a captured image generated by capturing an image of the subject onto which the pattern light is projected is analyzed. Specifically, the correspondence between the reference image shown in Figure 3 or Figure 4 and a captured image generated by capturing an image of the subject onto which pattern light corresponding to the reference image is projected is analyzed. The correspondence is analyzed using a neural network.

撮影対象にパターン光を投影すると、奥行に応じて、撮影対象から反射したパターン光の形状が歪み、対応して、生成された撮影画像において認識されるパターンの形状も歪むことになる。従来の画像処理では、このように歪んだ形状のパターンを含む撮影画像からは必ずしも正確なパターンを認識することができないことがある。本実施形態では、ニューラルネットワークが基準画像内のパターンを構成するノードを学習し、学習したノードに関する情報から、撮影画像内のパターンを構成するノードを基準画像内のノードと対応付けて抽出する。 When patterned light is projected onto a subject, the shape of the patterned light reflected from the subject is distorted depending on the depth, and the shape of the pattern recognized in the generated captured image is also distorted accordingly. With conventional image processing, it is not always possible to accurately recognize a pattern from a captured image that contains such a distorted pattern. In this embodiment, a neural network learns the nodes that make up the pattern in the reference image, and, based on information about the learned nodes, extracts the nodes that make up the pattern in the captured image by associating them with the nodes in the reference image.

本実施形態では、図３に示したいずれかの基準画像ＲＩが予め生成され、コンピュータデバイス１の記憶装置１３が記憶されているものとする。基準画像ＲＩは、上述したマーカＭ１乃至Ｍ９を含む。 In this embodiment, it is assumed that one of the reference images RI shown in Figure 3 is generated in advance and stored in the storage device 13 of the computer device 1. The reference image RI includes the markers M1 to M9 described above.

まず、内視鏡２の投光装置２２は、基準画像ＲＩに対応するパターン光ＰＬを撮影対象に投影する（ステップＳ５０１）。基準画像ＲＩは、コンピュータデバイス１の通信装置１４から光ファイバー２２ｃを通じて投光装置２２に送信される。パターン光ＰＬは、回折光学素子２２ａによって回折するので、奥行を有する撮影対象の全体に到達する。 First, the light projecting device 22 of the endoscope 2 projects pattern light PL corresponding to the reference image RI onto the subject (step S501). The reference image RI is transmitted to the light projecting device 22 from the communication device 14 of the computer device 1 via the optical fiber 22c. The pattern light PL is diffracted by the diffractive optical element 22a and reaches the entire subject, which has depth.

次に、内視鏡２の撮像装置２１は、パターン光ＰＬが投影された撮影対象を撮影する（ステップＳ５０２）。撮影対象を撮影すると、撮影信号が撮像装置２１からコンピュータデバイス１の通信装置１４に送信され、制御装置１１が、撮影信号に基づいて撮影画像ＤＩを生成する。生成した撮影画像は、メモリ１２または記憶装置１３に記憶される。 Next, the imaging device 21 of the endoscope 2 captures an image of the subject onto which the pattern light PL is projected (step S502). After capturing an image of the subject, an imaging signal is transmitted from the imaging device 21 to the communication device 14 of the computing device 1, and the control device 11 generates a captured image DI based on the imaging signal. The generated captured image is stored in the memory 12 or the storage device 13.

次に、コンピュータデバイス１の制御装置１１は、撮影画像ＤＩ内のパターンを構成する第１の要素の組の全ての第１の要素に対し、要素を構成するライン（列）を識別する。同様に、第２の要素の組の全ての第２の要素に対し、要素を構成するライン（行）を識別する（ステップＳ５０３）。すなわち、制御装置１１は、撮影画像ＤＩ内の全ての列および全ての行を識別する。識別された列および行は、識別番号が割り当てられる（列に第１の要素識別子、行に第２の要素識別子）。上述したように、第２の要素の一部は、二次元平面において断続的に延在するが、断続的に延在する要素については、線形になるラインを認識する。 Next, the control device 11 of the computer device 1 identifies the lines (columns) that make up all first elements in the set of first elements that make up the pattern in the captured image DI. Similarly, it identifies the lines (rows) that make up all second elements in the set of second elements (step S503). That is, the control device 11 identifies all columns and all rows in the captured image DI. The identified columns and rows are assigned identification numbers (first element identifiers to columns, second element identifiers to rows). As described above, some of the second elements extend intermittently in a two-dimensional plane, but for the intermittently extending elements, linear lines are recognized.

次に、制御装置１１は、ステップＳ５０３において識別した列と行との全ての交点をノードとして識別する（ステップＳ５０４）。次に、制御装置１１は、ノードごとに隣接するノードに対する相対位置に基づいて分類する（ステップＳ５０５）。識別および分類されたノードは、相対位置に基づいてラベル付けされる（撮影画像ＤＩ内のノードにラベルが付加される（分類値が付与される）。 Next, the control device 11 identifies all intersections of the columns and rows identified in step S503 as nodes (step S504). Next, the control device 11 classifies each node based on its relative position to adjacent nodes (step S505). The identified and classified nodes are labeled based on their relative positions (labels are added to the nodes in the captured image DI (classification values are assigned)).

ステップＳ５０３乃至ステップＳ５０５の処理は、深層学習において学習された学習データに基づいて実行される。本実施形態では、制御装置１１は、学習のためにＵ－Ｎｅｔを実装する。Ｕ－Ｎｅｔは、深層学習を利用した完全畳み込みネットワーク（ＦＣＮ：fully convolution network）の１つであり、画像内のセグメンテーションを推定する。セグメンテーションとは、画像を複数の領域に分割する処理を意味する。Ｕ－Ｎｅｔは、畳み込み演算、および活性化演算（ＲｅＬＵ）、最大プーリング演算などを通じて、学習データに基づいて、各々の画素が何を表すかを分類する。 The processing of steps S503 to S505 is performed based on the learning data learned through deep learning. In this embodiment, the control device 11 implements a U-Net for learning. The U-Net is a fully convolutional network (FCN) that uses deep learning and estimates segmentation within an image. Segmentation refers to the process of dividing an image into multiple regions. The U-Net classifies what each pixel represents based on the learning data through convolution operations, activation operations (ReLU), max pooling operations, and other operations.

図６は、撮影画像ＤＩに基づいてＵ－Ｎｅｔを学習する処理の例を視覚的に示す。図６では、Ｕ－Ｎｅｔにおける学習の例を示すために、１つの画像に基づいてＵ－Ｎｅｔを学習する例を提示するが、実際には、あらゆる基準画像、および基準画像に対応するパターン光を投影した撮影対象を撮影することによって生成された撮影画像に基づいて、ノードを識別および分類するよう学習される。なお、図６に示す例では、上述したマーカＭ１乃至Ｍ９は考慮しない。また、図６は、撮影画像ＤＩ内の一部のパターンのみを示す。 Figure 6 visually illustrates an example of the process of training a U-Net based on a captured image DI. While Figure 6 presents an example of training a U-Net based on a single image to illustrate an example of training in a U-Net, in reality, the system is trained to identify and classify nodes based on any reference image and any captured image generated by capturing an image of a target onto which patterned light corresponding to the reference image is projected. Note that the example shown in Figure 6 does not take into account the markers M1 to M9 described above. Also, Figure 6 shows only some of the patterns in the captured image DI.

まず、図６（ａ）に示すように、制御装置１１が、撮影画像ＤＩを学習データとして認識する。撮影画像ＤＩは、二次元平面において縦方向に延在する第１の要素の集合ＳＥ１、および二次元平面において横方向に延在する第２の要素の集合ＳＥ２を含む。なお、撮影画像ＤＩでは、第１の要素の集合ＳＥ１は、撮影対象の奥行によって生じるパターン光の経路のずれに起因して、歪んで表される。 First, as shown in FIG. 6(a), the control device 11 recognizes the captured image DI as learning data. The captured image DI includes a first set of elements SE1 extending vertically in a two-dimensional plane, and a second set of elements SE2 extending horizontally in the two-dimensional plane. Note that in the captured image DI, the first set of elements SE1 is distorted due to a deviation in the path of the pattern light caused by the depth of the subject.

次に、ユーザは、各々の第１の要素を識別するためのラベルを付加するために、第１の要素の集合ＳＥ１のそれぞれを描くように縦マークＶＭを付加する。図６（ｂ）に示すように、縦マークＶＭは、それぞれの第１の要素をなぞるように曲線を描くことによって付加される。縦マークＶＭは、制御装置１１によって学習データとして認識される。 Next, the user adds vertical marks VM to each of the sets of first elements SE1 to label each first element for identification. As shown in FIG. 6(b), the vertical marks VM are added by drawing a curve that traces each first element. The vertical marks VM are recognized as learning data by the control device 11.

次に、制御装置１１は、例えば、縦マークＶＭの所定のエリアを認識し、マークを付加する。本実施例では、図６（ｃ）に示すように、縦マークＶＭの二次元平面における右側の予め定められた画素数にわたる領域が縦マーク領域ＶＲ１として認識され、縦マークＶＭの二次元平面における左側の予め定められた画素数にわたる領域が縦マーク領域ＶＲ２として認識される。縦マーク領域ＶＲ１および縦マーク領域ＶＲ２は、制御装置１１によって学習データとして認識される。 Next, the control device 11 recognizes, for example, a predetermined area of the vertical mark VM and adds a mark. In this embodiment, as shown in FIG. 6(c), an area spanning a predetermined number of pixels on the right side of the vertical mark VM in the two-dimensional plane is recognized as vertical mark area VR1, and an area spanning a predetermined number of pixels on the left side of the vertical mark VM in the two-dimensional plane is recognized as vertical mark area VR2. Vertical mark area VR1 and vertical mark area VR2 are recognized as learning data by the control device 11.

同様に、各々の第２の要素を識別するためのラベルを付加するために、ユーザは、第２の要素の集合ＳＥ２のそれぞれを描くように横マークを付加する。横マークは、それぞれの第２の要素をなぞるように曲線を描くことによって付加される。横マークは、制御装置１１によって学習データとして認識される。 Similarly, to add labels to identify each second element, the user adds horizontal marks to depict each set of second elements SE2. The horizontal marks are added by drawing a curve that traces each second element. The horizontal marks are recognized as learning data by the control device 11.

次に、制御装置１１は、例えば、横マークの所定のエリアを認識し、マークを付加する。本実施例では、図６（ｄ）に示すように、横マークの二次元平面における上側の予め定められた画素数にわたる領域が横マーク領域ＨＲ１として認識され、横マークの二次元平面における下側の予め定められた画素数にわたる領域が横マーク領域ＨＲ２として認識される。横マーク領域ＨＲ１および横マーク領域ＨＲ２は、制御装置１１によって学習データとして認識される。 Next, the control device 11 recognizes, for example, a predetermined area of the horizontal mark and adds a mark. In this embodiment, as shown in FIG. 6(d), the area covering a predetermined number of pixels above the horizontal mark in the two-dimensional plane is recognized as horizontal mark area HR1, and the area covering a predetermined number of pixels below the horizontal mark in the two-dimensional plane is recognized as horizontal mark area HR2. Horizontal mark area HR1 and horizontal mark area HR2 are recognized as learning data by the control device 11.

上述したように、第２の要素の集合ＳＥ２は、二次元平面において断続的に延在する要素を含む。断続的に延在する要素は、同一の第１の要素と２つの位置において交差し、その２つの交点が異なる。図６（ａ）の例では、交点Ｉ１は、交点Ｉ２よりも二次元平面において低い位置にあり、交点Ｉ３は、交点Ｉ４よりも二次元平面において低い位置にある。横マーク領域ＨＲ１および横マーク領域ＨＲ２は、これらの交点を覆うよう、線形になるラインとして描かれる。図６（ｄ）における破線の枠内の領域が示すように、横マーク領域ＨＲ１および横マーク領域ＨＲ２は、交点Ｉ１およびＩ２に対応して二次元平面において右上方向に延在し、交点Ｉ３およびＩ４に対応して二次元平面において右下方向に延在する。 As described above, the second element set SE2 includes elements that extend intermittently in a two-dimensional plane. The intermittently extending elements intersect with the same first element at two different points. In the example of FIG. 6(a), intersection I1 is located lower in the two-dimensional plane than intersection I2, and intersection I3 is located lower in the two-dimensional plane than intersection I4. Horizontal mark areas HR1 and HR2 are drawn as linear lines that cover these intersections. As shown by the area within the dashed line frame in FIG. 6(d), horizontal mark areas HR1 and HR2 extend upward and to the right in the two-dimensional plane corresponding to intersections I1 and I2, and extend downward and to the right in the two-dimensional plane corresponding to intersections I3 and I4.

次に、制御装置１１は、縦マーク領域ＶＲ１と縦マーク領域ＶＲ２との境界を第１の要素として識別番号（第１の要素識別子）を割り当てる。第１の要素識別子が割り当てられた第１の要素は、学習データとして認識される。同様に、制御装置１１は、横マーク領域ＨＲ１と横マーク領域ＨＲ２との境界を第２の要素として識別番号（第２の要素識別子）を割り当てる。第２の要素識別子が割り当てられた第２の要素は、学習データとして認識される。なお、第１の要素識別子は、各々の第１の要素を識別するための任意の記号、形状、および色などを有してもよい。第２の要素識別子も同様である。 Next, the control device 11 assigns an identification number (first element identifier) to the boundary between the vertical mark area VR1 and the vertical mark area VR2 as a first element. The first element to which the first element identifier is assigned is recognized as learning data. Similarly, the control device 11 assigns an identification number (second element identifier) to the boundary between the horizontal mark area HR1 and the horizontal mark area HR2 as a second element. The second element to which the second element identifier is assigned is recognized as learning data. Note that the first element identifier may have any symbol, shape, color, etc. to identify each first element. The same applies to the second element identifier.

なお、Ｕ－Ｎｅｔでは、例えば、基準画像を出力装置１６に表示し、表示された基準画像に対して、ユーザが入力装置１５を介して縦マークおよび横マークを入力することによって、行および列を認識するよう学習されてもよい。 In addition, U-Net may learn to recognize rows and columns by, for example, displaying a reference image on the output device 16 and having the user input vertical and horizontal marks on the displayed reference image via the input device 15.

次に、制御装置１１は、第１の要素と第２の要素との交点をノードとして認識する。そして、制御装置１１は、全てのノードを、隣接するノードに対する相対位置に基づいて分類し、ラベルを付加する（分類値を付与する）。本実施形態では、隣接する２つのノードの相対位置に基づいて、２つのノードが二次元平面において同一または略同一の高さの位置にあること、２つのノードのうち左に位置するノードの方が二次元平面において高い位置にあること、または２つのノードのうち右に位置するノードの方が二次元平面において高い位置にあること、の３個のクラスに分類される。 Next, the control device 11 recognizes the intersection of the first element and the second element as a node. The control device 11 then classifies all nodes based on their relative position with respect to adjacent nodes and adds a label (assigns a classification value). In this embodiment, based on the relative positions of two adjacent nodes, the two nodes are classified into three classes: the two nodes are at the same or approximately the same height in a two-dimensional plane, the node located to the left of the two nodes is at a higher position in the two-dimensional plane, or the node located to the right of the two nodes is at a higher position in the two-dimensional plane.

なお、実際には、ノードの隣接するノードに対する相対位置を認識することができないことがあるので、このようなノードをｕｎｋｎｏｗｎクラスとして分類してもよい。この場合、ノードは、４個のクラスに分類される。本実施形態では、ノードを４個のクラスに分類する。図６（ｅ）は、ラベル付けされたノードを示す。 In practice, it may be impossible to recognize the relative position of a node relative to its adjacent nodes, so such nodes may be classified as unknown. In this case, the nodes are classified into four classes. In this embodiment, the nodes are classified into four classes. Figure 6(e) shows the labeled nodes.

２つのノードが二次元平面において同一または略同一の高さの位置あることとは、例えば、２つのノードの二次元平面における高さの差が、予め定められた閾値範囲（例えば、ｍ画素（ｍは任意の数））にあることを意味する。このように分類されるノードは、赤（Ｒ）ラベルが付加され、図６（ｅ）では、白抜きの円がＲラベルを表す。 Two nodes being at the same or nearly the same height in a two-dimensional plane means, for example, that the difference in height between the two nodes in the two-dimensional plane is within a predetermined threshold range (for example, m pixels (m is an arbitrary number)). Nodes classified in this way are labeled red (R), and in Figure 6(e), the open circle represents the R label.

Ｒラベルが付加されるノードは、パターンにおいて、連続的に延在する第２の要素と第１の要素との交点に対応する。図６（ａ）に示した例では、交点Ｉ７およびＩ８を構成する第２の要素は連続的に延在しているので、交点Ｉ７およびＩ８に対応して、ノードＮ４およびＮ５が識別され、Ｒラベルが付加される。同様に、ノードＮ６乃至Ｎ１１、Ｎ１４、およびＮ１６乃至Ｎ２０も、連続的に延在する第２の要素と第１の要素との交点に対応して、Ｒラベルが付加される。 Nodes labeled with an R correspond to the intersections of continuously extending second and first elements in the pattern. In the example shown in Figure 6(a), the second elements that make up intersections I7 and I8 extend continuously, so nodes N4 and N5 are identified and labeled with an R in correspondence with intersections I7 and I8. Similarly, nodes N6 to N11, N14, and N16 to N20 also correspond to the intersections of continuously extending second and first elements, and are labeled with an R.

２つのノードのうち右に位置するノードの方が二次元平面において高い位置にあることとは、例えば、右に位置するノードが左に位置するノードよりも二次元平面において高い位置にあり、２つのノードの二次元平面における高さの差が、予め定められた閾値を上回ることを意味する。このように分類されるノードは、緑（Ｇ）ラベルが付加され、図６（ｅ）では、黒塗りの円がＧラベルを表す。 When the right-side node of two nodes is positioned higher on a two-dimensional plane, it means, for example, that the right-side node is positioned higher on a two-dimensional plane than the left-side node, and the difference in height between the two nodes on the two-dimensional plane exceeds a predetermined threshold. Nodes classified in this way are labeled green (G), and in Figure 6(e), a black circle represents the G label.

Ｇラベルが付加されるノードは、パターンにおいて、断続的に延在する第２の要素と第１の要素との２つ交点（右に位置する第２の要素との交点の方が、左に位置する第２の要素との交点よりも高い位置にある）に対応する。図６（ａ）に示した例では、交点Ｉ１およびＩ２を構成する第２の要素は断続的に延在し、右に位置する第２の要素との交点Ｉ２の方が、左に位置する第２の要素との交点Ｉ１よりも高い位置にあるので、交点Ｉ１およびＩ２に対応して、ノードＮ１が識別され、Ｇラベルが付加される。同様に、交点Ｉ５およびＩ６を構成する第２の要素は断続的に延在し、右に位置する第２の要素との交点Ｉ６の方が、左に位置する第２の要素との交点Ｉ５よりも高い位置にあるので、交点Ｉ５およびＩ６に対応して、ノードＮ３が識別され、Ｇラベルが付加される。同様に、ノードＮ１３も、断続的に延在する第２の要素と第１の要素との交点に対応して、Ｇラベルが付加される。 Nodes labeled with the G label correspond to two intersections between intermittently extending second elements and first elements in the pattern (the intersection with the second element on the right is higher than the intersection with the second element on the left). In the example shown in Figure 6(a), the second elements constituting intersections I1 and I2 extend intermittently, and intersection I2 with the second element on the right is higher than intersection I1 with the second element on the left. Therefore, node N1 is identified corresponding to intersections I1 and I2 and labeled with the G label. Similarly, the second elements constituting intersections I5 and I6 extend intermittently, and intersection I6 with the second element on the right is higher than intersection I5 with the second element on the left. Therefore, node N3 is identified corresponding to intersections I5 and I6 and labeled with the G label. Similarly, a G label is added to node N13, corresponding to the intersection between the intermittently extending second element and the first element.

２つのノードのうち左に位置するノードの方が二次元平面において高い位置にあることとは、例えば、左に位置するノードが右に位置するノードよりも二次元平面において高い位置にあり、２つのノードの二次元平面における高さの差が、予め定められた閾値を上回ることを意味する。このように分類されるノードは、青（Ｂ）ラベルが付加され、図６（ｅ）では、網掛けの円がＢラベルを表す。 When we say that the left-most node of two nodes is higher in a two-dimensional plane, it means, for example, that the left-most node is higher in a two-dimensional plane than the right-most node, and the difference in height between the two nodes in the two-dimensional plane exceeds a predetermined threshold. Nodes classified in this way are labeled blue (B), and in Figure 6(e), the shaded circle represents the B label.

Ｂラベルが付加されるノードは、パターンにおいて、断続的に延在する第２の要素と第１の要素との２つ交点（左に位置する第２の要素との交点の方が、右に位置する第２の要素との交点よりも高い位置にある）に対応する。図６（ａ）に示した例では、交点Ｉ３およびＩ４を構成する第２の要素は断続的に延在し、左に位置する第２の要素との交点Ｉ３の方が、右に位置する第２の要素との交点Ｉ４よりも高い位置にあるので、交点Ｉ３およびＩ４に対応して、ノードＮ２が識別され、Ｂラベルが付加される。同様に、ノードＮ１２およびＮ１５も、断続的に延在する第２の要素と第１の要素との交点に対応して、Ｂラベルが付加される。 Nodes labeled with the B label correspond to two intersections between discontinuously extending second elements and first elements in the pattern (the intersection with the second element on the left is at a higher position than the intersection with the second element on the right). In the example shown in Figure 6(a), the second elements constituting intersections I3 and I4 extend discontinuously, and intersection I3 with the second element on the left is at a higher position than intersection I4 with the second element on the right. Therefore, node N2 is identified in correspondence with intersections I3 and I4, and a B label is added. Similarly, nodes N12 and N15 correspond to intersections between discontinuously extending second elements and first elements, and are also labeled with the B label.

本実施形態では、相対位置に基づいてノードを３個のクラスに分類する例を示したが、分類するクラスの数は３に限定されない。例えば、基準画像ＲＩにおいて、第１の要素および第２の要素のいずれもが、断続的に延在する要素を含む場合、隣接するノードのうち上に位置するノードが下に位置するノードよりも二次元平面において右（または、左）に位置することなどのクラスに分類されてもよい。なお、ラベルを上述した色で表すことは例示にすぎず、相対位置を示す任意の記号などがラベルとして付加されてもよい。 In this embodiment, an example has been shown in which nodes are classified into three classes based on their relative positions, but the number of classes to be classified is not limited to three. For example, if both the first element and the second element in the reference image RI include elements that extend intermittently, adjacent nodes may be classified into classes in which the upper node is located to the right (or left) of the lower node in a two-dimensional plane. Note that the use of colors to represent labels as described above is merely an example, and any symbol indicating relative position may be added as a label.

このようにして、撮影画像ＤＩがＵ－Ｎｅｔに入力され、撮影画像ＤＩから列（第１の要素）および行（第２の要素）が識別され、ノードが識別及び分類される。Ｕ－Ｎｅｔは、上述した手順に従って、あらゆる基準画像、および基準画像に対応するパターン光を投影した撮影対象を撮影することによって生成された撮影画像に基づいて、列および行を認識し、ノードを識別および分類するよう学習される。 In this way, the captured image DI is input to U-Net, columns (first elements) and rows (second elements) are identified from the captured image DI, and nodes are identified and classified. Following the procedure described above, U-Net is trained to recognize columns and rows and identify and classify nodes based on any reference image and a captured image generated by capturing an object onto which a pattern of light corresponding to the reference image is projected.

Ｕ－Ｎｅｔは、ノードごとに、第１の要素識別子および第２の要素識別子（ノードは、第１の要素識別子および第２の要素識別子の組によって識別される）、ならびにラベルを出力する。ノードを識別するための第１の要素識別子および第２の要素識別子は、２次元の特徴ベクトルとして表される。また、３個のクラス（ＲＧＢ）およびｕｎｋｎｏｗｎクラスのラベルは、４次元の特徴ベクトルＦとして表される。 For each node, U-Net outputs a first element identifier and a second element identifier (a node is identified by a pair of a first element identifier and a second element identifier), as well as a label. The first element identifier and the second element identifier used to identify the node are represented as a two-dimensional feature vector. The labels of the three classes (RGB) and the unknown class are represented as a four-dimensional feature vector F.

上述したように、ノードが識別および分類されると、制御装置１１は、ノードの分類（つまり、ノードの位置関係）および隣接するノードとの隣接関係に基づいて、ノード同士を接続するエッジを付与する。ステップＳ５０３乃至Ｓ５０５の処理では、撮影画像ＤＩから６次元（２次元（第１の要素識別子および第２の要素識別子）＋４次元（３個のクラス＋ｕｎｋｎｏｗｎクラス））のベクトルを含む特徴ベクトルＦが抽出される。また、ノードにラベルが付加された格子グラフＧが生成される。 As described above, once the nodes have been identified and classified, the control device 11 assigns edges connecting the nodes based on the node classification (i.e., the positional relationship of the nodes) and their adjacency with adjacent nodes. In the processing of steps S503 to S505, a feature vector F containing a six-dimensional vector (two dimensions (first element identifier and second element identifier) + four dimensions (three classes + unknown class)) is extracted from the captured image DI. In addition, a lattice graph G is generated in which labels are added to the nodes.

任意選択で、ステップＳ５０３乃至Ｓ５０５の処理では、対応付けの精度を高めるために、上述した基準画像ＲＩに配置されたマーカＭ１乃至Ｍ９（撮影画像ＤＩにもマーカＭ１乃至Ｍ９が配置される）が位置する所定の領域に基づいて、ノードを分類し、ノードに対してラベルを付加してもよい（分類値を付与する）。 Optionally, in the processing of steps S503 to S505, in order to improve the accuracy of the matching, the nodes may be classified and labels may be added to the nodes (assigned classification values) based on the predetermined areas in which the markers M1 to M9 placed in the reference image RI (markers M1 to M9 are also placed in the captured image DI) are located.

図７は、図３に示した基準画像ＲＩ内で、マーカＭ１乃至Ｍ９のそれぞれを中心とした所定の領域である領域Ｒ１乃至Ｒ９を示す。例えば、領域Ｒ１は、マーカＭ１が位置するノードを構成する第１の要素および第２の要素と、それらに隣接する第１の要素および第２の要素とによって構成された４個の格子を含む。領域Ｒ２乃至Ｒ９も同様に、４個の格子を含む。つまり、領域Ｒ１乃至Ｒ９はそれぞれ、マーカＭ１乃至Ｍ９がそれぞれ位置するノードと、そのノードに隣接するノードを含む。 Figure 7 shows regions R1 to R9, which are predetermined regions centered on markers M1 to M9, respectively, within the reference image RI shown in Figure 3. For example, region R1 includes four grids made up of the first and second elements that make up the node where marker M1 is located, and the first and second elements adjacent to those. Regions R2 to R9 also include four grids. In other words, regions R1 to R9 each include the nodes where markers M1 to M9 are located, and the nodes adjacent to those nodes.

領域Ｒ１乃至Ｒ９はそれぞれ、第１の要素および第２の要素から構成された格子の形状において、同一、異なる、左右対称、および上下左右対称などの関係を有する。つまり、領域Ｒ１乃至Ｒ９は相互に、領域内の中心のノードと隣接するノードとの間の関係が、同一であり、異なり、左右対称、および上下左右対称などである。よって、領域Ｒ１乃至Ｒ９内の中心ノードと隣接するノードとの関係に基づいて、ノードが分類される。 Regions R1 to R9 each have relationships such as identical, different, left-right symmetric, and vertically symmetric in the shape of a lattice made up of first and second elements. In other words, regions R1 to R9 have relationships such as identical, different, left-right symmetric, and vertically symmetric between the center node and adjacent nodes within the regions. Therefore, nodes within regions R1 to R9 are classified based on the relationship between the center node and adjacent nodes.

図７に示すように、領域Ｒ１乃至Ｒ５は、相互に異なる形状を有する。領域Ｒ６は、領域Ｒ４と、二次元平面において上下左右対称の形状を有する。領域Ｒ７は、領域Ｒ２と、二次元平面において左右対称の形状を有する。領域Ｒ８は、領域Ｒ３と、二次元平面において左右対称の形状を有する。領域Ｒ９は、領域Ｒ１と、二次元平面において上下左右対称の形状を有する。これらの分類も、Ｕ－Ｎｅｔによって学習データとして認識される。 As shown in Figure 7, regions R1 to R5 have mutually different shapes. Region R6 has a shape that is vertically and horizontally symmetrical with region R4 in a two-dimensional plane. Region R7 has a shape that is horizontally symmetrical with region R2 in a two-dimensional plane. Region R8 has a shape that is horizontally symmetrical with region R3 in a two-dimensional plane. Region R9 has a shape that is vertically and horizontally symmetrical with region R1 in a two-dimensional plane. These classifications are also recognized by U-Net as learning data.

ステップＳ５０３乃至Ｓ５０５の処理では、撮影画像ＤＩがＵ－Ｎｅｔに入力され、上述した学習によって生成された学習データに基づいて、撮影画像ＤＩ内のマーカＭ１乃至Ｍ９が５個のクラスに分類される。図７の例では、例えば、領域Ｒ１およびＲ９がクラス１、領域Ｒ２およびＲ７がクラス２、領域Ｒ３およびＲ８がクラス３、領域Ｒ４およびＲ６がクラス４、領域５がクラス５に分類される。 In steps S503 to S505, the captured image DI is input to U-Net, and markers M1 to M9 in the captured image DI are classified into five classes based on the learning data generated by the learning described above. In the example of Figure 7, for example, regions R1 and R9 are classified into class 1, regions R2 and R7 into class 2, regions R3 and R8 into class 3, regions R4 and R6 into class 4, and region 5 into class 5.

なお、実際には、マーカの周囲の領域またはその形状を識別することができないこともあるので、そのようなマーカをｕｎｋｎｏｗｎクラスとして分類してもよい。この場合、マーカは、６個のクラスに分類される。本実施形態では、マーカを６個のクラスに分類する。 In practice, it may be impossible to identify the area surrounding a marker or its shape, so such markers may be classified as unknown. In this case, markers are classified into six classes. In this embodiment, markers are classified into six classes.

これらのクラスは、それぞれのマーカが位置するノードとそのノードに隣接するノードとの関係を識別することができる。例えば、領域Ｒ１では、マーカＭ１が位置するノードの下のノード（交点Ｉ１およびＩ２から構成されるノード（図７ではノードを表していない））は、Ｇラベルが付加される（右に位置するノードが左に位置するノードよりも二次元平面において高いと分類される）。よって、分類されたクラスごとにこれらの位置関係を学習することによって、基準画像と、撮影対象に投影するパターン光に対応するパターン光が投影された撮影対象を撮影することによって生成された画像との間の対応関係を分析することを容易にする。 These classes can identify the relationship between the node where each marker is located and the nodes adjacent to that node. For example, in region R1, the node below the node where marker M1 is located (the node formed by intersections I1 and I2 (nodes not shown in Figure 7)) is labeled G (the node on the right is classified as being higher in a two-dimensional plane than the node on the left). Therefore, learning these positional relationships for each classified class makes it easier to analyze the correspondence between a reference image and an image generated by photographing an object onto which a patterned light corresponding to the patterned light projected onto the object is projected.

上述した任意選択の処理によって、ステップＳ５０３乃至Ｓ５０５の処理では、撮影画像ＤＩから１２次元（２次元＋４次元＋６次元（５個のクラス＋ｕｎｋｎｏｗｎクラス））のベクトルを含む特徴ベクトルＦが抽出される。 Through the optional processing described above, in steps S503 to S505, a feature vector F containing a 12-dimensional (2-dimensional + 4-dimensional + 6-dimensional (5 classes + unknown class)) vector is extracted from the captured image DI.

なお、本実施形態では、Ｕ－Ｎｅｔにおいて第１の要素識別子および第２の要素識別子を割り当てているが、そのような方式に限定されない。各々の第１の要素を識別する番号、および各々の第２の要素を識別する番号が事前に割り当てられ、それらの番号（要素を識別する任意の記号）が基準画像ＲＩに埋め込まれてもよい。 In this embodiment, first element identifiers and second element identifiers are assigned in U-Net, but this method is not limited to this. Numbers identifying each first element and numbers identifying each second element may be assigned in advance, and these numbers (arbitrary symbols identifying the elements) may be embedded in the reference image RI.

また、本実施形態では、画像内のそれぞれのノードと、対応する相対位置との対応関係を学習したＵ－Ｎｅｔから、対応するノードを示す情報の集合と、および対応する相対位置を示す情報の集合を抽出しているが、使用するニューラルネットワークは、Ｕ－Ｎｅｔに限定されない。例えば、画像内のそれぞれのノードと、対応する相対位置との対応関係を学習したＵ－Ｎｅｔ以外の公知のニューラルネットワークを使用してもよい。 Furthermore, in this embodiment, a set of information indicating corresponding nodes and a set of information indicating corresponding relative positions are extracted from a U-Net that has learned the correspondence between each node in an image and its corresponding relative position, but the neural network used is not limited to a U-Net. For example, a known neural network other than a U-Net that has learned the correspondence between each node in an image and its corresponding relative position may also be used.

また、ニューラルネットワークを使用せず、画像内の隣接する２つのノードを認識し、２つのノードの相対位置を判定してもよい（例えば、ハフ変換、射影変換、および／または二値化などの画像処理によって）。つまり、ステップＳ５０３乃至Ｓ５０５の処理はいずれも、ニューラルネットワークを使用するか否かに関わらず、画像内のそれぞれの列および行を識別し、ノードを分類する。 Alternatively, two adjacent nodes in an image may be recognized and their relative positions determined (e.g., by image processing such as Hough transform, projective transformation, and/or binarization) without using a neural network. That is, steps S503 through S505 all identify respective columns and rows in an image and classify nodes, regardless of whether a neural network is used.

図７の説明に戻ると、制御装置１１は、特徴ベクトルＦおよび格子グラフＧに基づいて、格子グラフＧ内の各々のノードを、対応する基準画像ＲＩから生成された格子グラフ内のノードと対応付ける（ステップＳ５０６）。 Returning to the explanation of Figure 7, the control device 11 associates each node in the grid graph G with a node in the grid graph generated from the corresponding reference image RI based on the feature vector F and the grid graph G (step S506).

上述したＵ－Ｎｅｔは、基準画像ＲＩなどを学習した結果に基づいて、グラフＧを生成している。しかしながら、Ｕ－Ｎｅｔは、必ずしも正確なグラフを生成することができるとは限らず、誤ったグラフからは、上述した対応付けを正確に行うことはできない。 The above-mentioned U-Net generates graph G based on the results of learning the reference image RI, etc. However, U-Net is not always able to generate an accurate graph, and an incorrect graph will not allow the above-mentioned correspondence to be performed accurately.

本実施形態では、制御装置１１は、学習のためにグラフ畳み込みネットワーク（ＧＣＮ：Graph Convolutional Network）を実装し、ステップＳ５０６の処理は、ＧＣＮによる深層学習において学習された学習データに基づいて実行される。ＧＣＮは、深層学習をグラフデータに適用するニューラルネットワークであり、グラフデータに対して畳み込み演算を行う。ＧＣＮにおける畳み込み演算では、各々のノードに対し、隣接ノードとの関係ごとに重み値に従って尤度が計算される。上述した特徴ベクトルＦは、撮影画像ＤＩ内の各々のノード自体の性質（相対位置）を表し、格子グラフＧは、ノード間の隣接関係を表す。 In this embodiment, the control device 11 implements a graph convolutional network (GCN) for learning, and the processing of step S506 is executed based on learning data learned through deep learning using the GCN. A GCN is a neural network that applies deep learning to graph data, and performs convolution operations on the graph data. In convolution operations in a GCN, likelihood is calculated for each node according to a weight value for each relationship with an adjacent node. The feature vector F described above represents the properties (relative position) of each node itself within the captured image DI, and the lattice graph G represents the adjacent relationships between nodes.

ＧＣＮは、あらゆる基準画像およびあらゆる撮影画像から抽出および生成された、上述したような特徴ベクトルＦおよび格子グラフＧに基づいて学習される。よって、上述した基準画像ＲＩから生成された格子グラフＧに基づいて、基準画像ＲＩ内のノードに対応するノードを抽出するようＧＣＮが学習される。よって、ＧＣＮからの出力は、入力した格子グラフＧ内のノードを表す情報である。 The GCN is trained based on the feature vectors F and lattice graph G, as described above, extracted and generated from all reference images and all captured images. Therefore, based on the lattice graph G generated from the reference image RI described above, the GCN is trained to extract nodes that correspond to nodes in the reference image RI. Therefore, the output from the GCN is information representing the nodes in the input lattice graph G.

図８は、本実施形態で実装されるＧＣＮ８００の構成を示す。ＧＣＮ８００は、入力された特徴ベクトルＦおよび格子グラフＧを所定の回数の演算を行うため、その演算の回数に従った階層構造を採用している。ＧＣＮ８００は、ＧＣＮ層８０１、全結合層８０２、および出力層８０３を含む。 Figure 8 shows the configuration of the GCN 800 implemented in this embodiment. The GCN 800 performs a predetermined number of calculations on the input feature vector F and lattice graph G, and therefore employs a hierarchical structure according to the number of calculations. The GCN 800 includes a GCN layer 801, a fully connected layer 802, and an output layer 803.

ＧＣＮ層８０１では、特徴ベクトルＦおよび格子グラフＧに基づいて、グラフ畳み込み演算部８０１ａがグラフ畳み込み演算を実行し、正規化演算部８０１ｂが正規化演算（Batch Normalization）を実行し、活性化演算部８０１ｃが活性化演算（ＲｅＬＵ）を実行する。この処理では、１２次元の特徴ベクトルＦに対しノードごとに上記演算が実行される。よって、Ｄ×Ｎ（Ｎは、ノード数、Ｄは次元数（１２））の行列Ｈが生成される。 In the GCN layer 801, based on the feature vector F and the lattice graph G, the graph convolution calculation unit 801a performs a graph convolution calculation, the normalization calculation unit 801b performs a normalization calculation (Batch Normalization), and the activation calculation unit 801c performs an activation calculation (ReLU). In this process, the above calculations are performed for each node on the 12-dimensional feature vector F. Therefore, a matrix H of D x N (N is the number of nodes, and D is the number of dimensions (12)) is generated.

ＧＣＮは、隣接するノードとの関係に基づいてノードごとに演算を行う。本実施形態では、格子グラフＧ内の各々のノードは、４方向（上方向、下方向、右方向、および左方向）に隣接するノードを有する（４方向（以下、方向数をｌで表す）に隣接するノードと隣接関係を有する）。よって、方向ごとに行列Ｈ⁽¹⁾乃至Ｈ^(l)が生成される。これらの行列は、層データ行列に累積されるので、最終的に、行列Ｈ^(l+1)が生成される。行列Ｈ^(l+1)は、式（１）によって表すことができる。 The GCN performs calculations for each node based on the relationship with adjacent nodes. In this embodiment, each node in the grid graph G has adjacent nodes in four directions (upward, downward, rightward, and leftward) (having an adjacent relationship with adjacent nodes in four directions (hereinafter, the number of directions is represented by l)). Therefore, matrices H ⁽¹⁾ to H ^(l) are generated for each direction. These matrices are accumulated in the layer data matrix, and finally, matrix H ^(l+1) is generated. Matrix H ^(l+1) can be expressed by equation (1).

は、自己結合を追加した格子グラフＧの隣接行列であり、Ｉは、単位行列であり、 is the adjacency matrix of the grid graph G with self-joins added, and I is the identity matrix.

は、 teeth,

の次数行列であり、Ｗ^(l)は、この層の重み行列であり、σは、活性化関数（ＲｅＬＵ）である。隣接行列Ａは、｛Ａ₀（上方向）、Ａ₁（下方向）、Ａ₂（右方向）、Ａ₃（左方向）｝である。 is the order matrix of the layer, W ^(l) is the weight matrix of this layer, and σ is the activation function (ReLU). The adjacency matrix A is {A ₀ (upward), A ₁ (downward), A ₂ (rightward), A ₃ (leftward)}.

行列Ｈ^(l+1)の計算は、式（２）に従って実行される。 The calculation of the matrix H ^(l+1) is performed according to equation (2).

は、方向ｄ∈｛０、１、２、３｝に応じた重み行列である。 is a weight matrix according to the direction d∈{0, 1, 2, 3}.

式（２）を実行した後、正規化演算および活性化演算が実行される。この処理が５回繰り返される。このようにして、ＧＣＮ層８０１は、行列Ｈ^(l+1)を出力する。 After executing equation (2), normalization and activation operations are performed. This process is repeated five times. In this way, the GCN layer 801 outputs the matrix H ^(l+1) .

全結合層８０２では、行列Ｈ^(l+1)に基づいて、線形変換演算部８０２ａが線形変換演算を実行し、活性化演算部８０２ｂが活性化演算（ＲｅＬＵ）を実行する。この処理では、Ｄ個の次元（本実施形態では１２次元）およびＮ個のノードごとに上記演算が実行される。よって、特徴ベクトルＦの特徴埋め込み行列（Ｎ×Ｄのサイズを有する）が生成される。このようにして、全結合層８０２は、特徴ベクトルＦの特徴埋め込み行列を出力する。 In the fully connected layer 802, a linear transformation operation unit 802a performs a linear transformation operation based on the matrix H ^(l+1) , and an activation operation unit 802b performs an activation operation (ReLU). In this process, the above operation is performed for each of D dimensions (12 dimensions in this embodiment) and N nodes. Thus, a feature embedding matrix (having a size of N×D) for the feature vector F is generated. In this way, the fully connected layer 802 outputs the feature embedding matrix for the feature vector F.

上述した演算によって、ノードごとに、自身の特徴ベクトル（つまり、隣接ノードとの相対位置を示す値）、隣接ノードとの関係、および重み付けを考慮した畳み込み演算によって、元の基準画像ＲＩ内のそれぞれのノードとの対応付けに対する精度を高めることができる。図９は、ノードの特徴ベクトル、隣接ノードとの関係、および重み値を考慮した演算を視覚的に示す。 The above-described calculations allow for convolution calculations that take into account each node's own feature vector (i.e., a value indicating its relative position with respect to adjacent nodes), its relationship with adjacent nodes, and weighting, thereby improving the accuracy of the correspondence with each node in the original reference image RI. Figure 9 visually illustrates calculations that take into account a node's feature vector, its relationship with adjacent nodes, and its weighting.

出力層８０３では、特徴埋め込み行列について、Ｓｏｆｔｍａｘ演算部８０３ａがノードごとにＳｏｆｔｍａｘ演算を実行し、ノードごとの対数尤度ベクトルを出力する。 In the output layer 803, the Softmax calculation unit 803a performs Softmax calculations for each node on the feature embedding matrix and outputs a log-likelihood vector for each node.

ＧＣＮは、Ｕ－Ｎｅｔから出力される格子グラフＧに基づいて学習される。具体的には、Ｕ－Ｎｅｔによって識別され、ラベルが付加された格子グラフＧ内のノードが学習データとして認識される。また、格子グラフ内の各々の第１の要素を識別するための第１の要素識別子が割り当てられ、各々の第２の要素を識別するための第２の要素識別子が割り当てられ、第１の要素識別子および第２の要素識別子が教師データとして使用される。 The GCN is trained based on the lattice graph G output from the U-Net. Specifically, the nodes in the lattice graph G that are identified and labeled by the U-Net are recognized as training data. In addition, a first element identifier is assigned to identify each first element in the lattice graph, and a second element identifier is assigned to identify each second element, and the first element identifier and second element identifier are used as training data.

ＧＣＮによって、ノードごとの対数尤度ベクトルが出力されると、対数尤度ベクトルに基づいて対応するノードが判定される。このような判定において、学習データから一定の確率を有すると判断されたノードが、基準画像ＲＩ内のノードに対応するノードとして抽出される。ＧＣＮは、上述した演算を行った後のノードを反映した第１の要素識別子の集合および第２の要素識別子の集合を出力する。第１の要素識別子の集合および第２の要素識別子の集合により、画像内の列および行を認識することができるので、その交点であるノードを識別することができる。 When the GCN outputs a log-likelihood vector for each node, the corresponding node is determined based on the log-likelihood vector. In this determination, nodes determined to have a certain probability from the training data are extracted as nodes corresponding to nodes in the reference image RI. The GCN outputs a set of first element identifiers and a set of second element identifiers that reflect the nodes after the above-mentioned calculations have been performed. The first set of element identifiers and the second set of element identifiers make it possible to recognize columns and rows in the image, and therefore identify nodes that are their intersections.

なお、本実施形態では、ＧＣＮが全てのノードを出力する負荷を考慮して、第１の要素識別子の集合および第２の要素識別子の集合を出力しているが、全ノードに識別子を割り当て（ノード識別子）、ノード識別子を出力してもよい。 In this embodiment, the GCN outputs a set of first element identifiers and a set of second element identifiers, taking into account the load of outputting all nodes. However, it is also possible to assign identifiers to all nodes (node identifiers) and output the node identifiers.

なお、パターン内のノードについての特徴ベクトルおよびノードによって構成された格子グラフに基づいて学習したＧＣＮから、基準画像内のノードに対応するノードを抽出しているが、使用するニューラルネットワークは、ＧＣＮに限定されない。例えば、特徴ベクトルおよびノードによって構成された格子グラフに基づいて学習したＧＣＮ以外の公知のニューラルネットワークを使用してもよい。グラフの接続に基づいて、グラフのノードと関連付けられた特徴ベクトルを集約する機能があってもよく、他の手法および他のニューラルネットワークでも代替が可能である。ＧＣＮ以外のネットワークを使用する場合も、格子グラフ内の各々のノードに対し、隣接ノードとの関係および重み値を考慮して演算が実行される。 Note that nodes corresponding to nodes in the reference image are extracted from a GCN trained based on feature vectors for nodes in the pattern and a lattice graph formed by the nodes, but the neural network used is not limited to a GCN. For example, a known neural network other than a GCN trained based on feature vectors and a lattice graph formed by nodes may also be used. There may be a function to aggregate feature vectors associated with nodes in the graph based on graph connections, and other methods and other neural networks are also possible. Even when a network other than a GCN is used, calculations are performed for each node in the lattice graph, taking into account the relationship with adjacent nodes and weight values.

また、ニューラルネットワークを使用せず、撮影画像内のノードの隣接するノードに対する相対位置（相対位置に基づいた分類）および隣接ノードとの関係に基づいて、ノードによって構成された格子グラフから、基準画像内のノードに対応するノードを抽出してもよい（例えば、空間フィルタリングおよび畳み込み演算などの画像処理によって）。つまり、ステップＳ５０６の処理は、ニューラルネットワークを使用するか否かに関わらず、入力された格子グラフおよび特徴ベクトル（格子グラフ内のノードの隣接ノードに対する相対位置に基づいた分類）に基づいて、隣接ノードとの関係および重み値を考慮して基準画像内のノードに対応するノードを抽出する。 Alternatively, without using a neural network, nodes corresponding to nodes in the reference image may be extracted from a lattice graph formed by nodes based on the relative positions of nodes in the captured image to adjacent nodes (classification based on relative positions) and relationships with adjacent nodes (for example, by image processing such as spatial filtering and convolution operations). In other words, regardless of whether a neural network is used, the processing of step S506 extracts nodes corresponding to nodes in the reference image based on the input lattice graph and feature vectors (classification based on the relative positions of nodes in the lattice graph to adjacent nodes), taking into account relationships with adjacent nodes and weight values.

以上のようにして、ＧＣＮを使用して、基準画像ＲＩ内の各々のノードに対応する、撮影画像ＤＩ内の各々のノードが識別される。このようにして撮影画像ＤＩ内のより多くのノードが基準画像ＲＩ内のノードと対応付けられるので、その対応付けに基づいて、三角測量に基づいて撮像装置と撮影対象との間の距離を計算し、この距離に基づいて三次元画像を構築するができる。また、対応付けに基づいて、撮像装置２１と投光装置２２との間の位置情報などを示す外部パラメータ（エピポーラ幾何に使用される）を計算することができる。 In this way, the GCN is used to identify each node in the captured image DI that corresponds to each node in the reference image RI. In this way, more nodes in the captured image DI can be associated with nodes in the reference image RI. Based on this association, the distance between the imaging device and the subject can be calculated using triangulation, and a three-dimensional image can be constructed based on this distance. Furthermore, based on the association, external parameters (used in epipolar geometry) indicating positional information between the imaging device 21 and the light projector 22 can be calculated.

上述したように、撮影対象の奥行に応じて反射したパターン光の形状が歪むので、従来技術の画像処理では、画像内の全てノードを元のノードと対応付けることは困難であった。本実施形態では、二次元平面において縦および／または横に断続的に延在する要素によって、パターンの形状が歪んでも、撮像画像ＤＩにおいて隣接ノードとの関係が維持されるので、その関係に基づいて、ＧＣＮにより元のパターン内のノードと対応付ける精度を高めることができる。例えば、図３に示した基準画像ＲＩを投影して撮影した撮像画像ＤＩでは、パターンの形状が歪んでも、２つのノードとの間で二次元平面における横方向での高さの関係が維持される。 As mentioned above, the shape of the reflected pattern light distorts depending on the depth of the subject, making it difficult to match all nodes in an image with the original nodes using conventional image processing. In this embodiment, even if the shape of the pattern is distorted by elements that extend intermittently vertically and/or horizontally on a two-dimensional plane, the relationship between adjacent nodes is maintained in the captured image DI. Based on this relationship, the accuracy of matching nodes in the original pattern using GCN can be improved. For example, in the captured image DI captured by projecting the reference image RI shown in Figure 3, the horizontal height relationship between two nodes on a two-dimensional plane is maintained even if the shape of the pattern is distorted.

また、基準画像ＲＩが、第１の要素または第２の要素の少なくとも一方において、断続的に延在する要素を含めることによって構成されるので、ノードを対応付けるための情報を少なくすることができる。更に、基準画像ＲＩに配置されたマークＭ１乃至Ｍ９に基づいた分類に基づいて、ノードの隣接関係を判定するので、対応付けの精度を更に高めることができる。 In addition, because the reference image RI is constructed by including elements that extend intermittently in at least one of the first and second elements, it is possible to reduce the amount of information required to match nodes. Furthermore, because the adjacency relationship between nodes is determined based on classification based on marks M1 to M9 placed in the reference image RI, the accuracy of matching can be further improved.

上述した処理に加え、撮像画像ＤＩ内の画素と隣接するノードとの相対位置を判定し、基準画像ＲＩ内の画素と隣接するノードとの相対位置を判定し、双方の相対位置に基づいて、画素ごとの対応付けを行ってもよい。この処理は、撮像画像ＤＩおよび基準画像ＲＩの両方に対し、画素ごとに隣接するノードとの相対位置を認識するので、処理負荷は高くなるが、対応付けの精度を更に高めることができる。このような画素ごとの対応付けも、ＧＣＮによって学習される。 In addition to the above processing, it is also possible to determine the relative position between pixels in the captured image DI and adjacent nodes, and the relative position between pixels in the reference image RI and adjacent nodes, and perform pixel-by-pixel correspondence based on the relative positions of both. This processing recognizes the relative position between adjacent nodes for each pixel in both the captured image DI and the reference image RI, which increases the processing load but can further improve the accuracy of correspondence. Such pixel-by-pixel correspondences are also learned by the GCN.

上述したＧＣＮからの出力およびノードの隣接ノードに対する相対位置に基づいたノード間の対応付けは、ＧＣＮもしくは他のニューラルネットワーク、またはニューラルネットワークを使用しない画像処理（例えば、空間フィルタリングおよび畳み込み演算など）によって行われてもよい。 The correspondence between nodes based on the output from the GCN and the relative position of the node to its neighbors described above may be performed by a GCN or other neural network, or by image processing that does not use a neural network (e.g., spatial filtering and convolution operations).

＜第２の実施形態＞
次に、第２の実施形態を説明する。第２の実施形態は、第１の実施形態と比較して、撮影画像ＤＩと共に、基準画像ＲＩもＵ－ＮｅｔおよびＧＣＮに入力し、双方のＧＣＮからの出力を比較する点で異なる。 Second Embodiment
Next, a second embodiment will be described. The second embodiment differs from the first embodiment in that a reference image RI is input to the U-Net and the GCN together with the captured image DI, and the outputs from both GCNs are compared.

図１０を参照して、第２の実施形態に従った、画像分析システム１００が実行する処理の例を説明する。図１０に示すステップＳ１００１乃至Ｓ１００５は、図５に示したステップＳ５０１乃至Ｓ５０５と同様であるので、説明を省略する。なお、ステップＳ１００５からは、撮影画像ＤＩから、特徴ベクトルＦ_dおよび格子グラフＧ_dが出力される。 An example of processing executed by image analysis system 100 according to the second embodiment will be described with reference to Fig. 10. Steps S1001 to S1005 shown in Fig. 10 are similar to steps S501 to S505 shown in Fig. 5, and therefore description thereof will be omitted. Note that from step S1005, a feature vector _Fd and a grid graph _Gd are output from the captured image DI.

ステップＳ１００６では、制御装置１１は、記憶装置１３に記憶された基準画像ＲＩがＵ－Ｎｅｔに入力され、基準画像ＲＩ内の第１の要素を構成するライン（列）および第２の要素を構成するライン（行）を識別する。列および行を識別する方式は、図５に示したステップＳ５０３について説明した方式と同様である。 In step S1006, the control device 11 inputs the reference image RI stored in the storage device 13 to the U-Net and identifies the lines (columns) that make up the first element and the lines (rows) that make up the second element in the reference image RI. The method for identifying columns and rows is the same as the method described for step S503 shown in Figure 5.

次に、制御装置１１は、ステップＳ１００６において識別した列と行との全ての交点をノードとして識別する（ステップＳ１００７）。ノードを識別する方式は、図５に示したステップＳ５０４について説明した方式と同様である。 Next, the control device 11 identifies all intersections between the columns and rows identified in step S1006 as nodes (step S1007). The method for identifying nodes is the same as the method described for step S504 in Figure 5.

次に、制御装置１１は、ノードごとに隣接するノードに対する相対位置に基づいて分類する（ステップＳ１００８）。ノードを分類する方式は、図５に示したステップＳ５０５について説明した方式と同様である。ステップＳ１００８からは、基準画像ＲＩから、特徴ベクトルＦ_pおよび格子グラフＧ_pが出力される。 Next, the control device 11 classifies each node based on its relative position with respect to adjacent nodes (step S1008). The method for classifying the nodes is the same as that described for step S505 in Fig. 5. From step S1008, a feature vector _Fp and a grid graph _Gp are output from the reference image RI.

次に、制御装置１１は、ステップＳ１００５の出力（特徴ベクトルＦ_dおよび格子グラフＧ_d）をＧＣＮに入力し、撮影画像ＤＩから計算された特徴埋め込み行列Ｆ_dを出力する（ステップＳ１００９）。特徴埋め込み行列Ｆ_dを計算する方式は、図５に示したステップＳ５０６について説明した方式と同様である。 Next, the control device 11 inputs the output of step S1005 (feature vector _Fd and lattice graph _Gd ) to the GCN and outputs the feature embedding matrix _Fd calculated from the captured image DI (step S1009). The method for calculating the feature embedding matrix _Fd is the same as the method described for step S506 in FIG. 5.

次に、制御装置１１は、ステップＳ１００８の出力（特徴ベクトルＦ_pおよび格子グラフＧ_p）をＧＣＮに入力し、基準画像ＲＩから計算された特徴埋め込み行列Ｆ_dを出力する（ステップＳ１０１０）。特徴埋め込み行列Ｆ_pを計算する方式は、図５に示したステップＳ５０６について説明した方式と同様である。 Next, the control device 11 inputs the output of step S1008 (feature vector _Fp and lattice graph _Gp ) to the GCN and outputs the feature embedding matrix _Fd calculated from the reference image RI (step S1010). The method for calculating the feature embedding matrix _Fp is the same as the method described for step S506 in FIG. 5.

次に、制御装置１１は、特徴埋め込み行列Ｆ_dおよび特徴埋め込み行列Ｆ_pについて、ノードごとの特徴ベクトルＦ_dおよびＦ_dの内積を計算することによって、基準画像ＲＩと撮影画像ＤＩとの類似性を判定する（ステップＳ１０１１）。特徴ベクトルＦ_dおよびＦ_dの内積は、 Next, the control device 11 determines the similarity between the reference image RI and the captured image DI by calculating the inner product of the feature vectors F _d and F _d for each node for the feature embedding matrix F _d and the feature embedding matrix F _p (step S1011). The inner product of the feature vectors F _d and F _d is given by

によって表される。 Represented by:

制御装置１１は、Ｓｏｆｔｍａｘ関数を使用して、内積のＳｏｆｔｍａｘ値を導出する。ＣＧＮは、格子グラフＧ_pおよびＧ_dに基づいて、上述した演算を実行して、２つの特徴ベクトルの内積と第１の要素識別子および第２の要素識別子との間の交差エントロピのコスト関数を使用することによって学習される。上述した学習から、内積のＳｏｆｔｍａｘ関数によってノードごとにＳｏｆｔｍａｘ値を導出し、ノードごとの対数尤度ベクトルを評価することによって、２つの画像の間で、ノードごとに対応付けることができる。 The control device 11 derives the Softmax value of the dot product using the Softmax function. The CGN is trained by performing the above-described operations based on the lattice graphs G _p and G _d and using a cost function of the cross entropy between the dot product of two feature vectors and the first element identifier and the second element identifier. From the above-described training, the Softmax value for each node is derived using the Softmax function of the dot product, and a correspondence can be established between two images for each node by evaluating a log-likelihood vector for each node.

第２の実施形態に係る処理は、第１の実施形態に係る処理と基本的には同様であるが、基準画像に対してもＧＣＮを介して演算を行い、基準画像ＲＩおよび撮影画像ＤＩの双方の特徴ベクトルの内積を計算し、ノードごとの対数尤度ベクトルを評価する。図１１は、この処理を視覚的に表している。 The processing according to the second embodiment is basically the same as that according to the first embodiment, but also performs calculations on the reference image via GCN, calculates the inner product of the feature vectors of both the reference image RI and the captured image DI, and evaluates the log-likelihood vector for each node. Figure 11 visually represents this processing.

以上のように、第２の実施形態を説明した。第２の実施形態によっても、撮影画像内のノードと基準画像内のノードとの対応付けの精度を高めることができる。 The second embodiment has been described above. The second embodiment also makes it possible to improve the accuracy of matching nodes in a captured image with nodes in a reference image.

＜第３の実施形態＞
次に、第３の実施形態を説明する。第３の実施形態は、第１の実施形態および第２の実施形態と比較して、使用する基準画像ＲＩが異なる。 Third Embodiment
Next, a third embodiment will be described. The third embodiment differs from the first and second embodiments in that the reference image RI used is different.

図１２を参照して、第３の実施形態に係る基準画像ＲＩを説明する。図１２に示すように、基準画像ＲＩは、二次元平面において縦方向に連続的に延在する長方形の要素の集合および二次元平面において縦方向に断続的に延在する任意の記号（本実施形態では、十字）の要素の集合を含む。 The reference image RI according to the third embodiment will be described with reference to Figure 12. As shown in Figure 12, the reference image RI includes a collection of rectangular elements extending continuously in the vertical direction on a two-dimensional plane and a collection of arbitrary symbol elements (in this embodiment, crosses) extending intermittently in the vertical direction on a two-dimensional plane.

図１２に示すように、第１の要素の集合ＳＥ１における第１の要素（図１２に示す網掛けの長方形）Ｅ１ａ乃至Ｅ１ｎ（ｎは任意の整数）は、基準画像ＲＩ内の二次元平面において縦方向に連続的に延在する。一方、第２の要素の集合ＳＥ２における第２の要素（図１２に示す十字記号）Ｅ２ａ乃至Ｅ２ｎ（ｎは任意の整数）は、基準画像ＲＩ内の二次元平面において縦方向に断続的に延在する。第２の要素Ｅ２ａ乃至Ｅ２ｎはそれぞれ、第１の要素Ｅ１ａ乃至Ｅ１ｎの各々に沿って配置される。図１２では、基準画像ＲＩの一部を囲む円Ｃ１に対応した円Ｃ２の内部に、円Ｃ１によって囲まれた基準画像ＲＩの一部の領域を拡大した状態を示す。 As shown in FIG. 12, the first elements (shaded rectangles shown in FIG. 12) E1a to E1n (n is any integer) in the set of first elements SE1 extend continuously vertically in a two-dimensional plane within the reference image RI. Meanwhile, the second elements (cross symbols shown in FIG. 12) E2a to E2n (n is any integer) in the set of second elements SE2 extend intermittently vertically in a two-dimensional plane within the reference image RI. The second elements E2a to E2n are each arranged along the first elements E1a to E1n. FIG. 12 shows an enlarged view of a portion of the reference image RI enclosed by circle C1, within circle C2 corresponding to circle C1 enclosing a portion of the reference image RI.

円Ｃ２内の領域では、第１の要素Ｅ１ａ乃至Ｅ１ｃはそれぞれ、縦方向に連続的に延在する。第２の要素Ｅ２ａ乃至Ｅ２ｎは、縦方向にランダムな間隔に配置される。例えば、第２の要素Ｅ２ｂと第２の要素Ｅ２ｅとの距離は、第２の要素Ｅ２ｅと第２の要素Ｅ２ｈとの距離とは異なる。同様に、第２の要素Ｅ２ｃと第２の要素Ｅ２ｆとの距離は、第２の要素Ｅ２ｆと第２の要素Ｅ２ｉとの距離とは異なる。 In the area within circle C2, first elements E1a to E1c each extend continuously in the vertical direction. Second elements E2a to E2n are arranged at random intervals in the vertical direction. For example, the distance between second element E2b and second element E2e is different from the distance between second element E2e and second element E2h. Similarly, the distance between second element E2c and second element E2f is different from the distance between second element E2f and second element E2i.

第２の要素Ｅ２ａ乃至Ｅ２ｎが二次元平面において縦方向にランダムな間隔に配置されるので、１つ目の第２の要素が二次元平面において隣接する第２の要素に対して位置する角度は、２つ目の第２の要素が二次元平面において隣接する第２の要素に対して位置する角度とは異なる。例えば、円Ｃ２内の領域では、第２の要素Ｅ２ｂが隣接する第２の要素Ｅ２ｃに対して位置する角度は、第２の要素Ｅ２ｃが隣接する第２の要素Ｅ２ｄに対して位置する角度とは異なる。同様に、第２の要素Ｅ２ｂが隣接する第２の要素Ｅ２ｃに対して位置する角度は、第２の要素Ｅ２ｈが隣接する第２の要素Ｅ２ｉに対して位置する角度とは異なる。 Because second elements E2a to E2n are arranged at random intervals vertically in a two-dimensional plane, the angle at which a first second element is positioned relative to an adjacent second element in the two-dimensional plane is different from the angle at which a second second element is positioned relative to an adjacent second element in the two-dimensional plane. For example, in the area within circle C2, the angle at which second element E2b is positioned relative to adjacent second element E2c is different from the angle at which second element E2c is positioned relative to adjacent second element E2d. Similarly, the angle at which second element E2b is positioned relative to adjacent second element E2c is different from the angle at which second element E2h is positioned relative to adjacent second element E2i.

なお、本実施形態では、二次元平面において縦方向に連続的に延在する長方形の要素の集合および二次元平面において縦方向に断続的に延在する記号の要素の集合を含む基準画像を採用しているが、基準画像は、そのような構成に限定されない。例えば、図示しないが、基準画像は、二次元平面において横方向に連続的に延在する長方形の要素の集合および二次元平面において横方向に断続的に延在する記号の要素の集合を含んでもよい。この場合、記号の要素の集合は、横方向にランダムな間隔に配置される。 Note that, in this embodiment, a reference image is employed that includes a set of rectangular elements that extend continuously vertically in a two-dimensional plane and a set of symbol elements that extend intermittently vertically in the two-dimensional plane; however, the reference image is not limited to such a configuration. For example, although not shown, the reference image may include a set of rectangular elements that extend continuously horizontally in a two-dimensional plane and a set of symbol elements that extend intermittently horizontally in the two-dimensional plane. In this case, the set of symbol elements is arranged at random intervals in the horizontal direction.

なお、図示しないが、図１２に示した基準画像ＲＩにおいても、図３に示した基準画像ＲＩと同様に、ｎ個のマーカが配置されてもよい。 Although not shown, n markers may also be arranged in the reference image RI shown in Figure 12, similar to the reference image RI shown in Figure 3.

第３の実施形態で使用する基準画像ＲＩついても、対応するパターン光ＰＬが撮影対象に投影され、撮影画像ＤＩが生成される。第３の実施形態に従って撮影画像ＤＩを処理する方法は、第１の実施形態および第２の実施形態で説明したいずれかの方式と同様であるが、画像内の列のみを識別すること、および隣接するノードの間の相対角度に基づいてノードを分類する点で、第１の実施形態および第２の実施形態に係る処理とは異なる。 For the reference image RI used in the third embodiment, a corresponding pattern light PL is projected onto the subject to be photographed, generating a photographed image DI. The method for processing the photographed image DI according to the third embodiment is similar to either of the methods described in the first and second embodiments, but differs from the processing according to the first and second embodiments in that only columns within the image are identified and nodes are classified based on the relative angles between adjacent nodes.

上述したように、第３の実施形態で使用する基準画像ＲＩは、第１の要素および第２の要素のいずれもが二次元平面において縦方向に延在するので、撮影画像ＤＩでは、画像内の列のみが識別される。列は、例えば、第１の要素の端または第２の要素（記号）をなぞり、二次元平面において縦方向に線形になるラインを描くことによって識別される。これらの処理は、図６（ｂ）および図６（ｃ）について説明した方式と同様である。 As described above, in the reference image RI used in the third embodiment, both the first and second elements extend vertically in a two-dimensional plane, so in the captured image DI, only columns within the image are identified. Columns are identified, for example, by tracing the edge of the first element or the second element (symbol) and drawing a linear line that extends vertically in the two-dimensional plane. This processing is similar to the method described with reference to Figures 6(b) and 6(c).

また、ノードの識別については、例えば、図１２において符号ＮＲ１が付された矩形領域に示されるように、第１の要素と第２の要素との間の一定の領域がノードとして識別される。つまり、後述する分類されたラベル（隣接するノードに対する相対角度（特徴ベクトル））に基づいて、一定の領域に分割される。図１２におけるノード領域ＮＲ１は、第１の要素Ｅ１ｂ上の、第２の要素Ｅ２ｆに隣接する一定の領域として分割され、領域ＮＲ１がノードとして識別される。 When identifying nodes, for example, as shown in the rectangular area labeled NR1 in Figure 12, a certain area between a first element and a second element is identified as a node. In other words, it is divided into certain areas based on the classified labels (relative angles (feature vectors) to adjacent nodes) described below. The node area NR1 in Figure 12 is divided as a certain area on the first element E1b adjacent to the second element E2f, and area NR1 is identified as a node.

同様に、ノード領域ＮＲ２は、第１の要素Ｅ１ｂ上の、第２の要素Ｅ２ｉに隣接する一定の領域として分割され、領域ＮＲ２がノードとして識別される。ノード領域ＮＲ３は、第１の要素Ｅ１ｃ上の、第２の要素Ｅ２ｇに隣接する一定の領域として分割され、領域ＮＲ３がノードとして識別される。分割された領域は、各々がノードを含む複数の格子を構成する。このようなノード領域も、上述したＵ－Ｎｅｔが学習することによって識別されてもよい。図１２に示すノード領域は例示にすぎず、予め定められたルールに従って第１の要素と第２の要素との間の一定の領域がノードとして識別されてもよい。 Similarly, node region NR2 is divided as a fixed region on the first element E1b adjacent to the second element E2i, and region NR2 is identified as a node. Node region NR3 is divided as a fixed region on the first element E1c adjacent to the second element E2g, and region NR3 is identified as a node. The divided regions form multiple lattices, each containing a node. Such node regions may also be identified by the U-Net learning described above. The node regions shown in Figure 12 are merely examples, and a fixed region between the first element and the second element may be identified as a node according to predetermined rules.

第３の実施形態で使用する基準画像ＲＩに対応するパターン光ＰＬが投影された撮影対象から生成された撮影画像ＤＩも、上述したように第１の要素と第２の要素との間の一定の領域がノードとして識別される。第３の実施形態では、ノードを識別する際に、行を識別する必要がないので、Ｕ－Ｎｅｔなどによる演算処理を簡易化することができる。 In the captured image DI generated from the subject onto which pattern light PL corresponding to the reference image RI used in the third embodiment is projected, a certain area between the first element and the second element is identified as a node, as described above. In the third embodiment, there is no need to identify rows when identifying nodes, which simplifies calculation processing using U-Net, etc.

第３の実施形態では、第１の実施形態における処理のように、列と行との交点によりノードを識別しない。図１２のパターンを投影して撮影を行いつつ、撮影画像から、画素ごとに対応関係を計算することによって計測精度の精度を高めることができる。このために、Ｕ－Ｎｅｔにより撮影画像から格子の位相、つまり格子を基準とした相対位置を各画素で抽出するように学習してもよい。図１３に、撮影画像から格子の位相（繰り返す格子と各画素の相対位置関係を、０以上２π未満の回転角度として表現したもの）を推定した例を示す。 In the third embodiment, nodes are not identified by the intersection of columns and rows, as in the processing in the first embodiment. Measurement accuracy can be improved by projecting the pattern shown in Figure 12 and capturing an image, while calculating the correspondence for each pixel from the captured image. To achieve this, the U-Net can be trained to extract the grid phase from the captured image, that is, the relative position of each pixel based on the grid. Figure 13 shows an example of estimating the grid phase (the relative positional relationship between the repeating grid and each pixel, expressed as a rotation angle between 0 and 2π) from a captured image.

図１３は、撮影画像の格子情報について、格子と各画素の相対位置関係を回転角度として表現する例を示す。この回転角度は格子の繰り返しと連動しており、一つの格子ごとに１回転する。つまり０から２πまで上昇し、その後０に戻る。図１３（ａ）は、図１２のパターンを投影した画像である。図１３（ｂ）は、上述したように、Ｕ－Ｎｅｔで格子の区切りの位置で回転角度が０になるような余弦信号を推定した画像である。図１３（ｂ）および図１３（ｃ）は、図１３（ａ）に示した撮影画像から、Ｕ－Ｎｅｔで、格子の区切りの位置で回転角度が４π／５になるような余弦信号を推定した画像を示す。図１３（ｄ）は、図１３（ｂ）および図１３（ｃ）を含む位相推定結果から、格子の位相情報を画素ごとに計算した結果を示す。図１３（ｅ）は、図１３は、格子の縦方向の位相情報を画素ごとに計算した結果を示す。 Figure 13 shows an example of grid information for a captured image, where the relative positional relationship between the grid and each pixel is expressed as a rotation angle. This rotation angle is linked to the repetition of the grid, rotating once for each grid. In other words, it increases from 0 to 2π and then returns to 0. Figure 13(a) shows an image projected with the pattern in Figure 12. Figure 13(b) shows an image in which, as described above, a cosine signal is estimated using U-Net so that the rotation angle is 0 at the grid division positions. Figures 13(b) and 13(c) show images in which a cosine signal is estimated using U-Net from the captured image shown in Figure 13(a) so that the rotation angle is 4π/5 at the grid division positions. Figure 13(d) shows the results of calculating grid phase information for each pixel from the phase estimation results including Figures 13(b) and 13(c). Figure 13(e) shows the results of calculating vertical grid phase information for each pixel.

図１３に示した撮影画像の位相情報（回転角度）の推定は例示にすぎず、Ｕ－Ｎｅｔによる格子の位相情報の検出を、余弦信号の推定を経ずに直接検出することも可能である。ただし、Ｕ－Ｎｅｔによって位相情報を直接検出するよりも、余弦信号の検出を学習するほうが、位相情報の推定精度が高くなる。また、ガボールフィルタ等を利用することもできる。上述したように、格子の位相情報が、Ｕ－Ｎｅｔ以外のニューラルネットワークまたは他の画像処理によって識別されてもよい。 The estimation of phase information (rotation angle) of the captured image shown in Figure 13 is merely an example, and it is also possible to detect the phase information of the lattice directly using U-Net without estimating the cosine signal. However, learning to detect the cosine signal will result in higher accuracy in estimating the phase information than directly detecting the phase information using U-Net. Gabor filters, etc. can also be used. As mentioned above, the phase information of the lattice may be identified using a neural network other than U-Net or other image processing.

本実施形態では、グラフのノードは、図１３に示した画素ごとの位相情報を、位相の０度の部分を領域の区切りとして領域分割する（つまり、画素ごとの相対位置および撮影画像上の格子情報に基づいて撮影画像を複数の領域に分割する）ことによって識別される。また、ノードの隣接関係を、領域の隣接関係から抽出する。ノードが識別されると、制御装置１１は、Ｕ－Ｎｅｔなどを使用して、各々のノードに対し、ノード周辺の画像特徴から、ノードを分類し、ラベルを付加する（分類値を付与する）。ノードが識別および分類されると、制御装置１１は、隣接するノードとの隣接関係に基づいて、ノード同士を接続するエッジを付与する。 In this embodiment, graph nodes are identified by dividing the phase information for each pixel shown in Figure 13 into regions, using 0-degree phase sections as region separators (i.e., dividing the captured image into multiple regions based on the relative position of each pixel and the grid information on the captured image). The node adjacency relationships are also extracted from the region adjacency relationships. Once the nodes are identified, the control device 11 uses a U-Net or similar to classify each node based on the image features around the node and assigns a label (assigns a classification value). Once the nodes are identified and classified, the control device 11 assigns edges connecting the nodes based on the adjacency relationships with adjacent nodes.

なお、ノードを領域分割によって抽出する際、位相情報以外の情報に基づいてもよい。例えば、Ｕ－Ｎｅｔによって格子から一定の範囲内をノードとして認識し、そのノードを中心としてドロネー分割などの技術を使用して領域分割を行ってもよい。さらに、位相情報に基づく領域分割と、ノードを中心とした領域分割を組み合わせてもよい。また、ノード同士の接続は、隣接関係以外に近接関係に基づいてもよい。その場合、出力されるグラフは格子グラフではなく一般のグラフとなる。図１４に近接関係による接続の例を示す。 Note that when extracting nodes through region division, information other than topological information may be used. For example, a certain range from the grid may be recognized as a node using U-Net, and region division may be performed using techniques such as Delaunay division with that node at the center. Region division based on topological information may also be combined with region division centered on a node. Furthermore, connections between nodes may be based on proximity relationships in addition to adjacency relationships. In this case, the output graph will be a general graph rather than a grid graph. Figure 14 shows an example of connections based on proximity relationships.

本実施形態では、各ノードについて、そのノードに付随する２つの点の相対位置に基づいて、２つ点が二次元平面において水平または略水平にあること、２つの点のうち左に位置する点が右に位置する点に対し二次元平面において正の角度の位置にあり、その角度が予め定められた角度を上回ること、２つの点のうち左に位置する点が右に位置する点に対し二次元平面において正の角度の位置にあり、その角度が予め定められた角度以下であること、２つの点のうち左に位置する点が右に位置する点に対し二次元平面において負の角度の位置にあり、その角度が予め定められた角度以下であること、２つの点のうち左に位置する点が右に位置するノードに対し二次元平面において負の角度の位置にあり、その角度が予め定められた角度を上回ること、の５個のクラスに分類される。 In this embodiment, for each node, the relative positions of the two points associated with that node are classified into five classes: the two points are horizontal or nearly horizontal in a two-dimensional plane; the left-side point of the two points is at a positive angle in the two-dimensional plane relative to the right-side point, and that angle exceeds a predetermined angle; the left-side point of the two points is at a positive angle in the two-dimensional plane relative to the right-side point, and that angle is equal to or less than a predetermined angle; the left-side point of the two points is at a negative angle in the two-dimensional plane relative to the right-side point, and that angle is equal to or less than a predetermined angle; and the left-side point of the two points is at a negative angle in the two-dimensional plane relative to the right-side node, and that angle exceeds a predetermined angle.

正の角度とは、隣接する２つの点のうち左に位置する点が右に位置する点よりも二次元平面において低い位置にある角度を意味する。負の角度とは、隣接する２つの点のうち左に位置する点が右に位置する点よりも二次元平面において高い位置にある角度を意味する。 A positive angle means that of two adjacent points, the point located to the left is lower on a two-dimensional plane than the point located to the right. A negative angle means that of two adjacent points, the point located to the left is higher on a two-dimensional plane than the point located to the right.

なお、実際には、あるノードについて、付随する２点の相対位置を認識することができないことがあるので、このようなノードをｕｎｋｎｏｗｎクラスとして分類してもよい。この場合、ノードは、６個のクラスに分類される。本実施形態では、ノードを６個のクラスに分類する。図１５は、ラベル付けされたノードを示す。 In practice, it may be impossible to recognize the relative positions of two points associated with a certain node, so such nodes may be classified as unknown. In this case, nodes are classified into six classes. In this embodiment, nodes are classified into six classes. Figure 15 shows labeled nodes.

２つの点のうち左に位置する点が右に位置する点に対し二次元平面において正の角度の位置にあり、その角度が予め定められた角度を上回ることとは、例えば、左に位置する点が右に位置する点よりも二次元平面において低い位置にあり、２つの点の二次元平面における高さの差が、予め定められた閾値を上回ることに等しい。このように分類される点は、緑（Ｇ）ラベルが付加され、図１５では、黒塗りの円がＧラベルを表す。図１５に示した例では、点Ｎ１が点Ｎ６よりも二次元平面において低い位置にあり、点Ｎ１の点Ｎ６に対する角度が閾値を上回ると仮定して、Ｇラベルが付加される。 When the left-hand point of two points is at a positive angle on a two-dimensional plane relative to the right-hand point, and that angle exceeds a predetermined angle, this is equivalent to, for example, the point on the left being lower on a two-dimensional plane than the point on the right, and the difference in height between the two points on the two-dimensional plane exceeding a predetermined threshold. Points classified in this way are labeled green (G), and in Figure 15, a black circle represents the G label. In the example shown in Figure 15, point N1 is lower on a two-dimensional plane than point N6, and the G label is added assuming that the angle of point N1 relative to point N6 exceeds the threshold.

２つの点のうち左に位置する点が右に位置する点に対し二次元平面において正の角度の位置にあり、その角度が予め定められた角度以下であることとは、例えば、左に位置する点が右に位置する点よりも二次元平面において低い位置にあり、２つの点の二次元平面における高さの差が、予め定められた閾値以下であることに等しい。このように分類される点は、青（Ｂ）ラベルが付加され、図１５では、網掛けの円がＢラベルを表す。図１５に示した例では、点Ｎ２が点Ｎ７よりも二次元平面において低い位置にあり、点Ｎ２の点Ｎ７に対する角度が閾値以下であると仮定して、Ｂラベルが付加される。 When the left-hand point of two points is at a positive angle on a two-dimensional plane relative to the right-hand point, and that angle is less than or equal to a predetermined angle, this is equivalent to, for example, the left-hand point being lower on a two-dimensional plane than the right-hand point, and the difference in height between the two points on the two-dimensional plane being less than or equal to a predetermined threshold. Points classified in this way are labeled blue (B), and in Figure 15, the shaded circle represents the B label. In the example shown in Figure 15, point N2 is lower on a two-dimensional plane than point N7, and the B label is assigned based on the assumption that the angle between point N2 and point N7 is less than or equal to the threshold.

２つの点が二次元平面において水平または略水平にあることとは、例えば、２つの点の二次元平面における高さの差が、予め定められた閾値範囲にあることに等しい。このように分類される点は、黄（Ｙ）ラベルが付加され、図１５では、網掛け（Ｂラベルよりも明るい）の円がＹラベルを表す。図１５に示した例では、点Ｎ３が点Ｎ８と水平または略水平の位置にあるので、Ｙラベルが付加される。 Two points being horizontal or nearly horizontal on a two-dimensional plane means, for example, that the difference in height between the two points on the two-dimensional plane is within a predetermined threshold range. Points classified in this way are labeled yellow (Y), and in Figure 15, a shaded circle (lighter than the B label) represents the Y label. In the example shown in Figure 15, point N3 is located horizontally or nearly horizontally with point N8, so the Y label is assigned to it.

２つの点のうち左に位置する点が右に位置する点に対し二次元平面において負の角度の位置にあり、その角度が予め定められた角度以下であることとは、例えば、左に位置する点が右に位置する点よりも二次元平面において高い位置にあり、２つの点の二次元平面における高さの差が、予め定められた閾値以下であることに等しい。このように分類される点は、紫（Ｐ）ラベルが付加され、図１５では、網掛け（Ｙラベルよりも明るい）の円がＰラベルを表す。図１５に示した例では、点Ｎ４が点Ｎ９よりも二次元平面において高い位置にあり、点Ｎ４の点Ｎ９に対する角度が閾値以下であると仮定して、Ｐラベルが付加される。 When the left-hand point of two points is at a negative angle on a two-dimensional plane relative to the right-hand point, and that angle is less than or equal to a predetermined angle, this is equivalent to, for example, the left-hand point being higher on a two-dimensional plane than the right-hand point, and the difference in height between the two points on the two-dimensional plane being less than or equal to a predetermined threshold. Points classified in this way are labeled purple (P), and in Figure 15, a shaded circle (lighter than the Y label) represents the P label. In the example shown in Figure 15, point N4 is higher on a two-dimensional plane than point N9, and the P label is assigned assuming that the angle of point N4 relative to point N9 is less than or equal to the threshold.

２つの点のうち左に位置する点が右に位置する点に対し二次元平面において負の角度の位置にあり、その角度が予め定められた角度を上回ることとは、例えば、左に位置する点が右に位置する点よりも二次元平面において高い位置にあり、２つの点の二次元平面における高さの差が、予め定められた閾値を上回ることに等しい。このように分類される点は、赤（Ｒ）ラベルが付加され、図１５では、白抜きの円がＲラベルを表す。図１５に示した例では、点Ｎ５が点Ｎ１０よりも二次元平面において高い位置にあり、点Ｎ５の点Ｎ１０に対する角度が閾値を上回ると仮定して、Ｒラベルが付加される。 When the left-hand point of two points is at a negative angle on a two-dimensional plane relative to the right-hand point, and that angle exceeds a predetermined angle, this is equivalent to, for example, the point on the left being higher on a two-dimensional plane than the point on the right, and the difference in height between the two points on the two-dimensional plane exceeding a predetermined threshold. Points classified in this way are labeled red (R), and in Figure 15, a white circle represents the R label. In the example shown in Figure 15, point N5 is higher on a two-dimensional plane than point N10, and the R label is added assuming that the angle of point N5 relative to point N10 exceeds the threshold.

第３の実施形態では、実施形態１と同様にノードごとに対応を求めた後、ノードごとの対応情報と、Ｕ－Ｎｅｔで求めた位相情報を組み合わせて、画素ごとの対応情報を求める。具体的には、画素に近接するノードごとの対応情報を整数値で、位相情報を０以上１以下の小数値とし、足し合わせることで、画素ごとに、対応情報を実数精度で求めることができる。 In the third embodiment, after determining the correspondence for each node as in the first embodiment, the correspondence information for each node is combined with the phase information determined by U-Net to determine the correspondence information for each pixel. Specifically, by treating the correspondence information for each node adjacent to a pixel as an integer value and the phase information as a decimal value between 0 and 1, and adding them together, it is possible to determine the correspondence information for each pixel with real number precision.

本実施形態では、横方向において隣接する点の間の相対角度に基づいてノードを５個のクラスに分類する例を示したが、分類するクラスの数は５に限定されない。例えば、基準画像ＲＩにおいて、第１の要素および第２の要素のいずれもが、断続的に延在する要素を含む場合、縦方向および横方向に隣接する点の間の相対角度に基づいて更になるクラスに分類されてもよい。なお、ラベルを上述した色で表すことは例示にすぎず、相対位置を示す任意の記号などがラベルとして付加されてもよい。 In this embodiment, an example has been shown in which nodes are classified into five classes based on the relative angle between horizontally adjacent points, but the number of classes is not limited to five. For example, if both the first element and the second element in the reference image RI include elements that extend intermittently, they may be classified into further classes based on the relative angles between vertically and horizontally adjacent points. Note that representing labels with the above-mentioned colors is merely an example, and any symbol indicating relative position may be added as a label.

第３の実施形態で示した基準画像ＲＩについても、対応するパターン光ＰＬが撮影対象に投影され、パターンの形状が歪んでも、撮像画像ＤＩにおいて隣接ノードとの関係が維持される。よって、その関係に基づいて、ＧＣＮにより元のパターン内のノードと対応付ける精度を高めることができる。 For the reference image RI shown in the third embodiment, even if the corresponding pattern light PL is projected onto the subject and the shape of the pattern is distorted, the relationship with adjacent nodes is maintained in the captured image DI. Therefore, based on this relationship, the accuracy of matching with nodes in the original pattern using GCN can be improved.

なお、第の実施形態においても、ＧＣＮからの出力およびノードに付随する点同士の相対角度（相対角度に基づいて付与された分類値）に基づいたノード間の対応付けが行われてもよい（ＧＣＮもしくは他のニューラルネットワーク、またはニューラルネットワークを使用しない画像処理（例えば、空間フィルタリングおよび畳み込み演算など）によって）。 In the third embodiment, correspondence between nodes may also be performed based on the output from the GCN and the relative angles between points associated with the nodes (classification values assigned based on the relative angles) (by using a GCN or other neural network, or by image processing that does not use a neural network (e.g., spatial filtering and convolution operations)).

なお、図３、図４、および１２に示した基準画像ＲＩに代えて、パターンを正方形の格子を含むパターンによって構成してもよい。このようなパターン自体は、上述した相対位置および相対角度における相違を表すことはできないが、格子内に、相対位置および相対角度を表す任意の記号を配置することによって、上述した相違を表してもよい。つまり、基準画像ＲＩは、第１の要素または第２の要素のいずれかにおいて、上述した相対位置および／または相対角度における相違を表す。 Instead of the reference image RI shown in Figures 3, 4, and 12, the pattern may be configured with a pattern including a square grid. Such a pattern itself cannot represent the differences in relative position and relative angle described above, but the differences may be represented by placing any symbols representing the relative position and relative angle within the grid. In other words, the reference image RI represents the differences in the relative position and/or relative angle described above in either the first element or the second element.

上記実施形態で説明したハードウェアの構成要素は例示的なものにすぎず、その他の構成も可能であることに留意されたい。また、上記実施形態で説明した処理の順序は、必ずしも説明した順序で実行される必要がなく、任意の順序で実行されてもよい。更に、本発明の基本的な概念から逸脱することなく、追加のステップが新たに加えられてもよい。 Please note that the hardware components described in the above embodiments are merely exemplary, and other configurations are possible. Furthermore, the processing steps described in the above embodiments do not necessarily have to be performed in the order described, and may be performed in any order. Furthermore, additional steps may be added without departing from the basic concept of the present invention.

また、本発明の一実施形態に係る画像分析方法は、コンピュータデバイス１の制御装置１１（プロセッサ）によって実行されるコンピュータプログラムによって実装されるが、当該コンピュータプログラムは、非一時的記憶媒体に記憶されてもよい。非一時的記憶媒体の例は、リードオンリメモリ（ＲＯＭ）、ランダムアクセスメモリ（ＲＡＭ）、レジスタ、キャッシュメモリ、半導体メモリ装置、内蔵ハードディスクおよび取外可能ディスク装置などの磁気媒体、光磁気媒体、ならびにＣＤ－ＲＯＭディスクおよびデジタル多用途ディスク（ＤＶＤ）などの光学媒体などを含む。 The image analysis method according to one embodiment of the present invention is implemented by a computer program executed by the control device 11 (processor) of the computer device 1, and the computer program may be stored on a non-transitory storage medium. Examples of non-transitory storage media include read-only memory (ROM), random access memory (RAM), registers, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disk devices, magneto-optical media, and optical media such as CD-ROM disks and digital versatile disks (DVDs).

Claims

A computer-implemented method for analyzing a correspondence between a reference image including a pattern and a captured image generated from a subject onto which a pattern light corresponding to the reference image is projected, wherein the reference image includes a first set of elements extending vertically in a two-dimensional plane and a second set of elements extending horizontally in the two-dimensional plane, and the reference image represents that at least one of the first set of elements and the second set of elements includes elements extending intermittently;
identifying, within the captured image, respective nodes formed from the first set of elements and the second set of elements;
assigning a classification value to each of the nodes by classifying the node based on its relative position to adjacent nodes;
generating a graph from the classified nodes;
For each node in the graph, determining its adjacency with adjacent nodes;
matching nodes in the graph with nodes in the reference image based on the classification values and the adjacency relationships;
A method comprising:
The reference image includes n markers, and the method comprises:
identifying n regions each containing said n markers;
For each of the n regions, classifying the nodes based on a relationship between a central node and adjacent nodes within the region and assigning a second classification value;
matching nodes in the graph with nodes in the reference image further based on the second classification value;
The method further comprising :

the step of associating nodes in the graph with nodes in the reference image is performed by using a neural network, the neural network being trained to output correspondences of nodes in the graph to nodes in the reference image based on the classification values and the adjacency relationships.
2. The method of claim 1.

The neural network is a graph convolutional network (GCN).
3. The method of claim 2.

If the second set of elements represents intermittently extending elements, the reference image is
a plurality of second elements in the set of second elements intersect with a single first element in the set of first elements at a plurality of locations;
a first height in a two-dimensional plane of an intersection between a first element of the plurality of second elements and the first element is different from a second height in a two-dimensional plane of an intersection between a second element of the plurality of second elements and the first element;
3. The method according to claim 1 or 2.

the relative position is based on the first height being different from a second height represented by the reference image;
5. The method of claim 4.

In each of the n regions,
For a lattice consisting of the central node and adjacent nodes in the region,
Based on whether the shapes of the lattices are the same, different, and symmetrical,
The second classification value is assigned .
2. The method of claim 1 .

the steps of identifying nodes, assigning classification values, and generating a graph are performed by using a second neural network, the second neural network having trained the reference images to identify nodes from the reference images, assign classification values, and generate a graph.
7. The method according to any one of claims 1 to 6 .

The second neural network is a U-Net.
8. The method of claim 7 .

For each pixel in the captured image,
determining a relative position of the identified nodes with respect to nodes that are proximate to the pixel;
Corresponding pixels in the captured image to pixels in the reference image based on the determined relative positions and relative positions of the pixels in the reference image relative to nodes adjacent to the corresponding pixels;
9. The method of any one of claims 1 to 8 , further comprising:

identifying respective nodes in the reference image that are comprised of the first set of elements and the second set of elements;
assigning a third classification value to each of the nodes by classifying the node based on its relative position to adjacent nodes;
generating a second graph from the classified nodes;
and for each node in the second graph, determining an adjacency relationship with adjacent nodes;
The step of associating nodes in the graph with nodes in the reference image comprises:
generating a first feature vector for each node in the graph by inputting the graph into a neural network;
generating a second feature vector for each node in the second graph by inputting the second graph into the neural network;
calculating , for each node, an inner product value of the first feature vector and the second feature vector;
evaluating the dot product value;
10. The method according to any one of claims 1 to 9 , comprising:

A computer-implemented method for analyzing a correspondence between a reference image including a pattern and a captured image generated from a subject onto which a pattern light corresponding to the reference image is projected, wherein the reference image includes a first set of elements extending vertically or horizontally in a two-dimensional plane and a second set of elements extending intermittently in the same direction as the first set of elements, and the reference image represents each of the second set of elements being arranged at random intervals in the same direction;
identifying, within the captured image, respective nodes formed from the first set of elements and the second set of elements;
assigning a classification value to each of the nodes by classifying the node based on its relative angle to adjacent nodes;
generating a graph from the classified nodes;
For each node in the graph, determining its adjacency with adjacent nodes;
matching nodes in the graph with nodes in the reference image based on the classification values and the adjacency relationships;
A method comprising:

The step of identifying the node is performed by using a neural network, and the neural network is trained on the captured image and the captured image rotated at a predetermined angle to identify the first element or the second element.
12. The method of claim 11 .

the reference image represents that an angle of a first element of the set of second elements relative to an adjacent second element is different from an angle of a second element of the set of second elements relative to an adjacent second element;
13. The method according to claim 11 or 12 .

the step of generating the graph includes a step of dividing the captured image into a plurality of regions including the nodes based on a relative position of each pixel in the captured image and grid information on the captured image;
14. The method according to any one of claims 11 to 13 .

A computer device that analyzes a correspondence relationship between a reference image including a pattern and a captured image generated from a subject onto which pattern light corresponding to the reference image is projected, wherein the reference image includes a first set of elements extending vertically in a two-dimensional plane and a second set of elements extending horizontally in the two-dimensional plane, and the reference image indicates that at least one of the first set of elements and the second set of elements includes elements that extend intermittently;
Identifying nodes in the captured image that are each made up of the first set of elements and the second set of elements;
assigning a classification value to each of the nodes by classifying the node based on its relative position to adjacent nodes;
generating a graph from the classified nodes;
For each node in the graph, determining an adjacency relationship with adjacent nodes;
Corresponding nodes in the graph with nodes in the reference image based on the classification values and the adjacency relationships;
a control device configured to
The reference image includes n markers, and the control device
identifying n regions each containing said n markers;
For each of the n regions, classifying the nodes based on a relationship between a central node and adjacent nodes within the region and assigning a second classification value;
Corresponding nodes in the graph with nodes in the reference image further based on the second classification value.
1. A computing device comprising:

A computer device that analyzes a correspondence relationship between a reference image including a pattern and a captured image generated from a subject onto which pattern light corresponding to the reference image is projected, wherein the reference image includes a first set of elements extending vertically or horizontally in a two-dimensional plane and a second set of elements extending intermittently in the same direction as the first set of elements, and the reference image represents each of the second set of elements being arranged at random intervals in the same direction;
Identifying nodes in the captured image that are each made up of the first set of elements and the second set of elements;
assigning a classification value to each of the nodes by classifying the node based on its relative angle to adjacent nodes;
generating a graph from the classified nodes;
For each node in the graph, determining an adjacency relationship with adjacent nodes;
Corresponding nodes in the graph with nodes in the reference image based on the classification values and the adjacency relationships;
a control device configured to
The reference image includes n markers, and the control device
identifying n regions each containing said n markers;
For each of the n regions, classifying the nodes based on a relationship between a central node and adjacent nodes within the region and assigning a second classification value;
Corresponding nodes in the graph with nodes in the reference image further based on the second classification value.
1. A computing device comprising:

A computer program comprising computer-executable instructions, which when executed by a processor, cause the processor to perform a method according to any one of claims 1 to 14 .
A computer program characterized by: