JP7797003B2

JP7797003B2 - Class inference system, tree species map generation system, class inference method, and tree species map generation method

Info

Publication number: JP7797003B2
Application number: JP2022040304A
Authority: JP
Inventors: 信徳大西
Original assignee: Kyoto University NUC
Current assignee: Kyoto University NUC
Priority date: 2022-03-15
Filing date: 2022-03-15
Publication date: 2026-01-13
Anticipated expiration: 2042-03-15
Also published as: JP2023135210A

Description

本発明は、森林の樹木などの物体を分類する技術に関する。 The present invention relates to technology for classifying objects such as trees in a forest.

非特許文献１に記載されるように、従来、ハイパースペクトルカメラ、マルチスペクトルカメラ、またはＬｉＤＡＲ（Light Detection and Ranging）センサなどの特殊なハードウェアで森林の情報を取得し、森林に生えている樹木の種類を、取得した情報に基づいて特定する方法が提案されている。 As described in Non-Patent Document 1, methods have been proposed in the past to acquire forest information using specialized hardware such as hyperspectral cameras, multispectral cameras, or LiDAR (Light Detection and Ranging) sensors, and to identify the types of trees growing in the forest based on the acquired information.

しかし、これらのハードウェアは高価なので、この方法によるとコストが嵩んでしまう。そこで、ＡＩ（Artificial Intelligence）によって樹木の種類を推論する方法が提案されている。 However, this hardware is expensive, so this method increases costs. Therefore, a method has been proposed that uses AI (Artificial Intelligence) to infer tree species.

例えば、特許文献１に記載される方法によると、上空からのテスト領域の写真またはテスト領域の各地点の高さの分布である第一の分布と各地点の勾配の分布である第二の分布とに基づいて、テスト領域に存在する複数の物体それぞれの輪郭を特定する。特定された輪郭ごとに、所定の複数の種類のうちの輪郭に存在する物体の種類に対応するラベルを付与し、輪郭ごとに、写真の中の各輪郭に囲まれた部分の部分画像および各輪郭に付与されたラベルを示す学習データを生成する。学習データそれぞれに示される部分画像およびラベルをそれぞれ入力および正解として用いて学習済モデルを生成する。そして、生成した学習済モデルに基づいて、推論対象の樹木の種類を推論する。 For example, according to the method described in Patent Document 1, the contours of each of multiple objects present in a test area are identified based on a photograph of the test area taken from above or a first distribution that is the distribution of heights at each point in the test area and a second distribution that is the distribution of gradients at each point. For each identified contour, a label corresponding to the type of object present in the contour from among multiple predetermined types is assigned, and training data is generated for each contour, showing partial images of the part of the photograph surrounded by each contour and the labels assigned to each contour. A trained model is generated using the partial images and labels shown in each training data as input and correct answer, respectively. The type of tree to be inferred is then inferred based on the generated trained model.

特開２０２０－９１６４０号公報Japanese Patent Application Laid-Open No. 2020-91640

"Review of studies on tree species classification from remotely sensed data." Remote Sensing of Environment Sensing of Environment Sensing of Environment Sensing of Environment Sensing of Environment Sensing of Environment , 186 , 64 -87.，Fassnacht, F. E., Latifi, H., Sterenczak, K., Modzelewska, A., Lefsky, M., Waser, L. T., ... & Ghosh, A. ２０１６年著"Review of studies on tree species classification from remotely sensed data." Remote Sensing of Environment Sensing of Environment Sensing of Environment Sensing of Environment Sensing of Environment , 186 , 64 -87., Fassnacht, F. E., Latifi, H., Sterenczak, K., Modzelewska, A., Lefsky, M., Waser, L. T., ... & Ghosh, A. 2016

しかし、特許文献１に記載されるような、ＡＩによる従来の方法によると、学習していないクラス（種類）を識別することができない。したがって、ある植生の下で取得した学習データに基づいて生成された既存の学習済モデルを、植生の異なる地域で使用することができない。 However, conventional AI methods such as those described in Patent Document 1 are unable to distinguish between classes (types) that have not been trained. Therefore, an existing trained model generated based on training data acquired under a certain type of vegetation cannot be used in areas with different vegetation.

そこで、その地域の植生に応じて新たに学習済モデルを生成することが考えられるが、そのためには、膨大な学習データを新たに取得しなければならない。さらに、機械学習の演算のために高性能なコンピュータが必要である。 One possible solution would be to generate a new trained model based on the vegetation of the area, but this would require acquiring a huge amount of new training data. Furthermore, a high-performance computer would be required for the machine learning calculations.

本発明は、このような問題点に鑑み、樹木などのクラスを推論するＡＩを従来よりも容易に構築することを、目的とする。 In light of these problems, the present invention aims to make it easier than ever to build AI that can infer classes such as trees.

本発明の一形態に係るクラス推論システムは、特定の地域に生育する、複数のクラスそれぞれに属する樹木の樹冠の画像の特徴を表わす第一の特徴ベクトルを、当該画像を学習済モデルに入力することによって取得する第一の取得手段と、前記特定の地域に生育する、推論の対象である推論対象樹木の樹冠の画像の特徴を表わす第二の特徴ベクトルを、当該画像を前記学習済モデルに入力することによって取得する第二の取得手段と、前記複数のクラスのうちの前記推論対象樹木が属するクラスを、前記複数のクラスそれぞれの前記第一の特徴ベクトルおよび当該推論対象樹木の前記第二の特徴ベクトルに基づいて推論する推論手段と、を有し、前記学習済モデルは、複数の樹木それぞれのクラスおよび樹冠の画像をそれぞれ目的変数および説明変数としてディープラーニングによって生成されたＣＮＮ（Convolutional Neural Network）のうちの、入力画像の特徴ベクトルを畳込み処理によって算出する部分のネットワークである。
本発明の一形態に係る樹種マップ生成システムは、複数のクラスそれぞれに属する樹木が生育しかつ複数の地域を有する領域の樹種マップを生成する樹種マップ生成システムであって、前記複数の地域それぞれについて、当該地域に生育する、前記複数のクラスそれぞれに属する樹木の樹冠の画像の特徴を表わす第一の特徴ベクトルを、当該画像を学習済モデルに入力することによって取得する第一の取得手段と、前記領域に生育する複数の推論対象樹木それぞれの樹冠の画像の特徴を表わす第二の特徴ベクトルを、当該画像を前記学習済モデルに入力することによって取得する第二の取得手段と、前記複数の推論対象樹木それぞれが属するクラスを、前記複数の地域のうちの当該推論対象樹木が生育する地域における前記複数のクラスそれぞれの前記第一の特徴ベクトルおよび当該推論対象樹木の前記第二の特徴ベクトルに基づいて推論する推論手段と、前記複数の推論対象樹木それぞれの位置情報および前記推論手段による推論結果に基づいて前記樹種マップを生成するマップ生成手段と、を有する。

A class inference system according to one embodiment of the present invention comprises a first acquisition means for acquiring a first feature vector representing the features of an image of the crown of a tree belonging to each of a plurality of classes that grows in a specific region by inputting the image into a trained model; a second acquisition means for acquiring a second feature vector representing the features of an image of the crown of an inference target tree that grows in the specific region and is the subject of inference by inputting the image into the trained model; and an inference means for inferring the class to which the inference target tree belongs among the plurality of classes based on the first feature vector of each of the plurality of classes and the second feature vector of the inference target tree, wherein the trained model is a network that calculates the feature vector of the input image by convolution processing within a convolutional neural network (CNN) generated by deep learning using the class and the image of the crown of each of a plurality of trees as the objective variable and the explanatory variable, respectively.
A tree species map generation system according to one embodiment of the present invention is a tree species map generation system that generates a tree species map of an area having multiple regions and in which trees belonging to each of multiple classes grow, and includes a first acquisition means that acquires, for each of the multiple regions, a first feature vector representing the characteristics of an image of the crown of a tree belonging to each of the multiple classes that grows in the area by inputting the image into a trained model; a second acquisition means that acquires a second feature vector representing the characteristics of an image of the crown of each of multiple inference target trees that grow in the area by inputting the image into the trained model; an inference means that infers the class to which each of the multiple inference target trees belongs based on the first feature vector of each of the multiple classes in the area in which the inference target tree grows among the multiple regions and the second feature vector of the inference target tree; and a map generation means that generates the tree species map based on location information of each of the multiple inference target trees and the inference results by the inference means.

本発明によると、樹木などのクラスを推論するＡＩを従来よりも容易に構築することができる。 This invention makes it easier than ever to build AI that can infer classes such as trees.

樹木分類システムの全体的な構成の例を示す図である。FIG. 1 is a diagram illustrating an example of the overall configuration of a tree classification system. コンピュータのハードウェア構成の例を示す図である。FIG. 1 illustrates an example of a hardware configuration of a computer. コンピュータの機能的構成の例を示す図である。FIG. 2 is a diagram illustrating an example of a functional configuration of a computer. 樹木分類ネットワークの例を示す図である。FIG. 1 illustrates an example of a tree classification network. 上空写真およびキャノピマップそれぞれの例を示す図である。1A and 1B are diagrams showing examples of an aerial photograph and a canopy map, respectively. 物体データの例を示す図である。FIG. 10 is a diagram illustrating an example of object data. 推論の方法の例を示す図である。FIG. 10 is a diagram illustrating an example of an inference method. ２６４０次元空間における特徴点の分布の例を示す図である。FIG. 10 is a diagram illustrating an example of the distribution of feature points in a 2640-dimensional space. 樹木分類マップの例を示す図である。FIG. 10 is a diagram illustrating an example of a tree classification map. 樹木分類プログラムによる全体的な処理の流れの例を説明するフローチャートである。10 is a flowchart illustrating an example of the overall processing flow of a tree classification program. 正誤結果の例を示す図である。FIG. 10 is a diagram showing an example of a correct/incorrect result. 特徴抽出ネットワークの汎用化および全国の樹木マップの作成の方法の例を示す図である。FIG. 1 shows an example of a method for generalizing a feature extraction network and creating a national tree map.

〔全体の構成〕
図１は、樹木分類システム３の全体的な構成の例を示す図である。図２は、コンピュータ１のハードウェア構成の例を示す図である。図３は、コンピュータ１の機能的構成の例を示す図である。 [Overall structure]
Fig. 1 is a diagram showing an example of the overall configuration of a tree classification system 3. Fig. 2 is a diagram showing an example of the hardware configuration of a computer 1. Fig. 3 is a diagram showing an example of the functional configuration of the computer 1.

図１に示すように、樹木分類システム３は、コンピュータ１およびドローン２などによって構成される。樹木分類システム３は、上空からの写真に写っている樹木をＡＩ（Artificial Intelligence）によって分類するサービスを提供する。 As shown in Figure 1, the tree classification system 3 is composed of a computer 1, a drone 2, and other components. The tree classification system 3 provides a service that uses AI (artificial intelligence) to classify trees that appear in aerial photographs.

ドローン２は、デジタルカメラが搭載されたＵＡＶ（Unmanned Aerial Vehicle）であって、森林を上空から撮影することによって学習データの基となるカラー写真を取得するために用いられる。ドローン２は、市販のものでよく、例えば、ＤＪＩ社のPHANTOM 4が用いられる。 Drone 2 is a UAV (Unmanned Aerial Vehicle) equipped with a digital camera, and is used to photograph the forest from above to obtain color photographs that will serve as the basis for learning data. Drone 2 can be a commercially available model; for example, a DJI PHANTOM 4 is used.

コンピュータ１は、図２に示すように、プロセッサ１０、ＲＡＭ（Random Access Memory）１１、ＲＯＭ（Read Only Memory）１２、補助記憶装置１３、ネットワークアダプタ１４、キーボード１５、ポインティングデバイス１６、入出力ボード１７、タッチパネルディスプレイ１８、および音声出力ユニット１９などによって構成される。 As shown in FIG. 2, the computer 1 is composed of a processor 10, RAM (Random Access Memory) 11, ROM (Read Only Memory) 12, auxiliary storage device 13, network adapter 14, keyboard 15, pointing device 16, input/output board 17, touch panel display 18, and audio output unit 19.

ＲＯＭ１２または補助記憶装置１３には、オペレーティングシステムのほか種々のプログラムがインストールされている。特に、本実施形態では、樹木分類プログラム４（図３参照）がインストールされている。補助記憶装置１３として、ＳＳＤ（Solid State Drive）またはハードディスクなどが用いられる。 In addition to the operating system, various other programs are installed in the ROM 12 or auxiliary storage device 13. In particular, in this embodiment, the tree classification program 4 (see Figure 3) is installed. The auxiliary storage device 13 may be an SSD (Solid State Drive) or a hard disk.

ＲＡＭ１１は、コンピュータ１のメインメモリである。ＲＡＭ１１には、オペレーティングシステムのほか樹木分類プログラム４などのプログラムがロードされる。 RAM 11 is the main memory of computer 1. In addition to the operating system, programs such as tree classification program 4 are loaded into RAM 11.

プロセッサ１０は、ＲＡＭ１１にロードされたプログラムを実行する。プロセッサ１０として、ＧＰＵ（Graphics Processing Unit）またはＣＰＵ（Central Processing Unit）などが用いられる。 The processor 10 executes programs loaded into the RAM 11. The processor 10 may be a GPU (Graphics Processing Unit) or a CPU (Central Processing Unit).

ネットワークアダプタ１４は、ＴＣＰ／ＩＰ（Transmission Control Protocol/Internet Protocol）などのプロトコルでドローン２など他の装置と通信するための装置である。 The network adapter 14 is a device for communicating with other devices such as the drone 2 using protocols such as TCP/IP (Transmission Control Protocol/Internet Protocol).

キーボード１５およびポインティングデバイス１６は、コマンドまたはデータなどをオペレータが入力するための入力装置である。 The keyboard 15 and pointing device 16 are input devices that allow the operator to input commands, data, etc.

入出力ボード１７は、ドローン２と有線または無線を介して通信する。入出力ボード１７として、例えば、ＵＳＢ（Universal Serial Bus）またはＢｌｕｅｔｏｏｔｈに準拠した入出力ボードが用いられる。 The input/output board 17 communicates with the drone 2 via wired or wireless communication. For example, an input/output board compliant with USB (Universal Serial Bus) or Bluetooth is used as the input/output board 17.

タッチパネルディスプレイ１８は、コマンドもしくはデータを入力するための画面またはプロセッサ１０によって生成されたマップなどを表示する。 The touch panel display 18 displays a screen for inputting commands or data, or a map generated by the processor 10.

音声出力ユニット１９は、音声ボードおよびスピーカなどによって構成され、警告音などの音声を出力する。 The audio output unit 19 is composed of an audio board and speaker, and outputs sounds such as warning sounds.

樹木分類プログラム４は、図３に示すキャノピマップ生成部４１、ラベル付与部４２、特徴ベクトル算出部４３、教師データ記憶部４４、キャノピ検出部４５、特徴量算出部４６、クラス推論部４７、樹木分類マップ生成部４８、および樹木分類マップ出力部４９などを実現するためのコンピュータプログラムである。樹木分類プログラム４によると、樹木のクラス（種類）を推論するための分類器を生成し、この分類器に基づいて対象の樹木のクラスを推論することができる。 The tree classification program 4 is a computer program for realizing the canopy map generation unit 41, label assignment unit 42, feature vector calculation unit 43, training data storage unit 44, canopy detection unit 45, feature calculation unit 46, class inference unit 47, tree classification map generation unit 48, and tree classification map output unit 49 shown in Figure 3. The tree classification program 4 can generate a classifier for inferring the class (type) of a tree, and infer the class of a target tree based on this classifier.

以下、ある地域（サイト）の森林８０の樹木を分類するために樹木分類システム３をチューニングして使用する場合を例に、ドローン２および図３に示す各部の処理を、学習のフェーズおよび推論のフェーズに大別して説明する。 Below, we will explain the processing of the drone 2 and each part shown in Figure 3, broadly dividing it into a learning phase and an inference phase, using an example in which the tree classification system 3 is tuned and used to classify trees in a forest 80 in a certain region (site).

〔学習のフェーズ〕
（１）ネットワークおよびデータ
図４は、樹木分類ネットワーク５の例を示す図である。図５は、上空写真６０およびキャノピマップ６１それぞれの例を示す図である。図６は、物体データ６２の例を示す図である。 [Learning Phase]
(1) Network and Data Fig. 4 is a diagram showing an example of a tree classification network 5. Fig. 5 is a diagram showing examples of an aerial photograph 60 and a canopy map 61. Fig. 6 is a diagram showing examples of object data 62.

コンピュータ１には、予め、特徴抽出ネットワーク５１が用意されている。特徴抽出ネットワーク５１は、図４に示す樹木分類ネットワーク５の一部分である。 A feature extraction network 51 is prepared in advance in the computer 1. The feature extraction network 51 is part of the tree classification network 5 shown in Figure 4.

樹木分類ネットワーク５は、分類する対象の樹木がＮ個のクラスのうちのいずれに属するのかをその樹木のキャノピ（樹冠）の画像に基づいて推論するＣＮＮ（Convolutional Neural Network）であって、主に特徴抽出ネットワーク５１および確率算出ネットワーク５２によって構成される。 The tree classification network 5 is a convolutional neural network (CNN) that infers which of N classes a target tree belongs to based on an image of the tree's canopy, and is primarily composed of a feature extraction network 51 and a probability calculation network 52.

特徴抽出ネットワーク５１は、入力層、畳込み層、プーリング層、および出力層を有するネットワークであって、入力された画像の特徴を表わす特徴ベクトルＦを算出する。畳込み層およびプーリング層が交互に連なっていてもよい。本実施形態では、２２４×２２４ピクセルのＲＧＢのカラー画像が入力層に入力され、各層の演算によって、２５６０次元のベクトルが特徴ベクトルとして出力層から出力される。 The feature extraction network 51 is a network having an input layer, a convolutional layer, a pooling layer, and an output layer, and calculates a feature vector F representing the features of the input image. The convolutional layers and pooling layers may be arranged alternately. In this embodiment, a 224 x 224 pixel RGB color image is input to the input layer, and through calculations in each layer, a 2560-dimensional vector is output from the output layer as a feature vector.

確率算出ネットワーク５２は、入力層、１つまたは複数の全結合層、および出力層によって構成される。出力層の活性化関数としてｓｏｆｔｍａｘが用いられる。特徴ベクトルＦが入力層に入力されると、各層の演算によって、対象の樹木が１番目、２番目、…、Ｎ番目それぞれのクラスに属する確率ｐ_１、ｐ_２、…ｐ_Ｎが出力層から出力される。 The probability calculation network 52 is composed of an input layer, one or more fully connected layers, and an output layer. Softmax is used as the activation function for the output layer. When a feature vector F is input to the input layer, the output layer outputs the probabilities _p1 , _p2 , ..., _pN that the target tree belongs to the first, second, ..., Nth class, respectively, through calculations in each layer.

樹木分類ネットワーク５は、森林８０以外の森林の樹木のキャノピの画像に基づいて学習された既存のネットワーク（学習済モデル）で構わない。例えば、特開２０２０－９１６４０号公報に記載される方法またはwatershed segmentation法によって作成されたものでよい。特開２０２０－９１６４０号公報に記載される方法によると、コンピュータ１は、上空写真（空中写真）６０を用いてオルソ写真を生成し、オルソ写真に基づいてＤＳＭ（Digital Surface Model）を生成し、ＤＳＭに基づいてスロープモデルを生成する。そして、オルソ写真の赤、緑、および青それぞれの階調の写真、ＤＳＭモデル、ならびにスロープモデルに基づいて、キャノピマップ６１を生成する。 The tree classification network 5 may be an existing network (trained model) trained based on images of the tree canopy of forests other than forest 80. For example, it may be created using the method described in JP 2020-91640 A or the watershed segmentation method. According to the method described in JP 2020-91640 A, the computer 1 generates an orthophotograph using an aerial photograph 60, generates a DSM (Digital Surface Model) based on the orthophotograph, and generates a slope model based on the DSM. The canopy map 61 is then generated based on the red, green, and blue gradations of the orthophotograph, the DSM model, and the slope model.

樹木分類ネットワーク５が学習した森林の植生が森林８０の植生と異なっていても構わない。つまり、植生を問わず森林８０以外の森林のために生成されたＣＮＮの特徴抽出ネットワークを、コンピュータ１において特徴抽出ネットワーク５１として使用することができる。 It does not matter if the vegetation of the forest learned by the tree classification network 5 is different from the vegetation of forest 80. In other words, a CNN feature extraction network generated for a forest other than forest 80, regardless of the vegetation, can be used as feature extraction network 51 in computer 1.

オペレータは、森林８０の上空にドローン２を飛ばして所定の高度から森林８０の一部分または全体を撮影することによって図５（Ａ）のような上空写真６０を取得する。そして、上空写真６０をコンピュータ１に入力する。なお、上空写真６０は、ＲＧＢのカラー写真である。 The operator flies the drone 2 over the forest 80 and photographs part of or the entire forest 80 from a predetermined altitude, thereby obtaining an aerial photograph 60 such as that shown in Figure 5(A). The aerial photograph 60 is then input into the computer 1. Note that the aerial photograph 60 is an RGB color photograph.

すると、コンピュータ１のキャノピマップ生成部４１（図３参照）は、上空写真６０に写っているキャノピなどの物体の位置および形状を検出し、図５（Ｂ）のようなキャノピマップ６１を生成する。キャノピマップ６１には、物体それぞれの位置および形状がポリゴン６１Ａとして表われている。キャノピマップ６１は、公知の方法によって作成することができる。例えば、特開２０２０－９１６４０号公報に記載される方法によって作成することができる。 The canopy map generator 41 (see Figure 3) of the computer 1 then detects the position and shape of objects such as canopies that appear in the aerial photograph 60, and generates a canopy map 61 as shown in Figure 5(B). The canopy map 61 shows the position and shape of each object as polygons 61A. The canopy map 61 can be created using known methods. For example, it can be created using the method described in JP 2020-91640 A.

ラベル付与部４２は、上空写真６０に写っている一部の物体の画像に対して、その物体のクラスに対応するラベルを次のように付与する。 The labeling unit 42 assigns labels corresponding to the object classes to images of some of the objects captured in the aerial photograph 60 as follows:

ラベル付与部４２は、キャノピマップ６１をタッチパネルディスプレイ１８によって表示する。キャノピマップ６１の代わりに、ポリゴン６１Ａそれぞれの輪郭を上空写真６０の上に重ねて表示させてもよい。 The labeling unit 42 displays the canopy map 61 on the touch panel display 18. Instead of the canopy map 61, the outlines of each polygon 61A may be displayed superimposed on the aerial photograph 60.

ここで、オペレータは、森林８０に生育する樹木のクラスごとに、キャノピを表わすポリゴン６１Ａを１個ないし数個ずつ選択（サンプリング）する。例えば、森林８０に生育する樹木のクラスがメタセコイア、ストローブマツ、およびヒノキの３種類である場合は、メタセコイアを表わすポリゴン６１Ａ、ストローブマツを表わすポリゴン６１Ａ、およびヒノキを表わすポリゴン６１Ａをそれぞれ１個ないし数個ずつ選択する。さらに、オペレータは、樹木でない物体（例えば、地面）を表わすポリゴン６１Ａを１個ないし数個、選択する。なお、ポリゴン６１Ａを同数ずつ選択するのが望ましい。つまり、例えば、メタセコイア、ストローブマツ、ヒノキ、および地面それぞれのポリゴン６１Ａを３つずつ選択するのが望ましい。 Here, the operator selects (samples) one to several polygons 61A representing the canopy for each class of trees growing in forest 80. For example, if the classes of trees growing in forest 80 are metasequoia, white pine, and cypress, the operator selects one to several polygons 61A representing metasequoia, white pine, and cypress. Furthermore, the operator selects one to several polygons 61A representing non-tree objects (for example, the ground). It is desirable to select the same number of polygons 61A for each. That is, for example, it is desirable to select three polygons 61A each for metasequoia, white pine, cypress, and the ground.

すると、ラベル付与部４２は、選択された各ポリゴン６１Ａに対応する物体の画像を物体画像６２Ｇとして上空写真６０から抽出し、その物体を識別するラベルをラベル６２Ｌとして物体画像６２Ｇへ付与する。例えば、選択されたポリゴン６１Ａがメタセコイアのものであれば、そのポリゴン６１Ａに対応する物体の画像を物体画像６２Ｇとして上空写真６０から抽出し、メタセコイアのラベルをラベル６２Ｌとして物体画像６２Ｇへ付与する。そして、物体画像６２Ｇとラベル６２Ｌとを示す物体データ６２を生成する。 The labeling unit 42 then extracts an image of the object corresponding to each selected polygon 61A from the aerial photograph 60 as object image 62G, and assigns a label identifying the object to the object image 62G as label 62L. For example, if the selected polygon 61A is a metasequoia, the labeling unit 42 extracts an image of the object corresponding to the polygon 61A from the aerial photograph 60 as object image 62G, and assigns the metasequoia label to the object image 62G as label 62L. Then, it generates object data 62 indicating the object image 62G and label 62L.

オペレータの操作およびラベル付与部４２の処理によって、図６のような、クラスごとのサンプルの物体データ６４が生成される。 Through the operator's operations and the processing of the labeling unit 42, sample object data 64 for each class is generated, as shown in Figure 6.

なお、図６の例では、ラベル６２Ｌの値が「クラス＿１」、「クラス＿２」、…であるが、「アカガシ」、「アカマツ」、「その他」などの種類名であってもよい。この場合は、ポリゴン６１Ａを選択する際にオペレータが各種類名を入力すればよい。 In the example of Figure 6, the values of label 62L are "Class_1", "Class_2", etc., but they may also be type names such as "Red Oak", "Red Pine", "Other". In this case, the operator simply inputs each type name when selecting polygon 61A.

（２）学習
図７は、推論の方法の例を示す図である。図８は、２６４０次元空間における特徴点の分布の例を示す図である。図９は、樹木分類マップ６７の例を示す図である。 (2) Learning Fig. 7 is a diagram showing an example of an inference method. Fig. 8 is a diagram showing an example of the distribution of feature points in a 2640-dimensional space. Fig. 9 is a diagram showing an example of a tree classification map 67.

特徴ベクトル算出部４３（図３参照）は、物体データ６２が生成されると、図７に示すように、物体データ６２それぞれの物体画像６２Ｇの特徴ベクトルＦを、特徴抽出ネットワーク５１に物体画像６２Ｇを入力することによって算出する。なお、物体画像６２Ｇは、特徴抽出ネットワーク５１の入力層のサイズに合わせて適宜、縮小されて入力される。 When object data 62 is generated, the feature vector calculation unit 43 (see Figure 3) calculates the feature vector F of each object image 62G of the object data 62 by inputting the object image 62G into the feature extraction network 51, as shown in Figure 7. Note that the object image 62G is appropriately reduced in size to match the size of the input layer of the feature extraction network 51 before being input.

そして、特徴ベクトル算出部４３は、物体データ６２の中の物体画像６２Ｇを、算出した特徴ベクトルＦに置き換えることによって教師データ６３を生成し、教師データ記憶部４４に記憶させる。 Then, the feature vector calculation unit 43 generates training data 63 by replacing the object image 62G in the object data 62 with the calculated feature vector F, and stores the training data 63 in the training data storage unit 44.

キャノピマップ生成部４１ないし特徴ベクトル算出部４３の処理によると、例えば、オペレータが１１クラスの物体（樹木またはその他の物体）それぞれについて５つずつポリゴン６１Ａを選択した場合は、５５個の教師データ６３が生成され、教師データ記憶部４４に記憶される。 For example, if the operator selects five polygons 61A for each of 11 classes of objects (trees or other objects), processing by the canopy map generation unit 41 or feature vector calculation unit 43 generates 55 pieces of training data 63 and stores them in the training data storage unit 44.

以下、Ｐ個の教師データ６３が教師データ記憶部４４に記憶された場合を例に説明する。それぞれの教師データ６３に示される特徴ベクトルＦを「特徴ベクトルＦ_１」、「特徴ベクトルＦ_２」、…、「特徴ベクトルＦ_Ｐ」と区別して記載することがある。 The following description will be given taking as an example a case where P pieces of teacher data 63 are stored in the teacher data storage unit 44. The feature vectors F indicated in the respective teacher data 63 may be distinguished and described as "feature vector F ₁ ,""feature vector F ₂ ," ..., "feature vector F _P ."

これらの特徴ベクトルＦは、２６４０次元空間（ユークリッド空間）の原点を始点とする位置ベクトルであると言える。以下、２６４０次元空間での特徴ベクトルＦの終点を「特徴点」と記載する。各物体の特徴ベクトルＦの特徴点を２６４０次元空間にプロットすると、図８（Ａ）に示すように、森林８０に存在する物体の特徴の分布が表われる。 These feature vectors F can be thought of as position vectors starting from the origin of 2640-dimensional space (Euclidean space). Hereinafter, the end points of feature vectors F in 2640-dimensional space will be referred to as "feature points." When the feature points of each object's feature vector F are plotted in 2640-dimensional space, the distribution of the features of the objects present in forest 80 appears, as shown in Figure 8 (A).

〔推論のフェーズ〕
オペレータは、所定の高度からドローン２によって森林８０の一部分または全体を撮影した上空写真６５をコンピュータ１に入力する。上空写真６５は、上空写真６０と同様、ＲＧＢのカラー写真である。 [Inference Phase]
The operator inputs an aerial photograph 65, which is a photograph of a part or the whole of the forest 80 taken by the drone 2 from a predetermined altitude, into the computer 1. The aerial photograph 65, like the aerial photograph 60, is an RGB color photograph.

キャノピ検出部４５（図３参照）は、上空写真６５が入力されると、上空写真６５に写っているキャノピなどの物体の位置および形状を検出する。検出の方法は、キャノピマップ生成部４１による検出の方法と同様であって、例えば特開２０２０－９１６４０号公報に記載される方法が用いられる。 When an aerial photograph 65 is input, the canopy detection unit 45 (see Figure 3) detects the position and shape of objects such as canopies that appear in the aerial photograph 65. The detection method is the same as the detection method used by the canopy map generation unit 41, and for example, the method described in JP 2020-91640 A is used.

上空写真６５に写っているキャノピなどの各物体の位置および形状がキャノピ検出部４５によって検出されると、特徴量算出部４６およびクラス推論部４７によって各物体のクラスを推論するための処理が行われる。以下、ある物体αのクラスを推論する場合を例に説明する。 When the canopy detection unit 45 detects the position and shape of each object, such as a canopy, captured in the aerial photograph 65, the feature calculation unit 46 and class inference unit 47 perform processing to infer the class of each object. Below, we will explain an example of inferring the class of a certain object α.

特徴量算出部４６は、物体αの位置および形状に基づいて、物体αの画像を物体画像６６として上空写真６５から抽出する。そして、物体画像６６を特徴抽出ネットワーク５１に入力することによって物体αの特徴ベクトルＦを算出する。この特徴抽出ネットワーク５１は、学習のフェーズにおいて特徴量Ｆ_１、Ｆ_２、…、Ｆ_Ｐを算出するために使用した特徴抽出ネットワーク５１と同じものである。以下、特徴量算出部４６によって算出された特徴ベクトルＦを「特徴ベクトルＦ_Ａ」と記載する。 The feature amount calculation unit 46 extracts an image of the object α from the aerial photograph 65 as an object image 66 based on the position and shape of the object α. Then, the feature vector F of the object α is calculated by inputting the object image 66 to a feature extraction network 51. This feature extraction network 51 is the same as the feature extraction network 51 used to calculate the feature amounts F ₁ , F ₂ , ..., F _P in the learning phase. Hereinafter, the feature vector F calculated by the feature amount calculation unit 46 will be referred to as a "feature vector F _A ".

クラス推論部４７は、教師データ記憶部４４に記憶される各教師データ６３の特徴ベクトルＦ（Ｆ_１、Ｆ_２、…、Ｆ_Ｐ）の特徴点のうちの特徴ベクトルＦ_Ａの特徴点に最も近いものをｋ個、ｋ近傍法によって選出する。そして、選出した各特徴点の特徴ベクトルＦに対応する教師データ６３のラベル６２Ｌを抽出し、最も多くのラベル６２Ｌに示されるクラスを物体αのクラスであると推論する。なお、２つの特徴点同士の距離は、ユーグリッド距離である。 The class inference unit 47 uses the k-nearest neighbor method to select k feature points that are closest to the feature point of the feature vector F _A from among the feature points of the feature vector F (F ₁ , F ₂ , ..., F _P ) of each piece of teacher data 63 stored in the teacher data storage unit 44. Then, it extracts the labels 62L of the teacher data 63 that correspond to the feature vector F of each selected feature point, and infers that the class indicated by the most labels 62L is the class of the object α. The distance between two feature points is the Euclidean distance.

例えば、図８（Ｂ）のように特徴ベクトルＦ_Ａの終点が「×」に位置し、かつ、ｋが「３」である場合は、特徴ベクトルＦ_Ａの終点に最も近い３つの終点は点線で囲まれる３点であり、これらの３点のうちクラス＿１の終点が最多である。したがって、クラス推論部４７は、クラス＿１を物体αのクラスであると推論する。 8B, when the end point of the feature vector F _A is located at "X" and k is "3", the three end points closest to the end point of the feature vector F _A are the three points surrounded by dotted lines, and of these three points, the most common are end points of class_1. Therefore, the class inference unit 47 infers that class_1 is the class of the object α.

クラス推論部４７は、キャノピ検出部４５によって位置および形状が判別された他の各物体についても物体αと同様に、ｋ近傍法によってクラスを推論する。 The class inference unit 47 infers the class of each of the other objects whose position and shape have been determined by the canopy detection unit 45 using the k-nearest neighbor method, in the same way as for object α.

樹木分類マップ生成部４８は、キャノピ検出部４５によって位置および形状が判別された各物体の、クラス推論部４７によって推論されたクラスに基づいて、図９のような樹木分類マップ６７を生成する。具体的には、上空写真６５に写っている各物体の画像を、各物体のクラスに対応する色のポリゴンに置き換えることによって、樹木分類マップ６７を生成する。 The tree classification map generation unit 48 generates a tree classification map 67 as shown in Figure 9 based on the class inferred by the class inference unit 47 for each object whose position and shape have been determined by the canopy detection unit 45. Specifically, the tree classification map 67 is generated by replacing the image of each object in the aerial photograph 65 with a polygon of a color corresponding to the class of each object.

そして、樹木分類マップ出力部４９は、生成された樹木分類マップ６７をタッチパネルディスプレイ１８に表示させることによって出力する。または、通信回線を介して樹木分類マップ６７のファイルを他のコンピュータへ送信してもよい。 The tree classification map output unit 49 then outputs the generated tree classification map 67 by displaying it on the touch panel display 18. Alternatively, the tree classification map 67 file may be transmitted to another computer via a communication line.

〔全体的な処理の流れおよび本実施形態の効果〕
図１０は、樹木分類プログラム４による全体的な処理の流れの例を説明するフローチャートである。図１１は、正誤結果の例を示す図である。 [Overall Processing Flow and Effects of This Embodiment]
Fig. 10 is a flowchart illustrating an example of the overall processing flow by the tree classification program 4. Fig. 11 is a diagram showing an example of a correct/incorrect result.

次に、コンピュータ１による全体的な処理の流れを、図１０のフローチャートを参照しながら説明する。コンピュータ１は、樹木分類プログラム４に基づいて、図１０に示す手順で処理を実行する。 Next, the overall processing flow by the computer 1 will be explained with reference to the flowchart in Figure 10. The computer 1 executes processing based on the tree classification program 4, following the procedure shown in Figure 10.

コンピュータ１は、図５（Ａ）に示したような上空写真６０が入力されると、上空写真６０に基づいて、図５（Ｂ）に示したようなキャノピマップ６１を生成し（図１０の＃１０１）、表示する（＃１０２）。 When an aerial photograph 60 such as that shown in Figure 5(A) is input, the computer 1 generates a canopy map 61 such as that shown in Figure 5(B) based on the aerial photograph 60 (#101 in Figure 10) and displays it (#102).

各クラスに対応するポリゴン６１Ａをオペレータが１個ないし数個ずつ選択すると、コンピュータ１は、選択されたポリゴン６１Ａそれぞれの元の画像（物体画像６２Ｇ）に対して、それぞれに対応するクラスのラベル６２Ｌを付与することによって、図６に示したような物体データ６２を生成する（＃１０３）。そして、物体画像６２Ｇそれぞれの特徴ベクトルＦを算出し（＃１０４）、物体画像６２Ｇそれぞれの特徴ベクトルＦおよびラベル６２Ｌを示す教師データ６３を生成して記憶する（＃１０５）。以上の処理によって、学習のフェーズが完了する。 When the operator selects one or several polygons 61A corresponding to each class, the computer 1 generates object data 62 as shown in FIG. 6 by assigning the corresponding class label 62L to the original image (object image 62G) of each selected polygon 61A (#103). Then, the computer 1 calculates the feature vector F for each object image 62G (#104), and generates and stores training data 63 indicating the feature vector F and label 62L for each object image 62G (#105). The learning phase is completed with the above processing.

コンピュータ１は、上空写真６５が入力されると、上空写真６５に写っている物体の位置および形状を検出し（＃１２１）、上空写真６５の中の各物体の物体画像６６の特徴ベクトルＦを算出する（＃１２２）。 When the aerial photograph 65 is input, the computer 1 detects the position and shape of objects shown in the aerial photograph 65 (#121) and calculates the feature vector F of the object image 66 of each object in the aerial photograph 65 (#122).

さらに、コンピュータ１は、教師データ６３それぞれに示される特徴ベクトルＦおよびステップ＃１２２で算出した特徴ベクトルＦに基づいてｋ近傍法によって、上空写真６５の中の各物体のクラスを推論する（＃１２３）。そして、図９に示したような樹木分類マップ６７を推論結果に基づいて生成し（＃１２４）、出力する（＃１２５）。 Furthermore, the computer 1 infers the class of each object in the aerial photograph 65 using the k-nearest neighbor method based on the feature vector F indicated in each training data 63 and the feature vector F calculated in step #122 (#123). Then, a tree classification map 67 like the one shown in Figure 9 is generated based on the inference results (#124) and output (#125).

本実施形態によると、オペレータは、従来のディープラーニングのケースのように膨大な学習データを用意する必要がなく、ある地域（サイト）に生育する樹木のクラス（種類）ごとの物体データ６２を１ないし数個ずつ用意すればよい。これらの物体データ６２に基づいてコンピュータ１によって教師データ６３が生成され、推論のための準備が完了する。したがって、本実施形態によると、樹木などのクラスを推論するＡＩを従来よりも容易に構築することができる。 According to this embodiment, the operator does not need to prepare a huge amount of training data as in the case of conventional deep learning; instead, it is sufficient to prepare one or several pieces of object data 62 for each class (type) of trees growing in a certain area (site). Based on this object data 62, the computer 1 generates training data 63, completing preparations for inference. Therefore, according to this embodiment, it is easier than ever to build AI that can infer classes such as trees.

ここで、実験結果の一例を挙げて、本実施形態の効果を補足する。コシアブラ、ミズメ、ブナ、サワグルミ、ホオ、アズキナシ、ウワミズザクラ、イタヤカエデ、およびトチの９種類の樹木および樹木以外の物体（ギャップ）の１０のクラスのいずれかに分類するためのＣＮＮをディープラーニングによって生成し、このＣＮＮによって推論対象の物体を分類し、実際のクラスと推論されたクラスとを集計すると、図１１に示す正誤結果＿１が得られた。 Here, an example of experimental results will be presented to further explain the effects of this embodiment. A CNN was generated using deep learning to classify nine types of trees (Japanese koshiabura, Mizume, beech, Japanese walnut, magnolia, adzuki pear, Japanese cherry, sugar maple, and horse chestnut) and non-tree objects (gaps) into one of 10 classes. The object to be inferred was then classified using this CNN, and the actual class and the inferred class were tallied to obtain the true/false result_1 shown in Figure 11.

一方、本実施形態の方法の通り、既存の特徴抽出ネットワーク５１を用いて少量の（例えば、各クラス１０個ずつの）教師データ６３を生成し、これらの教師データ６３に基づいてｋ近傍法で推論対象の物体を分類し、実際のクラスと推論されたクラスとを集計すると、正誤結果＿２が得られた。 On the other hand, as per the method of this embodiment, a small amount of training data 63 (e.g., 10 for each class) is generated using an existing feature extraction network 51, and the object to be inferred is classified using the k-nearest neighbor method based on this training data 63. The actual classes and inferred classes are then tallied, resulting in a true/false result of _2.

正誤結果＿１および正誤結果＿２それぞれのκ値は、０．４１４および０．４５６である。つまり、地域（サイト）に合わせて取得した各クラス１０個ずつ程度の教師データ６３の集合によるｋ近傍法であっても、ディープラーニングで取得したＣＮＮと同程度の識別精度を達成することができる。さらに、一部のクラス（イタヤカエデ）の識別の精度をディープラーニングよりも大幅に向上させることができる。 The kappa values for correct/incorrect result_1 and correct/incorrect result_2 are 0.414 and 0.456, respectively. In other words, even with the k-nearest neighbor method using a set of training data 63 of approximately 10 items per class acquired for each region (site), it is possible to achieve classification accuracy on the same level as the CNN acquired through deep learning. Furthermore, the classification accuracy of some classes (sugar maple) can be significantly improved compared to deep learning.

〔変形例および応用例〕
推論の対象の物体の物体画像６６の特徴ベクトルＦ（例えば、図７の特徴ベクトルＦ_Ａ）の特徴点と特徴ベクトルＦ_１、Ｆ_２、…、Ｆ_Ｐそれぞれの特徴点との距離がすべて、所定の距離以上である場合は、その物体がどのクラスにも属さない可能性が高い。そこで、コンピュータ１は、このような場合に、ｋ近傍法を使用せず、その物体のクラスを樹木分類ネットワーク５（図４参照）によって推論してもよい。つまり、その物体の物体画像６６を特徴抽出ネットワーク５１へ入力し、確率算出ネットワーク５２から出力される確率ｐ_１、ｐ_２、…、ｐ_Ｎのうち最も高い確率に対応するクラスをその物体のクラスであると推論してもよい。 [Modifications and Applications]
If the distances between the feature points of feature vector F (e.g., feature vector F _A in FIG. 7 ) of object image 66 of the object to be inferred and the feature points of each of feature vectors F ₁ , F ₂ , ..., F _P are all equal to or greater than a predetermined distance, it is highly likely that the object does not belong to any class. Therefore, in such a case, computer 1 may infer the class of the object using tree classification network 5 (see FIG. 4 ) without using the k-nearest neighbor method. That is, object image 66 of the object may be input to feature extraction network 51, and the class corresponding to the highest probability among probabilities p ₁ , p ₂ , ..., p _N output from probability calculation network 52 may be inferred to be the class of the object.

本実施形態では、図８で説明したように、コンピュータ１は、２６４０次元でｋ近傍法によって対象の物体のクラスを推論したが、次元を落としてから推論してもよい。例えば、特徴ベクトルＦの２６４０の元（要素）のうちの最も重要な幾つかの（例えば、１０個の）元をＳＦＳ（Sequential Forward Selection）法によって決定し、決定した元の特徴量に基づいてｋ近傍法によって対象の物体のクラスを推論してもよい。つまり、例えば、図７の特徴ベクトルＦ_１、Ｆ_２、…、Ｆ_ＰをそれぞれＳＦＳ法によって低次元ベクトルＧ_１、Ｇ_２、…、Ｇ_Ｐに変換し、特徴ベクトルＦ_ＡをＳＦＳ法によって低次元ベクトルＧ_Ａに変換する。そして、低次元ベクトルＧ_Ａの特徴点と特徴ベクトルＦ_１、Ｆ_２、…、Ｆ_Ｐそれぞれの特徴点との距離を算出し、ｋ近傍法によってクラスを推論する。 In this embodiment, as described with reference to FIG. 8 , the computer 1 infers the class of the target object using the k-nearest neighbor algorithm in 2640 dimensions. However, the inference may be performed after reducing the dimensions. For example, several (e.g., 10) most important elements of the 2640 elements (components) of the feature vector F may be determined using the Sequential Forward Selection (SFS) method, and the class of the target object may be inferred using the k-nearest neighbor algorithm based on the determined feature quantities. That is, for example, the feature vectors F ₁ , F ₂ , ..., F _P in FIG. 7 may be converted into low-dimensional vectors G ₁ , G ₂ , ..., G _P using the SFS method, and the feature vector F _A may be converted into a low-dimensional vector G _A using the SFS method. Then, the distances between the feature points of the low-dimensional vector G _A and the feature points of each of the feature vectors F ₁ , F ₂ , ..., F _P are calculated, and the class is inferred using the k-nearest neighbor algorithm.

最も重要な幾つかの元は、推論のフェーズにおいて推論の対象の物体の画像すなわち物体画像６６の特徴ベクトルＦを算出した後（図１０のステップ＃１２２の後）、その特徴ベクトルＦに基づいて決定すればよい。または、学習のフェーズにおいていずれかのサンプルの画像すなわち物体画像６２Ｇの特徴ベクトルＦを算出した後（＃１０４の後）、その特徴ベクトルＦに基づいて決定してもよい。 The most important few elements may be determined based on the feature vector F of the image of the object to be inferred, i.e., object image 66, calculated in the inference phase (after step #122 in Figure 10). Alternatively, the most important few elements may be determined based on the feature vector F of one of the sample images, i.e., object image 62G, calculated in the learning phase (after #104).

または、ＳＢＳ（Sequential Backward Selection）法または総当たり法など、ＳＦＳ法の以外のラッパー法によって、重要な元を決定してもよい。または、ラッパー法以外の特徴量選択方法によって重要な元を決定してもよい。 Alternatively, important elements may be determined by a wrapper method other than the SFS method, such as the Sequential Backward Selection (SBS) method or a brute force method. Alternatively, important elements may be determined by a feature selection method other than the wrapper method.

図１２は、特徴抽出ネットワーク５１の汎用化および全国の樹木マップの作成の方法の例を示す図である。 Figure 12 shows an example of how to generalize the feature extraction network 51 and create a nationwide tree map.

樹木分類プログラム４の提供者は、樹木分類プログラム４を日本国内の各地のユーザへ提供し各ユーザからデータを収集することによって、日本全国の樹種マップを作成するとともに、樹木分類ネットワーク５（図４参照）の精度の向上を図ることができる。以下、図１２を参照しながら、この仕組みについて説明する。 By providing the tree classification program 4 to users throughout Japan and collecting data from each user, the provider of the tree classification program 4 can create a tree species map for the entire country of Japan and improve the accuracy of the tree classification network 5 (see Figure 4). This mechanism is explained below with reference to Figure 12.

樹木分類プログラム４の提供者は、予め、樹木分類ネットワーク５を有するサーバ３１を用意しておく。さらに、日本で見られる各樹木のクラス（種類）の識別子（クラスのコードまたは呼び名）を予め統一し、それらを記載した樹木リストを用意しておく。 The provider of the tree classification program 4 prepares in advance a server 31 that has a tree classification network 5. Furthermore, the provider standardizes in advance the identifiers (class codes or names) of the classes (types) of trees found in Japan, and prepares a tree list that lists these.

サーバ３１は、ユーザからリクエストを受け付けるごとに、樹木分類プログラム４とともに特徴抽出ネットワーク５１および樹木リストをそのユーザのコンピュータへ送信する（＃１３１）。特徴抽出ネットワーク５１は、各ユーザの地域（サイト）で汎用的に用いられるニューラルネットワーク（汎用モデル）であるが、最新のものであることが望ましい。 Each time the server 31 receives a request from a user, it sends the tree classification program 4, along with the feature extraction network 51 and tree list, to the user's computer (#131). The feature extraction network 51 is a neural network (general-purpose model) that is generally used in each user's region (site), and it is desirable that it be the latest version.

ユーザは、樹木分類プログラム４および特徴抽出ネットワーク５１をコンピュータによって受信すると、そのコンピュータに樹木分類プログラム４および特徴抽出ネットワーク５１をインストールする。これにより、そのコンピュータがコンピュータ１として機能する。 When the user receives the tree classification program 4 and feature extraction network 51 on a computer, the user installs the tree classification program 4 and feature extraction network 51 on the computer. This causes the computer to function as computer 1.

ユーザは、樹木分類プログラム４、特徴抽出ネットワーク５１、およびドローン２などを使用して自分の地域の森林に合わせて樹木分類プログラム４をチューニングし、樹木分類マップ６７を作成する（＃１３２）。ラベリングは、樹木リストに従う。そして、樹木分類マップ６７を、撮影場所の位置情報および作成の過程で得られた物体データ６２とともに自分のコンピュータからサーバ３１へ送信する（＃１３３）。 The user uses the tree classification program 4, feature extraction network 51, drone 2, etc. to tune the tree classification program 4 to the forests in their area and create a tree classification map 67 (#132). Labeling follows the tree list. The user then transmits the tree classification map 67 from their computer to the server 31 along with location information of the photographing location and object data 62 obtained during the creation process (#133).

サーバ３１は、物体データ６２、樹木分類マップ６７、および位置情報を受信すると、物体データ６２に示される物体画像６２Ｇおよびラベル６２Ｌをそれぞれ説明変数および目的変数として用いて樹木分類ネットワーク５のさらなる学習（訓練）をディープラーニングによって行う（＃１３４）。物体データ６２の一部を検証データまたはテストデータとして用いることもある。さらに、位置情報に基づいて樹木分類マップ６７を全国地図に重ねる（＃１３５）。 When the server 31 receives the object data 62, tree classification map 67, and location information, it performs further learning (training) of the tree classification network 5 by deep learning using the object image 62G and label 62L shown in the object data 62 as explanatory variables and target variables, respectively (#134). Part of the object data 62 may also be used as validation data or test data. Furthermore, the tree classification map 67 is overlaid on a national map based on the location information (#135).

以上の処理によって、特徴抽出ネットワーク５１の学習および日本全国の樹種マップの作成が進む。その後、各ユーザへは、最新の状態の特徴抽出ネットワーク５１が提供される。この一連の処理を繰り返すことによって、各地から物体データ６２が集まり、特徴抽出ネットワーク５１が大規模な汎用モデルになるように学習されていく。なお、日本国内に限らず全世界の樹種マップを作成するとともに、全世界の樹木の種類の分類が可能になるように樹木分類ネットワーク５をディープラーニングさせてもよい。 The above process progresses the training of the feature extraction network 51 and the creation of a tree species map for all of Japan. The latest version of the feature extraction network 51 is then provided to each user. By repeating this series of processes, object data 62 is collected from various locations, and the feature extraction network 51 is trained to become a large-scale, general-purpose model. Note that the tree classification network 5 may be subjected to deep learning to create a tree species map for the entire world, not just Japan, and to classify tree species worldwide.

本実施形態では、樹木などの物体のクラスを推論する際にｋ近傍法を用いたが、ｋ近傍法を例えば次のように変形して用いてもよい。各教師データ６３の特徴ベクトルＦ（Ｆ_１、Ｆ_２、…）の特徴点の中から推論の対象の物体の特徴ベクトルＦ_Ａの特徴点に最も近いものをｋ個、選出する。選出した各特徴点を各特徴点に対応する教師データ６３のラベル６２Ｌに示されるクラスごとに分類する。そして、クラスごとに、選出した各特徴点と推論の対象の物体の特徴点との距離の総和を算出し、総和が最も短いクラスを推論の対象の物体のクラスであると推論する。または、票の代わりに距離が近いほど大きな点数を与え、点数の総和が最も大きいクラスを推論の対象の物体のクラスであると推論してもよい。 In this embodiment, the k-nearest neighbor method is used when inferring the class of an object such as a tree. However, the k-nearest neighbor method may be modified as follows: From the feature points of the feature vectors F (F ₁ , F ₂ , ...) of each piece of training data 63, k feature points that are closest to the feature points of the feature vector F _A of the object to be inferred are selected. Each selected feature point is classified into a class indicated by the label 62L of the training data 63 corresponding to each feature point. Then, for each class, the sum of the distances between each selected feature point and the feature points of the object to be inferred is calculated, and the class with the shortest sum is inferred to be the class of the object to be inferred. Alternatively, instead of votes, a larger score may be assigned to a closer distance, and the class with the largest sum of scores may be inferred to be the class of the object to be inferred.

または、ｋ近傍法の代わりにＳＶＭ（Support Vector Machine）またはランダムフォレストを使用して学習器を生成し、生成した学習器によってクラスを推論してもよい。ＳＶＭを使用する場合もランダムフォレストを使用する場合も、ＳＦＳ法などによって特徴ベクトルＦ_１、Ｆ_２、…、Ｆ_Ｐの次元を減らして学習器を生成し、同様の方法で特徴ベクトルＦ_Ａの次元を減らしてクラスを推論してもよい。 Alternatively, instead of the k-nearest neighbor method, a support vector machine (SVM) or random forest may be used to generate a learner, and a class may be inferred using the generated learner. Whether an SVM or a random forest is used, a learner may be generated by reducing the dimensions of the feature vectors F ₁ , F ₂ , ..., F _P using an SFS method or the like, and a class may be inferred by reducing the dimensions of the feature vector F _A in a similar manner.

本実施形態では、図３に示した各機能をコンピュータ１に集約したが、複数の装置に分散してもよい。例えば、キャノピマップ生成部４１およびラベル付与部４２を第一のコンピュータに設け、特徴ベクトル算出部４３を第二のコンピュータに設け、教師データ記憶部４４を第三のコンピュータに設け、キャノピ検出部４５を第四のコンピュータに設け、クラス推論部４７を第五のコンピュータに設け、樹木分類マップ生成部４８および樹木分類マップ出力部４９を第六のコンピュータに設けてもよい。第一のコンピュータないし第六のコンピュータは、それぞれの機能を実現するためのコンピュータプログラムがインストールされており、通信回線を介して互いに連携することによって上空写真６０から樹木分類マップ６７を生成するための学習および推論の一連の処理を実行する。 In this embodiment, the functions shown in FIG. 3 are concentrated in computer 1, but they may be distributed across multiple devices. For example, the canopy map generation unit 41 and label assignment unit 42 may be provided in a first computer, the feature vector calculation unit 43 in a second computer, the teacher data storage unit 44 in a third computer, the canopy detection unit 45 in a fourth computer, the class inference unit 47 in a fifth computer, and the tree classification map generation unit 48 and tree classification map output unit 49 in a sixth computer. Computer programs for realizing each function are installed in the first through sixth computers, and by cooperating with each other via communication lines, they execute a series of learning and inference processes for generating a tree classification map 67 from an aerial photograph 60.

本実施形態では、コンピュータ１を、樹木を分類するために用いたが、他の物体を分類するために用いてもよい。例えば、草原に分布する草花、低木、動物の住処（蟻塚、モグラの巣などの、動物の巣）、または地形などを分類するために用いてもよい。または、岩石（火山岩、半深成岩、深成岩、堆積岩など）を分類するために用いてもよいし、建築物を分類するために用いてもよい。 In this embodiment, the computer 1 is used to classify trees, but it may also be used to classify other objects. For example, it may be used to classify grassland flowers, shrubs, animal habitats (animal nests such as anthills and mole nests), or landforms. It may also be used to classify rocks (volcanic rocks, hypabyssal rocks, plutonic rocks, sedimentary rocks, etc.), or to classify buildings.

いずれの種類を分類する場合も、コンピュータ１およびドローン２は、基本的に本実施形態と同様の処理を行えばよい。 Regardless of the type of classification, the computer 1 and drone 2 can basically perform the same processing as in this embodiment.

ただし、草花を分類する場合は、ドローン２は、樹木を分類する場合よりも低い高度で撮影を行うことによって上空写真６０、６５を収集する。草花の形状はビルの形状のように急傾斜が生じているので、本実施形態と同様にスロープモデルを用いることが有効的であるが、樹木よりも小さいので、草花のサイズに合わせて高度を下げて撮影するのが望ましい。低木を分類する場合および湖沼の植物（例えば、葦）を分類する場合も、同様である。 However, when classifying flowers and plants, the drone 2 collects aerial photographs 60, 65 by taking photographs at a lower altitude than when classifying trees. Because flowers and plants have a steep slope like the shape of a building, it is effective to use a slope model as in this embodiment, but because they are smaller than trees, it is desirable to take photographs at a lower altitude to match their size. The same applies when classifying shrubs and lake plants (e.g., reeds).

一方、地形（湖沼、平野、丘陵、河川）を分類する場合は、ドローン２は、樹木を分類する場合よりも高い高度で撮影を行うことによって上空写真６０、６５を収集してもよい。または、解像度を下げて上空写真６０、６５を収集してもよい。 On the other hand, when classifying landforms (lakes, marshes, plains, hills, rivers), the drone 2 may collect aerial photographs 60, 65 by taking photographs at a higher altitude than when classifying trees. Alternatively, the aerial photographs 60, 65 may be collected at a lower resolution.

その他、コンピュータ１の全体または各部の構成、処理の内容、処理の順序、データの構成などは、本発明の趣旨に沿って適宜変更することができる。 In addition, the overall configuration of computer 1 or each part, the processing content, processing order, data structure, etc. may be modified as appropriate in accordance with the spirit of the present invention.

１コンピュータ（学習装置）
３樹木分類システム（クラス推論システム）
３１サーバ（学習手段）
４樹木分類プログラム（コンピュータプログラム）
４２ラベル付与部（第一の取得手段）
４６特徴量算出部（第二の取得手段）
４７クラス推論部（推論手段）
５樹木分類ネットワーク（ＣＮＮ）
５１特徴抽出ネットワーク（学習済モデル）
６３教師データ（推論用データ）
６２Ｇ物体画像
６３教師データ（サンプル特徴量）
６６物体画像（推論対象物の画像） 1. Computer (learning device)
3. Tree Classification System (Class Inference System)
31 Server (learning means)
4. Tree classification program (computer program)
42 Label assignment unit (first acquisition means)
46 Feature calculation unit (second acquisition means)
47 Class inference unit (inference means)
5. Tree Classification Network (CNN)
51 Feature extraction network (trained model)
63 Training data (data for inference)
62G Object image 63 Training data (sample feature)
66 Object image (image of the object to be inferred)

Claims

a first acquisition means for acquiring a first feature vector representing features of an image of a crown of a tree belonging to each of a plurality of classes that grows in a specific region by inputting the image into a trained model;
a second acquisition means for acquiring a second feature vector representing features of an image of a crown of an inference target tree that is a target of inference and grows in the specific region by inputting the image into the trained model;
an inference means for inferring a class to which the inference target tree belongs from among the plurality of classes, based on the first feature vector of each of the plurality of classes and the second feature vector of the inference target tree;
and
The trained model is a network of a part of a CNN (Convolutional Neural Network) generated by deep learning using the classes of a plurality of trees and the images of tree crowns as objective variables and explanatory variables, respectively, that calculates the feature vectors of input images by convolution processing.
A class inference system characterized by:

the inference means infers a class to which the inference target tree belongs among the plurality of classes by a k-nearest neighbor method;
The class inference system of claim 1 .

the first acquisition means acquires, as the first feature vector of an image of a crown of a tree belonging to each of the plurality of classes, a first low-dimensional vector composed of each of a plurality of specific element values of a feature vector calculated by inputting the image into the trained model;
the second acquisition means acquires, as the second feature vector, a second low-dimensional vector consisting of each of the specific plurality of element values of a feature vector calculated by inputting an image of a crown of the inference target tree into the trained model;
3. The class inference system according to claim 1 or claim 2.

The specific elements are obtained by performing a sequential forward selection (SFS) process on the feature vector calculated by the trained model.
4. The class inference system of claim 3.

a learning means for further training the CNN using images of tree crowns belonging to each of the plurality of classes and each of the tree classes as explanatory variables and objective variables;
having
5. A class inference system according to claim 1.

A tree species map generation system for generating a tree species map for an area having multiple regions where trees belonging to each of multiple classes grow, comprising:
a first acquisition means for acquiring, for each of the plurality of regions, a first feature vector representing features of an image of a crown of a tree growing in the region and belonging to each of the plurality of classes, by inputting the image into a trained model;
a second acquisition means for acquiring a second feature vector representing features of an image of the crown of each of a plurality of inference target trees growing in the region by inputting the image into the trained model;
an inference means for inferring a class to which each of the plurality of inference target trees belongs based on the first feature vector of each of the plurality of classes in an area where the inference target tree grows among the plurality of areas and the second feature vector of the inference target tree;
a map generating means for generating the tree species map based on the position information of each of the plurality of inference target trees and the inference result by the inference means;
A tree species map generation system comprising:

Obtaining a first feature vector representing features of an image of a crown of a tree belonging to each of a plurality of classes growing in a specific region by inputting the image into a trained model;
A second feature vector representing features of an image of a crown of an inference target tree that is a target of inference and grows in the specific region is obtained by inputting the image into the trained model;
inferring a class to which the inference target tree belongs from among the plurality of classes based on the first feature vector of each of the plurality of classes and the second feature vector of the inference target tree;
As the trained model, a network of a part of a CNN (Convolutional Neural Network) that calculates feature vectors of input images by convolution processing is used, the part being generated by deep learning using the classes of a plurality of trees and the images of tree crowns as objective variables and explanatory variables, respectively.
A class inference method characterized by:

A tree species map generation method for generating a tree species map for an area having multiple regions and in which trees belonging to multiple classes grow, comprising the steps of:
For each of the plurality of regions, a first feature vector representing features of an image of a crown of a tree belonging to each of the plurality of classes growing in the region is obtained by inputting the image into a trained model;
A second feature vector representing features of an image of a crown of each of a plurality of inference target trees growing in the region is obtained by inputting the image into the trained model;
inferring a class to which each of the plurality of inference target trees belongs based on the first feature vector of each of the plurality of classes in a region where the inference target tree grows among the plurality of regions and the second feature vector of the inference target tree;
generating the tree species map based on the position information and the inferred class of each of the plurality of inference target trees;
A tree species map generation method characterized by: