JP7597469B2

JP7597469B2 - Training Semi-Supervised Neural Networks for Uncertainty-Guided Image Classification

Info

Publication number: JP7597469B2
Application number: JP2022537678A
Authority: JP
Inventors: セダイ、スーマン; アントニー、バーヴナ、ジョセフィーン; ガルナビ、ラーヒル
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2020-01-09
Filing date: 2021-01-04
Publication date: 2024-12-10
Anticipated expiration: 2041-01-04
Also published as: JP2023510697A; US20210216825A1; US11386298B2; WO2021140426A1; CN114945951B; CN114945951A

Description

本発明は、一般に畳み込みニューラル・ネットワークを使用するイメージ検出に関し、より具体的には、イメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングに関する。 The present invention relates generally to image detection using convolutional neural networks, and more specifically to training uncertainty-guided semi-supervised neural networks for image classification.

畳み込みニューラル・ネットワークは、マルチ・レイヤのディープ・ラーニング・アルゴリズムである。ディープ・ラーニングは、特徴及びそれらのクラス化を識別する機械学習のタイプである。クラス化は、イメージのピクセルのセットのそれぞれのピクセルを、特徴を識別するクラス化のセットのうちの１つでラベル付けすることを参照する。
ディープ・ラーニング・アルゴリズムは、例えばイメージ内の特徴の識別及びクラス化を要求する医療イメージングに基づいて診断を行うことに良好に適合する。教師あり学習は、グラウンド・トルスを含むトレーニング・データセットでトレーニングされたニューラル・ネットワークを参照する。これは、入力に対応する正確な出力が知られている。教師無し学習は、入力のみでニューラス・ネットワークをトレーニングすることを参照する Convolutional neural networks are multi-layered deep learning algorithms. Deep learning is a type of machine learning that identifies features and their classification. Classification refers to labeling each pixel of a set of pixels of an image with one of a set of classifications that identify the feature.
Deep learning algorithms are well suited for making diagnoses based on medical imaging, for example, which requires identifying and classifying features within an image. Supervised learning refers to a neural network being trained on a training dataset that contains a ground truth, where the exact output corresponding to an input is known. Unsupervised learning refers to training a neural network with only the input.

光干渉断層撮影（ＯＣＴ）の網膜スキャンといった、生物医学的の解剖学的なセグメント化は、特に疾病診断、進行分析、及び治療計画のために大きな療法的重要性を有する。例えば、ＯＣＴにより測定される視神経乳頭周囲網膜神経線維層（ｃｐＲＮＦＬ）の厚さが進行的に薄くなることは、緑内障の患者における視覚損失を予測するために使用することができる。 Biomedical anatomical segmentation, such as optical coherence tomography (OCT) retinal scans, is of great therapeutic importance, especially for disease diagnosis, progression analysis, and treatment planning. For example, progressive thinning of the peripapillary retinal nerve fiber layer (cpRNFL) thickness measured by OCT can be used to predict vision loss in patients with glaucoma.

畳み込みニューラル・ネットワーク（ＣＮＮｓ）に基づく方法は、医学的イメージ及び自然イメージのセグメント化において顕著な性能の増進を達成した。例えば、そのようなネットワークは、眼底及びＯＣＴイメージにおける網膜構造のセグメント化のために使用されてきた。そのような完全に教師ありのセグメント化アルゴリズムは、合理的な堅牢性及び精度を達成するために膨大な数のアノテーションされたイメージを必要とする。しかしながら、ピクセル・ワイズのグラウンド・トルスを要求することは、熟練者のみが信頼性のあるアノテーションを提供することができる医療イメージング・ドメインでは時間の浪費であり、かつコストが高い可能性がある。 Methods based on convolutional neural networks (CNNs) have achieved significant performance gains in segmenting medical and natural images. For example, such networks have been used for segmenting retinal structures in fundus and OCT images. Such fully supervised segmentation algorithms require a huge number of annotated images to achieve reasonable robustness and accuracy. However, requiring pixel-wise ground truths can be time-consuming and cost-prohibitive in the medical imaging domain, where only experts can provide reliable annotations.

ラベル付けされたデータの供給不足は、限定された教師を要求する、効果的な半教師あり学習方法の必要性を要求する。本開示は、限定されたラベル付けされたデータと共に膨大な数の、容易に利用することができるラベル付けされていないデータを作用させることにより、本問題に対処するための新奇な半教師あり学習方法を記載する。 The scarcity of labeled data necessitates the need for effective semi-supervised learning methods that require limited supervision. This disclosure describes a novel semi-supervised learning method to address this problem by working with vast amounts of readily available unlabeled data together with limited labeled data.

したがって、上述した問題に対処するための技術が必要とされている。 Therefore, there is a need for technology that addresses the above-mentioned issues.

１つの側面から概観すると、本発明は、コンピュータ実装方法を提供し：ラベル付けされたイメージを使用して教師ニューラル・ネットワークをトレーニングして、トレーニングされた教師ニューラル・ネットワークを得ることであって、前記ラベル付けされたそれぞれのイメージのそれぞれのピクセルがクラス化のセットの１つを示すラベルに指定されること、前記トレーニングされた教師ニューラル・ネットワークにラベル付けされていないイメージのセットを提供して、ソフト・ラベル付けされたイメージのセットを生成することであって、前記ソフト・ラベル付けされたそれぞれのイメージのそれぞれのピクセルが、ソフト・ラベルに関連する前記クラス化のセットの１つ及び不確実性の値を示す前記ソフト・ラベルに指定されること、前記ラベル付けされたイメージ及び前記ソフト・ラベル付けされたセットで生徒ニューラル・ネットワークをトレーニングしてトレーニングされた生徒ニューラル・ネットワークを得ること、及び前記トレーニングされた生徒ニューラル・ネットワークを使用してラベル付けされていないイメージから生徒ラベル付けイメージを得ることを含む。 In one aspect, the present invention provides a computer-implemented method including: training a teacher neural network using labeled images to obtain a trained teacher neural network, where each pixel of each of the labeled images is assigned a label indicative of one of a set of classifications; providing a set of unlabeled images to the trained teacher neural network to generate a set of soft-labeled images, where each pixel of each of the soft-labeled images is assigned a soft label indicative of one of the set of classifications and an uncertainty value associated with the soft label; training a student neural network with the labeled images and the soft-labeled set to obtain a trained student neural network; and using the trained student neural network to obtain student-labeled images from the unlabeled images.

さらなる側面から概観すると、本発明は、システムを提供し：コンピュータ可読な命令を有するメモリ、及び前記コンピュータ可読な命令を実行するための１つ又はそれ以上のプロセッサを含み、前記コンピュータ可読な命令が前記１つ又はそれ以上のプロセッサを制御して：ラベル付けされたイメージを使用して教師ニューラル・ネットワークをトレーニングして、トレーニングされた教師ニューラル・ネットワークを得ることであって、前記ラベル付けされたそれぞれのイメージのそれぞれのピクセルがクラス化のセットの１つを示すラベルに指定されること、前記トレーニングされた教師ニューラル・ネットワークにラベル付けされていないイメージのセットを提供して、ソフト・ラベル付けされたイメージのセットを生成することであって、前記ソフト・ラベル付けされたそれぞれのイメージのそれぞれのピクセルが、ソフト・ラベルに関連する前記クラス化のセットの１つ及び不確実性の値を示す前記ソフト・ラベルに指定されること、前記ラベル付けされたイメージのサブセット及び前記ソフト・ラベル付けされたセットで生徒ニューラル・ネットワークをトレーニングしてトレーニングされた生徒ニューラル・ネットワークを得ること、及び前記トレーニングされた生徒ニューラル・ネットワークを使用してラベル付けされていないイメージから生徒ラベル付けイメージを得ることを含む動作を実行する。 In a further aspect, the present invention provides a system including: a memory having computer-readable instructions; and one or more processors for executing the computer-readable instructions, the computer-readable instructions controlling the one or more processors to perform operations including: training a teacher neural network using labeled images to obtain a trained teacher neural network, where each pixel of each labeled image is assigned a label indicative of one of a set of classifications; providing a set of unlabeled images to the trained teacher neural network to generate a set of soft-labeled images, where each pixel of each soft-labeled image is assigned a soft label indicative of one of the set of classifications and an uncertainty value associated with the soft label; training a student neural network with a subset of the labeled images and the soft-labeled set to obtain a trained student neural network; and using the trained student neural network to obtain student-labeled images from the unlabeled images.

さらなる側面から概観すると、本発明は、それにプログラム命令が実体化されたコンピュータ可読な記録媒体を含むコンピュータ・プログラム製品を提供し、プロセッサによって実行可能なプログラム命令は、プロセッサをして：ラベル付けされたイメージを使用して教師ニューラル・ネットワークをトレーニングして、トレーニングされた教師ニューラル・ネットワークを得ることであって、前記ラベル付けされたそれぞれのイメージのそれぞれのピクセルがクラス化のセットの１つを示すラベルに指定されること、前記トレーニングされた教師ニューラル・ネットワークにラベル付けされていないイメージのセットを提供して、ソフト・ラベル付けされたイメージのセットを生成することであって、前記ソフト・ラベル付けされたそれぞれのイメージのそれぞれのピクセルが、ソフト・ラベルに関連する前記クラス化のセットの１つ及び不確実性の値を示す前記ソフト・ラベルに指定されること、前記ラベル付けされたイメージのサブセット及び前記ソフト・ラベル付けされたセットで生徒ニューラル・ネットワークをトレーニングしてトレーニングされた生徒ニューラル・ネットワークを得ること、及び前記トレーニングされた生徒ニューラル・ネットワークを使用してラベル付けされていないイメージから生徒ラベル付けイメージを得ることを含む動作を実行させる。 In a further aspect, the present invention provides a computer program product including a computer-readable recording medium having program instructions embodied therein, the program instructions executable by a processor to cause the processor to perform operations including: training a teacher neural network using labeled images to obtain a trained teacher neural network, where each pixel of each labeled image is assigned a label indicative of one of a set of classifications; providing a set of unlabeled images to the trained teacher neural network to generate a set of soft-labeled images, where each pixel of each soft-labeled image is assigned a soft label indicative of one of the set of classifications and an uncertainty value associated with the soft label; training a student neural network on a subset of the labeled images and the soft-labeled set to obtain a trained student neural network; and using the trained student neural network to obtain student-labeled images from the unlabeled images.

さらなる側面から概観すると、本発明は、イメージのクラス化のためのニューラル・ネットワークのためのコンピュータ・プログラム製品を提供し、前記コンピュータ・プログラム製品は、処理回路によって可読であり、かつ本発明のステップを実行するための前記処理回路による実行のための命令を格納するコンピュータ可読な記録媒体を含む。 Viewed from a further aspect, the present invention provides a computer program product for a neural network for image classification, the computer program product comprising a computer-readable recording medium readable by a processing circuit and storing instructions for execution by the processing circuit to perform the steps of the present invention.

さらなる側面から概観すると、本発明は、コンピュータ可読な媒体上に格納され、かつデジタル・コンピュータの内部メモリにロード可能なコンピュータ・プログラムであって、前記プログラムがコンピュータ上で動作する場合に、本発明のステップを実行するためのソフトウェア・コード部分を含む、コンピュータ・プログラムを提供する。 Viewed from a further aspect, the present invention provides a computer program stored on a computer-readable medium and loadable into an internal memory of a digital computer, the computer program comprising software code portions for carrying out the steps of the present invention when the program is run on the computer.

本発明の実施形態は、不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングを指向する。コンピュータ実装方法の非限定的な実施例は、ラベル付けされたイメージを使用して教師ニューラル・ネットワークをトレーニングし、トレーニングされた教師ニューラル・ネットワークを得ることであって、前記ラベル付けされたそれぞれのイメージのそれぞれのピクセルがクラス化のセットの１つを示すラベルに指定されることを含む。本方法はまた、前記トレーニングされた教師ニューラル・ネットワークにラベル付けされていないイメージのセットを提供してソフト・ラベル付けされたイメージのセットを生成すること、前記ソフト・ラベル付けされたそれぞれのイメージのそれぞれのピクセルが、前記クラス化のセットの１つ及びソフト・ラベルに関連する不確実性の値を示す前記ソフト・ラベルに指定されること、前記ラベル付けされたイメージのサブセット及び前記ソフト・ラベル付けされたセットで生徒ニューラル・ネットワークをトレーニングしてトレーニングされた生徒ニューラル・ネットワークを得ることを含む。生徒ラベル付けイメージは、前記トレーニングされた生徒ニューラル・ネットワークを使用してラベル付けされていないイメージから得られる。 Embodiments of the present invention are directed to training a semi-supervised neural network guided by uncertainty. A non-limiting example of a computer-implemented method includes training a teacher neural network using labeled images to obtain a trained teacher neural network, where each pixel of each of the labeled images is assigned a label indicative of one of a set of classifications. The method also includes providing a set of unlabeled images to the trained teacher neural network to generate a set of soft-labeled images, where each pixel of each of the soft-labeled images is assigned a soft label indicative of one of the set of classifications and an uncertainty value associated with the soft label, and training a student neural network with a subset of the labeled images and the soft-labeled set to obtain a trained student neural network. Student-labeled images are obtained from the unlabeled images using the trained student neural network.

本発明の他の実施形態は、コンピュータ・システム及びコンピュータ・プログラム製品内に上述した方法の特徴を実装する。 Other embodiments of the present invention implement features of the above-described methods in computer systems and computer program products.

追加的な技術的特徴及び利益は、本発明の技術を通じて実現される。本発明の実施形態及び側面は、本明細書において詳述され、かつ請求項の主題の部分と考えられる。より良い理解のため、詳細な説明及び図面を参照する。 Additional technical features and advantages are realized through the techniques of the present invention. Embodiments and aspects of the invention are described in detail herein and are considered part of the subject matter of the claims. For a better understanding, reference is made to the detailed description and drawings.

本明細書に説明する排他的権利の特徴は、明細書の結論で請求項において特定的に指摘され、かつ個別に請求される。本発明の実施形態の上述した及び他の特徴及び他の特徴は、添付する図面との組み合わせにおいて後述する詳細な説明から理解される。 The exclusive features described herein are particularly pointed out and individually claimed in the claims at the conclusion of the specification. The above-disclosed and other features of the embodiments of the invention will become apparent from the following detailed description taken in conjunction with the accompanying drawings.

図１は、本発明の１つ又はそれ以上の実施形態によるイメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングを実行するための方法のプロセス・フローを示す。FIG. 1 illustrates a process flow of a method for performing uncertainty-guided semi-supervised neural network training for image classification in accordance with one or more embodiments of the present invention. 図２は、本発明の１つ又はそれ以上の実施形態によるイメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングを実行するための生徒ニューラル・ネットワークのトレーニングのプロセスの詳述である。FIG. 2 details the process of training a student neural network to perform uncertainty-guided semi-supervised neural network training for image classification in accordance with one or more embodiments of the present invention. 図３は、本発明の１つ又はそれ以上の実施形態によるイメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングを実装するための処理システムのブロック図を示す。FIG. 3 illustrates a block diagram of a processing system for implementing uncertainty-guided semi-supervised neural network training for image classification in accordance with one or more embodiments of the present invention.

本明細書において図示された図は、例示的である。図、又は図内に記述された操作に対する多くのバリエーションは、本発明の精神から逸脱することなく存在することができる。例えば、動作は、異なる順序で実行することができ、又動作は、追加、削除、又は修正することができる。また、用語“結合”及びそのバリエーションは、２つの要素の間に通信経路を有することを記述し、それらの間にまったく介在的な要素／接続を有しない要素の間の直接接続を意味しない。これらのバリエーションすべては、本明細書の部分として考えられる。 The diagrams illustrated herein are exemplary. Many variations to the diagrams, or to the operations described within the diagrams, may exist without departing from the spirit of the invention. For example, operations may be performed in a different order, and operations may be added, deleted, or modified. Also, the term "coupled" and variations thereof describe having a communication path between two elements, and do not imply a direct connection between elements with no intervening elements/connections between them. All of these variations are considered part of this specification.

前述したように、ディープ・ラーニング・ニューラル・ネットワークは、イメージ内の特徴を識別すると共に特徴をクラス化するために使用することができる。これらのニューラル・ネットワークは、トレーニングされて、例えば、医療イメージ内の特徴をクラス化することに基づく医療診断が遂行される。すなわち、医療イメージ内の特徴は、クラス化のセットの１つで識別され、かつラベルされることができる。教師あり学習は、すでにラベル付けされたイメージを使用してニューラル・ネットワークをトレーニングすることを含み、より良好な結果を生成することができる。しかしながら、教師あり学習のために十分に大きなトレーニング・データセットを取得することは、時間の浪費であり、コストも高い可能性がある。医療診断用途においては、例えば、エキスパートだけが教師あり学習のために必要とされるイメージのピクセル・ワイズのグラウンド・トルスを提供することができる。 As mentioned above, deep learning neural networks can be used to identify features in images and classify the features. These neural networks can be trained to perform, for example, medical diagnoses based on classifying features in medical images. That is, features in medical images can be identified and labeled with one of a set of classifications. Supervised learning involves training a neural network using images that have already been labeled and can produce better results. However, obtaining a sufficiently large training dataset for supervised learning can be time consuming and costly. In medical diagnostic applications, for example, only an expert can provide the pixel-wise ground truth of the image needed for supervised learning.

典型的な教師あり学習のシナリオのおいては、ニューラル・ネットワークは、すでにクラス化されたイメージの入力セットが提供されて、トレーニングされたニューラル・ネットワークを生成する。入力セットがより大きくなると、得られるトレーニングされたニューラル・ネットワークは、より精度高くなりがちである。しかしながら、また上述したように、すでにクラス化されたイメージの大きさセットを取得することには課題がある。医療診断用途においては例えば、エキスパートのみが入力イメージをクラス化するために使用されるラベルを決定することができる。 In a typical supervised learning scenario, a neural network is provided with an input set of already-classified images to generate a trained neural network. The larger the input set, the more accurate the resulting trained neural network is likely to be. However, and as mentioned above, obtaining a large set of already-classified images is challenging. In medical diagnostic applications, for example, only an expert can determine the labels used to classify the input images.

本発明の１つ又はそれ以上の実施形態は、イメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングに関連する。本明細書において、より詳細に引き続き説明されるように、教師ニューラル・ネットワークは、クラス化されたイメージの利用可能なセットを使用して教師あり学習が行われる。得られたトレーニングされた教師ニューラル・ネットワークは、その後、追加的なイメージをクラス化してソフト・ラベルを提供する。クラス化されていないイメージは、クラス化されたイメージよりもより容易に使用されるので、トレーニングされた教師ニューラル・ネットワークを使用してクラス化されていないイメージから生成することができるソフト・ラベル付けされたイメージは、クラス化されたイメージよりもより大きなセットとすることができる。クラス化され、ソフト・ラベル付けされたイメージは、その後、生徒ニューラル・ネットワークをトレーニングするために使用される。したがって、生徒ニューラル・ネットワークは、クラス化され、かつソフト・ラベル付けされた両方のイメージの使用からの利益を受ける。 One or more embodiments of the present invention relate to training an uncertainty-guided semi-supervised neural network for image classification. As will be described in more detail subsequently herein, a teacher neural network is supervised using the available set of classified images. The resulting trained teacher neural network then classifies additional images to provide soft labels. Because unclassified images are more easily used than classified images, the set of soft-labeled images that can be generated from unclassified images using the trained teacher neural network can be larger than the set of classified images. The classified, soft-labeled images are then used to train a student neural network. Thus, the student neural network benefits from the use of both classified and soft-labeled images.

本発明の側面によれば、生徒ニューラル・ネットワークは、トレーニングされた教師ニューラル・ネットワークによって生成されたそれぞれのラベルに関連する信頼レベルを示す不確実性マップからの利益を受ける。光干渉断層撮影（ＯＣＴ）の網膜スキャンは、特に実施例の目的で議論されるが、本明細書で詳述される本発明の１つ又はそれ以上の実施形態は、同様に特定のカテゴリにクラス化することができる特徴を有する如何なる生物医学的、又は他のイメージに対しても適用される。他の例示的なイメージは、磁気共鳴イメージ（ＭＲＩｓ）を含み、かつ他の例示的な診断は、肺小結節及び網膜血管に関係する。ＯＣＴ網膜スキャンから得られるイメージの例示的な場合は、ＯＣＴスキャンが視神経乳頭周囲網膜神経線維層（ｃｐＲＮＦＬ）の厚さを測定し、かつしたがって、ｃｐＲＮＦＬの薄化を示すことは、緑内障の患者における視覚損失を予測するために使用することができる。 In accordance with an aspect of the present invention, the student neural network benefits from an uncertainty map that indicates the confidence level associated with each label generated by the trained teacher neural network. Although optical coherence tomography (OCT) retinal scans are specifically discussed for purposes of example, one or more embodiments of the present invention detailed herein apply to any biomedical or other image having features that can be classified into a particular category as well. Other exemplary images include magnetic resonance images (MRIs), and other exemplary diagnoses relate to lung nodules and retinal blood vessels. In the exemplary case of images obtained from an OCT retinal scan, the OCT scan measures the thickness of the peripapillary retinal nerve fiber layer (cpRNFL), and thus indicates thinning of the cpRNFL, which can be used to predict vision loss in patients with glaucoma.

図１は、本発明の１つ又はそれ以上の実施形態によるイメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングを実行する方法１００のプロセス・フローを示す。ブロック１１０で、ラベル付けされたイメージＤ_ｌが取得される。前述したように、それぞれのラベル付けされたイメージのそれぞれのピクセルは、イメージに適用されるクラス化にしたがってラベル付けされる。例えば、ＯＣＴスキャンのラベル付けされたイメージＤ_ｌについて、それぞれのＯＣＴスキャンのそれぞれのピクセルは、目の８つの解剖学的特徴の１つにしたがってラベル付けすることができる。教師ニューラル・ネットワークは、ベイジアン・ディープ・ラーニングを使用してブロック１２０でラベル付けされたイメージＤ_ｌの、このセットでトレーニングされる。ブロック１２０でのトレーニングの結果は、不確実性マップの他に入力としてラベル付けされていないイメージ及びそのイメージのそれぞれのピクセルについての出力されたソフト・ラベルを受領することができる、トレーニングされた教師ありニューラル・ネットワークである。不確実性マップは、それぞれのソフト・ラベルの不確実性を示す。トレーニングされた教師ニューラル・ネットワークにより出力されるラベルは、エキスパートによりアノテートされたラベル付けされ、そのためグラウンド・トルスを表すイメージＤ_ｌ内のラベルからトレーニングされた教師ニューラル・ネットワークの出力と区別するため、例示的な目的でソフト・ラベルとして参照する。 FIG. 1 illustrates a process flow of a method 100 for performing uncertainty-guided semi-supervised neural network training for image classification according to one or more embodiments of the present invention. In block 110, labeled images D _{1 are obtained. As previously described, each pixel of each labeled image is labeled according to a classification applied to the image. For example, for labeled images D 1} of OCT scans, each pixel of each OCT scan can be labeled according to one of eight anatomical features of the eye. A supervised neural network is trained on this set of labeled images _{D 1} _in block 120 using Bayesian deep learning. The result of the training in block 120 is a trained supervised neural network that can receive as input an unlabeled image and output soft labels for each pixel of the image in addition to an uncertainty map. The uncertainty map indicates the uncertainty of each soft label. The labels output by the trained teacher neural network are referred to for illustrative purposes as soft labels to distinguish them from the output of the trained teacher neural network from the labels in the image D _l that have been annotated by an expert and therefore represent the ground truth.

ブロック１３０～１６０のプロセスは、その後、ブロック１７０でトレーニングされた生徒ニューラル・ネットワークを取得するために反復的に実行される。これらのプロセスは、ここでは要約され、かつ以下でさらに詳細に説明される。反復の数は、それぞれの反復で特有の検証トレーニングセットを使用して決定される、検証ロスの収束に基づくことができる。ブロック１３０では、ラベル付けされていないイメージＤ_ｕのサブセットｘ^＾ _ｕが取得される。サブセットｘ^＾ _ｕ内のラベル付けされていないイメージの数は、それぞれの反復で同一とすることができる。ブロック１４０では、トレーニングされた教師ニューラル・ネットワークを使用することによって、ソフト・ラベル付けされたイメージ及び不確実性の推定が、ラベル付けされていないイメージＤ_ｕのサブセットｘ^＾ _ｕについてブロック１５０で出力される。すなわち、サブセットｘ^＾ _ｕ内のそれぞれのラベル付けされていないイメージについて、トレーニングされた教師ニューラル・ネットワークは、それぞれのピクセルについてソフト・ラベルを提供すると共に、それぞれのソフト・ラベルの不確実性を示す。したがって、ソフト・ラベルｚのベクトル及び不確実性の推定ベクトル（すなわち、不確実性マップｕ）がブロック１５０でサブセットｘ^＾ _ｕ内のそれぞれのイメージについて取得される。不確実性マップｕ内のより高い値は、より不正確でありそうなソフト・ラベルを有するピクセルを記述することができる。これらのソフト・ラベルは、その後、ブロック１６０で、生徒ニューラル・ネットワークのトレーニングにおいて対応してダウン・ウェイトされることができる。それぞれの反復で、ソフト・ラベルを有するサブセットｘ^＾ _ｕ及びブロック１１０で得られたラベル付けイメージのサブセットｘ^＾ _ｌは、ブロック１６０で生徒ニューラル・ネットワークをトレーニングするために使用される。ブロック１６０での生徒ニューラル・ネットワークをトレーニングするプロセスは、図２を参照して詳述される。 The processes of blocks 130-160 are then performed iteratively to obtain a trained student neural network in block 170. These processes are summarized here and described in more detail below. The number of iterations may be based on the convergence of the validation loss, which is determined in each iteration using a unique validation training set. In block 130, a subset x ^{^} _u of the unlabeled images D _u is obtained. The number of unlabeled images in the subset x ^{^} _u may be the same in each iteration. In block 140, by using the trained teacher neural network, soft-labeled images and uncertainty estimates are output in block 150 for the subset x ^{^} _u of the unlabeled images D _u . That is, for each unlabeled image in the subset x ^{^} _u , the trained teacher neural network provides a soft label for each pixel and indicates the uncertainty of each soft label. Thus, a vector of soft labels z and an estimate vector of uncertainty (i.e., an uncertainty map u) are obtained for each image in the subset x ^{^} _u in block 150. Higher values in the uncertainty map u may describe pixels with soft labels that are more likely to be inaccurate. These soft labels can then be correspondingly down-weighted in training the student neural network in block 160. At each iteration, the subset x ^{^} _u with soft labels and the subset x ^{^} _l of labeled images obtained in block 110 are used to train the student neural network in block 160. The process of training the student neural network in block 160 is detailed with reference to FIG. 2.

図２は、本発明の１つ又はそれ以上の実施形態によって、イメージのクラス化のための不確実性にガイドされた半教師ありニューラル・ネットワークのトレーニングを実行することによる、ブロック１６０（図１）での、生徒ニューラル・ネットワークをトレーニングするプロセスを詳述する。不確実性にガイドされた半教師ありトレーニングは、ブロック１５０でトレーニングされた教師ニューラル・ネットワークによって提供されるソフト・ラベルｚ及び不確実性マップｕを使用することを参照する。前述したように、不確実性マップｕは、サブセットｘ^＾ _ｕのそれぞれのピクセルについてトレーニングされた教師ニューラル・ネットワークにより生成されたソフト・ラベルの信頼性を示し、したがって、生徒ニューラル・ネットワークのトレーニングにおけるソフト・ラベルの重み付けを容易にする。 2 details the process of training a student neural network at block 160 (FIG. 1) by performing uncertainty-guided semi-supervised neural network training for image classification in accordance with one or more embodiments of the present invention. Uncertainty-guided semi-supervised training refers to using the soft labels z and the uncertainty map u provided by the teacher neural network trained at block 150. As mentioned above, the uncertainty map u indicates the reliability of the soft labels generated by the trained teacher neural network for each pixel of the subset x ^{^} _u , thus facilitating the weighting of the soft labels in the training of the student neural network.

ブロック２１０で、正規化された信頼マップωを得ることは、トレーニングされた教師ニューラル・ネットワークによって不確実性マップｕの出力を変換することを参照する。 In block 210, obtaining a normalized belief map ω refers to transforming the output of the uncertainty map u by the trained teacher neural network.

正のスカラーハイパー・パラメータαは、トレーニングされた教師ニューラル・ネットワークから生徒ニューラル・ネットワークへの情報フローを制御し、かつより特定的にはソフト・ラベルｚの使用を制御する。すなわち、もしもαが０に設定されるならば、ソフト・ラベルｚのすべては、等しく重み付けされるであろう（すなわち、ω＝１）が、α＞０では、より信頼性のあるソフト・ラベルの確率的選択が行われる。正規化された信頼マップω∈［０，１］は、ソフト・ラベルｚのピクセル・ワイズの品質を提供するので、より高い不確実性の値は、より低い品質スコアを生成し（及びトレーニングにおいて、ソフト・ラベルが使用されるであろうより低い蓋然性）、より低い不確実性の値は、より高いスコア（及びトレーニングにおいてソフト・ラベルが使用されるであろうより高い蓋然性）を生成する。αの値は、経験的に決定することができる。トレーニングされた教師ニューラル・ネットワークから得られたサブセットｘ^＾ _ｕのソフト・ラベルｚを、ラベル付けされたイメージのサブセットｘ^＾ _ｌのエキスパートがアノテートしたラベルに追加して使用することは、生徒ニューラル・ネットワークのトレーニングにおいて使用されるトレーニング・データセットを増加する。追加的に、トレーニングされた教師ニューラル・ネットワークから得られる不確実性マップｕに基づくソフト・ラベルｚの重み付けは、さらに、生徒ニューラル・ネットワークのトレーニングを改善する。 A positive scalar hyper-parameter α controls the information flow from the trained teacher neural network to the student neural network, and more specifically the use of the soft labels z. That is, if α is set to 0, all of the soft labels z will be weighted equally (i.e., ω=1), while α>0 results in a probabilistic selection of the more reliable soft labels. Since the normalized belief map ω∈[0,1] provides the pixel-wise quality of the soft labels z, higher uncertainty values produce lower quality scores (and lower probability that the soft labels will be used in training), and lower uncertainty values produce higher scores (and higher probability that the soft labels will be used in training). The value of α can be determined empirically. Using the soft labels z of the subset x ^{^} _u obtained from the trained teacher neural network in addition to the expert-annotated labels of the subset x ^{^} _l of labeled images increases the training dataset used in training the student neural network. Additionally, weighting the soft labels z based on the uncertainty map u obtained from a trained teacher neural network further improves the training of the student neural network.

ブロック２２０では、生徒ニューラル・ネットワークから出力ｚ^ｔ _ｃを得ることは、重み付けされたソフト・ラベルを使用することを参照すると共に、ｔ番目のピクセル及びｃ番目のクラスを示す。標準ロスＬ_ｌａｂは、ラベル付けされたイメージのサブセットｘ^＾ _ｌに関連するロスを記述するために使用され、かつサブセットｘ^＾ _ｕに関連するアンラベルド・ロスＬ_{ｕｎｌａｂ}は、信頼重み付け交差エントロピーとして以下によって定式化される： In block 220, obtaining _an output ^ztc from the student neural network refers to using the weighted soft labels and indicates the tth pixel and the cth class. The standard loss _Llab is used to describe the loss associated with the subset x ^{^} _l of the labeled image, and the unlabeled loss _Lunlab associated with the subset x ^{^} _u is formulated as the confidence-weighted cross-entropy by:

ＯＣＴスキャンであるイメージの例示的な場合では、クラスの数は、８（すなわち、Ｃ＝８）である。Ｚ_ｃは、ソフト・ラベル・ベクトルｚ内のｃ番目のクラスのピクセル領域を記述し、ξ_ｃは、以下に与えられる： In the exemplary case of an image being an OCT scan, the number of classes is eight (i.e., C=8). Let Z _c describe the pixel region of the cth class in the soft label vector z, and ξ _c is given by:

式３にしたがい、クラスあたりのピクセルの有効数≦Ｐの場合、ξ_ｃ＝０である。これは、Ｚ_ｃのピクセルの多くが不確実である場合に、アンラベルド・ロスの安定化を提供する。Ｐの値は、経験的に設定される（例えば、ＯＣＴスキャンのイメージの例示的な場合においてはＰ＝５０）。半教師ありロスＬ_{ｓｅｍｉｓｕｐ}が、アンラベルド・ロスＬ_{ｕｎｌａｂ}及びラベルド・ロスＬ_ｌａｂの合計としてブロック２３０で以下に計算される。 According to Equation 3, ξ _c =0 if the effective number of pixels per class≦P. This provides stabilization of the unlabeled loss when many of the pixels in Z _c are uncertain. The value of P is set empirically (e.g., P=50 in the exemplary case of OCT scan images). The semi-supervised loss L _semisup is calculated below in block 230 as the sum of the unlabeled loss L _unlab and the labeled loss L _lab .

半教師ありロスＬ_{ｓｅｍｉｓｕｐ}は、ブロック２４０で反復についてのトレーニング・プロセスの部分として、生徒ニューラル・ネットワークのパラメータをアップデートするために使用される。 The semi-supervised loss L _semisup is used to update the parameters of the student neural network as part of the training process for each iteration in block 240.

ブロック２２０で生徒ニューラル・ネットワークから出力を得ること、及びブロック２３０で半教師ありロスを計算することは、ブロック１３０～ブロック１６０（図１）のさらなる反復が必要か否かを判断するために、特定のトレーニング・データセットについて繰り返されることができる。特定のトレーニング・データセット、又は検証イメージは、ラベル付けされていないイメージ及び対応するラベル付けされたイメージを含む。上述した半教師ありロスＬ_{ｓｅｍｉｓｕｐ}は、イメージ・サブセットｘ^＾ _ｕ及びｘ^＾ _ｌを使用してブロック２３０で計算され、かつブロック２４０で、生徒ニューラル・ネットワークのパラメータをアップデートするために使用される。ブロック２３０で検証イメージを使用して計算された半教師ありロスＬ_{ｓｅｍｉｓｕｐ}は、収束を判断するために使用される。半教師ありロスＬ_{ｓｅｍｉｓｕｐ}が収束する（例えば、１つの反復から次への半教師ありロスＬ_{ｓｅｍｉｓｕｐ}の差が閾値以下である。）場合、反復を停止することができる。ブロック１３０～１６０のプロセスの反復は、収束が達成されるか、又は、設定された数の反復（例えば、４０，０００）に達するまで実行することができる。 Obtaining the output from the student neural network in block 220 and computing the semi-supervised loss in block 230 can be repeated for a particular training dataset to determine whether further iterations of blocks 130-160 (FIG. 1) are necessary. A particular training dataset, or validation images, includes unlabeled images and corresponding labeled images. The semi-supervised loss L _semisup , described above, is computed in block 230 using the image subsets x ^{^} _u and x ^{^} _l , and is used to update the parameters of the student neural network in block 240. The semi-supervised loss L _semisup , computed in block 230 using the validation images, is used to determine convergence. The iterations can be stopped when the semi-supervised loss L _semisup converges (e.g., the difference in the semi-supervised loss L _semisup from one iteration to the next is less than or equal to a threshold). Iterations of the process of blocks 130-160 may be performed until convergence is achieved or a set number of iterations (eg, 40,000) is reached.

本発明の１つ又はそれ以上の実施形態は、現在知られ、又はその後に開発される如何なる他のタイプのコンピューティング環境との組み合わせで実装することができることが理解される。図３は、本明細書で説明した技術を実装するためのプロセッシング・システム３００のブロック図である。図３に示す実施形態においては、プロセッシング・システム３００は、１つ又はそれ以上の中央処理ユニット（プロセッサ）２１ａ、２１ｂ、２１ｃなどを有する（集合的に、又は一般的にプロセッサ（複数でもよい）２１、又はプロセッシング・デバイス（複数でもよい）、又はそれらの両方として参照する。）。本発明の１つ又はそれ以上の実施形態によれば、それぞれのプロセッサ２１は、縮小命令セット・コンピュータ（ＲＩＳＣ）マイクロ・プロセッサを含むことができる。プロセッサ２１は、システム・メモリ（例えばランダム・アクセス・メモリ（ＲＡＭ）２４）及び種々の他のコンポーネントと結合される。リード・オンリー・メモリ（ＲＯＭ）２２は、システム・バス３３に結合され、かつベーシック入力／出力システム（ＢＩＯＳ）を含むことができ、これは、プロセッシング・システム３００の一定の基本機能を制御する。 It is understood that one or more embodiments of the present invention may be implemented in combination with any other type of computing environment now known or later developed. FIG. 3 is a block diagram of a processing system 300 for implementing the techniques described herein. In the embodiment shown in FIG. 3, the processing system 300 has one or more central processing units (processors) 21a, 21b, 21c, etc. (collectively or generally referred to as processor(s) 21, or processing device(s), or both). According to one or more embodiments of the present invention, each processor 21 may include a reduced instruction set computer (RISC) microprocessor. The processors 21 are coupled to a system memory (e.g., random access memory (RAM) 24) and various other components. A read-only memory (ROM) 22 is coupled to a system bus 33 and may include a basic input/output system (BIOS), which controls certain basic functions of the processing system 300.

さらに図示されるものは、システム・バス３３に結合された入力／出力（Ｉ／Ｏアダプタ２７及び通信アダプタ２６である。Ｉ／Ｏアダプタ２７は、ハードディスク２３、又はテープ・ストレージ・ドライブ２５、又はこれらの両方と通信するスモール・コンピュータ・システム・インタフェース（ＳＣＳＩ）とすることができる。Ｉ／Ｏアダプタ２７、ハードディスク２３、及びテープ・ストレージ・デバイス２５は、集合的に本明細書においてマス・ストレージ３４として参照される。プロセッシング・システム３００上で実行するためのオペレーティング・システム４０は、マス・ストレージ３４内に格納することができる。ＲＡＭ２４、ＲＯＭ２２、及びマス・ストレージ３４は、プロセッシング・システム３００のメモリ１９の実施例である。ネットワーク・アダプタ２６は、システム・バス３３を、外部のネットワーク３６に相互接続して、プロセッシング・システム３００が他のそのようなシステムと通信することを可能とする。 Also shown are input/output (I/O adapter 27 and communications adapter 26 coupled to system bus 33. I/O adapter 27 may be a small computer system interface (SCSI) that communicates with hard disk 23, or tape storage drive 25, or both. I/O adapter 27, hard disk 23, and tape storage device 25 are collectively referred to herein as mass storage 34. An operating system 40 for executing on processing system 300 may be stored in mass storage 34. RAM 24, ROM 22, and mass storage 34 are examples of memory 19 of processing system 300. Network adapter 26 interconnects system bus 33 to an external network 36, enabling processing system 300 to communicate with other such systems.

ディスプレイ（例えば、ディスプレイ・モニタ）３５は、システム・バス３３にディスプレイ・アダプタ３２によって接続され、これは、グラフィックス集約的なアプリケーションの性能を改善するグラフィックス・アダプタ及びビデオ・コントローラを含む。本発明の１つ又はそれ以上の実施形態によれば、アダプタ２６、２７、又は３２、又はそれらの組み合わせは、中間的なバス・ブリッジ（図示せず）を介してシステム・バス３３に接続される１つ又はそれ以上のＩ／Ｏバスに接続されることができる。ハードディスク・コントローラ、ネットワーク・アダプタ、及びグラフィックス・アダプタといった周辺デバイスを接続するための好ましいＩ／Ｏバスは、典型的には、ペリフェラル・コンポーネント・インタコネクト（ＰＣＩ）といった共通のプロトコルを含む。追加的な入力／出力デバイスは、ユーザ・インタフェース・アダプタ２８及びディスプレイ・アダプタ３２を介してシステム・バス３３に接続されるように示されている。キーボード２９、マウス３０、スピーカ３１は、システム・バス３３にユーザ・インタフェース・アダプタ２８を介して相互接続され、これは、例えば、単一の集積回路へと多数のデバイス・アダプタを集積するスーパーＩ／Ｏチップを含むことができる。 A display (e.g., a display monitor) 35 is connected to the system bus 33 by a display adapter 32, which includes a graphics adapter and a video controller to improve performance of graphics-intensive applications. According to one or more embodiments of the present invention, the adapters 26, 27, or 32, or a combination thereof, can be connected to one or more I/O buses that are connected to the system bus 33 through an intermediate bus bridge (not shown). Preferred I/O buses for connecting peripheral devices such as hard disk controllers, network adapters, and graphics adapters typically include a common protocol such as Peripheral Component Interconnect (PCI). Additional input/output devices are shown connected to the system bus 33 through a user interface adapter 28 and a display adapter 32. A keyboard 29, a mouse 30, and a speaker 31 are interconnected to the system bus 33 through a user interface adapter 28, which may include, for example, a super I/O chip that integrates multiple device adapters into a single integrated circuit.

本発明の１つ又はそれ以上の実施形態によれば、プロセッシング・システム３００は、グラフィックス・プロセッシング・ユニット３７を含む。グラフィックス・プロセッシング・ユニット３７は、メモリを操作すると共に変更して、ディスプレイへの出力を意図するフレーム・バッファ内へのイメージの生成を加速するように設計された、専用的な電子回路である。一般に、グラフィックス・プロセッシング・ユニット３７は、コンピュータ・グラフィックス及びイメージ・プロセッシングを極めて効率的に操作すると共に、並列に大規模なデータ・ブロックを処理するアルゴリズムのための汎用目的ＣＰＵよりも、より効果的とする高い並列構造を有する。 In accordance with one or more embodiments of the present invention, the processing system 300 includes a graphics processing unit 37. The graphics processing unit 37 is a specialized electronic circuit designed to manipulate and modify memory to accelerate the generation of images into a frame buffer intended for output to a display. In general, the graphics processing unit 37 has a highly parallel structure that makes it extremely efficient at handling computer graphics and image processing, and more effective than a general purpose CPU for algorithms that process large blocks of data in parallel.

したがって、本明細書で構成されるように、プロセッシング・システム３００は、プロセッサ２１に形態のプロセッシング能力、システム・メモリ（例えば、ＲＡＭ２４）及びマス・ストレージ３４を含むストレージ能力、キーボード２９及びマウス３０といった入力手段、及びスピーカ３１及びディスプレイ３５を含む出力能力を含む。本発明の１つ又はそれ以上の実施形態によれば、システム・・メモリ（例えばＲＡＭ２４）の部分及びマス・ストレージ３４は、ＩＢＭコーポレーションからのＡＩＸ（登録商標）オペレーティング・システムといったオペレーティング・システムを集合的に格納して、プロセッシング・システム３００に示された種々のコンポーネントの機能を調整する。 Thus, as constructed herein, processing system 300 includes processing capability in the form of processor 21, storage capability including system memory (e.g., RAM 24) and mass storage 34, input means such as keyboard 29 and mouse 30, and output capability including speaker 31 and display 35. In accordance with one or more embodiments of the present invention, portions of the system memory (e.g., RAM 24) and mass storage 34 collectively store an operating system, such as the AIX® operating system from IBM Corporation, to coordinate the functioning of the various components illustrated in processing system 300.

本発明の種々の実施形態は、本明細書において関連する図面を参照して説明された。本発明の代替的実施形態は、本発明の範囲から逸脱することなく成しえる。種々の接続及び位置的な関係（例えば、上、下、隣接など）は、これにしたがう明細書において、及び図面において要素間において明らかにされる。これらの接続、又は位置的な関係、又はこれらの両方は、それ以外が特定されない限り、直接、又は間接とすることができ、かつ本発明は、これらの点において限定することを意図しない。したがって、エンティティの結合は、直接的、又は間接的なカップリングとすることができると共に、エンティティ間の位置的関係は、直接的、又は間接的な位置的関係とすることができる。さらに、本明細書で説明した種々のタスク及びプロセス・ステップは、本明細書で説明されていない追加的なステップ又は機能を有するより包括的な手順、又はプロセスに組み込むことができる。 Various embodiments of the present invention have been described herein with reference to the associated drawings. Alternative embodiments of the present invention may be made without departing from the scope of the present invention. Various connections and positional relationships (e.g., above, below, adjacent, etc.) are disclosed between elements in the specification and in the drawings hereto. These connections and/or positional relationships may be direct or indirect unless otherwise specified, and the present invention is not intended to be limited in these respects. Thus, coupling of entities may be direct or indirect coupling, and positional relationships between entities may be direct or indirect positional relationships. Additionally, various tasks and process steps described herein may be incorporated into a more comprehensive procedure or process having additional steps or functions not described herein.

本明細書で説明した１つ又はそれ以上の方法は、それぞれが当技術において周知の以下の技術の如何なるもの又は組み合わせで実装することができる：データ信号に応じて論理的機能を実装するための論理ゲートを有する別々の論知回路（複数でもよい）、適切な組み合わせの論理ゲートを有する特定用途集積回路（ＡＳＩＣ）、プログラマブル・ゲート・アレイ（複数でもよい）（ＰＧＡ）、フィールド・プログラマブル・ゲート・アレイ（ＦＰＧＡ）、など。 One or more of the methods described herein may be implemented with any one or combination of the following technologies, each of which is well known in the art: a discrete logic circuit(s) having logic gates for implementing logical functions in response to data signals, an application specific integrated circuit (ASIC) having an appropriate combination of logic gates, a programmable gate array(s) (PGA), a field programmable gate array (FPGA), etc.

簡略化のために、本発明の側面を製造し、かつ使用することに関係する従来の技術は、本明細書において詳細に説明されているか、又はされていない場合がある。具体的には、本明細書で説明した種々の技術的特徴を実装するためのコンピューティング・システムの種々の側面及び特定のコンピュータ・プログラムは、よく知られている。したがって、簡潔さの観点において、多くの従来の実装の詳細は、本明細書において簡潔に述べられるか、周知のシステム、又はプロセス詳細、又はこれらの両方を提供することなく完全に省略された。 For the sake of brevity, conventional techniques related to making and using aspects of the present invention may or may not have been described in detail herein. In particular, various aspects of computing systems and specific computer programs for implementing various technical features described herein are well known. Thus, in the interest of brevity, many conventional implementation details have been briefly described herein or omitted entirely without providing well-known system or process details, or both.

いくつかの実施形態において、種々の機能又は動作は、所与の位置、又は１つ又はそれ以上の装置、又はシステムの動作に接続され、又はそれらの両方で発生することができる。いくつかの実施形態では、所与の機能又は動作は、第１のデバイス、又は位置で実行されることができ、機能又は動作の残りは、１つ又はそれ以上の追加的なデバイス、又は位置で実行されることができる。 In some embodiments, various functions or operations may occur at a given location, or connected to the operation of one or more devices or systems, or both. In some embodiments, a given function or operation may be performed at a first device or location, and the remainder of the function or operation may be performed at one or more additional devices or locations.

本明細書で使用される用語は、特定の実施形態を説明する目的のためのみのものであり、限定を意図するものではない。本明細書で使用するように、単数形、“ａ”、“ａｎ”及び“ｔｈｅ”は、文脈が明らかにそれ以外を示さない限り、同様に複数形態を含むことを意図する。さらに、用語、含む“ｃｏｍｐｒｉｓｅ”、含んでいる“ｃｏｍｐｒｉｓｉｎｇ”、又はこれらの両方が本明細書において使用される場合、宣言された特徴、整数、ステップ、操作、要素、又はコンポーネント、又はこれらの組み合わせの存在を特定するが、１つ又はそれ以上の他の特徴、整数、ステップ、操作、要素、コンポーネント又はグループ又はそれらの組み合わせの存在、又は追加を除外するものでないことについて理解されるべきである。 The terms used herein are for the purpose of describing particular embodiments only and are not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, when the terms "comprise", "comprising", or both are used herein, they should be understood to specify the presence of the stated features, integers, steps, operations, elements, or components, or combinations thereof, but not to exclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or groups, or combinations thereof.

下記のクレーム中の対応する構造、材料、動作、及びすべての手段又はステップ・プラス・ファンクションの均等物は、特定的に請求項に記載された他の請求項の要素との組み合わせにおいて機能を実行するための如何なる構造、材料、又は動作を含むことを意図する。本開示は、例示及び説明の目的で提示されるが、開示された形態だけに尽きる、又は限定されることを意図するものではない。多くの修正及びバリエーションは、本開示の範囲から逸脱することなく、当業者において明らかであろう。実施形態は、本開示の原理及び実際的な用途を説明すると共に、当業者の他の者に対して、想定される特定の使用に適切となるような種々の修正を有する種々の実施形態についての開示を理解させることを可能とするために選択され、かつ説明された。 The corresponding structures, materials, acts, and all means or step-plus-function equivalents in the following claims are intended to include any structures, materials, or acts for performing a function in combination with other claim elements specifically recited in the claims. The present disclosure has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the disclosed forms. Many modifications and variations will be apparent to those skilled in the art without departing from the scope of the present disclosure. The embodiments have been selected and described to explain the principles and practical application of the present disclosure and to enable others skilled in the art to understand the disclosure of various embodiments with various modifications as may be appropriate for the particular use envisaged.

本明細書で説明した図は、例示である。本明細書で説明した図、又はステップ（又は操作）に対する多くのバリエーションが、本開示の範囲から逸脱することなく存在することができる。例えば、動作は、異なる順序で実行されることができ、又は動作は、追加、又は修正することができる。また、用語“結合”は、２つの要素間で信号経路を有することを記述し、要素間の間に介在する要素／接続の無い直接の接続を意味しない。これらのバリエーションのすべては、本開示の一部と考えられる。 The diagrams described herein are exemplary. Many variations to the diagrams or steps (or operations) described herein can exist without departing from the scope of this disclosure. For example, operations can be performed in a different order, or operations can be added or modified. Also, the term "coupled" describes having a signal path between two elements, and does not imply a direct connection with no intervening elements/connections between the elements. All of these variations are considered part of this disclosure.

後述する定義及び略記は、請求項及び明細書の解釈のために使用されるべきものである。本明細書で使用されるように、“含む”、“含んでいる”、“内包する”、“内包している”、
“有する”、“有している”、“包含する”、又は“包含している”、又はそれらの如何なるバリエーションは、被排他的包含を含むことを意図する。例えば、要素のリストを含む組成物、混合物、プロセス、方法、製品、又は装置は、それらの要素だけに限定される必要は無いが、明示的にリストされていないか、そのような組成、混合物、プロセス、方法、製品又は装置に本質的なものである他の要素を含むことができる。 The following definitions and abbreviations should be used for interpretation of the claims and the specification. As used herein, the terms "comprise", "including ...
"Having,""having,""including," or "comprising," or any variation thereof, is intended to include a non-exclusive inclusion. For example, a composition, mixture, process, method, product, or device that includes a list of elements need not be limited to only those elements, but can include other elements that are not expressly listed or that are essential to such composition, mixture, process, method, product, or device.

追加的に、本明細書で使用する用語“例示的”は、“実施例としての提供”を意味する。本明細書で説明する“例示的な”如何なる実施形態、又は設計は、他の実施形態、又は設計を超えた好ましい、又は有利なものと解釈される必要はない。用語、“少なくとも１つ”及び“１つ又はそれ以上の”は、１に等しいか、それ以上の如何なる整数、すなわち、１、２、３、４、などを含むと理解される。用語、“複数の”は、２よりも大きな如何なる整数、すなわち、２、３、４、５などを含むと理解される。用語“接続”は、間接的な“接続”及び直接的な“接続”の両方を含むことができる。 Additionally, as used herein, the term "exemplary" means "served as an example." Any "exemplary" embodiment or design described herein is not necessarily to be construed as preferred or advantageous over other embodiments or designs. The terms "at least one" and "one or more" are understood to include any integer number equal to or greater than one, i.e., 1, 2, 3, 4, etc. The term "multiple" is understood to include any integer number greater than two, i.e., 2, 3, 4, 5, etc. The term "connected" can include both indirect and direct "connections."

用語“約”、“実質的”、“近似的”、及びそれらのバリエーションは、本願の出願の時点で利用可能な装置に基づいて特定の品質の測定に関連する誤差の程度を含むことを意図する。例えば、“約”は、所与の値の、±８％、５％、又は２％の範囲を含むことができる。 The terms "about," "substantially," "approximately," and variations thereof are intended to include the degree of error associated with the measurement of a particular quality based on equipment available at the time of the filing of this application. For example, "about" can include a range of ±8%, 5%, or 2% of a given value.

本発明は、システム、方法、又はコンピュータ・プログラム製品、又はそれらの組み合わせとすることができる。コンピュータ・プログラム製品は、それ上に、プロセッサに対して本開示の特徴を遂行させるためのコンピュータ可読なプログラム命令を有する、コンピュータ可読な記録媒体（又は複数の媒体）を含む。 The present invention may be a system, a method, or a computer program product, or a combination thereof. The computer program product includes a computer-readable recording medium (or media) having computer-readable program instructions thereon for causing a processor to perform features of the present disclosure.

コンピュータ可読な記録媒体は、命令実行デバイスが使用するための複数の命令を保持し格納することができる有形のデバイスとすることができる。コンピュータ可読な媒体は、例えば、これらに限定されないが、電気的記録デバイス、磁気的記録デバイス、光学的記録デバイス、電気磁気的記録デバイス、半導体記録デバイス、又はこれらの如何なる好ましい組み合わせとすることができる。コンピュータ可読な記録媒体のより具体的な実施例のこれらに尽きないリストは、次のポータブル・コンピュータ・ディスク、ハードディスク、ランダム・アクセス・メモリ（ＲＡＭ）、リード・オンリー・メモリ（ＲＯＭ）、消去可能なプログラマブル・リード・オンリー・メモリ（ＥＰＲＯＭ又はフラッシュ・メモリ（登録商標））、スタティック・ランダム・アクセス・メモリ（ＳＲＡＭ）、ポータブル・コンパクト・ディスク・リード・イオンリー・メモリ（ＣＤ－ＲＯＭ）、デジタル多目的ディスク（ＤＶＤ）、メモリ・スティック、フロッピー・ディスク（登録商標）、パンチ・カード又は命令を記録した溝内に突出する構造を有する機械的にエンコードされたデバイス、及びこれらの好ましい如何なる組合せを含む。本明細書で使用するように、コンピュータ可読な記録媒体は、ラジオ波又は他の自由に伝搬する電磁波、導波路又は他の通信媒体（例えば、光ファイバ・ケーブルを通過する光パルス）といった電磁波、又はワイヤを通して通信される電気信号といったそれ自体が一時的な信号として解釈されることはない。 A computer-readable recording medium may be a tangible device capable of holding and storing a plurality of instructions for use by an instruction execution device. A computer-readable medium may be, for example, but not limited to, an electrical recording device, a magnetic recording device, an optical recording device, an electro-magnetic recording device, a semiconductor recording device, or any suitable combination thereof. A non-exhaustive list of more specific examples of computer-readable recording media includes the following: portable computer disks, hard disks, random access memories (RAMs), read-only memories (ROMs), erasable programmable read-only memories (EPROMs or flash memories (registered trademark)), static random access memories (SRAMs), portable compact disk read-only memories (CD-ROMs), digital versatile disks (DVDs), memory sticks, floppy disks (registered trademark), punch cards, or mechanically encoded devices having structures protruding into the grooves in which the instructions are recorded, and any suitable combination thereof. As used herein, a computer-readable recording medium is not to be construed as a transitory signal per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves such as wave guides or other communications media (e.g., light pulses passing through a fiber optic cable), or electrical signals communicated through wires.

本明細書において説明されるコンピュータ・プログラム命令は、コンピュータ可読な記録媒体からそれぞれのコンピューティング／プロセッシング・デバイスにダウンロードでき、又は例えばインターネット、ローカル・エリア・ネットワーク、ワイドエリア・ネットワーク又はワイヤレス・ネットワーク及びそれからの組み合わせといったネットワークを介して外部コンピュータ又は外部記録デバイスにダウンロードすることができる。ネットワークは、銅通信ケーブル、光通信ファイバ、ワイヤレス通信、ルータ、ファイアウォール、スイッチ、ゲートウェイ・コンピュータ及びエッジ・サーバ又はこれらの組み合わせを含むことができる。それぞれのコンピューティング／プロセッシング・デバイスにおけるネットワーク・アダプタ・カード又はネットワーク・インタフェースは、ネットワークからコンピュータ可読なプログラム命令を受領し、このコンピュータ可読なプログラム命令を格納するためにそれぞれのコンピューティング／プロセッシング・デバイス内のコンピュータ可読な記録媒体内に転送する。 The computer program instructions described herein can be downloaded from a computer-readable recording medium to the respective computing/processing device, or can be downloaded to an external computer or external recording device via a network, such as the Internet, a local area network, a wide area network, or a wireless network, and combinations thereof. The network can include copper cables, optical fibers, wireless communications, routers, firewalls, switches, gateway computers, and edge servers, or combinations thereof. A network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and transfers the computer-readable program instructions into a computer-readable recording medium in the respective computing/processing device for storage.

本発明の操作を遂行するためのコンピュータ可読なプログラム命令は、アセンブラ命令、命令セット・アーキテクチャ（ＩＳＡ）命令、機械語命令、マシン依存命令、マイクロ・コード、ファームウェア命令、状態設定データ、集積回路のための構成データ、又は１つ又はそれ以上の、Ｓｍａｌｌｔａｌｋ（登録商標）、Ｃ＋＋などのオブジェクト指向プログラミング言語、“Ｃ”プログラミング言語又は類似のプログラム言語といった従来の手続き型プログラミング言語を含むプログラミング言語のいかなる組合せにおいて記述されたソース・コード、又はオブジェクト・コードの何れかとすることができる。コンピュータ可読なプログラム命令は、全体がユーザ・コンピュータ上で、部分的にユーザ・コンピュータ上でスタンドアローン・ソフトウェア・パッケージとして、部分的にユーザ・コンピュータ上で、かつ部分的にリモート・コンピュータ上で、又は全体がリモート・コンピュータ又はサーバ上で実行することができる。後者のシナリオにおいて、リモート・コンピュータは、ローカル・エリア・ネットワーク（ＬＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）を含むいかなるタイプのネットワークを通してユーザ・コンピュータに接続することができ、又は接続は、外部コンピュータ（例えばインターネット・サービス・プロバイダを通じて）へと行うことができる。いくつかの実施形態では、例えばプログラマブル論理回路、フィールド・プログラマブル・ゲートアレイ（ＦＰＧＡ）、又はプログラマブル論理アレイ（ＰＬＡ）を含む電気回路がコンピュータ可読なプログラム命令を、コンピュータ可読なプログラム命令の状態情報を使用して、本発明の特徴を実行するために電気回路をパーソナライズして実行することができる。 The computer readable program instructions for carrying out the operations of the present invention may be either source code written in any combination of programming languages, including assembler instructions, instruction set architecture (ISA) instructions, machine language instructions, machine dependent instructions, microcode, firmware instructions, state setting data, configuration data for an integrated circuit, or one or more conventional procedural programming languages, such as object oriented programming languages such as Smalltalk, C++, the "C" programming language, or similar programming languages, or object code. The computer readable program instructions may be executed entirely on the user computer, partially on the user computer as a stand-alone software package, partially on the user computer and partially on a remote computer, or entirely on a remote computer or server. In the latter scenario, the remote computer may be connected to the user computer through any type of network, including a local area network (LAN), a wide area network (WAN), or the connection may be made to an external computer (e.g., through an Internet service provider). In some embodiments, electrical circuitry, including, for example, a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can execute computer-readable program instructions using state information from the computer-readable program instructions to personalize the electrical circuitry to perform features of the invention.

本明細書で説明した本発明の側面を、本発明の実施形態にしたがい、フローチャート命令及び方法のブロック図、又はそれらの両方、装置（システム）、及びコンピュータ可読な記録媒体及びコンピュータ・プログラム製品を参照して説明した。フローチャートの図示及びブロック図又はそれら両方及びフローチャートの図示におけるブロック及びブロック図、又はそれらの両方のいかなる組合せでもコンピュータ可読なプログラム命令により実装することができることを理解されたい。 Aspects of the invention described herein have been described with reference to flowchart instructions and/or block diagrams of methods, apparatus (systems), and computer-readable recording media and computer program products according to embodiments of the invention. It should be understood that any combination of flowchart illustrations and/or block diagrams and blocks in flowchart illustrations and/or block diagrams can be implemented by computer-readable program instructions.

コンピュータ可読なプログラム命令は、汎用目的のコンピュータ、特定目的のコンピュータ、又は他のプロセッサ又は機械を生成するための他のプログラマブル・データ・プロセッシング装置に提供することができ、コンピュータのプロセッサ又は他のプログラマブル・データ・プロセッシング装置による実行がフローチャート及びブロック図のブロック又は複数のブロック又はこれらの組み合わせで特定される機能／動作を実装するための手段を生成する。コンピュータ、プログラマブル・データ・プロセッシング装置及び他の装置又はこれらの組み合わせが特定の仕方で機能するように指令するこれらのコンピュータ可読なプログラム命令は、またコンピュータ可読な記録媒体に格納することができ、その内に命令を格納したコンピュータ可読な記録媒体は、フローチャート及びブロック図のブロック又は複数のブロック又はこれらの組み合わせで特定される機能／動作の特徴を実装する命令を含む製造品を構成する。 The computer-readable program instructions can be provided to a general-purpose computer, a special-purpose computer, or other processor or other programmable data processing device to produce a machine, and execution by the computer's processor or other programmable data processing device produces means for implementing the functions/operations specified in the block or blocks of the flowcharts and block diagrams, or a combination thereof. These computer-readable program instructions that direct a computer, programmable data processing device, and other device, or a combination thereof, to function in a particular manner can also be stored on a computer-readable recording medium, and the computer-readable recording medium having instructions stored therein constitutes an article of manufacture containing instructions that implement the functional/operational features specified in the block or blocks of the flowcharts and block diagrams, or a combination thereof.

コンピュータ可読なプログラム命令は、またコンピュータ、他のプログラマブル・データ・プロセッシング装置、又は他のデバイス上にロードされ、コンピュータ、他のプログラマブル装置、又は他のデバイス上で操作ステップのシリーズに対してコンピュータ実装プロセスを生じさせることで、コンピュータ、他のプログラマブル装置又は他のデバイス上でフローチャート及びブロック図のブロック又は複数のブロック又はこれらの組み合わせで特定される機能／動作を実装させる。 The computer-readable program instructions may also be loaded onto a computer, other programmable data processing device, or other device, and cause a computer-implemented process to execute a series of operational steps on the computer, other programmable device, or other device, thereby causing the computer, other programmable device, or other device to implement the functions/operations identified in the blocks of the flowcharts and block diagrams, or in a combination of blocks thereof.

図のフローチャート及びブロック図は、本発明の種々の実施形態にしたがったシステム、方法及びコンピュータ・プログラムのアーキテクチャ、機能、及び可能な実装操作を示す。この観点において、フローチャート又はブロック図は、モジュール、セグメント又は命令の部分を表すことかでき、これらは、特定の論理的機能（又は複数の機能）を実装するための１つ又はそれ以上の実行可能な命令を含む。いくつかの代替的な実装においては、ブロックにおいて記述された機能は、図示した以外で実行することができる。例えば、連続して示された２つのブロックは、含まれる機能に応じて、実質的に同時的に、又は複数のブロックは、時として逆の順番で実行することができる。またブロック図及びフローチャートの図示、又はこれらの両方及びブロック図中のブロック及びフローチャートの図示又はこれらの組み合わせは、特定の機能又は動作を実行するか又は特定の目的のハードウェア及びコンピュータ命令を遂行する特定目的のハードウェアに基づいたシステムにより実装することができることを指摘する。 The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and possible implementation operations of systems, methods, and computer programs according to various embodiments of the present invention. In this respect, the flowcharts or block diagrams may represent modules, segments, or portions of instructions, which include one or more executable instructions for implementing a particular logical function (or functions). In some alternative implementations, the functions described in the blocks may be performed other than as illustrated. For example, two blocks shown in succession may be executed substantially simultaneously, or the blocks may sometimes be executed in reverse order, depending on the functionality involved. Also, the illustration of block diagrams and/or flowcharts and the illustration of blocks in block diagrams and flowcharts or combinations thereof may be implemented by a system based on special purpose hardware that performs a particular function or operation or executes specific purpose hardware and computer instructions.

本開示の種々の実施形態の説明は、例示の目的のために提示されたが、開示された実施形態に尽きたものとか、又は限定を意図するものではない。多くの変更例又は変形例は、本開示の範囲及び精神から逸脱することなく、当業者において自明である。本明細書で使用する用語は、本実施形態の原理、実用的用途、又は市場において見出される技術を超える技術的改善を最良に説明するため、又は本明細書において開示された実施形態を当業者の他の者が理解できるようにするために選択したものである。
The description of various embodiments of the present disclosure has been presented for illustrative purposes, but is not intended to be exhaustive or limiting to the disclosed embodiments. Many modifications or variations will be obvious to those skilled in the art without departing from the scope and spirit of the present disclosure. The terms used in this specification are selected to best explain the principles of the present embodiments, practical applications, or technical improvements over the art found in the market, or to enable others skilled in the art to understand the embodiments disclosed herein.

Claims

1. A computer-implemented method, comprising:
training a teacher neural network using the labeled images to obtain a trained teacher neural network, wherein each pixel of each of the labeled images is assigned a label indicating one of a set of classifications;
providing a set of unlabeled images to the trained teacher neural network to generate a set of soft-labeled images, each pixel of the generated soft-labeled images being given a soft label indicating one of a set of classifications and an uncertainty value indicating a confidence level of the soft label , the soft label being provided by the trained teacher neural network ;
training a student neural network using the subset of labeled images and the soft-labeled set based on a confidence map derived from the uncertainty values of the soft-labeled images to obtain a trained student neural network; and obtaining student-labeled images from unlabeled images using the trained student neural network.

The computer-implemented method of claim 1 , wherein basing the belief map comprises weighting each class according to a significant number of pixels per class based on the belief map.

3. The computer-implemented method of claim 1, further comprising: iteratively training the student neural network by providing unlabeled images to the trained teacher neural network to generate different sets of soft-labeled images, using a different set of soft-labeled images and a different subset of the labeled images in each iteration.

4. The computer-implemented method of claim 3, further comprising: calculating a loss value from an output of the student neural network at each iteration; and training the student neural network comprises updating parameters of the student neural network at each iteration based on the loss value.

The iterative training of the student neural network is performed until convergence of the loss value calculated for the student neural network at each iteration is achieved or until a predefined number of iterations is reached.
5. The computer-implemented method of claim 4 .

The computer-implemented method of claim 1 , further comprising weighting the soft-labeled images generated by the trained teacher neural network based on the uncertainty value associated with the soft label assigned to each pixel .

training the student neural network with the set of soft-labeled images includes filtering out soft labels of pixels based on weights determined from the weightings.
7. The computer-implemented method of claim 6 .

the set of classifications being indicative of anatomical features of the eye, and the computer-implemented method further includes making an eye diagnosis based on the student-labeled images obtained by inputting an optical coherence tomography (OCT) retinal scan as the unlabeled image.
A computer-implemented method according to any one of claims 1 to 7 .

1. A system comprising:
a memory having computer readable instructions; and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to:
training a teacher neural network using the labeled images to obtain a trained teacher neural network, wherein each pixel of each of the labeled images is assigned a label indicating one of a set of classifications;
providing a set of unlabeled images to the trained teacher neural network to generate a set of soft-labeled images, each pixel of the generated soft-labeled images being given a soft label indicating one of a set of classifications and an uncertainty value indicating a confidence level of the soft label , the soft label being provided by the trained teacher neural network ;
training a student neural network using the subset of labeled images and the soft-labeled set based on a confidence map derived from the uncertainty values of the soft-labeled images to obtain a trained student neural network; and obtaining student-labeled images from unlabeled images using the trained student neural network.

10. The system of claim 9, further comprising: iteratively training the student neural network by providing unlabeled images to the trained teacher neural network to generate different sets of soft-labeled images, using a different set of soft-labeled images and a different subset of the labeled images in each iteration .

11. The system of claim 10, further comprising: calculating a loss value from an output of the student neural network at each iteration; and training the student neural network comprises updating parameters of the student neural network at each iteration based on the loss value.

The iterative training of the student neural network is performed until convergence of the loss value calculated for the student neural network at each iteration is achieved or until a predefined number of iterations is reached.
The system of claim 11 .

The system of any one of claims 9 to 12, further comprising weighting the soft-labeled images generated by the trained teacher neural network based on the uncertainty value associated with the soft label assigned to each pixel .

training the student neural network with the set of soft-labeled images includes filtering out soft labels of pixels based on weights determined from the weightings.
The system of claim 13 .

A computer program stored on a computer-readable medium and loadable into the internal memory of a digital computer, the computer program comprising software code portions for carrying out the method according to any one of claims 1 to 8 when said program runs on a computer.