JP7732299B2

JP7732299B2 - Learning device, learning method, and program

Info

Publication number: JP7732299B2
Application number: JP2021152315A
Authority: JP
Inventors: 恭史国定
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2021-09-17
Filing date: 2021-09-17
Publication date: 2025-09-02
Anticipated expiration: 2041-09-17
Also published as: JP2023044336A

Description

本発明は、学習装置、学習方法およびプログラムに関する。 The present invention relates to a learning device, a learning method, and a program.

ニューラルネットワーク（以下、「ＮＮ」とも表記する。）は、画像認識などにおいて高い性能を有する。しかし、一般的にＮＮは、膨大なパラメータと複雑なモデルとによって構成されており、ＮＮのパラメータとＮＮからの出力結果との関係を解釈することが難しい。かかる課題を解決するため、解釈性の高いＮＮを得る手法が幾つか提案されている。なお、「解釈性が高い」は、「人間の感覚との一致度が高い」とも換言され得る。 Neural networks (hereafter referred to as "NNs") have high performance in areas such as image recognition. However, NNs are generally composed of a huge number of parameters and complex models, making it difficult to interpret the relationship between the NN parameters and the output results from the NN. To solve this problem, several methods have been proposed for obtaining NNs with high interpretability. Note that "high interpretability" can also be said to mean "high consistency with human senses."

例えば、ＮＮのモデルが判断のために注目するべき領域を示したヒートマップのラベルを人手によって付しておき、そのヒートマップと一致するようにモデルを学習させることによって人にも解釈しやすいモデルを得る手法が知られている（例えば、非特許文献１参照）。また、モデルから得られたヒートマップの解釈性が低い場合には、そのヒートマップと一致しないようにモデルを再学習させることによって、より解釈性の高いモデルを得ることもできる。 For example, one known technique involves manually labeling a heat map that indicates the areas that a neural network model should focus on for making decisions, and then training the model to match that heat map, thereby obtaining a model that is easy for humans to interpret (see, for example, Non-Patent Document 1). Furthermore, if the heat map obtained from a model has low interpretability, a model with higher interpretability can be obtained by retraining the model so that it does not match the heat map.

また、入力データのうちＮＮが判断を行うための注目領域を抽出する機構をネットワーク内に導入することによって、ＮＮの精度を向上させる手法も知られている（例えば、非特許文献２参照）。かかる手法によって得られた注目領域を人間が修正し、修正した注目領域とＮＮの注目領域が一致するようにＮＮを再学習させることによって、ＮＮの解釈性および精度を向上させることができる。 There is also a known method for improving the accuracy of a neural network by introducing into the network a mechanism for extracting regions of interest from input data that the neural network will use to make decisions (see, for example, Non-Patent Document 2). The regions of interest obtained using this method can be manually corrected, and the neural network can be retrained so that the corrected regions of interest match the neural network's regions of interest, thereby improving the interpretability and accuracy of the neural network.

Andrew Ross、他2名、"Right for the Right Reasons: Training Differentiable Models byConstraining their Explanations"、[online]、［令和3年9月8日検索］、インターネット＜https://arxiv.org/abs/1703.03717＞Andrew Ross and two others, "Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations," [online], [Retrieved September 8, 2021], Internet <https://arxiv.org/abs/1703.03717> Masahiro Mitsuhara、他6名、"Embedding Human Knowledge into Deep Neural Network viaAttention Map"、[online]、［令和3年9月8日検索］、インターネット＜https://arxiv.org/abs/1905.03540＞Masahiro Mitsuhara and 6 others, "Embedding Human Knowledge into Deep Neural Network via Attention Map," [online], [Retrieved September 8, 2021], Internet <https://arxiv.org/abs/1905.03540> "Grad-CAM: VisualExplanations from Deep Networks via Gradient-based Localization"、[online]、［令和3年9月8日検索］、インターネット＜https://arxiv.org/abs/1610.02391v3＞"Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization", [online], [Retrieved September 8, 2021], Internet <https://arxiv.org/abs/1610.02391v3>

しかしながら、非特許文献１および非特許文献２に記載された、人手によってヒートマップのラベルを用意する手法は、ラベル付けのための人的コストが大きい。 However, the methods described in Non-Patent Document 1 and Non-Patent Document 2, in which heat map labels are prepared manually, require a high human resource cost for labeling.

一方、ラベル付けを必要としない手法としては、非特許文献１に記載の学習済みモデルのヒートマップと一致しないようにモデルを再学習させる手法が挙げられる。しかし、かかる手法では、再学習により精度が低下してしまう可能性が高いという点が課題として挙げられる。さらに、かかる手法では、全てのデータに対して一様にヒートマップの一致度が低下してしまうため、個々のデータに対してはかえってヒートマップの解釈性を低下させてしまう場合があるという点が課題として挙げられる。 On the other hand, a method that does not require labeling is the method described in Non-Patent Document 1, in which a model is retrained so that it does not match the heat map of a trained model. However, this method has the problem that retraining is likely to result in a decrease in accuracy. Furthermore, this method has the problem that the degree of match of the heat map decreases uniformly for all data, which may actually reduce the interpretability of the heat map for individual data.

そこで、人的コストを抑制しながら、解釈性および精度の高いモデルを得ることが可能な技術が提供されることが望まれる。 Therefore, it is desirable to provide technology that can obtain highly interpretable and accurate models while reducing human costs.

上記問題を解決するために、本発明のある観点によれば、第１の入力データと前記第１の入力データの正解値とを取得する入力部と、前記第１の入力データと複数の推論モデルとに基づいて、複数の推論モデルそれぞれに対応する第１の推論値を出力する推論部と、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示す前記複数の推論モデルそれぞれに対応する第１の説明情報を出力する説明部と、前記正解値と前記第１の推論値とに基づいて推論評価結果を得る推論評価部と、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度に基づいて説明評価結果を得る説明評価部と、前記推論評価結果と前記説明評価結果とに基づいて、前記第１の説明情報同士の一致度が小さくなるように、前記複数の推論モデルの第１の重みパラメータの更新を行う更新部と、を備える、学習装置であって、前記入力部は、第２の入力データを取得し、前記推論部は、前記第２の入力データと前記第１の重みパラメータの更新後の複数の推定モデルである複数の学習済みモデルとに基づいて、前記複数の学習済みモデルそれぞれに対応する第２の推論値を出力し、前記説明部は、前記第２の推論値に対する前記第２の入力データの寄与の大きさを示す前記複数の学習済みモデルそれぞれに対応する第２の説明情報を出力し、前記学習装置は、前記第２の推論値および前記第２の説明情報のユーザへの提示を制御する提示制御部と、前記複数の学習済みモデルから前記ユーザによって選択された１または複数の学習済みモデルを示す選択モデル情報を前記ユーザから受け付ける操作部と、を備える、学習装置が提供される。
In order to solve the above problem, according to one aspect of the present invention, there are provided an input unit that acquires first input data and a correct value of the first input data; an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on the first input data and a plurality of inference models; an explanation unit that outputs first explanation information corresponding to each of the plurality of inference models that indicates the magnitude of contribution of the first input data to the first inference value; an inference evaluation unit that obtains an inference evaluation result based on the correct value and the first inference value; an explanation evaluation unit that obtains an explanation evaluation result based on the degree of agreement between the first explanation information corresponding to each of the plurality of inference models; and a weighting unit that adjusts first weighting parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result so that the degree of agreement between the first explanation information becomes smaller. and an update unit that updates the first weight parameter, wherein the input unit acquires second input data, the inference unit outputs a second inference value corresponding to each of the plurality of trained models based on the second input data and a plurality of trained models that are a plurality of estimation models after the first weight parameter has been updated, and the explanation unit outputs second explanation information corresponding to each of the plurality of trained models that indicates the magnitude of contribution of the second input data to the second inference value, and the learning device comprises: a presentation control unit that controls presentation of the second inference value and the second explanation information to a user, and an operation unit that receives selected model information from the user that indicates one or more trained models selected by the user from the plurality of trained models .

前記学習装置は、前記複数の学習済みモデルから前記ユーザによって選択された１または複数の学習済みモデルを示す情報の記録を制御する記録制御部を備えてもよい。 The learning device may also include a recording control unit that controls the recording of information indicating one or more trained models selected by the user from the plurality of trained models.

前記説明評価結果は、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度が大きいほど小さい値を取ってもよい。 The explanation evaluation result may take a smaller value the greater the degree of match between the first explanation information corresponding to each of the multiple inference models.

前記説明評価部は、前記複数の推論モデルそれぞれに対応する第１の説明情報を正規化したベクトルの内積に基づいて前記説明評価結果を得てもよい。 The explanation evaluation unit may obtain the explanation evaluation result based on the inner product of vectors normalized from the first explanation information corresponding to each of the plurality of inference models.

前記説明評価部は、前記複数の推論モデルごとに、前記第１の説明情報の二値化を行ってマスクを生成するとともに、自身以外の推論モデルに対応する前記第１の説明情報から生成したマスクと自身の推論モデルに対応する前記第１の説明情報との積を計算し、前記複数の推論モデルごとの前記積の和に基づいて、前記説明評価結果を得てもよい。 The explanation evaluation unit may binarize the first explanation information to generate a mask for each of the multiple inference models, calculate the product of the mask generated from the first explanation information corresponding to an inference model other than itself and the first explanation information corresponding to its own inference model, and obtain the explanation evaluation result based on the sum of the products for each of the multiple inference models.

前記説明部は、誤差逆伝播が可能な関数を含んでもよい。 The explanation section may include a function capable of backpropagation.

前記説明部は、第２の重みパラメータを有し、前記更新部は、誤差逆伝播法によって前記第２の重みパラメータの更新を行ってもよい。 The explanation unit may have a second weight parameter, and the update unit may update the second weight parameter using a backpropagation algorithm.

前記複数の推論モデルの少なくとも一つは、ニューラルネットワークを含んでもよい。なお、ニューラルネットワークは、機械学習アルゴリズムの一例に過ぎない。したがって、ニューラルネットワークの代わりに他の機械学習アルゴリズムが用いられてもよい。 At least one of the plurality of inference models may include a neural network. Note that a neural network is merely one example of a machine learning algorithm. Therefore, other machine learning algorithms may be used instead of a neural network.

前記更新部は、前記推論評価結果と前記説明評価結果との加算結果に基づいて、前記第１の重みパラメータの更新を行ってもよい。 The update unit may update the first weight parameter based on the sum of the inference evaluation result and the explanation evaluation result.

前記第１の説明情報は、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示すヒートマップであってもよい。 The first explanatory information may be a heat map showing the magnitude of the contribution of the first input data to the first inferred value.

また、本発明の別の観点によれば、第１の入力データと前記第１の入力データの正解値とを取得することと、前記第１の入力データと複数の推論モデルとに基づいて、複数の推論モデルそれぞれに対応する第１の推論値を出力することと、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示す前記複数の推論モデルそれぞれに対応する第１の説明情報を出力することと、前記正解値と前記第１の推論値とに基づいて推論評価結果を得ることと、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度に基づいて説明評価結果を得ることと、前記推論評価結果と前記説明評価結果とに基づいて、前記第１の説明情報同士の一致度が小さくなるように、前記複数の推論モデルの第１の重みパラメータの更新を行うことと、第２の入力データを取得することと、前記第２の入力データと前記第１の重みパラメータの更新後の複数の推定モデルである複数の学習済みモデルとに基づいて、前記複数の学習済みモデルそれぞれに対応する第２の推論値を出力することと、前記第２の推論値に対する前記第２の入力データの寄与の大きさを示す前記複数の学習済みモデルそれぞれに対応する第２の説明情報を出力することと、前記第２の推論値および前記第２の説明情報のユーザへの提示を制御することと、前記複数の学習済みモデルから前記ユーザによって選択された１または複数の学習済みモデルを示す選択モデル情報を前記ユーザから受け付けることと、を含む、学習方法が提供される。
According to another aspect of the present invention, a method for generating a first inference value corresponding to each of a plurality of inference models based on the first input data and the plurality of inference models is provided. The method further comprises: obtaining first input data and a correct answer value for the first input data; outputting first explanation information corresponding to each of the plurality of inference models, the first explanation information indicating the magnitude of contribution of the first input data to the first inference value; obtaining an inference evaluation result based on the correct answer value and the first inference value; obtaining an explanation evaluation result based on the degree of agreement between the first explanation information corresponding to each of the plurality of inference models; and adjusting the first explanation information of the plurality of inference models so that the degree of agreement between the first explanation information is reduced based on the inference evaluation result and the explanation evaluation result. a learning method including: updating a first weight parameter; acquiring second input data; outputting a second inference value corresponding to each of a plurality of trained models based on the second input data and a plurality of trained models that are a plurality of estimation models after the first weight parameter is updated; outputting second explanatory information corresponding to each of the plurality of trained models indicating a magnitude of contribution of the second input data to the second inference value; controlling presentation of the second inference value and the second explanatory information to a user; and receiving selected model information from the user indicating one or more trained models selected by the user from the plurality of trained models .

また、本発明の別の観点によれば、コンピュータを、第１の入力データと前記第１の入力データの正解値とを取得する入力部と、前記第１の入力データと複数の推論モデルとに基づいて、複数の推論モデルそれぞれに対応する第１の推論値を出力する推論部と、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示す前記複数の推論モデルそれぞれに対応する第１の説明情報を出力する説明部と、前記正解値と前記第１の推論値とに基づいて推論評価結果を得る推論評価部と、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度に基づいて説明評価結果を得る説明評価部と、前記推論評価結果と前記説明評価結果とに基づいて、前記第１の説明情報同士の一致度が小さくなるように、前記複数の推論モデルの第１の重みパラメータの更新を行う更新部と、を備える、学習装置であって、前記入力部は、第２の入力データを取得し、前記推論部は、前記第２の入力データと前記第１の重みパラメータの更新後の複数の推定モデルである複数の学習済みモデルとに基づいて、前記複数の学習済みモデルそれぞれに対応する第２の推論値を出力し、前記説明部は、前記第２の推論値に対する前記第２の入力データの寄与の大きさを示す前記複数の学習済みモデルそれぞれに対応する第２の説明情報を出力し、前記学習装置は、前記第２の推論値および前記第２の説明情報のユーザへの提示を制御する提示制御部と、前記複数の学習済みモデルから前記ユーザによって選択された１または複数の学習済みモデルを示す選択モデル情報を前記ユーザから受け付ける操作部と、を備える、学習装置として機能させるプログラムが提供される。
According to another aspect of the present invention, a computer includes an input unit that acquires first input data and a correct value of the first input data, an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on the first input data and a plurality of inference models, an explanation unit that outputs first explanation information corresponding to each of the plurality of inference models that indicates the magnitude of contribution of the first input data to the first inference value, an inference evaluation unit that obtains an inference evaluation result based on the correct value and the first inference value, an explanation evaluation unit that obtains an explanation evaluation result based on the degree of agreement between the first explanation information corresponding to each of the plurality of inference models, and an update unit that updates first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result so as to reduce the degree of agreement between the first explanation information. a presentation control unit that controls presentation of the second inference value and the second explanation information to a user; and an operation unit that receives, from the user, selected model information indicating one or more trained models selected by the user from the plurality of trained models.

以上説明したように本発明によれば、人的コストを抑制しながら、解釈性および精度の高いモデルを得ることが可能な技術が提供される。 As described above, the present invention provides technology that can obtain highly interpretable and accurate models while reducing human costs.

本発明の実施形態に係る学習装置の機能構成例を示す図である。FIG. 2 is a diagram illustrating an example of a functional configuration of a learning device according to an embodiment of the present invention. ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法について説明するための図である。10A and 10B are diagrams for explaining a method for obtaining explanation evaluation results by multiplying a mask obtained by binarizing a heat map with another heat map. 同実施形態に係る学習装置の学習段階における動作例を示すフローチャートである。10 is a flowchart showing an example of operation in a learning stage of the learning device according to the embodiment. 同実施形態に係る学習装置のテスト段階における動作例を示すフローチャートである。10 is a flowchart illustrating an example of operation of the learning device according to the embodiment in a test stage. 学習装置の例としての情報処理装置のハードウェア構成を示す図である。FIG. 1 is a diagram illustrating a hardware configuration of an information processing device as an example of a learning device.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings. Note that in this specification and drawings, components that have substantially the same functional configuration will be assigned the same reference numerals, and redundant explanations will be omitted.

また、本明細書および図面において、実質的に同一の機能構成を有する複数の構成要素を、同一の符号の後に異なる数字を付して区別する場合がある。ただし、実質的に同一の機能構成を有する複数の構成要素等の各々を特に区別する必要がない場合、同一符号のみを付する。また、異なる実施形態の類似する構成要素については、同一の符号の後に異なるアルファベットを付して区別する場合がある。ただし、異なる実施形態の類似する構成要素等の各々を特に区別する必要がない場合、同一符号のみを付する。 In addition, in this specification and drawings, multiple components having substantially the same functional configuration may be distinguished by adding different numbers after the same reference symbol. However, if there is no need to particularly distinguish between multiple components having substantially the same functional configuration, only the same reference symbol will be used. Furthermore, similar components in different embodiments may be distinguished by adding different letters after the same reference symbol. However, if there is no need to particularly distinguish between similar components in different embodiments, only the same reference symbol will be used.

（０．実施形態の概要）
本発明の実施形態の概要について説明する。本発明の実施形態では、入力データ（学習用データ）と正解値との組み合わせに基づいてニューラルネットワークの学習を行う学習装置について説明する。しかし、ニューラルネットワークは、機械学習アルゴリズムの一例に過ぎない。したがって、ニューラルネットワークの代わりに他の機械学習アルゴリズムが用いられてもよい。例えば、機械学習アルゴリズムの他の一例として、ＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）などが用いられてもよい。 (0. Overview of the embodiment)
An overview of an embodiment of the present invention will be described. In the embodiment of the present invention, a learning device that performs neural network learning based on a combination of input data (learning data) and a correct answer value will be described. However, a neural network is merely one example of a machine learning algorithm. Therefore, other machine learning algorithms may be used instead of a neural network. For example, an SVM (Support Vector Machine) may be used as another example of a machine learning algorithm.

（１．実施形態の詳細）
本発明の実施形態について詳細に説明する。 1. DETAILED DESCRIPTION OF THE EMBODIMENTS
An embodiment of the present invention will now be described in detail.

（１．１．学習装置の構成例）
図１は、本発明の実施形態に係る学習装置１０の機能構成例を示す図である。図１に示されるように、本発明の実施形態に係る学習装置１０は、入力部１０１と、推論部１０２と、説明部１０３と、推論評価部１０４と、説明評価部１０５と、更新部１０６と、提示制御部１０７と、記録制御部１０８と、表示部１２１と、操作部１２２とを備える。 (1.1. Example of the configuration of the learning device)
1 is a diagram showing an example of the functional configuration of a learning device 10 according to an embodiment of the present invention. As shown in Fig. 1, the learning device 10 according to an embodiment of the present invention includes an input unit 101, an inference unit 102, an explanation unit 103, an inference evaluation unit 104, an explanation evaluation unit 105, an update unit 106, a presentation control unit 107, a recording control unit 108, a display unit 121, and an operation unit 122.

本発明の実施形態では、推論部１０２が、ｎ個（ｎは１より大きい整数）の推論モデル、すなわち、「第１推論モデル」から「第ｎ推論モデル」までを含む場合を主に想定する。また、本発明の実施形態では、第１推論モデルから第ｎ推論モデルまでのそれぞれが、ニューラルネットワークを含んで構成される場合を主に想定する。以下では、ニューラルネットワークを「ＮＮ」とも表記する。 In an embodiment of the present invention, it is primarily assumed that the inference unit 102 includes n inference models (n is an integer greater than 1), i.e., a "first inference model" through an "nth inference model." Furthermore, in an embodiment of the present invention, it is primarily assumed that each of the first through nth inference models includes a neural network. Hereinafter, neural network will also be referred to as "NN."

第１推論モデルから第ｎ推論モデルまでのそれぞれに含まれるＮＮは、重みパラメータ１１０（第１の重みパラメータ）を使用する。このとき、第１推論モデルから第ｎ推論モデルまでのそれぞれに含まれるＮＮは、共通の構造を有し、使用する重みパラメータ１１０（第１の重みパラメータ）が異なっていてもよい。あるいは、第１推論モデルから第ｎ推論モデルまでのそれぞれに含まれるＮＮは、別々の構造を有していてもよい。 The NNs included in each of the first to nth inference models use a weight parameter 110 (first weight parameter). In this case, the NNs included in each of the first to nth inference models may have a common structure and use different weight parameters 110 (first weight parameters). Alternatively, the NNs included in each of the first to nth inference models may have different structures.

なお、第１推論モデルから第ｎ推論モデルまでの少なくとも一つが、ＮＮを含んでもよい。例えば、第１推論モデルから第ｎ推論モデルまでの一部がＮＮを含んでもよく、第１推論モデルから第ｎ推論モデルまでの他の一部は、ＮＮの代わりに他の機械学習アルゴリズムを含んでもよい。 In addition, at least one of the first to nth inference models may include a neural network. For example, some of the first to nth inference models may include a neural network, and other parts of the first to nth inference models may include other machine learning algorithms instead of a neural network.

さらに、本発明の実施形態では、説明部１０３がＮＮを含んで構成される場合を主に想定する。説明部１０３に含まれるＮＮは、重みパラメータ（第２の重みパラメータ）を使用する。 Furthermore, in the embodiment of the present invention, it is mainly assumed that the explanation unit 103 is configured to include a neural network. The neural network included in the explanation unit 103 uses a weight parameter (second weight parameter).

データセット１００、第１推論モデルから第ｎ推論モデルまでの重みパラメータ１１０（第１の重みパラメータ）および説明部１０３が有する重みパラメータ（第２の重みパラメータ）は、図示しない記憶部によって記憶される。かかる記憶部は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ハードディスクドライブまたはフラッシュメモリなどのメモリによって構成されてよい。 The dataset 100, the weight parameters 110 (first weight parameters) for the first to nth inference models, and the weight parameters (second weight parameters) held by the explanation unit 103 are stored in a storage unit (not shown). This storage unit may be configured with memory such as RAM (Random Access Memory), a hard disk drive, or flash memory.

入力部１０１と、推論部１０２と、説明部１０３と、推論評価部１０４と、説明評価部１０５と、更新部１０６と、提示制御部１０７と、記録制御部１０８とは、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）またはＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）などの演算装置を含み、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）により記憶されているプログラムが演算装置によりＲＡＭに展開されて実行されることにより、その機能が実現され得る。このとき、当該プログラムを記録した、コンピュータに読み取り可能な記録媒体も提供され得る。あるいは、これらのブロックは、専用のハードウェアにより構成されていてもよいし、複数のハードウェアの組み合わせにより構成されてもよい。演算装置による演算に必要なデータは、図示しない記憶部によって適宜記憶される。 The input unit 101, inference unit 102, explanation unit 103, inference evaluation unit 104, explanation evaluation unit 105, update unit 106, presentation control unit 107, and recording control unit 108 each include a computing device such as a CPU (Central Processing Unit) or GPU (Graphics Processing Unit), and their functions can be realized when a program stored in ROM (Read Only Memory) is loaded into RAM and executed by the computing device. In this case, a computer-readable recording medium on which the program is recorded can also be provided. Alternatively, these blocks may be composed of dedicated hardware or a combination of multiple hardware components. Data required for calculations by the computing device is stored as appropriate in a storage unit (not shown).

初期状態において、第１推論モデルから第ｎ推論モデルまでの重みパラメータ１１０および説明部１０３が有する重みパラメータそれぞれには、初期値が設定されている。例えば、これらに設定される初期値は、ランダムな値であってよいが、どのような値であってもよい。例えば、これらに設定される初期値は、あらかじめ学習によって得られた学習済みの値であってもよい。 In the initial state, initial values are set for the weight parameters 110 of the first to nth inference models and the weight parameters held by the explanation unit 103. For example, the initial values set for these may be random values, but any values are also acceptable. For example, the initial values set for these may be learned values obtained in advance through learning.

（データセット１００）
データセット１００は、学習段階において使用される複数の入力データ（第１の入力データ）と当該複数の入力データそれぞれの正解値とを含む。学習段階において使用される複数の入力データは、学習用データに該当し得る。さらに、データセット１００は、テスト段階において使用される複数の入力データ（第２の入力データ）を含む。テスト段階において使用される複数の入力データは、テスト用データに該当し得る。 (Dataset 100)
The dataset 100 includes a plurality of pieces of input data (first input data) used in the learning stage and correct values for each of the plurality of pieces of input data. The plurality of pieces of input data used in the learning stage may correspond to training data. Furthermore, the dataset 100 includes a plurality of pieces of input data (second input data) used in the testing stage. The plurality of pieces of input data used in the testing stage may correspond to testing data.

なお、テスト用データは、学習用データと別のデータとして用意されていることが主に想定される。しかし、テスト用データは、学習用データの一部を含んでもよい。 It is generally assumed that test data is prepared as separate data from training data. However, test data may also include part of the training data.

また、本発明の実施形態では、入力データが画像データである場合（特に、静止画像データである場合）を主に想定する。しかし、入力データの種類は特に限定されず、画像データ以外も入力データとして用いられ得る。例えば、入力データは、複数のフレームを含んだ動画像データであってもよいし、音響データであってもよい。 Furthermore, in embodiments of the present invention, it is primarily assumed that the input data is image data (especially still image data). However, the type of input data is not particularly limited, and data other than image data can also be used as input data. For example, the input data may be video data containing multiple frames, or audio data.

（入力部１０１）
入力部１０１は、学習段階において、データセット１００から学習段階において使用される入力データおよび正解値の組み合わせを順次に取得する。入力部１０１は、学習段階において使用される入力データおよび正解値の組み合わせを順次に推論部１０２に出力する。また、入力部１０１は、テスト段階において、データセット１００からテストにおいて使用される入力データを順次に取得する。入力部１０１は、テスト段階において使用される入力データを順次に推論部１０２に出力する。 (Input unit 101)
In the learning stage, the input unit 101 sequentially acquires combinations of input data and correct values to be used in the learning stage from the dataset 100. The input unit 101 sequentially outputs the combinations of input data and correct values to be used in the learning stage to the inference unit 102. In addition, in the testing stage, the input unit 101 sequentially acquires input data to be used in testing from the dataset 100. The input unit 101 sequentially outputs the input data to be used in the testing stage to the inference unit 102.

なお、例えば、入力部１０１は、データセット１００から学習段階において使用される入力データおよび正解値の組み合わせを全部取得して出力し終わった場合には、最初から当該組み合わせを取得し直して再度出力する動作を所定の回数繰り返してよい。かかる場合には、入力部１０１よりも後段のブロックにおいても、再度の入力に基づいて順次に各自の処理が繰り返し実行されてよい。一方、例えば、入力部１０１は、データセット１００からテスト段階において使用される入力データを全部取得して出力し終わった場合には、入力データの取得を終了してよい。 For example, when the input unit 101 has acquired and output all combinations of input data and correct answers used in the learning stage from the dataset 100, it may repeat the operation of acquiring and outputting those combinations again from the beginning a predetermined number of times. In such a case, the blocks subsequent to the input unit 101 may also sequentially repeat their respective processes based on the re-input. On the other hand, for example, when the input unit 101 has acquired and output all input data used in the testing stage from the dataset 100, it may end the acquisition of input data.

（推論部１０２）
推論部１０２は、学習段階において、入力部１０１から入力された入力データと第１推論モデルから第ｎ推論モデルまでとに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値（第１の推論値）を得る。同様に、推論部１０２は、テスト段階において、入力部１０１から入力された入力データと第１推論モデルから第ｎ推論モデルまでとに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値（第２の推論値）を得る。 (Inference unit 102)
In the learning stage, the inference unit 102 obtains inference values (first inference values) corresponding to each of the first to n-th inference models based on the input data input from the input unit 101 and the first to n-th inference models. Similarly, in the testing stage, the inference unit 102 obtains inference values (second inference values) corresponding to each of the first to n-th inference models based on the input data input from the input unit 101 and the first to n-th inference models.

第１推論モデルから第ｎ推論モデルまでが使用する重みパラメータ１１０は図示しない記憶部によって記憶されている。したがって、推論部１０２は、図示しない記憶部から重みパラメータ１１０を取得し、取得した重みパラメータ１１０と入力部１０１から入力された入力データとに基づいて、第１推論モデルから第ｎ推論モデルまでによる推論を行う。 The weight parameters 110 used by the first to nth inference models are stored in a storage unit (not shown). Therefore, the inference unit 102 acquires the weight parameters 110 from the storage unit (not shown), and performs inference using the first to nth inference models based on the acquired weight parameters 110 and the input data input from the input unit 101.

なお、本明細書においては、ＮＮへの入力に基づいてＮＮからの出力を得ることを広く「推論」と言う。 In this specification, obtaining output from a NN based on input to the NN is broadly referred to as "inference."

一例として、ｉ番目の推論モデルを示す関数をＦｉ（ｉは１～ｎまでの整数）とし、ｉ番目の推論モデルへの入力をｘとすると、ｉ番目の推論モデルからの出力はＦｉ（ｘ）と表現され得る。 As an example, if the function representing the i-th inference model is Fi (i is an integer from 1 to n) and the input to the i-th inference model is x, the output from the i-th inference model can be expressed as Fi(x).

なお、後にも説明するように、説明部１０３が用いる説明手法（すなわち、説明情報の生成手法）には、推論値の他に第１推論モデルから第ｎ推論モデルまでのそれぞれから出力される特徴量（中間特徴量）などの情報を必要とする説明手法が存在する場合があり得る。かかる場合には、推論部１０２は、推論値とともに、第１推論モデルから第ｎ推論モデルまでのそれぞれの中間層から出力される特徴量を説明部１０３に出力してよい。 As will be explained later, the explanation method (i.e., the method for generating explanation information) used by the explanation unit 103 may include an explanation method that requires information such as feature quantities (intermediate feature quantities) output from each of the first to nth inference models in addition to the inference value. In such cases, the inference unit 102 may output to the explanation unit 103, along with the inference value, the feature quantities output from each of the intermediate layers of the first to nth inference models.

第１推論モデルから第ｎ推論モデルまでの具体的な構成は、特に限定されない。しかし、第１推論モデルから第ｎ推論モデルまでのそれぞれの出力の形式は、入力データに対応する正解値の形式と合わせて設定されているのがよい。例えば、正解値が分類問題のクラスである場合、第１推論モデルから第ｎ推論モデルまでのそれぞれの出力は、クラス数分の長さを有するｏｎｅ－ｈｏｔベクトルであるとよい。 The specific configuration of the first to nth inference models is not particularly limited. However, it is preferable that the format of each output from the first to nth inference models be set to match the format of the correct value corresponding to the input data. For example, if the correct value is the class of a classification problem, it is preferable that each output from the first to nth inference models be a one-hot vector whose length is equal to the number of classes.

推論部１０２は、学習段階において、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値を、説明部１０３および推論評価部１０４それぞれに出力する。一方、推論部１０２は、テスト段階において、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値を、説明部１０３および提示制御部１０７それぞれに出力する。 In the learning stage, the inference unit 102 outputs inference values corresponding to each of the first through n-th inference models to the explanation unit 103 and the inference evaluation unit 104. On the other hand, in the testing stage, the inference unit 102 outputs inference values corresponding to each of the first through n-th inference models to the explanation unit 103 and the presentation control unit 107.

（説明部１０３）
説明部１０３は、第１推論モデルから第ｎ推論モデルまでのそれぞれについて、推論部１０２から入力された推論値の判断根拠を説明する説明情報を生成する。 (Explanation unit 103)
The explanation unit 103 generates explanation information that explains the basis for determining the inference values input from the inference unit 102 for each of the first to n-th inference models.

ここで、説明情報は、推論部１０２から入力された推論値に対する入力データの寄与の大きさを示す情報である。以下では、説明情報が推論値に対する入力データの寄与の大きさを領域（例えば、画像を構成するピクセルなど）または変数ごとに示すヒートマップである場合について主に説明する。ヒートマップによれば、入力データのうち判断に寄与した重要な領域または変数が示され得る。 Here, the explanatory information is information indicating the magnitude of the contribution of the input data to the inferred value input from the inference unit 102. Below, we will mainly explain the case where the explanatory information is a heat map that indicates the magnitude of the contribution of the input data to the inferred value for each region (e.g., pixels that make up an image) or variable. The heat map can indicate important regions or variables of the input data that contributed to the judgment.

入力データが画像データなどである場合には、ヒートマップは２次元ベクトルによって表現され得る。あるいは、入力データが表形式データなどである場合には、ヒートマップは１次元ベクトルによって表現され得る。 If the input data is image data, the heat map can be represented by a two-dimensional vector. Alternatively, if the input data is tabular data, the heat map can be represented by a one-dimensional vector.

ヒートマップはどのように生成されてもよい。例えば、説明部１０３は、推論部１０２から入力された推論値に基づいて、ヒートマップを生成してもよい。あるいは、上記したように、推論部１０２から説明部１０３に推論値だけではなく特徴量も入力される場合があり得る。かかる場合には、説明部１０３は、推論部１０２から入力された推論値と特徴量とに基づいて、ヒートマップを生成してもよい。 The heat map may be generated in any manner. For example, the explanation unit 103 may generate a heat map based on the inferred values input from the inference unit 102. Alternatively, as described above, there may be cases where not only the inferred values but also feature quantities are input from the inference unit 102 to the explanation unit 103. In such cases, the explanation unit 103 may generate a heat map based on the inferred values and feature quantities input from the inference unit 102.

例えば、説明部１０３は、誤差逆伝播が可能な関数を含んでいてもよい。このとき、後に説明するように、更新部１０６によって説明部１０３が有する重みパラメータが誤差逆伝播法によって更新され得る。すなわち、説明部１０３は、誤差逆伝播法による更新後の重みパラメータによってヒートマップを生成してもよい。 For example, the explanation unit 103 may include a function capable of backpropagation. In this case, as will be described later, the weight parameters held by the explanation unit 103 may be updated by the update unit 106 using the backpropagation method. In other words, the explanation unit 103 may generate a heat map using the weight parameters updated using the backpropagation method.

誤差逆伝播法による更新後の重みパラメータによってヒートマップを生成する説明手法としては、非特許文献３に記載された、いわゆるＧｒａｄ－ＣＡＭなどが適用され得る。Ｇｒａｄ－ＣＡＭは、ＮＮへの入力のうち推論値への寄与度が高い領域を示すヒートマップを出力する説明手法である。その他にも、ＶａｎｉｌｌａＧｒａｄｉｅｎｔ、ＳｍｏｏｔｈＧｒａｄといった各種の説明手法が適用され得る。 The so-called Grad-CAM method described in Non-Patent Document 3 can be used as an explanation method for generating a heat map using weight parameters updated using the backpropagation method. Grad-CAM is an explanation method that outputs a heat map that shows the areas of the input to a neural network that have a high contribution to the inferred value. Other explanation methods that can be used include Vanilla Gradient and SmoothGrad.

上記したように、ｉ番目の推論モデルに対応する推論値はＦｉ（ｘ）と表現され得るため、一例として、ヒートマップの生成処理を示す関数をＧとすると、説明部１０３によって生成されるｉ番目の推論モデルに対応するヒートマップＴｉ（ｘ）は、以下の式（１）のように表現され得る。 As mentioned above, the inference value corresponding to the i-th inference model can be expressed as Fi(x). As an example, if the function indicating the heat map generation process is G, the heat map Ti(x) corresponding to the i-th inference model generated by the explanation unit 103 can be expressed as in the following equation (1).

Ｔｉ（ｘ）＝Ｇ（Ｆｉ（ｘ））・・・（１） Ti(x)=G(Fi(x))...(1)

説明部１０３は、学習段階において、生成したｎ個のヒートマップ（第１の説明情報）を説明評価部１０５に出力する。一方、説明部１０３は、テスト段階において、生成したｎ個のヒートマップ（第２の説明情報）を提示制御部１０７に出力する。 In the learning stage, the explanation unit 103 outputs the generated n heat maps (first explanation information) to the explanation evaluation unit 105. Meanwhile, in the testing stage, the explanation unit 103 outputs the generated n heat maps (second explanation information) to the presentation control unit 107.

（推論評価部１０４）
推論評価部１０４は、推論部１０２から入力された第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と入力部１０１によって取得された正解値とに基づいて、推論評価結果を得る。より詳細に、推論評価部１０４は、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と入力部１０１によって取得された正解値とを比較することによって、推論評価結果を得る。 (Inference evaluation unit 104)
The inference evaluation unit 104 obtains an inference evaluation result based on the inference values corresponding to each of the first to nth inference models input from the inference unit 102 and the correct answer value acquired by the input unit 101. More specifically, the inference evaluation unit 104 obtains an inference evaluation result by comparing the inference values corresponding to each of the first to nth inference models with the correct answer value acquired by the input unit 101.

本発明の実施形態では、推論評価部１０４が、推論値と正解値とに応じた損失関数の第１推論モデルから第ｎ推論モデルまでについての和を推論評価結果の例としての損失関数Ｌ１として算出する場合を想定する。ここで、推論値と正解値とに応じた損失関数は特定の関数に限定されず、一般的なニューラルネットワークにおいて用いられる損失関数と同様の損失関数が用いられてよい。例えば、推論値と正解値とに応じた損失関数は、正解値と推論値との差分に基づくクロスエントロピー誤差であってもよい。 In an embodiment of the present invention, it is assumed that the inference evaluation unit 104 calculates the sum of loss functions corresponding to the inferred value and the correct value for the first to nth inference models as loss function L1, which is an example of an inference evaluation result. Here, the loss function corresponding to the inferred value and the correct value is not limited to a specific function, and a loss function similar to that used in general neural networks may be used. For example, the loss function corresponding to the inferred value and the correct value may be a cross-entropy error based on the difference between the correct value and the inferred value.

推論評価部１０４は、推論評価結果を更新部１０６に出力する。 The inference evaluation unit 104 outputs the inference evaluation results to the update unit 106.

（説明評価部１０５）
説明評価部１０５は、説明部１０３から入力された第１推論モデルから第ｎ推論モデルまでのそれぞれに対応するヒートマップに基づいて説明評価結果を得る。より詳細に、説明評価部１０５は、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応するヒートマップ同士を比較する。そして、説明評価部１０５は、比較結果としてのｎ個のヒートマップ同士の一致度に基づいて、説明評価結果を得る。 (Explanation evaluation unit 105)
The explanation evaluation unit 105 obtains an explanation evaluation result based on the heat maps corresponding to the first to n-th inference models input from the explanation unit 103. More specifically, the explanation evaluation unit 105 compares the heat maps corresponding to the first to n-th inference models with each other. Then, the explanation evaluation unit 105 obtains an explanation evaluation result based on the degree of agreement between the n heat maps as the comparison results.

本発明の実施形態では、ｎ個のヒートマップ同士の一致度が大きいほど説明評価結果が小さい値を取る損失関数である場合を主に想定する。なお、ｎ個のヒートマップ同士の一致度は、ｎ個のヒートマップ同士がどの程度乖離しているかを示す乖離度と換言されてもよい。かかる場合には、ｎ個のヒートマップ同士の乖離度が小さいほど説明評価結果が小さい値を取る損失関数であってよい。 In an embodiment of the present invention, a loss function is primarily assumed in which the explanation evaluation result takes a smaller value as the degree of agreement between n heat maps increases. Note that the degree of agreement between n heat maps may be rephrased as the degree of discrepancy indicating the degree to which the n heat maps differ from each other. In such a case, the loss function may take a smaller value as the degree of discrepancy between the n heat maps decreases.

ｎ個のヒートマップから説明評価結果を得る手法は限定されない。ここでは、説明評価結果を得る手法として、ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法、および、正規化されたヒートマップ同士の内積によって説明評価結果を得る手法について順に説明する。 There are no limitations on the method for obtaining explanation evaluation results from n heat maps. Here, we will explain two methods for obtaining explanation evaluation results: one method for obtaining explanation evaluation results by multiplying a mask that binarizes a heat map with another heat map, and one method for obtaining explanation evaluation results by taking the dot product of normalized heat maps.

図２は、ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法について説明するための図である。図２に示された例では、説明を簡便にするため、ｎ＝２である場合、すなわち、推論部１０２が、第１推論モデルおよび第２推論モデルを有する場合を想定する。 Figure 2 is a diagram illustrating a method for obtaining explanation evaluation results by multiplying a mask obtained by binarizing a heat map with another heat map. For ease of explanation, the example shown in Figure 2 assumes that n = 2, i.e., that the inference unit 102 has a first inference model and a second inference model.

図２を参照すると、第１推論モデルからは、推論値とヒートマップＨ１とが出力されている。一方、第２推論モデルからは、推論値とヒートマップＨ２とが出力されている。図２では、ヒートマップＨ１およびヒートマップＨ２において、入力データのうち推論値への寄与が大きい領域ほど濃い色によって示されている。 Referring to Figure 2, the first inference model outputs an inferred value and a heat map H1. Meanwhile, the second inference model outputs an inferred value and a heat map H2. In Figure 2, in heat maps H1 and H2, areas of the input data that contribute more to the inferred value are shown in darker colors.

説明評価部１０５は、ヒートマップＨ１の二値化を行ってマスクＭ１を生成するとともに、ヒートマップＨ２の二値化を行ってマスクＭ２を生成する。なお、二値化は、閾値ｃ以上である要素（例えば、ヒートマップを構成するピクセル）の値を１とし、閾値ｃよりも小さい要素の値を０とすることによって実行され得る。図２においては、二値のうち１が黒によって示され、０が白によって示されている。 The explanation evaluation unit 105 binarizes the heat map H1 to generate a mask M1, and binarizes the heat map H2 to generate a mask M2. Note that binarization can be performed by setting the value of elements (e.g., pixels constituting the heat map) that are equal to or greater than a threshold c to 1, and the value of elements that are less than the threshold c to 0. In Figure 2, the binary values 1 and 0 are represented by black and white, respectively.

説明評価部１０５は、第１推論モデルから出力されたヒートマップＨ１と、第２の推論モデルから出力されたヒートマップＨ２から生成したマスクＭ２との積を、要素ごとに計算する。同様に、説明評価部１０５は、第２推論モデルから出力されたヒートマップＨ２と、第１の推論モデルから出力されたヒートマップＨ１から生成したマスクＭ１との積を、要素ごとに計算する。これによって、各要素に対応する積の集合が推論モデルごとに得られる。 The explanation evaluation unit 105 calculates, for each element, the product of the heat map H1 output from the first inference model and the mask M2 generated from the heat map H2 output from the second inference model. Similarly, the explanation evaluation unit 105 calculates, for each element, the product of the heat map H2 output from the second inference model and the mask M1 generated from the heat map H1 output from the first inference model. This allows a set of products corresponding to each element to be obtained for each inference model.

説明評価部１０５は、各要素に対応する積を全部の推論モデルについて足し合わせることによって積の和を計算する。そして、説明評価部１０５は、このようにして計算した積の和を全要素について足し合わせることによって合計値を計算する。説明評価部１０５は、この合計値を説明評価結果の例としての損失関数Ｌ２とする。 The explanation evaluation unit 105 calculates the sum of products by adding up the products corresponding to each element for all inference models. Then, the explanation evaluation unit 105 calculates a total value by adding up the sums of products calculated in this way for all elements. The explanation evaluation unit 105 sets this total value as the loss function L2, an example of the explanation evaluation result.

図２を参照しながらｎ＝２である場合について説明した。ｎを１より大きい任意の整数であるとして説明すると、以下の通りである。 The case where n=2 was explained with reference to Figure 2. If n is any integer greater than 1, the explanation is as follows:

すなわち、説明評価部１０５は、ｉ＝１～ｎについて、ヒートマップＴｉ（ｘ）の各要素の値を二値化したマスクＭｉ（ｘ）を生成する。次に、説明評価部１０５は、推論モデルごとに、自身の推論モデルから出力されたヒートマップＴｉ（ｘ）と、自身以外の推論モデルに対応するヒートマップから生成したマスクＭ１（ｘ）～Ｍｉ－１（ｘ）、Ｍｉ＋１（ｘ）～Ｍｎ（ｘ）の和との積を要素ごとに計算する。 That is, the explanation evaluation unit 105 generates a mask Mi(x) by binarizing the values of each element of the heat map Ti(x) for i = 1 to n. Next, for each inference model, the explanation evaluation unit 105 calculates the product of the heat map Ti(x) output from its own inference model and the sum of the masks M1(x) to Mi-1(x) and Mi+1(x) to Mn(x) generated from the heat maps corresponding to inference models other than its own, for each element.

説明評価部１０５は、各要素に対応する積を第１推論モデルから第ｎ推論モデルまでについて足し合わせることによって積の和を計算する。そして、説明評価部１０５は、このようにして計算した積の和に基づいて、説明評価結果を得る。より詳細に、説明評価部１０５は、積の和を全要素について足し合わせることによって合計値を計算する。説明評価部１０５は、この合計値を説明評価結果の例としての損失関数Ｌ２とする。 The explanation evaluation unit 105 calculates the sum of products by adding up the products corresponding to each element for the first to nth inference models. The explanation evaluation unit 105 then obtains the explanation evaluation result based on the sum of products calculated in this way. More specifically, the explanation evaluation unit 105 calculates a total value by adding up the sums of products for all elements. The explanation evaluation unit 105 sets this total value as the loss function L2, an example of the explanation evaluation result.

この損失関数Ｌ２は、各ヒートマップにおいて、自身以外のヒートマップにおいて閾値以上の値を持つ領域の合計値である。この損失関数Ｌ２の値を小さくするように学習が行われることによって、ヒートマップの一致度が小さいｎ個の推論モデルが得られる。なお、このときの損失関数Ｌ２は、以下の式（２）のように表現され得る。式（２）において、ｅは、要素番号を示す。ここで、ヒートマップＴｉ（ｘ）は、ヒートマップＴｉ（ｘ）の大きさ｜Ｔｉ（ｘ）｜で割るなどして正規化してもよい。また、ヒートマップＴｉ（ｘ）にはsigmoidなどの活性化関数をかけてもよい。 This loss function L2 is the sum of the areas in each heat map that have values above a threshold in heat maps other than the target. By training to reduce the value of this loss function L2, n inference models with low heat map matching are obtained. Note that the loss function L2 in this case can be expressed as in equation (2) below. In equation (2), e represents the element number. Here, the heat map Ti(x) may be normalized by dividing it by the size of the heat map Ti(x), |Ti(x)|. The heat map Ti(x) may also be multiplied by an activation function such as sigmoid.

図２を参照しながら、ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法について説明した。続いて、正規化されたヒートマップ同士の内積によって説明評価結果を得る手法について説明する。 With reference to Figure 2, we explained a method for obtaining explanation evaluation results by multiplying a mask created by binarizing a heat map with another heat map. Next, we will explain a method for obtaining explanation evaluation results by taking the dot product of normalized heat maps.

説明評価部１０５は、ｉ＝１～ｎについて、ヒートマップＴｉ（ｘ）をヒートマップＴｉ（ｘ）の大きさ｜Ｔｉ（ｘ）｜で割ることによって正規化して、ｉ＝１～ｎについての正規化したベクトルを生成する。そして、説明評価部１０５は、ｉ＝１～ｎについての正規化したベクトルの内積に基づいて説明評価結果を得る。より詳細に、説明評価部１０５は、内積を全要素について足し合わせることによって合計値を計算する。説明評価部１０５は、この合計値を説明評価結果の例としての損失関数Ｌ２とする。 For i = 1 to n, the explanation evaluation unit 105 normalizes the heat map Ti(x) by dividing it by the size of the heat map Ti(x), |Ti(x)|, to generate normalized vectors for i = 1 to n. The explanation evaluation unit 105 then obtains an explanation evaluation result based on the dot product of the normalized vectors for i = 1 to n. More specifically, the explanation evaluation unit 105 calculates a total value by adding up the dot products for all elements. The explanation evaluation unit 105 sets this total value as the loss function L2, an example of the explanation evaluation result.

正規化したベクトルの内積が大きいほど、この損失関数Ｌ２は、大きい値となる。正規化したベクトルの内積が大きいことは、ヒートマップ同士の一致度が高いことを意味する。したがって、この損失関数Ｌ２の値を小さくするように学習が行われることによって、ヒートマップの一致度が小さいｎ個の推論モデルが得られる。なお、このときの損失関数Ｌ２は、以下の式（３）のように表現され得る。式（３）において、ｅは、要素番号を示す。 The larger the dot product of the normalized vectors, the larger the value of this loss function L2. A large dot product of normalized vectors means that the degree of match between heat maps is high. Therefore, by training to reduce the value of this loss function L2, n inference models with low degrees of match between heat maps are obtained. Note that the loss function L2 in this case can be expressed as in the following equation (3). In equation (3), e represents the element number.

説明評価部１０５は、説明評価結果を更新部１０６に出力する。 The explanation evaluation unit 105 outputs the explanation evaluation results to the update unit 106.

（更新部１０６）
更新部１０６は、推論評価部１０４から入力された推論評価結果と、説明評価部１０５から入力された説明評価結果とに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれが使用する重みパラメータ１１０の更新を行う。これによって、第１推論モデルから第ｎ推論モデルまでのそれぞれから出力される推論値が正解値に近づくように、かつ、説明部１０３から出力されるｎ個のヒートマップ同士の一致度が小さくなるように、重みパラメータ１１０が更新され得る。重みパラメータ１１０は、誤差逆伝播法（バックプロパゲーション）によって更新されてよい。 (Update unit 106)
The update unit 106 updates the weight parameters 110 used by each of the first to n-th inference models based on the inference evaluation result input from the inference evaluation unit 104 and the explanation evaluation result input from the explanation evaluation unit 105. This allows the weight parameters 110 to be updated so that the inference values output from each of the first to n-th inference models approach the correct value and so that the degree of agreement between the n heat maps output from the explanation unit 103 decreases. The weight parameters 110 may be updated by backpropagation.

例えば、更新部１０６は、推論評価部１０４から入力された推論評価結果と、説明評価部１０５から入力された説明評価結果とを加算し、加算結果に基づいて、重みパラメータ１１０の更新を行えばよい。このとき、更新部１０６は、計算した加算結果を誤差として、誤差逆伝播法（バックプロパゲーション）によって重みパラメータ１１０を更新すればよい。上記のように、推論評価結果が損失関数Ｌ１と表現され、説明評価結果が損失関数Ｌ２と表現される場合、加算結果は、Ｌ１＋Ｌ２である。 For example, the update unit 106 may add the inference evaluation result input from the inference evaluation unit 104 and the explanation evaluation result input from the explanation evaluation unit 105, and update the weight parameter 110 based on the addition result. At this time, the update unit 106 may update the weight parameter 110 by backpropagation using the calculated addition result as the error. As described above, if the inference evaluation result is expressed as loss function L1 and the explanation evaluation result is expressed as loss function L2, the addition result is L1 + L2.

さらに、更新部１０６は、説明部１０３が有する重みパラメータを更新してよい。より詳細に、説明部１０３が、誤差逆伝播が可能な関数を含む場合、更新部１０６は、推論評価結果と説明評価結果とに基づいて、誤差逆伝播法（バックプロパゲーション）によって、説明部１０３が有する重みパラメータを更新してよい。 Furthermore, the update unit 106 may update the weight parameters held by the explanation unit 103. More specifically, if the explanation unit 103 includes a function capable of backpropagation, the update unit 106 may update the weight parameters held by the explanation unit 103 by backpropagation based on the inference evaluation result and the explanation evaluation result.

なお、学習の終了条件（すなわち、重みパラメータ更新の終了条件）は特に限定されず、第１推論モデルから第ｎ推論モデルまでの学習がある程度行われたことを示す条件であればよい。具体的に、学習の終了件は、損失関数Ｌ１＋Ｌ２の値が閾値よりも小さいという条件を含んでもよい。あるいは、学習の終了条件は、損失関数Ｌ１＋Ｌ２の値の変化が閾値よりも小さいという条件（損失関数Ｌ１＋Ｌ２の値が収束状態になったという条件）を含んでもよい。あるいは、学習の終了条件は、重みパラメータの更新が所定の回数行われたという条件を含んでもよい。あるいは、推論評価部１０４によって正解値と推論値とに基づいて精度（例えば、正答率など）が算出される場合、学習の終了条件は、精度が所定の割合（例えば、９０％など）を超えるという条件を含んでもよい。 The learning termination condition (i.e., the weight parameter update termination condition) is not particularly limited and may be any condition indicating that a certain degree of learning has been performed on the first through nth inference models. Specifically, the learning termination condition may include a condition that the value of the loss function L1 + L2 is smaller than a threshold. Alternatively, the learning termination condition may include a condition that the change in the value of the loss function L1 + L2 is smaller than a threshold (a condition that the value of the loss function L1 + L2 has converged). Alternatively, the learning termination condition may include a condition that the weight parameters have been updated a predetermined number of times. Alternatively, if the inference evaluation unit 104 calculates accuracy (e.g., accuracy rate) based on the correct answer value and the inferred value, the learning termination condition may include a condition that the accuracy exceeds a predetermined percentage (e.g., 90%).

（提示制御部１０７）
提示制御部１０７は、テスト段階において、推論部１０２から入力された第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と、説明部１０３から入力されたｎ個のヒートマップとが、ユーザに提示されるように制御する。より詳細に、提示制御部１０７は、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と、ｎ個のヒートマップとが表示されるように表示部１２１を制御する。なお、ｎ個のヒートマップは表示されるが、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値は表示されない形態も想定され得る。 (Presentation control unit 107)
In the test phase, the presentation control unit 107 controls the display unit 121 so that inference values corresponding to the first to nth inference models input from the inference unit 102 and n heat maps input from the explanation unit 103 are presented to the user. More specifically, the presentation control unit 107 controls the display unit 121 so that inference values corresponding to the first to nth inference models and n heat maps are displayed. Note that a configuration in which n heat maps are displayed but inference values corresponding to the first to nth inference models are not displayed may also be envisioned.

（表示部１２１）
表示部１２１は、ディスプレイによって構成され、提示制御部１０７による制御に従って各種情報の表示を行う機能を有する。例えば、表示部１２１は、ｎ個の推論値とｎ個のヒートマップとを表示することが可能である。ここで、表示部１２１の形態は特に限定されない。例えば、表示部１２１は、液晶ディスプレイ（ＬＣＤ）装置であってもよいし、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置であってもよいし、ランプなどの表示装置であってもよい。 (Display section 121)
The display unit 121 is configured by a display and has a function of displaying various information under the control of the presentation control unit 107. For example, the display unit 121 can display n inferred values and n heat maps. Here, the form of the display unit 121 is not particularly limited. For example, the display unit 121 may be a liquid crystal display (LCD) device, an OLED (organic light emitting diode) device, or a display device such as a lamp.

（操作部１２２）
操作部１２２は、ユーザによる操作を受け付ける。例えば、ユーザがｎ個の推論値とｎ個のヒートマップとを参照しながら、ｎ個の推論モデルから解釈性の高い１または複数の推論モデル（以下、「選択モデル」とも言う。）を見つけたとする。このとき、ユーザは、選択モデルを示す情報（以下、「選択モデル情報」とも言う。）を操作部１２２に入力し、操作部１２２は、選択モデル情報１２３を受け付ける。例えば、選択モデル情報１２３は、選択モデルを示す番号であってよい。 (Operation unit 122)
The operation unit 122 accepts operations by the user. For example, suppose the user refers to n inference values and n heat maps and finds one or more inference models (hereinafter also referred to as "selection models") with high interpretability from the n inference models. At this time, the user inputs information indicating the selection model (hereinafter also referred to as "selection model information") into the operation unit 122, and the operation unit 122 accepts selection model information 123. For example, the selection model information 123 may be a number indicating the selection model.

なお、本発明の実施形態では、操作部１２２がマウスおよびキーボードである場合を主に想定する。しかし、操作部１２２の形態は特に限定されない。例えば、操作部１２２は、タッチパネルであってもよいし、他の入力装置であってもよい。 Note that in the embodiment of the present invention, it is primarily assumed that the operation unit 122 is a mouse and keyboard. However, the form of the operation unit 122 is not particularly limited. For example, the operation unit 122 may be a touch panel or another input device.

（記録制御部１０８）
記録制御部１０８は、操作部１２２によってユーザから受け付けられた選択モデル情報１２３の記録を制御する。より詳細に、記録制御部１０８は、操作部１２２によってユーザから受け付けられた選択モデル情報１２３を図示しない記憶部に記憶させる。選択モデル情報１２３は、図示しない記憶部から後に取得され、選択モデル情報１２３によって示される選択モデルが、解釈性の高い学習済みモデルとして用いられ得る。 (Recording control unit 108)
The recording control unit 108 controls the recording of the selection model information 123 received from the user by the operation unit 122. More specifically, the recording control unit 108 stores the selection model information 123 received from the user by the operation unit 122 in a storage unit (not shown). The selection model information 123 is later acquired from the storage unit (not shown), and the selection model indicated by the selection model information 123 can be used as a trained model with high interpretability.

なお、テストの終了条件は特に限定されず、ユーザにとって十分な回数のテストが行われたことを示す条件であればよい。具体的に、テストの終了条件は、テスト段階においてユーザによって推論結果の確認が所定の回数以上行われたという条件を含んでもよい。 The test termination conditions are not particularly limited, and may be any conditions that indicate that the user has performed the test a sufficient number of times. Specifically, the test termination conditions may include a condition that the user has confirmed the inference results a predetermined number of times or more during the test phase.

以上、本発明の実施形態に係る学習装置１０の構成例について説明した。 The above describes an example configuration of the learning device 10 according to an embodiment of the present invention.

（１．２．学習段階における動作）
図３を参照しながら、本発明の実施形態に係る学習装置１０の学習段階における動作の流れについて説明する。図３は、本発明の実施形態に係る学習装置１０の学習段階における動作例を示すフローチャートである。 (1.2. Actions in the learning stage)
The flow of operations in the learning stage of the learning device 10 according to the embodiment of the present invention will be described with reference to Fig. 3. Fig. 3 is a flowchart showing an example of operations in the learning stage of the learning device 10 according to the embodiment of the present invention.

まず、図３に示されたように、入力部１０１は、データセット１００から入力データ（すなわち、学習用データ）および正解値の組み合わせを取得する。さらに、推論部１０２は、ｎ個の推論モデルそれぞれに対応する重みパラメータ１１０を取得する（Ｓ１１）。推論部１０２は、入力部１０１によって取得された入力データとｎ個の推論モデルとに基づいて推論を行い（Ｓ１２）、推論によって得られたｎ個の推論値を推論評価部１０４および説明部１０３それぞれに出力する。 First, as shown in FIG. 3, the input unit 101 acquires a combination of input data (i.e., learning data) and correct values from the dataset 100. Furthermore, the inference unit 102 acquires weight parameters 110 corresponding to each of the n inference models (S11). The inference unit 102 performs inference based on the input data acquired by the input unit 101 and the n inference models (S12), and outputs the n inference values obtained by the inference to the inference evaluation unit 104 and the explanation unit 103, respectively.

説明部１０３は、推論部１０２から入力されたｎ個の推論値に基づいて、ｎ個の推論値それぞれの判断根拠を説明するヒートマップを生成する（Ｓ１３）。説明部１０３は、生成したｎ個のヒートマップを説明評価部１０５に出力する。 The explanation unit 103 generates a heat map that explains the basis for determining each of the n inference values based on the n inference values input from the inference unit 102 (S13). The explanation unit 103 outputs the generated n heat maps to the explanation evaluation unit 105.

推論評価部１０４は、入力部１０１によって取得された正解値に基づいて、推論部１０２から入力されたｎ個の推論値を評価して推論評価結果を得る。より詳細に、推論評価部１０４は、正解値とｎ個の推論値とに応じた損失関数を推論評価結果として算出する。推論評価部１０４は、算出した推論評価結果を更新部１０６に出力する。 The inference evaluation unit 104 evaluates the n inference values input from the inference unit 102 based on the correct answer value acquired by the input unit 101 to obtain an inference evaluation result. More specifically, the inference evaluation unit 104 calculates a loss function corresponding to the correct answer value and the n inference values as the inference evaluation result. The inference evaluation unit 104 outputs the calculated inference evaluation result to the update unit 106.

説明評価部１０５は、説明部１０３から入力されたｎ個のヒートマップの一致度に基づいて、説明評価結果を得る。より詳細に、説明評価部１０５は、説明部１０３から入力されたｎ個のヒートマップ同士の一致度に応じた損失関数を説明評価結果として算出する。説明評価部１０５は、算出した説明評価結果を更新部１０６に出力する（Ｓ１４）。 The explanation evaluation unit 105 obtains an explanation evaluation result based on the degree of agreement between the n heat maps input from the explanation unit 103. More specifically, the explanation evaluation unit 105 calculates a loss function according to the degree of agreement between the n heat maps input from the explanation unit 103 as the explanation evaluation result. The explanation evaluation unit 105 outputs the calculated explanation evaluation result to the update unit 106 (S14).

更新部１０６は、推論評価部１０４から入力された推論評価結果と、説明評価部１０５から入力された説明評価結果とに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する重みパラメータ１１０の更新を行う（Ｓ１５）。より詳細に、更新部１０６は、推論評価結果と説明評価結果とに基づいて、誤差逆伝播法によって、重みパラメータ１１０を更新する。さらに、更新部１０６は、推論評価結果と説明評価結果とに基づく誤差逆伝播法によって説明部１０３が有する重みパラメータの更新を行う。 The update unit 106 updates the weight parameters 110 corresponding to each of the first to nth inference models based on the inference evaluation results input from the inference evaluation unit 104 and the explanation evaluation results input from the explanation evaluation unit 105 (S15). More specifically, the update unit 106 updates the weight parameters 110 using backpropagation based on the inference evaluation results and explanation evaluation results. Furthermore, the update unit 106 updates the weight parameters held by the explanation unit 103 using backpropagation based on the inference evaluation results and explanation evaluation results.

更新部１０６は、入力データに基づく重みパラメータの更新が終わるたびに、学習の終了条件が満たされたか否かを判断する（Ｓ１６）。学習の終了条件が満たされていないと判断した場合には（Ｓ１６において「ＮＯ」）、Ｓ１１に動作が移行され、入力部１０１によって次の入力データが取得され、推論部１０２、説明部１０３、推論評価部１０４、説明評価部１０５および更新部１０６それぞれによって、当該次の入力データに基づく各自の処理が再度実行される。一方、更新部１０６によって、学習の終了条件が満たされたと判断された場合には（Ｓ１６において「ＹＥＳ」）、学習が終了される。 Each time the update unit 106 finishes updating the weight parameters based on input data, it determines whether the learning termination condition has been met (S16). If it determines that the learning termination condition has not been met (NO in S16), operation proceeds to S11, the input unit 101 acquires the next input data, and the inference unit 102, explanation unit 103, inference evaluation unit 104, explanation evaluation unit 105, and update unit 106 each re-execute their respective processes based on the next input data. On the other hand, if the update unit 106 determines that the learning termination condition has been met (YES in S16), learning is terminated.

以上、本発明の実施形態に係る学習装置１０の学習段階における動作の流れについて説明した。 The above describes the operational flow during the learning stage of the learning device 10 according to an embodiment of the present invention.

（１．３．テスト段階における動作）
図４を参照しながら、本発明の実施形態に係る学習装置１０のテスト段階における動作の流れについて説明する。図４は、本発明の実施形態に係る学習装置１０のテスト段階における動作例を示すフローチャートである。 (1.3. Operation in the test phase)
The flow of operations in the test phase of the learning device 10 according to the embodiment of the present invention will be described with reference to Fig. 4. Fig. 4 is a flowchart showing an example of operations in the test phase of the learning device 10 according to the embodiment of the present invention.

まず、図４に示されたように、入力部１０１は、データセット１００から入力データ（すなわち、テスト用データ）および正解値の組み合わせを取得する。さらに、推論部１０２は、ｎ個の推論モデルそれぞれに対応する重みパラメータ１１０を取得する（Ｓ２１）。推論部１０２は、入力部１０１によって取得された入力データとｎ個の推論モデルとに基づいて推論を行い（Ｓ２２）、推論によって得られたｎ個の推論値を説明部１０３および提示制御部１０７それぞれに出力する。 First, as shown in FIG. 4, the input unit 101 acquires a combination of input data (i.e., test data) and correct values from the dataset 100. Furthermore, the inference unit 102 acquires weight parameters 110 corresponding to each of the n inference models (S21). The inference unit 102 performs inference based on the input data acquired by the input unit 101 and the n inference models (S22), and outputs the n inference values obtained by the inference to the explanation unit 103 and the presentation control unit 107, respectively.

説明部１０３は、推論部１０２から入力されたｎ個の推論値に基づいて、ｎ個の推論値それぞれの判断根拠を説明するヒートマップを生成する（Ｓ２３）。説明部１０３は、生成したｎ個のヒートマップを提示制御部１０７に出力する。 The explanation unit 103 generates a heat map that explains the basis for determining each of the n inferred values based on the n inferred values input from the inference unit 102 (S23). The explanation unit 103 outputs the generated n heat maps to the presentation control unit 107.

提示制御部１０７は、推論部１０２から入力されたｎ個の推論値と、説明部１０３から入力されたｎ個のヒートマップとがユーザに提示されるように表示部１２１を制御する。表示部１２１は、提示制御部１０７による制御に従って、ｎ個の推論値と、ｎ個のヒートマップとを表示する（Ｓ２４）。 The presentation control unit 107 controls the display unit 121 so that the n inferred values input from the inference unit 102 and the n heat maps input from the explanation unit 103 are presented to the user. The display unit 121 displays the n inferred values and the n heat maps in accordance with the control of the presentation control unit 107 (S24).

操作部１２２は、ｎ個の推論モデルから解釈性が高いと判断された１または複数の推論モデルを示す情報（選択モデル情報１２３）をユーザから受け付ける。記録制御部１０８は、操作部１２２によってユーザから受け付けられた選択モデル情報１２３の記録を制御する（Ｓ２５）。図示しない記憶部は、記録制御部１０８による制御に従って、選択モデル情報１２３を記憶する。 The operation unit 122 receives from the user information (selection model information 123) indicating one or more inference models determined to have high interpretability from the n inference models. The recording control unit 108 controls the recording of the selection model information 123 received from the user by the operation unit 122 (S25). A storage unit (not shown) stores the selection model information 123 in accordance with the control of the recording control unit 108.

記録制御部１０８は、入力データに基づく選択モデル情報１２３の記録制御が終わるたびに、テストの終了条件が満たされたか否かを判断する（Ｓ２６）。テストの終了条件が満たされていないと判断した場合には（Ｓ２６において「ＮＯ」）、Ｓ２１に動作が移行され、入力部１０１によって次の入力データが取得され、推論部１０２、説明部１０３、提示制御部１０７および記録制御部１０８それぞれによって、当該次の入力データに基づく各自の処理が再度実行される。一方、記録制御部１０８によって、テストの終了条件が満たされたと判断された場合には（Ｓ２６において「ＹＥＳ」）、テストが終了される。 The recording control unit 108 determines whether the test termination condition has been met (S26) each time it finishes recording control of the selection model information 123 based on the input data. If it determines that the test termination condition has not been met ("NO" in S26), operation proceeds to S21, the input unit 101 acquires the next input data, and the inference unit 102, explanation unit 103, presentation control unit 107, and recording control unit 108 each re-execute their respective processes based on the next input data. On the other hand, if the recording control unit 108 determines that the test termination condition has been met ("YES" in S26), the test is terminated.

以上、本発明の実施形態に係る学習装置１０のテスト段階における動作の流れについて説明した。 The above describes the operational flow during the test phase of the learning device 10 according to an embodiment of the present invention.

（１．４．実施形態の効果）
以上に説明したように、本発明の実施形態によれば、第１推論モデルから第ｎ推論モデルまでのそれぞれから出力される推論値が正解値に近づくように、かつ、説明情報として出力されるｎ個のヒートマップ同士の一致度が小さくなるように、学習が行われ得る。これによって、互いに異なる複数のヒートマップを出力する推論モデルを得ることができる。これによって、ユーザは、ｎ個のモデルの中からより解釈性の高いヒートマップを出力するモデルを選んで使用することができる。 (1.4. Effects of the embodiment)
As described above, according to an embodiment of the present invention, learning can be performed so that the inference values output from each of the first to nth inference models approach the correct value and so that the degree of similarity between the n heat maps output as explanatory information decreases. This makes it possible to obtain inference models that output multiple heat maps that are different from each other. This allows a user to select and use a model that outputs a heat map with higher interpretability from among the n models.

以上、本発明の実施形態が奏する効果について説明した。 The above describes the effects achieved by embodiments of the present invention.

（２．ハードウェア構成例）
続いて、本発明の実施形態に係る学習装置１０のハードウェア構成例について説明する。以下では、本発明の実施形態に係る学習装置１０のハードウェア構成例として、情報処理装置９００のハードウェア構成例について説明する。なお、以下に説明する情報処理装置９００のハードウェア構成例は、学習装置１０のハードウェア構成の一例に過ぎない。したがって、学習装置１０のハードウェア構成は、以下に説明する情報処理装置９００のハードウェア構成から不要な構成が削除されてもよいし、新たな構成が追加されてもよい。 (2. Hardware configuration example)
Next, an example of the hardware configuration of the learning device 10 according to an embodiment of the present invention will be described. Below, an example of the hardware configuration of an information processing device 900 will be described as an example of the hardware configuration of the learning device 10 according to an embodiment of the present invention. Note that the example of the hardware configuration of the information processing device 900 described below is merely one example of the hardware configuration of the learning device 10. Therefore, the hardware configuration of the learning device 10 may be such that unnecessary components are deleted from the hardware configuration of the information processing device 900 described below, or new components are added.

図５は、本発明の実施形態に係る学習装置１０の例としての情報処理装置９００のハードウェア構成を示す図である。情報処理装置９００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９０１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９０３と、ホストバス９０４と、ブリッジ９０５と、外部バス９０６と、インタフェース９０７と、入力装置９０８と、出力装置９０９と、ストレージ装置９１０と、通信装置９１１と、を備える。 Figure 5 is a diagram showing the hardware configuration of an information processing device 900 as an example of a learning device 10 according to an embodiment of the present invention. The information processing device 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM (Random Access Memory) 903, a host bus 904, a bridge 905, an external bus 906, an interface 907, an input device 908, an output device 909, a storage device 910, and a communication device 911.

ＣＰＵ９０１は、演算処理装置および制御装置として機能し、各種プログラムに従って情報処理装置９００内の動作全般を制御する。また、ＣＰＵ９０１は、マイクロプロセッサであってもよい。ＲＯＭ９０２は、ＣＰＵ９０１が使用するプログラムや演算パラメータ等を記憶する。ＲＡＭ９０３は、ＣＰＵ９０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を一時記憶する。これらはＣＰＵバス等から構成されるホストバス９０４により相互に接続されている。 The CPU 901 functions as an arithmetic processing device and control device, and controls the overall operation of the information processing device 900 in accordance with various programs. The CPU 901 may also be a microprocessor. The ROM 902 stores programs used by the CPU 901, calculation parameters, etc. The RAM 903 temporarily stores programs used in the execution of the CPU 901, and parameters that change as appropriate during the execution. These are interconnected by a host bus 904, which is composed of a CPU bus, etc.

ホストバス９０４は、ブリッジ９０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バス等の外部バス９０６に接続されている。なお、必ずしもホストバス９０４、ブリッジ９０５および外部バス９０６を分離構成する必要はなく、１つのバスにこれらの機能を実装してもよい。 The host bus 904 is connected to an external bus 906, such as a PCI (Peripheral Component Interconnect/Interface) bus, via a bridge 905. Note that the host bus 904, bridge 905, and external bus 906 do not necessarily need to be configured separately; these functions may be implemented on a single bus.

入力装置９０８は、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、スイッチおよびレバー等ユーザが情報を入力するための入力手段と、ユーザによる入力に基づいて入力信号を生成し、ＣＰＵ９０１に出力する入力制御回路等から構成されている。情報処理装置９００を操作するユーザは、この入力装置９０８を操作することにより、情報処理装置９００に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 908 is composed of input means for the user to input information, such as a mouse, keyboard, touch panel, buttons, microphone, switches, and levers, and an input control circuit that generates input signals based on user input and outputs them to the CPU 901. By operating this input device 908, the user operating the information processing device 900 can input various data to the information processing device 900 and instruct processing operations.

出力装置９０９は、例えば、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）ディスプレイ装置、液晶ディスプレイ（ＬＣＤ）装置、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置、ランプ等の表示装置およびスピーカ等の音声出力装置を含む。 The output device 909 includes, for example, display devices such as a CRT (Cathode Ray Tube) display device, a liquid crystal display (LCD) device, an OLED (Organic Light Emitting Diode) device, a lamp, and an audio output device such as a speaker.

ストレージ装置９１０は、データ格納用の装置である。ストレージ装置９１０は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置および記憶媒体に記録されたデータを削除する削除装置等を含んでもよい。ストレージ装置９１０は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）で構成される。このストレージ装置９１０は、ハードディスクを駆動し、ＣＰＵ９０１が実行するプログラムや各種データを格納する。 The storage device 910 is a device for storing data. The storage device 910 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, and a deletion device that deletes data recorded on the storage medium. The storage device 910 is configured, for example, with an HDD (Hard Disk Drive). This storage device 910 drives the hard disk and stores programs executed by the CPU 901 and various data.

通信装置９１１は、例えば、ネットワークに接続するための通信デバイス等で構成された通信インタフェースである。また、通信装置９１１は、無線通信または有線通信のどちらに対応してもよい。 The communication device 911 is, for example, a communication interface configured with a communication device for connecting to a network. Furthermore, the communication device 911 may support either wireless communication or wired communication.

以上、本発明の実施形態に係る学習装置１０のハードウェア構成例について説明した。 The above describes an example of the hardware configuration of the learning device 10 according to an embodiment of the present invention.

（３．まとめ）
以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 (3. Summary)
Although the preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, the present invention is not limited to these examples. It is clear that a person skilled in the art to which the present invention pertains can conceive of various modifications and alterations within the scope of the technical ideas set forth in the claims, and it is understood that these also naturally fall within the technical scope of the present invention.

例えば、上記した例では、学習装置１０がｎ個の推論モデルを同時に学習する場合を主に想定している。しかし、学習装置１０は、ｎ個の推論モデルの全部を同時に学習しなくてもよい。例えば、ｎ個の推論モデルの一部として、学習済みの推論モデルが使用されてもよい。このとき、学習済みの推論モデルの重みパラメータは、更新されずに一定の値に固定され得る。 For example, the above example mainly assumes that the learning device 10 learns n inference models simultaneously. However, the learning device 10 does not have to learn all n inference models simultaneously. For example, a trained inference model may be used as part of the n inference models. In this case, the weight parameters of the trained inference model may be fixed to a constant value without being updated.

また、上記した例では、説明部１０３におけるヒートマップの生成手法の種類が、１種類である場合を主に想定している。しかし、説明部１０３におけるヒートマップの生成手法の種類は複数であってもよい。このとき、説明部１０３は、ヒートマップ同士の一致度に基づく損失の複数種類のヒートマップ生成手法についての合計値を説明評価結果の例として更新部１０６に出力してもよい。 Furthermore, in the above example, it is mainly assumed that the explanation unit 103 uses one type of heat map generation method. However, the explanation unit 103 may use multiple types of heat map generation methods. In this case, the explanation unit 103 may output to the update unit 106 the total value of the loss based on the degree of agreement between the heat maps for the multiple types of heat map generation methods as an example of the explanation evaluation result.

１０学習装置
１００データセット
１０１入力部
１０２推論部
１０３説明部
１０４推論評価部
１０５説明評価部
１０６更新部
１０７提示制御部
１０８記録制御部
１１０重みパラメータ
１２１表示部
１２２操作部
１２３選択モデル情報

REFERENCE SIGNS LIST 10 Learning device 100 Data set 101 Input unit 102 Inference unit 103 Explanation unit 104 Inference evaluation unit 105 Explanation evaluation unit 106 Update unit 107 Presentation control unit 108 Recording control unit 110 Weight parameter 121 Display unit 122 Operation unit 123 Selection model information

Claims

an input unit that acquires first input data and a correct answer value of the first input data;
an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on the first input data and a plurality of inference models;
an explanation unit that outputs first explanation information corresponding to each of the plurality of inference models, the first explanation information indicating the magnitude of contribution of the first input data to the first inferred value;
an inference evaluation unit that obtains an inference evaluation result based on the correct answer value and the first inference value;
an explanation evaluation unit that obtains an explanation evaluation result based on the degree of agreement between first explanation information corresponding to each of the plurality of inference models;
an update unit that updates first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result so as to reduce the degree of agreement between the first explanation information;
A learning device comprising:
the input unit acquires second input data;
the inference unit outputs second inference values corresponding to the plurality of trained models, which are the plurality of estimation models after updating the first weight parameters, based on the second input data and the plurality of trained models;
the explanation unit outputs second explanation information corresponding to each of the plurality of trained models, the second explanation information indicating a magnitude of contribution of the second input data to the second inferred value;
the learning device includes: a presentation control unit that controls presentation of the second inferred value and the second explanation information to a user;
an operation unit that receives, from the user, selected model information indicating one or more trained models selected by the user from the plurality of trained models;
A learning device comprising:

The learning device
A recording control unit that controls recording of information indicating the one or more trained models,
The learning device according to claim 1 .

the explanation evaluation result takes a smaller value as the degree of agreement between the first explanation information corresponding to each of the plurality of inference models increases;
The learning device according to claim 1 or 2 .

the explanation evaluation unit obtains the explanation evaluation result based on an inner product of vectors normalized from first explanation information corresponding to each of the plurality of inference models;
The learning device according to claim 3 .

The explanation evaluation unit binarizes the first explanation information to generate a mask for each of the multiple inference models, calculates the product of the mask generated from the first explanation information corresponding to an inference model other than itself and the first explanation information corresponding to its own inference model, and obtains the explanation evaluation result based on the sum of the products for each of the multiple inference models.
The learning device according to claim 3 .

The explanation section includes a function capable of backpropagation.
The learning device according to any one of claims 1 to 5 .

the explanation section has a second weighting parameter;
the update unit updates the second weight parameters by backpropagation.
The learning device according to claim 6 .

At least one of the plurality of inference models includes a neural network.
The learning device according to any one of claims 1 to 7 .

the update unit updates the first weight parameter based on an addition result of the inference evaluation result and the explanation evaluation result.
The learning device according to any one of claims 1 to 8 .

the first explanatory information is a heat map showing the magnitude of contribution of the first input data to the first inferred value;
The learning device according to any one of claims 1 to 9 .

Obtaining first input data and a correct answer value of the first input data;
outputting a first inference value corresponding to each of the plurality of inference models based on the first input data and the plurality of inference models;
outputting first explanatory information corresponding to each of the plurality of inference models indicating the magnitude of contribution of the first input data to the first inferred value;
obtaining an inference evaluation result based on the correct answer value and the first inference value;
obtaining an explanation evaluation result based on the degree of agreement between first explanation information corresponding to each of the plurality of inference models;
updating first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result so that the degree of agreement between the first explanation information becomes smaller;
obtaining second input data;
outputting second inference values corresponding to each of the plurality of trained models based on the second input data and the plurality of trained models, which are the plurality of estimation models after updating the first weight parameters; and
outputting second explanation information corresponding to each of the plurality of trained models, the second explanation information indicating a magnitude of contribution of the second input data to the second inferred value;
controlling presentation of the second inference value and the second explanatory information to a user;
receiving selected model information from the user indicating one or more trained models selected by the user from the plurality of trained models;
including, learning methods.

Computer,
an input unit that acquires first input data and a correct answer value of the first input data;
an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on the first input data and a plurality of inference models;
an explanation unit that outputs first explanation information corresponding to each of the plurality of inference models, the first explanation information indicating the magnitude of contribution of the first input data to the first inferred value;
an inference evaluation unit that obtains an inference evaluation result based on the correct answer value and the first inference value;
an explanation evaluation unit that obtains an explanation evaluation result based on the degree of agreement between first explanation information corresponding to each of the plurality of inference models;
an update unit that updates first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result so as to reduce the degree of agreement between the first explanation information;
A learning device comprising:
the input unit acquires second input data;
the inference unit outputs second inference values corresponding to the plurality of trained models, which are the plurality of estimation models after updating the first weight parameters, based on the second input data and the plurality of trained models;
the explanation unit outputs second explanation information corresponding to each of the plurality of trained models, the second explanation information indicating a magnitude of contribution of the second input data to the second inferred value;
the learning device includes: a presentation control unit that controls presentation of the second inferred value and the second explanation information to a user;
an operation unit that receives, from the user, selected model information indicating one or more trained models selected by the user from the plurality of trained models;
A program that functions as a learning device .