JP7763643B2

JP7763643B2 - Learning device, label estimation device and program

Info

Publication number: JP7763643B2
Application number: JP2021192928A
Authority: JP
Inventors: 有希安田
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2021-09-02
Filing date: 2021-11-29
Publication date: 2025-11-04
Anticipated expiration: 2041-11-29
Also published as: JP2023036503A

Description

本発明は、学習装置、ラベル推定装置及びプログラムに関する。 The present invention relates to a learning device, a label estimation device, and a program.

検索を容易にする等の理由から、ニュースなどの文章にその文章に関連するラベルを付与することが望ましい場合がある。例えば、感染症の影響で株価が変動した会社のニュースであれば、感染症や、株価、ビジネスなどの用語がラベルとして付与される。 For reasons such as making searches easier, it may be desirable to assign labels related to news articles. For example, if the news is about a company whose stock prices have fluctuated due to the impact of an infectious disease, terms such as infectious disease, stock price, and business may be assigned as labels.

特開２０１９－５３７３０号公報JP 2019-53730 A

Grigorios, Tsoumakas, Ioannis Katakis, “Multi-Label Classification: An Overview”Grigorios, Tsoumakas, Ioannis Katakis, “Multi-Label Classification: An Overview” Ankit Pal, Muru Selvakumar and Malaikannan Sankarasubbu,“MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network” arXiv:2003.11644v1Ankit Pal, Muru Selvakumar and Malaikannan Sankarasubbu, “MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network” arXiv:2003.11644v1 Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin，“Rethinking Complex Neural Network Architectures for Document Classification” Proceedings of NAACL-HLT 2019, pages 4046-4051Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin, “Rethinking Complex Neural Network Architectures for Document Classification” Proceedings of NAACL-HLT 2019, pages 4046-4051

ラベルは１つの文章に１つが付与されれば充分な場合もあるが、文章は複数の言葉で構成される場合が多いので、付与されるラベルが１つでは不十分な場合もある。すなわち、上述の例に示したような、１つの文章に複数のラベルを付与することが望ましい場合もある。しかしながら、文章に付与すべきラベルの数が増えれば増えるほど、ラベルの推定の作業に要する労力は増大してしまう。 In some cases, it is sufficient to assign one label to a sentence, but since sentences often consist of multiple words, there are cases where a single label is insufficient. In other words, as shown in the example above, it may be desirable to assign multiple labels to a single sentence. However, the more labels that need to be assigned to a sentence, the greater the effort required to estimate the labels.

上記事情に鑑み、本発明は、ラベルの推定する作業に要する労力の増大を抑制する技術を提供することを目的としている。 In light of the above circumstances, the present invention aims to provide technology that reduces the increase in labor required for label estimation work.

本発明の一態様は、文章を示す文章情報と前記文章に付与されるラベルの候補として予め定められた複数のラベルについて前記文章のラベルとして適切である度合を示すラベル適正情報とを含むモデル学習用データを用いた機械学習の方法により、入力された文章情報が示す文章に付与されるべきラベルを推定する数理モデルであるラベル推定モデルを更新するモデル学習部、を備え、前記ラベル適正情報は、前記モデル学習用データの示す文章に対して付与される確率の高さに関する所定の条件を満たすラベルを示す正否情報と、前記文章に付与されるラベルの候補として予め定められた複数の各ラベルのうちの任意の２つの間の共起の確率を示す情報であるラベル共起情報と、に基づいて得られた情報である、学習装置である。 One aspect of the present invention is a learning device that includes a model learning unit that updates a label estimation model, which is a mathematical model that estimates a label to be assigned to a sentence indicated by input sentence information, using a machine learning method that uses model training data that includes sentence information indicating a sentence and label appropriateness information indicating the degree to which a plurality of labels, predetermined as candidate labels to be assigned to the sentence, are appropriate as labels for the sentence. The label appropriateness information is information obtained based on true/false information indicating a label that satisfies a predetermined condition regarding the likelihood of being assigned to the sentence indicated by the model training data, and label co-occurrence information, which is information indicating the probability of co-occurrence between any two of a plurality of labels predetermined as candidate labels to be assigned to the sentence.

本発明の一態様は、処理対象の文章を示す情報である対象情報を取得する対象取得部と、
文章を示す文章情報と前記文章に付与されるラベルの候補として予め定められた複数のラベルについて前記文章のラベルとして適切である度合を示すラベル適正情報とを含むモデル学習用データを用いた機械学習の方法により、入力された文章情報が示す文章に付与されるべきラベルを推定する数理モデルであるラベル推定モデルを更新するモデル学習部、を備え、前記ラベル適正情報は、前記モデル学習用データの示す文章に対して付与される確率の高さに関する所定の条件を満たすラベルを示す正否情報と、前記文章に付与されるラベルの候補として予め定められた複数の各ラベルのうちの任意の２つの間の共起の確率を示す情報であるラベル共起情報と、に基づいて得られた情報である学習装置が得た、学習済みのラベル推定モデルを用いて、前記文章取得部の取得した対象情報が示す文章に付与されるべきラベルを推定する、推定部と、を備えるラベル推定装置である。 One aspect of the present invention is a method for processing a document, comprising: an object acquisition unit that acquires object information that indicates a sentence to be processed;
The label estimation device includes: a model learning unit that updates a label estimation model, which is a mathematical model that estimates a label to be assigned to a sentence indicated by input sentence information, by a machine learning method using model learning data including sentence information that indicates a sentence and label appropriateness information that indicates the degree to which a plurality of labels, which are predetermined as candidate labels to be assigned to the sentence, are appropriate as labels for the sentence; and an estimation unit that estimates a label to be assigned to a sentence indicated by target information acquired by the sentence acquisition unit, using the trained label estimation model obtained by a learning device, which is information obtained based on true/false information that indicates a label that satisfies a predetermined condition regarding the likelihood of being assigned to the sentence indicated by the model learning data, and label co-occurrence information that is information that indicates the probability of co-occurrence between any two of a plurality of labels, which are predetermined as candidate labels to be assigned to the sentence.

本発明の一態様は、上記の学習装置としてコンピュータを機能させるためのプログラムである。 One aspect of the present invention is a program for causing a computer to function as the above-mentioned learning device.

本発明の一態様は、上記のラベル推定装置としてコンピュータを機能させるためのプログラムである。 One aspect of the present invention is a program for causing a computer to function as the above-mentioned label estimation device.

本発明により、ラベルの推定する作業に要する労力の増大を抑制する技術を提供することが可能となる。 This invention makes it possible to provide technology that reduces the increase in labor required for label estimation work.

実施形態のラベル推定システムを説明する説明図。FIG. 1 is an explanatory diagram illustrating a label estimation system according to an embodiment. 実施形態におけるラベル共起情報の一例を示す図。FIG. 4 is a diagram showing an example of label co-occurrence information according to the embodiment. 実施形態におけるラベル適正情報生成処理の一例を説明する図。5A to 5C are diagrams illustrating an example of label suitability information generation processing according to an embodiment. 実施形態における学習装置１のハードウェア構成の一例を示す図。FIG. 2 is a diagram showing an example of the hardware configuration of a learning device 1 according to an embodiment. 実施形態における制御部１１の構成の一例を示す図。FIG. 2 is a diagram showing an example of the configuration of a control unit 11 according to the embodiment. 実施形態における学習装置１が実行する処理の流れの一例を示すフローチャート。10 is a flowchart showing an example of the flow of processing executed by the learning device 1 in the embodiment. 実施形態におけるラベル推定装置２のハードウェア構成の一例を示す図。FIG. 2 is a diagram showing an example of the hardware configuration of a label estimation device 2 according to an embodiment. 実施形態における制御部２１の構成の一例を示す図。FIG. 2 is a diagram showing an example of the configuration of a control unit 21 according to the embodiment. 実施形態におけるラベル推定装置２が実行する処理の流れの一例を示すフローチャート。10 is a flowchart showing an example of the flow of processing executed by a label estimation device 2 in the embodiment. 実施形態のラベル推定システムを用いた実験結果の一例を示す第１の図。FIG. 1 is a first diagram showing an example of experimental results using the label estimation system according to the embodiment. 実施形態のラベル推定システムを用いた実験結果の一例を示す第２の図。FIG. 2 is a second diagram showing an example of experimental results using the label estimation system according to the embodiment. 実施形態のラベル推定システムを用いた実験結果の一例を示す第３の図。FIG. 3 is a third diagram showing an example of experimental results using the label estimation system according to the embodiment. 変形例における制御部の構成の一例を示す図。FIG. 10 is a diagram showing an example of the configuration of a control unit in a modified example.

（実施形態）
図１は、実施形態のラベル推定システム１００を説明する説明図である。ラベル推定システム１００は、文章に付与されるべきラベルを推定するシステムである。ラベル推定システム１００は、文章に付与されるべきラベルを推定する数理モデルを機械学習の方法により得る。ラベル推定システム１００は、取得した数理モデルを用いて、入力された文章に付与されるべきラベルを推定する。 (Embodiment)
FIG. 1 is an explanatory diagram illustrating a label estimation system 100 according to an embodiment. The label estimation system 100 is a system that estimates a label to be assigned to a sentence. The label estimation system 100 obtains a mathematical model for estimating a label to be assigned to a sentence using a machine learning method. The label estimation system 100 uses the obtained mathematical model to estimate a label to be assigned to an input sentence.

より具体的には、数理モデルの取得を終えたラベル推定システム１００は、文章情報が入力された際に、取得した数理モデルを用い、文章情報に基づき文章情報が示す文章に付与されるべきラベルを推定する。文章情報は、文章を示す情報である。より具体的にラベル推定システム１００を説明する。ラベル推定システム１００は、学習装置１とラベル推定装置２とを備える。 More specifically, when text information is input, the label estimation system 100, having acquired the mathematical model, uses the acquired mathematical model to estimate a label to be assigned to the text indicated by the text information, based on the text information. Text information is information that indicates a text. The label estimation system 100 will be described in more detail. The label estimation system 100 includes a learning device 1 and a label estimation device 2.

学習装置１は、機械学習の方法によりラベル推定モデルを更新することで学習済みラベル推定モデルを得る。ラベル推定モデルは、入力された文章情報に基づき入力された文章情報が示す文章（以下「対象文章」という。）に付与されるべきラベルを推定する数理モデルであって学習に関する所定の終了条件が満たされる前の数理モデルである。 The learning device 1 obtains a trained label estimation model by updating the label estimation model using a machine learning method. The label estimation model is a mathematical model that estimates, based on input text information, the label to be assigned to a sentence indicated by the input text information (hereinafter referred to as the "target sentence"), and is a mathematical model before a predetermined termination condition for learning is satisfied.

より具体的にはラベル推定モデルが推定する結果は、ラベル適正情報である。ラベル適正情報は、ラベル適正度を、ラベル候補それぞれについて示す情報である。ラベル適正度は、ラベルが対象文章のラベルとして適切である度合である。ラベル候補は、対象文章に付与されるラベルの候補として予め定められた複数の各ラベルである。ラベル候補は、例えば“感染症”、“ビジネス”、“スポーツ”、“株価”等の対象文章に関連付けられ得る用語である。 More specifically, the result estimated by the label estimation model is label appropriateness information. Label appropriateness information is information indicating the label appropriateness for each label candidate. Label appropriateness is the degree to which a label is appropriate as a label for the target sentence. Label candidates are each of multiple labels that have been predetermined as candidates for labels to be assigned to the target sentence. Label candidates are terms that can be associated with the target sentence, such as "infectious disease," "business," "sports," and "stock prices."

学習済みラベル推定モデルは、学習に関する所定の終了条件（以下「学習終了条件」という。）が満たされた時点のラベル推定モデルである。学習終了条件は、例えば、学習によるラベル推定モデルの変化が所定の変化より小さいという条件である。学習終了条件は、例えば、学習の回数が所定の回数に達した、という条件であってもよい。 A trained label estimation model is a label estimation model at the point in time when a predetermined termination condition for learning (hereinafter referred to as the "learning termination condition") is met. The learning termination condition is, for example, a condition that the change in the label estimation model due to learning is smaller than a predetermined change. The learning termination condition may also be, for example, a condition that the number of times learning has been performed has reached a predetermined number.

以下、学習装置１が学習済みラベル推定モデルを得る処理をモデル学習処理という。機械学習の方法は、学習済みラベル推定モデルを得ることができればどのような方法であってもよい。機械学習の方法は、例えばＣＮＮ（Convolutional Neural Networks）を用いる方法であってもよいし、ＬＳＴＭ（Long short-term memory）を用いる方法であってもよいし、ＢＥＲＴ（Bidirectional Encoder Representations from Transformers）を用いる方法であってもよい。 Hereinafter, the process by which the learning device 1 obtains a trained label estimation model is referred to as a model learning process. Any machine learning method may be used as long as it can obtain a trained label estimation model. The machine learning method may, for example, be a method using CNN (Convolutional Neural Networks), a method using LSTM (Long short-term memory), or a method using BERT (Bidirectional Encoder Representations from Transformers).

学習済みラベル推定モデルを得るための機械学習の方法では、文章情報を説明変数として有するデータが用いられる。説明変数に対応する目的変数は、ラベル適正情報を示す。以下、説明変数として文章情報を有し、目的変数としてラベル適正情報を有するデータを、モデル学習用データという。モデル学習用データは学習済みラベル推定モデルの取得に用いられるデータである。すなわちモデル学習用データはラベル推定モデルの学習に用いられるデータである。以下、モデル学習用データの有するラベル適正情報を教師データという。 Machine learning methods for obtaining trained label estimation models use data that has text information as explanatory variables. The objective variable corresponding to the explanatory variables indicates label suitability information. Hereinafter, data that has text information as explanatory variables and label suitability information as objective variables will be referred to as model training data. Model training data is data used to obtain trained label estimation models. In other words, model training data is data used to train label estimation models. Hereinafter, the label suitability information contained in model training data will be referred to as training data.

＜ラベル適正情報の表現の具体例＞
ラベル適正情報の表現の具体例を説明する。ラベル候補がＮ個（Ｎは自然数）であるとき、ラベル適正情報は、例えばＮ次元のベクトルで表現される。Ｎ次元ベクトルの各要素はＮ個のラベル候補のいずれか１つに対応付けられており、インデックスｎの異なる要素は異なるラベル候補に対応付けられている。ｎは１以上Ｎ以下の自然数である。なお、インデクッスｎは、ラベル候補を区別する指標であり、なおかつ、Ｎ次元ベクトルのｎ番目の要素を示す指標である。以下、説明の簡単のため、ラベル候補がＮ個である場合を例にラベル推定システム１００を説明する。 <Examples of label appropriateness information>
A specific example of the representation of label appropriateness information will be described. When there are N label candidates (N is a natural number), the label appropriateness information is expressed, for example, as an N-dimensional vector. Each element of the N-dimensional vector corresponds to one of the N label candidates, and elements with different indexes n correspond to different label candidates. n is a natural number between 1 and N. Note that index n is an index that distinguishes between label candidates and also indicates the nth element of the N-dimensional vector. For ease of explanation, the label estimation system 100 will be described below using an example in which there are N label candidates.

ラベル適正情報を表現するＮ次元ベクトルの各要素は、対応する各ラベル候補のラベル適正度を示す。ラベル適正度は、例えば０以上１以下の値で示される。このような場合、ラベル適正情報を表現するＮ次元のベクトルの各要素の値は、例えば０に近いほど対応するラベル候補が文章情報の示す文章のラベルとして不適切であることを示す。一方、ラベル適正情報を表現するＮ次元のベクトルの各要素は、例えば値が１に近いほど対応するラベル候補が文章情報の示す文章のラベルとして適切であることを示す。 Each element of the N-dimensional vector representing label suitability information indicates the label suitability of the corresponding label candidate. Label suitability is expressed, for example, as a value between 0 and 1. In such a case, the closer the value of each element of the N-dimensional vector representing label suitability information is to 0, the less appropriate the corresponding label candidate is as a label for the sentence indicated by the sentence information. On the other hand, the closer the value of each element of the N-dimensional vector representing label suitability information is to 1, the more appropriate the corresponding label candidate is as a label for the sentence indicated by the sentence information.

＜モデル学習処理と損失関数とについて＞
モデル学習処理についてより詳細に説明する。モデル学習処理は、上述したように、モデル学習用データを用いた機械学習の方法により、学習終了条件が満たされるまでラベル推定モデルを更新する処理である。モデル学習処理では、損失関数を用いて計算された損失を小さくするようにラベル推定モデルの更新が行われる。なお損失関数を用いて計算された損失とは損失関数の値であり、例えばラベル推定モデルの出力と教師データとの不一致度を表す値である。 <Model learning process and loss function>
The model learning process will be described in more detail. As described above, the model learning process is a process of updating the label estimation model by a machine learning method using model learning data until the learning termination condition is met. In the model learning process, the label estimation model is updated so as to reduce the loss calculated using a loss function. Note that the loss calculated using the loss function is the value of the loss function, and is, for example, a value that represents the degree of mismatch between the output of the label estimation model and the training data.

損失関数は、教師データとラベル推定モデルの推定結果との一致度と不一致度とを用いて表現される指標である。損失関数は、例えば以下の式（１）で定義されるバイナリクロスエントロピーであってもよい。 The loss function is an index expressed using the degree of agreement and disagreement between the training data and the estimation results of the label estimation model. The loss function may be, for example, the binary cross-entropy defined by the following equation (1):

ｙ_ｎは、教師データが示すラベル適正度であってインデクッスｎのラベル候補のラベル適正度を示す。ｙ｛＾｝_ｎは、ラベル推定モデルによって推定されたラベル適正度であってインデクッスｎのラベル候補のラベル適正度を示す。なお、Ａ｛＾｝は、記号Ａにサーカムフレックスが付与された記号を示す。したがって、ｙ｛＾｝_ｎは、記号ｙにサーカムフレックスが付与された記号に下付き文字ｎが付与された記号を意味する。より具体的にはｙ｛＾｝_ｎは以下の式（２）の記号を意味する。 y _n is the label appropriateness indicated by the training data and indicates the label appropriateness of the label candidate with index n. y{^} _n is the label appropriateness estimated by the label estimation model and indicates the label appropriateness of the label candidate with index n. Note that A{^} indicates a symbol in which a circumflex is added to the symbol A. Therefore, y{^} _n means a symbol in which a circumflex is added to the symbol y and a subscript n is added. More specifically, y{^} _n means the symbol in the following formula (2).

式（１）における以下の式（３）で表現される項は、ラベル推定モデルの推定結果と教師データとが定性的に一致する場合におけるラベル推定モデルの推定結果と教師データとの間の定量的な違いを示す。 The term expressed in the following equation (3) in equation (1) indicates the quantitative difference between the estimation results of the label estimation model and the training data when the estimation results of the label estimation model and the training data qualitatively match.

式（１）における以下の式（４）で表現される項は、ラベル推定モデルの推定結果と教師データとが定性的に不一致である場合におけるラベル推定モデルの推定結果と教師データとの間の定量的な違いを示す。 The term expressed by the following equation (4) in equation (1) indicates the quantitative difference between the estimation results of the label estimation model and the training data when there is a qualitative mismatch between the estimation results of the label estimation model and the training data.

なお、損失関数を小さくするよう更新の具体的な一例は、式（１）が示す損失関数を用いて、ラベル推定モデルが不正解ラベル情報を推定する確率を増大させないようにラベル推定モデルを更新する処理である。不正解ラベル情報は、定性的に教師データと不一致なラベル適正情報である。 A specific example of updating to reduce the loss function is a process that uses the loss function shown in equation (1) to update the label estimation model so as not to increase the probability that the label estimation model will infer incorrect label information. Incorrect label information is label appropriateness information that qualitatively disagrees with the training data.

ラベル推定装置２は、学習装置１が取得した学習済みラベル推定モデルを用いて、入力された文章情報が示す対象文章に付与されるべきラベルを推定する。より具体的には、ラベル推定装置２は、学習装置１が取得した学習済みラベル推定モデルを用いて、入力された文章情報が示す対象文章に対するラベル候補それぞれのラベル適正度を推定する。 The label estimation device 2 uses the trained label estimation model acquired by the learning device 1 to estimate the label to be assigned to the target sentence indicated by the input text information. More specifically, the label estimation device 2 uses the trained label estimation model acquired by the learning device 1 to estimate the label appropriateness of each label candidate for the target sentence indicated by the input text information.

＜モデル学習用データが含むラベル適正情報の生成について＞
モデル学習用データが含むラベル適正情報を生成する方法の一例について説明する。ラベル適正情報は、例えば人手又は装置により、正否情報とラベル共起情報とに基づいて生成される。 <Generating label appropriateness information included in model training data>
An example of a method for generating label appropriateness information included in model learning data will be described below. The label appropriateness information is generated, for example, manually or by a device, based on correct/incorrect information and label co-occurrence information.

正否情報は、モデル学習用データが含む文章情報の示す文章に対して付与される確率の高さに関する所定の条件を満たすラベルを示す情報である。すなわち、正否情報は、モデル学習用データの示す文章に対するラベル適正度の高さに関する所定の条件を満たすラベルを示す情報である。ラベル適正度の高さに関する所定の条件（以下「ラベル適正条件」という。）は、例えば、ラベル適正度が最も高い、という条件である。正否情報は、ラベル適正条件を満たすラベルが複数である場合には、複数のラベルを示してもよい。正否情報は、例えば、付与される確率の最も高いラベルに対応する要素の値のみ１であり、他の要素の値が０である、Ｎ次元のベクトルで表現される。以下、説明の簡単のためラベル適正条件が、ラベル適正度が最も高い、という条件である場合を例に、ラベル推定システム１００を説明する。 The correct/incorrect information is information indicating a label that satisfies a predetermined condition regarding the likelihood of being assigned to a sentence indicated by the sentence information included in the model training data. In other words, the correct/incorrect information is information indicating a label that satisfies a predetermined condition regarding the likelihood of label appropriateness for a sentence indicated by the model training data. The predetermined condition regarding the likelihood of label appropriateness (hereinafter referred to as the "label appropriateness condition") is, for example, the condition that the label appropriateness is the highest. If there are multiple labels that satisfy the label appropriateness condition, the correct/incorrect information may indicate multiple labels. The correct/incorrect information is expressed, for example, as an N-dimensional vector in which only the element corresponding to the label with the highest likelihood of being assigned has a value of 1, and the other elements have values of 0. For simplicity of explanation, the label estimation system 100 will be described below using an example in which the label appropriateness condition is the condition that the label appropriateness is the highest.

ラベル共起情報は、Ｎ個のラベル候補のうちの任意の２つのラベル候補の間の共起の確率を示す情報である。共起の確率とは、具体的には、一方のラベル候補が文章中に出現する場合に、他方のラベル候補が文章中に出現する確率である。なお、ラベル共起情報は、同一のラベル候補間の共起の確率を示してもよい。同一のラベル候補間の共起の確率とは自己相関のことなので、同一のラベル候補間の共起の確率は１である。なお、ラベル共起情報は必ずしも同一のラベル候補間の共起の確率を示す必要は無く、このような場合にはラベル共起情報が示す同一のラベル候補間の共起の確率は、例えば０である。 Label co-occurrence information is information that indicates the probability of co-occurrence between any two of N label candidates. Specifically, the probability of co-occurrence is the probability that one label candidate appears in a sentence when the other label candidate appears in the sentence. Note that label co-occurrence information may also indicate the probability of co-occurrence between identical label candidates. Since the probability of co-occurrence between identical label candidates refers to autocorrelation, the probability of co-occurrence between identical label candidates is 1. Note that label co-occurrence information does not necessarily have to indicate the probability of co-occurrence between identical label candidates; in such cases, the probability of co-occurrence between identical label candidates indicated by label co-occurrence information is, for example, 0.

図２は、実施形態におけるラベル共起情報の一例を示す図である。ラベル共起情報は、例えば要素の値が０以上１以下の正定値行列で表現される。図２の例では、縦と横はそれぞれラベル候補を示し、対角成分は自己相関を示す。 Figure 2 shows an example of label co-occurrence information in an embodiment. The label co-occurrence information is expressed, for example, as a positive definite matrix whose elements have values between 0 and 1. In the example of Figure 2, the columns and columns each represent label candidates, and the diagonal elements represent autocorrelation.

図２のラベル共起情報は、より具体的には、ラベル候補の同士のＰＰＭＩ（Positive Pointwise Mutual Information）スコアを示す行列である。なお、ＰＰＭＩスコアは以下の式（５）で定義される。 More specifically, the label co-occurrence information in Figure 2 is a matrix showing the PPMI (Positive Pointwise Mutual Information) scores between label candidates. The PPMI score is defined by the following equation (5):

式（５）において、ｌ_ｎはインデックスｎのラベル候補を示し、ｌ_ｍはインデックスｍのラベル候補を示す。なお、ｍは、１以上Ｎ以下の整数である。ｍはｎと同じ値であってもよいし異なってもよい。Ｃ（ｌ_ｎ）は、予め用意された複数の所定の文章の集合（以下「事前文章集合」という。）におけるインデクッスｎのラベル候補の出現回数を示す。Ｃ（ｌ_ｍ）は、事前文章集合におけるインデクッスｍのラベル候補の出現回数を示す。Ｃ（ｌ_ｎ、ｌｍ）は、事前文章集合におけるインデックスｎのラベル候補とインデクッスｍのラベル候補との共起回数を示す。 In formula (5), _ln indicates a label candidate for index n, and _lm indicates a label candidate for index m. Note that m is an integer between 1 and N. m may be the same as or different from n. C( _ln ) indicates the number of times the label candidate for index n appears in a set of multiple predetermined sentences prepared in advance (hereinafter referred to as the "pre-sent sentence set"). C( _lm ) indicates the number of times the label candidate for index m appears in the pre-sent sentence set. C( _ln , lm) indicates the number of times the label candidate for index n co-occurs with the label candidate for index m in the pre-sent sentence set.

以下、正否情報とラベル共起情報とに基づきラベル適正情報を生成する処理を、ラベル適正情報生成処理という。ラベル適正情報生成処理では、例えば、正否情報を表現するベクトルの要素の値が１であるラベル候補について、他のラベル候補が共起する確率がラベル共起情報を用いて取得される処理が実行される。正否情報を表現するベクトルの要素の値が１のラベル候補が複数の場合には、例えば、要素の値が１の複数のラベル候補について他のラベル候補が共起する確率を取得し、他のラベル候補ごとに共起する確率の和が算出される。ラベル適正情報生成処理では次に、シグモイド関数等の独立変数の値を０以上１以下の所定の値に制限する関数を用いて、他のラベル候補の共起する確率を０以上１以下の値に変換する処理が実行される。 Hereinafter, the process of generating label appropriateness information based on correct/incorrect information and label co-occurrence information is referred to as the label appropriateness information generation process. In the label appropriateness information generation process, for example, for a label candidate whose element of the vector representing the correct/incorrect information has a value of 1, the probability of co-occurrence with other label candidates is obtained using the label co-occurrence information. If there are multiple label candidates whose element of the vector representing the correct/incorrect information has a value of 1, for example, the probability of co-occurrence with other label candidates is obtained for the multiple label candidates whose element has a value of 1, and the sum of the co-occurrence probabilities for each other label candidate is calculated. Next, in the label appropriateness information generation process, a function that limits the value of an independent variable to a predetermined value between 0 and 1, such as a sigmoid function, is used to convert the probability of co-occurrence with other label candidates to a value between 0 and 1.

ラベル適正情報生成処理では、正否情報を表現するベクトルについて、正否情報を表現するベクトルの値が０であった要素の値が、変換後の値に置き換えられる処理（以下「置き換え処理」という。）が実行される。置き換え処理によって要素の値が変更された正否情報が、ラベル適正情報である。 In the label appropriateness information generation process, a process (hereinafter referred to as the "replacement process") is performed in which the element values of the vector representing the correct/incorrect information, which had a value of 0, are replaced with converted values. The correct/incorrect information whose element values have been changed by the replacement process is the label appropriateness information.

図３は、実施形態におけるラベル適正情報生成処理の一例を説明する図である。より具体的には、図３は、ラベル共起情報がラベル候補の同士のＰＰＭＩスコアを示す行列（以下「ＰＰＭＩ行列」という。）である場合を例に、ラベル適正情報生成処理の一例を説明する説明図である。 Figure 3 is a diagram illustrating an example of the label appropriateness information generation process in an embodiment. More specifically, Figure 3 is an explanatory diagram illustrating an example of the label appropriateness information generation process in an example where the label co-occurrence information is a matrix indicating the PPMI scores of label candidates (hereinafter referred to as the "PPMI matrix").

図３は画像Ｇ１～Ｇ５を示す。画像Ｇ１は、正否情報の一例を示す。画像Ｇ１の正否情報は、ラベル候補として、“スポーツ”、“ビジネス”、”健康”、“ワクチン”及び“感染症”の５つを示す。画像Ｇ１は、”ビジネス”と”感染症”とのラベル適正度が最も高いことを示す。図３において、”ビジネス”と”感染症”とは、ラベル適正条件を満たすラベル候補である。 Figure 3 shows images G1 to G5. Image G1 shows an example of correct/incorrect information. The correct/incorrect information for image G1 shows five label candidates: "sports," "business," "health," "vaccine," and "infectious disease." Image G1 shows that "business" and "infectious disease" have the highest label appropriateness. In Figure 3, "business" and "infectious disease" are label candidates that meet the label appropriateness conditions.

画像Ｇ２は、ＰＰＭＩ行列の一例を示す。画像Ｇ３は、シグモイド関数を示す。画像Ｇ４は、ＰＰＭＩ行列の行のうち、ラベル適正条件を満たす行のベクトル和を得る処理を表す。具体的には、ラベル候補が”ビジネス”のラベル候補と共起する確率を示す行と、ラベル候補が”感染症”のラベル候補と共起する確率を示す行と、のベクトル和を得る処理を表す。画像Ｇ５は、ラベル適正情報の一例を示す。 Image G2 shows an example of a PPMI matrix. Image G3 shows a sigmoid function. Image G4 shows the process of obtaining the vector sum of rows in the PPMI matrix that satisfy the label appropriateness conditions. Specifically, it shows the process of obtaining the vector sum of rows indicating the probability that a label candidate will co-occur with the label candidate "business" and rows indicating the probability that a label candidate will co-occur with the label candidate "infectious disease." Image G5 shows an example of label appropriateness information.

ラベル適正情報生成処理では、画像Ｇ４が示すように、ＰＰＭＩ行列における正解ラベルとラベル候補との間の共起の確率を示す行（以下「主共起行」という。）を足し合わせる処理が実行される。以下、ＰＰＭＩ行列における主共起行を足し合わせる処理を、足し合わせ処理という。図３の例では、”ビジネス”と“感染症”とがそれぞれ正解ラベルであり、”ビジネス”の行と”感染症”の行とを足し合わせる処理が足し合わせ処理である。正解ラベルとは、正否情報を示すＮ次元ベクトルの要素に対応するラベル候補のうち値が１の要素に対応するラベル候補である。すなわち、正解ラベルとは、ラベル適正条件を満たすラベル候補である。 As shown in image G4, the label appropriateness information generation process involves adding up rows in the PPMI matrix that indicate the probability of co-occurrence between the correct label and label candidates (hereinafter referred to as "major co-occurrence rows"). Hereinafter, the process of adding up major co-occurrence rows in the PPMI matrix is referred to as the addition process. In the example of Figure 3, "business" and "infectious disease" are correct labels, and the process of adding up the "business" row and the "infectious disease" row is the addition process. The correct label is the label candidate that corresponds to an element with a value of 1 among the label candidates corresponding to the elements of the N-dimensional vector that indicates correctness information. In other words, the correct label is the label candidate that satisfies the label appropriateness conditions.

足し合わせ処理の実行により、正解ラベルと共起しやすい不正解ラベルのＰＰＭＩスコアを不正解ラベルごとに足し合わせることが行われる。足し合わせの結果得られる情報は、例えばＮ次元のベクトルで表現される。図３の例あれば、“ビジネス”と共起しやすい不正解ラベルのＰＰＭＩスコアと”感染症”と共起しやすい不正解ラベルのＰＰＭＩスコアとを足し合わせることが、足し合わせ処理により行われる。 By performing the summation process, the PPMI scores of incorrect labels that tend to co-occur with correct labels are added together for each incorrect label. The information obtained as a result of the summation is expressed, for example, as an N-dimensional vector. In the example of Figure 3, the summation process adds together the PPMI scores of incorrect labels that tend to co-occur with "business" and the PPMI scores of incorrect labels that tend to co-occur with "infectious disease".

不正解ラベルは、正否情報を示すＮ次元ベクトルの要素に対応するラベル候補のうち値が０の要素に対応するラベル候補である。すなわち、不正解ラベルとは、ラベル候補のうち正解ラベルではないラベル候補である。図３の例では、”スポーツ”、“健康”、”ワクチン”である。 An incorrect label is a label candidate that corresponds to an element with a value of 0 among the label candidates corresponding to the elements of the N-dimensional vector indicating correct/incorrect information. In other words, an incorrect label is a label candidate that is not a correct label. In the example in Figure 3, the incorrect labels are "sports", "health", and "vaccine".

ラベル適正情報生成処理では、次に、足し合わせ処理の実行により得られたＮ次元のベクトル（以下「足し合わせ結果ベクトル」という。）を画像Ｇ３に示すシグモイド関数に入力することにより、足し合わせ結果ベクトルの各要素の値を０以上１以下の値に正規化する処理が実行される。以下、足し合わせ結果ベクトルをシグモイド関数に入力することにより、足し合わせ結果ベクトルの各要素の値を０～１に正規化する処理を、第１正規化処理という。第１正規化処理は以下の式（６）によって定義される。 Next, in the label appropriateness information generation process, the N-dimensional vector obtained by executing the addition process (hereinafter referred to as the "addition result vector") is input into the sigmoid function shown in image G3, whereby the value of each element of the addition result vector is normalized to a value between 0 and 1. Hereinafter, the process of normalizing the value of each element of the addition result vector to a value between 0 and 1 by inputting the addition result vector into the sigmoid function is referred to as the first normalization process. The first normalization process is defined by the following equation (6).

Ｐ_ｎｍはＰＰＭＩ行列を意味する。σ（・）は、シグモイド関数を表す。 P _nm denotes the PPMI matrix, and σ(·) denotes the sigmoid function.

ラベル適正情報生成処理では第１正規化処理の実行後に、得られたスコアＳ_ｎとベクトルｙ_ｎとを足し合わせる処理（以下「平滑化処理」という。）が実行される。ラベル適正情報生成処理では、第１正規化処理の実行後に、以下の式（７）及び（８）によって示される正規化の処理（以下「第２正規化処理」という。）も実行される。 In the label appropriateness information generation process, after the first normalization process is performed, a process of adding the obtained score S _n and vector y _n (hereinafter referred to as a "smoothing process") is performed. In the label appropriateness information generation process, after the first normalization process is performed, a normalization process (hereinafter referred to as a "second normalization process") shown by the following equations (7) and (8) is also performed.

ｐ´_ｎｍはＰＰＭＩ行列Ｐ´におけるｎ行ｍ列目の要素を意味する。ｙ_ｍは、インデックスがｍの正解ラベルとラベル候補との間の共起の確率を示す行を示すベクトルである。ｓ_ｎは、ベクトルである。αは、平滑化の強度を意味するハイパーパラメータ（0以上1以下の係数）である。αは例えば、スケーリングレートと呼称される０以上１以下の係数である。このように、置き換え処理は、足し合わせ処理、第１正規化処理、平滑化処理及び第２正規化処理を含む。このようにして得られたｙ_ｎ´がラベル適正情報の一例である。 p _{' nm} means the element in the nth row and mth column in the PPMI matrix P'. y _m is a vector indicating the row indicating the probability of co-occurrence between the correct label with index m and the label candidate. s _n is a vector. α is a hyperparameter (a coefficient between 0 and 1) indicating the strength of smoothing. α is, for example, a coefficient between 0 and 1 called a scaling rate. In this way, the replacement process includes an addition process, a first normalization process, a smoothing process, and a second normalization process. y _n ' obtained in this way is an example of label appropriateness information.

図４は、実施形態における学習装置１のハードウェア構成の一例を示す図である。学習装置１は、バスで接続されたＣＰＵ（Central Processing Unit）等のプロセッサ９１とメモリ９２とを備える制御部１１を備え、プログラムを実行する。学習装置１は、プログラムの実行によって制御部１１、入力部１２、通信部１３、記憶部１４及び出力部１５を備える装置として機能する。 Figure 4 is a diagram showing an example of the hardware configuration of a learning device 1 in an embodiment. The learning device 1 has a control unit 11 including a processor 91 such as a CPU (Central Processing Unit) and memory 92 connected by a bus, and executes a program. By executing the program, the learning device 1 functions as a device including the control unit 11, input unit 12, communication unit 13, memory unit 14, and output unit 15.

より具体的には、プロセッサ９１が記憶部１４に記憶されているプログラムを読み出し、読み出したプログラムをメモリ９２に記憶させる。プロセッサ９１が、メモリ９２に記憶させたプログラムを実行することによって、学習装置１は、制御部１１、入力部１２、通信部１３、記憶部１４及び出力部１５を備える装置として機能する。 More specifically, the processor 91 reads the program stored in the storage unit 14 and stores the read program in the memory 92. When the processor 91 executes the program stored in the memory 92, the learning device 1 functions as a device including a control unit 11, an input unit 12, a communication unit 13, a storage unit 14, and an output unit 15.

制御部１１は、学習装置１が備える各種機能部の動作を制御する。制御部１１は、例えばモデル学習処理を実行する。制御部１１は、例えばラベル適正情報生成処理を実行してもよい。上述したようにラベル適正情報生成処理は、人手で行われてもよいが、装置が実行してもよい。以下、学習装置１がラベル適正情報生成処理を実行する場合を例に、ラベル推定システム１００を説明する。 The control unit 11 controls the operation of the various functional units included in the learning device 1. The control unit 11 executes, for example, a model learning process. The control unit 11 may also execute, for example, a label appropriateness information generation process. As described above, the label appropriateness information generation process may be performed manually, or may be executed by a device. Below, the label estimation system 100 will be described using an example in which the learning device 1 executes the label appropriateness information generation process.

制御部１１は、例えば出力部１５の動作を制御する。制御部１１は、例えばモデル学習処理の実行により生じた各種情報を記憶部１４に記録する。制御部１１は、例えば得られたラベル適正情報を記憶部１４に記録する。 The control unit 11 controls, for example, the operation of the output unit 15. The control unit 11 records, for example, various information generated by executing the model learning process in the memory unit 14. The control unit 11 records, for example, the obtained label suitability information in the memory unit 14.

入力部１２は、マウスやキーボード、タッチパネル等の入力装置を含んで構成される。入力部１２は、これらの入力装置を学習装置１に接続するインタフェースとして構成されてもよい。入力部１２は、学習装置１に対する各種情報の入力を受け付ける。 The input unit 12 includes input devices such as a mouse, keyboard, and touch panel. The input unit 12 may be configured as an interface that connects these input devices to the learning device 1. The input unit 12 accepts various types of information input to the learning device 1.

通信部１３は、学習装置１を外部装置に接続するための通信インタフェースを含んで構成される。通信部１３は、有線又は無線を介して外部装置と通信する。外部装置は、例えば正否情報の送信元の装置である。外部装置は、例えばラベル共起情報の送信元の装置である。外部装置は、例えばモデル学習用データの送信元の装置である。外部装置は、例えばラベル推定装置２である。なお、正否情報、ラベル共起情報及びモデル学習用データのそれぞれは、必ずしも通信部１３を介して入力される必要は無く、入力部１２に入力されてもよい。 The communication unit 13 includes a communication interface for connecting the learning device 1 to an external device. The communication unit 13 communicates with the external device via wired or wireless connections. The external device is, for example, a device that transmits true/false information. The external device is, for example, a device that transmits label co-occurrence information. The external device is, for example, a device that transmits model training data. The external device is, for example, the label estimation device 2. Note that the true/false information, label co-occurrence information, and model training data do not necessarily need to be input via the communication unit 13, and may be input to the input unit 12.

記憶部１４は、磁気ハードディスク装置や半導体記憶装置などのコンピュータ読み出し可能な記憶媒体装置を用いて構成される。記憶部１４は学習装置１に関する各種情報を記憶する。記憶部１４は、例えば入力部１２又は通信部１３を介して入力された情報を記憶する。記憶部１４は、例えばモデル学習処理の実行により生じた各種情報を記憶する。記憶部１４は、例えばラベル適正情報を記憶する。記憶部１４は、予めラベル推定モデルを記憶する。記憶部１４は、得られた学習済みラベル推定モデルを記憶してもよい。 The memory unit 14 is configured using a computer-readable storage medium device such as a magnetic hard disk drive or semiconductor storage device. The memory unit 14 stores various information related to the learning device 1. The memory unit 14 stores information input via the input unit 12 or the communication unit 13, for example. The memory unit 14 stores various information generated by executing a model learning process, for example. The memory unit 14 stores label appropriateness information, for example. The memory unit 14 stores a label estimation model in advance. The memory unit 14 may also store the obtained trained label estimation model.

出力部１５は、各種情報を出力する。出力部１５は、例えばＣＲＴ（Cathode Ray Tube）ディスプレイや液晶ディスプレイ、有機ＥＬ（Electro-Luminescence）ディスプレイ等の表示装置を含んで構成される。出力部１５は、これらの表示装置を学習装置１に接続するインタフェースとして構成されてもよい。出力部１５は、例えば入力部１２に入力された情報を出力する。出力部１５は、例えばモデル学習処理の実行結果を表示してもよい。出力部１５は、例えばラベル適正情報を表示してもよい。 The output unit 15 outputs various types of information. The output unit 15 is configured to include a display device such as a CRT (Cathode Ray Tube) display, a liquid crystal display, or an organic EL (Electro-Luminescence) display. The output unit 15 may be configured as an interface that connects these display devices to the learning device 1. The output unit 15 outputs information input to the input unit 12, for example. The output unit 15 may display the execution results of the model learning process, for example. The output unit 15 may display label suitability information, for example.

図５は、実施形態における制御部１１の構成の一例を示す図である。制御部１１は、ラベル適正情報取得部１１０、モデル学習部１２０、記憶制御部１３０、通信制御部１４０及び出力制御部１５０を備える。 Figure 5 is a diagram showing an example of the configuration of the control unit 11 in an embodiment. The control unit 11 includes a label appropriateness information acquisition unit 110, a model learning unit 120, a memory control unit 130, a communication control unit 140, and an output control unit 150.

ラベル適正情報取得部１１０は、ラベル適正情報を取得する。ラベル適正情報取得部１１０は、入力部１２又は通信部１３に入力された正否情報及びラベル共起情報に基づき、ラベル適正情報生成処理を実行することでラベル適正情報を取得する。 The label suitability information acquisition unit 110 acquires label suitability information. The label suitability information acquisition unit 110 acquires label suitability information by executing a label suitability information generation process based on the correct/incorrect information and label co-occurrence information input to the input unit 12 or the communication unit 13.

モデル学習部１２０は、ラベル適正情報と、入力部１２又は通信部１３に入力されたモデル学習用データとを用いて、学習終了条件が満たされるまでラベル推定モデルを更新する。すなわち、モデル学習部１２０は、ラベル適正情報と入力部１２又は通信部１３に入力されたモデル学習用データとを用いてモデル学習処理を実行することで学習済みラベル推定モデルを得る。 The model learning unit 120 updates the label estimation model using the label appropriateness information and the model learning data input to the input unit 12 or the communication unit 13 until the learning termination condition is met. In other words, the model learning unit 120 obtains a learned label estimation model by performing a model learning process using the label appropriateness information and the model learning data input to the input unit 12 or the communication unit 13.

記憶制御部１３０は、記憶部１４に各種情報を記録する。通信制御部１４０は通信部１３の動作を制御する。出力制御部１５０は、出力部１５の動作を制御する。 The memory control unit 130 records various information in the memory unit 14. The communication control unit 140 controls the operation of the communication unit 13. The output control unit 150 controls the operation of the output unit 15.

図６は、実施形態における学習装置１が実行する処理の流れの一例を示すフローチャートである。ラベル適正情報取得部１１０がラベル適正情報を取得する（ステップＳ１０１）。次に、入力部又は通信部１３にモデル学習用データが入力される（ステップＳ１０２）。次に、モデル学習部１２０が、モデル学習用データが示す文章情報をラベル推定モデルに入力することで、ラベル適正情報を推定する（ステップＳ１０３）。次にモデル学習部１２０が、モデル学習用データの含むラベル適正情報と、ステップＳ１０３の推定結果とに基づき、ラベル推定モデルを更新する（ステップＳ１０４）。次に、モデル学習部１２０が、学習終了条件が満たされたか否かを判定する（ステップＳ１０５）。学習終了条件が満たされた場合（ステップＳ１０５：ＹＥＳ）、処理が終了する。一方、学習終了条件が満たされない場合（ステップＳ１０５：ＮＯ）、ステップＳ１０２の処理に戻る。 Figure 6 is a flowchart showing an example of the flow of processing executed by the learning device 1 in an embodiment. The label appropriateness information acquisition unit 110 acquires label appropriateness information (step S101). Next, model training data is input to the input unit or communication unit 13 (step S102). Next, the model training unit 120 estimates label appropriateness information by inputting sentence information indicated by the model training data into a label estimation model (step S103). Next, the model training unit 120 updates the label estimation model based on the label appropriateness information included in the model training data and the estimation result of step S103 (step S104). Next, the model training unit 120 determines whether a learning termination condition is satisfied (step S105). If the learning termination condition is satisfied (step S105: YES), the processing ends. On the other hand, if the learning termination condition is not satisfied (step S105: NO), the processing returns to step S102.

学習終了条件が満たされるまで繰り返される、ステップＳ１０２からステップＳ１０５までの処理が、モデル学習処理の一例である。 The processes from step S102 to step S105, which are repeated until the learning termination condition is met, are an example of a model learning process.

図７は、実施形態におけるラベル推定装置２のハードウェア構成の一例を示す図である。ラベル推定装置２は、バスで接続されたＣＰＵ等のプロセッサ９３とメモリ９４とを備える制御部２１を備え、プログラムを実行する。ラベル推定装置２は、プログラムの実行によって制御部２１、入力部２２、通信部２３、記憶部２４及び出力部２５を備える装置として機能する。 Figure 7 is a diagram showing an example of the hardware configuration of a label estimation device 2 in an embodiment. The label estimation device 2 has a control unit 21 including a processor 93 such as a CPU and memory 94 connected by a bus, and executes a program. By executing the program, the label estimation device 2 functions as a device including the control unit 21, input unit 22, communication unit 23, memory unit 24, and output unit 25.

より具体的には、プロセッサ９３が記憶部２４に記憶されているプログラムを読み出し、読み出したプログラムをメモリ９４に記憶させる。プロセッサ９３が、メモリ９４に記憶させたプログラムを実行することによって、ラベル推定装置２は、制御部２１、入力部２２、通信部２３、記憶部２４及び出力部２５を備える装置として機能する。 More specifically, the processor 93 reads the program stored in the storage unit 24 and stores the read program in the memory 94. When the processor 93 executes the program stored in the memory 94, the label estimation device 2 functions as a device including a control unit 21, an input unit 22, a communication unit 23, a storage unit 24, and an output unit 25.

制御部２１は、ラベル推定装置２が備える各種機能部の動作を制御する。制御部２１は、例えば学習済みラベル推定モデルを実行する。制御部２１は、例えば出力部２５の動作を制御する。制御部２１は、例えば学習済みラベル推定モデルの実行により生じた各種情報を記憶部２４に記録する。 The control unit 21 controls the operation of the various functional units included in the label estimation device 2. The control unit 21, for example, executes a trained label estimation model. The control unit 21, for example, controls the operation of the output unit 25. The control unit 21, for example, records various information generated by the execution of the trained label estimation model in the memory unit 24.

入力部２２は、マウスやキーボード、タッチパネル等の入力装置を含んで構成される。入力部２２は、これらの入力装置をラベル推定装置２に接続するインタフェースとして構成されてもよい。入力部２２は、ラベル推定装置２に対する各種情報の入力を受け付ける。 The input unit 22 includes input devices such as a mouse, keyboard, and touch panel. The input unit 22 may be configured as an interface that connects these input devices to the label estimation device 2. The input unit 22 accepts input of various information for the label estimation device 2.

通信部２３は、ラベル推定装置２を外部装置に接続するための通信インタフェースを含んで構成される。通信部２３は、有線又は無線を介して外部装置と通信する。外部装置は、例えば文章情報の送信元の装置である。外部装置は、例えば学習装置１である。通信部２３は、学習装置１との通信により、学習済みラベル推定モデルを取得する。なお、文章情報は、必ずしも通信部２３に入力される必要は無く、入力部２２に入力されてもよい。 The communication unit 23 includes a communication interface for connecting the label estimation device 2 to an external device. The communication unit 23 communicates with the external device via wired or wireless connections. The external device is, for example, a device that transmits text information. The external device is, for example, the learning device 1. The communication unit 23 acquires a trained label estimation model by communicating with the learning device 1. Note that the text information does not necessarily have to be input to the communication unit 23, but may be input to the input unit 22.

記憶部２４は、磁気ハードディスク装置や半導体記憶装置などのコンピュータ読み出し可能な記憶媒体装置を用いて構成される。記憶部２４はラベル推定装置２に関する各種情報を記憶する。記憶部２４は、例えば入力部２２又は通信部２３を介して入力された情報を記憶する。記憶部２４は、例えば学習済みラベル推定モデルの実行により生じた各種情報を記憶する。記憶部２４は、学習済みラベル推定モデルを記憶する。 The memory unit 24 is configured using a computer-readable storage medium device such as a magnetic hard disk drive or semiconductor storage device. The memory unit 24 stores various information related to the label estimation device 2. The memory unit 24 stores information input via, for example, the input unit 22 or the communication unit 23. The memory unit 24 stores various information generated by executing, for example, a trained label estimation model. The memory unit 24 stores the trained label estimation model.

出力部２５は、各種情報を出力する。出力部２５は、例えばＣＲＴディスプレイや液晶ディスプレイ、有機ＥＬディスプレイ等の表示装置を含んで構成される。出力部２５は、これらの表示装置をラベル推定装置２に接続するインタフェースとして構成されてもよい。出力部２５は、例えば入力部２２に入力された情報を出力する。出力部２５は、例えば学習済みラベル推定モデルの実行結果を表示してもよい。 The output unit 25 outputs various types of information. The output unit 25 is configured to include a display device such as a CRT display, a liquid crystal display, or an organic EL display. The output unit 25 may be configured as an interface that connects these display devices to the label estimation device 2. The output unit 25 outputs information input to the input unit 22, for example. The output unit 25 may also display the execution results of the trained label estimation model, for example.

図８は、実施形態における制御部２１の構成の一例を示す図である。制御部２１は、対象取得部２１０、推定部２２０、記憶制御部２３０、通信制御部２４０及び出力制御部２５０を備える。対象取得部２１０は、入力部２２又は通信部２３に入力された文章情報を取得する。 Figure 8 is a diagram showing an example of the configuration of the control unit 21 in an embodiment. The control unit 21 includes an object acquisition unit 210, an estimation unit 220, a memory control unit 230, a communication control unit 240, and an output control unit 250. The object acquisition unit 210 acquires text information input to the input unit 22 or the communication unit 23.

推定部２２０は、対象取得部２１０の取得した文章情報に対して学習済みラベル推定モデルを実行する。推定部２２０は、学習済みラベル推定モデルの実行により、対象取得部２１０の取得した文章情報に対するラベル適正情報を推定する。 The estimation unit 220 executes the learned label estimation model on the text information acquired by the object acquisition unit 210. By executing the learned label estimation model, the estimation unit 220 estimates label appropriateness information for the text information acquired by the object acquisition unit 210.

記憶制御部２３０は、記憶部２４に各種情報を記録する。通信制御部２４０は通信部２３の動作を制御する。出力制御部２５０は、出力部２５の動作を制御する。 The memory control unit 230 records various information in the memory unit 24. The communication control unit 240 controls the operation of the communication unit 23. The output control unit 250 controls the operation of the output unit 25.

図９は、実施形態におけるラベル推定装置２が実行する処理の流れの一例を示すフローチャートである。対象取得部２１０が、入力部２２又は通信部２３に入力された文章情報を取得する（ステップＳ２０１）。次に推定部２２０が、学習済みラベル推定モデルを実行することで、対象取得部２１０の取得した文章情報に対するラベル適正情報を推定する（ステップＳ２０２）。次に出力制御部２５０が出力部２５の動作を制御して、取得されたラベル適正情報を出力部２５に出力させる（ステップＳ２０３）。 Figure 9 is a flowchart showing an example of the flow of processing executed by the label estimation device 2 in this embodiment. The object acquisition unit 210 acquires text information input to the input unit 22 or the communication unit 23 (step S201). Next, the estimation unit 220 executes a trained label estimation model to estimate label appropriateness information for the text information acquired by the object acquisition unit 210 (step S202). Next, the output control unit 250 controls the operation of the output unit 25 to cause the output unit 25 to output the acquired label appropriateness information (step S203).

（実験結果）
ここで、ラベル推定システム１００を用いた実験の結果について説明する。実験では、データセットとしてマルチラベル分類で用いられるベンチマークが使用された。具体的には、Reuters-21578と、Arxiv Academic Paper Dataset（AAPD）と、20Newsgroupsと、の３つが用いられた。実験では、機械学習のモデルとして自然言語処理で用いられる機械学習のモデルが用いられた。具体的には、ＢＥＲＴ（Bidirectional Encoder Representations from Transformers）と、Ｂｉ－ＬＳＴＭ（Long Short Term Memory）と、ＣＮＮ（Convolution Neural Network）とが用いられた。実験では、評価指標として、Ｍｉｃｒｏ－ｆ１と、Ｍａｃｒｏ－ｆ１とが用いられた。 (Experimental results)
Here, the results of an experiment using the label estimation system 100 will be described. In the experiment, benchmarks used in multi-label classification were used as datasets. Specifically, three datasets were used: Reuters-21578, the Arxiv Academic Paper Dataset (AAPD), and 20Newsgroups. In the experiment, machine learning models used in natural language processing were used as machine learning models. Specifically, Bidirectional Encoder Representations from Transformers (BERT), Long Short Term Memory (Bi-LSTM), and Convolution Neural Network (CNN) were used. In the experiment, Micro-f1 and Macro-f1 were used as evaluation indices.

図１０は、実施形態のラベル推定システム１００を用いた実験結果の一例を示す第１の図である。図１０における“Method”の欄が“BERT w/ALS”、”LSTM w/ALS”、”CNN w/ALS”である各行は、ラベル推定システム１００を用いた結果を示す。“Method”の欄が”BERT only”、”LSTM only”、”CNN only”である各行は、ラベル適正情報を用いず正否情報を用いて得られた学習済みラベル推定モデルによる推定の結果を示す。 Figure 10 is a first diagram showing an example of experimental results using the label estimation system 100 of the embodiment. In Figure 10, the rows with "BERT w/ALS", "LSTM w/ALS", and "CNN w/ALS" in the "Method" column indicate results using the label estimation system 100. The rows with "BERT only", "LSTM only", and "CNN only" in the "Method" column indicate estimation results using a trained label estimation model obtained using correct/incorrect information without using label appropriateness information.

なお、“BERT w/ALS”及び”BERT only”における”BERT”は、実験で用いられた機械学習のモデルがＢＥＲＴであったことを示す。なお、“LSTM w/ALS”及び”LSTM only”における”LSTM”は、実験で用いられた機械学習のモデルがＢｉ－ＬＳＴＭであったことを示す。なお、“CNN w/ALS”及び”CNN only”における”CNN”は、実験で用いられた機械学習のモデルがＣＮＮであったことを示す。 Note that "BERT" in "BERT w/ALS" and "BERT only" indicates that the machine learning model used in the experiment was BERT. Note that "LSTM" in "LSTM w/ALS" and "LSTM only" indicates that the machine learning model used in the experiment was Bi-LSTM. Note that "CNN" in "CNN w/ALS" and "CNN only" indicates that the machine learning model used in the experiment was CNN.

”Rueters-21578”の“Macro-f1”は、用いられたデータセットがReuters-21578の場合におけるMacor-f1の値を示す。”Rueters-21578”の“Micro-f1”は、用いられたデータセットがReuters-21578の場合におけるMicor-f1の値を示す。”AAPD”の“Macro-f1”は、用いられたデータセットがAAPDの場合におけるMacor-f1の値を示す。”AAPD”の“Micro-f1”は、用いられたデータセットがAAPDの場合におけるMicor-f1の値を示す。”20Newsgroups”の”Macro-f1”は、用いられたデータセットが20Newsgroupsの場合におけるMacor-f1の値を示す。”20Newsgroups”の”Micro-f1”は、用いられたデータセットが20Newsgroupsの場合におけるMicor-f1の値を示す。 "Macro-f1" in "Rueters-21578" indicates the value of Macor-f1 when the dataset used is Reuters-21578. "Micro-f1" in "Rueters-21578" indicates the value of Micor-f1 when the dataset used is Reuters-21578. "Macro-f1" in "AAPD" indicates the value of Macor-f1 when the dataset used is AAPD. "Micro-f1" in "AAPD" indicates the value of Micor-f1 when the dataset used is AAPD. "Macro-f1" in "20Newsgroups" indicates the value of Macor-f1 when the dataset used is 20Newsgroups. "Micro-f1" in "20Newsgroups" indicates the value of Micor-f1 when the dataset used is 20Newsgroups.

図１０の結果は、異なるランダムシードで５回実験が行われた結果を示す。図１０における、かっこ内の数値は標準偏差を示す。図１０の結果は、ＣＮＮやＢｉ－ＬＳＴＭ等の特定の機械学習のモデルによらずラベル推定システム１００が高い精度でラベルを推定可能であることを示す。 The results in Figure 10 show the results of five experiments conducted with different random seeds. The numbers in parentheses in Figure 10 indicate standard deviations. The results in Figure 10 demonstrate that the label estimation system 100 can estimate labels with high accuracy regardless of a specific machine learning model such as CNN or Bi-LSTM.

図１１は、実施形態のラベル推定システム１００を用いた実験結果の一例を示す第２の図である。より具体的には、図１１は、低頻度のラベル候補の推定の精度を実験で評価した結果を示す。なお図１０は、低頻度のラベル候補と低頻度ではないラベル候補との両者の推定の精度を実験で評価した結果を示す。なお、低頻度のラベル候補とは、複数のラベル候補のうち、データセット内の出現回数の順位が中央より下の順位のラベル候補を意味する。 Figure 11 is a second diagram showing an example of experimental results using the label estimation system 100 of the embodiment. More specifically, Figure 11 shows the results of an experimental evaluation of the accuracy of estimating low-frequency label candidates. Note that Figure 10 shows the results of an experimental evaluation of the accuracy of estimating both low-frequency label candidates and non-low-frequency label candidates. Note that a low-frequency label candidate refers to a label candidate that, among multiple label candidates, is ranked below the median in terms of the number of times it appears in the dataset.

図１１における“Method”の欄が“BERT w/ALS”、”LSTM w/ALS”、”CNN w/ALS”である各行は、ラベル推定システム１００を用いた結果を示す。“Method”の欄が”BERT only”、”LSTM only”、”CNN only”である各行は、ラベル適正情報を用いず正否情報を用いて得られた学習済みラベル推定モデルによる推定の結果を示す。 In Figure 11, rows with "BERT w/ALS," "LSTM w/ALS," or "CNN w/ALS" in the "Method" column indicate results using the label estimation system 100. Rows with "BERT only," "LSTM only," or "CNN only" in the "Method" column indicate estimation results using a trained label estimation model obtained using correct/incorrect information without using label appropriateness information.

図１１の結果は、異なるランダムシードで５回実験が行われた結果を示す。図１１における、かっこ内の数値は標準偏差を示す。図１１の結果は、ＣＮＮやＢｉ－ＬＳＴＭ等の特定の機械学習のモデルによらず、低頻度のラベル候補の推定についても、ラベル推定システム１００が高い精度で推定可能であることを示す。 The results in Figure 11 show the results of five experiments conducted using different random seeds. The numbers in parentheses in Figure 11 indicate standard deviations. The results in Figure 11 demonstrate that the label estimation system 100 is capable of highly accurate estimation of low-frequency label candidates, regardless of the use of specific machine learning models such as CNN or Bi-LSTM.

図１２は、実施形態のラベル推定システム１００を用いた実験結果の一例を示す第３の図である。図１２の横軸は、学習回数を示す。図１２の縦軸は、Ｍｉｃｒｏ－ｆ１の値を示す。”CNN only(train)”は、ラベル適正情報を用いず正否情報を用いて得られた学習済みラベル推定モデルによる学習データの推定の結果を示す。”CNN only(valid)”は、ラベル適正情報を用いず正否情報を用いて得られた学習済みラベル推定モデルによる開発データの推定の結果を示す。”CNN with ALS(train)”は、ラベル推定システム１００を用いた学習データの推定の結果を示す。”CNN with ALS(valid)”は、ラベル推定システム１００を用いた開発データの推定の結果を示す。なお、開発データとは、１回の学習ごとのラベル推定システム１００の推定の精度を測定するための実験における学習で用いられる学習データである。 Figure 12 is a third diagram showing an example of experimental results using the label estimation system 100 of an embodiment. The horizontal axis of Figure 12 represents the number of training rounds. The vertical axis of Figure 12 represents the Micro-f1 value. "CNN only (train)" represents the results of training data estimation using a trained label estimation model obtained using correct/incorrect information without label appropriateness information. "CNN only (valid)" represents the results of development data estimation using a trained label estimation model obtained using correct/incorrect information without label appropriateness information. "CNN with ALS (train)" represents the results of training data estimation using the label estimation system 100. "CNN with ALS (valid)" represents the results of development data estimation using the label estimation system 100. Note that development data refers to training data used in experiments to measure the estimation accuracy of the label estimation system 100 for each training round.

図１２の結果は、ラベル適正情報を用いず正否情報を用いた学習よりも、ラベル推定システム１００の学習の方が、学習の初期段階で過学習を抑制可能であることを示す。 The results in Figure 12 show that learning using the label estimation system 100 is more effective at suppressing overfitting in the early stages of learning than learning using correct/incorrect information without using label appropriateness information.

このように構成された実施形態における学習装置１は、ラベル適正情報を用いて学習済みラベル推定モデルを得る。そのため、ラベル適正情報ではなく正否情報だけに基づいて学習済みラベル推定モデルを得る装置よりも多くの付与されるべきラベルを高い精度で推定する数理モデルを得ることができる。その結果、学習装置１は、ラベルの推定する作業に要する労力の増大を抑制することができる。 In this embodiment, the learning device 1 configured in this manner obtains a trained label estimation model using label appropriateness information. As a result, it is possible to obtain a mathematical model that estimates more labels to be assigned with high accuracy than a device that obtains a trained label estimation model based only on correct/incorrect information rather than label appropriateness information. As a result, the learning device 1 can suppress an increase in the effort required for the task of estimating labels.

また、このように構成された実施形態におけるラベル推定装置２は、ラベル適正情報を用いて得られた学習済みラベル推定モデルを用いて、文章に付与されるべきラベルを推定する。そのため、ラベル適正情報ではなく正否情報だけに基づいて得られた学習済みラベル推定モデルを得る装置と比べて、より多くの付与されるべきラベルを高い精度で推定することができる。その結果、ラベル推定装置２は、ラベルの推定する作業に要する労力の増大を抑制することができる。 Furthermore, the label estimation device 2 in this embodiment configured estimates labels to be assigned to sentences using a trained label estimation model obtained using label appropriateness information. Therefore, compared to a device that obtains a trained label estimation model based only on correct/incorrect information rather than label appropriateness information, it can estimate more labels to be assigned with high accuracy. As a result, the label estimation device 2 can suppress an increase in the effort required for the task of estimating labels.

（変形例）
なお、上述したように、ラベル適正情報は、人手で生成されてもよい。このような場合、入力部１２又は通信部１３には、正否情報及びラベル共起情報に代えてラベル適正情報が入力される。このような場合、ラベル適正情報取得部１１０は、正否情報及びラベル共起情報に基づいてラベル適正情報を取得することに代えて、入力部１２又は通信部１３に入力されたラベル適正情報を取得することを実行する。 (Modification)
As described above, the label suitability information may be generated manually. In such a case, the label suitability information is input to the input unit 12 or the communication unit 13 instead of the correct/incorrect information and the label co-occurrence information. In such a case, the label suitability information acquisition unit 110 acquires the label suitability information input to the input unit 12 or the communication unit 13, instead of acquiring the label suitability information based on the correct/incorrect information and the label co-occurrence information.

出力制御部２５０は、推定部２２０は推定結果のラベル適正情報のうち、予め定められたラベルをジャンルとして出力部２５に出力してもよい。 The output control unit 250 may output a predetermined label from the label appropriateness information of the estimation result of the estimation unit 220 as a genre to the output unit 25.

上述したようにラベル適正情報の示すラベル適正度は、一例として、例えば０以上１以下の値で示される。しかしながらラベル適正度は、必ずしも０以上１以下の値で示される必要は無い。したがって、ラベル適正情報を表現するＮ次元のベクトルの各要素の値は、負の値を含んでもよい。上段の記載でラベル適正情報生成処理の一例を説明したが、そこではシグモイド関数等の独立変数の値を０以上１以下の所定の値に制限する関数を用いる処理を一例として説明した。これは、ラベル適正度が０以上１以下の値で示される場合を例にした処理の一例である。したがって、ラベル適正度が０以上１以下の値でなくてもよい場合には、シグモイド関数等の独立変数の値を０以上１以下の所定の値に制限する関数を用いる処理が行われる必要は無い。 As described above, the label appropriateness indicated by the label appropriateness information is, for example, expressed as a value between 0 and 1. However, the label appropriateness does not necessarily have to be expressed as a value between 0 and 1. Therefore, the values of each element of the N-dimensional vector representing the label appropriateness information may include negative values. An example of the label appropriateness information generation process was explained above, and the process used was an example of a process that uses a function, such as a sigmoid function, that limits the value of an independent variable to a predetermined value between 0 and 1. This is an example of a process that uses an example where the label appropriateness is expressed as a value between 0 and 1. Therefore, if the label appropriateness does not need to be a value between 0 and 1, there is no need to use a function, such as a sigmoid function, that limits the value of an independent variable to a predetermined value between 0 and 1.

制御部２１は、更に文章類似度推定部２６０と重要語句抽出部２７０とのいずれか一方又は両方を備えてもよい。以下、文章類似度推定部２６０と重要語句抽出部２７０とを備える制御部２１を制御部２１ａという。図１３は、変形例における制御部２１ａの構成の一例を示す図である。制御部２１ａは、文章類似度推定部２６０と重要語句抽出部２７０とを備える点で、制御部２１と異なる。 The control unit 21 may further include either or both of a sentence similarity estimation unit 260 and a key phrase extraction unit 270. Hereinafter, a control unit 21 including a sentence similarity estimation unit 260 and a key phrase extraction unit 270 will be referred to as a control unit 21a. Figure 13 is a diagram showing an example of the configuration of a control unit 21a in a modified example. The control unit 21a differs from the control unit 21 in that it includes a sentence similarity estimation unit 260 and a key phrase extraction unit 270.

文章類似度推定部２６０は、２つの文章情報の類似の度合（以下「文章類似度」という。）を推定する。２つの文章情報は少なくとも一方が、対象取得部２１０の取得した文章情報である。したがって、２つの文章情報は、どちらも対象取得部２１０の取得した文章情報であってもよいし、一方が対象取得部２１０の取得した文章情報であって他方が予め記憶部２４に記憶済みの文章情報であってもよい。文章類似度推定部２６０は、２つの文章情報をそれぞれ推定部２２０に入力し、推定部２２０にどちらについてもラベル適正情報を推定させる。文章類似度推定部２６０は、推定部２２０の推定した２つのラベル適正情報に基づき、ラベル適正情報の一致の度合を２つの文章情報の文章類似度として取得する。文章類似度推定部２６０は、例えば、２つのラベル適正情報それぞれに対応する各ベクトルの内積の値を文章類似度として取得する。 The text similarity estimation unit 260 estimates the degree of similarity between two pieces of text information (hereinafter referred to as "text similarity"). At least one of the two pieces of text information is text information acquired by the target acquisition unit 210. Therefore, both pieces of text information may be text information acquired by the target acquisition unit 210, or one may be text information acquired by the target acquisition unit 210 and the other may be text information previously stored in the storage unit 24. The text similarity estimation unit 260 inputs each of the two pieces of text information to the estimation unit 220 and causes the estimation unit 220 to estimate label appropriateness information for both pieces of text information. Based on the two pieces of label appropriateness information estimated by the estimation unit 220, the text similarity estimation unit 260 acquires the degree of agreement between the label appropriateness information as the text similarity between the two pieces of text information. For example, the text similarity estimation unit 260 acquires the value of the inner product of each vector corresponding to each of the two pieces of label appropriateness information as the text similarity.

文章類似度推定部２６０は、取得した文章類似度が所定の度合以上である場合に、２つの文章情報を類似した文章情報であると判定してもよい。このような場合、出力制御部２５０は、文章類似度推定部２６０によって類似した文章情報であると判定された２つの文章情報の一方又は両方を出力部２５に出力させてもよい。 The text similarity estimation unit 260 may determine that two pieces of text information are similar if the acquired text similarity is equal to or greater than a predetermined level. In such a case, the output control unit 250 may cause the output unit 25 to output one or both of the two pieces of text information determined by the text similarity estimation unit 260 to be similar.

重要語句抽出部２７０は、日本語自然言語処理オープンソースライブラリのＧｉＮＺＡを用いて、文章情報の示す文章中の重要語句を取得する。出力制御部２５０は、重要語句抽出部２７０の取得した重要語句を出力部２５に出力させてもよい。 The key phrase extraction unit 270 uses GiNZA, an open source library for Japanese natural language processing, to acquire key phrases in the text indicated by the text information. The output control unit 250 may cause the output unit 25 to output the key phrases acquired by the key phrase extraction unit 270.

なお、学習装置１は、ネットワークを介して通信可能に接続された複数台の情報処理装置を用いて実装されてもよい。この場合、学習装置１が備える各機能部は、複数の情報処理装置に分散して実装されてもよい。 The learning device 1 may be implemented using multiple information processing devices connected to each other via a network. In this case, the functional units of the learning device 1 may be distributed and implemented across multiple information processing devices.

なお、ラベル推定装置２は、ネットワークを介して通信可能に接続された複数台の情報処理装置を用いて実装されてもよい。この場合、ラベル推定装置２が備える各機能部は、複数の情報処理装置に分散して実装されてもよい。 The label estimation device 2 may be implemented using multiple information processing devices connected to each other so that they can communicate with each other via a network. In this case, the functional units of the label estimation device 2 may be distributed and implemented across multiple information processing devices.

なお、学習装置１と、ラベル推定装置２と、の各機能の全て又は一部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置である。プログラムは、電気通信回線を介して送信されてもよい。 All or part of the functions of the learning device 1 and the label estimation device 2 may be implemented using hardware such as an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), or an FPGA (Field Programmable Gate Array). The program may be recorded on a computer-readable recording medium. Examples of computer-readable recording media include portable media such as flexible disks, optical magnetic disks, ROMs, and CD-ROMs, as well as storage devices such as hard disks built into computer systems. The program may also be transmitted via telecommunications lines.

なお、対象取得部２１０の取得する文章情報は対象情報の一例である。対象取得部２１０の取得した文章情報の示す文章は処理対象の一例である。 Note that the text information acquired by the object acquisition unit 210 is an example of object information. The text indicated by the text information acquired by the object acquisition unit 210 is an example of a processing object.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The above describes in detail an embodiment of the present invention with reference to the drawings, but the specific configuration is not limited to this embodiment and includes designs that do not deviate from the gist of the present invention.

１００…ラベル推定システム、１…学習装置、２…ラベル推定装置、１１…制御部、１２…入力部、１３…通信部、１４…記憶部、１５…出力部、１１０…ラベル適正情報取得部、１２０…モデル学習部、１３０…記憶制御部、１４０…通信制御部、１５０…出力制御部、２１…制御部、２２…入力部、２３…通信部、２４…記憶部、２５…出力部、２１０…対象取得部、２２０…推定部、２３０…記憶制御部、２４０…通信制御部、２５０…出力制御部、９１…プロセッサ、９２…メモリ、９３…プロセッサ、９４…メモリ、２１ａ…制御部、２６０…文章類似度推定部、２７０…重要語句抽出部 100...Label estimation system, 1...Learning device, 2...Label estimation device, 11...Control unit, 12...Input unit, 13...Communication unit, 14...Memory unit, 15...Output unit, 110...Label appropriateness information acquisition unit, 120...Model learning unit, 130...Memory control unit, 140...Communication control unit, 150...Output control unit, 21...Control unit, 22...Input unit, 23...Communication unit, 24...Memory unit, 25...Output unit, 210...Object acquisition unit, 220...Estimation unit, 230...Memory control unit, 240...Communication control unit, 250...Output control unit, 91...Processor, 92...Memory, 93...Processor, 94...Memory, 21a...Control unit, 260...Sentence similarity estimation unit, 270...Key phrase extraction unit

Claims

a model learning unit that updates a label estimation model, which is a mathematical model that estimates a label to be assigned to a sentence indicated by input text information, by a machine learning method using model learning data that includes text information indicating a sentence and label appropriateness information indicating the degree to which a plurality of labels predetermined as candidates for labels to be assigned to the sentence are appropriate as labels for the sentence;
Equipped with
The label appropriateness information is information obtained based on true/false information indicating a label that satisfies a predetermined condition regarding the probability of being assigned to a sentence indicated by the model training data, and label co-occurrence information that is information indicating the probability of co-occurrence between any two of a plurality of labels that are predetermined as candidates for labels to be assigned to the sentence.
Learning device.

The predetermined condition is a condition that the probability of being assigned to a sentence represented by the model training data is the highest.
The learning device according to claim 1 .

an object acquisition unit that acquires object information that indicates a sentence to be processed;
a model learning unit that updates a label estimation model, which is a mathematical model that estimates a label to be assigned to a sentence indicated by input sentence information, by a machine learning method using model learning data including sentence information indicating a sentence and label appropriateness information indicating the degree to which a plurality of labels predetermined as candidate labels to be assigned to the sentence are appropriate as labels for the sentence, wherein the label appropriateness information is information obtained based on true/false information indicating a label that satisfies a predetermined condition regarding the likelihood of being assigned to the sentence indicated by the model learning data, and label co-occurrence information, which is information indicating the probability of co-occurrence between any two of a plurality of labels predetermined as candidate labels to be assigned to the sentence; and an estimation unit that estimates a label to be assigned to the sentence indicated by the target information acquired by the target acquisition unit, using the trained label estimation model obtained by a learning device, the label appropriateness information being information obtained based on true/false information indicating a label that satisfies a predetermined condition regarding the likelihood of being assigned to the sentence indicated by the model learning data, and label co-occurrence information, which is information indicating the probability of co-occurrence between any two of a plurality of labels predetermined as candidate labels to be assigned to the sentence.
A label estimation device comprising:

A program for causing a computer to function as the learning device described in claim 1 or 2.

A program for causing a computer to function as the label estimation device described in claim 3.