JP7720766B2

JP7720766B2 - Machine learning device, natural language processing device, and program

Info

Publication number: JP7720766B2
Application number: JP2021174466A
Authority: JP
Inventors: 健小早川; 礼子齋藤
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2021-10-26
Filing date: 2021-10-26
Publication date: 2025-08-08
Anticipated expiration: 2041-10-26
Also published as: JP2023064283A

Description

本発明は、機械学習装置、自然言語処理装置、およびプログラムに関する。 The present invention relates to a machine learning device, a natural language processing device, and a program.

例えば、ＳＮＳ（ソーシャル・ネットワーキング・サービス）等に投稿される多数の文章を自動的に分析することによって人々あるいは社会全体の意見の傾向を自動的に分析することは有効である。例えば数千万件から数億件またはそれ以上の文章を人手で分析することは、非現実的であり、精度よく自動的な分析を行えるようにすることは強く求められる。 For example, it would be effective to automatically analyze the trends in opinions of people or society as a whole by automatically analyzing the large number of sentences posted on social networking services (SNS). However, it is unrealistic to manually analyze tens of millions to hundreds of millions of sentences or more, and there is a strong demand for accurate automatic analysis.

従来の技術では、文章に含まれる句あるいは単語に対してそれぞれラベル付けを自動的に行うようにしている。例えば、深層学習モデルを用いて、文章の中から、意見対象と意見部分との区間をそれぞれ抽出することが試みられている。 Conventional technology automatically labels each phrase or word contained in a sentence. For example, attempts have been made to use deep learning models to extract sections of opinion objects and opinion parts from within a sentence.

非特許文献１には、深層学習モデルを用いて自然言語で記述された文章を分析するためのしくみであるＢＥＲＴ（Bidirectional Encoder Representations from Transformers）が記載されている。 Non-Patent Document 1 describes BERT (Bidirectional Encoder Representations from Transformers), a mechanism for analyzing text written in natural language using a deep learning model.

非特許文献２では、機械学習を用いて文章の構成要素の系列に対してラベル付けを行う「系列ラベリング」の技術について説明されている。 Non-Patent Document 2 describes a "sequence labeling" technique that uses machine learning to label sequences of sentence components.

Jacob Devlin，Ming-Wei Chang，Kenton Lee，Kristina Toutanova，BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding，Proceedings of NAACL-HLT 2019，pages 4171-4186，Association for Computational Linguistics，2019年．Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Proceedings of NAACL-HLT 2019, pages 4171-4186, Association for Computational Linguistics, 2019. Takatomo Ishikawa，言語処理のための機械学習「５．系列ラベリング」，［online］，インターネット＜ＵＲＬ：https://www.slideshare.net/Takatymo/ss-64274683，2016年7月22日＞．Takatomo Ishikawa, Machine Learning for Natural Language Processing "5. Sequence Labeling", [online], Internet <URL: https://www.slideshare.net/Takatymo/ss-64274683, July 22, 2016>.

しかしながら、従来技術において、深層学習分析モデルを用いて文章の構成要素のラベリングを行う場合には、ラベル付与の対象とする句への分割を必ずしも正しく行うことができないという問題があった。つまり、ラベル付与の処理の前段の処理である句への分割において間違いが生じ得るという問題があった。これは、文章の中の意見対象区間や意見区間を抽出するというタスクでは、意見対象区間や意見区間の句切れにおいて誤りが生じ得るという問題である。このような区間抽出の誤りが発生すると、ラベル付与も誤る結果となり得る。つまり、意見対象区間や意見区間を抽出するというタスクにおいては、それらの区間に含まれるべき表現の一部が欠落したり、本来は区間外であるべき誤った表現が区間に含まれる形で抽出されてしまったりする。また、ラベル付けにおいて上記のような誤りが生じると、その後段の処理における分析の精度も悪くなるという問題が発生し得る。例えば、ＳＮＳ上の意見の動向を統計的に集計する場合にも、精度が悪くなる要因となる。 However, when labeling sentence components using a deep learning analysis model, conventional technology has the problem of not always correctly segmenting sentences into phrases to be labeled. In other words, there is a problem with errors in segmentation into phrases, which is a process that precedes labeling. In the task of extracting opinion target sections or opinion sections within a sentence, errors can occur in the breaks between opinion target sections or opinion sections. Such errors in section extraction can also result in incorrect labeling. In other words, in the task of extracting opinion target sections or opinion sections, some expressions that should be included in those sections may be missing, or incorrect expressions that should not be included in those sections may be extracted as being included in those sections. Furthermore, errors such as those described above in labeling can also lead to problems such as poor analysis accuracy in subsequent processes. For example, this can also lead to poor accuracy when statistically aggregating opinion trends on social media.

本発明は、上記のような課題認識に基づいて行なわれたものであり、単語列に含まれる分析対象の部分列（この部分列の具体例は、上記の意見対象区間や意見区間）の位置をより精度よく抽出することを可能とするための機械学習装置、自然言語処理装置、およびプログラムを提供しようとするものである。 The present invention was developed based on the above-mentioned problem recognition, and aims to provide a machine learning device, natural language processing device, and program that enable more accurate extraction of the location of the subsequence to be analyzed contained in a word sequence (specific examples of this subsequence are the above-mentioned opinion target section and opinion section).

［１］上記の課題を解決するため、本発明の一態様による機械学習装置は、単語列を入力し前記単語列に対応する単語埋め込み表現列を出力する単語埋め込み部と、前記単語埋め込み部から出力される単語埋め込み表現列を入力し前記単語埋め込み表現列に対応するラベル系列を出力する系列ラベリング部と、前記単語埋め込み部から出力される単語埋め込み表現列を入力し前記単語埋め込み表現列に対応する句切れ位置情報を出力する句切れ位置予測部と、前記単語埋め込み部に入力するための単語列を供給するとともに、前記系列ラベリング部が出力する前記ラベル系列に対応する正解ラベル系列を供給し、さらに前記句切れ位置予測部が出力する前記句切れ位置情報に対応する正解句切れ位置情報を供給する学習用データ供給部と、を備え、前記系列ラベリング部は、前記ラベル系列と前記正解ラベル系列との誤差に基づく逆伝播を行うことによって内部のモデルのパラメーターを調整し、前記句切れ位置予測部は、前記句切れ位置情報と前記正解句切れ位置情報との誤差に基づく逆伝播を行うことによって内部のモデルのパラメーターを調整し、前記単語埋め込み部は、前記系列ラベリング部からの誤差の逆伝播と前記句切れ位置予測部からの誤差の逆伝播とによって内部のモデルのパラメーターを調整する、というものである。 [1] In order to solve the above problem, a machine learning device according to one aspect of the present invention includes a word embedding unit that inputs a word string and outputs a word embedding sequence corresponding to the word string, a sequence labeling unit that inputs the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence, a phrase break position prediction unit that inputs the word embedding sequence output from the word embedding unit and outputs phrase break position information corresponding to the word embedding sequence, and supplies a word string to be input to the word embedding unit and a correct label sequence corresponding to the label sequence output by the sequence labeling unit, and a learning data supply unit that supplies correct phrase break position information corresponding to the phrase break position information output by the phrase break position prediction unit, wherein the sequence labeling unit adjusts the parameters of an internal model by performing backpropagation based on the error between the label sequence and the correct label sequence, the phrase break position prediction unit adjusts the parameters of the internal model by performing backpropagation based on the error between the phrase break position information and the correct phrase break position information, and the word embedding unit adjusts the parameters of the internal model by backpropagating the error from the sequence labeling unit and the error from the phrase break position prediction unit.

［２］また、本発明の一態様は、上記の機械学習装置において、前記句切れ位置予測部は、入力される前記単語埋め込み表現列に含まれるすべての単語埋め込み表現を基とする全結合回帰モデルを用いることによって、前記単語埋め込み表現列に対応する句切れ位置情報を出力する、ものである。 [2] In another aspect of the present invention, in the machine learning device described above, the phrase break position prediction unit outputs phrase break position information corresponding to the input word embedding sequence by using a fully connected regression model based on all word embeddings included in the input word embedding sequence.

［３］また、本発明の一態様は、上記の機械学習装置において、前記系列ラベリング部は、入力される元の前記単語列に含まれる所定の部分列が意見対象であることを表すラベルと、前記所定の部分列が意見であることを表すラベルと、を少なくとも出力する、というものである。 [3] Another aspect of the present invention is that in the above-mentioned machine learning device, the sequence labeling unit outputs at least a label indicating that a predetermined subsequence included in the original input word sequence is an opinion subject, and a label indicating that the predetermined subsequence is an opinion.

［４］また、本発明の一態様は、上記の機械学習装置において、前記句切れ位置予測部は、前記句切れ位置情報として、前記部分列の開始位置および終了位置を表す数値を出力するものであり、前記学習用データ供給部が供給する前記正解句切れ位置情報は、前記部分列の開始位置および終了位置の正解を表す数値の情報であるというものである。 [4] Another aspect of the present invention is that in the above-mentioned machine learning device, the phrase break position prediction unit outputs, as the phrase break position information, numerical values representing the start and end positions of the subsequence, and the correct phrase break position information supplied by the learning data supply unit is numerical information representing the correct start and end positions of the subsequence.

［５］また、本発明の一態様は、単語列を入力し前記単語列に対応する単語埋め込み表現列を出力する単語埋め込み部と、前記単語埋め込み部から出力される単語埋め込み表現列を入力し前記単語埋め込み表現列に対応するラベル系列を出力する系列ラベリング部と、を備え、少なくとも、前記単語埋め込み部が内部に持つモデルは、請求項１から４までのいずれか一項に記載の機械学習装置によって学習済みである、という自然言語処理装置である。 [5] Another aspect of the present invention is a natural language processing device that includes a word embedding unit that inputs a word string and outputs a word embedding sequence corresponding to the word string, and a sequence labeling unit that inputs the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence, wherein at least the model internal to the word embedding unit has been trained by the machine learning device described in any one of claims 1 to 4.

［６］また、本発明の一態様は、単語列を入力し前記単語列に対応する単語埋め込み表現列を出力する単語埋め込み部と、前記単語埋め込み部から出力される単語埋め込み表現列を入力し前記単語埋め込み表現列に対応するラベル系列を出力する系列ラベリング部と、前記単語埋め込み部から出力される単語埋め込み表現列を入力し前記単語埋め込み表現列に対応する句切れ位置情報を出力する句切れ位置予測部と、前記単語埋め込み部に入力するための単語列を供給するとともに、前記系列ラベリング部が出力する前記ラベル系列に対応する正解ラベル系列を供給し、さらに前記句切れ位置予測部が出力する前記句切れ位置情報に対応する正解句切れ位置情報を供給する学習用データ供給部と、を備え、前記系列ラベリング部は、前記ラベル系列と前記正解ラベル系列との誤差に基づく逆伝播を行うことによって内部のモデルのパラメーターを調整し、前記句切れ位置予測部は、前記句切れ位置情報と前記正解句切れ位置情報との誤差に基づく逆伝播を行うことによって内部のモデルのパラメーターを調整し、前記単語埋め込み部は、前記系列ラベリング部からの誤差の逆伝播と前記句切れ位置予測部からの誤差の逆伝播とによって内部のモデルのパラメーターを調整する、機械学習装置、としてコンピューターを機能させるためのプログラムである。 [6] Another aspect of the present invention is a word embedding unit that inputs a word string and outputs a word embedding sequence corresponding to the word string; a sequence labeling unit that inputs the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence; a phrase break position prediction unit that inputs the word embedding sequence output from the word embedding unit and outputs phrase break position information corresponding to the word embedding sequence; and a sequence labeling unit that supplies a word string to be input to the word embedding unit, and also supplies a correct label sequence corresponding to the label sequence output by the sequence labeling unit, and further supplies the phrase break position information output by the phrase break position prediction unit. and a learning data supply unit that supplies correct phrase break position information corresponding to the information, wherein the sequence labeling unit adjusts the parameters of an internal model by performing backpropagation based on the error between the label sequence and the correct label sequence, the phrase break position prediction unit adjusts the parameters of the internal model by performing backpropagation based on the error between the phrase break position information and the correct phrase break position information, and the word embedding unit adjusts the parameters of the internal model by backpropagating the error from the sequence labeling unit and the error from the phrase break position prediction unit.

本発明によれば、単語系列に対してラベル系列（タグ系列）を付与する際に、その精度を向上させることができる。 This invention makes it possible to improve the accuracy of assigning label sequences (tag sequences) to word sequences.

本発明の実施形態による意見分析装置の概略機能構成を示すブロック図である。1 is a block diagram showing a schematic functional configuration of an opinion analysis device according to an embodiment of the present invention. 同実施形態において句切れ位置を予測するためのモデルの構成をより詳細に示す構成図である。FIG. 2 is a diagram showing in more detail the configuration of a model for predicting a phrase break position in the embodiment. 同実施形態による意見分析装置に入力されるテキストの実例と、その入力テキストに対して行われるラベル付けの例とを示す概略図である。2 is a schematic diagram showing an example of a text input to the opinion analysis device according to the embodiment and an example of labeling performed on the input text. FIG. 同実施形態における入力テキスト例（図３）に対応する単語系列と、その系列に対応するタグ系列の正解の例を示す概略図である。4 is a schematic diagram showing an example of a word sequence corresponding to an input text example (FIG. 3) in the embodiment and a correct answer of a tag sequence corresponding to the sequence. FIG. 同実施形態における入力テキスト例（図３）に対応する単語系列と、その系列に対応するタグ系列の正解の別の例を示す概略図である。FIG. 4 is a schematic diagram showing another example of a word sequence corresponding to the input text example (FIG. 3) and a correct answer of a tag sequence corresponding to the word sequence in the embodiment. 同実施形態における句切れ位置予測モデルの学習を行うための正解データの例を示す概略図である。10 is a schematic diagram showing an example of correct answer data for training a phrase break position prediction model in the embodiment. FIG. 同実施形態による意見分析装置がモデルの学習を行う際の処理の手順を示すフローチャートである。10 is a flowchart showing a processing procedure when the opinion analysis device according to the embodiment learns a model. 同実施形態による意見分析装置が、学習済みのモデルを用いて、系列ラベリングの処理を行う際の手順を示すフローチャートである。10 is a flowchart showing a procedure when the opinion analysis device according to the embodiment performs a sequence labeling process using a trained model. 同実施形態による意見分析装置の内部構成の一例を示すブロック図である。FIG. 2 is a block diagram showing an example of the internal configuration of the opinion analysis device according to the embodiment.

次に、本発明の一実施形態について、図面を参照しながら説明する。本実施形態による装置は、入力されるテキストに含まれる意見を分析するための意見分析装置である。具体的には、本実施形態の意見分析装置は、入力されるテキストを構成する要素（句）に対して、意見対象区間であるか、意見区間であるか、そのどちらでもない区間であるかを区別するためのタグ（ラベル）を付与する。 Next, one embodiment of the present invention will be described with reference to the drawings. The device according to this embodiment is an opinion analysis device for analyzing opinions contained in input text. Specifically, the opinion analysis device of this embodiment assigns tags (labels) to elements (phrases) that make up the input text to distinguish them as opinion target sections, opinion sections, or sections that are neither.

本実施形態の意見分析装置の特徴は、抽出する区間の分割箇所（句切れ）を予測するための回帰モデルを備える点である。この本実施形態に特有の回帰モデルを、以下で「句切れ位置予測モデル」と呼ぶ場合がある。句切れ位置予測モデルは、系列ラベリングを行うためのモデル（系列ラベリングモデル）とともに、下位層における分散表現を共有する。本実施形態の意見分析装置は、内部に備えるモデルの機械学習を行うものであり、「機械学習装置」とも呼ばれる。また、本実施形態の意見分析装置は、学習済みのモデルを用いて、与えられる未知の単語列に対応するラベル系列を与えるものであり、「自然言語処理装置」とも呼ばれる。 The opinion analysis device of this embodiment is characterized by the inclusion of a regression model for predicting the division points (phrase breaks) of the extracted section. This regression model specific to this embodiment may be referred to below as the "phrase break position prediction model." The phrase break position prediction model, together with a model for performing sequence labeling (sequence labeling model), shares a distributed representation in a lower layer. The opinion analysis device of this embodiment performs machine learning on an internally stored model and is also referred to as a "machine learning device." The opinion analysis device of this embodiment also uses a trained model to provide a label sequence corresponding to a given unknown word sequence and is also referred to as a "natural language processing device."

本実施形態では、句切れ位置予測モデルの機械学習を行うために、句切れ位置予測モデルによって予測される句切れ位置と句切れ位置の正解との誤差をクロスエントロピーで与えるような損失関数を用いる。 In this embodiment, in order to perform machine learning on the phrase break position prediction model, a loss function is used that gives the error between the phrase break position predicted by the phrase break position prediction model and the correct phrase break position in terms of cross-entropy.

図１は、本実施形態による意見分析装置の概略機能構成を示すブロック図である。図示するように、意見分析装置１は、単語埋め込み部１１と、系列ラベリング部１２と、系列ラベリングの損失関数算出部１７と、句切れ位置予測部２２と、句切れ位置の損失関数算出部２７と、学習用データ供給部３０と、を含んで構成される。これらの各機能部は、例えば、コンピューターと、プログラムとで実現することが可能である。また、各機能部は、必要に応じて、記憶手段を有する。記憶手段は、例えば、プログラム上の変数や、プログラムの実行によりアロケーションされるメモリーである。また、必要に応じて、磁気ハードディスク装置やソリッドステートドライブ（ＳＳＤ）といった不揮発性の記憶手段を用いるようにしてもよい。また、各機能部の少なくとも一部の機能を、プログラムではなく専用の電子回路として実現してもよい。各部の機能は、次に説明する通りである。 Figure 1 is a block diagram showing the general functional configuration of an opinion analysis device according to this embodiment. As shown in the figure, the opinion analysis device 1 includes a word embedding unit 11, a sequence labeling unit 12, a sequence labeling loss function calculation unit 17, a phrase break position prediction unit 22, a phrase break position loss function calculation unit 27, and a training data supply unit 30. Each of these functional units can be implemented, for example, by a computer and a program. Each functional unit also includes storage means as needed. Storage means is, for example, program variables or memory allocated by program execution. Non-volatile storage means such as a magnetic hard disk drive or solid-state drive (SSD) may also be used as needed. At least some of the functions of each functional unit may also be implemented as dedicated electronic circuits rather than programs. The functions of each unit are described below.

単語埋め込み部１１は、単語列を入力し前記単語列に対応する単語埋め込み表現列を出力するものである。単語埋め込み部１１は、内部に、機械学習可能なモデルを備える。単語埋め込み部１１が備えるモデルは、「単語埋め込み層」とも呼ばれる。単語埋め込み部１１が備えるモデルは、例えば、ニューラルネットワークを用いて実現される。本実施形態では、意見対象抽出装置の深層学習モデルとしてＢＥＲＴを用いる。ＢＥＲＴ自体は、既存技術であり、前記の非特許文献１等に記載されている。単語埋め込み部１１が用いるＢＥＲＴは、日本語で事前学習された一般的なモデルを用いてもよい。本実施形態では、単語埋め込み部１１が持つＢＥＲＴは、さらに学習することができるものである。入力文は、標準的な形態素解析処理によって単語単位で分割される。つまり、入力文は、単語列と等価である。入力文に対応する単語列は、単語埋め込み部１１に入力され、単語埋め込み表現の列に変換される。 The word embedding unit 11 receives a word string as input and outputs a word embedding representation string corresponding to the word string. The word embedding unit 11 includes an internal model capable of machine learning. The model included in the word embedding unit 11 is also called a "word embedding layer." The model included in the word embedding unit 11 is implemented, for example, using a neural network. In this embodiment, BERT is used as the deep learning model for the opinion target extraction device. BERT itself is an existing technology and is described in the aforementioned Non-Patent Document 1, etc. The BERT used by the word embedding unit 11 may be a general model pre-trained in Japanese. In this embodiment, the BERT held by the word embedding unit 11 can be further trained. The input sentence is divided into words using standard morphological analysis processing. In other words, the input sentence is equivalent to a word string. The word string corresponding to the input sentence is input to the word embedding unit 11 and converted into a word embedding representation string.

系列ラベリング部１２は、単語埋め込み部１１から出力される単語埋め込み表現列を入力し、その単語埋め込み表現列に対応するラベル系列を出力する。つまり、系列ラベリング部１２は、単語ごとにラベルを付与する。ラベルの具体例については後で説明する。系列ラベリング部１２は、内部に、機械学習可能なモデルを備える。系列ラベリング部１２が備えるモデルは、「系列ラベリングモデル」と呼ばれる。 The sequence labeling unit 12 inputs the word embedding sequence output by the word embedding unit 11 and outputs a label sequence corresponding to the word embedding sequence. In other words, the sequence labeling unit 12 assigns a label to each word. Specific examples of labels will be explained later. The sequence labeling unit 12 internally includes a model capable of machine learning. The model included in the sequence labeling unit 12 is called a "sequence labeling model."

本実施形態では、系列ラベリング部１２は、入力される元の単語列に含まれる所定の部分列が意見対象であることを表すラベルと、その所定の部分列が意見であることを表すラベルと、を少なくとも出力する。系列ラベリング部１２が付与するラベルについては、後でさらに具体的な例を説明する。 In this embodiment, the sequence labeling unit 12 outputs at least a label indicating that a specific subsequence included in the original input word sequence is an opinion subject, and a label indicating that the specific subsequence is an opinion. Specific examples of the labels assigned by the sequence labeling unit 12 will be described later.

系列ラベリングの損失関数算出部１７は、系列ラベリング部１２が出力したラベル系列と、その正解である正解ラベル系列との誤差を算出する。系列ラベリングの損失関数算出部１７が算出する誤差は、系列ラベリング部１２がパラメーターを調整するために行う誤差逆伝播の基となる誤差である。 The sequence labeling loss function calculation unit 17 calculates the error between the label sequence output by the sequence labeling unit 12 and the correct label sequence. The error calculated by the sequence labeling loss function calculation unit 17 is the basis for the error backpropagation performed by the sequence labeling unit 12 to adjust the parameters.

句切れ位置予測部２２は、単語埋め込み部１１から出力される単語埋め込み表現列を入力し、その単語埋め込み表現列に対応する句切れ位置情報を出力する。句切れ位置予測部２２は、内部に、機械学習可能なモデルを備える。句切れ位置予測部２２が備えるモデルは、「句切れ位置予測モデル」と呼ばれる。 The phrase break position prediction unit 22 inputs the word embedding sequence output from the word embedding unit 11 and outputs phrase break position information corresponding to the word embedding sequence. The phrase break position prediction unit 22 has an internal model capable of machine learning. The model provided by the phrase break position prediction unit 22 is called the "phrase break position prediction model."

句切れ位置予測部２２は、入力される単語埋め込み表現列に含まれるすべての単語埋め込み表現を基とする全結合回帰モデルを用いることによって、その単語埋め込み表現列に対応する句切れ位置情報を出力するようにしてよい。 The phrase break position prediction unit 22 may output phrase break position information corresponding to the input word embedding sequence by using a fully connected regression model based on all word embeddings contained in the input word embedding sequence.

句切れ位置の損失関数算出部２７は、句切れ位置予測部２２が出力した句切れ位置情報と、その正解である正解句切れ位置情報との誤差を算出する。句切れ位置の損失関数算出部２７が算出する誤差は、句切れ位置予測部２２がパラメーターを調整するために行う誤差逆伝播の基となる誤差である。 The phrase break position loss function calculation unit 27 calculates the error between the phrase break position information output by the phrase break position prediction unit 22 and the correct phrase break position information. The error calculated by the phrase break position loss function calculation unit 27 is the basis for the error backpropagation performed by the phrase break position prediction unit 22 to adjust the parameters.

なお、前記の単語埋め込み部１１は、そのモデルの機械学習を行う際には、系列ラベリング部１２からの誤差の逆伝播と句切れ位置予測部２２からの誤差の逆伝播との両方に基づいて内部のモデルのパラメーターを調整する。 When performing machine learning of the model, the word embedding unit 11 adjusts the parameters of the internal model based on both the backpropagation of errors from the sequence labeling unit 12 and the backpropagation of errors from the phrase break position prediction unit 22.

学習用データ供給部３０は、意見分析装置１が内部に備えるモデルの学習を行うための学習用データを供給する。具体的には、学習用データ供給部３０は、単語埋め込み部１１に入力するための単語列を供給する。また、学習用データ供給部３０は、上記の単語列に対応して、系列ラベリング部１２が出力するラベル系列に対応する正解である正解ラベル系列を供給する。さらに、学習用データ供給部３０は、上記の単語列に対応して、句切れ位置予測部２２が出力する句切れ位置情報に対応する正解である正解句切れ位置情報を供給する。 The learning data supply unit 30 supplies learning data for training the model stored internally in the opinion analysis device 1. Specifically, the learning data supply unit 30 supplies word sequences to be input to the word embedding unit 11. The learning data supply unit 30 also supplies correct label sequences that correspond to the label sequences output by the sequence labeling unit 12 in correspondence with the above word sequences. Furthermore, the learning data supply unit 30 supplies correct phrase break position information that corresponds to the phrase break position information output by the phrase break position prediction unit 22 in correspondence with the above word sequences.

句切れ位置予測部２２は、句切れ位置情報として、入力文に対応する単語列の部分列の開始位置および終了位置を表す数値を出力するものである。なお、部分列の数は、１つでも複数でもよい。学習用データ供給部３０が供給する正解句切れ位置情報は、上記の部分列の開始位置および終了位置の正解を表す数値の情報である。開始位置および終了位置は、第何番目の単語であるかを表す数値の情報である。部分列の開始位置および終了位置の正解は、それぞれ、基本的に整数である。句切れ位置予測部２２が予測して出力する開始位置および終了位置のそれぞれは、整数であるとは限らない。通常は、句切れ位置予測部２２が予測して出力する開始位置および終了位置のそれぞれは、非整数である。それらの非整数は、開始位置および終了位置のそれぞれの近似値として予測されたものであると捉えられる。 The phrase break position prediction unit 22 outputs, as phrase break position information, numerical values representing the start and end positions of subsequences of a word string corresponding to the input sentence. The number of subsequences may be one or more. The correct phrase break position information supplied by the training data supply unit 30 is numerical information representing the correct start and end positions of the above subsequence. The start and end positions are numerical information representing the ordinal number of the word. The correct start and end positions of a subsequence are basically integers. The start and end positions predicted and output by the phrase break position prediction unit 22 are not necessarily integers. Typically, the start and end positions predicted and output by the phrase break position prediction unit 22 are non-integer values. These non-integer values can be considered to be predicted approximations of the start and end positions, respectively.

図２は、句切れ位置を予測するためのモデルの構成をより詳細に示す構成図である。図示するように、句切れ位置を予測するためのモデルは、全結合回帰層２２０を含んで構成される。全結合回帰層２２０は、図１に示した句切れ位置予測部２２に含まれるモデルである。全結合回帰層２２０は、単語埋め込み層１１０（ＢＥＲＴエンベディング層）からのすべての出力に接続される２つのノード（図中のｙ_１およびｙ_２）を含む。単語埋め込み層１１０は、図１に示した単語埋め込み部１１に含まれるモデルである。単語埋め込み層１１０に含まれるノード（図中のｘ_１，ｘ_２，・・・，ｘ_Ｄ）は、入力されるテキストに含まれる単語の分散表現に相当する。ｙ_１およびｙ_２は、それぞれ、区間の句切れの開始位置（start_position）および終了位置（end_position）を表す数値である。これらの数値は、それぞれ、テキスト中の何番目の単語であるかを表す値である。つまり、句切れ位置を予測するためのモデルは、１つまたは複数の区間のそれぞれの開始位置および終了位置を予測する。言い換えれば、全結合回帰層２２０は、これら２つの数値を予測するための２次元回帰モデルである。 FIG. 2 is a diagram illustrating in more detail the configuration of a model for predicting phrase break positions. As shown in the figure, the model for predicting phrase break positions includes a fully connected recurrent layer 220. The fully connected recurrent layer 220 is a model included in the phrase break position prediction unit 22 shown in FIG. 1. The fully connected recurrent layer 220 includes two nodes ( _y1 and _y2 in the figure) connected to all outputs from the word embedding layer 110 (BERT embedding layer). The word embedding layer 110 is a model included in the word embedding unit 11 shown in FIG. 1. The nodes included in the word embedding layer 110 ( _x1 , _x2 , ..., _xD in the figure) correspond to embedded representations of words included in the input text. _y1 and _y2 are numerical values representing the start position (start_position) and end position (end_position) of a phrase break in a section, respectively. These numerical values represent the ordinal number of each word in the text. In other words, the model for predicting phrase break positions predicts the start and end positions of one or more sections. In other words, the fully connected recurrent layer 220 is a two-dimensional regression model for predicting these two quantities.

つまり、単語埋め込み層１１０は、Ｄ次元の連続値ｘ_１，ｘ_２，・・・，ｘ_Ｄを出力する。全結合回帰層２２０は、上記のｘ_１，ｘ_２，・・・，ｘ_Ｄを基に算出するｙ_１およびｙ_２を出力する。ｙ_１が開始位置、ｙ_２が終了位置であり、ｙ_１およびｙ_２は下の式（１）の通りである。 That is, the word embedding layer 110 outputs D-dimensional continuous values _x1 , _x2 , ..., _xD . The fully connected recurrent layer 220 outputs y1 and _y2 calculated based on the above _x1 , _x2 , ..., _xD . _y1 is the _start position and _y2 is the end position, and _y1 and _y2 are as shown in the following equation (1).

式（１）におけるｆは活性化関数である。また、ａ_ｉｊおよびｂ_ｊは、全結合回帰層２２０のモデルの内部パラメーターである。これらの内部パラメーターの値は学習によって調整される。なお、ｊは、１または２のいずれかである。 In equation (1), f is an activation function. Furthermore, _aij and _bj are internal parameters of the model of the fully connected recurrent layer 220. The values of these internal parameters are adjusted by learning. Note that j is either 1 or 2.

つまり、開始位置ｙ_１および終了位置ｙ_２は、それぞれ、単語埋め込み層１１０は、単語埋め込み層１１０からの出力値ｘ_１，ｘ_２，・・・，ｘ_Ｄの重み付きの和（に所定のパラメーター値ｂ_ｊを加算したもの）に活性化関数を適用して算出される。 That is, the start position _y1 and the end position _y2 are calculated by applying an activation function to the weighted sum of the output values _x1 , _x2 , ..., _xD from the word embedding layer 110 (to which a predetermined parameter value _bj is added).

全結合回帰層２２０からの出力ｙ_１およびｙ_２は、学習用データに含まれる区間（正解）と照合される。具体的には、意見分析装置１は、全結合回帰層２２０からの出力と学習用データに出現するラベル区間とのクロスエントロピーを最小化する規範を用いて、モデルの内部パラメーターの学習を行う。ラベル区間の規範を計算する際には、学習データに出現するラベル区間をどれか１つに絞り込むことなく、計算効率のための枝刈りを行う場合を除いては、すべてのラベル区間を規範に含める。 The outputs _y1 and _y2 from the fully connected regression layer 220 are compared with the intervals (correct answers) included in the training data. Specifically, the opinion analysis device 1 learns the internal parameters of the model using a criterion that minimizes the cross-entropy between the output from the fully connected regression layer 220 and the label intervals that appear in the training data. When calculating the criterion for the label intervals, all label intervals that appear in the training data are included in the criterion, without narrowing it down to one, except when pruning is performed for computational efficiency.

モデルの学習の際には、下の式（２）で表わされる損失関数Ｌ_ｓｐａｎ（Ｓ）を用いる。 When learning the model, a loss function L _span (S) expressed by the following equation (2) is used.

式（２）において、ＳｐａｎＳｅｔ（Ｓ）は、意見分析装置１への入力テキストである文Ｓに出現するアノテーション区間の集合である。集合ＳｐａｎＳｅｔ（Ｓ）に属するひとつの区間ｌの、開始位置が式（２）におけるＢｅｇｉｎ（ｌ）であり、終了位置がＥｎｄ（ｌ）である。言い換えれば、１つの区間（スパン）は、開始位置の値と終了位置の値とで規定される。式（２）におけるＣｒｏｓｓＥｎｔｒｏｐｙ（ｙ，ｘ）は、全結合回帰層２２０からの出力のひとつであるｙ（つまり、ｙ_１またはｙ_２）と、学習用データによって与えられる開始位置または終了位置の正解とのクロスエントロピーである。 In formula (2), SpanSet(S) is a set of annotation intervals that appear in sentence S, which is input text to the opinion analysis device 1. The start position of one interval l belonging to the set SpanSet(S) is Begin(l) in formula (2), and the end position is End(l). In other words, one interval (span) is defined by the value of the start position and the value of the end position. CrossEntropy(y, x) in formula (2) is the cross entropy between y (i.e., _y1 or _y2 ), which is one of the outputs from the fully connected recurrent layer 220, and the correct answer for the start position or end position given by the training data.

つまり、Ｌ_ｓｐａｎ（Ｓ）は、各アノテーション区間についてのｙ_１と開始位置の正解とのクロスエントロピーと、ｙ_２と終了位置の正解とのクロスエントロピーとの和の、全アノテーション区間についての総和に基づくものである。 That is, L _span (S) is based on the sum, for all annotation sections, of the cross entropy between y ₁ and the correct answer for the start position for each annotation section and the cross entropy between y ₂ and the correct answer for the end position.

クロスエントロピーＣｒｏｓｓＥｎｔｒｏｐｙ（ｙ，ｘ）は、下の式（３）によって計算されるものである。 CrossEntropy(y,x) is calculated using equation (3) below.

式（３）において、ｋは、文Ｓ内の位置を表すための指標である。また、ｍａｘ＿ｓｅｑ＿ｌｅｎｇｔｈは、文Ｓの系列長に対応する値である。また、ｅｘｐは、指数関数である。 In equation (3), k is an index representing the position within sentence S. Also, max_seq_length is a value corresponding to the sequence length of sentence S. Also, exp is an exponential function.

意見分析装置１は、上記の句切れ位置予測モデルのための損失関数Ｌ_ｓｐａｎ（Ｓ）とは別に、系列ラベリング部１２（図１）が持つモデルのための損失関数Ｌ_ｓｅｑ（Ｓ）を用いる。損失関数Ｌ_ｓｅｑ（Ｓ）は、従来技術による系列ラベリング問題における損失関数である。意見分析装置１に１文（Ｓ）が入力された場合のモデルの総合的な損失関数Ｌ_{ｔｏｔａｌ}（Ｓ）は、下の式（４）の通り計算される。 The opinion analysis device 1 uses a loss function _Lseq (S) for the model held by the sequence labeling unit 12 (FIG. 1) in addition to the loss function _Lspan (S) for the above-mentioned phrase break position prediction model. The loss function _Lseq (S) is a loss function for the sequence labeling problem according to conventional technology. The overall loss function _Ltotal (S) of the model when one sentence (S) is input to the opinion analysis device 1 is calculated as shown in the following formula (4):

つまり、総合的な損失関数Ｌ_{ｔｏｔａｌ}（Ｓ）は、句切れ位置予測モデルのための損失関数Ｌ_ｓｐａｎ（Ｓ）と、系列ラベリング部１２が持つモデルのための従来技術による損失関数Ｌ_ｓｅｑ（Ｓ）との和として計算される。 That is, the overall loss function L _total (S) is calculated as the sum of the loss function L _span (S) for the phrase break position prediction model and the loss function L _seq (S) according to the prior art for the model held by the sequence labeling unit 12 .

系列ラベリング部１２が持つモデルと、句切れ位置予測部２２が持つモデルとのうち、句切れ位置予測部２２が持つモデルは、学習時にのみ用いられ、推論時には用いられない。つまり、総合的な損失関数の値（上記の式（４））が減少するように学習が進む過程において、下位層（単語埋め込み部１１の単語埋め込み層１１０）の分散表現の学習精度が向上する。これによって、系列ラベリング部１２が持つモデルによるラベル付け（意見対象区間や意見区間を特定するタスク）の精度が向上する。 Of the model held by the sequence labeling unit 12 and the model held by the phrase break position prediction unit 22, the model held by the phrase break position prediction unit 22 is used only during training, not during inference. In other words, as training progresses so that the value of the overall loss function (equation (4) above) decreases, the learning accuracy of the distributed representations in the lower layer (the word embedding layer 110 of the word embedding unit 11) improves. This improves the accuracy of labeling (the task of identifying opinion target sections and opinion sections) using the model held by the sequence labeling unit 12.

図３は、意見分析装置１に入力されるテキストの実例と、その入力テキストに対して行われるラベル付けの例とを示す概略図である。言い換えれば、図３は、入力されるテキスト中から意見対象区間や意見区間を抽出する処理の実例を示す概略図である。 Figure 3 is a schematic diagram showing an example of text input to the opinion analysis device 1 and an example of labeling performed on the input text. In other words, Figure 3 is a schematic diagram showing an example of the process of extracting opinion target sections and opinion sections from the input text.

図示する入力テキスト例は、あるＳＮＳに投稿された単文である。このテキストは、「暴力はいけない、いけないよ＃アニメ」というものである。なお、このテキスト中の「＃アニメ」は、特定の話題を検索しやすくするためのタグ（「＃」を用いるため、ハッシュタグと呼ばれる）の記法にしたがった表現である。この入力テキストは、「暴力／は／いけない／、／いけない／よ／＃／アニメ」という８個の単語に分割される（スラッシュで単語の区切りを表している）。ここでは、句読点や、「＃」などという記号も、便宜的に単語の一つとして扱う。このような入力テキストから抽出される意見対象区間は、「暴力」（開始位置が１で、終了位置が１）と、「＃アニメ」（開始位置が７で、終了位置が８）である。また、抽出される意見区間は、「いけない」（開始位置が３で、終了位置が３）と、「いけない」（開始位置が５で、終了位置が５）である。 The example input text shown in the figure is a single sentence posted to a social networking site. This text is "Violence is bad, it's bad #anime." Note that the "#anime" in this text is an expression that follows the notation of tags (called hashtags because they use the "#"), which make it easier to search for specific topics. This input text is divided into eight words: "Violence/is/bad/,/bad/yo/#/anime" (word separators are used). Here, punctuation marks and symbols such as "#" are treated as words for convenience. The opinion target sections extracted from this input text are "Violence" (start position 1, end position 1) and "#anime" (start position 7, end position 8). The opinion sections extracted are "Bad" (start position 3, end position 3) and "Bad" (start position 5, end position 5).

図４は、図３に示した入力テキスト例に対応する単語系列と、その系列に対応するタグ系列（ラベル系列）の正解の例を示す概略図である。例示する単語系列は、「暴力／は／いけない／、／いけない／よ／＃／アニメ」というものである。この図では、単語の順序にしたがって番号（位置の番号）を付与している。このような単語系列に対応する正解のタグ系列は、「ＴＡＲＧＥＴ／Ｏ／ＯＰＩＮＩＯＮ／Ｏ／ＯＰＩＮＩＯＮ／Ｏ／ＴＡＲＧＥＴ／ＴＡＲＧＥＴ」である。「ＴＡＲＧＥＴ」というタグは、対応する単語が意見対象区間に属するものであることを表すタグである。「ＯＰＩＮＩＯＮ」というタグは、対応する単語が意見区間に属するものであることを表すタグである。「Ｏ」というタグは、対応する単語が意見対象区間にも意見区間にも属さないものであることを表すタグである。 Figure 4 is a schematic diagram showing an example of a word sequence corresponding to the example input text shown in Figure 3 and a correct tag sequence (label sequence) corresponding to that sequence. The example word sequence is "violence/is/not/,/not/yo/#/anime." In this diagram, numbers (position numbers) are assigned according to the order of the words. The correct tag sequence corresponding to such a word sequence is "TARGET/O/OPINION/O/OPINION/O/TARGET/TARGET." The tag "TARGET" indicates that the corresponding word belongs to the opinion target section. The tag "OPINION" indicates that the corresponding word belongs to the opinion section. The tag "O" indicates that the corresponding word belongs to neither the opinion target section nor the opinion section.

つまり、この正解のタグ系列は、単語系列内の「暴力」という表現（区間の開始位置が１で終了位置が１）と、「＃アニメ」という表現（区間の開始位置が７で終了位置が８）のそれぞれが意見対象区間であることを表す。また、単語系列内の「いけない」（区間の開始位置が３で終了位置が３）と、「いけない」（区間の開始位置が５で終了位置が５）のそれぞれが意見区間であることを表す。 In other words, this correct tag sequence indicates that the expression "violence" (the section starts at 1 and ends at 1) and the expression "#anime" (the section starts at 7 and ends at 8) in the word sequence are each the opinion target section. Furthermore, the expressions "Ikenai" (the section starts at 3 and ends at 3) and "Ikenai" (the section starts at 5 and ends at 5) in the word sequence are each opinion sections.

図５は、図３に示した入力テキスト例に対応する単語系列と、その系列に対応するタグ系列の正解の別の例を示す概略図である。図４におけるタグとの違いとして、図５における例では、「ＴＡＲＧＥＴ」および「ＯＰＩＮＩＯＮ」のそれぞれのタグには、「Ｂ－」あるいは「Ｉ－」という接頭辞が付加される。このようなタグは、ＢＩＯ形式タグと呼ばれる。「Ｂ－」は、区間の始まりであることを表す。また、「Ｉ－」は、区間の途中であることを表す。なお、区間の終わりの単語に与えられるタグも「Ｉ－」で始まるものである。図５に示す例では、「暴力／は／いけない／、／いけない／よ／＃／アニメ」という単語系列に対応する正解のタグ系列は、「Ｂ－ＴＡＲＧＥＴ／Ｏ／Ｂ－ＯＰＩＮＩＯＮ／Ｏ／Ｂ－ＯＰＩＮＩＯＮ／Ｏ／Ｂ－ＴＡＲＧＥＴ／Ｉ－ＴＡＲＧＥＴ」である。 Figure 5 is a schematic diagram showing another example of a word sequence corresponding to the example input text shown in Figure 3 and the corresponding correct tag sequence. Unlike the tags in Figure 4, the example in Figure 5 adds a prefix "B-" or "I-" to the "TARGET" and "OPINION" tags, respectively. Such tags are called BIO-format tags. "B-" indicates the beginning of a section, while "I-" indicates the middle of a section. The tag assigned to the word at the end of a section also begins with "I-". In the example shown in Figure 5, the correct tag sequence corresponding to the word sequence "violence/wa/gainai/,/gainai/yo/#/anime" is "B-TARGET/O/B-OPINION/O/B-OPINION/O/B-TARGET/I-TARGET."

つまり、この正解のタグ系列は、単語系列内の「暴力」という表現（区間の開始位置が１で終了位置が１）と、「＃アニメ」という表現（区間の開始位置が７で終了位置が８）のそれぞれが意見対象区間であることを表す。また、単語系列内の「いけない」（区間の開始位置が３で終了位置が３）と、「いけない」（区間の開始位置が５で終了位置が５）のそれぞれが意見区間であることを表す。ＢＩＯ形式のタグでは、「Ｂ－」あるいは「Ｉ－」を付けることによって、例えば図５の第７番目の単語と第８番目の単語とが同一の区間に属するものであることを明示的に表している。 In other words, this correct tag sequence indicates that the expression "violence" (the section starts at 1 and ends at 1) and the expression "#anime" (the section starts at 7 and ends at 8) in the word sequence are each the opinion target section. Also, the expressions "kenai" (the section starts at 3 and ends at 3) and "kenai" (the section starts at 5 and ends at 5) in the word sequence are each opinion sections. In BIO format tags, adding "B-" or "I-" explicitly indicates that, for example, the seventh and eighth words in Figure 5 belong to the same section.

図６は、句切れ位置予測モデルの学習を行うための正解データの例を示す概略図である。図示するように、この正解データは、区間の開始位置と終了位置のペアの集合として与えられる。図示する例では、正解データは、４つの区間に関するデータを含む。それらの区間は、開始位置が１で終了位置が１、開始位置が３で終了位置が３、開始位置が５で終了位置が５、開始位置が７で終了位置が８、の４つである。この正解データは、図４や図５で示した例に対応している。図６で示すデータは、図２にも示した開始位置と終了位置についての正解を表す。 Figure 6 is a schematic diagram showing an example of correct answer data for training a phrase break position prediction model. As shown, this correct answer data is provided as a set of pairs of section start and end positions. In the example shown, the correct answer data includes data for four sections. These sections are: start position 1 and end position 1, start position 3 and end position 3, start position 5 and end position 5, and start position 7 and end position 8. This correct answer data corresponds to the examples shown in Figures 4 and 5. The data shown in Figure 6 represents the correct answers for the start and end positions also shown in Figure 2.

図７は、意見分析装置１がモデルの学習を行う際の処理の手順を示すフローチャートである。以下、このフローチャートに沿って、モデルの学習の手順を説明する。 Figure 7 is a flowchart showing the processing steps performed by the opinion analysis device 1 when learning a model. The model learning steps will be explained below in accordance with this flowchart.

ステップＳ１１において、学習用データ供給部３０は、学習用データを供給する。学習用データは、入力テキスト（単語系列）と、その入力データに対応する正解データとのペアである。正解データは、タグ系列の正解と、区間の位置（開始位置および終了位置）とを含む。単語埋め込み部１１は、入力テキスト（単語系列）を読み込む。また、系列ラベリングの損失関数算出部１７は、正解のタグ系列を読み込む。そして、句切れ位置の損失関数算出部２７は、区間の開始位置および終了位置の正解データ（例えば、図６）を読み込む。 In step S11, the training data supply unit 30 supplies training data. The training data is a pair of input text (word sequence) and correct answer data corresponding to the input data. The correct answer data includes the correct answer for the tag sequence and the position of the section (start position and end position). The word embedding unit 11 reads the input text (word sequence). The sequence labeling loss function calculation unit 17 also reads the correct answer tag sequence. The phrase break position loss function calculation unit 27 then reads the correct answer data for the start position and end position of the section (e.g., Figure 6).

ステップＳ１２において、意見分析装置１は、算出される損失関数値に基づいて、各モデルの内部パラメーターの調整を行う。具体的には、次の通りである。 In step S12, the opinion analysis device 1 adjusts the internal parameters of each model based on the calculated loss function value. Specifically, this is as follows:

単語埋め込み部１１は、入力される単語系列に対応する分散表現の系列を、系列ラベリング部１２と句切れ位置予測部２２とに渡す。系列ラベリング部１２は、その時点での内部パラメーター値を持つ系列ラベリングモデルを用いて、系列ラベルを求める。
また、句切れ位置予測部２２は、その時点での内部パラメーター値に基づく句切れ位置予測モデルを用いて、句切れ位置を求める。系列ラベリングの損失関数算出部１７は、系列ラベリング部１２が出力する系列ラベルと、正解の系列ラベルとに基づいて、損失関数値を算出する。句切れ位置の損失関数算出部２７は、句切れ位置予測部２２が出力する句切れ位置と、正解の句切れ位置（開始位置および終了位置）とに基づいて、損失関数値を算出する。これら算出された損失関数値に基づいて、誤差逆伝播法により、各モデルの内部パラメーターの調整を行う。このようなモデルの内部パラメーターの調整を繰り返すことは、損失関数値が減少する方向に作用する。言い換えれば、パラメーターの調整を繰り返すことにより、各モデルは、より正解に近い出力値を算出できるようになる。なお、誤差逆伝播法によるモデルのパラメーターの調整の手法自体は、既存技術に属するものである。 The word embedding unit 11 passes a sequence of embedded representations corresponding to the input word sequence to the sequence labeling unit 12 and the phrase break position prediction unit 22. The sequence labeling unit 12 calculates sequence labels using a sequence labeling model with internal parameter values at that time.
The phrase break position prediction unit 22 determines the phrase break position using a phrase break position prediction model based on the internal parameter values at that time. The sequence labeling loss function calculation unit 17 calculates a loss function value based on the sequence label output by the sequence labeling unit 12 and the correct sequence label. The phrase break position loss function calculation unit 27 calculates a loss function value based on the phrase break position output by the phrase break position prediction unit 22 and the correct phrase break position (start position and end position). Based on these calculated loss function values, the internal parameters of each model are adjusted using backpropagation. Repeated adjustment of the internal parameters of such models acts to reduce the loss function value. In other words, repeated parameter adjustment enables each model to calculate an output value closer to the correct answer. Note that the method of adjusting model parameters using backpropagation itself belongs to existing technology.

ステップＳ１３において、意見分析装置１は、パラメーターを調整した後の各モデルの状態を保存する。具体的には、意見分析装置１は、調整後のパラメーターの値を不揮発性のメモリー等に書き込む。 In step S13, the opinion analysis device 1 saves the state of each model after the parameters have been adjusted. Specifically, the opinion analysis device 1 writes the adjusted parameter values to non-volatile memory or the like.

図８は、意見分析装置１が、学習済みのモデルを用いて、系列ラベリングの処理を行う際の手順を示すフローチャートである。図７に示した処理手順のステップＳ１３においてモデルの状態が保存されているため、意見分析装置１は、各モデルの状態を読み出すことができる。以下、このフローチャートに沿って、系列ラベルを推定する処理の手順を説明する。 Figure 8 is a flowchart showing the steps taken by the opinion analysis device 1 when performing sequence labeling processing using a trained model. Because the model state was saved in step S13 of the processing procedure shown in Figure 7, the opinion analysis device 1 can read the state of each model. Below, the processing steps for estimating sequence labels will be explained using this flowchart.

ステップＳ２１において、単語埋め込み部１１は、入力テキスト（単語系列）を読み込む。 In step S21, the word embedding unit 11 reads the input text (word sequence).

ステップＳ２２において、意見分析装置１は、モデルを用いて、入力テキストに対応するタグ系列を推論する。具体的には、単語埋め込みモデルを有する単語埋め込み部１１は、単語埋め込みモデルによって算出したベクトルの系列を出力する。系列ラベリングモデルを有する系列ラベリング部１２は、単語埋め込み部１１から渡されるベクトルの系列に基づいて、タグ系列を求める。 In step S22, the opinion analysis device 1 uses the model to infer a tag sequence corresponding to the input text. Specifically, the word embedding unit 11, which has a word embedding model, outputs a sequence of vectors calculated using the word embedding model. The sequence labeling unit 12, which has a sequence labeling model, determines a tag sequence based on the sequence of vectors passed from the word embedding unit 11.

ステップＳ２３において、系列ラベリング部１２は、ステップＳ２２において求めたタグ系列を出力する。入力テキスト（単語列）に対応するタグ系列の例は、図４や図５に示した通りである。 In step S23, the sequence labeling unit 12 outputs the tag sequence determined in step S22. Examples of tag sequences corresponding to the input text (word string) are shown in Figures 4 and 5.

以上説明したように本実施形態の意見分析装置を用いることにより、入力される単語系列に対応してタグ系列を自動的に付与する処理において、付与するタグの精度を向上させることができる。具体例としては、入力される単語系列の各単語について、意見対象を表す単語であることを示すタグや、意見を表す単語であることを示すタグを、精度よく付与することができるようになる。つまり、入力される単語列中の、特定の性質を有する部分（例えば、意見対象や、意見など）を、精度よく抽出することが可能となる。 As described above, by using the opinion analysis device of this embodiment, the accuracy of tags assigned can be improved in the process of automatically assigning a tag sequence corresponding to an input word sequence. As a specific example, it becomes possible to accurately assign a tag indicating that the word represents an opinion subject or a tag indicating that the word represents an opinion to each word in the input word sequence. In other words, it becomes possible to accurately extract parts of the input word sequence that have specific properties (for example, an opinion subject or an opinion).

本実施形態のこのような装置を、ＳＮＳで投稿されるテキストの中から特定の性質の部分（例えば、意見対象の部分や、意見の部分など）を抽出するために利用することができる。例えば、ＳＮＳでの投稿内容に基づいて特定のコンテンツ（放送番組等）の視聴に対する反響を自動的に分析する場合に、投稿内容のテキスト自体にはどの部分が意見対象でどの部分が意見であるかが明示されていなくても、これらの部分を自動的に精度よく抽出することができるようになる。あるいは投稿内容のテキストにはどのコンテンツに対する反響であるかを表すハッシュタグ等が含まれていない場合にも、意見対象の部分や意見の部分等を自動的に抽出することが可能となる。つまり、本実施形態の装置によると、キーワード検索等の手法よりも強力かつ精度の高い分析が可能となる。つまり、本実施形態により、大規模な市場調査等を容易に且つ精度よく実施することが可能となる。 Such a device according to this embodiment can be used to extract portions of a particular nature (e.g., portions that are the subject of opinion or portions that contain opinions) from text posted on SNS. For example, when automatically analyzing reactions to viewing specific content (such as a broadcast program) based on the content posted on SNS, these portions can be automatically and accurately extracted even if the text of the posted content does not explicitly state which portions are the subject of opinion and which portions are opinions. Alternatively, even if the text of the posted content does not contain hashtags or the like that indicate which content the reaction is to, it is possible to automatically extract portions that are the subject of opinion or portions that contain opinions. In other words, the device according to this embodiment enables more powerful and accurate analysis than methods such as keyword search. In other words, this embodiment makes it possible to easily and accurately conduct large-scale market research, etc.

ＳＮＳ等に投稿された大量のテキスト（例えば、数千件から数十万件の程度）について、意見対象や意見を自動的に抽出することができると、どのような意見対象に対してどのような意見がどの程度発言されたかを自動的に集計できるようになる。この集計の際には、同一の意見対象についての集計を行ったり、似通った意見対象をまとめ上げるようなクラスタリング処理を行ったりすることもできる。集計の手法やクラスタリングの手法としては、従来技術による手法を用いることができる。例えば、放送番組に対する視聴者らの意見を自動的に集計することが可能となる。 If it were possible to automatically extract opinion subjects and opinions from large amounts of text (for example, several thousand to several hundred thousand items) posted on social media, etc., it would be possible to automatically tally what kind of opinions were expressed and to what extent for each opinion subject. When tallying, it would be possible to tally the same opinion subject or perform clustering processing to group together similar opinion subjects. Conventional technology methods can be used for tallying and clustering. For example, it would be possible to automatically tally viewer opinions on a broadcast program.

図９は、上記実施形態の意見分析装置１の内部構成の例を示すブロック図である。意見分析装置１は、コンピューターを用いて実現され得る。図示するように、そのコンピューターは、中央処理装置９０１と、ＲＡＭ９０２と、入出力ポート９０３と、入出力デバイス９０４や９０５等と、バス９０６と、を含んで構成される。コンピューター自体は、既存技術を用いて実現可能である。中央処理装置９０１は、ＲＡＭ９０２等から読み込んだプログラムに含まれる命令を実行する。中央処理装置９０１は、各命令にしたがって、ＲＡＭ９０２にデータを書き込んだり、ＲＡＭ９０２からデータを読み出したり、算術演算や論理演算を行ったりする。ＲＡＭ９０２は、データやプログラムを記憶する。ＲＡＭ９０２に含まれる各要素は、アドレスを持ち、アドレスを用いてアクセスされ得るものである。なお、ＲＡＭは、「ランダムアクセスメモリー」の略である。入出力ポート９０３は、中央処理装置９０１が外部の入出力デバイス等とデータのやり取りを行うためのポートである。入出力デバイス９０４や９０５は、入出力デバイスである。入出力デバイス９０４や９０５は、入出力ポート９０３を介して中央処理装置９０１との間でデータをやりとりする。バス９０６は、コンピューター内部で使用される共通の通信路である。例えば、中央処理装置９０１は、バス９０６を介してＲＡＭ９０２のデータを読んだり書いたりする。また、例えば、中央処理装置９０１は、バス９０６を介して入出力ポートにアクセスする。 Figure 9 is a block diagram showing an example of the internal configuration of the opinion analysis device 1 of the above embodiment. The opinion analysis device 1 can be realized using a computer. As shown in the figure, the computer includes a central processing unit 901, RAM 902, input/output port 903, input/output devices 904 and 905, etc., and bus 906. The computer itself can be realized using existing technology. The central processing unit 901 executes instructions contained in a program read from RAM 902, etc. In accordance with each instruction, the central processing unit 901 writes data to RAM 902, reads data from RAM 902, and performs arithmetic and logical operations. RAM 902 stores data and programs. Each element contained in RAM 902 has an address and can be accessed using the address. Note that RAM is an abbreviation for "random access memory." The input/output port 903 is a port through which the central processing unit 901 exchanges data with external input/output devices, etc. The input/output devices 904 and 905 are input/output devices. Input/output devices 904 and 905 exchange data with the central processing unit 901 via the input/output port 903. The bus 906 is a common communication path used within the computer. For example, the central processing unit 901 reads and writes data from RAM 902 via the bus 906. Also, for example, the central processing unit 901 accesses the input/output port via the bus 906.

なお、上述した意見分析装置１の少なくとも一部の機能をコンピューターで実現することができる。その場合、この機能を実現するためのプログラムをコンピューター読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピューターシステムに読み込ませ、実行することによって実現しても良い。なお、ここでいう「コンピューターシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピューター読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ、ＵＳＢメモリー等の可搬媒体、コンピューターシステムに内蔵されるハードディスク等の記憶装置のことをいう。つまり、「コンピューター読み取り可能な記録媒体」とは、非一過性の（non-transitory）コンピューター読み取り可能な記録媒体であってよい。さらに「コンピューター読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、一時的に、動的にプログラムを保持するもの、その場合のサーバーやクライアントとなるコンピューターシステム内部の揮発性メモリーのように、一定時間プログラムを保持しているものも含んでも良い。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピューターシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 At least some of the functions of the opinion analysis device 1 described above can be implemented by a computer. In this case, a program for implementing these functions may be recorded on a computer-readable recording medium, and the program may be loaded into a computer system and executed. Note that the term "computer system" here includes hardware such as an OS and peripheral devices. Furthermore, "computer-readable recording medium" refers to portable media such as flexible disks, optical magnetic disks, ROMs, CD-ROMs, DVD-ROMs, and USB memory, as well as storage devices such as hard disks built into computer systems. In other words, a "computer-readable recording medium" may be a non-transitory computer-readable recording medium. Furthermore, "computer-readable recording medium" may also include media that temporarily and dynamically store programs, such as communication lines used when transmitting programs over networks like the Internet or telephone lines, or media that store programs for a certain period of time, such as volatile memory within the computer systems that serve as servers or clients in such cases. The program may also be a program that implements some of the functions described above, or it may be a program that can be implemented in combination with a program already stored in the computer system.

以上、複数の実施形態を説明したが、本発明はさらに次のような変形例でも実施することが可能である。 Although several embodiments have been described above, the present invention can also be implemented in the following modified forms.

［第１変形例］
以上の説明では、処理対象とするテキストは日本語で記述されたテキストであったが、本実施形態を日本語以外のデータに適用してもよい。後で説明する実証実験においては、日本語の他に英語のテキストのデータを処理対象としている。また、さらに、その他の言語によるテキストを処理対象としてもよい。 [First Modification]
In the above explanation, the text to be processed is written in Japanese, but this embodiment may also be applied to data other than Japanese. In the demonstration experiment described later, the text to be processed is in English as well as Japanese. Furthermore, text in other languages may also be processed.

［第２変形例］
上記実施形態では、１台の意見分析装置１が、モデルの学習も行い、学習済みのモデルを用いてラベル系列を付与する（推定する）処理も行うものであった。変形例として、意見分析装置１が、モデルの学習と、学習済みのモデルを用いたラベル系列の推定との、いずれかのみをおこなうものであってもよい。意見分析装置１が機械学習装置として機能する場合には、学習済みのモデルのパラメーターを他の装置に移植して、移植先の装置においてラベル系列を付与する（推定する）処理を行わせることができる。また、意見分析装置１が、自らは機械学習を行わず、学習済みのモデルのパラメーターを取得して、ラベル系列を付与する（推定する）処理を行うものであってもよい。 [Second Modification]
In the above embodiment, one opinion analysis device 1 both trains a model and performs a process of assigning (estimating) a label sequence using the trained model. As a variation, the opinion analysis device 1 may perform only one of model training and label sequence estimation using the trained model. When the opinion analysis device 1 functions as a machine learning device, the parameters of the trained model can be transplanted to another device, and the destination device can perform a process of assigning (estimating) a label sequence. Alternatively, the opinion analysis device 1 may not perform machine learning itself, but may instead acquire parameters of a trained model and perform a process of assigning (estimating) a label sequence.

後者の場合には、意見分析装置１は、単語列を入力し前記単語列に対応する単語埋め込み表現列を出力する単語埋め込み部と、前記単語埋め込み部から出力される単語埋め込み表現列を入力し前記単語埋め込み表現列に対応するラベル系列を出力する系列ラベリング部と、を備える自然言語処理装置として機能する。このとき、少なくとも単語埋め込み部１１が内部に持つモデルとしては、上記実施形態で説明した機械学習の仕組み（機械学習装置）を用いて学習済みのものを利用する。 In the latter case, the opinion analysis device 1 functions as a natural language processing device that includes a word embedding unit that inputs a word string and outputs a word embedding sequence corresponding to the word string, and a sequence labeling unit that inputs the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence. In this case, at least the model that the word embedding unit 11 internally has is one that has been trained using the machine learning mechanism (machine learning device) described in the above embodiment.

以上、この発明の実施形態およびその変形例について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Embodiments of the present invention and their variations have been described in detail above with reference to the drawings, but the specific configurations are not limited to these embodiments and include designs that do not deviate from the gist of the present invention.

［実装および評価］
意見分析装置１を実装し、ＳＮＳ（ツイッター）における実際の投稿文や、映画に関するレビュー文を用いた実験により評価した。その実験での評価結果について説明する。 [Implementation and Evaluation]
The opinion analysis device 1 was implemented and evaluated through an experiment using actual posts on SNS (Twitter) and movie reviews. The evaluation results of the experiment will be described below.

評価に用いたデータは、テレビ番組に関する日本語のツイートと、映画に関する英語のレビューとである。具体的には、日本語テキストとして、ツイッターにおける、テレビ番組（ＮＨＫ朝ドラ）「なつぞら」についての、９日間の日本語でのツイート（ツイート数は２４，６１０、単語数は９８４，８６１）を用いた。また、英語テキストとして、評論サイトＩＭＤｂのレビュー（レビュー数は９８６、文数は１０，３６０、単語数は３２１，８０７）を用いた。なお、学習用データには、人手で正解を付与した。 The data used for the evaluation consisted of Japanese tweets about television programs and English reviews of movies. Specifically, the Japanese text used was Japanese tweets from Twitter about the television program (NHK morning drama) "Natsuzora" over a nine-day period (24,610 tweets, 984,861 words). The English text used was reviews from the review site IMDb (986 reviews, 10,360 sentences, 321,807 words). Correct answers were manually annotated to the training data.

評価尺度としては、precision（プレシジョン）、recall（リコール）、f1を用いた。これらは、それぞれ、式（５）、式（６）、式（７）で定義される。なお、ここで、ＴＰは真陽性、ＦＰは偽陽性、ＦＮは偽陰性である。 The evaluation measures used were precision, recall, and f1. These are defined by equations (5), (6), and (7), respectively. Here, TP stands for true positive, FP for false positive, and FN for false negative.

放送番組に関する日本語によるテキスト（ツイート）の分析タスクに対して、ラベルを最上位の第１階層に制限した場合の実験結果は、表１に示す通りである。 Table 1 shows the experimental results when limiting labels to the top level (first hierarchical level) for the task of analyzing Japanese text (tweets) related to broadcast programs.

なお、この実験において、ラベルは３階層に体系化され、第１階層で大きく２種類に分けられている。その２種類とは、放送番組の内容に言及した部分に付与されるREFERENCE 系のラベルと、発信者の主観を述べた部分に付与されるOPINION系のラベルである。REFERENCE系のラベルは、第２階層におけるTITLE、MUSIC、PERSON、PROGRAM、SCENE、STORY、QUOTEという各ラベルを持つ。一方、OPINION系のラベルは、第２階層におけるEVALUATION、ACTION、INDEXという各ラベルを持つ。このうち、第２階層におけるECALUATIONは、第３階層におけるPOSITIVE、NEGATIVE、NEUTRAL、REQUESTという各ラベルを持つ。 In this experiment, labels were organized into three layers, with two main types at the first layer. These two types are REFERENCE labels, which are assigned to parts that refer to the content of the broadcast program, and OPINION labels, which are assigned to parts that express the sender's subjective opinions. REFERENCE labels have the following labels at the second layer: TITLE, MUSIC, PERSON, PROGRAM, SCENE, STORY, and QUOTE. On the other hand, OPINION labels have the following labels at the second layer: EVALUATION, ACTION, and INDEX. Of these, ECALUATION at the second layer has the following labels at the third layer: POSITIVE, NEGATIVE, NEUTRAL, and REQUEST.

実験では、評価対象として、第１階層のREFERENCEラベルおよびOPINIONラベルを個別に評価し、そして、全体での評価も行った。比較対象である従来技術は、学習時にも句切れ位置予測部２２（図１を参照）を用いない場合である。 In the experiment, the first-layer REFERENCE and OPINION labels were evaluated individually, and also as a whole. The prior art used for comparison does not use the phrase break position prediction unit 22 (see Figure 1) even during training.

評価尺度recallのREFERENCEラベルに関して本実施形態の結果の値が従来技術の結果をわずかに下回ったのを除いて、他のすべての場合については、本実施形態の結果は従来技術の結果を上回ることが確認できた。 Except for the REFERENCE label in the evaluation metric recall, where the results of this embodiment were slightly lower than those of the prior art, it was confirmed that in all other cases the results of this embodiment outperformed those of the prior art.

なお、本実施形態を用いることの効果は、モデルを学習する際のエポック（epoch）数によって異なる。学習のはじめの段階（１０エポック以下程度の段階）では、従来技術と本実施形態との間で、ｆ１値で評価する性能には大きな違いは見られない。しかしながら、１０エポックを超えるあたりから本実施形態の効果が見られはじめ、２０エポック程度になると本実施形態の優位性（従来技術に対するｆ１値での性能差）が顕著となる。 The effect of using this embodiment varies depending on the number of epochs used when training the model. In the early stages of training (less than about 10 epochs), there is no significant difference in performance evaluated by the f1 value between the conventional technology and this embodiment. However, once the number of epochs exceeds 10, the effect of this embodiment begins to be seen, and once the number of epochs reaches about 20, the superiority of this embodiment (the difference in performance in the f1 value compared to the conventional technology) becomes apparent.

英語で記述された映画レビューのテキストの分析タスクに対して、ラベルを最上位の第１階層に制限した場合の実験結果は、表２に示す通りである。表１の場合と同様に、表２においても第１階層のREFERENCEラベルおよびOPINIONラベルの個別に評価と、全体での評価とを行った。 Table 2 shows the experimental results when the labels were limited to the top first level for the task of analyzing movie review text written in English. As with Table 1, in Table 2 we evaluated the REFERENCE and OPINION labels in the first level individually, as well as the overall evaluation.

この英語のテキストの分析においても、ほとんどの場合に本実施形態が従来技術の結果を上回ることが確認できた。本実施形態の評価結果が従来技術の評価結果を下回るのは、評価尺度recallのREFERENCEラベルに関してのみである。その他の場合には、本実施形態の評価結果が従来技術の評価を上回っている。 This analysis of English text also confirmed that this embodiment outperformed the results of the prior art in most cases. The only time the evaluation results of this embodiment fell short of those of the prior art was for the REFERENCE label in the recall evaluation scale. In all other cases, the evaluation results of this embodiment outperformed those of the prior art.

本実施形態における改善が従来技術と比較して有意なものであるか否かを表す有意水準（ｐ値）を、下の表３に示す。ここで示すのは、McNemar検定を行った場合の有意水準である。 Table 3 below shows the significance level (p-value) indicating whether the improvement in this embodiment is significant compared to the conventional technology. The significance level shown here is the result of a McNemar test.

放送番組に関する日本語によるテキスト（ツイート）の分析タスクに対して、ラベルの階層を制限しない場合の実験結果は、下の表４に示す通りである。表４に示す数値は、全ラベルの総合的な評価結果（マイクロ平均）である。 The experimental results for the task of analyzing Japanese text (tweets) related to broadcast programs, without restricting the label hierarchy, are shown in Table 4 below. The values shown in Table 4 are the overall evaluation results (micro-average) for all labels.

以上の、表１、表２、表４で示したように、本実施形態によるテキスト内の特定の位置づけの表現を抽出（意見対象の抽出）の精度が従来技術の場合よりもよいことが、実験によって確認できた。また、表３で示したように、精度の向上が有意なものであることを、検定によって確かめることができた。 As shown in Tables 1, 2, and 4 above, experiments have confirmed that the accuracy of extracting expressions with specific positions within text (extracting opinion targets) using this embodiment is better than that of conventional technology. Furthermore, as shown in Table 3, testing has confirmed that the improvement in accuracy is significant.

本発明は、例えば、自然言語で記述された文から特定の位置づけの要素を抽出するために利用することができる。一例として、ＳＮＳに投稿されるテキストから特定の位置づけの要素を抽出するために利用することができる。但し、本発明の利用範囲はここに例示したものには限られない。 The present invention can be used, for example, to extract elements of a specific position from sentences written in natural language. As an example, it can be used to extract elements of a specific position from text posted on social media. However, the scope of use of the present invention is not limited to the examples given here.

１意見分析装置（機械学習装置、自然言語処理装置）
１１単語埋め込み部
１２系列ラベリング部
１７系列ラベリングの損失関数算出部
２２句切れ位置予測部
２７句切れ位置の損失関数算出部
３０学習用データ供給部
２２０全結合回帰層
１１０単語埋め込み層（ＢＥＲＴエンベディング層）
９０１中央処理装置
９０２ＲＡＭ
９０３入出力ポート
９０４，９０５入出力デバイス
９０６バス 1. Opinion analysis device (machine learning device, natural language processing device)
11 Word embedding unit 12 Sequence labeling unit 17 Sequence labeling loss function calculation unit 22 Phrase break position prediction unit 27 Phrase break position loss function calculation unit 30 Learning data supply unit 220 Fully connected regression layer 110 Word embedding layer (BERT embedding layer)
901 Central processing unit 902 RAM
903 Input/output port 904, 905 Input/output device 906 Bus

Claims

a word embedding unit that receives a word string and outputs a word embedding string corresponding to the word string;
a sequence labeling unit that receives the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence;
a phrase break position prediction unit that receives the word embedding sequence output from the word embedding unit and outputs phrase break position information corresponding to the word embedding sequence;
a learning data supply unit that supplies a word sequence to be input to the word embedding unit, supplies a correct label sequence corresponding to the label sequence output by the sequence labeling unit, and further supplies correct phrase break position information corresponding to the phrase break position information output by the phrase break position prediction unit;
Equipped with
the sequence labeling unit adjusts parameters of an internal model by performing backpropagation based on an error between the label sequence and the correct label sequence;
the phrase break position prediction unit adjusts parameters of an internal model by performing back propagation based on an error between the phrase break position information and the correct phrase break position information;
the word embedding unit adjusts parameters of an internal model by backpropagating errors from the sequence labeling unit and backpropagating errors from the phrase break position prediction unit;
Machine learning device.

the phrase break position prediction unit outputs phrase break position information corresponding to the word embedding sequence by using a fully connected regression model based on all word embeddings included in the input word embedding sequence.
The machine learning device according to claim 1 .

the sequence labeling unit outputs at least a label indicating that a predetermined subsequence included in the original input word sequence is an opinion target, and a label indicating that the predetermined subsequence is an opinion;
The machine learning device according to claim 1 or 2.

the phrase break position prediction unit outputs, as the phrase break position information, numerical values representing the start position and end position of the substring;
the correct phrase-break position information provided by the learning data providing unit is numerical information indicating the correct start and end positions of the subsequence;
The machine learning device according to claim 3 .

a word embedding unit that receives a word string and outputs a word embedding string corresponding to the word string;
a sequence labeling unit that receives the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence;
Equipped with
At least the model internal to the word embedding unit has been trained by the machine learning device according to any one of claims 1 to 4.
Natural language processing device.

a word embedding unit that receives a word string and outputs a word embedding string corresponding to the word string;
a sequence labeling unit that receives the word embedding sequence output from the word embedding unit and outputs a label sequence corresponding to the word embedding sequence;
a phrase break position prediction unit that receives the word embedding sequence output from the word embedding unit and outputs phrase break position information corresponding to the word embedding sequence;
a learning data supply unit that supplies a word sequence to be input to the word embedding unit, supplies a correct label sequence corresponding to the label sequence output by the sequence labeling unit, and further supplies correct phrase break position information corresponding to the phrase break position information output by the phrase break position prediction unit;
Equipped with
the sequence labeling unit adjusts parameters of an internal model by performing backpropagation based on an error between the label sequence and the correct label sequence;
the phrase break position prediction unit adjusts parameters of an internal model by performing back propagation based on an error between the phrase break position information and the correct phrase break position information;
the word embedding unit adjusts parameters of an internal model by backpropagating errors from the sequence labeling unit and backpropagating errors from the phrase break position prediction unit;
A program that enables a computer to function as a machine learning device.