JP7619481B2

JP7619481B2 - Model learning device, model learning method, and model learning program

Info

Publication number: JP7619481B2
Application number: JP2023563387A
Authority: JP
Inventors: 優太南部; 匡宏幸島; 隆二山本
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: NTT Inc; NTT Inc USA
Priority date: 2021-11-24
Filing date: 2021-11-24
Publication date: 2025-01-22
Anticipated expiration: 2041-11-24
Also published as: WO2023095211A1; JPWO2023095211A1

Description

本発明は、モデル学習装置、モデル学習方法、及びモデル学習プログラムに関する。 The present invention relates to a model learning device, a model learning method, and a model learning program.

多変量データから入出力関係を表すモデルを学習することは、機械学習・人工知能分野における問題の１つである。この問題において、通常の設定では、入力値とその入力から得られる出力値の組の集合が数値のデータとして与えられる。しかし、アンケート等において心理的抵抗から回答が得られない場合や、出力値が観測困難なものを対象にする場合は、数値ではなく順序関係が出力値として与えられる場合がある。Learning a model that represents input-output relationships from multivariate data is one of the problems in the fields of machine learning and artificial intelligence. In this problem, in a typical setup, a set of pairs of input values and the output values obtained from those inputs is given as numerical data. However, in cases where responses cannot be obtained from questionnaires due to psychological resistance, or when output values are difficult to observe, ordinal relationships rather than numerical values may be given as output values.

心理的抵抗から数値が得られない場合として、身体情報や私生活に関する情報の収集・分析が挙げられる。例えば、食生活と健康の関係を調べるためアンケート調査を実施する際、心理的抵抗から体重を回答してもらえない場合がある。他にも、室内の状況（温度・湿度など）から電力の消費量を予測するため、データを収集する場合がある。ここでも、在宅時間や家族構成を推測されたくないなどの心理的抵抗から実測値が提供されない場合がある。 Examples of cases where values cannot be obtained due to psychological resistance include the collection and analysis of physical information or information about personal life. For example, when conducting a questionnaire survey to investigate the relationship between diet and health, people may not provide their weight due to psychological resistance. In other cases, data may be collected to predict electricity consumption from indoor conditions (temperature, humidity, etc.). Here too, actual values may not be provided due to psychological resistance, such as not wanting people to guess how much time they spend at home or their family composition.

また、出力値が観測困難な例としては、満足度や興奮といった感情・情動が挙げられる。一般に人間の感情・情動は、真値が観測できず、評価者の主観を含むため絶対評価困難であるという課題がある。そのため、人間の感情・情動のデータは、Likert scales（非特許文献１）やＳＡＭ（self-assessment manikins）（非特許文献２）などの多段階評価によって得られることが多い。これらの評価尺度は、非線形かつ評価者のバイアスを含むため間隔尺度や比例尺度とみなすのは妥当でない。実際、ある評価者Ａの情動評価値２点と評価者Ｂの情動評価値４点を比較して、２倍の差があると論じることや、同じ２点差であるからといって評価結果が１点と３点であった場合と同等だと解釈することは妥当でない。 Examples of output values that are difficult to observe include emotions such as satisfaction and excitement. In general, the true values of human emotions cannot be observed, and they involve the subjectivity of the evaluator, making absolute evaluation difficult. For this reason, data on human emotions is often obtained using multi-level evaluations such as Likert scales (Non-Patent Document 1) and SAM (self-assessment manikins) (Non-Patent Document 2). These evaluation scales are non-linear and include the bias of the evaluator, so it is not appropriate to consider them as interval scales or proportional scales. In fact, it is not appropriate to compare an emotion evaluation value of 2 points by a certain evaluator A with an emotion evaluation value of 4 points by evaluator B and argue that there is a two-fold difference, or to interpret the same difference of 2 points as being equivalent to an evaluation result of 1 point and 3 points.

これらの課題は、アンケートにおける設問を順序尺度にするあるいは回答を順序尺度として解釈することで軽減される。設問を順序尺度にするとは、「体重を入力してください」ではなく「体重は～ｋｇ以上ですか？」のような設問とすることを意味する。具体的数値に言及しないことで、心理的抵抗を軽減することが可能であり回答を得やすくなる。また、感情や情動などの場合においても、評価者Ａの情動評価値２点と評価者Ｂの情動評価値４点から、「Ａ＜Ｂ」と解釈することができる。この「Ａ＜Ｂ」は回答を順序関係として扱っており、数値として扱う場合に比べ、誤った解釈が含まれる可能性を低減できる。このような設定から、具体的な数値は得られないが順序関係がラベルデータとして与えられた場合に、その順序関係から具体的な数値を予測する問題を検討する必要がある。These issues can be alleviated by making the questions in the questionnaire or interpreting the answers as an ordinal scale. Making the questions an ordinal scale means making the questions "Is your weight over __kg?" instead of "Please enter your weight." By not mentioning specific numerical values, it is possible to reduce psychological resistance and make it easier to obtain answers. Even in the case of emotions and feelings, the emotional evaluation value of evaluator A is 2 points and the emotional evaluation value of evaluator B is 4 points, so it can be interpreted as "A < B." This "A < B" treats the answers as an ordinal relationship, which reduces the possibility of misinterpretation compared to when it is treated as a numerical value. With this setting, it is necessary to consider the problem of predicting a specific numerical value from an ordinal relationship when a specific numerical value cannot be obtained but an ordinal relationship is given as label data.

順序関係からモデルパラメタを学習する手法としてランク学習がある。よく用いられるペアワイズランク学習をもとに、以下に一般的なランク学習の流れを述べる。要素数ｎ_ｘ個の観測可能な多変量データＸに対して、順序関係を示すｎ_ｋ個のペアの集合Ｄをラベルデータとして定義する。多変量データＸとペアの集合Ｄは、以下のとおりである。 Rank learning is a method for learning model parameters from order relationships. Based on the commonly used pairwise rank learning, the general flow of rank learning is described below. For observable multivariate data X with n _x elements, a set D of n _k pairs indicating order relationships is defined as label data. The multivariate data X and the set D of pairs are as follows:

これは、ｖ_ｋ＞ｕ_ｋを示しているのではなく、入力値がｖ_ｋ，ｕ_ｋであるときの目的変数ｙ_ｖｋ，ｙ_ｕｋがｙ_ｖｋ＞ｙ_ｕｋであることを示している。そして、学習させたいモデルをｆとすると、このｖ_ｋ，ｕ_ｋを入力して得られる出力値ｆ（ｖ_ｋ），ｆ（ｕ_ｋ）がｆ（ｖ_ｋ）＞ｆ（ｕ_ｋ）となるようＬ＝ｅｘｐ（ｆ（ｖ_ｋ）－ｆ（ｕ_ｋ））などを最小化することで学習は実現される。つまり、ランク学習の目的は、意図したランキングを構成するためのスコア出力モデルｆのパラメタ調整である。 This does not indicate that _vk >u _k , but indicates that when the input values are _vk ,u _k, the objective variables _yvk , _yuk are _yvk > _yuk . If the model to be trained is f, the learning is realized by minimizing L=exp(f(vk)-f(u k)) so that the output values f( _vk ),f(u _k ) obtained by inputting _vk ,u _k are f( _vk ₎ >f( _{u k} ₎ . In other words, the purpose of rank learning is to adjust the parameters of the score output model f to configure the intended ranking.

有名なランク学習手法としてＲａｎｋＮｅｔ（非特許文献３）がある。これはロジスティック関数とエントロピー損失関数を用いて、損失関数Ｌを以下で定義している。A well-known rank learning method is RankNet (Non-Patent Document 3). It uses a logistic function and an entropy loss function, and defines the loss function L as follows:

この損失関数は、ペアデータの順序関係を保持するとき出力が小さくなるためラベルデータの順序関係を満たすスコアの生成が可能となる。 This loss function produces smaller output when the order relationship of paired data is preserved, making it possible to generate a score that satisfies the order relationship of the label data.

また、ガウス分布を用いたランク学習手法（非特許文献４）も存在する。この手法では、損失関数Ｌを以下で定義している。There is also a rank learning method using Gaussian distribution (Non-Patent Document 4). In this method, the loss function L is defined as follows:

これらの手法は、順序関係からモデルパラメタを調整し、ランキングを構築可能なスコアを生成するモデルである。つまり、ラベルデータに沿ったランキングを構築することができれば、スコアの値は問わない。例えば、映画鑑賞時の情動を１から５点で評価した結果、入力ｖ_ｋ，ｕ_ｋに対して、ｕ_ｋよりもｖ_ｋの方が好ましいことが得られたとする。このラベルから学習することで、ｆ（ｖ_ｋ）＞ｆ（ｕ_ｋ）を満たすとは可能であり、ラベルデータに沿ったランキングは構築可能である。しかし、ｆ（ｖ_ｋ），ｆ（ｕ_ｋ）が閉区間［１，５］に収まる保証はなく、このスコアを情動評価値としてみなすことはできない。 These methods are models that adjust model parameters from order relationships and generate scores that can be used to construct rankings. In other words, as long as a ranking can be constructed based on label data, the value of the score does not matter. For example, as a result of evaluating emotions during movie watching on a scale of 1 to 5, it is found that v _k is more preferable than u _k for inputs v _k and u _k . By learning from this label, it is possible to satisfy f(v _k )>f(u _k ), and a ranking based on the label data can be constructed. However, there is no guarantee that f(v _k ) and f(u _k ) fall within the closed interval [1, 5], and this score cannot be regarded as an emotion evaluation value.

R. Likert. A technique for the measurement of attitudes. Archives of Psychology, Vol. 22 140, pp. 5-55, 1932.R. Likert. A technique for the measurement of attitudes. Archives of Psychology, Vol. 22 140, pp. 5-55, 1932. Jon D. Morris. Observations: SAM: The Self-Assessment Manikin An Efficient Cross-Cultural Measurement Of Emotional Response. (This article originally appeared in the Journal of Advertising Research November/December 1995.)Jon D. Morris. Observations: SAM: The Self-Assessment Manikin An Efficient Cross-Cultural Measurement Of Emotional Response. (This article originally appeared in the Journal of Advertising Research November/December 1995.) Christopher Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. pp. 89-96, 01 2005.Christopher Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. pp. 89-96, 01 2005. Wei Chu and Zoubin Ghahramani. Preference learning with gaussian processes. pp. 137-144, 08 2005.Wei Chu and Zoubin Ghahramani. Preference learning with gaussian processes. pp. 137-144, 08 2005.

しかし、既存のランク学習はラベルデータから順序関係を学習し、その順序関係に沿ったスコアを出力するモデルであり、その出力は目的変数を示すものではない。すなわち、既存のランク学習を用いるだけでは、順序関係のみがラベルデータとして与えられた場合に、その順序関係から具体的な目的変数を予測することは難しい。However, existing rank learning is a model that learns order relationships from label data and outputs a score according to the order relationships, and the output does not indicate the objective variable. In other words, when only the order relationships are given as label data, it is difficult to predict a specific objective variable from the order relationships using existing rank learning alone.

本発明は、上記事情に着目してなされたもので、その目的は、ラベルデータとして順序関係のみしか得られない場合でも、その順序関係から目的変数を予測するモデル学習装置、モデル学習方法、及びモデル学習プログラムを提供することにある。The present invention has been made in light of the above-mentioned circumstances, and its purpose is to provide a model learning device, a model learning method, and a model learning program that predict a dependent variable from an order relationship, even when only an order relationship is available as label data.

本発明の一態様は、モデル学習装置である。モデル学習装置は、入力データを取得する入力データ処理部と、設定パラメタを取得する設定パラメタ処理部と、前記入力データと前記設定パラメタを入力として、順序関係を保持する損失関数とモデルの出力を制限する正則化項から構成される目的関数を用いて、モデルパラメタをランク学習するモデルパラメタ学習部と、学習したモデルパラメタを出力するモデルパラメタ処理部を有する。前記入力データは、観測可能な多変量データと順序関係を示すラベルデータである。前記設定パラメタは、前記正則化項のハイパーパラメタを含む。前記モデルパラメタ学習部は、目的変数を予測するスコアを出力するモデルパラメタを得る。 One aspect of the present invention is a model learning device. The model learning device includes an input data processing unit that acquires input data, a setting parameter processing unit that acquires setting parameters, a model parameter learning unit that uses the input data and the setting parameters as inputs and rank-learns model parameters using an objective function composed of a loss function that maintains an order relationship and a regularization term that limits the output of a model, and a model parameter processing unit that outputs the learned model parameters. The input data is observable multivariate data and label data that indicates an order relationship. The setting parameters include a hyperparameter of the regularization term. The model parameter learning unit acquires model parameters that output a score that predicts a response variable.

本発明の一態様は、コンピュータが実行するモデル学習方法である。モデル学習方法は、入力データを取得することと、設定パラメタを取得することと、前記入力データと前記設定パラメタを入力として、順序関係を保持する損失関数とモデルの出力を制限する正則化項から構成される目的関数を用いて、モデルパラメタをランク学習することと、学習したモデルパラメタを出力することを有する。前記入力データは、観測可能な多変量データと順序関係を示すラベルデータである。前記設定パラメタは、前記正則化項のハイパーパラメタを含む。前記ランク学習することは、目的変数を予測するスコアを出力するモデルパラメタを得る。 One aspect of the present invention is a model learning method executed by a computer . The model learning method includes acquiring input data, acquiring setting parameters, and using the input data and the setting parameters as inputs, rank learning of model parameters using an objective function composed of a loss function that maintains an order relationship and a regularization term that limits the output of a model, and outputting the learned model parameters. The input data is observable multivariate data and label data that indicates an order relationship. The setting parameters include a hyperparameter of the regularization term. The rank learning obtains model parameters that output a score that predicts a response variable.

本発明の一態様に係るモデル学習プログラムは、上記のモデル学習装置の各構成要素の機能をコンピュータに実行させる。 A model learning program according to one aspect of the present invention causes a computer to execute the functions of each component of the above-mentioned model learning device.

本発明によれば、ラベルデータとして順序関係のみしか得られない場合でも、その順序関係から目的変数を予測するモデル学習装置、モデル学習方法、及びモデル学習プログラムが提供される。 According to the present invention, a model learning device, a model learning method, and a model learning program are provided that predict a target variable from an order relationship even when only an order relationship is obtained as label data.

図１は、実施形態に係るモデル学習装置の機能構成の一例を示すブロック図である。FIG. 1 is a block diagram illustrating an example of a functional configuration of a model learning device according to an embodiment. 図２は、実施形態に係るモデル学習装置のハードウェア構成の一例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of a hardware configuration of the model learning device according to the embodiment. 図３は、実施形態に係るモデル学習装置が実行するモデルパラメタの推定の処理手順と処理内容の一例を示すフローチャート図である。FIG. 3 is a flowchart illustrating an example of a process procedure and process contents of model parameter estimation executed by the model learning device according to the embodiment. 図４は、実施形態に係るモデル学習装置のモデルパラメタ学習部が実行するモデルパラメタのランク学習の処理手順と処理内容の一例を示すフローチャート図である。FIG. 4 is a flowchart illustrating an example of a processing procedure and processing contents of model parameter rank learning executed by the model parameter learning unit of the model learning device according to the embodiment.

以下、図面を参照して本発明に係る実施形態について説明する。 Below, an embodiment of the present invention is described with reference to the drawings.

［構成例］
まず、図１を参照して、モデル学習装置１の動作について説明する。図１は、実施形態に係るモデル学習装置１の機能構成の一例を示すブロック図である。 [Configuration example]
First, the operation of the model learning device 1 will be described with reference to Fig. 1. Fig. 1 is a block diagram showing an example of the functional configuration of the model learning device 1 according to the embodiment.

図１に示されるように、モデル学習装置１は、入力データ処理部１０と、設定パラメタ処理部２０と、モデルパラメタ学習部３０と、モデルパラメタ処理部４０と、記録部５０と、入出力部６０を有する。記録部５０は、入力データ記録部５１と、設定パラメタ記録部５２と、モデルパラメタ記録部５３を有する。1, the model learning device 1 has an input data processing unit 10, a setting parameter processing unit 20, a model parameter learning unit 30, a model parameter processing unit 40, a recording unit 50, and an input/output unit 60. The recording unit 50 has an input data recording unit 51, a setting parameter recording unit 52, and a model parameter recording unit 53.

入出力部６０は、外部装置２との間でデータの入出力を行う。入出力部６０は、外部装置２から入力データと設定パラメタを受け取る。また、入出力部６０は、外部装置２にモデルパラメタを出力する。The input/output unit 60 inputs and outputs data between the external device 2. The input/output unit 60 receives input data and setting parameters from the external device 2. The input/output unit 60 also outputs model parameters to the external device 2.

入力データ処理部１０は、入出力部６０を介して、外部装置２から入力データを取得する。入力データ処理部１０は、取得した入力データを入力データ記録部５１に出力する。The input data processing unit 10 acquires input data from the external device 2 via the input/output unit 60. The input data processing unit 10 outputs the acquired input data to the input data recording unit 51.

入力データ記録部５１は、入力データを入力データ処理部１０から受け取り、これを記録する。 The input data recording unit 51 receives input data from the input data processing unit 10 and records it.

設定パラメタ処理部２０は、入出力部６０を介して、外部装置２から設定パラメタを取得する。設定パラメタ処理部２０は、取得した設定パラメタを設定パラメタ記録部５２に出力する。The setting parameter processing unit 20 acquires setting parameters from the external device 2 via the input/output unit 60. The setting parameter processing unit 20 outputs the acquired setting parameters to the setting parameter recording unit 52.

設定パラメタ記録部５２は、設定パラメタを設定パラメタ処理部２０から受け取り、これを記録する。 The setting parameter recording unit 52 receives setting parameters from the setting parameter processing unit 20 and records them.

モデルパラメタ学習部３０は、入力データ記録部５１に記録されている入力データと、設定パラメタ記録部５２に記録されている設定パラメタを入力として、順序関係を保持する損失関数とモデルの出力を制限する正則化項から構成される目的関数を用いて、モデルパラメタをランク学習する。モデルパラメタ学習部３０は、学習したモデルパラメタをモデルパラメタ記録部５３に出力する。The model parameter learning unit 30 uses the input data recorded in the input data recording unit 51 and the setting parameters recorded in the setting parameter recording unit 52 as inputs, and performs rank learning of the model parameters using an objective function consisting of a loss function that maintains the order relationship and a regularization term that limits the model output. The model parameter learning unit 30 outputs the learned model parameters to the model parameter recording unit 53.

モデルパラメタ記録部５３は、モデルパラメタをモデルパラメタ学習部３０から受け取り、これを記録する。 The model parameter recording unit 53 receives model parameters from the model parameter learning unit 30 and records them.

モデルパラメタ処理部４０は、モデルパラメタ記録部５３からモデルパラメタを読み込み、入出力部６０を介して、これを外部装置２に出力する。 The model parameter processing unit 40 reads the model parameters from the model parameter recording unit 53 and outputs them to the external device 2 via the input/output unit 60.

入力データ処理部１０が取得する入力データは、観測可能な多変量データと、順序関係を示すラベルデータである。例えば、ラベルデータは、順序関係を示すペアの集合である。The input data acquired by the input data processing unit 10 is observable multivariate data and label data indicating an order relationship. For example, the label data is a set of pairs indicating an order relationship.

設定パラメタ処理部２０が取得する設定パラメタは、モデルパラメタ学習部３０がランク学習に用いる目的関数の正則化項のハイパーパラメタを含む。ハイパーパラメタは、目的変数の分位数を含む。ここで、分位数とは、ソート済みの数の集合を正の整数で等分する位置に存在する数である。ハイパーパラメタはまた、正則化項の重みを決定するパラメタを含む。設定パラメタはまた、目的関数の最適化に用いる学習率パラメタを含む。 The setting parameters acquired by the setting parameter processing unit 20 include hyperparameters of the regularization term of the objective function used by the model parameter learning unit 30 for rank learning. The hyperparameters include quantiles of the objective variable. Here, a quantile is a number that exists at a position where a set of sorted numbers is equally divided by positive integers. The hyperparameters also include parameters that determine the weight of the regularization term. The setting parameters also include a learning rate parameter used to optimize the objective function.

次に、モデル学習装置１のハードウェア構成について説明する。モデル学習装置１は、コンピュータで構成される。例えば、モデル学習装置１は、パーソナルコンピュータやサーバコンピュータ等で構成される。Next, we will explain the hardware configuration of the model learning device 1. The model learning device 1 is configured as a computer. For example, the model learning device 1 is configured as a personal computer, a server computer, etc.

図２は、実施形態に係るモデル学習装置１のハードウェア構成の一例を示すブロック図である。図２に示されるように、モデル学習装置１は、入出力インタフェース１１０と、ＣＰＵ１２０と、記憶装置１３０を有する。 Figure 2 is a block diagram showing an example of a hardware configuration of the model learning device 1 according to the embodiment. As shown in Figure 2, the model learning device 1 has an input/output interface 110, a CPU 120, and a storage device 130.

入出力インタフェース１１０とＣＰＵ１２０と記憶装置１３０は、バス１４０を介して互いに電気的に接続されており、バス１４０を介してデータや命令のやりとりを行う。The input/output interface 110, CPU 120, and storage device 130 are electrically connected to each other via bus 140, and exchange data and commands via bus 140.

入出力インタフェース１１０は、信号ケーブルまたはネットワークを介して、外部装置２と接続される。入出力インタフェース１１０は、外部装置２からデータを受け取ったり、モデルパラメタを外部装置２に出力したりするために使用される。The input/output interface 110 is connected to the external device 2 via a signal cable or a network. The input/output interface 110 is used to receive data from the external device 2 and to output model parameters to the external device 2.

記憶装置１３０は、ＣＰＵ１２０が実行する処理に必要なプログラムとデータを記憶している。ＣＰＵ１２０は、記憶装置１３０から必要なプログラムとデータを読み出して実行することにより、各種の処理を行う。The storage device 130 stores the programs and data necessary for the processes executed by the CPU 120. The CPU 120 performs various processes by reading and executing the necessary programs and data from the storage device 130.

記憶装置１３０は、主記憶装置１３１と、補助記憶装置１３２を有する。主記憶装置１３１と補助記憶装置１３２は、相互間でプログラムとデータのやりとりを行う。The storage device 130 has a main storage device 131 and an auxiliary storage device 132. The main storage device 131 and the auxiliary storage device 132 exchange programs and data between each other.

主記憶装置１３１は、ＣＰＵ１２０の処理に一時的に必要なプログラムとデータを記憶する。例えば、主記憶装置１３１は、ＲＡＭ（Random Access Memory）等の揮発性メモリで構成される。The main memory device 131 stores programs and data temporarily required for processing by the CPU 120. For example, the main memory device 131 is composed of a volatile memory such as a RAM (Random Access Memory).

補助記憶装置１３２は、外部機器やネットワークを介して供給されるプログラムやデータを記憶しており、ＣＰＵ１２０の処理に一時的に必要なプログラムとデータを主記憶装置１３１に提供する。例えば、補助記憶装置１３２は、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等の不揮発性メモリで構成される。The auxiliary storage device 132 stores programs and data supplied via an external device or a network, and provides the programs and data temporarily required for processing by the CPU 120 to the main storage device 131. For example, the auxiliary storage device 132 is configured with non-volatile memory such as a hard disk drive (HDD) or a solid state drive (SSD).

ＣＰＵ１２０は、プロセッサであり、データや命令を処理するハードウェアである。ＣＰＵ１２０は、制御装置１２１と、演算装置１２２を有する。The CPU 120 is a processor and is hardware that processes data and instructions. The CPU 120 has a control device 121 and an arithmetic device 122.

制御装置１２１は、入出力インタフェース１１０と演算装置１２２と記憶装置１３０を制御する。 The control device 121 controls the input/output interface 110, the computing device 122, and the memory device 130.

演算装置１２２は、主記憶装置１３１からプログラムとデータを読み込み、プログラムを実行してデータを処理し、処理したデータを主記憶装置１３１に提供する。The calculation device 122 reads the program and data from the main memory device 131, executes the program to process the data, and provides the processed data to the main memory device 131.

このようなハードウェア構成において、入出力インタフェース１１０は、入出力部６０を構成する。ＣＰＵ１２０と主記憶装置１３１は、入力データ処理部１０と設定パラメタ処理部２０とモデルパラメタ学習部３０とモデルパラメタ処理部４０を構成する。記憶装置１３０（例えば補助記憶装置１３２）は、記録部５０を構成する。In such a hardware configuration, the input/output interface 110 constitutes the input/output unit 60. The CPU 120 and the main memory device 131 constitute the input data processing unit 10, the setting parameter processing unit 20, the model parameter learning unit 30, and the model parameter processing unit 40. The memory device 130 (e.g., the auxiliary memory device 132) constitutes the recording unit 50.

［動作例］
次に、モデル学習装置１の動作について説明する。モデル学習装置１が実行するランク学習は、ランク学習における出力値の分布を分位数にもとづく正則化項により制限する手法である。モデルの出力を目的変数に近づけるためには、モデルの出力分布を目的変数の分布へ近似させる必要がある。しかし、ラベルデータとして順序関係のみしか得られない場合では、具体的な数値が得られないため目的変数の分布も得られない。 [Example of operation]
Next, the operation of the model learning device 1 will be described. The rank learning performed by the model learning device 1 is a method of restricting the distribution of output values in rank learning by a regularization term based on quantiles. In order to bring the model output closer to the objective variable, it is necessary to approximate the model output distribution to the objective variable distribution. However, when only an order relationship is obtained as label data, a specific numerical value cannot be obtained, and therefore the objective variable distribution cannot be obtained.

そこで、分布全体を近似させるのではなく、目的変数の分位数をハイパーパラメタとして与え、分位数で定める複数点のみを近似させる。分位数とは、あるソート済みの数の集合Ｘ＝｛ｘ１，ｘ２，…，ｘｎ｜ｘ１≦ｘ２≦…≦ｘｎ｝に対し、分布を任意の正の整数ｍでｍ等分する位置に存在するｍ－１個の数｛ｘｉ_１，ｘｉ_２，…，ｘｉ_ｍ－１｝のことであり、ｘｉ_ｌを第ｌｍ分位数と呼ぶ。 Therefore, instead of approximating the entire distribution, the quantiles of the objective variable are given as hyperparameters, and only a number of points determined by the quantiles are approximated. A quantile is m-1 numbers {xi 1 , xi 2 , ..., xi m- ₁ } that exist at positions that divide the distribution into m equal parts by an arbitrary positive integer m for a set of sorted numbers X = { _x1 , x2, ..., xn|x1≦ _{x2≦...≦xn} }, and xi _l is called the lmth quantile.

目的変数のｍ分位数αをハイパーパラメタとして与えたとき、モデルの出力値のｍ分位数βとの誤差を最小化することでモデルの出力に制限を加える。または、目的変数のｍ分位数αとモデルの出力値のｍ分位数βを用いて計算される項を含む目的関数を最小化することでモデルの出力に制限を加える。 When the m-quantile α of the objective variable is given as a hyperparameter, the model output is restricted by minimizing the error between the m-quantile β of the model output value. Alternatively, the model output is restricted by minimizing an objective function that includes a term calculated using the m-quantile α of the objective variable and the m-quantile β of the model output value.

（入力データ）
入力データ処理部１０が取得する入力データは、観測可能な多変量データと順序関係を示すラベルデータである。例えば、多変量データは、要素数ｎ_ｘ個の観測可能な多変量データＸであり、ラベルデータは、順序関係を示すｎ_ｋ個のペアの集合Ｄである。以下では、ペアの集合ＤをペアラベルデータＤとも称する。多変量データＸとペアラベルデータＤは、以下のように表される。 (Input data)
The input data acquired by the input data processing unit 10 is observable multivariate data and label data indicating an order relationship. For example, the multivariate data is observable multivariate data X having n _x elements, and the label data is a set D of n _k pairs indicating an order relationship. Hereinafter, the set D of pairs is also referred to as paired label data D. The multivariate data X and paired label data D are expressed as follows:

（モデル）
モデルパラメタ学習部３０が実行するランク学習に用いられるモデルｆは、ランク学習においてランキングの構築指標となるスコアを出力する任意のモデルが利用可能である。入力モデルｆは、線形モデルやロジスティック関数・シグモイド関数を用いた一般化線形モデルであってもよいし、ガウス過程モデルや深層学習モデルなどの非線形なモデルであってもよい。また、Ｘが多変量時系列データである場合には、ある時刻ｔに対し潜在変数ｚ_ｔ－１とｘ_ｔからｚ_ｔを得る関数と、ｚ_ｔから目的変数ｙ_ｔを推定する基本的な時系列モデルや、ＬＳＴＭのようなＲＮＮアーキテクチャとしてもよい。多変量データＸが画像の集合である場合に対し次式で表現されるＣＮＮ構造を持つモデルを用いることも想定している。 (Model)
The model f used in the rank learning performed by the model parameter learning unit 30 can be any model that outputs a score that serves as a ranking construction index in rank learning. The input model f may be a generalized linear model using a linear model, a logistic function, or a sigmoid function, or may be a nonlinear model such as a Gaussian process model or a deep learning model. In addition, when X is multivariate time series data, it may be a basic time series model that estimates a target variable _{y t} _from a function for obtaining z _t from a latent variable z _t-1 and x _t at a certain time t, or an RNN architecture such as LSTM. It is also assumed that a model having a CNN structure expressed by the following formula is used when the multivariate data X is a set of images.

（出力）
モデル学習装置１の出力は、モデルパラメタθの推定結果である。モデルパラメタθの推定結果は、目的変数に近いスコアである。 (output)
The output of the model learning device 1 is an estimation result of the model parameter θ. The estimation result of the model parameter θ is a score close to the objective variable.

（目的関数）
次に、ランク学習における目的関数について説明する。目的関数は、順序関係を保持する損失関数Ｅとモデルの出力を制限する正則化項Ωから構成される。これを踏まえ、目的関数Ｌを以下で表現する。 (Objective function)
Next, the objective function in rank learning will be explained. The objective function is composed of a loss function E that maintains the order relationship and a regularization term Ω that limits the model output. Based on this, the objective function L is expressed as follows.

ここで、λは正則化項の重みを決定するハイパーパラメタであり、αは目的変数のｍ分位数を示すハイパーパラメタである。 Here, λ is a hyperparameter that determines the weight of the regularization term, and α is a hyperparameter that indicates the m-quantile of the objective variable.

（正則化）
目的関数の主題は正則化であるため、まず正則化項Ωについて述べる。２つの分布の分位数を近似するようなΩについては様々な定義が可能であり、ここでは下記Ω_１，Ω_２を定義する。 (Regularization)
Since the subject of the objective function is regularization, we will first discuss the regularization term Ω. There are various possible definitions for Ω that approximates the quantiles of two distributions, and here we define Ω ₁ and Ω ₂ as follows.

まず、Ω_１について述べる。目的変数の累積分布関数をΦ（ｘ）、モデル出力の累積分布関数をΨ（ｘ）とすると、両分布が類似するほど小さな値を取る正則化項として２つの累積分布関数のＬ１距離が定義できる。 First, let us consider _Ω1 . If the cumulative distribution function of the objective variable is Φ(x) and the cumulative distribution function of the model output is Ψ(x), the L1 distance between the two cumulative distribution functions can be defined as a regularization term that takes a smaller value as the two distributions become more similar.

この式においてΦ（ａ）は、目的変数の具体的数値が得られないため、同様に具体的数値が得られない。そこで、目的変数のｍ分位数α＝｛α_１，…，α_ｍ－１｝が与えられると仮定し、前式の左辺をａに関して離散化した次式を最小化する。 In this equation, since the specific value of the objective variable cannot be obtained, a specific value cannot be obtained for Φ(a). Therefore, assuming that the m-quantile α={α ₁ , ..., α _m-1 } of the objective variable is given, the following equation, which is obtained by discretizing the left side of the previous equation with respect to a, is minimized.

ここで、昇順ソート済みのモデル出力をｙとし、ｙのうちα_ｌより小さい要素の部分集合をｙ_αｌとすると、Ψ（α_ｌ）は以下で算出できる。 Here, if the model output sorted in ascending order is y, and a subset of elements of y that are smaller than α _l is y _{α l} , then Ψ(α _l ) can be calculated as follows.

ただし、式（７）にて用いられるｎは要素数の数え上げ処理であり、これは手続き的であるため一般に誤差逆伝播ができない。そこで、モデル出力ｙのうち第ｌ＋１分位に属する（１／ｍ）個の要素が、α_ｌからα_ｌ＋１の範囲に含まれるよう正則化項Ω_２を定めることでΨ（α_ｌ）→ｌ／ｍを実現し、近似的にΩ_１を最小化する。Ω_２を式（８）に示す。 However, n used in formula (7) is a counting process of the number of elements, which is procedural and generally cannot be used for backpropagation of errors. Therefore, by determining the regularization term Ω2 so that the (1/m) elements belonging to the l+1th quantile of the model output y are included in the range from _αl to _αl+1 _, Ψ( _αl )→l/m is realized and _Ω1 is approximately minimized. _Ω2 is shown in formula (8).

ここで、ｂはバッチサイズ（＝ｎ（ｙ））であり、α_０，α_ｍは目的変数の上限と下限を示す任意の値である。この正則化項Ω_２は、ｙの第ｌｍ分位数をβ_ｌとしたとき、２つの分位数（β_ｌ，β_ｌ＋１）に挟まれる要素ｙ_ｉに対し、ｙ_ｉが区間［α_ｌ，α_ｌ＋１］に属する場合には値が０になり、ｙ_ｉがα_ｌより小さい場合はα_ｌとの差、ｙ_ｉがα_ｌ＋１より大きい場合はα_ｌ＋１との差を出力する。これは、ｙの分位数で定める区間［β_ｌ，β_ｌ＋１］に属するｙの要素が、目的変数の分位数で定める区間［α_ｌ，α_ｌ＋１］に収まるほど損失が小さくなることを意味する。よって、ｙの要素が分位数αで定める各区間に等分されるときΩ_２＝０となる。つまり、ｂが十分に大きくΩ_２→０のときｎ（ｙ_αｌ）→（ｌ／ｍ）ｂであるため、式（５）と式（７）に代入することで同様にΩ→０となる。よって、Ω_２を最小化することでΩ_１の最小化が実現される。 Here, b is the batch size (=n(y)), and α ₀ , α _m are arbitrary values indicating the upper and lower limits of the objective variable. When the lmth quantile of y is β _l , this regularization term Ω ₂ has a value of 0 for an element y _i sandwiched between two quantiles (β _l , β _l+1 ) if y _i belongs to the interval [α _l , α _l+1 ], and outputs the difference between α _l when y _i is smaller than α _l , and the difference between α _l+1 when y _i is larger than α _l+1 . This means that the loss becomes smaller as the elements of y that belong to the interval [β _l , β _l+1 ] defined by the quantile of y fall within the interval [α _l , α _l+1 ] defined by the quantile of the objective variable. Therefore, Ω ₂ =0 when the elements of y are equally divided into each interval defined by the quantile α. In other words, when b is sufficiently large _{and Ω2} →0, n( _yαl )→(l/m)b, so by substituting into equations (5) and (7), Ω→0 in the same way. Therefore, minimizing _Ω2 realizes the minimization of _Ω1 .

次に、式（８）の最小化アルゴリズムについて述べる。まず、モデル出力ｙの第ｌ＋１分位に属する要素に対して、目的変数の第（ｌ＋１）ｍ分位数を格納したベクトルｃと目的変数の第ｌｍ分位数を格納したベクトルｄを以下で定義する。Next, we will describe the minimization algorithm of equation (8). First, for elements belonging to the l+1th quantile of the model output y, we define vector c storing the (l+1)mth quantile of the objective variable and vector d storing the lmth quantile of the objective variable as follows:

このｕ，ｌを用いて計算処理上の正則化項は次式となる。 Using these u and l, the regularization term for the computational process is given by the following equation.

この正則化項において、ハイパーパラメタαは任意の値を設定可能であり、学習時に入力されるバッチの性質に合わせて可変である。これは、ある学習データセットをＸ_１，Ｘ_２，…，Ｘ_ｎと異なる性質を持つミニバッチに分割して学習する際、ミニバッチＸ_１にはハイパーパラメタα、ミニバッチＸ_２にはハイパーパラメタβというように、ミニバッチレベルで可変であることを示している。例えば、食生活から個人の体重を予測するというタスクにおいては、入力バッチが男性である場合と女性である場合に分けてハイパーパラメタを設定することが可能である。また、スポーツ観戦中に人間の情動を予測するタスクにおいては、盛り上がる時間帯とそうでない時間帯に分けてハイパーパラメタを設定することが可能である。 In this regularization term, the hyperparameter α can be set to any value and is variable according to the properties of the batch input during learning. This indicates that when a certain learning data set is divided into mini-batches with different properties such as X ₁ , X ₂ , ..., X _n for learning, the hyperparameter α is set for the mini-batch X ₁ and the hyperparameter β is set for the mini-batch X ₂ , and the hyperparameter is variable at the mini-batch level. For example, in a task of predicting an individual's weight from their diet, it is possible to set the hyperparameter separately for the case where the input batch is male and the case where the input batch is female. In addition, in a task of predicting human emotions while watching sports, it is possible to set the hyperparameter separately for the time period when the excitement is high and the time period when it is not.

損失関数Ｅについては、ペアデータの順序関係を保持するとき出力が小さくなる任意の関数が利用できる。例えば、ＲａｎｋＮｅｔ（非特許文献３）と同様に式（１）としてもよい。また、ガウス分布を用いたランク学習手法（非特許文献４）と同様に式（２）としてもよい。 The loss function E can be any function that reduces the output when maintaining the order of paired data. For example, it can be equation (1) as in RankNet (Non-Patent Document 3). It can also be equation (2) as in the rank learning method using Gaussian distribution (Non-Patent Document 4).

（最適化法）
目的関数の最適化には、勾配法や確率的勾配法、Ａｄａｍなど任意の最適化手法が適用できる。勾配法を利用する場合は、ｋ回目の最適化ステップで下記の式にしたがいパラメタを更新することを繰り返せばよい。 (Optimization Method)
Any optimization method such as the gradient method, the stochastic gradient method, Adam, etc. can be applied to optimize the objective function. When using the gradient method, it is sufficient to repeatedly update the parameters according to the following formula in the k-th optimization step.

ここで、γ_ｋは学習率パラメタである。目的関数の勾配∇_θＬ（θ）は、計算して導出した関数を用いてもよいし、数値的に計算してもよい。 Here, γ _k is a learning rate parameter. The gradient of the objective function ∇ _θ L (θ) may be a calculated and derived function or may be calculated numerically.

（モデルパラメタの推定）
次に、図３を参照して、モデル学習装置１が実行するモデルパラメタの推定の処理手順と処理内容について説明する。図３は、モデル学習装置１が実行するモデルパラメタの推定の処理手順と処理内容を示すフローチャート図である。 (Model parameter estimation)
Next, the procedure and contents of the model parameter estimation process executed by the model learning device 1 will be described with reference to Fig. 3. Fig. 3 is a flow chart showing the procedure and contents of the model parameter estimation process executed by the model learning device 1.

ステップＳ１において、入力データ処理部１０は、入力データを取得する。入力データは、前述した多変量データＸとペアラベルデータＤである。入力データ処理部１０はまた、取得した入力データを入力データ記録部５１に格納する。In step S1, the input data processing unit 10 acquires input data. The input data is the multivariate data X and pair label data D described above. The input data processing unit 10 also stores the acquired input data in the input data recording unit 51.

ステップＳ２において、設定パラメタ処理部２０は、設定パラメタを取得する。設定パラメタは、目的関数の正則化項のハイパーパラメタを含む。ハイパーパラメタは、目的変数のｍ分位数αと、正則化項の重みを決定するパラメタλを含む。設定パラメタはまた、目的関数の最適化に用いる学習率パラメタγ_ｋを含む。設定パラメタ処理部２０はまた、取得した設定パラメタを設定パラメタ記録部５２に格納する。 In step S2, the setting parameter processing unit 20 acquires setting parameters. The setting parameters include hyperparameters of the regularization term of the objective function. The hyperparameters include the m-quantile α of the objective variable and a parameter λ that determines the weight of the regularization term. The setting parameters also include a learning rate parameter γ _k used to optimize the objective function. The setting parameter processing unit 20 also stores the acquired setting parameters in the setting parameter recording unit 52.

ステップＳ３において、モデルパラメタ学習部３０は、入力データ記録部５１に記録されている入力データと、設定パラメタ記録部５２に記録されている設定パラメタを入力として、損失関数と正則化項を有する目的関数を用いて、モデルパラメタをランク学習する。モデルパラメタ学習部３０はまた、学習したモデルパラメタをモデルパラメタ記録部５３に格納する。In step S3, the model parameter learning unit 30 uses the input data recorded in the input data recording unit 51 and the setting parameters recorded in the setting parameter recording unit 52 as inputs, and performs rank learning of the model parameters using an objective function having a loss function and a regularization term. The model parameter learning unit 30 also stores the learned model parameters in the model parameter recording unit 53.

ステップＳ４において、モデルパラメタ処理部４０は、モデルパラメタ記録部５３からモデルパラメタを読み込み、これを外部装置２に出力する。 In step S4, the model parameter processing unit 40 reads the model parameters from the model parameter recording unit 53 and outputs them to the external device 2.

（ランク学習）
次に、図４を参照して、モデルパラメタ学習部３０が実行するモデルパラメタのランク学習の処理手順と処理内容について説明する。図４は、モデルパラメタ学習部３０が実行するモデルパラメタのランク学習の処理手順と処理内容の一例を示すフローチャート図である。 (Rank learning)
Next, the process procedure and process contents of the model parameter rank learning executed by the model parameter learning unit 30 will be described with reference to Fig. 4. Fig. 4 is a flow chart showing an example of the process procedure and process contents of the model parameter rank learning executed by the model parameter learning unit 30.

ステップＳ１１において、モデルパラメタθを初期化する。 In step S11, the model parameter θ is initialized.

ステップＳ１２において、ランク学習の最大繰り返し回数を設定する。また、ランク学習の計算繰り返し回数を初期化する。すなわち、計算繰り返し回数を０にする。In step S12, the maximum number of iterations of rank learning is set. Also, the number of calculation iterations of rank learning is initialized. That is, the number of calculation iterations is set to 0.

ステップＳ１３において、モデルパラメタθを式（３）と式（１０）に従い更新する。 In step S13, the model parameter θ is updated according to equations (3) and (10).

ステップＳ１４において、計算繰り返し回数を更新する。すなわち、計算繰り返し回数を１増やす。In step S14, the number of calculation iterations is updated. That is, the number of calculation iterations is incremented by 1.

ステップＳ１５において、計算繰り返し回数が最大繰り返し回数を超えたか否かを判断する。計算繰り返し回数が最大繰り返し回数を超えていない場合には、ステップＳ１３の処理に戻る。計算繰り返し回数が最大繰り返し回数を超えた場合には、モデルパラメタの学習を終了し、図３のステップＳ４の処理に戻る。In step S15, it is determined whether the number of calculation iterations has exceeded the maximum number of iterations. If the number of calculation iterations has not exceeded the maximum number of iterations, the process returns to step S13. If the number of calculation iterations has exceeded the maximum number of iterations, the learning of the model parameters is terminated, and the process returns to step S4 in FIG. 3.

［効果］
実施形態によれば、ラベルデータとして順序関係のみしか得られない場合でも、その順序関係から目的変数を予測することが可能となる。これにより、任意のランク学習モデルをランキング生成モデルとしてだけではなく、目的変数の分布に近似した数値を出力する回帰モデルとして学習することが可能となる。これにより、心理的抵抗から具体的な数値でアンケート回答が得られない場合や情動など絶対評価が難しい目的変数に対し、順序関係だけでなく数値として推定値を得ることが可能になる。 [effect]
According to the embodiment, even if only an order relationship is obtained as label data, it is possible to predict the objective variable from the order relationship. This makes it possible to learn an arbitrary rank learning model not only as a ranking generation model but also as a regression model that outputs a numerical value that approximates the distribution of the objective variable. This makes it possible to obtain not only an order relationship but also an estimated value as a numerical value for objective variables that are difficult to evaluate absolutely, such as emotions, when a survey response cannot be given in a specific numerical value due to psychological resistance.

上記実施形態では、最適化の際に勾配法を用いる例を示しているが、確率的勾配法やＡｄａｍなど任意の手法が利用できる。同様に目的関数（式３）におけるＥにも任意の損失関数が利用できる。上記の実施の形態の図１に示すモデル学習装置１は、各構成要素の動作をプログラムとして構築し、モデル学習装置として利用されるコンピュータにインストールして実行させる、またはネットワークを介して流通させることが可能である。本発明は上記実施形態に限定されることなく、種々変更・応用が可能である。 In the above embodiment, an example is shown in which a gradient method is used during optimization, but any method such as a stochastic gradient method or Adam can be used. Similarly, any loss function can be used for E in the objective function (Equation 3). In the model learning device 1 shown in FIG. 1 of the above embodiment, the operation of each component can be constructed as a program, and the program can be installed and executed on a computer used as the model learning device, or distributed via a network. The present invention is not limited to the above embodiment, and various modifications and applications are possible.

例えば、上記実施形態は、本発明をペアワイズランク学習法に適用した例であるが、本発明は、これに限らず、他のランク学習法、例えば、ポイントワイズランク学習法やリストワイズランク学習法に適用されてもよい。For example, the above embodiment is an example of applying the present invention to a pairwise rank learning method, but the present invention is not limited to this and may be applied to other rank learning methods, such as a pointwise rank learning method or a listwise rank learning method.

なお、本発明は、上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、各実施形態は適宜組み合わせて実施してもよく、その場合組み合わせた効果が得られる。更に、上記実施形態には種々の発明が含まれており、開示される複数の構成要件から選択された組み合わせにより種々の発明が抽出され得る。例えば、実施形態に示される全構成要件からいくつかの構成要件が削除されても、課題が解決でき、効果が得られる場合には、この構成要件が削除された構成が発明として抽出され得る。 Note that the present invention is not limited to the above-described embodiments, and can be modified in various ways in the implementation stage without departing from the gist of the invention. The embodiments may also be implemented in appropriate combination, in which case the combined effects can be obtained. Furthermore, the above-described embodiments include various inventions, and various inventions can be extracted by combinations selected from the multiple constituent elements disclosed. For example, if the problem can be solved and an effect can be obtained even if some constituent elements are deleted from all the constituent elements shown in the embodiments, the configuration from which these constituent elements are deleted can be extracted as an invention.

１…モデル学習装置
２…外部装置
１０…入力データ処理部
２０…設定パラメタ処理部
３０…モデルパラメタ学習部
４０…モデルパラメタ処理部
５０…記録部
５１…データ記録部
５２…設定パラメタ記録部
５３…モデルパラメタ記録部
６０…入出力部
１１０…入出力インタフェース
１２０…ＣＰＵ
１２１…制御装置
１２２…演算装置
１３０…記憶装置
１３１…主記憶装置
１３２…補助記憶装置
１４０…バス REFERENCE SIGNS LIST 1 model learning device 2 external device 10 input data processing unit 20 setting parameter processing unit 30 model parameter learning unit 40 model parameter processing unit 50 recording unit 51 data recording unit 52 setting parameter recording unit 53 model parameter recording unit 60 input/output unit 110 input/output interface 120 CPU
121: control device 122: arithmetic unit 130: storage device 131: main storage device 132: auxiliary storage device 140: bus

Claims

An input data processing unit for acquiring input data;
a configuration parameter processing unit that acquires configuration parameters;
a model parameter learning unit that uses the input data and the setting parameters as inputs, and rank-learns model parameters using an objective function that is composed of a loss function that maintains an order relationship and a regularization term that limits an output of a model;
A model parameter processing unit that outputs learned model parameters;
The input data is observable multivariate data and label data indicating an order relationship;
The setting parameters include hyperparameters of the regularization terms,
The model parameter learning unit obtains model parameters that output a score for predicting a dependent variable.
Model learning device.

The hyperparameter includes a quantile of the objective variable, where the quantile is a number that exists at a position where a set of sorted numbers is equally divided by positive integers;
the model parameter learning unit approximates a quantile of an output value of a model to the quantile of the response variable;
The model learning device according to claim 1 .

The model parameter learning unit minimizes an objective function including a term calculated using the quantiles of the objective variable and the quantiles of the output value of the model.
The model learning device according to claim 2 .

The hyperparameters further include a parameter that determines a weight of the regularization term.
The model learning device according to claim 2 or 3.

The model parameter learning unit optimizes the objective function using a gradient method, a stochastic gradient method, or Adam.
5. The model learning device according to claim 1.

The setting parameters further include a learning rate parameter used in optimizing the objective function.
The model learning device according to claim 5 .

Obtaining input data;
Obtaining configuration parameters;
rank-learning model parameters using an objective function including a loss function for maintaining an order relationship and a regularization term for limiting an output of a model, with the input data and the setting parameters as inputs;
Outputting the learned model parameters;
The input data is observable multivariate data and label data indicating an order relationship;
The setting parameters include hyperparameters of the regularization terms,
The rank learning obtains model parameters that output scores predicting the dependent variable.
A computer implemented method for learning models.

A model learning program that causes a computer to execute the functions of each component of the model learning device described in any one of claims 1 to 6.