JP7630356B2

JP7630356B2 - Trained model generation system

Info

Publication number: JP7630356B2
Application number: JP2021092119A
Authority: JP
Inventors: 啓太横山; 茂樹田中; 紀久伊藤; 里恵執行
Original assignee: NTT Docomo Inc; Daiichikosho Co Ltd
Current assignee: NTT Docomo Inc; Daiichikosho Co Ltd
Priority date: 2021-06-01
Filing date: 2021-06-01
Publication date: 2025-02-17
Anticipated expiration: 2041-06-01
Also published as: JP2022184337A

Description

本発明は、学習済みモデルを生成する学習済みモデル生成システムに関する。 The present invention relates to a trained model generation system that generates a trained model.

従来、カラオケおけるユーザの歌唱履歴に基づいて、ユーザに楽曲をレコメンドすることが提案されている（例えば、特許文献１参照）。 Conventionally, it has been proposed to recommend songs to a user based on the user's singing history at karaoke (see, for example, Patent Document 1).

特開２０１２－７８３８７号公報JP 2012-78387 A

カラオケで歌唱する際には楽曲のキーが重要になる。そのため、上記のような楽曲のレコメンドと同様に、ユーザが歌唱する際のキーをレコメンドすることが考えられる。キーのレコメンドについては、機械学習によって生成された学習済みモデルを用いて行うことが考えられる。例えば、０～１４の１５段階のキーのそれぞれについて、レコメンドする度合いを示す値を学習済みモデルによって出力することが考えられる。 The key of a song is important when singing karaoke. Therefore, similar to the song recommendations described above, it is possible to recommend a key for the user to sing. Key recommendations can be made using a trained model generated by machine learning. For example, the trained model could output a value indicating the degree of recommendation for each of 15 keys from 0 to 14.

キーは、大小関係によって順序付けされている。従って、学習済みモデルによって出力される値は、キーの順序に従った順序となるべきである。例えば、学習済みモデルから出力される値によって示される最もレコメンドする度合いが高いキーが７のキーであった場合、次に当該度合いが高いキーが６のキーを飛ばして５のキーとなることは妥当ではない。しかしながら、学習済みモデルを単に深層学習によって得られた非線形モデル等とした場合には、必ずしも学習済みモデルからの出力が上記のような順序になっているとは限らない。即ち、学習済みモデルからの出力が妥当なものとならないおそれがある。 Keys are ordered according to their magnitude relationship. Therefore, the values output by the trained model should be in the order of the keys. For example, if the key with the highest degree of recommendation indicated by the values output from the trained model is key 7, it is not appropriate for the next key with the highest degree to be key 5, skipping key 6. However, if the trained model is simply a nonlinear model obtained by deep learning, the output from the trained model will not necessarily be in the order described above. In other words, there is a risk that the output from the trained model will not be appropriate.

本発明は、上記に鑑みてなされたものであり、より妥当な値を出力する学習済みモデルを生成することができる学習済みモデル生成システムを提供することを目的とする。 The present invention has been made in consideration of the above, and aims to provide a trained model generation system that can generate a trained model that outputs more appropriate values.

上記の目的を達成するために、本発明に係る学習済みモデル生成システムは、入力データを入力して、大小関係によって順序付けられる複数の候補に対する推定値を出力する学習済みモデルを生成する学習済みモデル生成システムであって、学習済みモデルの生成に用いる、入力データに対応する学習用入力データ、及び推定値に対応する学習用出力データを取得する学習用データ取得部と、学習用データ取得部によって取得された学習用出力データ、及び予め設定された基準となる候補から教師値を生成する教師値生成部と、学習用データ取得部によって取得された学習用入力データ、及び教師値生成部によって生成された教師値を教師データとして機械学習を行って学習済みモデルを生成する学習済みモデル生成部と、を備え、学習済みモデル生成部によって生成される学習済みモデルは、基準となる候補及び各候補の順序に応じて候補がグループに分けられており、グループ毎に各候補に対して設定されたバイアス項、及び学習済みモデルで算出されるグループ毎の値から、各候補に対する推定値を生成して出力するものである。 In order to achieve the above object, the trained model generation system of the present invention is a trained model generation system that receives input data and generates a trained model that outputs estimated values for multiple candidates ordered by magnitude relationship, and includes a training data acquisition unit that acquires training input data corresponding to the input data and training output data corresponding to the estimated values used to generate the trained model, a teacher value generation unit that generates teacher values from the training output data acquired by the training data acquisition unit and a preset reference candidate, and a trained model generation unit that performs machine learning using the training input data acquired by the training data acquisition unit and the teacher values generated by the teacher value generation unit as teacher data to generate a trained model, and the trained model generated by the trained model generation unit is a trained model in which candidates are divided into groups according to the reference candidate and the order of each candidate, and an estimated value for each candidate is generated and output from a bias term set for each candidate for each group and a value for each group calculated by the trained model.

本発明に係る学習済みモデル生成システムでは、上記の教師値が生成されて、上記の学習済みモデルが生成される。このように生成される学習済みモデルは、順序に応じた適切な推定値を出力することができる。即ち、本発明に係る学習済みモデル生成システムによれば、より妥当な値を出力する学習済みモデルを生成することができる。 In the trained model generation system according to the present invention, the above teaching values are generated, and the trained model is generated. The trained model generated in this manner can output appropriate estimated values according to the order. In other words, the trained model generation system according to the present invention can generate a trained model that outputs more appropriate values.

本発明によれば、より妥当な値を出力する学習済みモデルを生成することができる。 The present invention makes it possible to generate a trained model that outputs more reasonable values.

本発明の実施形態に係る学習済みモデル生成システムの機能構成を示す図である。FIG. 1 is a diagram illustrating a functional configuration of a trained model generation system according to an embodiment of the present invention. 学習済みモデルからの出力の例を示す表である。1 is a table showing an example of output from a trained model. 機械学習に用いる教師ラベルの例を示す表である。1 is a table showing examples of teacher labels used in machine learning. 本発明の実施形態に係る学習済みモデル生成システムで実行される処理を示すフローチャートである。1 is a flowchart showing processing executed in a trained model generation system according to an embodiment of the present invention. 本発明の実施形態に係る学習済みモデル生成システムのハードウェア構成を示す図である。FIG. 1 is a diagram illustrating a hardware configuration of a trained model generation system according to an embodiment of the present invention.

以下、図面と共に本発明に係る学習済みモデル生成システムの実施形態について詳細に説明する。なお、図面の説明においては同一要素には同一符号を付し、重複する説明を省略する。 Below, an embodiment of the trained model generation system according to the present invention will be described in detail with reference to the drawings. Note that in the description of the drawings, the same elements are given the same reference numerals and duplicated descriptions will be omitted.

図１に本実施形態に係る学習済みモデル生成システム１０の機能構成を示す。学習済みモデル生成システム１０は、機械学習によって学習済みモデルを生成（推論）するシステム（装置）である。学習済みモデルは、入力データを入力して、大小関係によって順序付けられる複数の候補に対する推定値を出力する。例えば、学習済みモデルは、カラオケにおいてユーザが楽曲を歌唱する際のキー（キー変更）のレコメンド（推薦）に用いられる。この場合、学習済みモデルから出力される推定値に係る複数の候補は、キーである。 Figure 1 shows the functional configuration of a trained model generation system 10 according to this embodiment. The trained model generation system 10 is a system (device) that generates (infers) a trained model by machine learning. The trained model receives input data and outputs estimated values for multiple candidates that are ordered according to magnitude relationships. For example, the trained model is used to recommend keys (key changes) when a user sings a song at karaoke. In this case, the multiple candidates related to the estimated values output from the trained model are keys.

ユーザへのレコメンドの対象となる複数のキーは、大小関係によって順序付けられる。カラオケにおけるキーは、例えば、０～１４の１５段階ある。通常、１５段階のキーのうち７のキーが標準のキーであり、デフォルトのキーである。７のキーから、音を下げる方向のキーが０～６のキーである。７のキーから１つ（１音）下げたキーが６のキーであり、下げる数を大きくするにつれて５，４，３，２，１，０のキーとなる。７のキーから、音を上げる方向のキーが８～１４のキーである。７のキーから１つ（１音）上げたキーが８のキーであり、上げる数を大きくするにつれて９，１０，１１，１２，１３，１４のキーとなる。 The multiple keys recommended to the user are ordered according to their magnitude. For example, karaoke keys have 15 levels, from 0 to 14. Usually, of the 15 levels, key 7 is the standard key and the default key. The keys going down from key 7 are 0 to 6. Key 6 is one step (one tone) lower than key 7, and the keys 5, 4, 3, 2, 1, and 0 follow as the number of steps lower. The keys going up from key 7 are 8 to 14. Key 8 is one step (one tone) higher than key 7, and the keys 9, 10, 11, 12, 13, and 14 follow as the number of steps higher.

キーのレコメンドは、例えば、デフォルトのキーである７のキーからのキーの上げ下げを示すことによって行われる。通常、１回のレコメンドにおけるキーの上げ及び下げについては、少なくとも何れか一方のみが行われる。標準のキーに対して、上げ及び下げの両方をレコメンドすることは妥当ではないからである。 Key recommendations are made, for example, by indicating whether to raise or lower the key from the default key, key 7. Typically, at least one of raising and lowering the key is done in one recommendation, because it is not appropriate to recommend both raising and lowering the standard key.

例えば、学習済みモデルは、図２に示すようにキー毎の推定値（推定確率）を出力する。この推定値は、対応するキーがユーザにレコメンド可能であるか否かを示すものである。例えば、推定値が予め設定した閾値（例えば、０．５）を超えているキーが、ユーザにレコメンド可能であることを示す。なお、必ずしも推定値が大きい程、ユーザに適したキーを示すものではない。これに関しては後述する。このように学習済みモデルは、大小関係によって順序付けられる複数の候補に対する推定値を出力する。また、後述するように、学習済みモデルは、上記のキー毎の推定値に加えて傾向スコアの推定値を出力する。 For example, as shown in Figure 2, the trained model outputs an estimated value (estimated probability) for each key. This estimated value indicates whether or not the corresponding key can be recommended to the user. For example, a key whose estimated value exceeds a preset threshold value (e.g., 0.5) indicates that it can be recommended to the user. Note that a larger estimated value does not necessarily indicate a key that is more suitable for the user. This will be described later. In this way, the trained model outputs estimated values for multiple candidates that are ordered by magnitude. Also, as will be described later, the trained model outputs an estimated value of the propensity score in addition to the estimated value for each key described above.

推定値を出力する際の学習済みモデルへの入力データは、従来の学習済みモデルに入力することができる任意のデータを用いることができる。例えば、上記のように学習済みモデルをキーのレコメンドに用いる場合、入力データとしては、楽曲を歌唱するユーザに係る情報、及びレコメンドするキーに係る楽曲、即ち、ユーザがこれから歌唱する楽曲（レコメンドするキーの対象となる楽曲）を示す情報等を用いることができる。楽曲を歌唱するユーザに係る情報としては、カラオケにおいてユーザが楽曲を歌唱した際に録音されたユーザの音声に基づくデータ（例えば、当該音声のピッチを示すデータ）を用いることができる。あるいは、楽曲を歌唱するユーザに係る情報として、ユーザの年齢及び性別等の情報が用いられてもよい。学習済みモデルでは、入力データの特徴量化が行われてもよい。 The input data to the trained model when outputting an estimated value can be any data that can be input to a conventional trained model. For example, when the trained model is used to recommend a key as described above, the input data can be information related to the user singing the song and information indicating the song related to the recommended key, i.e., the song the user is about to sing (the song for which the recommended key is to be performed). The information related to the user singing the song can be data based on the user's voice recorded when the user sings the song at karaoke (e.g., data indicating the pitch of the voice). Alternatively, the information related to the user singing the song can be information such as the user's age and gender. The trained model can perform feature quantification of the input data.

学習済みモデルによる出力される推定値は、キーの順序に応じたものとなるのが妥当である。例えば、５及び７のキーの推定値がユーザにレコメンド可能であることを示すものである一方で、５と７との間の６のキーの推定値がユーザにレコメンド可能であることを示すものでないことは妥当ではない。しかしながら、学習済みモデルを単に深層学習によって得られた非線形モデル等とした場合には、上記のような妥当でない推定値が学習済みモデルから出力され得る。本実施形態に係る学習済みモデル生成システム１０によって生成される学習済みモデルは、上記のような妥当でない推定値の出力を防止、あるいはその可能性を低減するものである。 It is reasonable for the estimated values output by the trained model to correspond to the order of the keys. For example, it is not reasonable for the estimated values of keys 5 and 7 to indicate that they can be recommended to a user, while the estimated value of key 6 between 5 and 7 does not indicate that it can be recommended to a user. However, if the trained model is simply a nonlinear model obtained by deep learning, an invalid estimated value such as the above may be output from the trained model. The trained model generated by the trained model generation system 10 according to this embodiment prevents the output of invalid estimated values such as those described above, or reduces the possibility of such output.

なお、学習済みモデル生成システム１０によって生成される学習済みモデルは、ユーザにレコメンドするキーに係る推定値を出力するものである必要はなく、本実施形態と同様な大小関係によって順序付けられる複数の候補に対する推定値を出力するものであればよい。また、学習済みモデルへの入力データも上記の例である必要はなく、学習済みモデルからの出力に係る候補に応じたものであればよい。 The trained model generated by the trained model generation system 10 does not need to output an estimated value related to the key recommended to the user, but only needs to output an estimated value for multiple candidates ordered by a magnitude relationship similar to that of this embodiment. Furthermore, the input data to the trained model does not need to be the above example, but only needs to correspond to the candidates related to the output from the trained model.

学習済みモデル生成システム１０は、例えば、ＰＣ（パーソナルコンピュータ）又はサーバ装置等のコンピュータによって実現される。また、学習済みモデル生成システム１０は、複数のコンピュータ、即ち、コンピュータシステムによって実現されてもよい。 The trained model generation system 10 is realized by a computer such as a PC (personal computer) or a server device. The trained model generation system 10 may also be realized by multiple computers, i.e., a computer system.

引き続いて、本実施形態に係る学習済みモデル生成システム１０の機能を説明する。図１に示すように学習済みモデル生成システム１０は、学習用データ取得部１１と、教師値生成部１２と、学習済みモデル生成部１３とを備えて構成される。 Next, the functions of the trained model generation system 10 according to this embodiment will be described. As shown in FIG. 1, the trained model generation system 10 includes a training data acquisition unit 11, a teacher value generation unit 12, and a trained model generation unit 13.

学習用データ取得部１１は、学習済みモデルの生成に用いる、入力データに対応する学習用入力データ、及び推定値に対応する学習用出力データを取得する機能部である。学習用出力データは、複数の候補の何れかを示すデータであってもよい。入力データは、歌唱される楽曲に係るデータを含み、候補は、当該楽曲が歌唱される際のキーであってもよい。具体的には、学習用データ取得部１１は、以下のように学習用入力データ及び学習用出力データを取得する。 The learning data acquisition unit 11 is a functional unit that acquires learning input data corresponding to the input data and learning output data corresponding to the estimated values, which are used to generate a trained model. The learning output data may be data indicating any one of a number of candidates. The input data includes data related to the song to be sung, and the candidates may be the key in which the song is sung. Specifically, the learning data acquisition unit 11 acquires the learning input data and learning output data as follows.

学習用入力データ及び学習用出力データとしては、例えば、実例に係るデータが用いられる。本実施形態のようにキーのレコメンドでは、楽曲が歌唱された際のキーを示す情報（即ち、複数の候補の何れかを示すデータ）が、学習用出力データとされる。また、その際の入力データに相当する情報、例えば、上述した楽曲を歌唱したユーザに係る情報及び歌唱された楽曲（レコメンドするキーに係る楽曲に対応する情報）が、学習用入力データとされる。学習用入力データ及び学習用出力データに係るユーザ（即ち、楽曲を歌唱したユーザ）は、レコメンドの対象となるユーザと異なっていてもよい。学習用入力データ及び学習用出力データは、１つの楽曲の歌唱に対応した組み合わせとなっている。なお、学習済みモデル生成システム１０では、識別子の付与等によって、組み合わせとなる学習用入力データ及び学習用出力データが識別できるようになっている。 For example, data related to actual examples is used as the learning input data and the learning output data. In the case of key recommendation as in this embodiment, information indicating the key when the song was sung (i.e., data indicating one of multiple candidates) is used as the learning output data. In addition, information equivalent to the input data at that time, for example, information related to the user who sang the above-mentioned song and the sung song (information corresponding to the song related to the recommended key) is used as the learning input data. The user related to the learning input data and the learning output data (i.e., the user who sang the song) may be different from the user to be recommended. The learning input data and the learning output data are combined corresponding to the singing of one song. Note that in the trained model generation system 10, the combined learning input data and learning output data can be identified by assigning an identifier, etc.

学習用データ取得部１１は、例えば、従来のカラオケシステムから学習用入力データ及び学習用出力データの組み合わせを取得する。学習用データ取得部１１は、学習済みモデルの生成のための機械学習に十分な数の複数の組み合わせを取得する。学習用データ取得部１１は、取得した学習用入力データを学習済みモデル生成部１３に出力する。学習用データ取得部１１は、取得した学習用出力データを教師値生成部１２に出力する。 The learning data acquisition unit 11 acquires a combination of learning input data and learning output data from, for example, a conventional karaoke system. The learning data acquisition unit 11 acquires a number of combinations sufficient for machine learning to generate a trained model. The learning data acquisition unit 11 outputs the acquired learning input data to the trained model generation unit 13. The learning data acquisition unit 11 outputs the acquired learning output data to the teacher value generation unit 12.

なお、学習用入力データ及び学習用出力データは、上記のものである必要はなく、生成される学習済みモデルに応じたデータであればよい。また、学習用入力データ及び学習用出力データの取得も、上記による方法に限られず、任意の方法で取得されてもよい。 The learning input data and learning output data do not have to be the above, but may be any data that corresponds to the trained model to be generated. In addition, acquisition of the learning input data and learning output data is not limited to the above method, and may be acquired by any method.

教師値生成部１２は、学習用データ取得部１１によって取得された学習用出力データ、及び予め設定された基準となる候補から教師値を生成する機能部である。教師値生成部１２は、順序付けにおける学習用出力データによって示される何れかの候補から基準となる候補までについて第１の値を、それ以外の候補について第１の値とは異なる第２の値をそれぞれ教師値としてもよい。 The teacher value generating unit 12 is a functional unit that generates teacher values from the learning output data acquired by the learning data acquiring unit 11 and a preset reference candidate. The teacher value generating unit 12 may set a first value as the teacher value for any candidate indicated by the learning output data in the ordering up to the reference candidate, and may set a second value different from the first value as the teacher value for the other candidates.

教師値生成部１２によって生成される教師値は、学習済みモデル生成部１３による機械学習に用いられる情報である。当該教師値は、学習済みモデルの出力である推定値に対応する情報である。本実施形態では、学習済みモデルから出力される推定値に対応するキーが、基準となるキー及び各キーの順序に応じてグループに分けられている。基準となるキーは、標準のキーである７のキーである。グループは、７のキーより低いキーである０～６のキーを含む第１のグループと、７のキーより高いキーである８～１４のキーを含む第２のグループとの２つのグループである。７のキーは、何れのグループにも含まれる。便宜上、第２のグループに含まれる７のキーを７´のキーとする。 The teacher values generated by the teacher value generation unit 12 are information used in machine learning by the trained model generation unit 13. The teacher values are information corresponding to estimated values that are the output of the trained model. In this embodiment, the keys corresponding to the estimated values output from the trained model are divided into groups according to the reference key and the order of each key. The reference key is key 7, which is the standard key. There are two groups: a first group including keys 0 to 6, which are keys lower than key 7, and a second group including keys 8 to 14, which are keys higher than key 7. Key 7 is included in both groups. For convenience, key 7 included in the second group is referred to as key 7'.

教師値生成部１２は、学習用データ取得部１１から学習用出力データを入力する。教師値生成部１２は、入力した学習用出力データ毎に以下のように教師値を生成する。教師値生成部１２は、学習用出力データによって示されるキーが第１のグループと第２のグループとの何れに含まれるか判断する。 The teacher value generation unit 12 inputs the learning output data from the learning data acquisition unit 11. The teacher value generation unit 12 generates a teacher value for each of the input learning output data as follows. The teacher value generation unit 12 determines whether the key indicated by the learning output data is included in the first group or the second group.

学習用出力データによって示されるキーが第１のグループに含まれる場合、教師値生成部１２は、当該キーから７のキーまでの教師値を１（第１の値）とする。また、教師値生成部１２は、第２のグループの７´のキーの教師値を１（第１の値）とする。また、教師値生成部１２は、それ以外のキーの教師値を０（第２の値）とする。学習用出力データによって示されるキーが第２のグループに含まれる場合、教師値生成部１２は、７´のキーから当該キーまでの教師値を１（第１の値）とする。また、教師値生成部１２は、第１のグループの７のキーの教師値を１（第１の値）とする。また、教師値生成部１２は、それ以外のキーの教師値を０（第２の値）とする。 When the key indicated by the learning output data is included in the first group, the teacher value generation unit 12 sets the teacher values from the key to key 7 to 1 (first value). The teacher value generation unit 12 also sets the teacher value of key 7' in the second group to 1 (first value). The teacher value generation unit 12 also sets the teacher value of other keys to 0 (second value). When the key indicated by the learning output data is included in the second group, the teacher value generation unit 12 sets the teacher value from key 7' to the key to be 1 (first value). The teacher value generation unit 12 also sets the teacher value of key 7 in the first group to 1 (first value). The teacher value generation unit 12 also sets the teacher value of other keys to 0 (second value).

また、当該キーが７のキーである場合（即ち、当該キーが第１のグループ及び第２のグループの両方に含まれる場合）、教師値生成部１２は、第１のグループの７のキー及び第２のグループの７´のキーの教師値を１（第１の値）とする。また、教師値生成部１２は、それ以外のキーの教師値を０（第２の値）とする。このように、教師値生成部１２は、順序付けにおける学習用出力データによって示される何れかの候補から基準となる候補までについて第１の値（上記の例では１）を、それ以外の候補について第１の値とは異なる第２の値（上記の例では０）をそれぞれ教師値とする。 Also, if the key is key 7 (i.e., if the key is included in both the first group and the second group), the teacher value generation unit 12 sets the teacher values of key 7 in the first group and key 7' in the second group to 1 (first value). The teacher value generation unit 12 sets the teacher values of the other keys to 0 (second value). In this way, the teacher value generation unit 12 sets the teacher values from any candidate indicated by the learning output data in the ordering to the reference candidate to a first value (1 in the above example), and sets the teacher values of the other candidates to a second value different from the first value (0 in the above example).

教師値生成部１２は、学習用出力データによって示されるキーが含まれるグループを示す傾向スコアの教師値を生成する。学習用出力データによって示されるキーが第１のグループに含まれる場合、教師値生成部１２は、傾向スコアの教師値を０とする。学習用出力データによって示されるキーが第２のグループに含まれる場合、教師値生成部１２は、傾向スコアの教師値を１とする。また、学習用出力データによって示されるキーが７のキーである場合（即ち、当該キーが第１のグループ及び第２のグループの両方に含まれる場合）、教師値生成部１２は、傾向スコアの教師値を予め設定した０又は１とする。 The teacher value generating unit 12 generates a teacher value of the propensity score indicating the group in which the key indicated by the learning output data is included. If the key indicated by the learning output data is included in the first group, the teacher value generating unit 12 sets the teacher value of the propensity score to 0. If the key indicated by the learning output data is included in the second group, the teacher value generating unit 12 sets the teacher value of the propensity score to 1. Furthermore, if the key indicated by the learning output data is key 7 (i.e., if the key is included in both the first group and the second group), the teacher value generating unit 12 sets the teacher value of the propensity score to a preset value of 0 or 1.

教師値生成部１２によって生成される各キーの教師値及び傾向スコアの教師値は、機械学習に用いられる教師ラベルである。教師ラベルの例を図２に示す。この例では、学習用出力データによって示されるキーは６のキーである。この場合、第１グループの６のキーから７のキーまでの教師値が１とされる。第２グループの７´のキーの教師値が１とされる。それ以外のキーの教師値が０とされる。傾向スコアの教師値は０とされる。教師値生成部１２は、生成した教師ラベルを学習済みモデル生成部１３に出力する。 The teacher values of each key and the teacher values of the propensity score generated by the teacher value generation unit 12 are teacher labels used in machine learning. An example of a teacher label is shown in FIG. 2. In this example, the key indicated by the learning output data is key 6. In this case, the teacher values of keys 6 to 7 in the first group are set to 1. The teacher value of key 7' in the second group is set to 1. The teacher values of the other keys are set to 0. The teacher value of the propensity score is set to 0. The teacher value generation unit 12 outputs the generated teacher labels to the trained model generation unit 13.

学習済みモデルは、上記の教師ラベルに対応する推定値を出力する。即ち、学習済みモデルは、キー毎の推定値及び傾向スコアの推定値を出力する。学習済みモデルによって出力される傾向スコアの推定値は、ユーザにレコメンドするキーが第１グループのキー（即ち、デフォルトのキーから下げる方向のキー）か、ユーザにレコメンドするキーが第２グループのキー（即ち、デフォルトのキーから上げる方向のキー）かを示すものである。即ち、傾向スコアは、キーの推定値のグループに対する傾向を示す。 The trained model outputs an estimate corresponding to the above teacher label. That is, the trained model outputs an estimate for each key and an estimate of the propensity score. The estimated propensity score output by the trained model indicates whether the key recommended to the user is a key from the first group (i.e., a key in a downward direction from the default key) or whether the key recommended to the user is a key from the second group (i.e., a key in an upward direction from the default key). That is, the propensity score indicates the tendency of the estimated value of the key for the group.

上述したように傾向スコアの教師値を設定した場合、例えば、推定値が予め設定した閾値（例えば、０．５）以下であれば、ユーザにレコメンドするキーが第１グループのキーであるとする。この場合、第１グループのキーの推定値のみを参照して、第１グループのキーのみをレコメンドするようにしてもよい。また、推定値が予め設定した閾値（例えば、０．５）を超えていれば、ユーザにレコメンドするキーが第２グループのキーであるとする。この場合、第１グループのキーの推定値のみを参照して、第１グループのキーのみをレコメンドするようにしてもよい。 When the teacher value of the propensity score is set as described above, for example, if the estimated value is equal to or less than a preset threshold (e.g., 0.5), the key recommended to the user is a key of the first group. In this case, only the estimated value of the key of the first group may be referred to and only the key of the first group may be recommended. Also, if the estimated value exceeds a preset threshold (e.g., 0.5), the key recommended to the user is a key of the second group. In this case, only the estimated value of the key of the first group may be referred to and only the key of the first group may be recommended.

傾向スコアの推定値によってユーザにレコメンドするとされたグループのキーが、以下のように更にキー毎の推定値が用いられて、ユーザへレコメンドされる。即ち、傾向スコアの推定値によってユーザにレコメンドするとされたグループでないグループのキーは、ユーザへレコメンドされない。これは上述したように、標準のキーに対して、上げ及び下げの両方をレコメンドすることは妥当ではないからである。 The keys of the group that is recommended to the user based on the estimated propensity score are further recommended to the user using the estimated value for each key as follows. In other words, keys of groups that are not the group that is recommended to the user based on the estimated propensity score are not recommended to the user. This is because, as mentioned above, it is not appropriate to recommend both an up and down for a standard key.

学習済みモデルによって出力されるキー毎の推定値は、上述したように対応するキーがユーザにレコメンド可能であるか否かを示すものである。例えば、上述したようにキーの教師値を設定した場合、推定値が大きい程、対応するキーがユーザにレコメンド可能である度合いが大きいことを示す。本実施形態における学習済みモデルによって出力される推定値は、キーの順序に応じたものとなる。具体的には、第１のグループについては、７のキーの推定値が最も高い値となり、７のキーから離れるキー程、小さな値となる。即ち、７，６，５，…，１のキーの順番に推定値が小さくなる。第２のグループについては、７´のキーの推定値が最も高い値となり、７´のキーから離れるキー程、小さな値となる。即ち、７´，８，９，…，１４のキーの順番に推定値が小さくなる。 The estimated value for each key output by the trained model indicates whether or not the corresponding key can be recommended to the user, as described above. For example, when the teacher value of the key is set as described above, the larger the estimated value, the greater the possibility that the corresponding key can be recommended to the user. The estimated value output by the trained model in this embodiment corresponds to the order of the keys. Specifically, for the first group, the estimated value of key 7 is the highest value, and the further away from key 7, the smaller the value. That is, the estimated values become smaller in the order of keys 7, 6, 5, ..., 1. For the second group, the estimated value of key 7' is the highest value, and the further away from key 7', the smaller the value. That is, the estimated values become smaller in the order of keys 7', 8, 9, ..., 14.

上述したように、推定値が予め設定した閾値（例えば、０．５）を超えているキーが、ユーザにレコメンド可能であることを示す。そのため、レコメンド可能なキーは、複数のキーとなり得る。上述したようにキーの推定値はキーの順序に応じたものとなっているため、従って、レコメンド可能なキーは連続したものとなる。本実施形態では、例えば、連続したキーの範囲をユーザにレコメンドしてもよい。例えば、５～７のキーがユーザに適したキーとしてレコメンドしてもよい。この場合、ユーザは、レコメンドされた範囲を参照して、どの範囲でキー変更すればよいか選択することができる。あるいは、上記の範囲の、デフォルトのキー（７のキー）ではない側の境界となるキーをレコメンドしてもよい。例えば、５のキー（デフォルトのキーから２音下げたキー）をユーザに適したキーの境界としてレコメンドしてもよい。 As described above, a key whose estimated value exceeds a preset threshold (e.g., 0.5) can be recommended to the user. Therefore, multiple keys can be recommended. As described above, the estimated value of the key corresponds to the order of the keys, so the recommendable keys are consecutive. In this embodiment, for example, a range of consecutive keys may be recommended to the user. For example, keys 5 to 7 may be recommended as keys suitable for the user. In this case, the user can refer to the recommended range and select within which range the key should be changed. Alternatively, a key that is a boundary on the side of the above range that is not the default key (key 7) may be recommended. For example, key 5 (a key two notes lower than the default key) may be recommended as a boundary of keys suitable for the user.

推定値が最も高くなるのは、デフォルトのキー（７，７´のキー）であるが、これは、キーの教師値を上記のように設定しているためである。そのため、必ずしもデフォルトのキー（７，７´のキー）がユーザに最も適したものであるとは限らない。上記の範囲にユーザに最も適したキーが含まれると考えられ、特に７以外のキーが上記の範囲に含まれる場合には、７以外のキーがユーザに最も適する可能性が高い。即ち、この場合、デフォルトのキーから上げたキー又は下げたキーがユーザに最も適する可能性が高い。 The default key (key 7, 7') has the highest estimated value, but this is because the key teacher values are set as described above. Therefore, the default key (key 7, 7') is not necessarily the most suitable key for the user. It is believed that the most suitable key for the user is included in the above range, and in particular, if a key other than 7 is included in the above range, it is highly likely that a key other than 7 is most suitable for the user. In other words, in this case, a key higher or lower than the default key is highly likely to be most suitable for the user.

このように本実施形態における学習済みモデルは、双方向の順序性を有する推定値を出力する。即ち、推定値の分布は単峰分布を取る。また、学習済みモデルは、上述したように当該順序性が崩れないように推定値を平滑化したものである。 In this way, the trained model in this embodiment outputs estimates that have bidirectional ordering. In other words, the distribution of the estimates is unimodal. Furthermore, as described above, the trained model smoothes the estimates so that the ordering is not lost.

学習済みモデル生成部１３は、学習用データ取得部１１によって取得された学習用入力データ、及び教師値生成部１２によって生成された教師値を教師データとして機械学習を行って学習済みモデルを生成する機能部である。学習済みモデル生成部１３によって生成される学習済みモデルは、基準となる候補及び各候補の順序に応じて候補がグループに分けられており、グループ毎に各候補に対して設定されたバイアス項、及び学習済みモデルで算出されるグループ毎の値から、各候補に対する推定値を生成して出力するものである。 The trained model generation unit 13 is a functional unit that performs machine learning using the training input data acquired by the training data acquisition unit 11 and the training values generated by the training value generation unit 12 as training data to generate a trained model. The trained model generated by the trained model generation unit 13 divides candidates into groups according to the reference candidate and the order of each candidate, and generates and outputs an estimated value for each candidate from the bias term set for each candidate for each group and the value for each group calculated by the trained model.

学習済みモデル生成部１３は、学習用データ取得部１１によって取得された学習用出力データに応じてグループに対して重み付けされる損失関数を用いて機械学習を行ってもよい。学習済みモデル生成部１３は、学習用出力データによって示される何れかの候補を含むグループについて、それ以外のグループよりも重い重みが設定される損失関数を用いて機械学習を行ってもよい。学習済みモデル生成部１３によって生成される学習済みモデルは、推定値のグループに対する傾向を示す傾向スコアを出力すると共に、当該傾向スコアに基づいて各候補に対する推定値を生成してもよい。 The trained model generation unit 13 may perform machine learning using a loss function that weights groups according to the training output data acquired by the training data acquisition unit 11. The trained model generation unit 13 may perform machine learning using a loss function that sets a heavier weight for a group that includes a candidate indicated by the training output data than for other groups. The trained model generated by the trained model generation unit 13 may output a propensity score that indicates the tendency of the estimated value for the group, and generate an estimated value for each candidate based on the propensity score.

学習済みモデル生成部１３によって生成される学習済みモデルは、例えば、ニューラルネットワークを含んで構成される。入力データに時系列の情報（例えば、ユーザの時系列の楽曲の歌唱履歴）を含み、時系列の順番も考慮する場合には、学習済みモデルは、再帰型ニューラルネットワーク（ＲＮＮ）を含んでいてもよい。但し、学習済みモデルは、機械学習によって生成されるものであれば、ニューラルネットワークを含まないものであってもよい。 The trained model generated by the trained model generation unit 13 includes, for example, a neural network. When the input data includes time-series information (for example, the user's singing history of songs in time series) and the time-series order is also taken into consideration, the trained model may include a recurrent neural network (RNN). However, the trained model may not include a neural network as long as it is generated by machine learning.

学習済みモデルは、第１のグループ及び第２のグループそれぞれについて設定されたバイアス項を有する。バイアス項は、グループに含まれるキー（候補）に対応する値を要素とするベクトルである。バイアス項の次元数は、当該キーの数である。第１のグループ及び第２のグループは、それぞれ０～７及び７´～１４のキーの８つずつのキーを有する。従って、第１のグループ及び第２のグループそれぞれのバイアス項は、８次元のベクトルである。例えば、第１のグループについてのバイアス項linear_1_biasは、以下のようなベクトルとなる。以下の例では、バイアス項linear_1_biasの１～８番目の要素が、それぞれ７～０のキーに対応する。
linear_1_bias＝［0.6408，0.5496，0.3330，-0.4081，-0.5623，-0.6018，-0.6167，-0.6210］ The trained model has bias terms set for the first group and the second group. The bias term is a vector whose elements correspond to the keys (candidates) included in the group. The number of dimensions of the bias term is the number of the keys. The first group and the second group each have eight keys, 0 to 7 and 7' to 14, respectively. Therefore, the bias terms of the first group and the second group are eight-dimensional vectors. For example, the bias term linear_1_bias for the first group is a vector as follows. In the following example, the first to eighth elements of the bias term linear_1_bias correspond to the keys 7 to 0, respectively.
linear_1_bias=[0.6408, 0.5496, 0.3330, -0.4081, -0.5623, -0.6018, -0.6167, -0.6210]

バイアス項の値は、学習済みモデル生成部１３による機械学習によって設定される固定的な値である。また、バイアス項の値は、上述したキーの推定値の大小関係と同様の大小関係を有している。即ち、第１のグループでは、７のキーに対応する値が最も大きく、キーが下がるごとに値が小さくなっていく。また、第２のグループでは、７´のキーに対応する値が最も大きく、キーが上がるごとに値が小さくなっていく。上述した教師ラベルを用いた機械学習を行うことで、バイアス項の値はこのような大小関係を有する。 The bias term value is a fixed value that is set by machine learning by the trained model generation unit 13. The bias term value has a magnitude relationship similar to the magnitude relationship of the estimated values of the keys described above. That is, in the first group, the value corresponding to the key 7 is the largest, and the value decreases as the key decreases. In the second group, the value corresponding to the key 7' is the largest, and the value decreases as the key increases. By performing machine learning using the teacher labels described above, the bias term value has such a magnitude relationship.

学習済みモデルでは、第１のグループに係る値logit_l、第２のグループに係る値logit_u、及び傾向スコアの推定値trendが算出される。なお、第１のグループに係る値logit_l、及び第２のグループに係る値logit_uの算出に、傾向スコアの推定値trendが用いられてもよい。即ち、傾向スコアの推定値は、学習済みモデル内において、キー毎の推定値の算出に用いられてもよい。 In the trained model, the value logit_l for the first group, the value logit_u for the second group, and the estimated value trend of the propensity score are calculated. Note that the estimated value trend of the propensity score may be used to calculate the value logit_l for the first group and the value logit_u for the second group. In other words, the estimated value of the propensity score may be used to calculate an estimate for each key in the trained model.

学習済みモデルでは、続いて、第１のグループに係る値logit_lと第１のグループのバイアス項linear_1_biasの和の値logits1が算出される。また、第２のグループに係る値logit_uと第２のグループのバイアス項linear_2_biasの和の値logits2が算出される。これらの和の値logits1，logits2は、バイアス項linear_1_bias，linear_2_biasの各要素にそれぞれのグループに係る値logit_l，logit_uが加算されたベクトルである。 In the trained model, the sum value logits1 of the value logit_l for the first group and the bias term linear_1_bias for the first group is then calculated. The sum value logits2 of the value logit_u for the second group and the bias term linear_2_bias for the second group is then calculated. These sum values logits1 and logits2 are vectors in which the values logit_l and logit_u for the respective groups are added to each element of the bias terms linear_1_bias and linear_2_bias.

学習済みモデルでは、続いて、それぞれのグループの和の値logits1，logits2の各要素の値に対して、数値を０～１にするシグモイド関数が適用されて各キーの推定値probas1，probas2（＝torch.sigmoid(logits1)，torch.sigmoid(logits2)）が算出される。シグモイド関数の適用のイメージとしては、シグモイド関数上に推定値の閾値をバイアス項で設けるというものである。 In the trained model, the sigmoid function that sets the values of each element of the sum values logits1 and logits2 of each group between 0 and 1 is then applied to calculate the estimated values probas1 and probas2 (= torch.sigmoid(logits1), torch.sigmoid(logits2)) of each key. The idea of applying the sigmoid function is to set a threshold for the estimated value on the sigmoid function using a bias term.

上記を数式で示すと以下のようになる。
logit_l，logit_u，trend＝f(x)
logits1＝logit_l＋linear_1_bias
logits2＝logit_u＋linear_2_bias
probas1＝torch.sigmoid(logits1)
probas2＝torch.sigmoid(logits2)
ここでf(x)は、入力データｘに対して、学習済みモデルにおいてlogit_l，logit_u，trendを算出するための演算（関数）を示す。 The above can be expressed mathematically as follows:
logit_l, logit_u, trend=f(x)
logits1=logit_l+linear_1_bias
logits2＝logit_u＋linear_2_bias
probas1＝torch.sigmoid(logits1)
probas2＝torch.sigmoid(logits2)
Here, f(x) indicates a calculation (function) for calculating logit_l, logit_u, and trend in the trained model for the input data x.

また、logit_l、linear_1_bias及びlogits1の数値例を示す。
logit_l＝0.65
linear_1_bias＝［0.6408，0.5496，0.3330，-0.4081，-0.5623，-0.6018，-0.6167，-0.6210］
logits1＝［1.2908，1.1996，0.9830，0.2419，0.0877，0.0482，0.0333，0.0290］ Also, examples of values for logit_l, linear_1_bias, and logits1 are shown.
logit_l＝0.65
linear_1_bias=[0.6408, 0.5496, 0.3330, -0.4081, -0.5623, -0.6018, -0.6167, -0.6210]
logits1=[1.2908, 1.1996, 0.9830, 0.2419, 0.0877, 0.0482, 0.0333, 0.0290]

デフォルトのキーである７のキーについては、第１のグループの７のキーの推定値と、第２のグループの７´のキーの推定値との２つの推定値があるため、それらから１つの推定値が算出されてもよい。例えば、それらの平均値を７のキーの推定値としてもよい。 For the default key, key 7, there are two estimated values: an estimated value for key 7 in the first group and an estimated value for key 7' in the second group, so a single estimated value may be calculated from these. For example, the average value of these may be used as the estimated value for key 7.

学習済みモデル生成部１３は、以下のように学習済みモデルを生成する。学習済みモデル生成部１３は、学習用データ取得部１１から学習用入力データを入力する。学習済みモデル生成部１３は、教師値生成部１２から教師ラベルを入力する。学習済みモデル生成部１３は、学習用入力データ及び教師ラベルを教師データとして機械学習を行う。学習済みモデル生成部１３は、学習用入力データを、生成される学習済みモデルへの入力とすると共に、教師ラベルを、生成される学習済みモデルへの出力として機械学習を行って学習済みモデルを生成する。機械学習自体は、従来の機械学習の方法と同様に行うことができる。 The trained model generation unit 13 generates a trained model as follows. The trained model generation unit 13 inputs the training input data from the training data acquisition unit 11. The trained model generation unit 13 inputs the teacher label from the teacher value generation unit 12. The trained model generation unit 13 performs machine learning using the training input data and the teacher label as teacher data. The trained model generation unit 13 performs machine learning to generate a trained model by using the training input data as input to the trained model to be generated and the teacher label as output to the trained model to be generated. The machine learning itself can be performed in the same way as conventional machine learning methods.

機械学習を行う際の損失関数は、以下に示すものが用いられてもよい。
損失関数＝傾向スコア誤差＋ｃｏｒａｌｌｏｓｓ（０－７）＋ｃｏｒａｌｌｏｓｓ（７´－１４） The loss function used in machine learning may be as follows:
Loss function = propensity score error + coral loss (0-7) + coral loss (7'-14)

傾向スコア誤差は、機械学習時に学習済みモデルから出力される傾向スコアの推定値と、教師ラベルの傾向スコアとの誤差である。ｃｏｒａｌｌｏｓｓ（０－７）は、第１のグループ（０～７のキー）についての機械学習時に学習済みモデルから出力される各キーの推定値と、教師ラベルの値との誤差である。ｃｏｒａｌｌｏｓｓ（７´－１４）は、第２のグループ（７´～１４のキー）についての機械学習時に学習済みモデルから出力される各キーの推定値と、教師ラベルの値との誤差である。各誤差自体は、従来の機械学習の方法と同様に設定すればよい。また、損出関数の利用自体も、従来の機械学習の方法と同様に行われればよい。 The propensity score error is the error between the estimated value of the propensity score output from the trained model during machine learning and the propensity score of the teacher label. Coral loss (0-7) is the error between the estimated value of each key output from the trained model during machine learning for the first group (keys 0 to 7) and the value of the teacher label. Coral loss (7'-14) is the error between the estimated value of each key output from the trained model during machine learning for the second group (keys 7' to 14) and the value of the teacher label. Each error itself may be set in the same way as in conventional machine learning methods. In addition, the loss function itself may be used in the same way as in conventional machine learning methods.

また、機械学習の際の損失関数は、学習用出力データに応じてグループに対して重み付けされるものが用いられてもよい。学習済みモデル生成部１３は、学習用出力データによって示されるキーを含むグループについて、それ以外のグループよりも重い重みが設定される損失関数を用いて機械学習を行ってもよい。 The loss function used in machine learning may be one that weights groups according to the training output data. The trained model generation unit 13 may perform machine learning using a loss function that weights groups that include a key indicated by the training output data more heavily than other groups.

例えば、上記の損失関数の構成要素のうち、ｃｏｒａｌｌｏｓｓ（０－７）及びｃｏｒａｌｌｏｓｓ（７´－１４）の何れか一方を用いない、即ち、重みを０としてもよい。学習用出力データによって示されるキーが第１のグループに含まれる場合、即ち、傾向スコアの教師値が０である場合、ｃｏｒａｌｌｏｓｓ（７´－１４）を用いない。即ち、この場合、機械学習の際、第２のグループについての誤差を無視する。学習用出力データによって示されるキーが第２のグループに含まれる場合、即ち、傾向スコアの教師値が１である場合、ｃｏｒａｌｌｏｓｓ（０－７）を用いない。即ち、この場合、機械学習の際、第１のグループについての誤差を無視する。これは、学習用出力データによって示されるキーを含まない側のグループについては、学習用出力データを用いた機械学習の意義が薄いと考えられるためである。 For example, among the components of the loss function, either coral loss (0-7) or coral loss (7'-14) may not be used, i.e., the weight may be set to 0. If the key indicated by the learning output data is included in the first group, i.e., if the teacher value of the propensity score is 0, coral loss (7'-14) is not used. That is, in this case, the error for the second group is ignored during machine learning. If the key indicated by the learning output data is included in the second group, i.e., if the teacher value of the propensity score is 1, coral loss (0-7) is not used. That is, in this case, the error for the first group is ignored during machine learning. This is because it is considered that machine learning using the learning output data is less meaningful for the group that does not include the key indicated by the learning output data.

学習済みモデル生成部１３は、生成した学習済みモデルを出力する。例えば、キーのレコメンドのために学習済みモデルを用いる他の装置又はモジュールに学習済みモデルを送信又は出力する。あるいは、学習済みモデル生成部１３は、学習済みモデル生成システム１０に生成した学習済みモデルを記憶させて、学習済みモデルを用いる他の装置又はモジュールに利用できるようにしてもよい。 The trained model generation unit 13 outputs the generated trained model. For example, the trained model is transmitted or output to another device or module that uses the trained model for key recommendations. Alternatively, the trained model generation unit 13 may store the generated trained model in the trained model generation system 10 so that it can be used by another device or module that uses the trained model.

学習済みモデル生成システム１０によって生成される学習済みモデルは、人工知能ソフトウェアの一部であるプログラムモジュールとしての利用が想定される。学習済みモデルは、例えば、ＣＰＵ（Central Processing Unit）及びメモリを備えるコンピュータにて用いられ、コンピュータのＣＰＵが、メモリに記憶された学習済みモデルからの指令に従って動作する。例えば、コンピュータのＣＰＵが、当該指令に従って、学習済みモデルに対して情報を入力して、学習済みモデルに応じた演算を行って、学習済みモデルから結果を出力するように動作する。具体的には、コンピュータのＣＰＵが、当該指令に従って、ニューラルネットワークの入力層に情報を入力して、ニューラルネットワークにおける学習済みの重み付け係数等に基づく演算を行って、ニューラルネットワークの出力層から結果を出力するように動作する。以上が、本実施形態に係る学習済みモデル生成システム１０の機能である。 The trained model generated by the trained model generation system 10 is expected to be used as a program module that is part of artificial intelligence software. The trained model is used, for example, in a computer equipped with a CPU (Central Processing Unit) and memory, and the CPU of the computer operates according to instructions from the trained model stored in the memory. For example, the CPU of the computer operates to input information to the trained model according to the instructions, perform calculations according to the trained model, and output results from the trained model. Specifically, the CPU of the computer operates to input information to the input layer of the neural network according to the instructions, perform calculations based on trained weighting coefficients in the neural network, and output results from the output layer of the neural network. These are the functions of the trained model generation system 10 according to this embodiment.

引き続いて、図４のフローチャートを用いて、本実施形態に係る学習済みモデル生成システム１０で実行される処理（学習済みモデル生成システム１０が行う動作方法）を説明する。 Next, the process executed by the trained model generation system 10 according to this embodiment (the operation method performed by the trained model generation system 10) will be explained using the flowchart in FIG. 4.

本処理では、学習用データ取得部１１によって、学習用入力データ及び学習用出力データが取得される（Ｓ０１）。続いて、教師値生成部１２によって、学習用出力データ、及び予め設定された基準となる候補から教師値が生成される（Ｓ０２）。例えば、上述したように、教師値として図３に示すような教師ラベルが生成される。 In this process, the learning data acquisition unit 11 acquires learning input data and learning output data (S01). Next, the teacher value generation unit 12 generates teacher values from the learning output data and preset reference candidates (S02). For example, as described above, teacher labels such as those shown in FIG. 3 are generated as teacher values.

続いて、学習済みモデル生成部１３によって、学習用入力データ、及び教師値を教師データとして機械学習が行われて学習済みモデルが生成される（Ｓ０３）。生成される学習済みモデルは、入力データを入力して、大小関係によって順序付けられる複数の候補に対する推定値を出力するものである。また、学習済みモデルは、基準となる候補及び各候補の順序に応じて候補がグループに分けられており、グループ毎に各候補に対して設定されたバイアス項、及び学習済みモデルで算出されるグループ毎の値から、各候補に対する推定値を生成して出力するものである。続いて、学習済みモデル生成部１３から、生成された学習済みモデルが出力される（Ｓ０４）。以上が、本実施形態に係る学習済みモデル生成システム１０で実行される処理である。 Then, the trained model generation unit 13 performs machine learning using the learning input data and the training values as training data to generate a trained model (S03). The trained model that is generated receives input data and outputs estimated values for multiple candidates that are ordered according to magnitude relationships. In addition, the trained model divides candidates into groups according to a reference candidate and the order of each candidate, and generates and outputs estimated values for each candidate from bias terms set for each candidate for each group and values for each group calculated by the trained model. Next, the trained model generation unit 13 outputs the generated trained model (S04). The above is the processing executed by the trained model generation system 10 according to this embodiment.

本実施形態では、上述した教師値が生成されて、上述した学習済みモデルが生成される。このように生成される学習済みモデルは、順序に応じた適切な推定値を出力することができる。例えば、上述したようにキーのレコメンドに用いる学習済みモデルの場合、妥当でない推定値（例えば、５及び７のキーについてレコメンドすると共に６のキーについてレコメンドしない推定値）の出力を防止することができる。本実施形態のように２つのグループに対するそれぞれのバイアス項を用いることで双方向の順序性を持つ推定値の出力において、適切に推定値を平滑化することができる。即ち、本実施形態によれば、より妥当な値を出力する学習済みモデルを生成することができる。また、その結果、例えば、本実施形態のキーのレコメンドのように生成された学習済みモデルが用いられたレコメンドが行われることで、レコメンド結果の受諾率を高めることができる。 In this embodiment, the teacher values described above are generated, and the trained model described above is generated. The trained model generated in this manner can output appropriate estimated values according to the order. For example, in the case of a trained model used for key recommendations as described above, it is possible to prevent the output of inappropriate estimated values (for example, an estimated value that recommends keys 5 and 7 but does not recommend key 6). By using bias terms for each of the two groups as in this embodiment, it is possible to appropriately smooth the estimated values when outputting estimated values having bidirectional order. That is, according to this embodiment, it is possible to generate a trained model that outputs more appropriate values. As a result, for example, by making recommendations using a trained model generated like the key recommendations of this embodiment, the acceptance rate of recommendation results can be increased.

また、上述したように、入力データは、歌唱される楽曲に係るデータを含み、候補は、当該楽曲が歌唱される際のキーとしてもよい。この構成によれば、キーのレコメンド等を適切に行うことができる。但し、生成される学習済みモデルは、上記のものに限られず、大小関係によって順序付けられる複数の候補に対する推定値を出力するものであればよい。 As described above, the input data may include data related to the song to be sung, and the candidates may be the keys in which the song will be sung. With this configuration, it is possible to appropriately recommend keys. However, the trained model that is generated is not limited to the above, and may be one that outputs estimated values for multiple candidates ordered by magnitude relationship.

また、上述したように、学習用出力データは、複数の候補の何れかを示すデータであり、教師値生成部１２は、順序付けにおける学習用出力データによって示される何れかの候補から基準となる候補（上述した例では７又は７´のキー）までについて第１の値（上述した例では１）を、それ以外の候補について第１の値とは異なる第２の値（上述した例では０）をそれぞれ教師値としてもよい。この構成によれば、適切かつ確実により妥当な値を出力する学習済みモデルを生成することができる。但し、教師値の生成は、必ずしも上記のように行われる必要はなく、学習用出力データ、及び予め設定された基準となる候補から生成されればよい。 As described above, the learning output data is data indicating one of multiple candidates, and the teacher value generation unit 12 may set a first value (1 in the above example) for any candidate indicated by the learning output data in the ordering up to the reference candidate (key 7 or 7' in the above example) and a second value different from the first value (0 in the above example) for the other candidates as the teacher value. With this configuration, it is possible to generate a trained model that appropriately and reliably outputs more appropriate values. However, the teacher values do not necessarily need to be generated as described above, and may be generated from the learning output data and a preset reference candidate.

また、上述したように、学習済みモデル生成部１３は、学習用出力データに応じてグループに対して重み付けされる損失関数を用いて機械学習を行ってもよい。更には、学習用出力データによって示される何れかの候補を含むグループについて、それ以外のグループよりも重い重みが設定される損失関数を用いて機械学習を行ってもよい。この構成によれば、学習用出力データに応じた更に適切かつ効率的な学習が可能となる。例えば、上述したように学習用出力データによって示される候補を含まないグループを損出関数から除外することで、効率的な学習が可能となる。但し、機械学習における上記の重み付けは行われなくてよい。 As described above, the trained model generation unit 13 may perform machine learning using a loss function that weights groups according to the training output data. Furthermore, machine learning may be performed using a loss function that sets a heavier weight for a group that includes a candidate indicated by the training output data than for other groups. This configuration enables more appropriate and efficient learning according to the training output data. For example, as described above, efficient learning is possible by excluding a group that does not include a candidate indicated by the training output data from the loss function. However, the above weighting in machine learning does not need to be performed.

また、上述したように、学習済みモデルは、推定値のグループに対する傾向を示す傾向スコアを出力すると共に、当該傾向スコアに基づいて各候補に対する推定値を生成するものであってもよい。キーのレコメンドのように、レコメンド対象となるグループが重要である場合には、上述したように当該傾向スコアを用いてまずレコメンド対象となるグループを選択することができる。即ち、候補毎の推定値に加えて更に有用な情報を出力する学習済みモデルとすることができる。但し、学習済みモデルは、傾向スコアの算出及び出力等をするものでなくてもよい。 As described above, the trained model may output a propensity score indicating the tendency of the estimated value for a group, and generate an estimated value for each candidate based on the propensity score. When the group to be recommended is important, such as in key recommendations, the propensity score can be used to first select the group to be recommended, as described above. In other words, the trained model can output further useful information in addition to the estimated value for each candidate. However, the trained model does not have to calculate and output the propensity score.

なお、上記実施形態の説明に用いたブロック図は、機能単位のブロックを示している。これらの機能ブロック（構成部）は、ハードウェア及びソフトウェアの少なくとも一方の任意の組み合わせによって実現される。また、各機能ブロックの実現方法は特に限定されない。すなわち、各機能ブロックは、物理的又は論理的に結合した１つの装置を用いて実現されてもよいし、物理的又は論理的に分離した２つ以上の装置を直接的又は間接的に（例えば、有線、無線などを用いて）接続し、これら複数の装置を用いて実現されてもよい。機能ブロックは、上記１つの装置又は上記複数の装置にソフトウェアを組み合わせて実現されてもよい。 The block diagrams used to explain the above embodiments show functional blocks. These functional blocks (components) are realized by any combination of at least one of hardware and software. Furthermore, the method of realizing each functional block is not particularly limited. That is, each functional block may be realized using one device that is physically or logically coupled, or may be realized using two or more devices that are physically or logically separated and directly or indirectly connected (for example, using wires, wirelessly, etc.). The functional blocks may be realized by combining the one device or the multiple devices with software.

機能には、判断、決定、判定、計算、算出、処理、導出、調査、探索、確認、受信、送信、出力、アクセス、解決、選択、選定、確立、比較、想定、期待、見做し、報知（broadcasting）、通知（notifying）、通信（communicating）、転送（forwarding）、構成（configuring）、再構成（reconfiguring）、割り当て（allocating、mapping）、割り振り（assigning）などがあるが、これらに限られない。たとえば、送信を機能させる機能ブロック（構成部）は、送信部（transmitting unit）や送信機（transmitter）と呼称される。いずれも、上述したとおり、実現方法は特に限定されない。 Functions include, but are not limited to, judgement, determination, judgment, calculation, computation, processing, derivation, investigation, search, confirmation, reception, transmission, output, access, resolution, selection, selection, establishment, comparison, assumption, expectation, regard, broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating, mapping, and assignment. For example, a functional block (component) that performs the transmission function is called a transmitting unit or transmitter. As mentioned above, there are no particular limitations on the method of realization for either of these.

例えば、本開示の一実施の形態における学習済みモデル生成システム１０は、本開示の情報処理を行うコンピュータとして機能してもよい。図５は、本開示の一実施の形態に係る学習済みモデル生成システム１０のハードウェア構成の一例を示す図である。上述の学習済みモデル生成システム１０は、物理的には、プロセッサ１００１、メモリ１００２、ストレージ１００３、通信装置１００４、入力装置１００５、出力装置１００６、バス１００７などを含むコンピュータ装置として構成されてもよい。 For example, the trained model generation system 10 in one embodiment of the present disclosure may function as a computer that performs the information processing of the present disclosure. FIG. 5 is a diagram showing an example of a hardware configuration of the trained model generation system 10 according to one embodiment of the present disclosure. The trained model generation system 10 described above may be physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, etc.

なお、以下の説明では、「装置」という文言は、回路、デバイス、ユニットなどに読み替えることができる。学習済みモデル生成システム１０のハードウェア構成は、図に示した各装置を１つ又は複数含むように構成されてもよいし、一部の装置を含まずに構成されてもよい。 In the following description, the term "apparatus" may be interpreted as a circuit, device, unit, etc. The hardware configuration of the trained model generation system 10 may be configured to include one or more of the devices shown in the figure, or may be configured to exclude some of the devices.

学習済みモデル生成システム１０における各機能は、プロセッサ１００１、メモリ１００２などのハードウェア上に所定のソフトウェア（プログラム）を読み込ませることによって、プロセッサ１００１が演算を行い、通信装置１００４による通信を制御したり、メモリ１００２及びストレージ１００３におけるデータの読み出し及び書き込みの少なくとも一方を制御したりすることによって実現される。 Each function in the trained model generation system 10 is realized by loading a specific software (program) onto hardware such as the processor 1001 and memory 1002, causing the processor 1001 to perform calculations, control communications via the communication device 1004, and control at least one of reading and writing data in the memory 1002 and storage 1003.

プロセッサ１００１は、例えば、オペレーティングシステムを動作させてコンピュータ全体を制御する。プロセッサ１００１は、周辺装置とのインターフェース、制御装置、演算装置、レジスタなどを含む中央処理装置（ＣＰＵ：Central Processing Unit）によって構成されてもよい。例えば、上述の学習済みモデル生成システム１０における各機能は、プロセッサ１００１によって実現されてもよい。 The processor 1001, for example, operates an operating system to control the entire computer. The processor 1001 may be configured with a central processing unit (CPU) including an interface with peripheral devices, a control device, an arithmetic unit, a register, etc. For example, each function in the trained model generation system 10 described above may be realized by the processor 1001.

また、プロセッサ１００１は、プログラム（プログラムコード）、ソフトウェアモジュール、データなどを、ストレージ１００３及び通信装置１００４の少なくとも一方からメモリ１００２に読み出し、これらに従って各種の処理を実行する。プログラムとしては、上述の実施の形態において説明した動作の少なくとも一部をコンピュータに実行させるプログラムが用いられる。例えば、学習済みモデル生成システム１０における各機能は、メモリ１００２に格納され、プロセッサ１００１において動作する制御プログラムによって実現されてもよい。上述の各種処理は、１つのプロセッサ１００１によって実行される旨を説明してきたが、２以上のプロセッサ１００１により同時又は逐次に実行されてもよい。プロセッサ１００１は、１以上のチップによって実装されてもよい。なお、プログラムは、電気通信回線を介してネットワークから送信されても良い。 The processor 1001 also reads out programs (program codes), software modules, data, etc. from at least one of the storage 1003 and the communication device 1004 into the memory 1002, and executes various processes according to these. The programs used are those that cause a computer to execute at least some of the operations described in the above-mentioned embodiments. For example, each function in the trained model generation system 10 may be realized by a control program stored in the memory 1002 and running on the processor 1001. Although the above-mentioned various processes have been described as being executed by one processor 1001, they may be executed simultaneously or sequentially by two or more processors 1001. The processor 1001 may be implemented by one or more chips. The programs may be transmitted from a network via a telecommunications line.

メモリ１００２は、コンピュータ読み取り可能な記録媒体であり、例えば、ＲＯＭ（Read Only Memory）、ＥＰＲＯＭ（Erasable Programmable ＲＯＭ）、ＥＥＰＲＯＭ（Electrically Erasable Programmable ＲＯＭ）、ＲＡＭ（Random Access Memory）などの少なくとも１つによって構成されてもよい。メモリ１００２は、レジスタ、キャッシュ、メインメモリ（主記憶装置）などと呼ばれてもよい。メモリ１００２は、本開示の一実施の形態に係る情報処理を実施するために実行可能なプログラム（プログラムコード）、ソフトウェアモジュールなどを保存することができる。 The memory 1002 is a computer-readable recording medium, and may be composed of at least one of, for example, a ROM (Read Only Memory), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically Erasable Programmable ROM), a RAM (Random Access Memory), etc. The memory 1002 may also be called a register, a cache, a main memory (primary storage device), etc. The memory 1002 can store executable programs (program codes), software modules, etc., for performing information processing related to one embodiment of the present disclosure.

ストレージ１００３は、コンピュータ読み取り可能な記録媒体であり、例えば、ＣＤ－ＲＯＭ（Compact Disc ＲＯＭ）などの光ディスク、ハードディスクドライブ、フレキシブルディスク、光磁気ディスク(例えば、コンパクトディスク、デジタル多用途ディスク、Ｂｌｕ－ｒａｙ（登録商標）ディスク)、スマートカード、フラッシュメモリ(例えば、カード、スティック、キードライブ)、フロッピー（登録商標）ディスク、磁気ストリップなどの少なくとも１つによって構成されてもよい。ストレージ１００３は、補助記憶装置と呼ばれてもよい。学習済みモデル生成システム１０が備える記憶媒体は、例えば、メモリ１００２及びストレージ１００３の少なくとも一方を含むデータベース、サーバその他の適切な媒体であってもよい。 Storage 1003 is a computer-readable recording medium, and may be composed of at least one of, for example, an optical disk such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, a magneto-optical disk (e.g., a compact disk, a digital versatile disk, a Blu-ray (registered trademark) disk), a smart card, a flash memory (e.g., a card, a stick, a key drive), a floppy (registered trademark) disk, a magnetic strip, etc. Storage 1003 may also be referred to as an auxiliary storage device. The storage medium provided in the trained model generation system 10 may be, for example, a database including at least one of memory 1002 and storage 1003, a server, or other suitable medium.

通信装置１００４は、有線ネットワーク及び無線ネットワークの少なくとも一方を介してコンピュータ間の通信を行うためのハードウェア（送受信デバイス）であり、例えばネットワークデバイス、ネットワークコントローラ、ネットワークカード、通信モジュールなどともいう。 The communication device 1004 is hardware (transmitting/receiving device) for communicating between computers via at least one of a wired network and a wireless network, and is also referred to as, for example, a network device, a network controller, a network card, a communication module, etc.

入力装置１００５は、外部からの入力を受け付ける入力デバイス（例えば、キーボード、マウス、マイクロフォン、スイッチ、ボタン、センサなど）である。出力装置１００６は、外部への出力を実施する出力デバイス（例えば、ディスプレイ、スピーカー、LEDランプなど）である。なお、入力装置１００５及び出力装置１００６は、一体となった構成（例えば、タッチパネル）であってもよい。 The input device 1005 is an input device (e.g., a keyboard, a mouse, a microphone, a switch, a button, a sensor, etc.) that accepts input from the outside. The output device 1006 is an output device (e.g., a display, a speaker, an LED lamp, etc.) that performs output to the outside. Note that the input device 1005 and the output device 1006 may be integrated into one configuration (e.g., a touch panel).

また、プロセッサ１００１、メモリ１００２などの各装置は、情報を通信するためのバス１００７によって接続される。バス１００７は、単一のバスを用いて構成されてもよいし、装置間ごとに異なるバスを用いて構成されてもよい。 In addition, each device such as the processor 1001 and memory 1002 is connected by a bus 1007 for communicating information. The bus 1007 may be configured using a single bus, or may be configured using different buses between each device.

また、学習済みモデル生成システム１０は、マイクロプロセッサ、デジタル信号プロセッサ（ＤＳＰ：Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＰＬＤ（Programmable Logic Device）、ＦＰＧＡ（Field Programmable Gate Array）などのハードウェアを含んで構成されてもよく、当該ハードウェアにより、各機能ブロックの一部又は全てが実現されてもよい。例えば、プロセッサ１００１は、これらのハードウェアの少なくとも１つを用いて実装されてもよい。 The trained model generation system 10 may also be configured to include hardware such as a microprocessor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a programmable logic device (PLD), or a field programmable gate array (FPGA), and some or all of the functional blocks may be realized by the hardware. For example, the processor 1001 may be implemented using at least one of these pieces of hardware.

本開示において説明した各態様／実施形態の処理手順、シーケンス、フローチャートなどは、矛盾の無い限り、順序を入れ替えてもよい。例えば、本開示において説明した方法については、例示的な順序を用いて様々なステップの要素を提示しており、提示した特定の順序に限定されない。 The processing steps, sequences, flow charts, etc. of each aspect/embodiment described in this disclosure may be reordered unless inconsistent. For example, the methods described in this disclosure present elements of various steps using an example order and are not limited to the particular order presented.

入出力された情報等は特定の場所（例えば、メモリ）に保存されてもよいし、管理テーブルを用いて管理してもよい。入出力される情報等は、上書き、更新、又は追記され得る。出力された情報等は削除されてもよい。入力された情報等は他の装置へ送信されてもよい。 The input and output information may be stored in a specific location (e.g., memory) or may be managed using a management table. The input and output information may be overwritten, updated, or added to. The output information may be deleted. The input information may be transmitted to another device.

判定は、１ビットで表される値（０か１か）によって行われてもよいし、真偽値（Boolean：true又はfalse）によって行われてもよいし、数値の比較（例えば、所定の値との比較）によって行われてもよい。 The determination may be based on a value represented by one bit (0 or 1), a Boolean (true or false) value, or a numerical comparison (e.g., with a predetermined value).

本開示において説明した各態様／実施形態は単独で用いてもよいし、組み合わせて用いてもよいし、実行に伴って切り替えて用いてもよい。また、所定の情報の通知（例えば、「Ｘであること」の通知）は、明示的に行うものに限られず、暗黙的（例えば、当該所定の情報の通知を行わない）ことによって行われてもよい。 Each aspect/embodiment described in this disclosure may be used alone, in combination, or switched depending on the execution. In addition, notification of specific information (e.g., notification that "X is the case") is not limited to being done explicitly, but may be done implicitly (e.g., not notifying the specific information).

以上、本開示について詳細に説明したが、当業者にとっては、本開示が本開示中に説明した実施形態に限定されるものではないということは明らかである。本開示は、請求の範囲の記載により定まる本開示の趣旨及び範囲を逸脱することなく修正及び変更態様として実施することができる。したがって、本開示の記載は、例示説明を目的とするものであり、本開示に対して何ら制限的な意味を有するものではない。 Although the present disclosure has been described in detail above, it is clear to those skilled in the art that the present disclosure is not limited to the embodiments described herein. The present disclosure can be implemented in modified and altered forms without departing from the spirit and scope of the present disclosure as defined by the claims. Therefore, the description of the present disclosure is intended as an illustrative example and does not have any limiting meaning on the present disclosure.

ソフトウェアは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語と呼ばれるか、他の名称で呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、プログラム、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順、機能などを意味するよう広く解釈されるべきである。 Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.

また、ソフトウェア、命令、情報などは、伝送媒体を介して送受信されてもよい。例えば、ソフトウェアが、有線技術（同軸ケーブル、光ファイバケーブル、ツイストペア、デジタル加入者回線（ＤＳＬ：Digital Subscriber Line）など）及び無線技術（赤外線、マイクロ波など）の少なくとも一方を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び無線技術の少なくとも一方は、伝送媒体の定義内に含まれる。 Software, instructions, information, etc. may also be transmitted and received via a transmission medium. For example, if the software is transmitted from a website, server, or other remote source using wired technologies (such as coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL)), and/or wireless technologies (such as infrared, microwave), then these wired and/or wireless technologies are included within the definition of a transmission medium.

本開示において使用する「システム」及び「ネットワーク」という用語は、互換的に使用される。 As used in this disclosure, the terms "system" and "network" are used interchangeably.

また、本開示において説明した情報、パラメータなどは、絶対値を用いて表されてもよいし、所定の値からの相対値を用いて表されてもよいし、対応する別の情報を用いて表されてもよい。 In addition, the information, parameters, etc. described in this disclosure may be expressed using absolute values, may be expressed using relative values from a predetermined value, or may be expressed using other corresponding information.

本開示で使用する「判断(determining)」、「決定(determining)」という用語は、多種多様な動作を包含する場合がある。「判断」、「決定」は、例えば、判定(judging)、計算(calculating)、算出(computing)、処理(processing)、導出(deriving)、調査(investigating)、探索(looking up、search、inquiry)（例えば、テーブル、データベース又は別のデータ構造での探索）、確認(ascertaining)した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、受信(receiving)（例えば、情報を受信すること）、送信(transmitting)(例えば、情報を送信すること)、入力(input)、出力(output)、アクセス(accessing)（例えば、メモリ中のデータにアクセスすること）した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、解決(resolving)、選択(selecting)、選定(choosing)、確立(establishing)、比較(comparing)などした事を「判断」「決定」したとみなす事を含み得る。つまり、「判断」「決定」は、何らかの動作を「判断」「決定」したとみなす事を含み得る。また、「判断（決定）」は、「想定する（assuming）」、「期待する（expecting）」、「みなす（considering）」などで読み替えられてもよい。 As used in this disclosure, the terms "determining" and "determining" may encompass a wide variety of actions. "Determining" and "determining" may include, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, inquiring (e.g., searching in a table, database, or other data structure), ascertaining, and the like. "Determining" and "determining" may also include receiving (e.g., receiving information), transmitting (e.g., sending information), input, output, accessing (e.g., accessing data in memory), and the like. Additionally, "judgment" and "decision" can include considering resolving, selecting, choosing, establishing, comparing, etc., to have been "judged" or "decided." In other words, "judgment" and "decision" can include considering some action to have been "judged" or "decided." Additionally, "judgment (decision)" can be interpreted as "assuming," "expecting," "considering," etc.

「接続された(connected)」、「結合された(coupled)」という用語、又はこれらのあらゆる変形は、２又はそれ以上の要素間の直接的又は間接的なあらゆる接続又は結合を意味し、互いに「接続」又は「結合」された２つの要素間に１又はそれ以上の中間要素が存在することを含むことができる。要素間の結合又は接続は、物理的なものであっても、論理的なものであっても、或いはこれらの組み合わせであってもよい。例えば、「接続」は「アクセス」で読み替えられてもよい。本開示で使用する場合、２つの要素は、１又はそれ以上の電線、ケーブル及びプリント電気接続の少なくとも一つを用いて、並びにいくつかの非限定的かつ非包括的な例として、無線周波数領域、マイクロ波領域及び光（可視及び不可視の両方）領域の波長を有する電磁エネルギーなどを用いて、互いに「接続」又は「結合」されると考えることができる。 The terms "connected," "coupled," or any variation thereof, refer to any direct or indirect connection or coupling between two or more elements, and may include the presence of one or more intermediate elements between two elements that are "connected" or "coupled" to each other. The coupling or connection between elements may be physical, logical, or a combination thereof. For example, "connected" may be read as "access." As used in this disclosure, two elements may be considered to be "connected" or "coupled" to each other using at least one of one or more wires, cables, and printed electrical connections, as well as electromagnetic energy having wavelengths in the radio frequency range, microwave range, and optical (both visible and invisible) range, as some non-limiting and non-exhaustive examples.

本開示において使用する「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 As used in this disclosure, the phrase "based on" does not mean "based only on," unless expressly stated otherwise. In other words, the phrase "based on" means both "based only on" and "based at least on."

本開示において使用する「第１の」、「第２の」などの呼称を使用した要素へのいかなる参照も、それらの要素の量又は順序を全般的に限定しない。これらの呼称は、２つ以上の要素間を区別する便利な方法として本開示において使用され得る。したがって、第１及び第２の要素への参照は、２つの要素のみが採用され得ること、又は何らかの形で第１の要素が第２の要素に先行しなければならないことを意味しない。 Any reference to elements using designations such as "first," "second," etc., used in this disclosure does not generally limit the quantity or order of those elements. These designations may be used in this disclosure as a convenient way to distinguish between two or more elements. Thus, a reference to a first and a second element does not imply that only two elements may be employed or that the first element must precede the second element in some way.

本開示において、「含む（include）」、「含んでいる（including）」及びそれらの変形が使用されている場合、これらの用語は、用語「備える（comprising）」と同様に、包括的であることが意図される。さらに、本開示において使用されている用語「又は（or）」は、排他的論理和ではないことが意図される。 When the terms "include," "including," and variations thereof are used in this disclosure, these terms are intended to be inclusive, similar to the term "comprising." Additionally, the term "or," as used in this disclosure, is not intended to be an exclusive or.

本開示において、例えば、英語でのa, an及びtheのように、翻訳により冠詞が追加された場合、本開示は、これらの冠詞の後に続く名詞が複数形であることを含んでもよい。 In this disclosure, where articles have been added through translation, such as a, an, and the in English, the disclosure may include that the nouns following these articles are in the plural form.

本開示において、「ＡとＢが異なる」という用語は、「ＡとＢが互いに異なる」ことを意味してもよい。なお、当該用語は、「ＡとＢがそれぞれＣと異なる」ことを意味してもよい。「離れる」、「結合される」などの用語も、「異なる」と同様に解釈されてもよい。 In this disclosure, the term "A and B are different" may mean "A and B are different from each other." The term may also mean "A and B are each different from C." Terms such as "separate" and "combined" may also be interpreted in the same way as "different."

１０…学習済みモデル生成システム、１１…学習用データ取得部、１２…教師値生成部、１３…学習済みモデル生成部、１００１…プロセッサ、１００２…メモリ、１００３…ストレージ、１００４…通信装置、１００５…入力装置、１００６…出力装置、１００７…バス。 10...trained model generation system, 11...learning data acquisition unit, 12...teaching value generation unit, 13...trained model generation unit, 1001...processor, 1002...memory, 1003...storage, 1004...communication device, 1005...input device, 1006...output device, 1007...bus.

Claims

A trained model generation system that receives input data and generates a trained model that outputs estimated values for a plurality of candidates ordered by magnitude relationship,
a learning data acquisition unit that acquires learning input data corresponding to the input data and learning output data corresponding to the estimated value, both used for generating a trained model;
a teacher value generating unit that generates teacher values from the learning output data acquired by the learning data acquiring unit and preset reference candidates;
a trained model generation unit that performs machine learning using the learning input data acquired by the learning data acquisition unit and the training value generated by the training value generation unit as training data to generate a trained model;
A trained model generation system in which the trained model generated by the trained model generation unit divides candidates into groups according to the reference candidate and the order of each candidate, and generates and outputs estimated values for each candidate from bias terms set for each candidate in each group and values for each group calculated by the trained model.

The trained model generation system of claim 1, wherein the input data includes data related to the song to be sung, and the candidates are the keys in which the song is sung.

the learning output data is data indicating any one of a plurality of candidates,
3. The trained model generation system according to claim 1 or 2, wherein the teacher value generation unit sets a first value as a teacher value for a candidate indicated by the learning output data in the ordering up to the reference candidate, and sets a second value different from the first value as a teacher value for the other candidates.

The trained model generation system according to any one of claims 1 to 3, wherein the trained model generation unit performs machine learning using a loss function that weights the group according to the training output data acquired by the training data acquisition unit.

the learning output data is data indicating any one of a plurality of candidates,
The trained model generation system according to claim 4 , wherein the trained model generation unit performs machine learning using a loss function in which a heavier weight is set for a group including any of the candidates indicated by the learning output data than for other groups.

The trained model generation system according to any one of claims 1 to 5, wherein the trained model generated by the trained model generation unit outputs a propensity score indicating a tendency for the estimated value to be in a group, and generates an estimated value for each candidate based on the propensity score.