JP7464346B2

JP7464346B2 - Object attribute expression generation model capable of generating object attribute expressions, object attribute estimation device and method

Info

Publication number: JP7464346B2
Application number: JP2021034208A
Authority: JP
Inventors: 雄一石川
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2021-03-04
Filing date: 2021-03-04
Publication date: 2024-04-09
Anticipated expiration: 2041-03-04
Also published as: JP2022134801A

Description

本発明は、対象の有する属性、例えばユーザ属性の分散表現を取り扱う技術に関する。 The present invention relates to a technology that handles distributed representations of attributes of objects, such as user attributes.

人間の性格、人格、気質や、個性、さらには価値観といったような精神面の指標を全体的に捉えた特性であるパーソナリティは、各種商品・サービスの購入・選択といった様々な行動領域（行動ドメイン）における行動内容に少なからぬ影響を与えることが知られている。このため近年、ユーザの行動を予測したり、提供する商品・サービスをパーソナライズしたりすることを目的として、ユーザのパーソナリティを把握する試みが盛んに行われている。 Personality, which is a characteristic that holistically captures mental indicators such as a person's character, personality, temperament, individuality, and even values, is known to have a significant impact on behavior in various behavioral domains, such as the purchase and selection of various products and services. For this reason, in recent years, there have been many attempts to understand users' personalities in order to predict their behavior and personalize the products and services they provide.

今日、人間のパーソナリティは「特性論」と呼ばれる考え方に従い把握するのが一般的である。この特性論では、人間のパーソナリティは、複数の特性で構成されていて各特性の高低（スコア）で表現することができるとしている。例えば、特性論に基づくパーソナリティモデルの代表例である「Big Five」は、人間のパーソナリティが知的好奇心（O: Openness to experience）、誠実性（C: Conscientiousness）、外向性（E: Extraversion）、協調性（A: Agreeableness）、及び情緒安定性（N: Neuroticism）の５つの特性で構成されるとしており、人間の性格をこれら５特性（OCEAN）のスコアの組で表現する。 Today, human personality is generally understood according to a concept called "trait theory." This trait theory states that human personality is composed of multiple traits and can be expressed by the level (score) of each trait. For example, the "Big Five," a representative example of a personality model based on trait theory, states that human personality is composed of five traits: intellectual curiosity (O: Openness to experience), conscientiousness (C: Conscientiousness), extraversion (E: Extraversion), agreeableness (A: Agreeableness), and emotional stability (N: Neuroticism), and expresses human character as a set of scores for these five traits (OCEAN).

このようなパーソナリティを測定する手法として、質問紙調査がよく用いられる。例えば、Big Five測定用の質問紙として代表的な「Big Five Scales」は、合計60の項目からなる質問群を被験者に回答させるものであり、例えば知的好奇心（O）の測定においては、「独創的な」や「多才の」等の質問に対し「まったくあてはまらない（1点）」～「非常にあてはまる（７点）」の7段階で回答させ、全質問の合計得点で知的好奇心（O）のスコアを決定する仕組みとなっている。 Questionnaire surveys are often used as a method of measuring such personality traits. For example, the "Big Five Scales," a representative questionnaire for measuring the Big Five, has subjects answer a set of questions consisting of a total of 60 items. When measuring intellectual curiosity (O), for example, subjects are asked to answer questions such as "creative" and "versatile" on a seven-point scale from "not at all applicable (1 point)" to "very applicable (7 points)," and the score for intellectual curiosity (O) is determined by the total score for all questions.

一方、膨大なユーザに対し質問紙調査を実施することは多大な時間とコストがかかることを踏まえ、最初に少数のユーザに対して質問紙調査を実施し、その回答結果と当該ユーザの行動データとを用いて当該ユーザのパーソナリティを推定する技術の研究も広く行われている。 On the other hand, given that conducting questionnaire surveys of a huge number of users takes a lot of time and money, there is also widespread research into technology that first administers questionnaire surveys to a small number of users and then uses the responses to these surveys and the user's behavioral data to estimate the user's personality.

非特許文献１はこのような技術のサーベイ論文であり、例えば、ＳＮＳ（Social Networking Service）への投稿内容や、ＳＮＳで繋がっている友人の数、携帯端末の発信履歴や、端末位置情報に基づき導出される移動パターン、さらには通話における発話の速度・抑揚や、声の大小等、様々なユーザの行動データを用いて、当該ユーザのパーソナリティを推定する技術が紹介されている。 Non-Patent Document 1 is a survey paper on such technologies, and introduces a technology that estimates a user's personality using various user behavioral data, such as the content posted to SNS (Social Networking Service), the number of friends connected on SNS, the call history of a mobile device, movement patterns derived based on device location information, and even the speed and intonation of speech during phone calls, as well as the volume of the voice.

A. Vinciarelli and G. Mohammadi, "A Survey of Personality Computing,", IEEE Transactions on Affective Computing, vol. 5, no. 3, pp. 273-291, ２０１４年，＜https://doi.org/10.1109/TAFFC.2014.2330816＞A. Vinciarelli and G. Mohammadi, "A Survey of Personality Computing," IEEE Transactions on Affective Computing, vol. 5, no. 3, pp. 273-291, 2014, <https://doi.org/10.1109/TAFFC.2014.2330816> Donkers, T., Loepp, B., & Ziegler, J., "Sequential user-based recurrent neural network recommendations", RecSys'17 - Proceedings of the 11th ACM Conference on Recommender Systems, pp. 152-160, ２０１７年, ＜https://doi.org/10.1145/3109859.3109877＞Donkers, T., Loepp, B., & Ziegler, J., "Sequential user-based recurrent neural network recommendations", RecSys'17 - Proceedings of the 11th ACM Conference on Recommender Systems, pp. 152-160, 2017, ＜https://doi.org/10.1145/3109859.3109877＞

以上説明したように従来、パーソナリティの測定手法として様々な技術が開発されてきたが、これらの技術に共通する解決すべき課題として、（ａ）測定粒度（情報粒度）の粗さと、（ｂ）信頼性の低さとが挙げられる。 As explained above, various techniques have been developed to measure personality, but these techniques share common issues that need to be resolved: (a) coarse measurement granularity (information granularity) and (b) low reliability.

最初に上記（ａ）の測定粒度の粗さであるが、これは、ユーザの行動データから当該ユーザのパーソナリティを推定する技術を含め、「特性論」に基づきパーソナリティを把握する技術に顕著な課題となっている。例えば、Big Fiveの５つの特性は元来、辞書から抜き出した膨大な数の性格表現単語を、統計的な手法によって５つの類似概念単語グループに集約し、この集約結果に基づいて人間が解釈し易いように決定されたものである。 First, there is the coarseness of the measurement granularity mentioned above in (a). This poses a significant challenge to technologies for grasping personality based on "trait theory", including technologies for estimating a user's personality from the user's behavioral data. For example, the Big Five traits were originally determined by aggregating a huge number of personality-descriptive words extracted from dictionaries into five groups of words with similar concepts using statistical methods, and then based on the results of this aggregation, in a way that would be easy for humans to interpret.

ここで、この集約の過程において、性格表現単語の本来有する意味合いにおける細かな若しくは微妙な相違部分が失われてしまっている。例えば、Big Fiveのうちの１つである誠実性（C）は、「真面目」「勤勉」「計画性がある」等の性格表現を一特性に集約したものとなっており、この誠実性（C）の測定結果として得られるのは単一のスコアに過ぎず、集約前の性格表現１つ１つの意味合いの相違部分は、測定結果から失われているのである。 However, in the process of aggregation, small or subtle differences in the original meanings of the personality expressions are lost. For example, conscientiousness (C), one of the Big Five, is aggregating personality expressions such as "serious," "hardworking," and "organized" into a single trait, and the measurement result of this conscientiousness (C) is merely a single score, and the differences in meaning of each personality expression before aggregation are lost from the measurement results.

次に上記（ｂ）の信頼性の低さであるが、これは特に、質問紙調査によるパーソナリティ測定を含む技術に顕著な課題となっている。例えば、質問紙の質問に対する被験者の回答態度には通常、相当のばらつきが存在する。具体的には、自分の本来の考えではなく「こうあるべき」との考えに基づいて回答を行うケースや、回答自体を面倒な作業と捉えて出鱈目な回答を行うケースが生じることも少なくない。その結果、パーソナリティの測定結果における信頼性が低下してしまうのである。 Next, there is the low reliability of (b) above, which is a particularly notable issue for technologies that involve measuring personality through questionnaire surveys. For example, there is usually considerable variation in the way subjects respond to questions in a questionnaire. Specifically, it is not rare for subjects to answer based on how they "should be" rather than on their own original thoughts, or to consider answering itself to be a tedious task and give false answers. As a result, the reliability of personality measurement results is reduced.

また、仮に本来想定する（素直な且つ真面目な）回答態度が担保されたとしても、そもそも質問紙調査で設定されるような質問項目や回答方法では、測定対象とする特性の領域全てを測定し切れない可能性が高く、測定結果における信頼性は低くならざるを得ない。例えば、Big Fiveの質問紙は20種以上存在するが、質問数、質問表現や、（5段階で回答するか、はい／いいえで回答するか等の）回答方法は、それぞれで相当に異なるものとなっている。またその影響もあって、同一の被験者であっても回答する質問紙の種別によって測定結果が相違することは広く知られており、２つの質問紙間において、協調性（A）の測定結果の相関rが0.40を下回るケースも発生している。 Even if the expected (honest and serious) response attitude is guaranteed, the questions and response methods set in a questionnaire survey are unlikely to measure all of the areas of the characteristics to be measured, and the reliability of the measurement results is inevitably low. For example, there are more than 20 types of Big Five questionnaires, but the number of questions, question wording, and response methods (such as answering on a 5-point scale or yes/no) are significantly different for each. As a result, it is widely known that the measurement results of the same subject differ depending on the type of questionnaire answered, and there have been cases where the correlation r of the measurement results of agreeableness (A) between two questionnaires is below 0.40.

さらに、ユーザの行動データから当該ユーザのパーソナリティを推定する技術においても、使用する推定モデルは通常、質問紙調査による測定結果を用いて訓練・構築されるので、以上に説明した上記（ｂ）の信頼性の低さが、解決すべき重大な課題となっているのである。 Furthermore, even in technology that estimates a user's personality from the user's behavioral data, the estimation model used is usually trained and constructed using measurement results from questionnaire surveys, so the low reliability of (b) described above is a serious issue that needs to be resolved.

一方、パーソナリティの測定手法ではないが、特に商品・サービスのレコメンド（推薦）技術の分野において、現在、調査用紙によるアンケートを行うことなく、ユーザの属性を表現可能とするユーザ表現学習（ＵＲＬ，User Representation Learning）技術が精力的に研究されている。 On the other hand, although it is not a method for measuring personality, particularly in the field of product and service recommendation technology, User Representation Learning (URL) technology, which makes it possible to express user attributes without conducting a questionnaire survey, is currently being actively researched.

具体的にこのＵＲＬ技術は、音楽再生、動画閲覧や、広告クリック、さらにはアイテム購入等のユーザの行動データから、例えば音楽・映画の好みや、購入アイテムの傾向といったようなユーザ属性の分散表現（通常は数百次元のベクトル（多数の数値の羅列）による表現）を獲得する技術である。したがって、ユーザ属性についてではあるが、測定粒度（情報粒度）は十分に高く（細かく）なっており、その結果、ＵＲＬ技術は、上記（ａ）の測定粒度の粗さの課題を解決し得るものと言えるのである。 Specifically, this URL technology is a technology that acquires distributed representations (usually representations using vectors (lists of many numerical values) with several hundred dimensions) of user attributes such as music and movie preferences and purchasing trends from user behavioral data such as music playback, video viewing, ad clicks, and even item purchases. Therefore, although it is for user attributes, the measurement granularity (information granularity) is sufficiently high (fine), and as a result, it can be said that URL technology can solve the problem of the coarse measurement granularity mentioned above in (a).

ここで、ＵＲＬ技術は、行列因子分解（Matrix factorization）法を代表例とする、行動履歴の順序関係を反映していない非時系列の行動データを用いるStatic URL技術と、行動履歴の順序関係を反映した時系列の行動データを用いるSequential URL技術とに大別される。 Here, URL technologies are broadly divided into static URL technologies, such as matrix factorization, which use non-time-series behavioral data that does not reflect the sequential relationships in the behavioral history, and sequential URL technologies, which use time-series behavioral data that reflects the sequential relationships in the behavioral history.

このうち後者のSequential URL技術は、前者のStatic URL技術とは異なり、変化し得る動的なユーザ属性と、変化しない静的なユーザ属性とを分離して抽出可能となっている。このSequential URL技術として例えば非特許文献２には、ユーザの映画視聴についての時系列行動データを、回帰ニューラルネットワーク（ＲＮＮ, Recurrent Neural Network）の一種であるＧＲＵ（Gated Recurrent Unit）へ入力し、ユーザ属性の分散表現を獲得する技術が開示されている。 The latter Sequential URL technology differs from the former Static URL technology in that it is capable of extracting dynamic user attributes that can change separately from static user attributes that do not change. For example, Non-Patent Document 2 discloses an example of this Sequential URL technology in which time-series behavioral data on a user's movie viewing is input into a GRU (Gated Recurrent Unit), which is a type of Recurrent Neural Network (RNN), to acquire a distributed representation of the user attributes.

具体的に、この非特許文献２に記載された技術では、ユーザの識別情報を入力とする分散表現抽出部から出力された当該ユーザの分散表現と、当該ユーザによって視聴された映画の識別情報を入力とする分散表現抽出器から出力された当該映画の分散表現とを入力とするＧＲＵセルが開示されている。ここで非特許文献２は、十分な量のデータによってこのＧＲＵセル及び分散表現抽出器を訓練することによって、分散表現抽出部から出力される当該ユーザの分散表現には、静的であって容易には変化しないユーザ属性、例えば当該ユーザの映画の好み、が反映されるとしている。 Specifically, the technology described in Non-Patent Document 2 discloses a GRU cell that receives as input the distributed representation of a user output from a distributed representation extraction unit that receives the user's identification information as input, and the distributed representation of a movie output from a distributed representation extractor that receives the identification information of a movie watched by the user as input. Non-Patent Document 2 claims that by training the GRU cell and the distributed representation extractor with a sufficient amount of data, the distributed representation of the user output from the distributed representation extraction unit reflects user attributes that are static and do not change easily, such as the user's movie preferences.

しかしながら、このようなSequential URL技術によって獲得されるユーザ属性の分散表現は当然ながら、使用する行動データの属する行動ドメイン（領域）に強く依存する量となっており、ユーザの（精神面の指標を全体的に捉えた特性である）パーソナリティと解釈されるものとは全くなっていない。例えば、映画視聴との行動ドメインに係る行動データを用いた場合、例えば当該ユーザの映画の好み、といったような映画視聴に係るユーザ属性しか抽出することができない。またそれ故従来、Sequential URL技術を用いてユーザのパーソナリティを推定する試みは、何らなされてこなかったのである。 However, the distributed representation of user attributes obtained by such Sequential URL technology is naturally highly dependent on the behavioral domain (area) to which the behavioral data used belongs, and is not at all interpretable as the user's personality (a characteristic that captures mental indicators holistically). For example, when behavioral data related to the behavioral domain of movie viewing is used, only user attributes related to movie viewing, such as the user's movie preferences, can be extracted. For this reason, no attempt has been made to estimate a user's personality using Sequential URL technology to date.

さらに、上述したStatic URL技術はそもそも、ユーザ属性に関し行動ドメインへの依存の度合いや静的か動的かについての分離若しくは抽出を行うものとはなっていない。したがって、Static URL技術で生成されたユーザの分散表現を、当該ユーザのパーソナリティと解釈することも、到底できないのである。 Furthermore, the static URL technology mentioned above is not designed to separate or extract the degree of dependency of user attributes on behavioral domains or whether they are static or dynamic. Therefore, it is impossible to interpret the distributed representation of a user generated by static URL technology as the personality of that user.

そこで、本発明は、対象におけるパーソナリティといったような所定の属性が、より高い信頼性をもって表現された属性表現を生成することの可能な対象属性表現生成モデル、対象属性推定装置、及び対象属性推定方法を提供することを目的とする。 The present invention aims to provide a target attribute representation generation model, a target attribute estimation device, and a target attribute estimation method that are capable of generating attribute representations in which a given attribute, such as a target's personality, is expressed with higher reliability.

本発明によれば、対象の表現ベクトルを、別の表現ベクトル空間の要素である別の表現ベクトルに写像するコンピュータを機能させる対象属性表現生成モデルであって、
入力された当該表現ベクトルに対し訓練済みの写像演算子を作用させて、当該別の表現ベクトルを生成し、当該対象の所定属性に係る情報として出力する、又は、訓練済みのニューラルネットワークアルゴリズムであり、当該表現ベクトルを入力として受け取って、当該別の表現ベクトルを、当該対象の所定属性に係る情報として出力する写像部
としてコンピュータを機能させ、
当該写像演算子又は当該ニューラルネットワークアルゴリズムは、所定属性についての当該対象間の類似の度合いに関連している若しくは関連する可能性のある、当該対象間の所定関係の強さ指標について、当該所定関係の強さ指標が、写像して得られた２つの別の表現ベクトルに係る対象同士における当該類似の度合いをより大きくするもの若しくはより大きくする可能性のあるものとなっているほど、これら２つの別の表現ベクトルの間の離隔度合いが、より小さくなるように訓練されている
ことを特徴とする対象属性表現生成モデルが提供される。 According to the present invention, there is provided an object attribute representation generation model for causing a computer to function to map a representation vector of an object to another representation vector that is an element of another representation vector space, comprising:
A mapping unit that applies a trained mapping operator to the input expression vector to generate another expression vector and outputs the another expression vector as information related to the predetermined attribute of the target , or that is a trained neural network algorithm that receives the expression vector as an input and outputs the another expression vector as information related to the predetermined attribute of the target.
The computer functions as
The mapping operator or the neural network algorithm is trained such that, with respect to a strength index of a predetermined relationship between the objects, which is related or may be related to the degree of similarity between the objects with respect to a predetermined attribute, the greater the degree of similarity between the objects related to the two different representation vectors obtained by mapping, or the greater the likelihood that the strength index of the predetermined relationship will be, the smaller the degree of separation between these two different representation vectors.
The present invention provides an object attribute representation generation model , characterized in that :

この本発明による対象属性表現生成モデルの一実施形態として、当該写像演算子又は当該ニューラルネットワークアルゴリズムは、所定属性についての当該対象間の類似の度合いと正に相関する若しくは正に相関する可能性のある当該所定関係の強さ指標に関し、写像して得られた２つの別の表現ベクトルに係る対象同士がより強いものとなっているほど、これら２つの別の表現ベクトルの間の離隔度合いがより小さくなるように訓練されていることも好ましい。 As one embodiment of the object attribute representation generation model according to the present invention, it is also preferable that the mapping operator or the neural network algorithm is trained such that the stronger the objects related to the two separate representation vectors obtained by mapping are, the smaller the degree of separation between these two separate representation vectors will be, with respect to a strength index of the specified relationship that is positively correlated or likely to be positively correlated with the degree of similarity between the objects for a specified attribute.

また上記の実施形態において、当該写像演算子又は当該ニューラルネットワークアルゴリズムは、
当該所定関係の強さ指標に関してより強いものとなっている対象同士に係る２つの別の表現ベクトルであって、所定属性とは別の属性に関して異なる若しくは遠い関係にある当該対象同士に係る２つの別の表現ベクトルについて、これら２つの別の表現ベクトルの間の離隔度合いがより小さくなるように訓練されている、及び／又は、
当該所定関係の強さ指標に関してより弱いものとなっている対象同士に係る２つの別の表現ベクトルであって、所定属性とは別の属性に関して同一の若しくは近い関係にある当該対象同士に係る２つの別の表現ベクトルについて、これら２つの別の表現ベクトルの間の離隔度合いがより大きくなるように訓練されている
ことも好ましい。 In the above embodiment, the mapping operator or the neural network algorithm may be
Two separate expression vectors relating to objects that are stronger in terms of the strength index of the predetermined relationship, but that are different or distant in terms of an attribute other than the predetermined attribute, are trained to have a smaller degree of separation between these two separate expression vectors; and/or
It is also preferable that two separate expression vectors relating to objects that are weaker in terms of the strength indicator of the specified relationship, but have the same or a close relationship in terms of an attribute other than the specified attribute, are trained to have a greater degree of separation between these two separate expression vectors.

さらに、本発明に係る所定関係の強さ指標について、当該対象は人間であって所定属性はパーソナリティである場合に、
当該所定関係の強さ指標は、
（ア）血縁関係にある対象同士を、若しくは血縁関係が濃い対象同士ほどより強いものとする指標、
（イ）友人・知人関係にある対象同士を、若しくは友人・知人関係が深い対象同士ほどより強いものとする指標、
（ウ）親しい関係にある対象同士を、若しくは親しい対象同士ほどより強いものとする指標、
（エ）相性のよい対象同士を、若しくは相性がよい対象同士ほどより強いものとする指標、
（オ）特定の遺伝子配列が同一である対象同士を、若しくは特定の遺伝子配列が類似している対象同士ほどより強いものとする指標、及び／又は、
（カ）脳波から抽出された特徴量が類似している対象同士ほどより強いものとする指標
であることも好ましい。 Furthermore, regarding the strength index of a predetermined relationship according to the present invention, when the subject is a human being and the predetermined attribute is personality,
The strength indicator of the predetermined relationship is
(a) An indicator that is stronger for subjects who are related by blood or for subjects who are closely related by blood;
(a) An indicator that indicates that the stronger the relationship between objects that are friends or acquaintances, or the deeper the relationship between objects that are friends or acquaintances, the stronger the relationship between objects that are friends or acquaintances,
(c) An indicator that indicates that the closer the relationship between objects is, the stronger the relationship is.
(D) An index that makes objects that have good compatibility, or objects that have good compatibility, stronger;
(E) An indicator that is stronger between subjects with identical specific gene sequences, or between subjects with similar specific gene sequences, and/or
(f) It is also preferable that the index be stronger for subjects whose features extracted from electroencephalograms are more similar.

また、本発明による対象属性表現生成モデルの他の実施形態として、当該写像演算子又は当該ニューラルネットワークアルゴリズムは、所定属性についての当該対象間の類似の度合いと負に相関する若しくは負に相関する可能性のある当該所定関係の強さ指標に関し、写像して得られた２つの別の表現ベクトルに係る対象同士がより弱いものとなっているほど、これら２つの別の表現ベクトルの間の離隔度合いがより小さくなるように訓練されていることも好ましい。 In another embodiment of the object attribute representation generation model according to the present invention, it is also preferable that the mapping operator or the neural network algorithm is trained such that the weaker the objects related to the two different representation vectors obtained by mapping are, the smaller the degree of separation between these two different representation vectors will be, with respect to a strength indicator of the specified relationship that is negatively correlated or may be negatively correlated with the degree of similarity between the objects for a specified attribute.

さらに、負の相関に係る所定関係の強さ指標を用いる上記の実施形態において、当該写像演算子又は当該ニューラルネットワークアルゴリズムは、
当該所定関係の強さ指標に関してより弱いものとなっている対象同士に係る２つの別の表現ベクトルであって、所定属性とは別の属性に関して異なる若しくは遠い関係にある当該対象同士に係る２つの別の表現ベクトルについて、これら２つの別の表現ベクトルの間の離隔度合いがより小さくなるように訓練されている、及び／又は、
当該所定関係の強さ指標に関してより強いものとなっている対象同士に係る２つの別の表現ベクトルであって、所定属性とは別の属性に関して同一の若しくは近い関係にある当該対象同士に係る２つの別の表現ベクトルについて、これら２つの別の表現ベクトルの間の離隔度合いがより大きくなるように訓練されている
ことも好ましい。 Furthermore, in the above embodiment using a predetermined relationship strength indicator related to negative correlation, the mapping operator or the neural network algorithm may further include:
Two separate expression vectors relating to objects that are weaker in terms of the strength index of the predetermined relationship, but have different or distant relationships in terms of an attribute other than the predetermined attribute, are trained to have a smaller degree of separation between the two separate expression vectors; and/or
It is also preferable that two separate expression vectors relating to objects that are stronger in terms of the strength indicator of the specified relationship and that have the same or a close relationship in terms of an attribute other than the specified attribute are trained to have a greater degree of separation between the two separate expression vectors.

また、本発明に係る写像前の表現ベクトルについて、当該対象はユーザであって所定属性はパーソナリティである場合に、
当該表現ベクトルは、ユーザ属性情報生成モデルによって生成された、当該ユーザのパーソナリティを含むユーザ属性を表現した情報であるユーザ属性表現ベクトルであり、
上記のユーザ属性情報生成モデルは、
複数の行動ドメインの行動ドメイン毎に設定されたドメイン特定回帰ニューラルネットワーク（ＲＮＮ）セルであって、当該行動ドメインにおける当該ユーザの行動に係る情報であるドメイン行動情報を受け取り、前の時点で自ら生成した隠れ状態情報であるドメイン特定隠れ状態情報に対し当該ドメイン行動情報を反映させて、新たなドメイン特定隠れ状態情報を生成する複数のドメイン特定ＲＮＮセルと、
当該ユーザを識別する情報から当該ユーザ属性表現ベクトルを生成する、上記の複数のドメイン特定ＲＮＮセルと合せて訓練されるユーザ表現生成部と、
生成された当該ドメイン特定隠れ状態情報と、生成された当該ユーザ属性表現ベクトルとを受け取り、前の時点で自ら生成した隠れ状態情報であるドメイン非依存隠れ状態情報に対し、当該ドメイン特定隠れ状態情報及び当該ユーザ属性表現ベクトルを反映させて、新たなドメイン非依存隠れ状態情報を生成するドメイン非依存ＲＮＮセルと
してコンピュータを機能させるモデルである
ことも好ましい。 In addition, regarding the representation vector before mapping according to the present invention, when the target is a user and the predetermined attribute is a personality,
The expression vector is a user attribute expression vector that is generated by a user attribute information generation model and is information that expresses a user attribute including the personality of the user,
The above user attribute information generation model is as follows:
A plurality of domain-specific regression neural network (RNN) cells are set for each of a plurality of behavioral domains, and the domain-specific RNN cells receive domain behavior information, which is information related to the behavior of the user in the behavioral domain, and generate new domain-specific hidden state information by reflecting the domain behavior information on domain-specific hidden state information, which is hidden state information generated by the RNN cells themselves at a previous point in time;
a user representation generator trained in conjunction with the plurality of domain-specific RNN cells to generate a user attribute representation vector of the user from information identifying the user;
It is also preferable that the model functions as a domain-independent RNN cell that receives the generated domain-specific hidden state information and the generated user attribute representation vector, and generates new domain-independent hidden state information by reflecting the domain-specific hidden state information and the user attribute representation vector on domain-independent hidden state information, which is hidden state information that the computer itself generated at a previous point in time.

さらに、本発明に係る写像前の表現ベクトルについて、当該対象はユーザであって所定属性はパーソナリティである場合に、
当該表現ベクトルは、当該ユーザにおける所定の行動に係る情報を用い、ユーザ表現学習（ＵＲＬ，User Representation Learning）技術に係る手法によって算出された、当該ユーザのユーザ属性を表現したユーザ属性表現ベクトルであることも好ましい。また、このユーザ表現学習技術に係る手法は具体的に、行列因子分解（Matrix factorization）法であることも好ましい。 Furthermore, regarding the representation vector before mapping according to the present invention, when the target is a user and the predetermined attribute is a personality,
It is also preferable that the expression vector is a user attribute expression vector that expresses the user attributes of the user and is calculated by a method related to a user representation learning (URL) technology using information related to a predetermined behavior of the user. In addition, it is also preferable that the method related to the user representation learning technology is specifically a matrix factorization method.

本発明によれば、また、以上に述べた対象属性表現生成モデルを用いて当該別の表現ベクトルを生成し、生成した当該別の表現ベクトルを、当該対象の所定属性に係る情報として出力する対象属性推定装置が提供される。 The present invention also provides a target attribute estimation device that generates a different expression vector using the target attribute expression generation model described above and outputs the generated different expression vector as information related to a specific attribute of the target.

本発明によれば、さらに、
以上に述べた訓練済みの写像演算子又は訓練済みのニューラルネットワークアルゴリズムを得るように訓練を行うことによって、請求項１から９のいずれか１項に記載の対象属性表現生成モデルを構築し、
構築した対象属性表現生成モデルを用いて当該別の表現ベクトルを生成し、生成した当該別の表現ベクトルを、当該対象の所定属性に係る情報として出力する
ことを特徴とする、コンピュータによって実施される対象属性推定方法が提供される。 According to the present invention, further comprising:
By performing training to obtain the above-mentioned trained mapping operator or trained neural network algorithm , constructing the object attribute representation generation model according to any one of claims 1 to 9;
A computer -implemented target attribute estimation method is provided, which is characterized by generating the other expression vector using the constructed target attribute expression generation model, and outputting the generated other expression vector as information related to a specified attribute of the target.

本発明の対象属性表現生成モデル、対象属性推定装置、及び対象属性推定方法によれば、対象におけるパーソナリティといったような所定の属性が、より高い信頼性をもって表現された属性表現を生成することができる。 The object attribute representation generation model, object attribute estimation device, and object attribute estimation method of the present invention can generate attribute representations that express specific attributes, such as the personality of an object, with higher reliability.

本発明による対象属性表現生成モデル、及びユーザ属性情報生成モデルの一実施形態、またこれらのモデルを搭載した対象属性推定装置の一実施形態を示す模式図である。1 is a schematic diagram showing an embodiment of an object attribute expression generation model and a user attribute information generation model according to the present invention, and an embodiment of an object attribute estimation device equipped with these models. 本発明による対象属性表現生成モデルの写像部で実施される写像処理の一実施形態を説明するための模式図である。1 is a schematic diagram for explaining an embodiment of a mapping process performed in a mapping unit of an object attribute representation generation model according to the present invention; FIG. 本発明に係るユーザ属性情報生成モデルの一実施形態を示す模式図である。FIG. 1 is a schematic diagram showing an embodiment of a user attribute information generation model according to the present invention. 本発明に係るユーザ属性情報生成モデルの他の実施形態を示す模式図である。FIG. 13 is a schematic diagram showing another embodiment of a user attribute information generation model according to the present invention.

以下、本発明の実施形態について、図面を用いて詳細に説明する。 The following describes in detail an embodiment of the present invention with reference to the drawings.

［対象属性表現生成モデル，ユーザ属性情報生成モデル］
図１は、本発明による対象属性表現生成モデル、及びユーザ属性情報生成モデルの一実施形態、またこれらのモデルを搭載した対象属性推定装置の一実施形態を示す模式図である。 [Object attribute expression generation model, user attribute information generation model]
FIG. 1 is a schematic diagram showing an embodiment of an object attribute expression generation model and a user attribute information generation model according to the present invention, and an embodiment of an object attribute estimation device equipped with these models.

最初に、図１に示したユーザ属性情報生成モデル２は、後に詳細に説明するが、特定の行動ドメイン（領域）における対象（本実施形態ではユーザ）の行動に係る情報である「ドメイン行動情報」から、対象の属性（ユーザ属性）に係る情報である「ユーザ属性表現ベクトルr_u」を生成し出力する機械学習モデルである。 First, the user attribute information generation model 2 shown in Figure 1 is a machine learning model that generates and outputs a "user attribute representation vector r_u," which is information related to the attributes (user attributes) of a target, from "domain behavior information," which is information related to the behavior of a target (a user in this embodiment) in a specific behavioral domain (area), as will be described in detail later.

ここで、出力される「ユーザ属性表現ベクトルr_u」は、これも後に詳述するが、対象（ユーザ）における「所定属性（本実施形態ではパーソナリティ）に係る情報」を包含する情報となっている。なお以下、「表現ベクトル」とは、単語分散表現に代表される公知の分散表現（埋め込み（embedding））技術で扱われる、多次元の（例えば数十～数百次元の）ベクトルであり、多数の数値の羅列となっているベクトル量を指すものとする。 The output "user attribute representation vector r_u", which will also be described in detail later, is information that includes "information related to a specific attribute (personality in this embodiment)" of the target (user. Note that hereafter, "representation vector" refers to a multi-dimensional (e.g., tens to hundreds of dimensions) vector handled by known embedded representation (embedding) technology, such as word embedded representation, and is a vector quantity that is a sequence of many numerical values.

同じく図１において、本実施形態の対象属性表現生成モデル１は、
（ａ）ユーザ属性情報生成モデル２から、生成された「ユーザ属性表現ベクトルr_u」を受け取り、
（ｂ）受け取った「ユーザ属性表現ベクトルr_u」から、対象（ユーザ）における所定属性（パーソナリティ）を表現した「対象属性表現ベクトル」（本実施形態ではパーソナリティ表現ベクトルM(r_u)）を生成し出力する
機械学習モデルとなっている。 Also in FIG. 1, the object attribute expression generation model 1 of this embodiment is as follows:
(a) receiving the generated “user attribute expression vector r_u” from the user attribute information generation model 2;
(b) This is a machine learning model that generates and outputs a “target attribute representation vector” (in this embodiment, a personality representation vector M(r_u)) that expresses a specific attribute (personality) of the target (user) from the received “user attribute representation vector r_u.”

このように対象属性表現生成モデル１は、受け取った「ユーザ属性表現ベクトルr_u」から、そこに包含されている「所定属性（パーソナリティ）に係る情報」を、「対象属性表現ベクトル（パーソナリティ表現ベクトル）M(r_u)」として分離抽出するモデルであると捉えることもできる。 In this way, the target attribute expression generation model 1 can be considered as a model that separates and extracts the "information related to a specific attribute (personality)" contained in the received "user attribute expression vector r_u" as a "target attribute expression vector (personality expression vector) M(r_u)."

この点、この対象属性表現生成モデル１は、その特徴として、
（Ａ）ある表現ベクトル空間の要素である対象（ユーザ）の「ユーザ属性表現ベクトルr_u」を写像して得られた（別の表現ベクトル空間における）別の表現ベクトルである「対象属性表現ベクトル（パーソナリティ表現ベクトル）M(r_u)」を、「対象（ユーザ）の所定属性（パーソナリティ）に係る情報」として出力する写像部１０
としてコンピュータを機能させるモデルとなっており、それ故、上述したような対象属性表現（ユーザのパーソナリティ表現）を分離抽出するモデルと捉えることも可能となるのである。 In this regard, the object attribute representation generation model 1 has the following features:
(A) A mapping unit 10 that outputs an "object attribute representation vector (personality representation vector) M(r_u)" which is another representation vector (in another representation vector space) obtained by mapping a "user attribute representation vector r_u" of an object (user) which is an element of a certain representation vector space, as "information related to a predetermined attribute (personality) of the object (user)"
It is a model that causes a computer to function as a system that can extract and separate target attribute expressions (user personality expressions) as described above.

またさらに、この写像部１０は、その特徴として、
（Ｂ）所定属性（パーソナリティ）についての対象間（ユーザ間）の類似の度合いに関連している若しくは関連する可能性のある「対象間の所定関係の強さ指標」を用いて特定の訓練（学習）を施されたもの
となっている。ちなみに、この「対象間の所定関係の強さ指標」としては、後に詳細に説明するが、例えば血縁関係指標、友人・知人関係指標、親しさ指標、相性指標、遺伝子配列指標や、脳波特徴量指標、さらには仲の悪さ指標等を採用することができる。 Furthermore, the mapping unit 10 has the following features:
(B) It is a subject that has been given specific training (learning) using a "strength index of a predetermined relationship between objects" that is related or may be related to the degree of similarity between objects (users) regarding a predetermined attribute (personality). Incidentally, as the "strength index of a predetermined relationship between objects", which will be described in detail later, for example, a blood relationship index, a friend/acquaintance relationship index, a closeness index, a compatibility index, a gene sequence index, an electroencephalogram feature amount index, and even a bad relationship index, etc. can be adopted.

ここでより具体的に、写像部１０は、
（Ｃ）この「対象間の所定関係の強さ指標」が、写像して得られた２つの「対象属性表現ベクトル（パーソナリティ表現ベクトル）」に係る対象同士（ユーザ同士）における所定属性（パーソナリティ）の類似の度合いをより大きくする若しくはより大きくする可能性のあるものとなっているほど、これら２つの「対象属性表現ベクトル（パーソナリティ表現ベクトル）」の間の離隔度合いが、より小さくなるように訓練されたものとなっているのである。 More specifically, the mapping unit 10:
(C) This “indicator of the strength of a specified relationship between objects” is trained so that the greater the degree of similarity of a specified attribute (personality) between objects (users) related to the two “object attribute expression vectors (personality expression vectors)” obtained by mapping, or the greater the possibility of such a degree of similarity, the smaller the degree of separation between these two “object attribute expression vectors (personality expression vectors).”

またこのような訓練の結果、対象属性表現生成モデル１（写像部１０）は、「対象間の所定関係の強さ指標」を手掛かりにした写像Mを用いて、受け取った「ユーザ属性表現ベクトルr_u」を、所定属性（パーソナリティ）が類似しているほどその間の離隔度合いが小さくなるような「対象属性表現ベクトル（パーソナリティ表現ベクトル）M(r_u)」へと仕立てることができる。 Furthermore, as a result of such training, the object attribute representation generation model 1 (mapping unit 10) can use the mapping M based on the "strength index of a specified relationship between objects" to tailor the received "user attribute representation vector r_u" into an "object attribute representation vector (personality representation vector) M(r_u)" in which the degree of separation between the objects becomes smaller as the specified attributes (personalities) become more similar.

この「対象属性表現ベクトル（パーソナリティ表現ベクトル）M(r_u)」はまさに、所定属性（パーソナリティ）をより高い信頼性をもって表現した情報となっているのであり、すなわち、対象属性表現生成モデル１はまさに、所定属性（パーソナリティ）をより高い信頼性をもって表現した属性表現としての表現ベクトルを生成することができるのである。 This "target attribute expression vector (personality expression vector) M(r_u)" is precisely information that expresses a specified attribute (personality) with greater reliability. In other words, the target attribute expression generation model 1 can generate an expression vector as an attribute expression that expresses a specified attribute (personality) with greater reliability.

また、この「対象属性表現ベクトル（パーソナリティ表現ベクトル）M(r_u)」は、上述したように多次元の（例えば数十～数百次元の）ベクトルであるので、所定属性（パーソナリティ）の表現における測定粒度（情報粒度）は、十分に高い（細かい）ものとなっている。例えば、パーソナリティでいえば、その情報粒度は、（パーソナリティを５次元で表現する）Big Fiveに代表される従来のパーソナリティ指標と比較して各段に高いものとなっている。 In addition, since this "target attribute expression vector (personality expression vector) M(r_u)" is a multidimensional vector (e.g., tens to hundreds of dimensions) as described above, the measurement granularity (information granularity) in expressing a given attribute (personality) is sufficiently high (fine). For example, in the case of personality, the information granularity is much higher than that of conventional personality indices such as the Big Five (which express personality in five dimensions).

以下、「対象」がユーザ（人間）であって、「所定属性」がパーソナリティであるとして、以上に述べたことをより詳細に説明する。なお勿論、本発明は「対象」及び「所定属性」を上記のものに限定するものではなく、例えば「対象」を何らかの製品として、「所定属性」を当該製品の総合的な性能又は品質とすることも可能である（この場合、例えば関係指標として、当該製品の製造バージョン・時期や使用履歴・環境等を用いることができる）。しかしながら以下、説明の便宜のため、上記のものに限定することとする。 Below, the above will be explained in more detail, assuming that the "target" is a user (human) and the "predetermined attribute" is personality. Of course, the present invention does not limit the "target" and "predetermined attribute" to those mentioned above; for example, the "target" could be a product and the "predetermined attribute" could be the overall performance or quality of that product (in this case, for example, the product's manufacturing version, time, usage history, environment, etc. could be used as related indicators). However, for ease of explanation, we will limit it to the above.

ここでより詳細な説明を行うにあたり、図１の上方に示した、ユーザ属性を分類したユーザ属性分類グラフを用いる。一般に、（パーソナリティを含む広い意味での）ユーザ属性は、
（ａ）動的な（時間に依存する、又は時間変化し易い）属性か、静的な（時間に依存しない、又は時間変化し難い）属性か、及び
（ｂ）ドメイン依存的な（特定の行動ドメインにのみ影響を与える、又は特定の行動ドメインに係るものである）属性か、ドメイン非依存的な（様々な行動ドメインの行動に影響し得る、又は特定の行動ドメインにのみ係るものではない）属性か
といった観点から、４つのグループに分類することができる。 For a more detailed explanation, we will use the user attribute classification graph shown in the upper part of Figure 1, which classifies user attributes. In general, user attributes (in a broad sense, including personality) are:
Attributes can be classified into four groups based on whether they are (a) dynamic (time-dependent or time-varying) or static (time-independent or time-unchangeable) and (b) domain-dependent (affecting only a specific behavioral domain or relating to a specific behavioral domain) or domain-independent (can affect behavior in various behavioral domains or not relating to only a specific behavioral domain).

具体的に、当該４つのグループはそれぞれ、上記（ａ）に係る横軸と上記（ｂ）に係る縦軸とで構成される（図１上方の）ユーザ属性分類グラフにおける第１～第４象限に係るグループとなっている。ちなみに、ユーザ属性の１つであるパーソナリティは、静的であって且つドメイン非依存であるので、第１象限に係るグループに属することになる。 Specifically, the four groups are groups related to the first to fourth quadrants in the user attribute classification graph (top of Figure 1) consisting of the horizontal axis related to (a) above and the vertical axis related to (b) above. Incidentally, personality, which is one of the user attributes, is static and domain-independent, so it belongs to the group related to the first quadrant.

同じく図１において、ユーザ属性情報生成モデル２が生成し出力したユーザ属性表現ベクトルr_uは、後に詳細に説明するが、上記のユーザ属性分類グラフにおける第１象限に属するユーザ属性を表現したものとなっている。この第１象限に属するユーザ属性には、その主要な成分としてパーソナリティが含まれているが、他にも（ユーザの）性別、年代、職業、収入や居住地等のユーザ属性が、（各行動ドメインでの行動に影響を及ぼす程度をもって）含まれている。 Also in FIG. 1, the user attribute representation vector r_u generated and output by the user attribute information generation model 2 is described in detail later, but it represents the user attributes belonging to the first quadrant of the above user attribute classification graph. The user attributes belonging to this first quadrant include personality as their main component, but also include other user attributes such as (user) gender, age, occupation, income, and place of residence (to the extent that they affect behavior in each behavioral domain).

したがって、ユーザ属性情報生成モデル２が出力したこのユーザ属性表現ベクトルr_uは、ユーザのパーソナリティを表現した表現ベクトルとして取り扱うことも可能となっているが、その表現の信頼性を更に高めることが大いに望まれるものともなっているのである。 Thus, the user attribute expression vector r_u output by the user attribute information generation model 2 can be treated as an expression vector that represents the user's personality, but it is highly desirable to further improve the reliability of this expression.

一方、対象属性表現生成モデル１が生成し出力したパーソナリティ表現ベクトルM(r_u)は、上記（Ｃ）のように訓練された写像部１０が、このようなユーザ属性表現ベクトルr_uに対し写像Mを施した結果得られたものであるので、ユーザのパーソナリティを表現した表現ベクトルとしてより信頼性の高いものとなっている。言い換えれば、パーソナリティ表現ベクトルM(r_u)は、上記のユーザ属性分類グラフにおける第１象限に属するユーザ属性からパーソナリティ成分を分離抽出した（若しくは強調した）表現ベクトルとなっているのである。 On the other hand, the personality expression vector M(r_u) generated and output by the target attribute expression generation model 1 is obtained as a result of the mapping unit 10, trained as described above in (C), applying mapping M to such a user attribute expression vector r_u, and is therefore a more reliable expression vector that expresses the user's personality. In other words, the personality expression vector M(r_u) is an expression vector that separates and extracts (or emphasizes) personality components from the user attributes belonging to the first quadrant in the above user attribute classification graph.

ちなみに、このようなユーザのパーソナリティを表現したパーソナリティ表現ベクトル（personality embedding vector）は、今回、本願発明者が独自に創作したものである。 Incidentally, the personality embedding vector that represents the user's personality was created independently by the inventor of this application.

ここで、人工知能による自然言語処理において欠かせない技術となっている単語分散表現では、１つ１つの単語が、例えば数百次元の（多数の数値の羅列である）word embedding vectorで表現され、意味の近い単語同士ほどこのベクトル間の離隔度合いが小さくなる。すなわちword embedding vectorは、対応する単語の意味が埋め込まれたものと捉えることができる。また、word embedding vector同士の演算も可能となっており、例えば有名な例として、"king"－"man"＋"woman"＝"queen"といったような加減算を行うことが可能となっている。 Here, in word embedding representation, which is an essential technique in natural language processing using artificial intelligence, each word is represented by a word embedding vector of, for example, several hundred dimensions (a string of numerous numerical values), and the closer the meaning of words are, the smaller the degree of separation between these vectors. In other words, a word embedding vector can be thought of as having the meaning of the corresponding word embedded in it. It is also possible to perform calculations between word embedding vectors, and one famous example is addition and subtraction, such as "king" - "man" + "woman" = "queen."

これと同様にして、本願発明者の創作によるパーソナリティ表現ベクトル（personality embedding vector）も、対応するユーザのパーソナリティの特徴が埋め込まれたものと捉えることができるので、ベクトル間の距離の逆数や内積等を（ユーザ間における）パーソナリティの類似度としたり、またベクトル同士の演算を行ったりして、パーソナリティに関しさらに有益な情報が導出・生成可能となることも期待されるのである。 In a similar manner, the personality expression vector (personality embedding vector) created by the inventors of the present application can be considered to have the personality characteristics of the corresponding user embedded in it, so it is expected that even more useful information regarding personality can be derived and generated by using the inverse of the distance between vectors or the dot product as a measure of personality similarity (between users) or by performing calculations between vectors.

さらに、分散表現としてのパーソナリティ表現ベクトル（personality embedding vector）は、人間が直接に解釈できるものではないが、例えば、様々な用途の機械学習モデルにおける入力データとして利用され、人工知能がユーザのパーソナリティを詳細に理解して、当該ユーザの行動予測やサービスのパーソナライズ等をより的確に実施するようなことも可能になると期待される。 Furthermore, although the personality embedding vector as a distributed representation cannot be directly interpreted by humans, it is expected that it can be used, for example, as input data in machine learning models for various applications, enabling artificial intelligence to gain a detailed understanding of a user's personality and more accurately predict the user's behavior and personalize services.

次いで、図２を用いて、上記（Ｃ）で述べた写像部１０における訓練についてより詳細に説明を行う。 Next, we will use Figure 2 to provide a more detailed explanation of the training in the mapping unit 10 described above in (C).

図２は、対象属性表現生成モデル１の写像部１０で実施される写像処理の一実施形態を説明するための模式図である。 Figure 2 is a schematic diagram for explaining one embodiment of the mapping process performed by the mapping unit 10 of the object attribute representation generation model 1.

図２に示した実施形態によれば、対象属性表現生成モデル１における上記（Ｃ）のように訓練された写像部１０において実施される写像Mは、（ユーザ属性情報生成モデル２から出力される）様々なユーザについてのユーザ属性表現ベクトルr_uが張るユーザ属性表現ベクトル空間V_uから、（対象属性表現生成モデル１で生成される）当該様々なユーザについてのパーソナリティ表現ベクトルM(r_u)が張るパーソナリティ表現ベクトル空間V_pへの写像（M：V_u→V_p）となっている。 According to the embodiment shown in FIG. 2, the mapping M performed by the mapping unit 10 trained as described above in (C) in the target attribute representation generation model 1 is a mapping (M: V_u → V_p) from the user attribute representation vector space V_u spanned by the user attribute representation vectors r_u for various users (output from the user attribute information generation model 2) to the personality representation vector space V_p spanned by the personality representation vectors M(r_u) for the various users (generated by the target attribute representation generation model 1).

ここで本実施形態において、上記（Ｃ）の写像部１０の訓練は、「所定関係の強さ指標」として、
（ア）血縁関係指標：血縁関係にある対象同士を、又は血縁関係が濃い対象同士ほどより強いものとする指標
を採用している。この血縁関係指標は、パーソナリティ心理学における「パーソナリティは遺伝によりその半分が決定される」との知見、「血縁関係にある（遺伝上のつながりがある）人同士はパーソナリティが類似する」との知見や、「共通する遺伝子が多いほどパーソナリティが類似する（例えば二卵性双生児よりも一卵性双生児の方がパーソナリティについて類似する）」との知見等に基づき設定されたものである。 In this embodiment, the training of the mapping unit 10 in (C) above uses the following as the "strength index of a predetermined relationship":
(a) Blood relationship index: An index is adopted that is stronger for subjects who are related by blood, or for subjects who are closely related by blood. This blood relationship index was set based on the findings in personality psychology that "personality is half determined by genetics," that "people who are related by blood (have a genetic connection) tend to have similar personalities," and that "the more genes they have in common, the more similar their personalities will be (for example, identical twins are more similar in personality than fraternal twins)."

実際、上記の知見からして、血縁関係指標は、パーソナリティについてのユーザ間の類似の度合いと、（強いものとなっているほど類似の度合いが大きくなる傾向にあるという意味で）正に相関する又は正に相関する可能性のある指標となっていることが理解される。 In fact, based on the above findings, it can be understood that the blood relationship index is an index that is positively correlated or has the potential to be positively correlated with the degree of similarity between users in terms of personality (in the sense that the stronger the index, the greater the degree of similarity tends to be).

本実施形態では、写像部１０は、このような血縁関係指標に関し、写像して得られた２つのパーソナリティ表現ベクトルM(r_u)に係るユーザ同士がより強いものとなっているほど、これら２つのパーソナリティ表現ベクトルM(r_u)の間の離隔度合い（例えば空間V_pで定義された距離）がより小さくなるように訓練されているのである。 In this embodiment, the mapping unit 10 is trained so that, with respect to such blood relationship indices, the stronger the users associated with the two personality expression vectors M(r_u) obtained by mapping, the smaller the degree of separation between these two personality expression vectors M(r_u) (e.g., the distance defined in the space V_p).

またその結果、図２に示したように、例えば互いに血縁関係にあるユーザ"父１"、ユーザ"兄１"及びユーザ"弟１"の３つのユーザ属性表現ベクトルr_uは、ユーザ属性表現ベクトル空間V_u上では（パーソナリティのみを表現しているわけではないので）互いに離隔しているが、写像Mを施された後の対応する（ユーザ"父１"、ユーザ"兄１"及びユーザ"弟１"の）３つのパーソナリティ表現ベクトルM(r_u)は、パーソナリティ表現ベクトル空間V_p上では（互いに血縁関係にあることを反映して）互いに近接している。また、彼らとは別の血縁関係にあるユーザ"父２"、ユーザ"姉２"及びユーザ"弟２"も同様の態様をとっているのである。 As a result, as shown in Figure 2, for example, the three user attribute representation vectors r_u of users "Father 1", "Elder Brother 1", and "Younger Brother 1", who are related by blood, are separated from each other in the user attribute representation vector space V_u (because they do not only represent personality), but the three corresponding personality representation vectors M(r_u) (of users "Father 1", "Elder Brother 1", and "Younger Brother 1") after mapping M are applied are close to each other in the personality representation vector space V_p (reflecting their blood relationship). Also, users "Father 2", "Older Sister 2", and "Younger Brother 2", who are related by a different blood relationship, have a similar behavior.

このように、写像Mを施す表現ベクトルに係るユーザの血縁関係情報を取得して、上記の血縁関係指標に基づき、写像Mを行う写像部１０を訓練することによって、パーソナリティの互いに類似するユーザのパーソナリティ表現ベクトルM(r_u)が、パーソナリティ表現ベクトル空間V_p上において互いに近接するようにすることができる。すななち、ユーザのパーソナリティをより高い信頼性をもって表現しているパーソナリティ表現ベクトルM(r_u)が生成可能となるのである。 In this way, by acquiring the blood relationship information of the user related to the expression vector to which mapping M is applied, and training the mapping unit 10 that performs mapping M based on the above blood relationship index, the personality expression vectors M(r_u) of users with similar personalities can be made to be close to each other in the personality expression vector space V_p. In other words, it becomes possible to generate a personality expression vector M(r_u) that expresses the user's personality with higher reliability.

なお、このような写像Mを行う写像部１０は例えば、入力されたユーザ属性表現ベクトルr_uに対し、写像演算子としての行列を作用させる（積算する）ものであってもよく、このような行列の積算後、さらにバイアスベクトルを加算するものであってもよい。また、このようにして得られた値をsigmoid関数、tanh関数や、ReLu関数等に入力して関数値を出力するものとすることもできる。さらに、順伝播型ニューラルネットワーク、特に多層パーセプトロン（Multi-layer perceptron）であってもよい。 The mapping unit 10 that performs such a mapping M may, for example, apply (multiply) a matrix as a mapping operator to the input user attribute expression vector r_u, or may add a bias vector after multiplying the matrix. The value thus obtained may also be input to a sigmoid function, tanh function, ReLu function, or the like, and a function value may be output. Furthermore, it may be a forward propagation type neural network, particularly a multi-layer perceptron.

また、このような写像部１０の訓練（学習）は例えば、写像M後の表現ベクトルM(r_u)間の離隔度合いに関し、血縁関係にある血縁ユーザの方が血縁関係にない非血縁ユーザよりも小さく、且つ両者の離隔度合いの差ができるだけ大きくなるように損失関数を設定することにより実施することができる。 In addition, such training (learning) of the mapping unit 10 can be performed, for example, by setting a loss function so that the degree of separation between the representation vectors M(r_u) after mapping M is smaller for blood-related users than for unrelated users, and the difference in the degree of separation between the two is as large as possible.

＜血縁関係の有無を反映した損失関数＞
例えば、特定のユーザu1に係る損失関数lossとして、次式
（１） loss(u1,u2,u3)
＝Σ_u2Σ_u3 max[0, margin－M(r_u1)・M(r_u2)＋M(r_u1)・M(r_u3)]
を採用してもよい。ここで、u2はユーザu1の血縁ユーザであり、u3はユーザu1の非血縁ユーザである。またM(r_ui)は、ユーザuiのパーソナリティ表現ベクトルである。さらに、Σ_u2はユーザu1の血縁ユーザ全員についての総和であり、Σ_u3はユーザu1の非血縁ユーザ全員についての総和となっている。またmax[0, α]は、αが正値であればα、αが非正であれば0をとる関数であって、さらにmarginは、括弧[]内の後ろの値を調整するための正の定数である。 <Loss function reflecting the presence or absence of blood ties>
For example, the loss function for a specific user u1 is expressed as follows: loss(u1, u2, u3)
=Σ _u2 Σ _u3 max[0, margin－M(r_u1)・M(r_u2)＋M(r_u1)・M(r_u3)]
may be adopted. Here, u2 is a blood relative of user u1, and u3 is a non-blood relative of user u1. M(r_ui) is the personality expression vector of user u1. Furthermore, Σ _u2 is the sum of all blood relatives of user u1, and Σ _u3 is the sum of all non-blood relatives of user u1. Furthermore, max[0, α] is a function that takes α if α is positive and 0 if α is non-positive, and margin is a positive constant for adjusting the value after the brackets [].

上式（１）の損失関数lossは、ユーザu1における血縁者との内積M(r_u1)・M(r_u2)が非血縁者との内積M(r_u1)・M(r_u3)よりも大きくなるほど、小さい値をとる。ここで、この内積は、写像M後の表現ベクトルM(r_u)間の離隔度合いの逆数とも言うべき量となっており、したがって損失関数lossは、写像部１０の訓練に好適な関数となっているのである。 The loss function loss in the above formula (1) takes a smaller value as the inner product M(r_u1)·M(r_u2) between the user u1's relatives becomes larger than the inner product M(r_u1)·M(r_u3) between the user u1's non-relatives. Here, this inner product is an amount that can be considered the reciprocal of the degree of separation between the representation vectors M(r_u) after mapping M, and therefore the loss function loss is a function suitable for training the mapping unit 10.

なお、写像部１０の訓練に用いることのできる損失関数は当然、上式（１）に限定されるものではない。例えば上式（１）では、写像M後の表現ベクトルM(r_u)間の離隔度合いに係る量として内積を用いているが、代わりにコサイン類似度やユークリッド距離等を用いて損失関数を設定することも可能である。また、血縁ユーザu2や非血縁ユーザu3は、ユーザu1の全ての血縁ユーザや非血縁ユーザをとるのではなく、その一部をとったものであってもよい。 The loss function that can be used to train the mapping unit 10 is not limited to the above formula (1). For example, in the above formula (1), the inner product is used as a quantity related to the degree of separation between the representation vectors M(r_u) after mapping M, but it is also possible to set the loss function using cosine similarity, Euclidean distance, or the like instead. In addition, the blood-related user u2 and the non-blood-related user u3 do not need to be all blood-related users or non-blood-related users of the user u1, but may be a part of them.

また、写像部１０の訓練（学習）では、上記のような損失関数を計算するために、ユーザの血縁関係の情報を取得する必要があるが、一度訓練が完了すれば、パーソナリティ表現ベクトルM(r_u)を導出する際に、当該血縁関係の情報は不要となるのである。 In addition, in the training (learning) of the mapping unit 10, in order to calculate the loss function as described above, it is necessary to obtain information on the user's blood ties. However, once the training is completed, the blood ties information is no longer necessary when deriving the personality expression vector M(r_u).

＜血縁関係の強弱（濃さ）を反映した損失関数＞
上式（１）の損失関数lossはいわば、ユーザ間における血縁関係の有無を反映した損失関数となっているが、その変更態様として、ユーザ間における血縁関係の濃さを反映した損失関数を設定することも可能である。例えば血縁関係について、一卵性双生児の遺伝子情報は互いに一致するので、お互いのパーソナリティの類似度は最も大きくなる、若しくは最も大きくなる可能性が高いとすることができる。一方、従兄弟、従姉妹や叔父叔母等は、勿論血縁関係にはあるが、共有する遺伝子情報の割合は兄弟や親子よりも低く、お互いのパーソナリティの類似度は相対的に低くなると考えられる。したがって、血縁関係が濃いほどパーソナリティの類似度は大きくなる（傾向にある）ことに基づく損失関数を設定することができるのである。 <Loss function reflecting the strength (depth) of blood ties>
The loss function loss in the above formula (1) is, so to speak, a loss function that reflects the presence or absence of a blood relationship between users, but as a modification, it is also possible to set a loss function that reflects the strength of the blood relationship between users. For example, in terms of blood relationship, since identical twins' genetic information matches each other, it can be assumed that the similarity of their personalities is the highest or most likely to be the highest. On the other hand, cousins, uncles and aunts, etc. are of course blood relatives, but the proportion of shared genetic information is lower than that of siblings and parents and children, and it is considered that the similarity of their personalities is relatively low. Therefore, it is possible to set a loss function based on the fact that the closer the blood relationship, the greater (tends to be) the similarity of their personalities.

具体的に、このような損失関数loss'として、次式
（２） loss(u1,u2,u2')
＝Σ_u2Σ_u2' max[0, margin－M(r_u1)・M(r_u2)＋M(r_u1)・M(r_u2')]
を採用してもよい。ここで、u2及びu2'はともにユーザu1の血縁ユーザであるが、u2はu2'よりもu1との血縁関係が濃いユーザとなっている。また、Σ_u2はユーザu2（u1との血縁関係が濃い血縁ユーザ）全員についての総和であり、Σ_u3はユーザu2'（u1との血縁関係がu2ほどは濃くない血縁ユーザ）全員についての総和となっている。 Specifically, the loss function loss' is expressed as follows: loss(u1,u2,u2')
=Σ _u2 Σ _u2' max[0, margin－M(r_u1)・M(r_u2)＋M(r_u1)・M(r_u2')]
Here, both u2 and u2' are blood relatives of user u1, but u2 is a user who is more closely related to u1 than u2'. Also, Σ _u2 is the sum of all users u2 (blood relatives who are more closely related to u1), and Σ _u3 is the sum of all users u2' (blood relatives who are not as closely related to u1 as u2).

このように、写像部１０の訓練（学習）に用いる損失関数に、血縁関係の濃さを反映させることにより、訓練後の写像部１０から導出されるパーソナリティ表現ベクトルM(r_u)の信頼性をより高めることも可能となる。すなわち、パーソナリティの表現としてより正確であることが期待される表現ベクトルを生成することができるのである。 In this way, by reflecting the strength of the blood relationship in the loss function used to train (learn) the mapping unit 10, it is possible to further increase the reliability of the personality expression vector M(r_u) derived from the mapping unit 10 after training. In other words, it is possible to generate an expression vector that is expected to be more accurate as a representation of personality.

＜血縁関係以外の関係の利用＞
以上、「所定関係の強さ指標」として、上記（ア）の血縁関係指標を用いた訓練（損失関数の設定）について説明を行ったが、「所定関係の強さ指標」は当然、この血縁関係指標に限定されるものではない。例えば、
（イ）友人・知人関係指標：友人・知人関係にあるユーザ同士を、若しくは友人・知人関係が深いユーザ同士ほどより強いものとする指標、
（ウ）親しさ指標：親しい関係にあるユーザ同士を、若しくは親しいユーザ同士ほどより強いものとする指標、
（エ）相性指標：相性のよいユーザ同士を、若しくは相性がよいユーザ同士ほどより強いものとする指標、
（オ）遺伝子配列指標：特定の遺伝子配列が同一であるユーザ同士を、若しくは特定の遺伝子配列が類似しているユーザ同士ほどより強いものとする指標、又は、
（カ）脳波特徴量指標：脳波から抽出された特徴量が類似しているユーザ同士ほどより強いものとする指標
を「所定関係の強さ指標」として採用することもできる。すなわち上記（ア）～（カ）のいずれにしても、当該関係の有無に係る上式（１）や、当該関係の強弱に係る上式（２）のような損失関数を設定することが可能となるのである。 <Use of relationships other than blood ties>
The above describes the training (setting of the loss function) using the blood relationship index (A) as the "predetermined relationship strength index", but the "predetermined relationship strength index" is not limited to this blood relationship index. For example,
(a) Friend/acquaintance relationship index: an index that considers users who are friends or acquaintances to be stronger, or users who have a deeper friend/acquaintance relationship to be stronger,
(c) Closeness index: An index that gives a stronger relationship between users who are close to each other, or between users who are closer to each other.
(D) Compatibility index: An index that makes users who have good compatibility stronger, or users who have good compatibility stronger;
(E) Gene sequence index: an index that indicates that users who have the same specific gene sequence, or users who have similar specific gene sequences, are stronger; or
(F) EEG feature index: An index that gives a stronger relationship to users with similar features extracted from EEG can be adopted as a "predetermined relationship strength index." In other words, in any of the above (A) to (F), it is possible to set a loss function such as the above formula (1) relating to the presence or absence of the relationship, or the above formula (2) relating to the strength of the relationship.

このように「所定関係の強さ指標」に関し選択肢を揃えておくことによって、例えば、ユーザの血縁関係の情報が取得できない場合に、取得可能となっているユーザの友人・知人関係の情報を取得して、写像部１０の訓練に適用するといったことも可能となるのである。 By providing a range of options for the "specified relationship strength index" in this way, for example, if information on the user's blood ties cannot be obtained, it is possible to obtain information on the user's friend and acquaintance relationships, which is available, and apply this information to training the mapping unit 10.

ここで、上記（イ）の友人・知人関係指標における「友人・知人関係の深さ」は、例えば所定のＳＮＳ（Social Networking Service）上におけるコミュニケーション頻度（例えばメッセージ交換の頻度やチャットを行う頻度）、コミュニケーション形態（例えば、非リアルタイムのメッセージ交換のみを行う形態、リアルタイムのチャットも行う形態や、ビデオ通話も行う形態等）や、コミュニケーション時間（「友達」関係が登録されている期間や、総通話時間等）等の情報を定量化したものとすることもできる。すなわち、当該ＳＮＳ情報から自動的に生成することも可能となるのである。 Here, the "depth of a friend/acquaintance relationship" in the friend/acquaintance relationship index in (i) above can be a quantification of information such as the frequency of communication on a specific SNS (Social Networking Service) (e.g., frequency of message exchange or chat), the form of communication (e.g., non-real-time message exchange only, real-time chat, video calls, etc.), and the duration of communication (the period during which the "friend" relationship was registered, the total duration of calls, etc.). In other words, it can also be generated automatically from the SNS information.

また、上記（ア）～（カ）のうちの複数、例えば全部を「所定関係の強さ指標」として採用してもよい。例えば、血縁関係指標の強さも、遺伝子配列指標の強さもともに考慮して損失関数を設定することも可能である。 Moreover, more than one of the above (a) to (f), for example all of them, may be adopted as the "strength index of a specified relationship." For example, it is also possible to set a loss function taking into account both the strength of the blood relationship index and the strength of the gene sequence index.

ちなみに、上記（イ）～（エ）の強さ指標は、「パーソナリティの類似する者同士ほど、友人・知人関係や親しい関係になり易く、また、互いに相性がよい傾向にある」との知見に基づき設定されるものとなっている。さらに、上記（オ）及び（カ）の強さ指標は、パーソナリティと生体情報との関係についての、以下に述べるような近年の研究結果に基づき設定可能となっているのである。 The strength indices (i) to (d) above are set based on the knowledge that "people with similar personalities are more likely to become friends, acquaintances, or close friends, and tend to get along well with each other." Furthermore, the strength indices (e) and (f) above can be set based on the results of recent research into the relationship between personality and biometric information, as described below.

例えば、ヒトのドーパミンＤ４受容体の遺伝子には、特定の48塩基の単位からなる繰り返し配列が存在しており、また、この繰り返しの回数については、2～11回の間で個人差（多型）がみられ、さらに、繰り返し配列内にも多くのＳＮＰｓ（一塩基多型）が存在していることが分かっている。ここで、この繰り返し配列については、当該分野の2つの研究グループによって、繰り返しの回数が多いヒトほど好奇心が強く、いわゆる怖いもの知らずの傾向にあり、また、ドーパミンＤ４受容体をノックアウトしたマウスは好奇心が低下し、探索行動が観察され難くなるとの報告がなされている。さらに、イヌのドーパミンＤ４受容体遺伝子にも、ある配列の欠損や挿入による8種の多型の存在することが明らかになっており、例えば攻撃性の強いシバイヌとおとなしい気質のゴールデンレトリバーとでは遺伝子型に相応の違いのあることも知られている。 For example, the human dopamine D4 receptor gene contains a repeat sequence consisting of a specific 48-base unit, and the number of repeats varies from person to person (polymorphism) between 2 and 11 times. Furthermore, it is known that there are many SNPs (single nucleotide polymorphisms) within the repeat sequence. Two research groups in the field have reported that people with a higher number of repeats tend to be more curious and fearless, and that mice with knockout dopamine D4 receptors have a lower curiosity and are less likely to exhibit exploratory behavior. Furthermore, it has been revealed that there are eight types of polymorphisms in the dog dopamine D4 receptor gene, caused by deletions or insertions of certain sequences. For example, it is known that there are corresponding differences in genotype between aggressive Shiba Inu and docile golden retrievers.

またさらに脳波についても、脳波から抽出された特徴量がパーソナリティにおける特定の特性との間で有意な相関を示すことが、数多くの研究の結果として報告されている。 Furthermore, numerous studies have shown that features extracted from brain waves show significant correlations with specific personality traits.

以上に述べたような研究結果・知見に基づいて、上記（オ）の遺伝子配列指標や上記（カ）の脳波特徴量指標を採用し、ユーザの遺伝子配列や脳波特徴量を予め測定しておいた上で、例えば、ドーパミンＤ４受容体の繰り返し回数や繰り返し配列のパターンが類似しているユーザ同士ほど、また、パーソナリティとの間で相関を示す脳波特徴量が類似するユーザ同士ほど、対応するパーソナリティ表現ベクトルの間の離隔度合いをより小さくするように写像部１０を訓練することも好ましいのである。 Based on the research results and findings described above, it is also preferable to employ the gene sequence index (e) and the electroencephalogram feature index (f) above, measure the user's gene sequence and electroencephalogram feature in advance, and then train the mapping unit 10 so that the degree of separation between corresponding personality expression vectors is smaller for users who, for example, have similar dopamine D4 receptor repetition numbers or repeat sequence patterns, or who have similar electroencephalogram feature values that show a correlation with personality.

なお、以上に説明した上記（ア）～（カ）の「所定関係の強さ指標」はいずれも、パーソナリティについてのユーザ間の類似の度合いと（強いものとなっているほど類似の度合いが大きくなる傾向にあるという意味で）正に相関する若しくは正に相関する可能性のある指標となっている。この場合、このような「所定関係の強さ指標」に関し、写像して得られた２つのパーソナリティ表現ベクトルに係るユーザ同士がより強いものとなっているほど、これら２つのパーソナリティ表現ベクトルの間の離隔度合いがより小さくなるように、写像部１０を訓練することができるのである。 All of the "specified relationship strength indexes" (a) to (f) described above are indices that are positively correlated or have the potential to be positively correlated with the degree of personality similarity between users (in the sense that the stronger the index, the greater the degree of similarity tends to be). In this case, the mapping unit 10 can be trained so that the stronger the relationship between the users associated with the two personality expression vectors obtained by mapping, the smaller the degree of separation between these two personality expression vectors becomes with respect to such a "specified relationship strength index."

一方、上記（ア）～（カ）とは異なり、パーソナリティについてのユーザ間の類似の度合いと（強いものとなっているほど類似の度合いが小さくなる傾向にあるという意味で）負に相関する若しくは負に相関する可能性のある「所定関係の強さ指標」を採用することも可能である。例えば、
（キ）仲の悪さ指標：仲の悪いユーザ同士を、又は仲の悪いユーザ同士ほどより強いものとする指標
を採用してもよい。 On the other hand, unlike the above (A) to (F), it is also possible to adopt a "predetermined relationship strength index" that is negatively correlated or has the potential to be negatively correlated with the degree of similarity between users in terms of personality (in the sense that the stronger the index, the smaller the degree of similarity tends to be). For example,
(G) Bad relationship index: An index may be adopted that equates users who do not get along well with each other, or users who do not get along well with each other, with a higher level of strength.

このような負の相関に係る「所定関係の強さ指標」を採用する場合は、この「所定関係の強さ指標」に関し、写像して得られた２つのパーソナリティ表現ベクトルに係るユーザ同士がより弱いものとなっているほど、これら２つのパーソナリティ表現ベクトルの間の離隔度合いがより小さくなるように、写像部１０を訓練することができる。 When adopting a "specified relationship strength index" related to such a negative correlation, the mapping unit 10 can be trained so that the degree of separation between the two personality expression vectors obtained by mapping becomes smaller the weaker the users related to the two personality expression vectors obtained by mapping are in terms of this "specified relationship strength index."

また、「所定関係の強さ指標」の更なる変更態様として、パーソナリティの類似の度合いとの間で正や負の相関を示すものではないが、パーソナリティの類似の度合いに関連している若しくは関連する可能性のある「所定関係の強さ指標」を採用することも可能である。例えば、特定の指標値をとる場合にパーソナリティの類似の度合いが極大化又は極小化するような「所定関係の強さ指標」についても、写像部１０の訓練に用いることができるのである。 As a further modification of the "specified relationship strength index," it is also possible to employ a "specified relationship strength index" that does not show a positive or negative correlation with the degree of personality similarity, but that is related or potentially related to the degree of personality similarity. For example, a "specified relationship strength index" that maximizes or minimizes the degree of personality similarity when it takes a specific index value can also be used to train the mapping unit 10.

＜パーソナリティ以外のユーザ属性の利用＞
以上、様々な種類の「所定関係の強さ指標」を用いて写像部１０の訓練（学習）が実施可能となっていることを説明した。ここで、訓練についての好適な他の態様として、パーソナリティとは別のユーザ属性、例えば居住地、職業や、年代等を利用した訓練について説明を行う。 <Using user attributes other than personality>
As described above, it has been possible to train (learn) the mapping unit 10 using various types of "predetermined relationship strength indexes." Here, as another preferred form of training, training using user attributes other than personality, such as place of residence, occupation, age, etc., will be described.

最初に、上記（ア）～（カ）のような正の相関に係る「所定関係の強さ指標」を用いる場合を説明する。この場合、写像部１０は、
（ａ）このような「所定関係の強さ指標」に関してより強いものとなっているユーザ同士に係る２つのパーソナリティ表現ベクトルであって、パーソナリティとは別のユーザ属性（例えば居住地）に関して異なる若しくは遠い関係にある当該ユーザ同士に係る２つのパーソナリティ表現ベクトルについて、これら２つのパーソナリティ表現ベクトルの間の離隔度合いがより小さくなるように訓練されることも好ましい。 First, a case will be described in which the "indices of strength of a predetermined relationship" relating to a positive correlation such as those in (a) to (f) above are used. In this case, the mapping unit 10:
(a) It is also preferable that two personality expression vectors relating to users who have a stronger relationship with respect to such a "specified relationship strength index" and who have a different or distant relationship with respect to a user attribute other than personality (e.g., place of residence) are trained to reduce the degree of separation between these two personality expression vectors.

例えば、訓練を行うに当たり、血縁関係にあるユーザに係る（ユーザ属性表現ベクトルを含む）訓練データを選択する際には、できるだけ（パーソナリティ以外のユーザ属性としての）居住地の異なる若しくは遠く離れたユーザに係る訓練データを選択することも好ましいのである。 For example, when selecting training data (including user attribute expression vectors) related to related users, it is also preferable to select training data related to users who live in different or distant places (as user attributes other than personality) as much as possible.

また、写像部１０は、
（ｂ）このような「所定関係の強さ指標」に関してより弱いものとなっているユーザ同士に係る２つのパーソナリティ表現ベクトルであって、パーソナリティとは別のユーザ属性（例えば居住地）に関して同一の若しくは近い関係にある当該ユーザ同士に係る２つのパーソナリティ表現ベクトルについて、これら２つのパーソナリティ表現ベクトルの間の離隔度合いがより大きくなるように訓練されることも好ましい。 Moreover, the mapping unit 10
(b) It is also preferable that two personality expression vectors relating to users who are weaker in terms of such a "specified relationship strength index" and who have the same or a close relationship in terms of a user attribute other than personality (e.g., place of residence) are trained to have a greater degree of separation between these two personality expression vectors.

例えば、訓練を行うに当たり、非血縁関係にあるユーザに係る（ユーザ属性表現ベクトルを含む）訓練データを選択する際には、できるだけ（パーソナリティ以外のユーザ属性としての）居住地が同じ若しくは近いユーザに係る訓練データを選択することも好ましいのである。 For example, when selecting training data (including user attribute expression vectors) related to unrelated users during training, it is also preferable to select training data related to users who have the same or nearby place of residence (as a user attribute other than personality) whenever possible.

ここで、写像部１０は、上記（ａ）の訓練、及び上記（ｂ）の訓練のいずれか一方又は両方を受けてもよい。いずれにしても、パーソナリティとは別のユーザ属性として、パーソナリティとは一般に関連しない（とされている）ユーザ属性を選択して訓練に用いることによって、パーソナリティ表現ベクトルにおけるパーソナリティとは別のユーザ属性の表現分を低減・除去することが可能となる。すなわち、より信頼性の高いパーソナリティ表現ベクトルを導出することができるのである。 Here, the mapping unit 10 may undergo either or both of the above-mentioned training (a) and training (b). In any case, by selecting a user attribute that is not generally related to personality (considered to be not related to personality) as a user attribute other than personality and using it for training, it becomes possible to reduce or remove the expression of the user attribute other than personality in the personality expression vector. In other words, it is possible to derive a more reliable personality expression vector.

例えば、写像部１０において、パーソナリティの異なるユーザ同士に係る２つのパーソナリティ表現ベクトルを、互いに離隔させるように訓練したとしても、当該ユーザ同士の居住地が異なっている（例えば居住地がそれぞれ北海道及び沖縄である）場合、これら２つのパーソナリティ表現ベクトルの離隔度合いには、居住地の違いも反映されてしまう可能性が生じてしまう。これに対し、このような２つのパーソナリティ表現ベクトルに係るユーザ同士として同じ居住地である（例えばともに北海道である）ユーザ同士を選択することによって、これら２つのパーソナリティ表現ベクトルにおける居住地の表現分を低減・除去することができるのである。 For example, even if the mapping unit 10 is trained to separate two personality expression vectors associated with users with different personalities, if the users live in different places (e.g., Hokkaido and Okinawa, respectively), the degree of separation between these two personality expression vectors may reflect the difference in place of residence. In response to this, by selecting users who live in the same place of residence (e.g., both in Hokkaido) as users associated with such two personality expression vectors, it is possible to reduce or eliminate the expression of place of residence in these two personality expression vectors.

次いで、上記（キ）のような負の相関に係る「所定関係の強さ指標」を用いる場合を説明する。この場合、写像部１０は、
（ｃ）このような「所定関係の強さ指標」に関してより弱いものとなっているユーザ同士に係る２つのパーソナリティ表現ベクトルであって、パーソナリティとは別のユーザ属性（例えば居住地）に関して異なる若しくは遠い関係にある当該ユーザ同士に係る２つのパーソナリティ表現ベクトルについて、これら２つのパーソナリティ表現ベクトルの間の離隔度合いがより小さくなるように訓練されることも好ましい。 Next, a case where the "indicator of strength of a predetermined relationship" related to the negative correlation as in (G) above is used will be described. In this case, the mapping unit 10
(c) It is also preferable that two personality expression vectors relating to users who are weaker in terms of such a "specified relationship strength index" and who have a different or distant relationship in terms of a user attribute other than personality (e.g., place of residence) are trained to reduce the degree of separation between these two personality expression vectors.

また、写像部１０は、
（ｄ）このような「所定関係の強さ指標」に関してより強いものとなっているユーザ同士に係る２つのパーソナリティ表現ベクトルであって、パーソナリティとは別のユーザ属性（例えば居住地）に関して同一の若しくは近い関係にある当該ユーザ同士に係る２つのパーソナリティ表現ベクトルについて、これら２つのパーソナリティ表現ベクトルの間の離隔度合いがより大きくなるように訓練されることも好ましい。 Moreover, the mapping unit 10
(d) It is also preferable that two personality expression vectors relating to users who have a stronger relationship with respect to such a "specified relationship strength index" and who have the same or a similar relationship with respect to a user attribute other than personality (e.g., place of residence) are trained to have a greater degree of separation between these two personality expression vectors.

さらに、写像部１０は、上記（ｃ）の訓練、及び上記（ｄ）の訓練のいずれか一方又は両方を受けてもよいのである。いずれにしても、このような訓練を実施することによって、パーソナリティ表現ベクトルにおけるパーソナリティとは別のユーザ属性の表現分を低減・除去し、より信頼性の高いパーソナリティ表現ベクトルを導出可能とするのである。 Furthermore, the mapping unit 10 may undergo either or both of the training (c) and the training (d) above. In any case, by carrying out such training, the representation of user attributes other than the personality in the personality expression vector is reduced or removed, making it possible to derive a more reliable personality expression vector.

［ユーザ属性情報生成モデルの一実施形態］
図３は、本発明に係るユーザ属性情報生成モデルの一実施形態を示す模式図である。 [One embodiment of a user attribute information generation model]
FIG. 3 is a schematic diagram showing an embodiment of a user attribute information generation model according to the present invention.

図３に示した本実施形態のユーザ属性情報生成モデル２は、
（ａ）ユーザ属性推定対象のユーザにおける複数の行動ドメイン（図３では「アイテム購入」及び「動画閲覧」の２つ）に関する情報である「行動ドメイン情報」を用いて、
（ｂ）推定対象ユーザのユーザ属性を表現した情報である「ユーザ属性表現ベクトル」を生成する
機械学習モデルである。 The user attribute information generation model 2 of this embodiment shown in FIG.
(a) Using “behavior domain information” which is information on multiple behavior domains (two in FIG. 3: “item purchase” and “video viewing”) of a user whose user attributes are to be estimated,
(b) A machine learning model that generates a “user attribute expression vector”, which is information expressing the user attributes of the user to be estimated.

具体的に、ユーザ属性情報生成モデル２は、
（Ａ）複数の行動ドメインの行動ドメイン毎に設定されたドメイン特定回帰ニューラルネットワーク（ＲＮＮ）セルであって、当該行動ドメインにおける推定対象ユーザの「ドメイン行動情報」を受け取り、前の時点で自ら生成した隠れ状態情報である「ドメイン特定隠れ状態情報」に対し「ドメイン行動情報」を反映させて、新たな「ドメイン特定隠れ状態情報」を生成する複数のドメイン特定ＲＮＮセル（図３ではＤＳＬ(Domain Specific Layer)１セル２１及びＤＳＬ２セル２２の２つ）と、
（Ｂ）推定対象ユーザを識別する情報である「ユーザ識別情報」から、推定対象ユーザを表現する情報である「ユーザ属性表現ベクトル」を生成するユーザ表現生成部（図３ではユーザ分散表現抽出部２０ｕ）と、
（Ｃ）上記（Ａ）で生成された「ドメイン特定隠れ状態情報」と、上記（Ｂ）で生成された「ユーザ属性表現ベクトル」とを受け取り、前の時点で自ら生成した隠れ状態情報である「ドメイン非依存隠れ状態情報」に対し、「ドメイン特定隠れ状態情報」及び「ユーザ属性表現ベクトル」を反映させて、新たな「ドメイン非依存隠れ状態情報」を生成するドメイン非依存ＲＮＮセル（図３ではＤＩＬ(Domain Independent layer)２０）と
してコンピュータを機能させる機械学習モデルとなっている。 Specifically, the user attribute information generation model 2 is
(A) A plurality of domain-specific regression neural network (RNN) cells (two in FIG. 3 , a DSL (Domain Specific Layer) 1 cell 21 and a DSL 2 cell 22) are set for each of a plurality of behavioral domains, and receive “domain behavior information” of a user to be estimated in the behavioral domain, and generate new “domain-specific hidden state information” by reflecting the “domain behavior information” on “domain-specific hidden state information” that is hidden state information generated by the RNN cells themselves at a previous point in time;
(B) a user expression generation unit (in FIG. 3, a user distributed expression extraction unit 20u) that generates a “user attribute expression vector” that is information expressing a user to be estimated from “user identification information” that is information identifying the user to be estimated;
(C) This is a machine learning model that causes a computer to function as a domain-independent RNN cell (in FIG. 3 , DIL (Domain Independent layer) 20) that receives the “domain-specific hidden state information” generated in (A) above and the “user attribute representation vector” generated in (B) above, and generates new “domain-independent hidden state information” by reflecting the “domain-specific hidden state information” and the “user attribute representation vector” on the “domain-independent hidden state information” that is the hidden state information that the computer itself generated at a previous point in time.

ここで本実施形態において、上記（Ａ）の「行動ドメイン情報」は、推定対象ユーザの当該行動ドメインにおける（購入や閲覧といった）行動の内容を示す情報（例えば購入したアイテムや閲覧した動画の識別情報）を受け取った行動表現生成部（図３ではアイテム分散表現抽出部２１ｉや動画分散表現抽出部２２ｍ）において生成される行動表現情報となっている。したがって、「行動ドメイン情報」（行動表現情報）は、各行動ドメインにおいて行動が発生する度に生成され、その結果、全体として時系列データ群をなすものとなる。 In this embodiment, the "behavior domain information" in (A) above is behavior expression information generated by a behavior expression generation unit (in FIG. 3, the item dispersed representation extraction unit 21i and the video dispersed representation extraction unit 22m) that receives information indicating the content of the behavior (such as purchase or viewing) of the estimated target user in the behavior domain (for example, identification information of a purchased item or a viewed video). Therefore, "behavior domain information" (behavior expression information) is generated each time a behavior occurs in each behavior domain, and as a result, the information as a whole forms a time series data group.

また、上記（Ｂ）のユーザ表現生成部（ユーザ分散表現抽出部２０ｕ）は、後に詳細に説明するが、複数のドメイン特定ＲＮＮセル（ＤＳＬ１セル２１及びＤＳＬ２セル２２）と合せて訓練されるのであり、結果的に当該訓練後は、特定の行動ドメインに偏らないドメイン非依存の表現生成演算（分散表現抽出演算）を実行する。これにより、ここで生成される「ユーザ属性表現ベクトル」は、推定対象ユーザのユーザ属性情報として把握されるものとなるのである。 The user expression generation unit (user distributed expression extraction unit 20u) in (B) above is trained together with multiple domain-specific RNN cells (DSL1 cell 21 and DSL2 cell 22), which will be described in detail later, and as a result, after the training, it executes a domain-independent expression generation calculation (distributed expression extraction calculation) that is not biased toward a specific behavioral domain. As a result, the "user attribute expression vector" generated here is understood as the user attribute information of the user to be estimated.

言い換えると、「ユーザ属性表現ベクトル」や上述した「ドメイン非依存隠れ状態情報」は、このように互いに異なる複数の又は多数の行動ドメインに係るセルからの「ドメイン特定隠れ状態情報」を受けてドメイン非依存化しているのであり、このうち特に「ユーザ属性表現ベクトル」は、推定対象ユーザにおける静的且つドメイン非依存のユーザ属性に係る情報、図１上方のユーザ属性分類グラフでいえばその第１象限に属するユーザ属性情報、と捉えることができるのである。したがって、「ユーザ属性表現ベクトル」はまさに、「パーソナリティ」を含む静的且つドメイン非依存のユーザ属性を表現したものとなっているのである。 In other words, the "user attribute representation vector" and the above-mentioned "domain-independent hidden state information" are made domain-independent by receiving "domain-specific hidden state information" from cells related to multiple or many different behavioral domains, and among these, the "user attribute representation vector" in particular can be considered as information related to static and domain-independent user attributes of the estimation target user, or, in terms of the user attribute classification graph at the top of Figure 1, user attribute information belonging to the first quadrant. Therefore, the "user attribute representation vector" is truly a representation of static and domain-independent user attributes including "personality".

さらに、この「ユーザ属性表現ベクトル」は、推定対象ユーザの「ユーザ識別情報」（例えば当該ユーザを示すone-hotベクトル）を受け取ったユーザ表現生成部（ユーザ分散表現抽出部２０ｕ）が出力する、多次元の（例えば数十～数百次元の）表現ベクトルであり、例えば多数の数値の羅列となっている。したがって、「ユーザ属性表現ベクトル」のユーザ属性に関する測定粒度（情報粒度）は、非常に高い（細かい）ものとなるのである。 Furthermore, this "user attribute expression vector" is a multidimensional (e.g., tens to hundreds of dimensions) expression vector, e.g., a list of many numerical values, output by the user expression generation unit (user distributed expression extraction unit 20u) that receives the "user identification information" of the user to be estimated (e.g., a one-hot vector indicating the user), and therefore the "user attribute expression vector" has a very high (fine) measurement granularity (information granularity) regarding the user attributes.

なお、以上に説明したドメイン特定ＲＮＮセル（ＤＳＬ１セル２１及びＤＳＬ２セル２２）、及びドメイン非依存ＲＮＮセル（ＤＩＬセル２０）は、ＧＲＵ（Gated Recurrent Unit）や、ＬＳＴＭ（Long-Short Term Memory）といったような公知のＲＮＮで構成されたものとすることができる。 The domain-specific RNN cells (DSL1 cell 21 and DSL2 cell 22) and domain-independent RNN cell (DIL cell 20) described above can be configured with well-known RNNs such as a GRU (Gated Recurrent Unit) or a LSTM (Long-Short Term Memory).

また、ユーザ属性情報生成モデル２において、ドメイン特定ＲＮＮセルの数、すなわち取り扱う行動ドメインの数は、当然、図３に示したような２つに限定されるものではなく、３つ以上とすることも可能である。また、取り扱う行動ドメインも、図３に示したアイテム購入や動画閲覧に限定されるものではなく、行動主体のユーザ属性に依存する又はその影響を受け得る行動に係るドメインであれば種々様々なものが、行動ドメインとして採用可能である。例えば、行動ドメイン「広告クリック」の時系列データが取得される場合に、「広告クリック」に係るドメイン行動情報を取り込むドメイン特定ＤＮＮセルを追加して、ユーザ属性情報生成モデル２を構成してもよい。 In addition, in the user attribute information generation model 2, the number of domain-specific RNN cells, i.e., the number of behavioral domains handled, is naturally not limited to two as shown in FIG. 3, and can be three or more. In addition, the behavioral domains handled are not limited to the item purchases and video viewing shown in FIG. 3, and various domains related to behavior that depend on or can be influenced by the user attributes of the behavior subject can be adopted as behavioral domains. For example, when time-series data of the behavioral domain "ad click" is acquired, a domain-specific DNN cell that incorporates domain behavior information related to "ad click" may be added to configure the user attribute information generation model 2.

以下、本実施形態のユーザ属性情報生成モデル２の構成について、より詳細に説明を行う。同じく図３によれば、ユーザ属性情報生成モデル２は、
（ア）行動ドメイン「アイテム購入」に係るドメイン特定ＲＮＮセルとしてのＤＳＬ１セル２１と、ユーザ分散表現抽出部２１ｕと、アイテム分散表現抽出部２１ｉと、出力層２１ｏと、
（イ）行動ドメイン「動画閲覧」に係るドメイン特定ＲＮＮセルとしてのＤＳＬ２セル２２と、ユーザ分散表現抽出部２２ｕと、動画分散表現抽出部２２ｍと、出力層２２ｏと、
（ウ）ドメイン非依存ＲＮＮセルとしてのＤＩＬセル２０と、ユーザ表現生成部としてのユーザ分散表現抽出部２０ｕと、出力層２０ｏと
を、コンピュータ（に搭載されたプログラム）によって具現される機能構成部として備えている。 The configuration of the user attribute information generation model 2 of this embodiment will be described in more detail below. As shown in FIG. 3, the user attribute information generation model 2 includes:
(A) A DSL1 cell 21 as a domain-specific RNN cell related to the behavior domain “item purchase”, a user distributed representation extraction unit 21u, an item distributed representation extraction unit 21i, and an output layer 21o;
(a) A DSL2 cell 22 as a domain-specific RNN cell related to the behavior domain “video viewing”, a user distributed expression extraction unit 22u, a video distributed expression extraction unit 22m, and an output layer 22o;
(c) The DIL cell 20 as a domain-independent RNN cell, a user distributed expression extraction unit 20u as a user expression generation unit, and an output layer 20o are included as functional components embodied by a computer (a program installed in the computer).

ここで図３には、上述した（ア）～（ウ）の機能構成部（図３の左端側の機能ブロック群）が実行する処理を、時間経過の向きが右向きとなっている時間軸上で展開した様子が示されている。なお、上記（ア）～（ウ）の各々について設定された計３つの時間軸は、それぞれ独自の値をとる時点についての時間軸となっている。 Figure 3 shows the processing executed by the functional components (a) to (c) (the functional blocks on the left side of Figure 3) as described above, deployed on a time axis with time progression pointing to the right. Note that the three time axes set for each of (a) to (c) above are time axes for points in time when each has its own unique value.

また、各時間軸における時点の表記であるが、例えばＤＳＬ２セル２２に係る時点(t2+1)は、ＤＳＬ２セル２２が時点t2でドメイン行動情報（動画embedding vector）を受け取った後、次にドメイン行動情報（動画embedding vector）を受け取った時点を意味するものとする。ここでその次に受け取った時点は当然、(t2+2)となる。さらに、時点(t2+1)から見て、時点t2は「前の時点」となるのである。また、この時点(t2+1)で処理を行うＤＳＬ２セル２２を、以後行う説明の便宜上、ＤＳＬ２セル２２(t2+1)と称することにする。またさらに、ＤＳＬ１セル２１やＤＩＬセル２０についても以後、同様の処理時点の表記、及び処理時点を含む表記を行うこととする。 As for the notation of time points on each time axis, for example, time point (t2+1) for the DSL2 cell 22 means the time point at which the DSL2 cell 22 next receives domain behavior information (video embedding vector) after receiving domain behavior information (video embedding vector) at time point t2. Here, the next reception time point is naturally (t2+2). Furthermore, from the perspective of time point (t2+1), time point t2 is the "previous time point." Furthermore, for the convenience of the following explanation, the DSL2 cell 22 that performs processing at this time point (t2+1) will be referred to as the DSL2 cell 22(t2+1). Furthermore, similar notations of processing time points and notations including the processing time will be used hereafter for the DSL1 cell 21 and the DIL cell 20.

以下、上述した各機能構成部について具体的に説明を行う。同じく図３において、時点t1におけるアイテム分散表現抽出部２１ｉであるアイテム分散表現抽出部２１ｉ(t1)は、時点t1において「（推定対象ユーザである）ユーザUserAによってアイテムitem1が購入された」とのイベントを受けて、
（ａ）アイテムitem1のアイテム識別情報であるone-hotベクトルi1^(t1)を受け取り、
（ｂ）受け取ったone-hotベクトルi1^(t1)に対し、アイテム分散表現抽出演算子としての行列W_i ^DSL1を作用させて（積算して）、アイテムitem1のアイテム表現ベクトル（ドメイン行動情報）r_i1^(t1)を生成する。 Hereinafter, each of the above-mentioned functional components will be specifically described. In FIG. 3, the item distributed expression extraction unit 21i(t1), which is the item distributed expression extraction unit 21i at time t1, receives an event that "item item1 has been purchased by user UserA (the user to be estimated)" at time t1,
(a) Receive a one-hot vector i1 ^(t1) which is the item identification information of item item1,
(b) The received one-hot vector i1 ^(t1) is applied (multiplied) by the matrix W _i ^DSL1 as an item distributed representation extraction operator to generate an item representation vector (domain behavior information) r_i1 ^(t1) for item item1.

また、時点t1におけるユーザ分散表現抽出部２１ｕであるユーザ分散表現抽出部２１ｕ(t1)は、上記（ａ）のone-hotベクトルi1^(t1)の受け取りに合わせ、
（ｃ）ユーザUserAのユーザ識別情報であるone-hotベクトルua^(t1)を受け取り、
（ｄ）受け取ったone-hotベクトルua^(t1)に対し、ユーザ分散表現抽出演算子としての行列W_u ^DSL1を作用させて（積算して）、ユーザUserAのユーザ属性表現ベクトルr_ua^(t1)を生成する。 In addition, the user distributed expression extraction unit 21u(t1), which is the user distributed expression extraction unit 21u at the time t1, receives the one-hot vector i1 ^(t1) in (a) above,
(c) receiving a one-hot vector u a ^(t1) which is the user identification information of user UserA;
(d) The received one-hot vector ua ^(t1) is applied (multiplied) by the matrix W _u ^DSL1 as a user distributed expression extraction operator to generate a user attribute expression vector r_ua ^(t1) of user UserA.

同じく図３において、時点t1におけるＤＳＬ１セル２１であるＤＳＬ１セル２１(t1)は、
（ｅ）上記（ｂ）で生成されたアイテム表現ベクトル（ドメイン行動情報）r_i1^(t1)と、上記（ｄ）で生成されたユーザ属性表現ベクトルr_ua^(t1)とを受け取り、さらに、
（ｆ）（この後詳細に説明するが、）上記（ｅ）のドメイン行動情報（r_i1^(t1)）を受け取る時点t1からみて最近に（図３では時点tIに）生成されたドメイン非依存隠れ状態情報h^DIL,(tI)も受け取り、
（ｇ）前の時点(t1-1)で自ら生成したドメイン特定隠れ状態情報h^DSL1,(t1-1)に対し、受け取ったアイテム表現ベクトル（ドメイン行動情報）r_i1^(t1)、ユーザ属性表現ベクトルr_ua^(t1)及びドメイン非依存隠れ状態情報h^DIL,(tI)を反映させて、新たなドメイン特定隠れ状態情報h^DSL1,(t1)を生成する。 Similarly, in FIG. 3, the DSL1 cell 21 (t1) at time t1 is
(e) receiving the item representation vector (domain behavior information) r_i1 ^(t1) generated in (b) above and the user attribute representation vector r_ua ^(t1) generated in (d) above,
(f) (which will be described in detail later) also receives domain-independent hidden state information h DIL,( ^{tI) that was recently generated (at time tI in FIG. 3) from time t1 when the domain action information (r_i1 (} ^{t1)) of (e)} is received,
(g) Generate new domain-specific hidden state information h ^DSL1,(t1) by reflecting the received item representation vector (domain behavior information) r_i1 ^(t1) , user attribute representation vector r_ua ^(t1) , and domain-independent hidden state information h ^DIL,(tI) to the domain-specific hidden state information h ^{DSL1,(t1-1) generated by itself at the previous point in time (t1-1)} .

以上、アイテム分散表現抽出部２１ｉ、ユーザ分散表現抽出部２１ｕ、及びＤＳＬ１セル２１における時点t1での処理について説明を行ったが、勿論他の時点での処理、例えば次の時点(t1+1)に係るアイテム分散表現抽出部２１ｉ(t1+1)、ユーザ分散表現抽出部２１ｕ(t1+1)、及びＤＳＬ１セル２１(t1+1)での処理、についても上記（ａ）～（ｇ）と同様の処理が実行される。 The above describes the processing at time t1 in the item shared representation extraction unit 21i, the user shared representation extraction unit 21u, and the DSL1 cell 21. However, of course, the same processing as (a) to (g) above is also performed for processing at other times, for example, the processing at the item shared representation extraction unit 21i(t1+1), the user shared representation extraction unit 21u(t1+1), and the DSL1 cell 21(t1+1) relating to the next time (t1+1).

さらに、他の行動ドメイン（動画閲覧）に係る動画分散表現抽出部２２ｍ、ユーザ分散表現抽出部２２ｕ、及びＤＳＬ２セル２２における各時点での処理、例えば時点t2に係る動画分散表現抽出部２２ｍ(t2)、ユーザ分散表現抽出部２２ｕ(t2)、及びＤＳＬ２セル２２(t2)での処理、についても、上記（ａ）～（ｇ）と同様の処理が実行される。ただし、この場合、例えば時点t2での動画分散表現抽出部２２ｍ(t2)は、
（ｂ’）受け取った（ユーザUserAによって閲覧された動画Mov1を示す）one-hotベクトルm1^(t2)に対し、動画分散表現抽出演算子としての行列W_m ^DSL2を作用させて（積算して）、動画Mov1の動画表現ベクトル（ドメイン行動情報）r_m1^(t2)を生成する
のである。また、ＤＳＬ２セル２２(t2)は、このドメイン行動情報r_m1^(t2)を取り入れて処理を行うことになる。 Furthermore, the same processes as those described above (a) to (g) are also executed for the video dispersed expression extraction unit 22m, the user dispersed expression extraction unit 22u, and the DSL2 cell 22 related to other behavioral domains (video viewing), for example, the video dispersed expression extraction unit 22m(t2), the user dispersed expression extraction unit 22u(t2), and the DSL2 cell 22(t2) related to time t2. However, in this case, for example, the video dispersed expression extraction unit 22m(t2) at time t2,
(b') The matrix W _m ^DSL2 as a video distributed expression extraction operator is applied (accumulated) to the received one-hot vector m1 ^(t2) (indicating the video Mov1 viewed by user UserA) to generate a video expression vector (domain behavior information) r_m1 ^(t2) of the video Mov1. The DSL2 cell 22 (t2) incorporates this domain behavior information r_m1 ^(t2) and performs processing.

同じく図３において、時点tIにおけるユーザ分散表現抽出部（ユーザ表現生成部）２０ｕであるユーザ分散表現抽出部２０ｕ(tI)は、
（ｈ）ユーザUserAのユーザ識別情報であるone-hotベクトルua^(tI)を受け取り、
（ｉ）受け取ったone-hotベクトルua^(tI)に対し、ユーザ分散表現抽出演算子としての行列W_u ^DILを作用させて（積算して）、ユーザUserAのユーザ属性表現ベクトルr_ua^(tI)を生成する。 Similarly, in FIG. 3, the user distributed expression extraction unit 20u(tI), which is the user distributed expression extraction unit (user expression generation unit) 20u at time tI,
(h) Receive a one-hot vector u a ^{(t I )} , which is the user identification information of user UserA,
(i) The received one-hot vector ua ^(tI) is applied (multiplied) by a matrix W _u ^DIL as a user distributed expression extraction operator to generate a user attribute expression vector r_ua ^(tI) of user UserA.

また、時点tIにおけるＤＩＬセル２０であるＤＩＬセル２０(tI)は、
（ｊ）上記（ｉ）で生成されたユーザ属性表現ベクトルr_ua^(tI)を受け取り、さらに、
（ｋ）時点tIからみて最近に（図３では時点t2に）生成されたドメイン特定隠れ状態情報h^DSL2,(t2)も受け取り、
（ｌ）前の時点(tI-1)で自ら生成したドメイン非依存隠れ状態情報h^DIL,(tI-1)に対し、受け取ったユーザ属性表現ベクトルr_ua^(tI)及びドメイン特定隠れ状態情報h^DSL2,(t2)を反映させて、新たなドメイン非依存隠れ状態情報h^DIL,(tI)を生成する。 Furthermore, DIL cell 20(tI), which is the DIL cell 20 at time tI, is
(j) receiving the user attribute expression vector r_ua ^(tI) generated in (i) above,
(k) receiving domain-specific hidden state information h ^{DSL2,(t 2 )} generated most recently (at time t 2 in FIG. 3 );
(l) Generate new domain-independent hidden state information h ^DIL,(tI) by reflecting the received user attribute representation vector r_ua ^(tI) and the domain-specific hidden state information h ^DSL2,(t2) on the domain-independent hidden state information h ^DIL,(tI-1) generated by itself at the previous point in time (tI-1).

なお、以上説明した処理（ｈ）～（ｌ）に係る時点tIは、例えば定期的に（所定時間経過毎に）設定された時点（の１つ）とすることも可能ではあるが、本実施形態ではより好適な設定として、いずれかのドメイン特定ＲＮＮセル（図３ではＤＳＬ１セル２１及びＤＳＬ２セル２２のいずれか）においてドメイン特定隠れ状態情報が生成された直後の時点となっている。 Note that the time tI related to the above-described processes (h) to (l) can be, for example, a time (one of) that is set periodically (every time a predetermined period of time has elapsed), but in this embodiment, a more preferable setting is the time immediately after domain-specific hidden state information is generated in one of the domain-specific RNN cells (in FIG. 3, either DSL1 cell 21 or DSL2 cell 22).

言い換えると、例えば以上に説明した場合においては、時点t2においてＤＳＬ２セル２２(t2)がドメイン特定隠れ状態情報h^DSL2,(t2)を生成したのを受けて、その直後（すなわち時点tI）に、上記処理（ｈ）～（ｌ）が発動するのである。またさらに、同じく図３に示したように、時点t1においてＤＳＬ１セル２１(t1)がドメイン特定隠れ状態情報h^DSL1,(t1)を生成したのを受け、その直後（時点tI+1）に、生成された直後のドメイン特定隠れ状態情報h^DSL1,(t1)を受け取って、上記処理（ｈ）～（ｌ）と同様の処理が実行されるのである。 In other words, in the case described above, for example, when the DSL2 cell 22(t2) generates the domain-specific hidden state information h ^DSL2,(t2) at time t2, the above-mentioned processes (h) to (l) are initiated immediately thereafter (i.e., at time tI).Furthermore, as also shown in FIG. 3, when the DSL1 cell 21(t1) generates the domain-specific hidden state information h ^DSL1,(t1) at time t1, the DSL1 cell 21(t1) receives the domain-specific hidden state information h ^DSL1,(t1) immediately after it is generated (at time tI+1), and processes similar to the above-mentioned processes (h) to (l) are executed.

このように本実施形態では、ＤＩＬセル２０は、いずれかのドメイン特定ＲＮＮセルでドメイン特定隠れ状態情報が生成される度に、当該ドメイン特定隠れ状態情報を受け取って、「ドメイン非依存隠れ状態情報」を生成する。 In this manner, in this embodiment, each time domain-specific hidden state information is generated in any domain-specific RNN cell, the DIL cell 20 receives the domain-specific hidden state information and generates "domain-independent hidden state information."

ここで図３において、推定対象であるユーザUserAは、「(時点t2で)動画Mov1を閲覧」→「(時点t1で)アイテムitem1を購入」→「(時点t2+1で)動画Mov2を閲覧」をこの順で行っている。ＤＩＬセル２０は、これらの行動に係る表現ベクトル（ドメイン行動情報）を順次受け取ったドメイン特定ＲＮＮセルで生成されたドメイン特定隠れ状態情報を順次受け取り、これにより、これらのドメイン特定隠れ状態情報の生成（受け取り）順序や前後関係の情報も加味されており、特定の行動ドメインでの行動内容だけに依存することのない（ドメイン非依存化した）「ドメイン非依存隠れ状態」を生成することが可能となるのである。 In FIG. 3, the user UserA, who is the subject of estimation, performs the following actions in this order: "viewing video Mov1 (at time t2)" → "purchasing item item1 (at time t1)" → "viewing video Mov2 (at time t2+1)". The DIL cell 20 sequentially receives domain-specific hidden state information generated by the domain-specific RNN cell which sequentially receives the expression vectors (domain behavior information) related to these actions. This takes into account the order in which these domain-specific hidden state information was generated (received) and the contextual information, making it possible to generate "domain-independent hidden states" that are not dependent solely on the content of actions in a specific behavioral domain (domain-independent).

さらに本実施形態では、上記（ｆ）で述べたように、このような特徴的な「ドメイン非依存隠れ状態」が適宜、ドメイン特定ＲＮＮセル（ＤＳＬ１セル２１やＤＳＬ２セル２２）への入力として用いられる。その結果、後にＤＩＬセル２０が受け取るドメイン特定隠れ状態情報も、上述したような特徴が反映されたものとなり、その結果、「ドメイン非依存隠れ状態」のドメイン非依存化が促進する。言い換えると、ＤＩＬセル２０はさらに、特定の行動ドメインに依存しない、推定対象ユーザの全体的な属性を反映した情報処理を実行することができるようになるのである。 Furthermore, in this embodiment, as described in (f) above, such characteristic "domain-independent hidden states" are appropriately used as inputs to domain-specific RNN cells (DSL1 cell 21 and DSL2 cell 22). As a result, the domain-specific hidden state information later received by DIL cell 20 also reflects the above-mentioned characteristics, which promotes the domain independence of the "domain-independent hidden states." In other words, DIL cell 20 can further perform information processing that reflects the overall attributes of the estimated target user, independent of a specific behavioral domain.

次いで以下、ユーザ属性情報生成モデル２の訓練（学習）について説明を行う。 Next, we will explain the training (learning) of user attribute information generation model 2.

最初に、ＤＳＬ１セル２１、アイテム分散表現抽出部２１ｉ及びユーザ分散表現抽出部２１ｕの訓練においては、多数のユーザについて各ユーザが購入したアイテムの識別情報であるone-hotベクトルの時系列データを準備し、これを順次アイテム分散表現抽出部２１ｉへ入力する。また合わせて、当該ユーザのユーザ識別情報であるone-hotベクトルをユーザ分散表現抽出部２１ｕへ入力する。これにより、ＤＳＬ１セル２１は、各時点において前の時点でのドメイン特定隠れ状態情報を更新して当該時点でのドメイン特定隠れ状態情報を生成し、これを受けて出力層２１ｏが、購入アイテムの予測結果（次の時点に当該ユーザが購入すると予測されるアイテムの識別情報、又は各候補アイテムの購入される確率）を出力する（本実施形態において、出力層２１ｏはそのような出力を行うように設定されている）。 First, in training the DSL1 cell 21, the item distributed representation extraction unit 21i, and the user distributed representation extraction unit 21u, time series data of one-hot vectors, which are identification information of items purchased by each user for a large number of users, are prepared and input sequentially to the item distributed representation extraction unit 21i. In addition, the one-hot vector, which is the user identification information of the user, is input to the user distributed representation extraction unit 21u. As a result, the DSL1 cell 21 updates the domain-specific hidden state information at the previous time point at each time point to generate domain-specific hidden state information at the current time point, and in response, the output layer 21o outputs a prediction result of the purchased item (identification information of the item predicted to be purchased by the user at the next time point, or the probability of each candidate item being purchased) (in this embodiment, the output layer 21o is configured to perform such output).

ここで、出力されたアイテムの予測結果と正解のアイテムの識別情報（実際に購入されたアイテムのone-hotベクトル）との差異を損失とし、公知のＲＮＮの誤差逆伝播法を用いて、ＤＳＬ１セル２１内の各パラメータ（重み行列やバイアスベクトルを決めるパラメータ等）や、行列W_i ^DSL1及び行列W_u ^DSL1を決めるパラメータを調整する訓練を行うのである。 Here, the difference between the output item prediction result and the correct item identification information (the one-hot vector of the item actually purchased) is considered as the loss, and training is performed using the well-known RNN error backpropagation method to adjust each parameter in DSL1 cell 21 (such as parameters that determine the weight matrix and bias vector) and the parameters that determine the matrix W _i ^DSL1 and matrix W _u ^DSL1 .

なお、ＤＳＬ２セル２２、動画分散表現抽出部２２ｍ及びユーザ分散表現抽出部２２ｕの訓練についても、上記のＤＳＬ１セル２１等の訓練と同様にして実施することができる。具体的にこの場合、出力層２２ｏから出力された閲覧動画の予測結果（次の時点に当該ユーザが閲覧すると予測される動画の識別情報、又は各候補動画の閲覧される確率）と正解の動画の識別情報（実際に閲覧された動画のone-hotベクトル）との差異を損失とし、公知のＲＮＮの誤差逆伝播法を用いて、ＤＳＬ２セル２２内の各パラメータ（重み行列やバイアスベクトルを決めるパラメータ等）や、行列W_m ^DSL2及び行列W_u ^DSL2を決めるパラメータを調整する訓練を行うのである。さらに、３つ目又はそれ以降のドメイン特定ＲＮＮセル系が設定されている場合も、同様にして訓練することが可能である。 Training of the DSL2 cell 22, the video distributed expression extraction unit 22m, and the user distributed expression extraction unit 22u can be performed in the same manner as the training of the DSL1 cell 21, etc. Specifically, in this case, the difference between the prediction result of the viewed video output from the output layer 22o (identification information of the video predicted to be viewed by the user at the next time point, or the probability of viewing each candidate video) and the identification information of the correct video (one-hot vector of the video actually viewed) is treated as a loss, and training is performed to adjust each parameter in the DSL2 cell 22 (parameters determining the weight matrix and bias vector, etc.) and parameters determining the matrix W _m ^DSL2 and the matrix W _u ^DSL2 using a known RNN backpropagation method. Furthermore, training can be performed in the same manner when a third or subsequent domain-specific RNN cell system is set.

次いで、ＤＩＬセル２０及びユーザ分散表現抽出部２０ｕの訓練においては、各ドメイン特定ＲＮＮセル（ＤＳＬ１セル２１及びＤＳＬ２セル２２の各々）における上述した損失を、各ドメイン特定ＲＮＮセルとＤＩＬセル２０との間のリンクを通して、ＤＩＬセル２０及びユーザ分散表現抽出部２０ｕへ逆伝播させて訓練を行う。例えば、図３において、ＤＳＬ１セル２１での時点t1までの入力に基づく購入アイテムの予測結果と正解アイテムの識別情報との差異である損失は、ドメイン非依存隠れ状態情報h^DIL,(tI)の伝達されるリンクを通して逆伝播し、ＤＩＬセル２０内の各パラメータ（重み行列やバイアスベクトルを決めるパラメータ等）や、ユーザ分散表現抽出部２０ｕの行列W_u ^DILを決めるパラメータの訓練に用いられる。 Next, in training the DIL cell 20 and the user distributed expression extraction unit 20u, the above-mentioned loss in each domain-specific RNN cell (each of the DSL1 cell 21 and the DSL2 cell 22) is back-propagated to the DIL cell 20 and the user distributed expression extraction unit 20u through the link between each domain-specific RNN cell and the DIL cell 20 to perform training. For example, in Fig. 3, the loss, which is the difference between the predicted result of the purchase item based on the input up to time t1 in the DSL1 cell 21 and the identification information of the correct item, is back-propagated through the link that transmits the domain-independent hidden state information hDIL ^,(tI) , and is used to train each parameter in the DIL cell 20 (parameters that determine the weight matrix and bias vector, etc.) and parameters that determine ^the matrix _WuDIL of the user distributed expression extraction unit 20u.

すなわち本実施形態において、ＤＩＬセル２０及びユーザ分散表現抽出部２０ｕの訓練は、ドメイン非依存ＲＮＮセル独自の教師データを用いることなく、ドメイン特定ＲＮＮセル（ＤＳＬ１セル２１及びＤＳＬ２セル２２）からの誤差逆伝播のみで実施されるのである。 In other words, in this embodiment, training of the DIL cell 20 and the user distributed expression extraction unit 20u is performed only by backpropagation of errors from the domain-specific RNN cells (DSL1 cell 21 and DSL2 cell 22) without using teacher data unique to the domain-independent RNN cells.

以上説明したように訓練（学習）されたＤＩＬセル２０及びユーザ分散表現抽出部２０ｕは、複数のドメイン特定ＲＮＮセル（ＤＳＬ１セル２１及びＤＳＬ２セル２２）からドメイン特定隠れ状態情報を適宜受け取り、さらに各ドメイン特定ＲＮＮセルへ適宜ドメイン非依存隠れ状態情報を与えてきた過程でドメイン非依存化する。ここで特に、ユーザ分散表現抽出部２０ｕ（の行列W_u ^DIL）は、ユーザの識別情報を受けて、当該ユーザの静的な且つ非ドメイン依存の（すなわち当該ユーザのパーソナリティを包含する）ユーザ属性に係る情報を出力するようになっているのである。 The DIL cell 20 and the user distributed expression extraction unit 20u trained (learned) as described above receive domain-specific hidden state information from a plurality of domain-specific RNN cells (DSL1 cell 21 and DSL2 cell 22) as appropriate, and further become domain-independent in the process of providing each domain-specific RNN cell with appropriate domain-independent hidden state information. Here, in particular, the user distributed expression extraction unit 20u (its matrix W _u ^DIL ) receives user identification information and outputs information related to the user's static and domain-independent user attributes (i.e., including the user's personality).

［ユーザ属性情報生成モデルの他の実施形態］
図４は、本発明に係るユーザ属性情報生成モデルの他の実施形態を示す模式図である。 [Another embodiment of the user attribute information generation model]
FIG. 4 is a schematic diagram showing another embodiment of a user attribute information generation model according to the present invention.

図４によれば、本実施形態のユーザ属性情報生成モデル３は、行列因子分解（Matrix factorization）法を用いて、推定対象のユーザに係るユーザ属性表現ベクトルを生成し、対象属性表現生成モデル１’へ出力するモデルとなっている。また、この対象属性表現生成モデル１’は、対象属性表現生成モデル１と同様、例えば血縁関係指標を用いて訓練された写像部１０’を備えており、受け取ったこのユーザ属性表現ベクトルに対し写像M'を施して、当該ユーザのパーソナリティ表現ベクトルを生成するのである。 As shown in FIG. 4, the user attribute information generation model 3 of this embodiment is a model that uses matrix factorization to generate a user attribute expression vector for the user to be estimated, and outputs it to the target attribute expression generation model 1'. Similarly to the target attribute expression generation model 1, the target attribute expression generation model 1' is equipped with a mapping unit 10' trained using, for example, a blood relationship index, and applies a mapping M' to the received user attribute expression vector to generate a personality expression vector for the user.

ここで、ユーザ属性表現ベクトルの生成に利用される行列因子分解法は、公知の手法であって、特に商品・サービスのレコメンド（推薦）技術の分野で精力的に研究されているユーザ表現学習（ＵＲＬ）技術のうちの、非時系列の行動データを用いるStatic URL技術の代表例となっている。 The matrix factorization method used to generate the user attribute expression vector is a well-known technique, and is a representative example of the static URL technique, which uses non-time-series behavioral data, among the user expression learning (URL) techniques that have been actively researched, particularly in the field of product and service recommendation technology.

具体的に、ユーザ属性情報生成モデル３は本実施形態において、この行列因子分解法に従い、
（ａ）多数（m人）のユーザの各々が様々な（n種類の）アイテムの各々を購入した事実の記録であるアイテム購入履歴から、各行のユーザが各列のアイテムを購入したか否かを示す値を行列成分とするm×nの「アイテム購入行列」を生成し、
（ｂ）生成した「アイテム購入行列」を次元削減処理によって、m×kの「ユーザ行列」とk×nの「アイテム行列」とに分解し（ここで0＜k＜m）、
（ｃ）分解して得られた「ユーザ行列」において、行毎の行列成分で構成されるベクトルを、当該行に対応するユーザのユーザ属性表現ベクトルとして出力する
のである。 Specifically, in this embodiment, the user attribute information generation model 3 follows the matrix factorization method as follows:
(a) From the item purchase history, which is a record of the fact that each of a large number (m people) of users purchased each of various (n types) items, an m × n "item purchase matrix" is generated, in which the matrix elements are values indicating whether or not a user in each row purchased an item in each column;
(b) The generated “item purchase matrix” is decomposed into an m × k “user matrix” and a k × n “item matrix” (where 0 < k < m) by dimensionality reduction processing.
(c) In the "user matrix" obtained by the decomposition, a vector formed of the matrix elements for each row is output as a user attribute expression vector for the user corresponding to that row.

図４では（容易な理解のためユーザ数及びアイテム数を極端に小さくした4×5の「アイテム購入行列」を示しているが）、4×3の「ユーザ行列」における第１行の行列成分で構成されるベクトルが、この第１行に対応するユーザp1のユーザ属性表現ベクトルr_p1として出力されている。 In Figure 4 (which shows a 4x5 "item purchase matrix" with an extremely small number of users and items for easy understanding), the vector formed by the matrix elements of the first row in the 4x3 "user matrix" is output as the user attribute representation vector r_p1 for user p1, which corresponds to this first row.

ここで、ユーザ属性情報生成モデル３から出力されたユーザp1のユーザ属性表現ベクトルr_p1は、行列因子分解法（Static URL技術）の性格上、ユーザ属性に関し動的／静的の、さらにはドメイン依存的／ドメイン非依存的の分離若しくは抽出を経たものとはなっていない。いわば図１上方のユーザ属性分類グラフにおける第１～第４象限に係るグループ分けはできていない状態の表現ベクトルとなっている。 The user attribute representation vector r_p1 of user p1 output from user attribute information generation model 3 has not undergone dynamic/static, or even domain-dependent/domain-independent, separation or extraction of user attributes due to the nature of the matrix factorization method (static URL technology). In other words, it is an expression vector in a state where grouping into the first to fourth quadrants of the user attribute classification graph in the upper part of Figure 1 has not been completed.

対象属性表現生成モデル１’は、このようなユーザ属性表現ベクトルr_p1を受け取り、これに対し写像M'を施すことによって、いわばこのユーザ属性表現ベクトルr_p1から、ユーザp1のパーソナリティ成分を抽出した（若しくはあぶり出した）結果としてのパーソナリティ表現ベクトルM'(r_p1)を生成することができるのである。 The target attribute representation generation model 1' receives this user attribute representation vector r_p1 and applies a mapping M' to it, so that it can generate a personality representation vector M'(r_p1) that is the result of extracting (or revealing) the personality components of user p1 from this user attribute representation vector r_p1.

なお本実施形態では、ユーザ属性情報生成モデル３は、行動ドメイン「アイテム購入」に係る「アイテム購入行列」を利用してユーザ属性表現を生成しているが、勿論、他の行動ドメイン、例えば「動画閲覧」や「広告クリック」に係る行列を利用することも可能である。 In this embodiment, the user attribute information generation model 3 generates a user attribute representation using an "item purchase matrix" related to the behavioral domain "item purchase." However, it is of course possible to use matrices related to other behavioral domains, such as "video viewing" or "advertisement clicking."

また、本発明による対象属性表現生成モデルへ入力されるユーザ属性表現ベクトルも、当然、上記のような行列因子分解法を実施するモデルや、上述したユーザ属性情報生成モデル２（図３）から出力されるものに限定されない。例えば、図１上方のユーザ属性分類グラフにおける第１象限を包含する情報を出力可能な「ＵＲＬ技術によるユーザ属性情報生成モデル」からのユーザ属性表現ベクトルならば、それを受け取った対象属性表現生成モデルは、パーソナリティ表現ベクトルを生成することが可能となるのである。 Furthermore, the user attribute expression vector input to the target attribute expression generation model according to the present invention is naturally not limited to a model that implements the above-mentioned matrix factorization method or one output from the above-mentioned user attribute information generation model 2 (Figure 3). For example, if the user attribute expression vector is from a "user attribute information generation model using URL technology" that can output information that includes the first quadrant in the user attribute classification graph in the upper part of Figure 1, the target attribute expression generation model that receives it can generate a personality expression vector.

［パーソナリティ推定装置・プログラム・方法］
以下、図１に戻って、以上詳細に説明したユーザ属性情報生成モデル２と対象属性表現生成モデル１とを合わせたモデルであるパーソナリティ推定モデル８を搭載しており、推定対象ユーザのパーソナリティを推定するパーソナリティ推定装置９について説明する。 [Personality estimation device, program, and method]
Returning to FIG. 1 , we will now explain the personality estimation device 9 that is equipped with a personality estimation model 8, which is a model that combines the user attribute information generation model 2 and the target attribute expression generation model 1 described in detail above, and that estimates the personality of the user to be estimated.

具体的に図１下方の機能ブロック図において、パーソナリティ推定装置９の入力部９１は、通信機能を備えていて、例えば外部に設置された行動ドメイン関連の管理サーバ（例えばウェブショッピング管理サーバや動画配信管理サーバ）から、多数のユーザにおける各行動ドメインでの行動に係る時系列情報を取得し、訓練部９２へ出力する。また、これらのユーザにおける血縁関係情報も、例えば外部に設置されたアンケート調査結果管理サーバから取得し、訓練部９２へ出力する。さらに入力部９１は、推定対象ユーザに係る情報を受け取り、パーソナリティ推定部９３へ出力する。 Specifically, in the functional block diagram at the bottom of FIG. 1, the input unit 91 of the personality estimation device 9 has a communication function, and acquires time-series information relating to the behavior of a large number of users in each behavioral domain, for example, from an externally installed behavioral domain-related management server (for example, a web shopping management server or a video distribution management server), and outputs it to the training unit 92. In addition, blood relationship information for these users is also acquired, for example, from an externally installed questionnaire survey result management server, and output to the training unit 92. Furthermore, the input unit 91 receives information relating to the estimation target user, and outputs it to the personality estimation unit 93.

訓練部９２は、受け取った多数のユーザにおける各行動ドメインでの行動に係る時系列情報から訓練データを生成し、これを用いてパーソナリティ推定モデル８におけるユーザ属性情報生成モデル２部分の訓練（学習）を実施する。また、受け取ったこれらのユーザの血縁関係情報と、ユーザ属性情報生成モデル２部分から出力されたユーザ属性表現ベクトルとから訓練データを生成し、これを用いてパーソナリティ推定モデル８における対象属性表現生成モデル１部分の訓練（学習）を実施する。 The training unit 92 generates training data from the received time-series information related to the behavior of the multiple users in each behavioral domain, and uses this to train (learn) the user attribute information generation model 2 portion of the personality estimation model 8. It also generates training data from the received blood relationship information of these users and the user attribute expression vectors output from the user attribute information generation model 2 portion, and uses this to train (learn) the target attribute expression generation model 1 portion of the personality estimation model 8.

パーソナリティ推定部９３は、受け取った推定対象ユーザに係る情報から、推定対象ユーザのユーザ識別情報（one-hotベクトル）を生成して、これを、訓練（学習）済みのパーソナリティ推定モデル８（のユーザ属性情報生成モデル２部分）へ入力し、パーソナリティ推定モデル８（の対象属性表現生成モデル１部分）から、推定対象ユーザのパーソナリティ表現ベクトルを取得する。このパーソナリティ表現ベクトルはすで説明したように、推定対象ユーザのパーソナリティの分散表現と捉えることができ、測定粒度の高い（細かい）且つ信頼性の高いパーソナリティ情報となっているのである。 The personality estimation unit 93 generates user identification information (one-hot vector) of the estimated user from the received information related to the estimated user, inputs this to the trained (learned) personality estimation model 8 (the user attribute information generation model 2 part), and obtains the personality expression vector of the estimated user from the personality estimation model 8 (the target attribute expression generation model 1 part). As already explained, this personality expression vector can be regarded as a distributed expression of the estimated user's personality, and is personality information with high measurement granularity (fineness) and high reliability.

出力部９４は、受け取ったパーソナリティ表現ベクトル（personality embedding vector）を、推定対象ユーザのパーソナリティ情報として（通信機能を備えている場合に）外部の情報処理装置へ送信したり、（表示機能を備えている場合に）表示したりする。 The output unit 94 transmits the received personality embedding vector to an external information processing device (if the device has a communication function) as personality information of the estimated target user, or displays the vector (if the device has a display function).

ここで、訓練部９２及びパーソナリティ推定部９３は、本発明によるパーソナリティ推定方法の一実施形態を実施する主要機能構成部であり、また、本発明によるパーソナリティ推定プログラムの一実施形態を保存したプロセッサ・メモリの機能と捉えることもできる。またこのことから、パーソナリティ推定装置９は、パーソナリティ推定の専用装置であってもよいが、本発明によるパーソナリティ推定プログラムを搭載した、例えばクラウドサーバ、非クラウドのサーバ装置、パーソナル・コンピュータ（ＰＣ）、ノート型若しくはタブレット型コンピュータ、又はスマートフォン等とすることも可能である。 The training unit 92 and the personality estimation unit 93 are the main functional components that implement one embodiment of the personality estimation method according to the present invention, and can also be considered as functions of a processor/memory that stores one embodiment of the personality estimation program according to the present invention. As a result, the personality estimation device 9 may be a dedicated device for personality estimation, but it can also be, for example, a cloud server, a non-cloud server device, a personal computer (PC), a notebook or tablet computer, or a smartphone, etc., that is equipped with the personality estimation program according to the present invention.

なお、パーソナリティ推定モデル８を構成するユーザ属性情報生成モデル及び対象属性表現生成モデルは当然、上記のものに限定されるものではなく、例えばユーザ属性情報生成モデル３（図４）及び対象属性表現生成モデル１’（図４）とすることも可能である。
また、搭載されるパーソナリティ推定モデル８は、対象属性表現生成モデルそのものであって、パーソナリティ推定装置９は、ユーザ属性情報生成モデルを搭載した外部の装置からユーザ属性表現ベクトルを受け取って、パーソナリティ表現ベクトルを生成するものであってもよい。 Of course, the user attribute information generation model and the target attribute expression generation model constituting the personality estimation model 8 are not limited to those described above, and may be, for example, the user attribute information generation model 3 (Figure 4) and the target attribute expression generation model 1' (Figure 4).
In addition, the installed personality estimation model 8 may be the target attribute expression generation model itself, and the personality estimation device 9 may receive a user attribute expression vector from an external device equipped with a user attribute information generation model, and generate a personality expression vector.

以上詳細に説明したように、本発明によれば、対象（例えばユーザ）の表現ベクトルを、別の表現ベクトル空間の別の表現ベクトルに写像することができ、その際、対象の所定属性（例えばユーザのパーソナリティ）についての類似の度合いに関連する（可能性のある）強さ指標（例えば血縁関係指標）について、この強さ指標（血縁関係指標）が、当該類似の度合いをより大きくする（可能性のある）ものとなっているほど、対応する別の表現ベクトルの間の離隔度合いがより小さくなるような訓練処理を行っておくことによって、写像後の別の表現ベクトルを、対象（ユーザ）における所定属性（パーソナリティ）がより高い信頼性をもって表現された属性表現とすることが可能となる。 As explained in detail above, according to the present invention, it is possible to map an expression vector of a target (e.g., a user) to another expression vector in another expression vector space, and in this case, a training process is performed for a strength index (e.g., a blood relationship index) that is (possibly) related to the degree of similarity in a specific attribute of the target (e.g., the user's personality) such that the greater the (possibly) the degree of similarity is, the smaller the degree of separation between the corresponding other expression vectors becomes. This makes it possible for the other expression vector after mapping to be an attribute representation that more reliably represents a specific attribute (personality) of the target (user).

例えば、本発明の１つの応用例ではあるが、従来使用されてきた質問紙調査によるパーソナリティ測定結果を用いることなく又はそれに依存することなく、推定対象ユーザの行動ドメイン情報から、情報粒度及び信頼性のより高いパーソナリティ情報を生成することも可能となる。さらに、このように生成したユーザのパーソナリティ情報を用いて、例えばマーケティングの分野において、提供する商品・サービスをパーソナライズし、例えば好適な又は有効なレコメンド等を実施することもできるのである。 For example, as one application example of the present invention, it is possible to generate personality information with higher information granularity and reliability from behavioral domain information of the estimated target user without using or relying on the results of personality measurements by questionnaire surveys that have been used in the past. Furthermore, by using the user's personality information generated in this way, for example in the field of marketing, it is possible to personalize the products and services provided and to implement, for example, suitable or effective recommendations.

また、例えば子供達に対し質の高い、且つ個々の性格に合った教育を提供するために、本発明によって生成した当該子供達の（例えばその行動内容から推定した）詳細な且つ高信頼度のパーソナリティ情報を活用することもできる。すなわち本発明によれば、国連が主導する持続可能な開発目標（ＳＤＧｓ）の目標４「すべての人々に包摂的かつ公平で質の高い教育を提供し、生涯学習の機会を促進する」に貢献することも可能となるのである。 In addition, for example, detailed and reliable personality information of children (e.g., estimated from their behavior) generated by the present invention can be used to provide children with high-quality education that is tailored to their individual personalities. In other words, the present invention can contribute to Goal 4 of the United Nations-led Sustainable Development Goals (SDGs), which is to "Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all."

さらに、例えば大人達に対し、環境に害を及ぼさないディーセント・ワーク（働きがいのある人間らしい仕事）や、質の高い、且つ個々の性格に適した仕事を提供するために、本発明によって生成した当該大人達の（例えばその行動内容から推定した）詳細な且つ高信頼度のパーソナリティ情報を活用することもできる。すなわち本発明によれば、国連が主導するＳＤＧｓの目標８「すべての人々のための包摂的かつ持続可能な経済成長、雇用およびディーセント・ワークを推進する」に貢献することも可能となるのである。 Furthermore, detailed and reliable personality information of adults (e.g., estimated from their behavior) generated by the present invention can be used to provide adults with decent work that does not harm the environment and high quality work that suits their individual personalities. In other words, the present invention can contribute to Goal 8 of the UN-led SDGs, "Promote inclusive and sustainable economic growth, employment and decent work for all."

またさらに、例えば消費者に対し、当該消費者の性格や生活行動の現状に沿った、持続可能な消費とライフスタイルについての教育を提供するために、本発明によって生成した当該消費者の（例えばその行動内容から推定した）詳細な且つ高信頼度のパーソナリティ情報や生活行動履歴・消費活動履歴を活用することもできる。すなわち本発明によれば、国連が主導するＳＤＧｓの目標１２「持続可能な消費と生産のパターンを確保する」に貢献することも可能となるのである。 Furthermore, for example, the detailed and highly reliable personality information (e.g., estimated from the content of the consumer's behavior) generated by the present invention, as well as the consumer's lifestyle history and consumption activity history, can be utilized to provide the consumer with education on sustainable consumption and lifestyles that are in line with the consumer's personality and current lifestyle. In other words, the present invention can also contribute to the achievement of Goal 12 of the United Nations-led SDGs, "Ensure sustainable consumption and production patterns."

上述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。上述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 With respect to the various embodiments of the present invention described above, various changes, modifications, and omissions within the scope of the technical ideas and viewpoints of the present invention can be easily made by a person skilled in the art. The above description is merely an example and is not intended to be restrictive in any way. The present invention is limited only by the scope of the claims and their equivalents.

１、１’ 対象属性表現生成モデル
１０、１０’ 写像部
２、３ユーザ属性情報生成モデル
２０ＤＩＬセル（ドメイン非依存ＲＮＮモデル）
２０ｏ、２１ｏ、２２ｏ出力層
２０ｕユーザ分散表現抽出部（ユーザ表現生成部）
２１ＤＳＬ１セル（ドメイン特定ＲＮＮモデル）
２１ｉアイテム分散表現抽出部
２１ｕ、２２ｕユーザ分散表現抽出部
２２ＤＳＬ２セル（ドメイン特定ＲＮＮモデル）
２２ｍ動画分散表現抽出部
８パーソナリティ推定モデル
９パーソナリティ推定装置
９１入力部
９２訓練部
９３パーソナリティ推定部
９４出力部 1, 1' Object attribute expression generation model 10, 10' Mapping unit 2, 3 User attribute information generation model 20 DIL cell (Domain-independent RNN model)
20o, 21o, 22o Output layer 20u User distributed expression extraction unit (user expression generation unit)
21 DSL1 cell (domain specific RNN model)
21i Item distributed representation extraction unit 21u, 22u User distributed representation extraction unit 22 DSL2 cell (domain-specific RNN model)
22m Video distributed expression extraction unit 8 Personality estimation model 9 Personality estimation device 91 Input unit 92 Training unit 93 Personality estimation unit 94 Output unit

Claims

An object attribute representation generation model that causes a computer to function to map a representation vector of an object to another representation vector that is an element of another representation vector space,
A mapping unit that applies a trained mapping operator to the input expression vector to generate another expression vector and outputs the another expression vector as information related to the predetermined attribute of the target , or that is a trained neural network algorithm that receives the expression vector as an input and outputs the another expression vector as information related to the predetermined attribute of the target.
The computer functions as
The mapping operator or the neural network algorithm is trained such that , with respect to a strength index of a predetermined relationship between the objects, which is related or may be related to the degree of similarity between the objects with respect to the predetermined attribute, the greater the strength index of the predetermined relationship is, or the greater the likelihood of it being able to increase the degree of similarity between the objects associated with the two different representation vectors obtained by mapping, the smaller the degree of separation between the two different representation vectors is .
A target attribute representation generation model characterized by :

The object attribute representation generation model of claim 1, characterized in that the mapping operator or the neural network algorithm is trained such that the stronger the objects related to the two different representation vectors obtained by mapping are, the smaller the degree of separation between the two different representation vectors becomes, with respect to a strength index of the specified relationship that is positively correlated or likely to be positively correlated with the degree of similarity between the objects with respect to the specified attribute.

The mapping operator or the neural network algorithm is
Two separate expression vectors relating to objects that are stronger in terms of the strength index of the predetermined relationship, and that are different or distant in terms of an attribute other than the predetermined attribute, are trained to have a smaller degree of separation between the two separate expression vectors; and/or
The object attribute representation generation model according to claim 2, characterized in that it is trained so that the degree of separation between two different representation vectors relating to objects that are weaker in terms of the strength index of the specified relationship, and that have the same or a close relationship with respect to an attribute other than the specified attribute, is greater.

The object attribute representation generation model according to any one of claims 1 to 3, characterized in that the object is a human being, the predetermined attribute is personality, and the predetermined relationship strength index is an index that is stronger between objects related by blood or objects with a closer blood relationship, an index that is stronger between objects related by friends or acquaintances or objects with a closer friend or acquaintance relationship, an index that is stronger between objects related by closeness or objects with a closer relationship, an index that is stronger between objects with good compatibility or objects with good compatibility, an index that is stronger between objects with the same specific gene sequence or objects with similar specific gene sequences, and/or an index that is stronger between objects with similar features extracted from electroencephalograms.

The object attribute representation generation model of claim 1, characterized in that the mapping operator or the neural network algorithm is trained such that the weaker the objects related to the two different representation vectors obtained by mapping are, the smaller the degree of separation between the two different representation vectors becomes, with respect to a strength index of the specified relationship that is negatively correlated or may be negatively correlated with the degree of similarity between the objects for the specified attribute.

The mapping operator or the neural network algorithm is
Two separate expression vectors relating to objects that are weaker in terms of the strength index of the predetermined relationship, and that are different or distant in terms of an attribute other than the predetermined attribute, are trained to have a smaller degree of separation between the two separate expression vectors; and/or
The object attribute representation generation model of claim 5, characterized in that the model is trained so that the degree of separation between two different representation vectors relating to objects that are stronger in terms of the strength index of the specified relationship and that have the same or a close relationship in terms of an attribute other than the specified attribute becomes greater.

the subject is a user, and the predetermined attribute is a personality;
The expression vector is a user attribute expression vector that is generated by a user attribute information generation model and is information that expresses a user attribute including the personality of the user,
The user attribute information generation model is
A plurality of domain-specific regression neural network (RNN) cells are set for each of a plurality of behavioral domains, and the domain-specific RNN cells receive domain behavior information, which is information related to the behavior of the user in the behavioral domain, and generate new domain-specific hidden state information by reflecting the domain behavior information on domain-specific hidden state information, which is hidden state information generated by the RNN cells themselves at a previous point in time;
a user representation generator trained in conjunction with the plurality of domain-specific RNN cells to generate a user attribute representation vector of the user from information identifying the user;
The object attribute representation generation model according to any one of claims 1 to 6, characterized in that the model causes a computer to function as a domain-independent RNN cell that receives the generated domain-specific hidden state information and the generated user attribute representation vector, and generates new domain-independent hidden state information by reflecting the domain-specific hidden state information and the user attribute representation vector on domain-independent hidden state information, which is hidden state information generated by the computer itself at a previous point in time.

the subject is a user, and the predetermined attribute is a personality;
The target attribute representation generation model according to any one of claims 1 to 6, characterized in that the representation vector is a user attribute representation vector that expresses the user attributes of the user and is calculated by a method related to User Representation Learning (URL) technology using information related to a predetermined behavior of the user.

The target attribute expression generation model described in claim 8, characterized in that the method related to the user expression learning technology is a matrix factorization method.

A target attribute estimation device that generates the different expression vector using a target attribute expression generation model according to any one of claims 1 to 9, and outputs the generated different expression vector as information related to the predetermined attribute of the target.

Constructing the object attribute representation generation model according to any one of claims 1 to 9 by performing training to obtain a trained mapping operator or a trained neural network algorithm ;
A target attribute estimation method implemented by a computer, characterized in that a different expression vector is generated using the constructed target attribute expression generation model, and the generated different expression vector is output as information related to the specified attribute of the target.