JP7224315B2

JP7224315B2 - Information processing device, information processing method, and information processing program

Info

Publication number: JP7224315B2
Application number: JP2020058927A
Authority: JP
Inventors: 裕介吉沢; 迪王
Original assignee: TIS Inc
Current assignee: TIS Inc
Priority date: 2020-03-27
Filing date: 2020-03-27
Publication date: 2023-02-17
Anticipated expiration: 2040-03-27
Also published as: JP2021157650A

Description

本開示は、情報処理装置、情報処理方法、および情報処理プログラムに関する。 The present disclosure relates to an information processing device, an information processing method, and an information processing program.

人間が生まれ持った骨格によるスタイルの違いに着目し、各スタイルに、より似合うファッションをコーディネートするために骨格診断を行っている。骨格診断では、身体の特徴から骨格を、例えば、ストレート（Ｓ）、ウェーブ（Ｗ）、ナチュラル（Ｎ）の３つのタイプに分類することができる。 Focusing on the difference in style due to the skeleton that humans are born with, we perform skeleton diagnosis in order to coordinate fashion that suits each style. In skeletal diagnosis, the skeletal structure can be classified into three types, for example, straight (S), wavy (W), and natural (N), based on physical characteristics.

骨格診断にはセルフ診断、および有識者による診断の２つがある。セルフ診断は、例えば、被診断者自身が、身体の特徴に関するＱ＆Ａフローチャートを進めて骨格タイプを決定する。一方、有識者による診断は、例えば、骨格診断アナリスト協会に認定された有識者が被診断者の身体を直接診断し、骨格タイプを決定する。 There are two types of skeletal diagnosis: self-diagnosis and diagnosis by experts. In the self-diagnosis, for example, the person to be diagnosed decides the skeleton type by going through a Q&A flow chart regarding physical characteristics. On the other hand, in the diagnosis by an expert, for example, an expert who is certified by the Skeletal Diagnosis Analyst Association directly diagnoses the body of the person to be diagnosed and determines the skeleton type.

しかしながら、セルフ診断は、質問事項が「○○の筋肉のハリがあり、～～」といったように細かい上に、被診断者自身では判断が難しい内容が多く、信頼性はあまり高くないといえる。一方、有識者による診断は、信頼性は高いが、有識者に直接診断してもらう必要があるため、被診断者にとって手軽さに欠ける。 However, the self-diagnosis is not very reliable because the questions are detailed, such as ``I have muscle tone in XX,'' and there are many contents that are difficult for the subject to judge by himself/herself. On the other hand, diagnosis by an expert is highly reliable, but it is not easy for the person to be diagnosed because it is necessary to have the expert diagnose the patient directly.

そこで、本開示では、被診断者であるユーザ自身で手軽に信頼性の高い骨格診断を行うことができる情報処理装置、情報処理方法、および情報処理プログラムを提案する。 Therefore, the present disclosure proposes an information processing device, an information processing method, and an information processing program that allow a user, who is a person to be diagnosed, to easily perform highly reliable skeleton diagnosis.

本開示によれば、人物が写った複数の第１の画像および人物の第１の骨格タイプを取得する取得部と、複数の第１の画像を入力、および第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する学習部とを備えたことを特徴とする情報処理装置が提供される。 According to the present disclosure, an acquisition unit that acquires a plurality of first images of a person and a first skeleton type of the person, inputs the plurality of first images, and determines that the first skeleton type is the correct answer. An information processing apparatus is provided, comprising a learning unit that learns teacher data and generates a learning model.

また、本開示によれば、情報処理装置が、人物が写った複数の第１の画像および人物の第１の骨格タイプを取得し、複数の第１の画像を入力、および第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する処理を実行することを特徴とする方法が提供される。 Further, according to the present disclosure, an information processing device obtains a plurality of first images in which a person is captured and a first skeleton type of the person, inputs the plurality of first images, and acquires the first skeleton type is the correct answer, and a process of generating a learning model by learning teacher data is provided.

また、本開示によれば、情報処理装置に、人物が写った複数の第１の画像および人物の第１の骨格タイプを取得し、複数の第１の画像を入力、および第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する処理を実行させることを特徴とするプログラムが提供される。 Further, according to the present disclosure, a plurality of first images showing a person and a first skeleton type of the person are acquired in the information processing apparatus, the plurality of first images are input, and the first skeleton type is obtained. is the correct answer, and a program is provided for executing processing for generating a learning model by learning teacher data.

本開示によれば、ユーザが写った複数の第２の画像を取得する取得部と、人物が写った複数の第１の画像を入力、および人物の第１の骨格タイプを正解とする教師データを学習して生成された学習モデルに複数の第２の画像を入力することで出力される結果に基づいて、ユーザの第２の骨格タイプを決定する決定部とを備えたことを特徴とする情報処理装置が提供される。 According to the present disclosure, an acquisition unit that acquires a plurality of second images showing a user, a plurality of first images showing a person are input, and teacher data that corrects the first skeleton type of the person and a determination unit that determines the second skeleton type of the user based on the results output by inputting a plurality of second images to the learning model generated by learning An information processing device is provided.

また、本開示によれば、情報処理装置が、ユーザが写った複数の第２の画像を取得し、複数の第１の画像を入力、および第１の骨格タイプを正解とする教師データを学習して生成された学習モデルに複数の第２の画像を入力することで出力される結果に基づいて、ユーザの第２の骨格タイプを決定する処理を実行することを特徴とする方法が提供される。 Further, according to the present disclosure, the information processing device acquires a plurality of second images in which the user is photographed, inputs a plurality of first images, and learns teacher data that corrects the first skeleton type. determining a second skeleton type of the user based on results output from inputting a plurality of second images to the learning model generated by be.

また、本開示によれば、情報処理装置に、ユーザが写った複数の第２の画像を取得し、複数の第１の画像を入力、および第１の骨格タイプを正解とする教師データを学習して生成された学習モデルに複数の第２の画像を入力することで出力される結果に基づいて、ユーザの第２の骨格タイプを決定する処理を実行させることを特徴とするプログラムが提供される。 Further, according to the present disclosure, the information processing apparatus acquires a plurality of second images in which the user is photographed, inputs a plurality of first images, and learns teacher data that corrects the first skeleton type. A program characterized by executing a process of determining a second skeleton type of a user based on a result output by inputting a plurality of second images to the learning model generated by be.

本実施形態に係る情報処理システムの構成例を示す図である。It is a figure which shows the structural example of the information processing system which concerns on this embodiment. 同実施形態に係る情報処理装置１００の機能構成例を示すブロック図である。2 is a block diagram showing a functional configuration example of the information processing apparatus 100 according to the same embodiment; FIG. 同実施形態に係る学習モデルの一例を示す図である。It is a figure which shows an example of the learning model which concerns on the same embodiment. 同実施形態に係る特徴量抽出層５００の学習の一例を示す図である。It is a figure which shows an example of learning of the feature-value extraction layer 500 which concerns on the same embodiment. 同実施形態に係る全結合層６００の学習の一例を示す図である。It is a figure which shows an example of learning of the fully connected layer 600 which concerns on the same embodiment. 同実施形態に係る特徴量抽出層５００の学習の別例を示す図である。FIG. 10 is a diagram showing another example of learning of the feature quantity extraction layer 500 according to the same embodiment; 同実施形態に係る全結合層６００の学習の別例を示す図である。FIG. 10 is a diagram showing another example of learning of the fully connected layer 600 according to the same embodiment; 同実施形態に係るユーザインタフェースの一例を示す図である。It is a figure which shows an example of the user interface which concerns on the same embodiment. 同実施形態に係る学習モデル生成処理の流れを示すフローチャートである。It is a flow chart which shows a flow of learning model generation processing concerning the embodiment. 同実施形態に係る学習処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the learning process which concerns on the same embodiment. 同実施形態に係る骨格タイプの決定処理の流れを示すフローチャートである。9 is a flowchart showing the flow of skeleton type determination processing according to the embodiment; 同実施形態に係る骨格タイプの決定処理の別の流れを示すフローチャートである。9 is a flowchart showing another flow of skeleton type determination processing according to the same embodiment. 同実施形態に係る情報処理装置１００のハードウェア構成例を示すブロック図である。2 is a block diagram showing a hardware configuration example of the information processing apparatus 100 according to the same embodiment; FIG.

以下に、本実施形態について図面に基づいて詳細に説明する。なお、本明細書および図面において、実質的に同一の部位には、同一の符号を付することにより重複する説明を省略する。 The present embodiment will be described in detail below with reference to the drawings. In addition, in this specification and the drawings, substantially the same portions are denoted by the same reference numerals, thereby omitting redundant explanations.

なお、説明は以下の順序で行うものとする。
１．実施形態
１．１．システム構成例
１．２．機能構成例
１．３．機能の詳細
１．４．機能の流れ
２．ハードウェア構成例
３．まとめ In addition, description shall be performed in the following order.
1. Embodiment 1.1. System configuration example 1.2. Functional configuration example 1.3. Details of functions 1.4. Function flow 2 . Hardware configuration example 3 . summary

＜１．実施形態＞
＜＜１．１．システム構成例＞＞
まず、本実施形態に係る情報処理システムの構成例について説明する。図１は、本実施形態に係る情報処理システムの構成例を示す図である。図１に示すように、情報処理システムは、情報処理装置１００、ユーザ端末２００、および店舗端末３００を含む。図１に示すように、情報処理装置１００と、ユーザ端末２００および店舗端末３００とはネットワークＮを介して相互に通信可能に接続される。なお、ネットワークＮは、有線、無線を問わず、インターネットなどの各種通信網を採用することができる。 <1. embodiment>
<<1.1. System configuration example >>
First, a configuration example of an information processing system according to this embodiment will be described. FIG. 1 is a diagram showing a configuration example of an information processing system according to this embodiment. As shown in FIG. 1 , the information processing system includes an information processing device 100 , a user terminal 200 and a store terminal 300 . As shown in FIG. 1, an information processing apparatus 100, a user terminal 200 and a store terminal 300 are connected via a network N so as to be able to communicate with each other. The network N can employ various communication networks such as the Internet, regardless of whether they are wired or wireless.

情報処理装置１００は、例えば、骨格診断によるファッションコーディネートサービスを提供するサービス提供者によって管理されるサーバ装置である。情報処理装置１００は、ユーザ端末２００や店舗端末３００からユーザの写った画像（ユーザ画像）を受信し、当該ユーザ画像を学習モデルに入力することでユーザの骨格タイプを決定する。また、情報処理装置１００は、決定した骨格タイプに基づいておすすめのコーディネートを検索し、ユーザ端末２００や店舗端末３００を介してユーザに提示する。なお、情報処理装置１００は、クラウドサーバ装置であってもよいし、複数台のコンピュータで構成される分散型コンピューティングシステムであってもよい。 The information processing device 100 is, for example, a server device managed by a service provider that provides a fashion coordination service based on skeletal diagnosis. The information processing apparatus 100 receives an image of the user (user image) from the user terminal 200 or the shop terminal 300, and inputs the user image into the learning model to determine the skeleton type of the user. The information processing apparatus 100 also searches for recommended coordination based on the determined skeleton type, and presents it to the user via the user terminal 200 or the store terminal 300 . The information processing device 100 may be a cloud server device, or may be a distributed computing system composed of a plurality of computers.

ユーザ端末２００は、ユーザが所有し使用する端末である。なお、ユーザとは、例えば、自分に似合うファッションをコーディネートするために骨格診断を受ける被診断者である。ユーザ端末２００は、スマートフォンやタブレットＰＣなどのモバイル端末であってもよいし、ユーザの自宅などに設置される据え置き端末であってもよい。 A user terminal 200 is a terminal owned and used by a user. Note that the user is, for example, a person to be diagnosed who undergoes a skeleton diagnosis in order to coordinate fashion that suits him/herself. The user terminal 200 may be a mobile terminal such as a smart phone or a tablet PC, or may be a stationary terminal installed in the user's home or the like.

ユーザ端末２００は、ユーザが自宅で骨格診断を受けるような使用形態を想定している。そのため、ユーザは、ユーザ端末２００またはその他のカメラ装置を使って、ユーザを、例えば、正面、背面、側面から撮影する。撮影する部位は、例えば、全身、半身、鎖骨、胸、足から腰、手などである。また、ユーザは、ユーザ端末２００を介して、撮影したユーザ画像を情報処理装置１００にアップロードする。 The user terminal 200 is assumed to be used in such a way that the user receives a skeleton diagnosis at home. Therefore, the user uses the user terminal 200 or other camera device to photograph the user, for example, from the front, back, or side. The parts to be imaged are, for example, the whole body, half of the body, clavicle, chest, legs to waist, and hands. Also, the user uploads a captured user image to the information processing apparatus 100 via the user terminal 200 .

また、ユーザ端末２００は、撮影したユーザ画像に基づいて決定されたユーザの骨格タイプやおすすめのコーディネートを情報処理装置１００から受信し、ユーザに提示する。ユーザ画像のアップロードやおすすめのコーディネートの表示は、専用のユーザインタフェース（ＵＩ）を介して行うことができる。 Further, the user terminal 200 receives from the information processing apparatus 100 the skeleton type of the user determined based on the photographed user image and recommended coordination, and presents them to the user. Uploading of user images and display of recommended coordination can be performed via a dedicated user interface (UI).

店舗端末３００は、例えば、衣料品などを販売する店舗に設置されたデジタルサイネージである。しかしながら、店舗端末３００は、店舗に設置された据え置き端末であってもよいし、店員が取り扱う店頭用のスマートフォンやタブレットＰＣなどのモバイル端末であってもよい。 The store terminal 300 is, for example, a digital signage installed in a store that sells clothing and the like. However, the store terminal 300 may be a stationary terminal installed in a store, or may be a mobile terminal such as a smartphone or a tablet PC for store use handled by a store clerk.

店舗端末３００は、ユーザが店頭で骨格診断を受けるような使用形態を想定している。店舗端末３００も、ユーザ端末２００同様、撮影したユーザ画像を情報処理装置１００にアップロードし、撮影したユーザ画像に基づいて決定されるユーザの骨格タイプやおすすめのコーディネートを情報処理装置１００から受信し、ユーザに提示する。 The store terminal 300 is assumed to be used in such a way that a user undergoes a skeleton diagnosis at a store. Similarly to the user terminal 200, the store terminal 300 also uploads a photographed user image to the information processing device 100, receives the user's skeleton type and recommended coordination determined based on the photographed user image from the information processing device 100, present to the user.

なお、図１では、本実施形態に係る情報処理システムとして、情報処理装置１００、ユーザ端末２００、店舗端末３００の３デバイスを少なくとも含むように示している。しかしながら、上述したように、本実施形態に係る情報処理システムは、使用形態によって、情報処理装置１００およびユーザ端末２００、または情報処理装置１００および店舗端末３００といった２デバイスのみを含むものであってよい。または、例えば、ユーザ端末２００や店舗端末３００の１デバイスにインストールされたアプリケーションのみでユーザ画像の撮影から骨格診断、ユーザの骨格タイプやおすすめのコーディネートの提示のすべてを行うこともできる。 Note that FIG. 1 shows the information processing system according to the present embodiment to include at least three devices: an information processing apparatus 100, a user terminal 200, and a store terminal 300. As shown in FIG. However, as described above, the information processing system according to the present embodiment may include only two devices, the information processing device 100 and the user terminal 200 or the information processing device 100 and the store terminal 300, depending on the usage pattern. . Alternatively, for example, only an application installed in one device of the user terminal 200 or the store terminal 300 can perform everything from taking a user image to skeletal diagnosis, and presenting a user's skeletal type and recommended coordination.

＜＜１．２．機能構成例＞＞
次に、本実施形態に係る情報処理装置１００の機能構成例について説明する。図２は、本実施形態に係る情報処理装置１００の機能構成例を示すブロック図である。図２に示すように、本実施形態に係る情報処理装置１００は、記憶部１１０、取得部１２０、学習部１３０、決定部１４０、検索部１５０、送信部１６０、撮影部１７０、制御部１８０を備える。 <<1.2. Functional configuration example >>
Next, a functional configuration example of the information processing apparatus 100 according to this embodiment will be described. FIG. 2 is a block diagram showing a functional configuration example of the information processing apparatus 100 according to this embodiment. As shown in FIG. 2, the information processing apparatus 100 according to the present embodiment includes a storage unit 110, an acquisition unit 120, a learning unit 130, a determination unit 140, a search unit 150, a transmission unit 160, an imaging unit 170, and a control unit 180. Prepare.

（記憶部１１０）
本実施形態に係る記憶部１１０は、各種プログラムやデータを一時的または恒常的に記憶するための記憶領域である。記憶部１１０には、情報処理装置１００が各種機能を実行するためのプログラムやデータが記憶されてよい。具体的な一例として、記憶部１１０には、ユーザ画像のアップロードやおすすめのコーディネートの表示を行うためのプログラムやデータ、人物の骨格タイプを決定するための学習モデルや、各種設定などを管理するための管理データなどが記憶されてよい。もちろん、上記はあくまで一例であり、記憶部１１０に記憶されるデータの種別は特に限定されない。 (storage unit 110)
The storage unit 110 according to this embodiment is a storage area for temporarily or permanently storing various programs and data. The storage unit 110 may store programs and data for the information processing apparatus 100 to execute various functions. As a specific example, the storage unit 110 stores programs and data for uploading user images and displaying recommended coordination, a learning model for determining a person's skeleton type, and various settings. management data and the like may be stored. Of course, the above is just an example, and the type of data stored in storage unit 110 is not particularly limited.

（取得部１２０）
本実施形態に係る取得部１２０は、学習モデルの教師データとして用いるため、人物が写った画像（「第１の画像」に相当）および当該人物の骨格タイプ（「第１の骨格タイプ」に相当）を取得する。人物が写った画像は、当該人物を正面、背面、または側面から、全身、半身、鎖骨、胸、足から腰、手などの部位が撮影されたものである。人物の骨格タイプは、有識者が当該人物を診断し、決定した骨格タイプ（例えば、ストレート（Ｓ）、ウェーブ（Ｗ）、ナチュラル（Ｎ）の３つのタイプ）である。 (Acquisition unit 120)
Since the acquisition unit 120 according to the present embodiment is used as teacher data for a learning model, an image of a person (corresponding to a “first image”) and a skeleton type of the person (corresponding to a “first skeleton type”) ). An image of a person is obtained by photographing the person from the front, back, or side, and from the whole body, the half body, the clavicle, the chest, the legs to the waist, and the hands. The skeletal type of a person is a skeletal type (for example, straight (S), wavy (W), and natural (N) three types) determined by an expert in diagnosing the person.

また、取得部１２０は、ユーザの骨格診断を行うため、ユーザ端末２００を介してアップロードされた、当該ユーザが写った画像（ユーザ画像。「第２の画像」に相当）を取得する。ユーザ画像も、全身、半身、鎖骨、胸、足から腰、手などの部位を含んで撮影されたユーザの正面、背面、または側面画像である。すなわち、学習モデルの教師データとして用いられる部位と同じ部位が写ったユーザ画像を取得する。 In addition, the acquisition unit 120 acquires an image of the user (user image, corresponding to a “second image”) uploaded via the user terminal 200 in order to diagnose the skeleton of the user. The user image is also a front, back, or side image of the user captured including parts such as the whole body, half body, clavicle, chest, legs to waist, and hands. That is, a user image showing the same part as the part used as teacher data for the learning model is acquired.

（学習部１３０）
本実施形態に係る学習部１３０は、取得部１２０によって取得された、人物が写った画像を入力、および当該人物の骨格タイプを正解とする教師データを学習し、学習モデルを生成する。なお、学習モデルに学習させる画像は、全身や半身など部位ごとに撮影された複数の画像を１セットとしてまとめて１つの特徴データとして学習モデルに入力することができる。なお、正解データとして用いる骨格タイプは、有識者が当該人物を直接、または画像を見て診断して決定した骨格タイプである。また、信頼性が高いものであれば、被診断者自身が身体の特徴に関するＱ＆Ａフローチャートを進めて決定した骨格タイプを正解データに含んでもよい。 (Learning unit 130)
The learning unit 130 according to the present embodiment receives an image of a person acquired by the acquisition unit 120, learns teacher data that corrects the skeletal type of the person, and generates a learning model. In addition, images to be learned by the learning model can be input to the learning model as a set of a plurality of images taken for each part such as the whole body or half of the body as one set of feature data. The skeletal type used as the correct data is a skeletal type determined by an expert by diagnosing the person directly or by viewing an image. In addition, if the reliability is high, the correct data may include the skeletal type determined by the person to be diagnosed by following the Q&A flow chart regarding physical features.

詳細は後述するが、学習部１３０による学習手法は、例えば、画像の学習に特化した畳み込みニューラルネットワーク（ＣＮＮ：ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）を用いることができる。さらに、学習部１３０は、膨大な画像を学習させた画像データセットであるＩｍａｇｅＮｅｔを用いて、学習モデルを転移学習することができる。また、この転移学習の際、例えば、学習モデルを特徴量抽出層と全結合層とに分離し、特徴量抽出層から出力された特徴量を全結合層の学習に用いることができる。これにより、学習モデルのすべての層を毎回学習させるのではなく必要最低限の学習を行うことにより、学習時間の短縮、および学習処理の負荷軽減を図ることができる。 Although the details will be described later, the learning method by the learning unit 130 can use, for example, a convolutional neural network (CNN) specialized for image learning. Furthermore, the learning unit 130 can transfer-learn a learning model using ImageNet, which is an image data set obtained by learning a huge number of images. Further, during this transfer learning, for example, the learning model can be separated into a feature amount extraction layer and a fully connected layer, and the feature amount output from the feature amount extraction layer can be used for learning of the fully connected layer. As a result, the learning time can be shortened and the load of the learning process can be reduced by performing the minimum necessary learning instead of learning all the layers of the learning model each time.

なお、本実施形態の学習モデルは、ユーザが写った画像が入力される入力層と、出力層と、入力層から出力層までのいずれかの層であって出力層以外の層に属する第１要素と、第１要素と第１要素の重みとに基づいて値が算出される第２要素と、を含み、入力層に入力された画像に応じて、ユーザの骨格タイプを出力層から出力するよう、情報処理装置１００を機能させる。 Note that the learning model of the present embodiment includes an input layer to which an image of a user is input, an output layer, and a first layer belonging to any layer from the input layer to the output layer other than the output layer. and a second element whose value is calculated based on the first element and the weight of the first element, and outputs the user's skeleton type from the output layer according to the image input to the input layer. Thus, the information processing apparatus 100 is made to function.

なお、本実施形態の学習モデルを生成する生成装置（例えば、サーバ装置などの情報処理装置１００）は、いかなる学習アルゴリズムを用いて上述の学習モデルを生成してもよい。例えば、生成装置は、ニューラルネットワーク（ＮＮ：ＮｅｕｒａｌＮｅｔｗｏｒｋ）、サポートベクターマシン（ＳＶＭ：ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）、クラスタリング、強化学習などの学習アルゴリズムを用いて本実施形態の学習モデルを生成してもよい。一例として、生成装置がＮＮを用いて本実施形態の学習モデルを生成するとする。この場合、学習モデルは、１つ以上のニューロンを含む入力層と、１つ以上のニューロンを含む中間層と、１つ以上のニューロンを含む出力層とを有していてもよい。 It should be noted that the generation device (for example, the information processing device 100 such as the server device) that generates the learning model of the present embodiment may generate the above-described learning model using any learning algorithm. For example, the generation device may generate the learning model of this embodiment using a learning algorithm such as a neural network (NN), a support vector machine (SVM), clustering, or reinforcement learning. As an example, assume that the generating device uses the NN to generate the learning model of this embodiment. In this case, the learning model may have an input layer containing one or more neurons, an intermediate layer containing one or more neurons, and an output layer containing one or more neurons.

ここで、本実施形態に係る学習モデルが「ｙ＝ａ_１＊ｘ_１＋ａ_２＊ｘ_２＋・・・＋ａ_ｉ＊ｘ_ｉ」で示す回帰モデルで実現されるとする。この場合、学習モデルが含む第１要素は、ｘ_１やｘ_２などといった入力データ（ｘ_ｉ）に対応する。また、第１要素の重みは、ｘ_ｉに対応する係数ａ_ｉに対応する。ここで、回帰モデルは、入力層と出力層とを有する単純パーセプトロンとみなすことができる。各モデルを単純パーセプトロンとみなした場合、第１要素は、入力層が有するいずれかのノードに対応し、第２要素は、出力層が有するノードとみなすことができる。 Here, it is assumed that the learning model according to the present embodiment is realized by a regression model represented by "y= _a1 * _x1 + _a2 * _x2 +...+ _ai * _xi ". In this case, the first element included in the learning model corresponds to the input data (x _i ) such as x ₁ and x ₂ . Also, the weight of the first element corresponds to the coefficient a _i corresponding to x _i . Here, the regression model can be viewed as a simple perceptron with an input layer and an output layer. When each model is regarded as a simple perceptron, the first element can be regarded as a node of the input layer, and the second element can be regarded as a node of the output layer.

また、本実施形態に係る学習モデルがＤＮＮ（ＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋ）など、１つまたは複数の中間層を有するＮＮで実現されるとする。この場合、学習モデルが含む第１要素は、入力層または中間層が有するいずれかのノードに対応する。また、第２要素は、第１要素と対応するノードから値が伝達されるノードである次段のノードに対応する。また、第１要素の重みは、第１要素と対応するノードから第２要素と対応するノードに伝達される値に対して考慮される重みである接続係数に対応する。 Also, it is assumed that the learning model according to this embodiment is realized by an NN having one or more intermediate layers, such as a DNN (Deep Neural Network). In this case, the first element included in the learning model corresponds to any node of the input layer or intermediate layer. Also, the second element corresponds to the next node, which is a node to which the value is transmitted from the node corresponding to the first element. Also, the weight of the first element corresponds to the connection coefficient, which is the weight considered for the value transmitted from the node corresponding to the first element to the node corresponding to the second element.

上述した回帰モデルやＮＮなど、任意の構造を有する学習モデルを用いて、ユーザの骨格タイプを算出する。より具体的には、学習モデルは、ユーザが写った画像が入力された場合に、当該ユーザの骨格タイプを出力するように係数が設定される。本実施形態に係る学習モデルは、データの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。 A user's skeleton type is calculated using a learning model having an arbitrary structure such as the regression model and NN described above. More specifically, in the learning model, coefficients are set so that when an image of a user is input, the skeleton type of the user is output. The learning model according to the present embodiment may be a model generated based on results obtained by repeating data input/output.

なお、上記例では、本実施形態に係る学習モデルが、ユーザが写った画像が入力された場合に、当該ユーザの骨格タイプを出力するモデル（モデルＡとする）である例を示した。しかしながら、本実施形態に係る学習モデルは、モデルＡに対しデータの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。例えば、本実施形態に係る学習モデルは、ユーザが写った画像を入力とし、モデルＡが出力する当該ユーザの骨格タイプを出力とする学習モデル（モデルＢとする）であってもよい。または、本実施形態に係る学習モデルは、ユーザが写った画像を入力とし、モデルＢが出力する当該ユーザの骨格タイプを出力とする学習モデルであってもよい。 In the above example, the learning model according to the present embodiment is a model (referred to as model A) that outputs the user's skeleton type when an image of the user is input. However, the learning model according to the present embodiment may be a model generated based on results obtained by repeatedly inputting and outputting data to model A. For example, the learning model according to the present embodiment may be a learning model (model B) that receives an image of a user as an input and outputs the skeleton type of the user that is output by the model A. Alternatively, the learning model according to the present embodiment may be a learning model that receives an image of a user as an input and outputs the skeleton type of the user that the model B outputs.

（決定部１４０）
本実施形態に係る決定部１４０は、ユーザ画像を学習モデルに入力することで出力される結果に基づいて、ユーザの骨格タイプ（「第２の骨格タイプ」に相当）を決定する。なお、学習モデルから出力される結果は、例えば、「Ｓ：０．８、Ｗ：０．１、Ｎ：０．１」など、１を最大（１００％）として［０～１］の範囲の数値で示される、それぞれの骨格タイプである確率である。 (Determination unit 140)
The determining unit 140 according to the present embodiment determines the skeleton type of the user (corresponding to the “second skeleton type”) based on the result output by inputting the user image into the learning model. The result output from the learning model is, for example, "S: 0.8, W: 0.1, N: 0.1", with 1 being the maximum (100%), the range [0 to 1] Probability of each skeleton type indicated by a number.

（検索部１５０）
本実施形態に係る検索部１５０は、決定部１４０によって決定されたユーザの骨格タイプに基づいて、ユーザに対するおすすめのコーディネートを検索する。具体的には、例えば、検索部１５０は、決定部１４０によって決定された骨格タイプを検索キーとして、記憶部１１０に記憶され、骨格タイプに関連付けられたおすすめのコーディネートを検索する。なお、この際、ユーザの身長や体重、年齢、性別、活動エリアなどのユーザ属性や、現在の季節などに基づいて、おすすめのコーディネートを絞り込むことができる。また、おすすめのコーディネートは、衣料品や服飾品、アクセサリのみならず、ユーザに似合う素材や柄を含むことができる。さらに、検索部１５０は、おすすめの衣料品などに関連付けられた販売情報（取扱い店舗、売り場案内、オンラインショップ）を検索することもできる。また、検索部１５０は、ユーザ情報に基づいて、ユーザ検索を行い、骨格診断済みのユーザの骨格タイプやユーザ属性などの各種情報を検索する。 (Search unit 150)
The search unit 150 according to the present embodiment searches for recommended coordination for the user based on the user's skeleton type determined by the determination unit 140 . Specifically, for example, the search unit 150 searches for recommended coordination stored in the storage unit 110 and associated with the skeleton type, using the skeleton type determined by the determination unit 140 as a search key. At this time, recommended coordination can be narrowed down based on user attributes such as height, weight, age, sex, and activity area of the user, the current season, and the like. Also, the recommended coordination can include not only clothing, furnishings, and accessories, but also materials and patterns that suit the user. Furthermore, the search unit 150 can also search for sales information (stores that carry the item, information on sales floors, online shops) associated with recommended clothing items and the like. In addition, the search unit 150 performs a user search based on the user information, and searches for various types of information such as the skeleton type and user attributes of users who have undergone skeleton diagnosis.

（送信部１６０）
本実施形態に係る送信部１６０は、決定部１４０によって決定されたユーザの骨格タイプや、検索部１５０によって検索されたおすすめのコーディネートを、ユーザ端末２００や店舗端末３００に送信する。 (transmitting unit 160)
The transmission unit 160 according to the present embodiment transmits the user's skeleton type determined by the determination unit 140 and the recommended coordination retrieved by the search unit 150 to the user terminal 200 and the store terminal 300 .

（撮影部１７０）
本実施形態に係る撮影部１７０は、上述したように、本実施形態に係る情報処理システムを１デバイスで実現する場合に含まれる機能で、ユーザを撮影する。そのため、撮影部１７０は、撮像素子、フォーカスリングやズームレンズなどを備える。撮影部１７０によって撮影された写真はデジタルデータに変換され、記憶部１１０に記憶される。ユーザの撮影を別の装置で行う場合、情報処理装置１００は撮影部１７０を備えなくてよい。 (Photographing unit 170)
As described above, the imaging unit 170 according to the present embodiment is a function included when the information processing system according to the present embodiment is realized by one device, and photographs the user. Therefore, the photographing unit 170 includes an imaging device, a focus ring, a zoom lens, and the like. A photograph taken by the photographing unit 170 is converted into digital data and stored in the storage unit 110 . If the image of the user is captured by another device, the information processing device 100 does not need to include the image capture unit 170 .

（制御部１８０）
本実施形態に係る制御部１８０は、情報処理装置１００全体を司る処理部であり、情報処理装置１００が備える各構成を制御する。制御部１８０が有する機能の詳細については後述される。 (control unit 180)
The control unit 180 according to the present embodiment is a processing unit that controls the entire information processing apparatus 100 and controls each configuration included in the information processing apparatus 100 . Details of the functions of the control unit 180 will be described later.

以上、本実施形態に係る情報処理装置１００の機能構成例について説明した。なお、図２を用いて説明した上記の機能構成はあくまで一例であり、本実施形態に係る情報処理装置１００の機能構成は係る例に限定されない。例えば、情報処理装置１００は、必ずしも図２に示す構成のすべてを備えなくてもよいし、学習部１３０などの各構成を情報処理装置１００とは異なる別のコンピュータに備えることも可能である。本実施形態に係る情報処理装置１００の機能構成は、仕様や運用に応じて柔軟に変形可能である。 The functional configuration example of the information processing apparatus 100 according to the present embodiment has been described above. Note that the functional configuration described above with reference to FIG. 2 is merely an example, and the functional configuration of the information processing apparatus 100 according to the present embodiment is not limited to the example. For example, the information processing apparatus 100 does not necessarily have to include all of the configurations shown in FIG. The functional configuration of the information processing apparatus 100 according to this embodiment can be flexibly modified according to specifications and operations.

また、各構成要素の機能を、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｃｅｓｓｉｎｇＵｎｉｔ）などの演算装置がこれらの機能を実現する処理手順を記述した制御プログラムを記憶したＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）やＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などの記憶媒体から制御プログラムを読み出し、そのプログラムを解釈して実行することにより行ってもよい。したがって、本実施形態を実施する時々の技術レベルに応じて、適宜利用する構成を変更することが可能である。また、情報処理装置１００のハードウェア構成の一例については後述される。 In addition, the function of each component is stored in a ROM (Read Only Memory) or a RAM (Random Access Memory) that stores a control program that describes the processing procedure for an arithmetic unit such as a CPU (Central Processing Unit) to realize these functions. A control program may be read out from a storage medium, and the program may be interpreted and executed. Therefore, it is possible to appropriately change the configuration to be used according to the technical level at which the present embodiment is implemented. Also, an example of the hardware configuration of the information processing apparatus 100 will be described later.

＜＜１．３．機能の詳細＞＞
次に、本実施形態に係る情報処理装置１００が有する機能について詳細に説明する。本実施形態に係る情報処理装置１００の制御部１８０は、ユーザ画像を取得し、学習モデルを用いてユーザの骨格タイプを決定し、おすすめのコーディネートを提示することを特徴の１つとする。 <<1.3. Function details >>
Next, functions of the information processing apparatus 100 according to this embodiment will be described in detail. One of the characteristics of the control unit 180 of the information processing apparatus 100 according to the present embodiment is that it acquires a user image, determines the user's skeleton type using a learning model, and presents recommended coordination.

まず、情報処理装置１００が用いる学習モデルについて説明する。図３は、本実施形態に係る学習モデルの一例を示す図である。図３に示すように、本実施形態に係る学習モデルは、例えば、ＣＮＮである。ユーザの写った複数の画像を結合した１つの画像である画像セット４００はＣＮＮに入力されると、特徴量抽出層（畳み込み層）５００－１～５００－５（まとめて「特徴量抽出層５００」という）を経て、画像の視覚的特徴から意味的特徴に構造化される。なお、図３の特徴量抽出層５００は一例であり、層の深さは５層に限られない。そして、特徴量抽出層５００から出力された特徴量が、全結合層６００に入力され、結果であるユーザの骨格タイプが出力される。図３の例では、出力された結果は「Ｓ：０．８、Ｗ：０．１、Ｎ：０．１」を示したため、決定部１４０は、最も確率の高いストレート（Ｓ）を、ユーザの骨格タイプとして決定することができる。しかしながら、図３に示すように、出力された結果は、骨格タイプの正解データ「Ｓ：１．０、Ｗ：０．０、Ｎ：０．０」との間で誤差が発生する場合がある。そのため、学習部１３０は、このような誤差を限りなく０に近づけるように、発生した誤差を、転移学習により全結合層６００に学習させる。 First, the learning model used by the information processing apparatus 100 will be described. FIG. 3 is a diagram showing an example of a learning model according to this embodiment. As shown in FIG. 3, the learning model according to this embodiment is, for example, CNN. When an image set 400, which is one image obtained by combining a plurality of images of a user, is input to the CNN, feature amount extraction layers (convolution layers) 500-1 to 500-5 (collectively, "feature amount extraction layers 500 ), the visual features of the image are structured into semantic features. Note that the feature quantity extraction layer 500 in FIG. 3 is an example, and the depth of the layers is not limited to five layers. Then, the feature amount output from the feature amount extraction layer 500 is input to the fully connected layer 600, and the resulting skeleton type of the user is output. In the example of FIG. 3, the output result indicates "S: 0.8, W: 0.1, N: 0.1", so the determining unit 140 selects the straight (S) with the highest probability from the user can be determined as the skeleton type of However, as shown in FIG. 3, an error may occur between the output result and the skeleton type correct data "S: 1.0, W: 0.0, N: 0.0". . Therefore, the learning unit 130 causes the fully connected layer 600 to learn the generated error by transfer learning so that such an error approaches 0 as much as possible.

一般的な転移学習は、図３に示すように、画像セット４００を学習モデルに入力し、出力される結果と正解データとの誤差を全結合層６００に学習させる（全結合層６００の後半部分のみ、または特徴量抽出層５００の一部も含んで学習させる場合もある）。しかしながら、全結合層６００の学習の度に、特徴量抽出層５００の処理を繰り返し行うのは非常に無駄である。そこで、本実施形態では、学習モデルを特徴量抽出層５００と全結合層６００とに分離し、特徴量抽出層５００から出力された特徴量を全結合層６００の学習に用いる。 In general transfer learning, as shown in FIG. 3, an image set 400 is input to a learning model, and the fully connected layer 600 learns the error between the output result and the correct data (the latter half of the fully connected layer 600 (There are also cases where learning is performed by including only a part of the feature amount extraction layer 500). However, it is extremely wasteful to repeat the processing of the feature quantity extraction layer 500 each time the fully connected layer 600 is learned. Therefore, in this embodiment, the learning model is separated into a feature amount extraction layer 500 and a fully connected layer 600, and the feature amount output from the feature amount extraction layer 500 is used for learning of the fully connected layer 600. FIG.

図４は、本実施形態に係る特徴量抽出層５００の処理の一例を示す図である。図４は、分離された特徴量抽出層５００の処理を示す。図４に示すように、特徴量抽出層５００に入力された画像セット４００は、視覚的特徴から意味的特徴に構造化された特徴量７００を出力する。この特徴量７００を全結合層６００の学習時に用いることで、特徴量抽出層５００の処理は１回で済み、繰り返し行う必要がなくなる。 FIG. 4 is a diagram showing an example of processing of the feature quantity extraction layer 500 according to this embodiment. FIG. 4 shows processing of the separated feature quantity extraction layer 500 . As shown in FIG. 4, an image set 400 input to the feature quantity extraction layer 500 outputs a feature quantity 700 structured from visual features to semantic features. By using this feature quantity 700 during learning of the fully connected layer 600, the processing of the feature quantity extraction layer 500 can be performed only once, eliminating the need for repeated processing.

図５は、本実施形態に係る全結合層６００の学習処理の一例を示す図である。図５は、分離された全結合層６００の学習処理を示す。学習処理は、誤差を評価し、全結合層６００の重みを調整した上で、全結合層６００に対して（図４で示した）特徴量抽出層５００から出力された特徴量７００を入力する。これにより、全結合層６００の学習の度に、特徴量抽出層５００の処理を繰り返し行わなくて済む。 FIG. 5 is a diagram showing an example of learning processing of the fully connected layer 600 according to this embodiment. FIG. 5 shows the training process of the isolated fully connected layer 600. As shown in FIG. The learning process evaluates the error, adjusts the weight of the fully connected layer 600, and inputs the feature quantity 700 output from the feature quantity extraction layer 500 (shown in FIG. 4) to the fully connected layer 600. . This eliminates the need to repeat the processing of the feature quantity extraction layer 500 each time the fully connected layer 600 is learned.

なお、図３～５による学習処理では、ユーザの写った複数の画像を結合した１つの画像である画像セット４００を学習させるため、複数の画像それぞれを学習させるより学習時間の短縮や学習処理の負荷軽減を図ることができる。しかしながら、複数の画像結合したことにより画像の劣化が発生してしまう可能性がある。そのため、本実施形態に係る学習処理の別例として、複数の画像それぞれを学習モデルに学習させる方法を説明する。 In the learning process shown in FIGS. 3 to 5, since the image set 400, which is a single image obtained by combining a plurality of images of the user, is learned, the learning time is shortened and the learning process is faster than learning each of the plurality of images. It is possible to reduce the load. However, there is a possibility that image deterioration will occur due to the combination of a plurality of images. Therefore, as another example of the learning process according to the present embodiment, a method of making a learning model learn each of a plurality of images will be described.

図６は、本実施形態に係る特徴量抽出層５００の処理の別例を示す図である。図６の例では、ユーザの写った複数の画像として、正面画像４００－１、背面画像４００－２、側面画像４００－３、および手画像４００－４のそれぞれを、特徴量抽出層５００に入力する。そうすると、図６に示すように、特徴量抽出層５００からは、入力画像ごとにそれぞれの特徴量７００－１～７００－４が出力される。これら特徴量７００－１～７００－４は、ユーザの写った複数の画像４００－１～４００－４をそのまま用いて出力されたものであるため、画像結合によるデータ劣化の影響を受けることはない。 FIG. 6 is a diagram showing another example of processing of the feature quantity extraction layer 500 according to this embodiment. In the example of FIG. 6, a front image 400-1, a back image 400-2, a side image 400-3, and a hand image 400-4 are input to the feature extraction layer 500 as multiple images of the user. do. Then, as shown in FIG. 6, the feature amount extraction layer 500 outputs feature amounts 700-1 to 700-4 for each input image. Since these feature amounts 700-1 to 700-4 are output using the plurality of images 400-1 to 400-4 of the user as they are, they are not affected by data deterioration due to image combination. .

図７は、本実施形態に係る全結合層６００の学習の別例を示す図である。次に、特徴量抽出層５００から出力された特徴量７００－１～７００－４を結合して、１つの特徴量７００を生成する。特徴量７００－１～７００－４の結合は、画像の結合と異なり、単に、羅列した数値データを繋げる程度の結合であるため、データの劣化は発生しない。そして、この結合した特徴量７００を全結合層６００の学習時に用いることで、全結合層６００の学習の度に、特徴量抽出層５００の処理を繰り返し行わなくて済みつつ、画像結合によるデータ劣化の影響も受けずに済む。 FIG. 7 is a diagram showing another example of learning of the fully connected layer 600 according to this embodiment. Next, the feature quantities 700-1 to 700-4 output from the feature quantity extraction layer 500 are combined to generate one feature quantity 700. FIG. The combination of the feature quantities 700-1 to 700-4 is different from the combination of images, and is merely a combination of enumerated numerical data, so data deterioration does not occur. Then, by using this combined feature amount 700 when learning the fully connected layer 600, it is not necessary to repeat the processing of the feature amount extraction layer 500 each time the fully connected layer 600 is learned, and data deterioration due to image combination can be prevented. without being affected by

次に、学習モデルを用いて決定したユーザの骨格タイプや、当該骨格タイプに基づいたおすすめのコーディネートのユーザ提示について説明する。情報処理装置１００は、ユーザの骨格タイプやおすすめのコーディネートを、ユーザ端末２００や店舗端末３００に送信する。ユーザ端末２００や店舗端末３００は、ユーザの骨格タイプやおすすめのコーディネートを受信し、ＵＩを介してユーザ提示を行う。 Next, the skeleton type of the user determined using the learning model and presentation of recommended coordination to the user based on the skeleton type will be described. The information processing device 100 transmits the user's skeleton type and recommended coordination to the user terminal 200 and the store terminal 300 . The user terminal 200 and the store terminal 300 receive the user's skeleton type and recommended coordination, and present them to the user via the UI.

図８は、本実施形態に係るユーザインタフェースの一例を示す図である。図８は、ユーザの骨格タイプやおすすめのコーディネートのユーザ提示のために、ユーザ端末２００または店舗端末３００に表示されたＵＩの一例である。当該ＵＩは、情報処理装置１００などからダウンロードされ、ユーザ端末２００または店舗端末３００にインストールされたアプリケーションによるものであってもよいし、情報処理装置１００上のＷｅｂアプリケーションによるものであってもよい。 FIG. 8 is a diagram showing an example of a user interface according to this embodiment. FIG. 8 is an example of a UI displayed on the user terminal 200 or the store terminal 300 for presenting the user's frame type and recommended coordination to the user. The UI may be an application downloaded from the information processing device 100 or the like and installed in the user terminal 200 or the store terminal 300, or may be a web application on the information processing device 100. FIG.

図８に示すように、ユーザ端末２００および店舗端末３００は、骨格診断の結果として、ユーザの骨格タイプ、骨格タイプに対する解説や似合う素材や柄、ユーザの骨格タイプに合ったおすすめのコーディネートを表示することができる。また、ユーザ端末２００および店舗端末３００は、おすすめのコーディネートとして検索された、衣料品、服飾品、およびアクセサリの販売情報を表示することもできる。さらに、各販売情報に対して、ユーザ端末２００や店舗端末３００は、例えば、ユーザによるタッチ操作で選択されることにより、選択された衣料品などの取り扱い店舗や、売り場案内、オンラインショップの情報を表示することができる。これにより、おすすめのコーディネートとして検索された、衣料品、服飾品、およびアクセサリの販売を促進させることができる。 As shown in FIG. 8, the user terminal 200 and the store terminal 300 display the skeleton type of the user, commentary on the skeleton type, suitable materials and patterns, and recommended coordination suitable for the user's skeleton type as a result of skeleton diagnosis. be able to. User terminal 200 and store terminal 300 can also display sales information of clothing, furnishings, and accessories searched for recommended coordination. Furthermore, for each sales information, the user terminal 200 and the store terminal 300, for example, by being selected by a touch operation by the user, provide information on stores handling selected clothing items, sales floor guidance, and online shops. can be displayed. As a result, it is possible to promote sales of clothing, accessories, and accessories searched for recommended coordination.

＜＜１．４．機能の流れ＞＞
次に、図９を用いて、本実施形態に係る人物の学習モデル生成処理の手順について説明する。図９は、本実施形態に係る学習モデル生成処理の流れを示すフローチャートである。本処理は、人物の骨格タイプを決定するための学習モデルを生成するために情報処理装置１００によって実行される。 <<1.4. Function Flow>>
Next, the procedure of the human learning model generation process according to the present embodiment will be described with reference to FIG. 9 . FIG. 9 is a flowchart showing the flow of learning model generation processing according to this embodiment. This process is executed by the information processing apparatus 100 to generate a learning model for determining a person's skeleton type.

まず、図９に示すように、情報処理装置１００の取得部１２０は、学習モデルの教師データとして用いるため、人物が写った画像を取得する（ステップＳ１０１）。当該画像は、例えば、全身、半身、手など、骨格診断に用いる部位が写った複数の画像セット４００である。このような画像は、学習処理のために撮影してもよいし、ＩｍａｇｅＮｅｔなどに蓄積された既存の画像を用いることもできる。 First, as shown in FIG. 9, the acquisition unit 120 of the information processing apparatus 100 acquires an image of a person to be used as teacher data for a learning model (step S101). The images are, for example, a plurality of image sets 400 showing parts used for skeletal diagnosis, such as whole body, half body, and hands. Such images may be taken for learning processing, or existing images accumulated in ImageNet or the like can be used.

次に、取得部１２０は、ステップＳ１０１で取得した画像に写った人物の骨格タイプを取得する（ステップＳ１０２）。当該骨格タイプは、予め、有識者が当該人物を直接、または画像を見て診断して決定した骨格タイプである。なお、ステップＳ１０１およびステップＳ１０２の実行順番は逆であってもよい。 Next, the obtaining unit 120 obtains the skeleton type of the person appearing in the image obtained in step S101 (step S102). The skeletal type is a skeletal type determined in advance by an expert by diagnosing the person directly or by viewing an image. Note that the execution order of steps S101 and S102 may be reversed.

次に、情報処理装置１００の学習部１３０は、取得部１２０によって取得された、人物が写った画像を入力、および当該人物の骨格タイプを正解とする教師データを学習し、学習モデルを生成する（ステップＳ１０３）。なお、この際、転移学習による学習モデルの学習のため、学習モデルを特徴量抽出層５００と全結合層６００とに分離しておき、特徴量抽出層５００から出力される特徴量７００を記憶部１１０に記憶しておくことができる。 Next, the learning unit 130 of the information processing apparatus 100 receives the image of the person acquired by the acquisition unit 120, learns the teacher data that corrects the skeletal type of the person, and generates a learning model. (Step S103). At this time, in order to learn the learning model by transfer learning, the learning model is separated into the feature amount extraction layer 500 and the fully connected layer 600, and the feature amount 700 output from the feature amount extraction layer 500 is stored in the storage unit. 110 can be stored.

ステップＳ１０３の後、学習を行う人物の数分、ステップＳ１０１～１０３を繰り返して学習を行い、本処理は終了する。 After step S103, steps S101 to S103 are repeated for the number of persons to be learned, and the process ends.

次に、図１０を用いて、本実施形態に係る人物の学習処理の手順について説明する。図１０は、本実施形態に係る学習処理の流れを示すフローチャートである。本処理は、人物の骨格タイプを決定するための学習モデルから出力された結果と正解との誤差を評価し、誤差をフィードバックすることで学習モデルを学習するために情報処理装置１００によって実行される。 Next, the procedure of the person learning process according to this embodiment will be described with reference to FIG. 10 . FIG. 10 is a flowchart showing the flow of learning processing according to this embodiment. This process is executed by the information processing apparatus 100 to evaluate the error between the result output from the learning model for determining the skeleton type of the person and the correct answer, and feed back the error to learn the learning model. .

まず、図１０に示すように、情報処理装置１００の取得部１２０は、ステップＳ１０３で生成された学習モデルの特徴量抽出層５００から出力された特徴量データ（特徴量７００）を取得する（ステップＳ２０１）。特徴量７００は、学習モデルに入力される画像が変わらない限り同一であるため、出力結果と正解との誤差を用いて繰り返し学習を行う場合は、全結合層６００に特徴量７００を入力することで、特徴量抽出層５００の処理を繰り返し行わなくて済む。 First, as shown in FIG. 10, the acquisition unit 120 of the information processing apparatus 100 acquires feature amount data (feature amount 700) output from the feature amount extraction layer 500 of the learning model generated in step S103 (step S201). The feature amount 700 is the same as long as the image input to the learning model does not change. Therefore, when repeated learning is performed using the error between the output result and the correct answer, the feature amount 700 is input to the fully connected layer 600. Therefore, the processing of the feature quantity extraction layer 500 does not have to be repeated.

ステップ１０３で生成された学習モデルの出力結果と、正解データとの誤差を取得する（ステップＳ２０２）。例えば、学習モデルから出力された結果が「Ｓ：０．８、Ｗ：０．１、Ｎ：０．１」であり、正解データが「Ｓ：１．０、Ｗ：０．０、Ｎ：０．０」であるとすると、Ｓに対して－０．２、ＷおよびＮに対して－０．１ずつの誤差があることになる。学習モデルの精度を上げるために、この各誤差を０に近づけるように学習を行う。 An error between the output result of the learning model generated in step 103 and the correct data is obtained (step S202). For example, the result output from the learning model is "S: 0.8, W: 0.1, N: 0.1", and the correct data is "S: 1.0, W: 0.0, N: 0.0”, there is an error of −0.2 for S and −0.1 for W and N. In order to improve the accuracy of the learning model, learning is performed so that each error approaches zero.

次に、情報処理装置１００の学習部１３０は、取得部１２０によって取得された、誤差および特徴量７００を用いて、学習モデルの全結合層６００の学習を行う（ステップＳ２０３）。より具体的には、誤差を評価し、全結合層６００の重みを調整した上で、特徴量７００を学習モデルに入力する。その後、学習モデルから出力された結果と正解データとの評価し、全結合層６００の学習を繰り返し行い、本処理は終了する。 Next, the learning unit 130 of the information processing device 100 learns the fully connected layer 600 of the learning model using the error and the feature amount 700 acquired by the acquisition unit 120 (step S203). More specifically, after evaluating the error and adjusting the weight of the fully connected layer 600, the feature quantity 700 is input to the learning model. After that, the result output from the learning model and the correct data are evaluated, the learning of the fully connected layer 600 is repeatedly performed, and this process ends.

次に、図１１を用いて、本実施形態に係るユーザの骨格タイプの決定処理の手順について説明する。図１１は、本実施形態に係る骨格タイプの決定処理の流れを示すフローチャートである。本処理は、例えば、ユーザがユーザ端末２００を用いて、自宅で骨格診断を受けることを想定した処理である。本処理は、例えば、ユーザがユーザ端末２００を介して、ユーザ端末２００にインストールされた骨格診断を行うためのアプリケーションや、情報処理装置１００上のＷｅｂアプリケーションにアクセスしたことをトリガーとして開始される。 Next, with reference to FIG. 11, a procedure for determining a user's skeleton type according to this embodiment will be described. FIG. 11 is a flowchart showing the flow of skeleton type determination processing according to the present embodiment. This process is, for example, a process on the assumption that the user uses the user terminal 200 to receive a skeleton diagnosis at home. This process is triggered by, for example, the user accessing, via the user terminal 200, an application for performing skeleton diagnosis installed in the user terminal 200 or a web application on the information processing apparatus 100. FIG.

まず、図１１に示すように、ユーザ端末２００は、アクセスしたユーザを情報処理装置１００に認識させるために、ユーザ情報を情報処理装置１００に送信する（ステップＳ３０１）。当該ユーザ情報とは、例えば、ユーザごとのログイン情報である。 First, as shown in FIG. 11, the user terminal 200 transmits user information to the information processing apparatus 100 so that the information processing apparatus 100 recognizes the user who has accessed (step S301). The user information is, for example, login information for each user.

情報処理装置１００は、ユーザ情報をユーザ端末２００から受信していない場合（ステップＳ３０２：Ｎｏ）、ユーザ情報の受信を待つ。一方、ユーザ情報を受信した場合（ステップＳ３０２：Ｙｅｓ）、情報処理装置１００の検索部１５０は、ユーザ情報に基づいて、対象ユーザを検索する（ステップＳ３０３）。 If the information processing apparatus 100 has not received user information from the user terminal 200 (step S302: No), it waits for reception of user information. On the other hand, if user information has been received (step S302: Yes), the search unit 150 of the information processing apparatus 100 searches for the target user based on the user information (step S303).

次に、ステップＳ３０３によるユーザ検索の結果、対象ユーザが骨格診断済みの場合、情報処理装置１００の検索部１５０は対象ユーザの骨格タイプを検索し、ステップＳ３１０に進む。 Next, as a result of the user search in step S303, if the target user has undergone a skeleton diagnosis, the search unit 150 of the information processing apparatus 100 searches for the target user's skeleton type, and the process proceeds to step S310.

一方、対象ユーザが骨格診断済みでない場合、情報処理装置１００の送信部１６０は、骨格診断を行うため、ユーザ画像の要求をユーザ端末２００に送信する（ステップＳ３０５）。 On the other hand, if the target user has not undergone skeleton diagnosis, the transmitting unit 160 of the information processing apparatus 100 transmits a request for a user image to the user terminal 200 in order to perform skeleton diagnosis (step S305).

ユーザ端末２００は、ユーザ画像の要求を情報処理装置１００から受信していない場合（ステップＳ３０６：Ｎｏ）、ユーザ画像の要求の受信を待つ。一方、ユーザ画像の要求を受信した場合（ステップＳ３０６：Ｙｅｓ）、ユーザ端末２００は、例えば、ユーザ端末２００の画面上にその旨を表示するなどし、ユーザに対しユーザ画像のアップロードを促す。 If the user terminal 200 has not received a user image request from the information processing apparatus 100 (step S306: No), the user terminal 200 waits to receive a user image request. On the other hand, if a user image request has been received (step S306: Yes), the user terminal 200 prompts the user to upload a user image by displaying a message to that effect on the screen of the user terminal 200, for example.

次に、ユーザがユーザ端末２００などを用いてユーザ画像を撮影し、または撮影済みのユーザ画像を選択すると、ユーザ端末２００は、ユーザ画像を情報処理装置１００にアップロードする（ステップＳ３０７）。 Next, when the user shoots a user image using the user terminal 200 or the like or selects a shot user image, the user terminal 200 uploads the user image to the information processing apparatus 100 (step S307).

情報処理装置１００は、ユーザ画像をユーザ端末２００から受信していない場合（ステップＳ３０８：Ｎｏ）、ユーザ画像の受信を待つ。一方、ユーザ画像を受信した場合（ステップＳ３０８：Ｙｅｓ）、情報処理装置１００の学習部１３０は、ユーザ画像を学習モデルに入力し、決定部１４０は、学習モデルから出力される結果に基づいて、対象ユーザの骨格タイプを決定する（ステップＳ３０９）。 If the user image has not been received from the user terminal 200 (step S308: No), the information processing apparatus 100 waits for reception of the user image. On the other hand, when the user image is received (step S308: Yes), the learning unit 130 of the information processing device 100 inputs the user image into the learning model, and the determining unit 140, based on the result output from the learning model, The skeleton type of the target user is determined (step S309).

次に、検索部１５０は、対象ユーザの骨格タイプに基づいて、おすすめのコーディネートを検索する（ステップＳ３１０）。この際、検索部１５０は、対象ユーザの身長や体重、年齢、性別、活動エリアなどのユーザ属性や、現在の季節などの各種情報を検索し、検索した各種情報にさらに基づいて、おすすめのコーディネートを検索することもできる。また、おすすめのコーディネートとして検索された衣料品などに関連付けられた販売情報（取扱い店舗やオンラインショップ情報など）を検索することもできる。 Next, the search unit 150 searches for recommended coordination based on the skeleton type of the target user (step S310). At this time, the search unit 150 searches for user attributes such as the height, weight, age, gender, and activity area of the target user, and various information such as the current season. You can also search for In addition, it is possible to search for sales information (such as information on stores and online shops) associated with clothing items searched for as recommended coordination.

次に、送信部１６０は、決定部１４０によって決定された対象ユーザの骨格タイプや、検索部１５０によって検索されたおすすめのコーディネートなどの情報を、ユーザ端末２００に送信する（ステップＳ３１１）。 Next, the transmission unit 160 transmits information such as the skeleton type of the target user determined by the determination unit 140 and the recommended coordination retrieved by the search unit 150 to the user terminal 200 (step S311).

ユーザ端末２００は、骨格タイプなどの情報を情報処理装置１００から受信していない場合（ステップＳ３１２：Ｎｏ）、骨格タイプなどの情報の受信を待つ。一方、骨格タイプなどの情報を受信した場合（ステップＳ３１２：Ｙｅｓ）、ユーザ端末２００は、骨格タイプなどの情報を、ユーザ端末２００の画面上に表示する（ステップＳ３１３）。ステップＳ３１３の後、本処理は終了する。しかしながら、例えば、ユーザは、ステップＳ３１３でおすすめのコーディネートに付随して表示された販売情報から、おすすめの衣料品などを販売するオンラインショップや取扱い店舗にアクセスし、買い物や商品の詳細確認などをすることができる。 If the user terminal 200 has not received information such as the skeleton type from the information processing apparatus 100 (step S312: No), it waits for reception of information such as the skeleton type. On the other hand, when information such as the skeleton type is received (step S312: Yes), the user terminal 200 displays the information such as the skeleton type on the screen of the user terminal 200 (step S313). After step S313, this process ends. However, for example, the user accesses an online shop or store that sells recommended clothing or the like based on the sales information displayed along with the recommended coordination in step S313, and performs shopping or confirms details of the product. be able to.

次に、図１２を用いて、本実施形態に係るユーザの骨格タイプの決定処理の別の手順について説明する。図１２は、本実施形態に係る骨格タイプの決定処理の別の流れを示すフローチャートである。本処理は、例えば、ユーザが衣料品などを販売する店舗を訪れ、店舗端末３００を用いて、店頭で骨格診断を受けることを想定した処理である。本処理は、例えば、ユーザが店舗端末３００に顔認証などにより検出されたことをトリガーに開始される。 Next, another procedure of the user's skeleton type determination process according to the present embodiment will be described with reference to FIG. 12 . FIG. 12 is a flowchart showing another flow of skeleton type determination processing according to the present embodiment. This processing is processing, for example, assuming that the user visits a store that sells clothing and the like and receives a skeleton diagnosis at the store using the store terminal 300 . This process is triggered by, for example, the fact that the user is detected by the store terminal 300 through face authentication or the like.

まず、図１２に示すように、店舗端末３００は、検出したユーザの検出情報を情報処理装置１００に送信する（ステップＳ４０１）。 First, as shown in FIG. 12, the store terminal 300 transmits detection information of the detected user to the information processing device 100 (step S401).

情報処理装置１００は、ユーザの検出情報を店舗端末３００から受信していない場合（ステップＳ４０２：Ｎｏ）、ユーザの検出情報の受信を待つ。一方、ユーザの検出情報を受信した場合（ステップＳ４０２：Ｙｅｓ）、情報処理装置１００の検索部１５０は、ユーザの検出情報に基づいて、対象ユーザを検索する（ステップＳ４０３）。 If the information processing apparatus 100 has not received the user detection information from the store terminal 300 (step S402: No), the information processing apparatus 100 waits for reception of the user detection information. On the other hand, when user detection information is received (step S402: Yes), the search unit 150 of the information processing apparatus 100 searches for a target user based on the user detection information (step S403).

次に、ステップＳ４０３によるユーザ検索の結果、対象ユーザが骨格診断済みの場合、情報処理装置１００の検索部１５０は対象ユーザの骨格タイプを検索し、ステップＳ４１０に進む。 Next, as a result of the user search in step S403, if the target user has undergone a skeleton diagnosis, the search unit 150 of the information processing apparatus 100 searches for the target user's skeleton type, and the process proceeds to step S410.

一方、対象ユーザが骨格診断済みでない場合、情報処理装置１００の送信部１６０は、骨格診断を行うため、ユーザ画像の撮影要求を店舗端末３００に送信する（ステップＳ４０５）。 On the other hand, if the target user has not undergone a skeleton diagnosis, the transmission unit 160 of the information processing apparatus 100 transmits a user image capturing request to the store terminal 300 in order to perform skeleton diagnosis (step S405).

店舗端末３００は、ユーザ画像の撮影要求を情報処理装置１００から受信していない場合（ステップＳ４０６：Ｎｏ）、ユーザ画像の撮影要求の受信を待つ。一方、ユーザ画像の撮影要求を受信した場合（ステップＳ４０６：Ｙｅｓ）、店舗端末３００は、ユーザを撮影し、ユーザ画像を情報処理装置１００にアップロードする（ステップＳ４０７）。 If the store terminal 300 has not received a user image shooting request from the information processing apparatus 100 (step S406: No), it waits for reception of a user image shooting request. On the other hand, when a user image capturing request is received (step S406: Yes), the shop terminal 300 captures the user and uploads the user image to the information processing device 100 (step S407).

情報処理装置１００は、ユーザ画像を店舗端末３００から受信していない場合（ステップＳ４０８：Ｎｏ）、ユーザ画像の受信を待つ。一方、ユーザ画像を受信した場合（ステップＳ４０８：Ｙｅｓ）、情報処理装置１００の学習部１３０は、ユーザ画像を学習モデルに入力し、決定部１４０は、学習モデルから出力される結果に基づいて、対象ユーザの骨格タイプを決定する（ステップＳ４０９）。 If the user image has not been received from the store terminal 300 (step S408: No), the information processing apparatus 100 waits for reception of the user image. On the other hand, when the user image is received (step S408: Yes), the learning unit 130 of the information processing device 100 inputs the user image into the learning model, and the determining unit 140, based on the result output from the learning model, The skeleton type of the target user is determined (step S409).

次に、検索部１５０は、対象ユーザの骨格タイプに基づいて、おすすめのコーディネートを検索する（ステップＳ４１０）。この際、ステップＳ３１０同様、検索部１５０は、おすすめのコーディネートや、おすすめのコーディネートとして検索された衣料品などに関連付けられた販売情報（売り場案内など）を検索することもできる。 Next, the search unit 150 searches for recommended coordination based on the skeleton type of the target user (step S410). At this time, as in step S310, the search unit 150 can also search for recommended coordination and sales information (such as sales floor guides) associated with clothing items searched for as recommended coordination.

次に、送信部１６０は、決定部１４０によって決定された対象ユーザの骨格タイプや、検索部１５０によって検索されたおすすめのコーディネートなどの情報を、店舗端末３００に送信する（ステップＳ４１１）。 Next, the transmission unit 160 transmits information such as the skeleton type of the target user determined by the determination unit 140 and the recommended coordination retrieved by the search unit 150 to the store terminal 300 (step S411).

店舗端末３００は、骨格タイプなどの情報を情報処理装置１００から受信していない場合（ステップＳ４１２：Ｎｏ）、骨格タイプなどの情報の受信を待つ。一方、骨格タイプなどの情報を受信した場合（ステップＳ４１２：Ｙｅｓ）、店舗端末３００は、骨格タイプなどの情報を、店舗端末３００の画面上に表示する（ステップＳ４１３）。ステップＳ４１３の後、本処理は終了する。しかしながら、例えば、ユーザは、ステップＳ４１３でおすすめのコーディネートに付随して表示された販売情報から、おすすめの衣料品などを販売する売り場を確認し、買い物や商品の詳細確認などをすることができる。 If information such as the skeleton type has not been received from the information processing apparatus 100 (step S412: No), the store terminal 300 waits for reception of information such as the skeleton type. On the other hand, if information such as the skeleton type has been received (step S412: Yes), the shop terminal 300 displays the information such as the skeleton type on the screen of the shop terminal 300 (step S413). After step S413, this process ends. However, for example, the user can confirm sales floors selling recommended clothing and the like from the sales information displayed accompanying the recommended coordination in step S413, and can confirm details of shopping and products.

＜２．ハードウェア構成例＞
次に、本実施形態に係る情報処理装置１００のハードウェア構成例について説明する。図１３は、本実施形態に係る情報処理装置１００のハードウェア構成例を示すブロック図である。図１３を参照すると、情報処理装置１００は、例えば、プロセッサ８０１と、ＲＯＭ８０２と、ＲＡＭ８０３と、ホストバス８０４と、ブリッジ８０５と、外部バス８０６と、インターフェース８０７と、入力装置８０８と、出力装置８０９と、ストレージ８１０と、ドライブ８１１と、接続ポート８１２と、通信装置８１３と、を有する。なお、ここで示すハードウェア構成は一例であり、構成要素の一部が省略されてもよい。また、ここで示される構成要素以外の構成要素をさらに含んでもよい。 <2. Hardware configuration example>
Next, a hardware configuration example of the information processing apparatus 100 according to this embodiment will be described. FIG. 13 is a block diagram showing a hardware configuration example of the information processing apparatus 100 according to this embodiment. Referring to FIG. 13, the information processing apparatus 100 includes, for example, a processor 801, a ROM 802, a RAM 803, a host bus 804, a bridge 805, an external bus 806, an interface 807, an input device 808, and an output device 809. , a storage 810 , a drive 811 , a connection port 812 , and a communication device 813 . Note that the hardware configuration shown here is an example, and some of the components may be omitted. Moreover, it may further include components other than the components shown here.

（プロセッサ８０１）
プロセッサ８０１は、例えば、演算処理装置または制御装置として機能し、ＲＯＭ８０２、ＲＡＭ８０３、ストレージ８１０、またはリムーバブル記録媒体９０１に記録された各種プログラムに基づいて各構成要素の動作全般またはその一部を制御する。 (processor 801)
The processor 801 functions, for example, as an arithmetic processing device or a control device, and controls the overall operation or part of each component based on various programs recorded in the ROM 802, the RAM 803, the storage 810, or the removable recording medium 901. .

（ＲＯＭ８０２、ＲＡＭ８０３）
ＲＯＭ８０２は、プロセッサ８０１に読み込まれるプログラムや演算に用いるデータなどを格納する手段である。ＲＡＭ８０３には、例えば、プロセッサ８０１に読み込まれるプログラムや、そのプログラムを実行する際に適宜変化する各種パラメータなどが一時的または永続的に格納される。 (ROM802, RAM803)
The ROM 802 is means for storing programs to be read into the processor 801 and data used for calculations. The RAM 803 temporarily or permanently stores, for example, a program to be read into the processor 801 and various parameters that appropriately change when the program is executed.

（ホストバス８０４、ブリッジ８０５、外部バス８０６、インターフェース８０７）
プロセッサ８０１、ＲＯＭ８０２、ＲＡＭ８０３は、例えば、高速なデータ伝送が可能なホストバス８０４を介して相互に接続される。一方、ホストバス８０４は、例えば、ブリッジ８０５を介して比較的データ伝送速度が低速な外部バス８０６に接続される。また、外部バス８０６は、インターフェース８０７を介して種々の構成要素と接続される。 (Host Bus 804, Bridge 805, External Bus 806, Interface 807)
The processor 801, ROM 802, and RAM 803 are interconnected via, for example, a host bus 804 capable of high-speed data transmission. On the other hand, the host bus 804 is connected, for example, via a bridge 805 to an external bus 806 with a relatively low data transmission speed. Also, the external bus 806 is connected to various components via an interface 807 .

（入力装置８０８）
入力装置８０８には、例えば、マウス、キーボード、タッチパネル、ボタン、スイッチ、およびレバーなどが用いられる。さらに、入力装置８０８としては、赤外線やその他の電波を利用して制御信号を送信することが可能なリモートコントローラ（以下、リモコン）が用いられることもある。また、入力装置８０８には、マイクロフォンなどの音声入力装置が含まれる。 (Input device 808)
For example, a mouse, keyboard, touch panel, button, switch, and lever are used as the input device 808 . Furthermore, as the input device 808, a remote controller (hereinafter referred to as a remote controller) capable of transmitting control signals using infrared rays or other radio waves may be used. The input device 808 also includes a voice input device such as a microphone.

（出力装置８０９）
出力装置８０９は、例えば、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）、ＬＣＤ、または有機ＥＬなどのディスプレイ装置、スピーカ、ヘッドホンなどのオーディオ出力装置、プリンタ、携帯電話、またはファクシミリなど、取得した情報を利用者に対して視覚的または聴覚的に通知することが可能な装置である。また、本実施形態に係る出力装置８０９は、触覚刺激を出力することが可能な種々の振動デバイスを含む。 (output device 809)
The output device 809 is, for example, a display device such as a CRT (Cathode Ray Tube), an LCD, or an organic EL, an audio output device such as a speaker or headphone, a printer, a mobile phone, a facsimile, or the like, and outputs the acquired information to the user. It is a device that can be notified visually or audibly. Also, the output device 809 according to this embodiment includes various vibration devices capable of outputting tactile stimulation.

（ストレージ８１０）
ストレージ８１０は、各種のデータを格納するための装置である。ストレージ８１０としては、例えば、ハードディスクドライブ（ＨＤＤ）などの磁気記憶デバイス、半導体記憶デバイス、光記憶デバイス、または光磁気記憶デバイスなどが用いられる。 (storage 810)
Storage 810 is a device for storing various data. As the storage 810, for example, a magnetic storage device such as a hard disk drive (HDD), a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like is used.

（ドライブ８１１）
ドライブ８１１は、例えば、磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリなどのリムーバブル記録媒体９０１に記録された情報を読み出し、またはリムーバブル記録媒体９０１に情報を書き込む装置である。 (Drive 811)
The drive 811 is, for example, a device that reads information recorded on a removable recording medium 901 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory, or writes information to the removable recording medium 901 .

（接続ポート８１２）
接続ポート８１２は、例えば、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）ポート、ＩＥＥＥ１３９４ポート、ＳＣＳＩ（ＳｍａｌｌＣｏｍｐｕｔｅｒＳｙｓｔｅｍＩｎｔｅｒｆａｃｅ）、ＲＳ－２３２Ｃポート、または光オーディオ端子などのような外部接続機器９０２を接続するためのポートである。 (connection port 812)
The connection port 812 is, for example, a USB (Universal Serial Bus) port, an IEEE1394 port, a SCSI (Small Computer System Interface), an RS-232C port, or a port for connecting an external connection device 902 such as an optical audio terminal. be.

（通信装置８１３）
通信装置８１３は、ネットワークに接続するための通信デバイスであり、例えば、有線または無線ＬＡＮ、Ｂｌｕｅｔｏｏｔｈ（登録商標）、またはＷＵＳＢ（ＷｉｒｅｌｅｓｓＵＳＢ）用の通信カード、光通信用のルータ、ＡＤＳＬ（ＡｓｙｍｍｅｔｒｉｃＤｉｇｉｔａｌＳｕｂｓｃｒｉｂｅｒＬｉｎｅ）用のルータ、または各種通信用のモデムなどである。 (Communication device 813)
The communication device 813 is a communication device for connecting to a network. subscriber line) or a modem for various communications.

（リムーバブル記録媒体９０１）
リムーバブル記録媒体９０１は、例えば、ＤＶＤ（登録商標）メディア、Ｂｌｕ－ｒａｙ（登録商標）メディア、ＨＤＤＶＤメディア、各種の半導体記憶メディアなどである。もちろん、リムーバブル記録媒体９０１は、例えば、非接触型ＩＣチップを搭載したＩＣカード、または電子機器などであってもよい。 (Removable recording medium 901)
The removable recording medium 901 is, for example, DVD (registered trademark) media, Blu-ray (registered trademark) media, HD DVD media, various semiconductor storage media, and the like. Of course, the removable recording medium 901 may be, for example, an IC card equipped with a contactless IC chip, an electronic device, or the like.

（外部接続機器９０２）
外部接続機器９０２は、例えば、プリンタ、携帯音楽プレーヤ、デジタルカメラ、デジタルビデオカメラ、またはＩＣレコーダなどである。 (External connection device 902)
The externally connected device 902 is, for example, a printer, portable music player, digital camera, digital video camera, or IC recorder.

なお、本実施形態に係る記憶部１１０は、ＲＯＭ８０２やＲＡＭ８０３、ストレージ８１０によって実現される。また、プロセッサ８０１によって実現される本実施形態に係る制御部１８０が、取得部１２０、学習部１３０、決定部１４０、検索部１５０、撮影部１７０を実現する各制御プログラムを、ＲＯＭ８０２やＲＡＭ８０３などから読み出し実行する。また、本実施形態に係る送信部１６０が、ＲＯＭ８０２やＲＡＭ８０３などからデータを読み出し、ホストバス８０４、ブリッジ８０５、外部バス８０６、インターフェース８０７を介して通信装置８１３にデータを送り、外部装置へのデータ送信を行う。 Note that the storage unit 110 according to this embodiment is realized by the ROM 802, the RAM 803, and the storage 810. FIG. Further, the control unit 180 according to the present embodiment realized by the processor 801 loads each control program realizing the acquisition unit 120, the learning unit 130, the determination unit 140, the search unit 150, and the imaging unit 170 from the ROM 802, the RAM 803, and the like. Read and execute. Further, the transmission unit 160 according to this embodiment reads data from the ROM 802, the RAM 803, etc., transmits the data to the communication device 813 via the host bus 804, the bridge 805, the external bus 806, and the interface 807, and transmits the data to the external device. send.

＜３．まとめ＞
以上説明したように、情報処理装置１００は、ユーザが写った複数の第２の画像を取得する取得部１２０と、人物が写った複数の第１の画像を入力、ならびに人物の第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する学習部１３０と、複数の第２の画像を学習モデルに入力することで出力される結果に基づいて、ユーザの第２の骨格タイプを決定する決定部とを備える。 <3. Summary>
As described above, the information processing apparatus 100 receives the acquisition unit 120 that acquires a plurality of second images of a user, a plurality of first images of a person, and a first skeleton of the person. A learning unit 130 that learns teacher data whose type is the correct answer and generates a learning model, and based on the result output by inputting a plurality of second images into the learning model, the user's second skeleton type and a determination unit that determines the

これにより、有識者に骨格診断を受けなくてもユーザ自身で手軽にセルフ診断できる学習モデルを提供することができる。また、提供される学習モデルは、有識者の診断結果を正解データとして学習するため、信頼性の高い骨格診断を行うことができる。また、学習モデルを用いた骨格診断は、身体的特徴に基づいて有識者が視覚および触診で骨格タイプを決定する骨格診断と比較して、画像の視覚的特徴から構造化した意味的特徴に基づいて特定の骨格タイプである確率を導出する点で大きく異なる。 As a result, it is possible to provide a learning model that allows the user to easily self-diagnose without undergoing skeleton diagnosis by an expert. In addition, since the provided learning model learns the diagnostic results of experts as correct data, it is possible to perform highly reliable skeletal diagnosis. In addition, skeletal diagnosis using a learning model is based on semantic features structured from visual features of images, compared to skeletal diagnosis in which an expert determines the skeletal type visually and by palpation based on physical characteristics. It differs significantly in that it derives the probability of being of a particular skeleton type.

以上、添付図面を参照しながら本開示の好適な実施形態について詳細に説明したが、本開示の技術的範囲はかかる例に限定されない。本開示の技術分野における通常の知識を有する者であれば、請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本開示の技術的範囲に属するものと了解される。 Although the preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, the technical scope of the present disclosure is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field of the present disclosure can conceive of various modifications or modifications within the scope of the technical idea described in the claims. are naturally within the technical scope of the present disclosure.

また、本明細書に記載された効果は、あくまで説明的または例示的なものであって限定的ではない。つまり、本開示に係る技術は、上記の効果とともに、または上記の効果に代えて、本明細書の記載から当業者には明らかな他の効果を奏しうる。 Also, the effects described herein are merely illustrative or exemplary, and are not limiting. In other words, the technology according to the present disclosure can produce other effects that are obvious to those skilled in the art from the description of this specification, in addition to or instead of the above effects.

なお、本技術は以下のような構成も取ることができる。
（１）人物が写った複数の第１の画像および前記人物の第１の骨格タイプを取得する取得部１２０と、前記複数の第１の画像を入力、および前記第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する学習部１３０とを備えたことを特徴とする情報処理装置１００。 Note that the present technology can also take the following configuration.
(1) an acquisition unit 120 for acquiring a plurality of first images showing a person and a first skeleton type of the person; and a learning unit 130 that learns teacher data to generate a learning model.

これにより、情報処理装置１００は、有識者に骨格診断を受けなくてもユーザ自身で手軽にセルフ診断できる学習モデルを提供することができる。さらに、これはセルフ診断ではあるが、提供される学習モデルが有識者の診断結果を正解データとして学習するため、信頼性の高い骨格診断を行うことができる。 As a result, the information processing apparatus 100 can provide a learning model that allows the user to easily self-diagnose without undergoing skeleton diagnosis by an expert. Furthermore, although this is a self-diagnosis, the provided learning model learns the diagnostic results of experts as correct data, so that highly reliable skeletal diagnosis can be performed.

（２）前記学習部１３０は、前記複数の第１の画像を結合した１つの画像を前記第１の画像として入力、および前記第１の骨格タイプを正解とする教師データを学習して前記学習モデルを生成することを特徴とする情報処理装置１００。 (2) The learning unit 130 inputs, as the first image, one image obtained by combining the plurality of first images, and learns teacher data that corrects the first skeleton type. An information processing apparatus 100 characterized by generating a model.

これにより、情報処理装置１００は、複数の第１の画像それぞれを学習モデルに学習させるより学習時間の短縮、および学習処理の負荷軽減を図ることができる。 As a result, the information processing apparatus 100 can shorten the learning time and reduce the load of the learning process by having the learning model learn each of the plurality of first images.

（３）前記学習部１３０はさらに、前記学習モデルの特徴量抽出層から出力された特徴量と、前記学習モデルの全結合層から出力された結果と、前記第１の骨格タイプの正解データとの誤差とを用いて、前記学習モデルの学習を行うことを特徴とする情報処理装置１００。 (3) The learning unit 130 further includes the feature amount output from the feature amount extraction layer of the learning model, the result output from the fully connected layer of the learning model, and the correct data of the first skeleton type. An information processing apparatus 100 that performs learning of the learning model using the error of .

これにより、情報処理装置１００は、学習モデルの特徴量抽出層から処理を行うより学習時間の短縮、および学習処理の負荷軽減を図ることができる。 As a result, the information processing apparatus 100 can shorten the learning time and reduce the load of the learning process by performing processing from the feature amount extraction layer of the learning model.

（４）前記学習部１３０は、前記複数の第１の画像をそれぞれ前記第１の画像として入力、および前記第１の骨格タイプを正解とする教師データを学習して前記学習モデルを生成することを特徴とする情報処理装置１００。 (4) The learning unit 130 inputs the plurality of first images as the first images, learns teacher data that corrects the first skeleton type, and generates the learning model. An information processing apparatus 100 characterized by:

これにより、情報処理装置１００は、複数の第１の画像を結合したことにより発生し得る画像の劣化を発生させずに、学習モデルを生成することができる。 As a result, the information processing apparatus 100 can generate a learning model without causing image deterioration that may occur due to combining a plurality of first images.

（５）前記学習部１３０はさらに、前記学習モデルの特徴量抽出層から出力された複数の特徴量を結合した１つの特徴量と、前記学習モデルの全結合層から出力された結果と、前記第１の骨格タイプの正解データとの誤差とを用いて、前記学習モデルの学習を行うことを特徴とする情報処理装置１００。 (5) The learning unit 130 further includes one feature value obtained by combining a plurality of feature values output from the feature value extraction layer of the learning model, a result output from the fully connected layer of the learning model, and the An information processing apparatus 100, characterized in that learning of the learning model is performed using errors from correct data of the first skeleton type.

（６）前記取得部１２０は、前記複数の第１の画像として、前記人物の第１の正面画像、第１の背面画像、および第１の側面画像の少なくとも２つを取得することを特徴とする前記（１）乃至（５）のいずれか１つに記載の情報処理装置１００。 (6) The acquiring unit 120 acquires, as the plurality of first images, at least two of a first front image, a first rear image, and a first side image of the person. The information processing apparatus 100 according to any one of (1) to (5) above.

これにより、情報処理装置１００は、骨格診断を行うにあたり身体の特徴がより出やすい部位が写った画像を学習モデルに学習させることができる。その結果、より精度の高い骨格診断を行うことが可能な学習モデルを提供することができる。 As a result, the information processing apparatus 100 can cause the learning model to learn an image showing a part of the body that is more likely to have features when skeletal diagnosis is performed. As a result, it is possible to provide a learning model that enables more accurate skeletal diagnosis.

（７）前記取得部１２０は、前記複数の第１の画像として、前記人物の鎖骨、足から腰、胸、手の部位のいずれかが写った画像の少なくとも２つを取得することを特徴とする前記（１）乃至（６）のいずれか１つに記載の情報処理装置１００。 (7) The obtaining unit 120 obtains, as the plurality of first images, at least two images of the person's clavicle, leg to waist, chest, and hand parts. The information processing apparatus 100 according to any one of (1) to (6) above.

これにより、情報処理装置１００は、骨格診断を行うにあたり身体の特徴がより出やすい部位が写った画像を学習モデルに学習させることができる。また、その結果、より精度の高い骨格診断を行うことが可能な学習モデルを提供することができる。 As a result, the information processing apparatus 100 can cause the learning model to learn an image showing a part of the body that is more likely to have features when skeletal diagnosis is performed. Moreover, as a result, it is possible to provide a learning model capable of performing more accurate skeletal diagnosis.

（８）ユーザが写った複数の第２の画像を取得する取得部１２０と、人物が写った複数の第１の画像を入力、および前記人物の第１の骨格タイプを正解とする教師データを学習して生成された学習モデルに前記複数の第２の画像を入力することで出力される結果に基づいて、前記ユーザの第２の骨格タイプを決定する決定部１４０とを備えたことを特徴とする情報処理装置１００。 (8) Acquisition unit 120 for acquiring a plurality of second images of a user, input of a plurality of first images of a person, and training data that corrects the first skeletal type of the person. and a determination unit 140 that determines the second skeleton type of the user based on the result output by inputting the plurality of second images into the learning model generated by learning. and the information processing apparatus 100.

（９）前記取得部１２０は、前記複数の第２の画像として、前記ユーザの第２の正面画像、第２の背面画像、および第２の側面画像の少なくとも２つを取得することを特徴とする前記（８）に記載の情報処理装置１００。 (9) The acquisition unit 120 acquires at least two of a second front image, a second rear image, and a second side image of the user as the plurality of second images. The information processing apparatus 100 according to (8) above.

これにより、情報処理装置１００は、骨格診断を行うにあたり身体の特徴がより出やすい部位が写った画像を学習モデルに学習させることができる。その結果、より精度の高い骨格診断を行うことが可能な学習モデルを提供することができる。 As a result, the information processing apparatus 100 can cause the learning model to learn an image showing a part of the body that is more likely to have features when skeletal diagnosis is performed. As a result, it is possible to provide a learning model capable of performing more accurate skeletal diagnosis.

（１０）前記取得部１２０は、前記複数の第２の画像として、前記ユーザの鎖骨、足から腰、胸、手の部位のいずれかが写った画像の少なくとも２つを取得することを特徴とする前記（８）または（９）に記載の情報処理装置１００。 (10) The obtaining unit 120 obtains, as the plurality of second images, at least two images of the user's clavicle, leg to waist, chest, and hand parts. The information processing apparatus 100 according to (8) or (9) above.

（１１）前記第２の骨格タイプに基づいて、前記ユーザに対するおすすめのコーディネートを検索する検索部１５０をさらに備えたことを特徴とする前記（８）乃至（１０）のいずれか１つに記載の情報処理装置１００。 (11) The apparatus according to any one of (8) to (10) above, further comprising a search unit 150 that searches for recommended coordination for the user based on the second skeleton type. Information processing device 100 .

これにより、情報処理装置１００は、ユーザの骨格タイプに合ったおすすめのコーディネートをユーザに提示することができる。 Thereby, the information processing apparatus 100 can present to the user recommended coordination suitable for the user's skeleton type.

（１２）前記検索部１５０はさらに、前記おすすめのコーディネートとして検索された、衣料品、服飾品、アクセサリ、ならびに前記ユーザに似合う素材および柄の少なくとも１つの販売情報を検索することを特徴とする前記（１１）に記載の情報処理装置１００。 (12) The search unit 150 further searches for sales information of at least one of clothing, furnishings, accessories, and materials and patterns that suit the user, which are searched as the recommended coordination. The information processing apparatus 100 according to (11).

これにより、情報処理装置１００は、おすすめのコーディネートとして検索された、衣料品、服飾品、およびアクセサリの販売を促進させることができる。 As a result, the information processing apparatus 100 can promote sales of the clothing items, furnishings, and accessories searched for as recommended coordination.

（１３）情報処理装置１００が、人物が写った複数の第１の画像および前記人物の第１の骨格タイプを取得し、前記複数の第１の画像を入力、および前記第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する処理を実行することを特徴とする方法。 (13) The information processing apparatus 100 acquires a plurality of first images showing a person and a first skeleton type of the person, inputs the plurality of first images, and obtains the first skeleton type. A method characterized by executing a process of learning correct teacher data and generating a learning model.

（１４）情報処理装置１００に、人物が写った複数の第１の画像および前記人物の第１の骨格タイプを取得し、前記複数の第１の画像を入力、および前記第１の骨格タイプを正解とする教師データを学習して学習モデルを生成する処理を実行させることを特徴とするプログラム。 (14) acquiring a plurality of first images of a person and a first skeleton type of the person in the information processing apparatus 100, inputting the plurality of first images, and acquiring the first skeleton type; A program characterized by executing a process of learning correct teacher data and generating a learning model.

（１５）情報処理装置１００が、ユーザが写った複数の第２の画像を取得し、人物が写った複数の第１の画像を入力、および前記人物の第１の骨格タイプを正解とする教師データを学習して生成された学習モデルに前記複数の第２の画像を入力することで出力される結果に基づいて、前記ユーザの第２の骨格タイプを決定する処理を実行することを特徴とする方法。 (15) The information processing apparatus 100 acquires a plurality of second images in which the user is photographed, inputs a plurality of first images in which a person is photographed, and the teacher determines that the first skeleton type of the person is correct. characterized by executing a process of determining the second skeleton type of the user based on a result output by inputting the plurality of second images into a learning model generated by learning data. how to.

（１６）情報処理装置１００に、ユーザが写った複数の第２の画像を取得し、人物が写った複数の第１の画像を入力、および前記人物の第１の骨格タイプを正解とする教師データを学習して生成された学習モデルに前記複数の第２の画像を入力することで出力される結果に基づいて、前記ユーザの第２の骨格タイプを決定する処理を実行させることを特徴とするプログラム。 (16) A teacher who obtains a plurality of second images of a user, inputs a plurality of first images of a person, and determines that the first skeleton type of the person is correct, to the information processing apparatus 100. characterized in that a process of determining the second skeleton type of the user is executed based on a result output by inputting the plurality of second images into a learning model generated by learning data. program to do.

１００情報処理装置
１１０記憶部
１２０取得部
１３０学習部
１４０決定部
１５０検索部
１６０送信部
１７０撮影部
２００ユーザ端末
３００店舗端末 100 information processing device 110 storage unit 120 acquisition unit 130 learning unit 140 determination unit 150 search unit 160 transmission unit 170 imaging unit 200 user terminal 300 store terminal

Claims

An image of a person, which is learned using a plurality of first images of a first person as input data and teacher data having a correct label of the first skeleton type of the first person. When the plurality of first images are input to the learning model, the learning model is output from the feature amount extraction layer of the learning model, and each of the plurality of first images is output from the feature amount extraction layer. Using one feature value obtained by combining a plurality of feature values corresponding to each of the images of the learning model, the probability that it is the second skeleton type output from the fully connected layer of the learning model, and the first skeleton type inputting a plurality of second images of a user to the learning model learned by updating the fully connected layer with
An information processing apparatus, comprising: a control unit that determines the skeleton type of the user based on the probability of being each of the skeleton types output from the learning model.

the plurality of first images includes at least two of a first front image, a first rear image, and a first side image of the first person;
The control unit controls, as the plurality of second images, a second front image, a second rear image, and a second rear image of the user in which the same part as the first image used as the teacher data is shown. 2. The information processing apparatus according to claim 1, wherein at least two of two side images are input to said learning model.

The plurality of first images includes at least two images of a clavicle, legs to waist, chest, and hands of the first person;
The control unit selects, as the plurality of second images, at least images of the user's clavicle, legs to waist, chest, and hands showing the same part as the first image used as the teacher data. 2. The information processing apparatus according to claim 1, wherein two are input to said learning model.

4. The method according to any one of claims 1 to 3, wherein the control unit searches for recommended coordinates associated with the skeleton type, which are stored in advance, using the determined skeleton type as a search key. The information processing device according to any one of the items.

The recommended coordination includes at least one of clothing, furnishings, accessories, and materials and patterns that suit the user,
5. The control unit according to claim 4, wherein the control unit further executes a process of searching sales information associated with the recommended coordination , which is stored in advance , using the searched recommended coordination as a search key. information processing equipment.

2. The method according to claim 1, wherein the control unit learns the learning model using the plurality of first images as input data and teacher data having the first skeleton type as a correct label. Information processing equipment.

An image of a person, which is learned using a single image obtained by combining a plurality of first images of a first person as input data , and teacher data having the first skeleton type as a correct label. A learning model for determining a skeleton type from a feature output from a feature amount extraction layer of the learning model when one image obtained by combining the plurality of first images is input to the learning model to the learning model learned by updating the fully connected layer using the quantity, the probability of being the second skeleton type output from the fully connected layer of the learning model, and the first skeleton type , inputting one image obtained by combining a plurality of second images in which the user is captured;
determining the skeletal type of the user based on probabilities of being each of the skeletal types output from the learning model;
An information processing apparatus comprising a control section for executing processing .

8. The method according to claim 7 , wherein the control unit learns the learning model using each of the plurality of first images as input data and teacher data having the first skeleton type as a correct label. The information processing device described.

the plurality of first images includes at least two of a first front image, a first rear image, and a first side image of the first person;
The plurality of second images are a second front image, a second rear image, and a second side image of the user in which the same part as the first image used as the training data is shown. 8. An information processing apparatus according to claim 7 , comprising at least two .

The plurality of first images include at least two images of the first person's clavicle, leg to waist, chest, or hand part,
The plurality of second images include at least two images of the user's clavicle, legs to waist, chest, and hands, which show the same parts as the first images used as the training data. 8. The information processing apparatus according to claim 7 .

The information processing device
An image of a person, which is learned using a plurality of first images of a first person as input data and teacher data having a correct label of the first skeleton type of the first person. When the plurality of first images are input to the learning model, the learning model is output from the feature amount extraction layer of the learning model, and each of the plurality of first images is output from the feature amount extraction layer. Using one feature value obtained by combining a plurality of feature values corresponding to each of the images of the learning model, the probability that it is the second skeleton type output from the fully connected layer of the learning model, and the first skeleton type inputting a plurality of second images of a user to the learning model learned by updating the fully connected layer with
determining the skeletal type of the user based on probabilities of being each of the skeletal types output from the learning model.

The information processing device
12. The method according to claim 11 , wherein the learning model is trained using the plurality of first images as input data and teacher data having the first skeleton type as a correct label. .

information processing equipment,
An image of a person, which is learned using a plurality of first images of a first person as input data and teacher data having a correct label of the first skeleton type of the first person. When the plurality of first images are input to the learning model, the learning model is output from the feature amount extraction layer of the learning model, and each of the plurality of first images is output from the feature amount extraction layer. Using one feature value obtained by combining a plurality of feature values corresponding to each of the images of the learning model, the probability that it is the second skeleton type output from the fully connected layer of the learning model, and the first skeleton type inputting a plurality of second images of a user to the learning model learned by updating the fully connected layer with
A program for determining the skeleton type of the user based on the probability of being each of the skeleton types output from the learning model.

information processing equipment,
14. The program according to claim 13 , causing execution of a process of learning the learning model using the plurality of first images as input data and teacher data having the first skeleton type as a correct label. .