JP7563451B2

JP7563451B2 - Information processing device, program, and information processing method

Info

Publication number: JP7563451B2
Application number: JP2022522566A
Authority: JP
Inventors: 哲哉加川
Original assignee: Konica Minolta Inc
Current assignee: Konica Minolta Inc
Priority date: 2020-05-14
Filing date: 2021-04-12
Publication date: 2024-10-08
Anticipated expiration: 2041-04-12
Also published as: JPWO2021229973A1; WO2021229973A1

Description

本発明は、情報処理装置、プログラム及び情報処理方法に関する。 The present invention relates to an information processing device, a program, and an information processing method.

従来、化合物の機能の予測に、演繹的予測モデルや、帰納的アプローチで機械学習により生成された学習モデルといった予測モデルが用いられている。このうち演繹的予測モデルは、化合物についての既知の原理や規則性から化合物の機能を予測する予測モデルである。また、学習モデルは、例えば化合物の構造に係る記述子を説明変数とし、当該化合物が呈する機能を目的変数として帰納的に学習した結果得られる、説明変数と目的変数との相関を表す予測モデルである（例えば、特許文献１）。機械学習を含む情報処理を用いて材料開発を行う方法は、マテリアルインフォマティクス（以下「ＭＩ」と記す）と呼ばれている。Conventionally, prediction models such as deductive prediction models and learning models generated by machine learning with an inductive approach have been used to predict the functions of compounds. Among these, deductive prediction models are prediction models that predict the functions of compounds from known principles and regularities of compounds. Moreover, learning models are prediction models that express the correlation between explanatory variables and objective variables, obtained as a result of inductive learning using, for example, descriptors related to the structure of a compound as explanatory variables and the function exhibited by the compound as the objective variable (for example, Patent Document 1). A method of developing materials using information processing including machine learning is called materials informatics (hereinafter referred to as "MI").

このような予測モデルを用いて機能を予測する対象の化合物の情報や、学習モデルの生成に用いる化合物の情報は、例えば公開されているデータベースから取得することができる。 Information on compounds whose functions are predicted using such predictive models, and information on compounds used to generate learning models, can be obtained, for example, from publicly available databases.

特開２００２－７３６２６号公報JP 2002-73626 A

しかしながら、公開されているデータベースから必要な情報を取得可能な化合物は限られており、機能予測対象の化合物の候補を増やしたり、学習モデルの生成に用いる化合物を増やして予測精度をさらに向上させたりするためには、非公開のデータベースからも化合物の情報を取得する必要がある。ここで、非公開のデータベースから化合物の構造の情報を取得すると、化合物の構造に係る機密情報の漏洩に繋がる可能性があるという課題がある。However, the number of compounds for which necessary information can be obtained from public databases is limited, and in order to increase the number of candidate compounds for function prediction or to increase the number of compounds used to generate learning models and further improve prediction accuracy, it is necessary to obtain compound information from private databases as well. However, obtaining compound structure information from private databases poses the problem that it may lead to the leakage of confidential information related to compound structures.

この発明の目的は、化合物の構造に係る機密情報の安全性を高めることができる情報処理装置、プログラム及び情報処理方法を提供することにある。 The object of the present invention is to provide an information processing device, program, and information processing method that can increase the security of confidential information related to the structure of compounds.

上記目的を達成するため、請求項１に記載の情報処理装置の発明は、
第１の外部装置に対して、所定の暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報を提供する情報提供部と、
前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された機能予測対象の暗号化構造データを取得する第１のデータ取得部と、
前記機能予測対象の暗号化構造データに対応する化合物の機能を所定の予測モデルに基づいて予測する予測部と、
を備え、
前記予測モデルは、化合物の構造に係る構造データを前記暗号化アルゴリズムに従って暗号化して得られた暗号化構造データと、前記化合物の機能に係る機能データとの相関関係を表す。 In order to achieve the above object, the present invention provides an information processing device comprising:
an information providing unit that provides the first external device with encryption algorithm information for performing encryption according to a predetermined encryption algorithm;
a first data acquisition unit that acquires encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device;
a prediction unit that predicts a function of a compound corresponding to the encrypted structure data of the function prediction target based on a predetermined prediction model;
Equipped with
The prediction model represents the correlation between encrypted structural data obtained by encrypting structural data relating to the structure of a compound according to the encryption algorithm, and functional data relating to the function of the compound.

請求項２に記載の発明は、請求項１に記載の情報処理装置において、
化合物の構造に係る構造データを前記暗号化アルゴリズムに従って暗号化して暗号化構造データを生成する暗号化部と、
前記暗号化構造データ、及び前記化合物の機能に係る機能データに基づいて、前記予測モデルとしての学習モデルを生成する学習モデル生成部と、
を備える。 The present invention as set forth in claim 2 provides the information processing device as set forth in claim 1,
an encryption unit that encrypts structure data relating to a compound structure according to the encryption algorithm to generate encrypted structure data;
a learning model generation unit that generates a learning model as the prediction model based on the encrypted structure data and function data related to the function of the compound;
Equipped with.

請求項３に記載の発明は、請求項２に記載の情報処理装置において、
前記第１のデータ取得部は、前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された学習対象の暗号化構造データ、及び当該学習対象の暗号化構造データに対応する化合物の機能に係る学習対象の機能データを取得し、
前記学習モデル生成部は、前記第１のデータ取得部が取得した前記学習対象の暗号化構造データ及び前記学習対象の機能データを少なくとも用いて前記学習モデルを生成する。 The present invention provides an information processing device according to claim 2,
the first data acquisition unit acquires, from the first external device, encrypted structural data of a learning object encrypted according to the encryption algorithm, and functional data of a learning object related to a function of a compound corresponding to the encrypted structural data of the learning object;
The learning model generation unit generates the learning model using at least the encrypted structure data of the learning object and the functional data of the learning object acquired by the first data acquisition unit.

上記目的を達成するため、請求項４に記載の情報処理装置の発明は、
化合物の構造に係る構造データを所定の暗号化アルゴリズムに従って暗号化して暗号化構造データを生成する暗号化部と、
前記暗号化構造データ、及び前記化合物の機能に係る機能データに基づいて、前記暗号化構造データと前記機能データとの相関関係を表す学習モデルを生成する学習モデル生成部と、
第１の外部装置に対して、前記暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報を提供する情報提供部と、
前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された学習対象の暗号化構造データ、及び当該暗号化構造データに対応する化合物の機能に係る学習対象の機能データを取得する第１のデータ取得部と、
を備え、
前記学習モデル生成部は、前記第１のデータ取得部が取得した前記学習対象の暗号化構造データ及び前記学習対象の機能データを少なくとも用いて前記学習モデルを生成する。 In order to achieve the above object, the invention of an information processing device according to claim 4 comprises:
an encryption unit that encrypts structure data relating to a compound structure according to a predetermined encryption algorithm to generate encrypted structure data;
a learning model generating unit that generates a learning model that represents a correlation between the encrypted structure data and the functional data, based on the encrypted structure data and functional data related to a function of the compound;
an information providing unit that provides a first external device with encryption algorithm information for performing encryption according to the encryption algorithm;
a first data acquisition unit that acquires, from the first external device, encrypted structure data of a learning object encrypted according to the encryption algorithm, and function data of a learning object related to a function of a compound corresponding to the encrypted structure data;
Equipped with
The learning model generation unit generates the learning model using at least the encrypted structure data of the learning object and the functional data of the learning object acquired by the first data acquisition unit.

請求項５に記載の発明は、請求項４に記載の情報処理装置において、
前記第１のデータ取得部は、前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された機能予測対象の暗号化構造データを取得し、
当該情報処理装置は、前記第１のデータ取得部が取得した前記機能予測対象の暗号化構造データに対応する化合物の機能を前記学習モデルに基づいて予測する予測部を備える。 The invention described in claim 5 is the information processing device described in claim 4,
The first data acquisition unit acquires encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device,
The information processing device includes a prediction unit that predicts, based on the learning model, a function of a compound corresponding to the encrypted structure data of the function prediction target acquired by the first data acquisition unit.

請求項６に記載の発明は、請求項２、３、５のいずれか一項に記載の情報処理装置において、
前記構造データを生成する構造データ生成部を備え、
前記暗号化部は、前記構造データ生成部が生成した前記構造データを暗号化して機能予測対象の前記暗号化構造データを生成し、
前記予測部は、前記暗号化部が生成した前記機能予測対象の暗号化構造データに対応する化合物の機能を前記学習モデルに基づいて予測する。 The present invention as set forth in claim 6 provides an information processing device as set forth in any one of claims 2, 3, and 5,
a structure data generating unit that generates the structure data,
The encryption unit encrypts the structure data generated by the structure data generation unit to generate the encrypted structure data of a function prediction target;
The prediction unit predicts, based on the learning model, a function of a compound corresponding to the encrypted structure data of the function prediction target generated by the encryption unit.

請求項７に記載の発明は、請求項２、３、５のいずれか一項に記載の情報処理装置において、
化合物の構造を公開する第２の外部装置から当該化合物の構造に係る構造データを取得する第２のデータ取得部を備え、
前記暗号化部は、前記第２のデータ取得部が取得した前記構造データを暗号化して機能予測対象の前記暗号化構造データを生成し、
前記予測部は、前記暗号化部が生成した前記機能予測対象の暗号化構造データに対応する化合物の機能を前記学習モデルに基づいて予測する。 The present invention as set forth in claim 7 provides an information processing device as set forth in any one of claims 2, 3, and 5,
a second data acquisition unit that acquires structural data relating to a structure of the compound from a second external device that publishes the structure of the compound;
The encryption unit encrypts the structure data acquired by the second data acquisition unit to generate the encrypted structure data of a function prediction target;
The prediction unit predicts, based on the learning model, a function of a compound corresponding to the encrypted structure data of the function prediction target generated by the encryption unit.

請求項８に記載の発明は、請求項２～７のいずれか一項に記載の情報処理装置において、
外部の所定のデータベースから前記構造データ及び前記機能データを取得する第３のデータ取得部を備え、
前記暗号化部は、前記第３のデータ取得部が取得した前記構造データに基づいて前記暗号化構造データを生成し、
前記学習モデル生成部は、当該暗号化構造データ、及び前記第３のデータ取得部が取得した前記機能データを少なくとも用いて前記学習モデルを生成する。 The invention described in claim 8 is the information processing device according to any one of claims 2 to 7,
a third data acquisition unit that acquires the structure data and the function data from an external predetermined database;
The encryption unit generates the encrypted structure data based on the structure data acquired by the third data acquisition unit,
The learning model generation unit generates the learning model using at least the encrypted structure data and the functional data acquired by the third data acquisition unit.

請求項９に記載の発明は、請求項１～８のいずれか一項に記載の情報処理装置において、
前記暗号化アルゴリズムは、暗号化前の前記構造データへの逆変換が不可能である。 The invention described in claim 9 provides an information processing device according to any one of claims 1 to 8,
The encryption algorithm is incapable of reversing the conversion to the structure data before encryption.

また、上記目的を達成するため、請求項１０に記載のプログラムの発明は、
情報処理装置に設けられたコンピューターを、
第１の外部装置に対して、所定の暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報を提供する情報提供手段、
前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された機能予測対象の暗号化構造データを取得するデータ取得手段、
前記機能予測対象の暗号化構造データに対応する化合物の機能を所定の予測モデルに基づいて予測する予測手段、
として機能させ、
前記予測モデルは、化合物の構造に係る構造データを前記暗号化アルゴリズムに従って暗号化して得られた暗号化構造データと、前記化合物の機能に係る機能データとの相関関係を表す。 In order to achieve the above object, the present invention provides a program as set forth in claim 10,
A computer installed in the information processing device,
an information providing means for providing the first external device with encryption algorithm information for performing encryption according to a predetermined encryption algorithm;
a data acquisition means for acquiring encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device;
a prediction means for predicting a function of a compound corresponding to the encrypted structure data of the function prediction target based on a predetermined prediction model;
Function as a
The prediction model represents the correlation between encrypted structural data obtained by encrypting structural data relating to the structure of a compound according to the encryption algorithm, and functional data relating to the function of the compound.

また、上記目的を達成するため、請求項１１に記載のプログラムの発明は、
情報処理装置に設けられたコンピューターを、
化合物の構造に係る構造データを所定の暗号化アルゴリズムに従って暗号化して暗号化構造データを生成する暗号化手段、
前記暗号化構造データ、及び前記化合物の機能に係る機能データに基づいて、前記暗号化構造データと前記機能データとの相関関係を表す学習モデルを生成する学習モデル生成手段、
第１の外部装置に対して、前記暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報を提供する情報提供手段、
前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された学習対象の暗号化構造データ、及び当該暗号化構造データに対応する化合物の機能に係る学習対象の機能データを取得するデータ取得手段、
として機能させ、
前記学習モデル生成手段は、前記データ取得手段が取得した前記学習対象の暗号化構造データ及び前記学習対象の機能データを少なくとも用いて前記学習モデルを生成する。 In order to achieve the above object, the present invention provides a program as set forth in claim 11,
A computer installed in the information processing device,
an encryption means for encrypting structural data relating to a compound structure according to a predetermined encryption algorithm to generate encrypted structural data;
a learning model generating means for generating a learning model representing a correlation between the encrypted structure data and functional data relating to a function of the compound, based on the encrypted structure data and the functional data;
an information providing means for providing a first external device with encryption algorithm information for performing encryption according to the encryption algorithm;
a data acquisition means for acquiring, from the first external device, encrypted structure data of a learning subject encrypted according to the encryption algorithm, and function data of a learning subject relating to a function of a compound corresponding to the encrypted structure data;
Function as a
The learning model generating means generates the learning model using at least the encrypted structure data of the learning subject and the functional data of the learning subject acquired by the data acquiring means.

また、上記目的を達成するため、請求項１２に記載の情報処理方法の発明は、
情報処理装置が実行する情報処理方法であって、
第１の外部装置に対して、所定の暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報を提供する情報提供ステップと、
前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された機能予測対象の暗号化構造データを取得するデータ取得ステップと、
前記機能予測対象の暗号化構造データに対応する化合物の機能を所定の予測モデルに基づいて予測する予測ステップと、
を含み、
前記予測モデルは、化合物の構造に係る構造データを前記暗号化アルゴリズムに従って暗号化して得られた暗号化構造データと、前記化合物の機能に係る機能データとの相関関係を表す。 In order to achieve the above object, the invention of an information processing method according to claim 12 comprises:
An information processing method executed by an information processing device,
an information providing step of providing encryption algorithm information for performing encryption according to a predetermined encryption algorithm to a first external device;
a data acquisition step of acquiring encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device;
a prediction step of predicting a function of a compound corresponding to the encrypted structure data of the function prediction target based on a predetermined prediction model;
Including,
The prediction model represents the correlation between encrypted structural data obtained by encrypting structural data relating to the structure of a compound according to the encryption algorithm, and functional data relating to the function of the compound.

また、上記目的を達成するため、請求項１３に記載の情報処理方法の発明は、
情報処理装置が実行する情報処理方法であって、
化合物の構造に係る構造データを所定の暗号化アルゴリズムに従って暗号化して暗号化構造データを生成する暗号化ステップと、
前記暗号化構造データ、及び前記化合物の機能に係る機能データに基づいて、前記暗号化構造データと前記機能データとの相関関係を表す学習モデルを生成する学習モデル生成ステップと、
第１の外部装置に対して、前記暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報を提供する情報提供ステップと、
前記第１の外部装置から、前記暗号化アルゴリズムに従って暗号化された学習対象の暗号化構造データ、及び当該暗号化構造データに対応する化合物の機能に係る学習対象の機能データを取得するデータ取得ステップと、
を含み、
前記学習モデル生成ステップでは、前記データ取得ステップにおいて取得した前記学習対象の暗号化構造データ及び前記学習対象の機能データを少なくとも用いて前記学習モデルを生成する。 In order to achieve the above object, the invention of an information processing method according to claim 13 comprises:
An information processing method executed by an information processing device,
an encryption step of encrypting structural data relating to the structure of the compound according to a predetermined encryption algorithm to generate encrypted structural data;
a learning model generating step of generating a learning model representing a correlation between the encrypted structure data and the functional data, based on the encrypted structure data and functional data relating to a function of the compound;
an information providing step of providing encryption algorithm information for performing encryption according to the encryption algorithm to a first external device;
a data acquisition step of acquiring, from the first external device, encrypted structure data of a learning subject encrypted according to the encryption algorithm, and function data of a learning subject relating to a function of a compound corresponding to the encrypted structure data;
Including,
In the learning model generating step, the learning model is generated using at least the encrypted structure data of the learning subject and the functional data of the learning subject acquired in the data acquiring step.

本発明に従うと、化合物の構造に係る機密情報の安全性を高めることができるという効果がある。 The present invention has the effect of increasing the security of confidential information relating to the structure of compounds.

化合物情報処理システムの概略構成を示す図である。FIG. 1 is a diagram showing a schematic configuration of a compound information processing system. 暗号化分子構造データを生成する暗号化アルゴリズムの例を説明する図である。FIG. 13 is a diagram illustrating an example of an encryption algorithm for generating encrypted molecular structure data. ＭＩサーバーの主要な機能構成を示すブロック図である。FIG. 2 is a block diagram showing the main functional configuration of an MI server. ラベル付き暗号化分子構造データの内容例を示す図である。FIG. 13 is a diagram showing an example of the contents of labeled encrypted molecular structure data. クライアントサーバーの主要な機能構成を示すブロック図である。FIG. 2 is a block diagram showing the main functional configuration of the client server. 化合物の機能予測に係る第１の方法を説明する図である。FIG. 1 is a diagram illustrating a first method for predicting the function of a compound. 化合物の機能予測に係る第２の方法を説明する図である。FIG. 1 is a diagram illustrating a second method for predicting the function of a compound. 化合物の機能予測に係る第３の方法を説明する図である。FIG. 13 is a diagram illustrating a third method for predicting the function of a compound. 化合物の機能予測に係る第４の方法を説明する図である。FIG. 13 is a diagram illustrating a fourth method for predicting the function of a compound.

以下、本発明の情報処理装置、プログラム及び情報処理方法に係る実施の形態を図面に基づいて説明する。 Below, embodiments of the information processing device, program, and information processing method of the present invention are described with reference to the drawings.

図１は、化合物情報処理システム１００の概略構成を示す図である。
化合物情報処理システム１００は、ＭＩサーバー１（情報処理装置）、公的データベースサーバー２（以下、「公的ＤＢサーバー２」と記す）（所定のデータベース）、試薬データベースサーバー３（以下、「試薬ＤＢサーバー３」と記す）（第２の外部装置）、及びクライアントサーバー４（第１の外部装置）を備える。ＭＩサーバー１、公的ＤＢサーバー２、試薬ＤＢサーバー３及びクライアントサーバー４は、通信ネットワークＮを介して相互に通信可能に接続されている。通信ネットワークＮは、例えばインターネットであるが、これに限られない。 FIG. 1 is a diagram showing a schematic configuration of a compound information processing system 100.
The compound information processing system 100 includes an MI server 1 (information processing device), a public database server 2 (hereinafter referred to as "public DB server 2") (a specified database), a reagent database server 3 (hereinafter referred to as "reagent DB server 3") (a second external device), and a client server 4 (a first external device). The MI server 1, the public DB server 2, the reagent DB server 3, and the client server 4 are connected to each other so as to be able to communicate with each other via a communication network N. The communication network N is, for example, the Internet, but is not limited to this.

ＭＩサーバー１は、マテリアルインフォマティクス（ＭＩ）に係る情報提供サービスの提供者が保有する機器であり、ＭＩに係る各種情報処理を行う。すなわち、ＭＩサーバー１は、化合物に係る情報に基づいて、化合物の機能を予測する学習モデル（「学習済みモデル」とも呼ばれる）を機械学習により生成し、当該学習モデルを用いて、ＭＩによる材料開発に有用な情報を取得又は生成してクライアントサーバー４に送信する。より具体的には、ＭＩサーバー１は、クライアントサーバー４から化合物の機能の目標値を取得し、当該目標値の機能を呈する化合物を探索して、特定された化合物の構造に係る情報をクライアントサーバー４に送信する。The MI server 1 is a device owned by a provider of information provision services related to materials informatics (MI), and performs various information processing related to MI. That is, the MI server 1 generates a learning model (also called a "trained model") that predicts the function of a compound based on information related to the compound through machine learning, and uses the learning model to acquire or generate information useful for material development by MI and transmit it to the client server 4. More specifically, the MI server 1 acquires a target value of the function of the compound from the client server 4, searches for a compound that exhibits the function of the target value, and transmits information related to the structure of the identified compound to the client server 4.

詳しくは、ＭＩサーバー１は、化合物の構造に係る暗号化分子構造データ（暗号化構造データ）、及び当該化合物の機能に係る機能データの組み合わせを多数用いて、暗号化分子構造データと機能データとの相関を表す学習モデルを機械学習により帰納的アプローチで生成する。暗号化分子構造データが機械学習の説明変数に相当し、機能データが機械学習の目的変数に相当する。このうち暗号化分子構造データは、化合物の構造に係る分子構造データ（構造データ）を所定の暗号化アルゴリズムに従って暗号化したデータである。学習モデルの生成に用いられる暗号化分子構造データ及び機能データからなるデータセットの数は、例えば数万セット以上とされる。学習モデルによる予測精度を向上させる方法の１つは、このデータセットの数を増大させることである。
以下では、学習モデルの生成に用いられる暗号化分子構造データ及び機能データを、それぞれ「学習対象の暗号化分子構造データ」、及び「学習対象の機能データ」とも記す。ＭＩサーバー１で生成される学習モデルは、化合物の機能予測のための予測モデルの１つである。 In detail, the MI server 1 uses a large number of combinations of encrypted molecular structure data (encrypted structure data) relating to the structure of a compound and functional data relating to the function of the compound to generate a learning model that represents the correlation between the encrypted molecular structure data and the functional data through an inductive approach by machine learning. The encrypted molecular structure data corresponds to the explanatory variables of the machine learning, and the functional data corresponds to the target variables of the machine learning. Among them, the encrypted molecular structure data is data obtained by encrypting the molecular structure data (structural data) relating to the structure of the compound according to a predetermined encryption algorithm. The number of data sets consisting of the encrypted molecular structure data and the functional data used to generate the learning model is, for example, tens of thousands of sets or more. One method for improving the prediction accuracy of the learning model is to increase the number of these data sets.
Hereinafter, the encrypted molecular structure data and the functional data used to generate the learning model are also referred to as “encrypted molecular structure data to be learned” and “functional data to be learned”, respectively. The learning model generated by the MI server 1 is one of the prediction models for predicting the function of a compound.

暗号化分子構造データの元となる分子構造データは、分子の構成、すなわち分子を構成する元素とその結合態様を特定可能なものであれば、特には限られない。The molecular structure data that is the source of the encrypted molecular structure data is not particularly limited as long as it is capable of identifying the molecular structure, i.e., the elements that make up the molecule and their bonding patterns.

分子構造データから暗号化構造データを生成する暗号化アルゴリズムとしては、例えば、化合物の分子構造の特徴を所定の規則で抽出して数値化するものを用いることができる。 As an encryption algorithm for generating encrypted structure data from molecular structure data, for example, one can use one that extracts characteristics of the molecular structure of a compound according to predetermined rules and quantifies them.

図２は、暗号化分子構造データを生成する暗号化アルゴリズムの例を説明する図である。
図２の暗号化アルゴリズムでは、図中上方に示した構造式を、その特徴に応じて、図中下方に示す符号に変換している。符号の各桁は、０又は１とされる。図２の暗号化アルゴリズムによる変換ルールは、例えば以下のものとすることができる。
すなわち、まずＭｏｒｇａｎ法によって、分子を構成する各原子に番号を付す。
次に、Ｄａｙｌｉｇｈｔｒｕｌｅによって原子情報を付与し、分子に含まれるフラグメント情報を追加する。
次に、重複しているフラグメントを削除する。
最後に、得られたフラグメントをハッシュ関数によって所定の桁に割り当てる。例えば、分子にある特定のフラグメントが含まれる場合には、符号の所定の桁が１とされる。
このような暗号化アルゴリズムで生成された暗号化分子構造データは、分子構造の特徴を表す記述子の一種であるということもできる。すなわち、暗号化分子構造データからは、値が１となっている桁の位置から、分子構造の特徴を多面的に特定することができる。他方で、ハッシュ関数は一方向関数であるため、暗号化分子構造データから分子構造データへの逆変換は不可能となっている。すなわち、本実施形態では、不可逆の暗号化アルゴリズムが用いられている。 FIG. 2 is a diagram for explaining an example of an encryption algorithm for generating encrypted molecular structure data.
In the encryption algorithm of Fig. 2, the structural formula shown in the upper part of the figure is converted into the code shown in the lower part of the figure according to its characteristics. Each digit of the code is set to 0 or 1. The conversion rule by the encryption algorithm of Fig. 2 can be, for example, as follows.
That is, first, each atom constituting a molecule is numbered according to the Morgan method.
Next, atomic information is given according to the Daylight rule, and fragment information contained in the molecule is added.
Next, the duplicate fragments are removed.
Finally, the resulting fragment is assigned to a predetermined digit by a hash function. For example, if a particular fragment is included in the molecule, a predetermined digit of the code is set to 1.
The encrypted molecular structure data generated by such an encryption algorithm can be said to be a kind of descriptor representing the characteristics of the molecular structure. That is, from the encrypted molecular structure data, the characteristics of the molecular structure can be specified from various aspects based on the position of the digits whose value is 1. On the other hand, since the hash function is a one-way function, it is impossible to reversely convert the encrypted molecular structure data back to molecular structure data. That is, in this embodiment, an irreversible encryption algorithm is used.

ＭＩサーバー１で生成される学習モデルの種別は、暗号化分子構造データと機能データとの相関関係を表すものであれば、特には限られない。学習モデルとしては、例えば、線形回帰、主成分分析、決定木、ランダムフォレスト、サポートベクターマシン、又はランダムフォレスト等の各種公知のものを用いることができる。The type of learning model generated by the MI server 1 is not particularly limited as long as it represents the correlation between the encrypted molecular structure data and the functional data. As the learning model, various known models such as linear regression, principal component analysis, decision tree, random forest, support vector machine, or random forest can be used.

また、ＭＩサーバー１は、機能を予測したい化合物に係る暗号化分子構造データ（以下では「機能予測対象の暗号化分子構造データ」とも記す）に対して、生成された学習モデルを適用することで、当該暗号化分子構造データに対応する化合物の機能を予測する。ＭＩサーバー１は、多数の暗号化分子構造データについての機能予測を行い、クライアントサーバー４から受信した機能の目標値に一致する予測結果が得られた暗号化分子構造データを特定する。そして、ＭＩサーバー１は、特定した暗号化分子構造データに係る情報をクライアントサーバー４に送信する。 In addition, the MI server 1 predicts the function of the compound corresponding to the encrypted molecular structure data by applying the generated learning model to the encrypted molecular structure data of the compound whose function is to be predicted (hereinafter also referred to as "encrypted molecular structure data for which function is predicted"). The MI server 1 performs function prediction for a large number of encrypted molecular structure data, and identifies encrypted molecular structure data for which a prediction result matching the target value of function received from the client server 4 has been obtained. The MI server 1 then transmits information related to the identified encrypted molecular structure data to the client server 4.

公的ＤＢサーバー２は、多数の化合物の分子構造に係る分子構造データと、当該化合物が呈する機能に係る機能データとを記憶している。公的ＤＢサーバー２は、他の装置（本実施形態では、ＭＩサーバー１）からの要求に応じてこれらの分子構造データ及び機能データを提供する。公的ＤＢサーバー２からＭＩサーバー１に提供された分子構造データは、ＭＩサーバー１において暗号化されて暗号化分子構造データに変換される。この暗号化分子構造データは、機能データとともに学習モデルの生成のための機械学習に用いられるほか、機能予測対象の暗号化分子構造データとしても用いられ得る。The public DB server 2 stores molecular structure data relating to the molecular structures of a large number of compounds, and function data relating to the functions exhibited by the compounds. The public DB server 2 provides these molecular structure data and function data in response to a request from another device (in this embodiment, the MI server 1). The molecular structure data provided to the MI server 1 from the public DB server 2 is encrypted in the MI server 1 and converted into encrypted molecular structure data. This encrypted molecular structure data is used for machine learning to generate a learning model together with the function data, and can also be used as encrypted molecular structure data for function prediction targets.

試薬ＤＢサーバー３は、販売対象とされている多数の化合物の分子構造に係る分子構造データを記憶している。試薬ＤＢサーバー３は、購入可能な試薬（化合物）のカタログを提供するものであるということもできる。試薬ＤＢサーバー３は、他の装置（本実施形態では、ＭＩサーバー１）からの要求に応じて分子構造データを提供する。本実施形態では、試薬ＤＢサーバー３からは化合物の機能に係る機能データは提供されないものとする。試薬ＤＢサーバー３からＭＩサーバー１に提供された分子構造データは、ＭＩサーバー１において暗号化されて暗号化分子構造データに変換される。この暗号化分子構造データは、機能予測対象の暗号化分子構造データとして用いられる。The reagent DB server 3 stores molecular structure data relating to the molecular structures of many compounds that are available for sale. It can also be said that the reagent DB server 3 provides a catalog of purchasable reagents (compounds). The reagent DB server 3 provides molecular structure data in response to a request from another device (in this embodiment, the MI server 1). In this embodiment, it is assumed that no functional data relating to the function of a compound is provided from the reagent DB server 3. The molecular structure data provided from the reagent DB server 3 to the MI server 1 is encrypted in the MI server 1 and converted into encrypted molecular structure data. This encrypted molecular structure data is used as encrypted molecular structure data for a function prediction target.

クライアントサーバー４は、ＭＩサーバー１によるＭＩに係る情報提供サービスを受けるクライアントが保有する機器である。クライアントサーバー４は、ＭＩサーバー１に対し、クライアントが所望する化合物の機能の目標値を指定するデータ等を送信し、ＭＩサーバー１から、当該機能を呈する化合物の構造に係る情報を受信する。また、クライアントサーバー４は、ＭＩに係る必要な情報提供サービスを受けるために、ＭＩサーバー１に対し、機能予測対象の暗号化分子構造データを送信したり、学習対象の暗号化分子構造データ及び学習対象の機能データを送信したりする。 The client server 4 is a device owned by a client who receives the MI-related information provision service provided by the MI server 1. The client server 4 transmits to the MI server 1 data specifying the target value of the function of the compound desired by the client, and receives from the MI server 1 information relating to the structure of the compound exhibiting said function. In order to receive the necessary MI-related information provision service, the client server 4 also transmits to the MI server 1 encrypted molecular structure data of the function prediction target, encrypted molecular structure data of the learning target, and function data of the learning target.

本明細書では、クライアントサーバー４に記憶されている化合物の構造に係る分子構造データは、機密情報であるものとする。本実施形態では、クライアントサーバー４において当該分子構造データを暗号化して得られた暗号化分子構造データをＭＩサーバー１に送信することで、機密情報である分子構造データをＭＩサーバー１に開示することなく、必要な情報提供サービスを受けることができるようになっている。このようにクライアントサーバー４内の機密情報を保護するための仕組みについては、後に詳述する。In this specification, the molecular structure data relating to the structure of a compound stored in the client server 4 is considered to be confidential information. In this embodiment, the molecular structure data is encrypted in the client server 4, and the resulting encrypted molecular structure data is transmitted to the MI server 1, thereby making it possible to receive the necessary information provision services without disclosing the molecular structure data, which is confidential information, to the MI server 1. The mechanism for protecting the confidential information in the client server 4 in this manner will be described in detail later.

次に、ＭＩサーバー１及びクライアントサーバー４の詳細な構成について説明する。
図３は、ＭＩサーバー１の主要な機能構成を示すブロック図である。
ＭＩサーバー１は、制御部１１と、操作部１２と、表示部１３と、通信部１４などを備え、これらの各部はバス１５により接続されている。 Next, the detailed configuration of the MI server 1 and the client server 4 will be described.
FIG. 3 is a block diagram showing the main functional configuration of the MI server 1. As shown in FIG.
The MI server 1 includes a control unit 11 , an operation unit 12 , a display unit 13 , a communication unit 14 , etc., and these units are connected to each other via a bus 15 .

制御部１１は、ＭＩサーバー１の動作を統括制御するプロセッサー（コンピューター）である。制御部１１は、ＣＰＵ１１１（Central Processing Unit）、ＲＡＭ１１２（Random Access Memory）及び記憶部１１３を有する。The control unit 11 is a processor (computer) that controls the overall operation of the MI server 1. The control unit 11 has a CPU 111 (Central Processing Unit), a RAM 112 (Random Access Memory), and a memory unit 113.

ＣＰＵ１１１は、記憶部１１３に記憶された各種制御用のプログラム１１３ｃや設定データを読み出してＲＡＭ１１２に記憶させ、当該プログラム１１３ｃを実行して各種演算処理を行う。
ＲＡＭ１１２は、ＣＰＵ１１１に作業用のメモリー空間を提供し、一時データを記憶する。ＲＡＭ１１２は、不揮発性メモリーを含んでいてもよい。 The CPU 111 reads out various control programs 113c and setting data stored in the storage unit 113, stores them in the RAM 112, and executes the programs 113c to perform various arithmetic processing.
The RAM 112 provides a working memory space for the CPU 111 and stores temporary data. The RAM 112 may include a non-volatile memory.

記憶部１１３には、上記のプログラム１１３ｃの他、ＭＩに係る情報処理を行うための各種データが記憶されている。記憶部１１３としては、例えばＨＤＤ（Hard Disk Drive）が用いられ、また、ＤＲＡＭ（Dynamic Random Access Memory）などが併用されてもよい。
記憶部１１３に記憶されるデータには、一般データ１１３ａ、クライアント由来データ１１３ｂ、暗号化アルゴリズム情報Ｄ１、及び学習モデルデータＤ２などがある。 In addition to the above-mentioned program 113c, various data for performing information processing related to the MI are stored in the storage unit 113. As the storage unit 113, for example, a hard disk drive (HDD) is used, and a dynamic random access memory (DRAM) or the like may also be used in combination.
The data stored in the memory unit 113 includes general data 113a, client-derived data 113b, encryption algorithm information D1, and learning model data D2.

一般データ１１３ａは、化合物の構造及び機能に係るデータのうち、クライアントサーバー４を介さずに取得されたもの、すなわち、公的ＤＢサーバー２や試薬ＤＢサーバー３から取得したデータ、又はＭＩサーバー１内で生成したデータ等である。
具体的には、一般データ１１３ａは、学習対象の分子構造データＡ１、その暗号化分子構造データＡ２、及び機能データＡ３を含む。また、一般データ１１３ａは、機能予測対象の分子構造データＢ１及びその暗号化分子構造データＢ２を含む。これらのうち分子構造データＡ１及び機能データＡ３は、公的ＤＢサーバー２から取得される。また、分子構造データＢ１は、公的ＤＢサーバー２又は試薬ＤＢサーバー３から取得される。また、後述するように、分子構造データＢ１は、ＭＩサーバー１内で生成される場合もある。 General data 113a is data relating to the structure and function of a compound that has been obtained without going through the client server 4, i.e., data obtained from the public DB server 2 or the reagent DB server 3, or data generated within the MI server 1.
Specifically, the general data 113a includes molecular structure data A1 of the learning target, its encrypted molecular structure data A2, and function data A3. The general data 113a also includes molecular structure data B1 of the function prediction target and its encrypted molecular structure data B2. Of these, the molecular structure data A1 and the function data A3 are obtained from the public DB server 2. Furthermore, the molecular structure data B1 is obtained from the public DB server 2 or the reagent DB server 3. Furthermore, as described below, the molecular structure data B1 may be generated within the MI server 1.

クライアント由来データ１１３ｂは、化合物の構造及び機能に係るデータのうち、クライアントサーバー４から取得したデータである。クライアント由来データ１１３ｂは、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌ、学習対象の暗号化分子構造データＣ２、及び学習対象の機能データＣ３を含む。The client-derived data 113b is data relating to the structure and function of a compound, which is acquired from the client server 4. The client-derived data 113b includes labeled encrypted molecular structure data C2L of the target for function prediction, encrypted molecular structure data C2 of the target for learning, and functional data C3 of the target for learning.

図４は、ラベル付き暗号化分子構造データＣ２Ｌの内容例を示す図である。
ラベル付き暗号化分子構造データＣ２Ｌは、機能予測対象の複数の暗号化分子構造データの各々に対して、固有のラベル（ここでは自然数）が対応付けられたデータである。 FIG. 4 is a diagram showing an example of the contents of the labeled encrypted molecular structure data C2L.
The labeled encrypted molecular structure data C2L is data in which a unique label (here, a natural number) is associated with each of a plurality of encrypted molecular structure data that are targets of function prediction.

図３に示す暗号化アルゴリズム情報Ｄ１は、分子構造データから暗号化分子構造データを生成するための暗号化アルゴリズムに係る情報である。暗号化分子構造データを生成するための所定の暗号化プログラムの実行の際に、暗号化アルゴリズム情報Ｄ１が参照されることで、特定の暗号化アルゴリズムに従った暗号化を行うことができる。あるいは、暗号化アルゴリズム情報Ｄ１は、暗号化プログラムそのものであってもよい。
暗号化アルゴリズム情報Ｄ１は、ＭＩサーバー１の制御部１１が分子構造データＡ１、Ｂ１を暗号化して分子構造データＡ１、Ｂ２を生成する際に用いられる。また、暗号化アルゴリズム情報Ｄ１は、クライアントサーバー４における暗号化処理のためにクライアントサーバー４に送信される。 The encryption algorithm information D1 shown in Fig. 3 is information related to an encryption algorithm for generating encrypted molecular structure data from molecular structure data. When a specific encryption program for generating encrypted molecular structure data is executed, the encryption algorithm information D1 is referenced, so that encryption can be performed according to a specific encryption algorithm. Alternatively, the encryption algorithm information D1 may be the encryption program itself.
The encryption algorithm information D1 is used when the control unit 11 of the MI server 1 encrypts the molecular structure data A1 and B1 to generate molecular structure data A1 and B2. In addition, the encryption algorithm information D1 is transmitted to the client server 4 for encryption processing in the client server 4.

学習モデルデータＤ２は、学習対象の暗号化分子構造データＡ２、機能データＡ３、及び／又は学習対象の暗号化分子構造データＣ２、及び機能データＣ３に基づいて機械学習により生成された学習モデルに係るデータである。機能予測対象の暗号化分子構造データに対し、学習モデルデータＤ２により表される学習モデルを適用することで、当該暗号化分子構造データに対応する化合物の機能を予測することができる。本明細書では、学習モデルデータＤ２を生成することを「学習モデルを生成する」とも記す。The learning model data D2 is data relating to a learning model generated by machine learning based on the encrypted molecular structure data A2, functional data A3, and/or the encrypted molecular structure data C2, and functional data C3 of the learning target. By applying the learning model represented by the learning model data D2 to the encrypted molecular structure data of the function prediction target, it is possible to predict the function of the compound corresponding to the encrypted molecular structure data. In this specification, generating the learning model data D2 is also referred to as "generating a learning model."

これらの構成を有する制御部１１は、ＣＰＵ１１１がプログラム１１３ｃを実行することで、暗号化部（暗号化手段）、学習モデル生成部（学習モデル生成手段）、情報提供部（情報提供手段）、第１のデータ取得部（第１のデータ取得手段）、第２のデータ取得部（第２のデータ取得手段）、第３のデータ取得部（第３のデータ取得手段）、予測部（予測手段）、及び構造データ生成部（構造データ生成手段）として機能する。
暗号化部は、化合物の構造に係る分子構造データＡ１、Ｂ１を、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して暗号化分子構造データＡ２、Ｂ２を生成する。
学習モデル生成部は、学習対象の暗号化分子構造データＡ２、機能データＡ３、及び／又は学習対象の暗号化分子構造データＣ２、及び機能データＣ３に基づいて機械学習を行って学習モデルデータＤ２を生成する。
情報提供部は、クライアントサーバー４に対して、上記暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報Ｄ１を提供する（通信部１４により送信させる）。
第１のデータ取得部は、クライアントサーバー４から、通信部１４を介して、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌ、学習対象の暗号化分子構造データＣ２、及び学習対象の機能データＣ３を取得する。
第２のデータ取得部は、試薬ＤＢサーバー３から、通信部１４を介して分子構造データＢ１を取得する。
第３のデータ取得部は、公的ＤＢサーバー２から、通信部１４を介して分子構造データＡ１及び機能データＡ３を取得する。
予測部は、機能予測対象の暗号化分子構造データＡ２、Ｂ２、及びラベル付き暗号化分子構造データＣ２Ｌに対応する化合物の機能を、学習モデルデータＤ２により表される学習モデルに基づいて予測する。
構造データ生成部は、遺伝的アルゴリズムなどを用いて機械的に分子構造データＢ１を生成し、記憶部１１３に記憶させる。 The control unit 11 having these configurations functions as an encryption unit (encryption means), a learning model generation unit (learning model generation means), an information provision unit (information provision means), a first data acquisition unit (first data acquisition means), a second data acquisition unit (second data acquisition means), a third data acquisition unit (third data acquisition means), a prediction unit (prediction means), and a structured data generation unit (structured data generation means) when the CPU 111 executes the program 113c.
The encryption unit encrypts molecular structure data A1, B1 relating to the structure of a compound in accordance with an encryption algorithm indicated by encryption algorithm information D1 to generate encrypted molecular structure data A2, B2.
The learning model generation unit performs machine learning based on the encrypted molecular structure data A2 and functional data A3 of the learning target, and/or the encrypted molecular structure data C2 and functional data C3 of the learning target, to generate learning model data D2.
The information providing unit provides the client server 4 with encryption algorithm information D1 for performing encryption according to the encryption algorithm (causing the communication unit 14 to transmit the information).
The first data acquisition unit acquires, from the client server 4 via the communication unit 14, the labeled encrypted molecular structure data C2L of the function prediction target, the encrypted molecular structure data C2 of the learning target, and the functional data C3 of the learning target.
The second data acquisition unit acquires molecular structure data B1 from the reagent DB server 3 via the communication unit 14.
The third data acquisition unit acquires the molecular structure data A1 and the function data A3 from the public DB server 2 via the communication unit 14.
The prediction unit predicts the function of a compound corresponding to the encrypted molecular structure data A2, B2, and the labeled encrypted molecular structure data C2L, the functions of which are to be predicted, based on a learning model represented by learning model data D2.
The structure data generation unit mechanically generates molecular structure data B1 using a genetic algorithm or the like, and stores the data in the storage unit 113.

操作部１２は、キーボード及びマウスといった入力デバイスや、表示部１３と一体的に設けられたタッチパネル等により実現される。操作部１２は、これらの入力デバイスやタッチパネルからの操作入力を受け付けて、操作入力に応じた操作信号を制御部１１に出力する。The operation unit 12 is realized by input devices such as a keyboard and a mouse, and a touch panel that is integral with the display unit 13. The operation unit 12 accepts operation input from these input devices and the touch panel, and outputs an operation signal corresponding to the operation input to the control unit 11.

表示部１３は、液晶表示装置や有機ＥＬ表示装置などにより実現され、制御部１１による制御下で各種情報を表示する。The display unit 13 is realized by a liquid crystal display device or an organic EL display device, etc., and displays various information under the control of the control unit 11.

通信部１４は、制御部１１による制御下で、公的ＤＢサーバー２、試薬ＤＢサーバー３及びクライアントサーバー４との間で通信ネットワークＮを介したデータの送受信を行う。 Under the control of the control unit 11, the communication unit 14 transmits and receives data between the public DB server 2, the reagent DB server 3 and the client server 4 via the communication network N.

図５は、クライアントサーバー４の主要な機能構成を示すブロック図である。
クライアントサーバー４は、制御部４１と、操作部４２と、表示部４３と、通信部
などを備え、これらの各部はバス４５により接続されている。 FIG. 5 is a block diagram showing the main functional configuration of the client server 4. As shown in FIG.
The client server 4 includes a control unit 41 , an operation unit 42 , a display unit 43 , a communication unit, etc., and these units are connected to each other via a bus 45 .

制御部４１は、クライアントサーバー４の動作を統括制御するプロセッサーである。制御部４１は、ＣＰＵ４１１、ＲＡＭ４１２及び記憶部４１３を有する。The control unit 41 is a processor that controls the overall operation of the client server 4. The control unit 41 has a CPU 411, a RAM 412, and a memory unit 413.

ＣＰＵ４１１は、記憶部４１３に記憶された各種制御用のプログラム４１３ａや設定データを読み出してＲＡＭ４１２に記憶させ、当該プログラム４１３ａを実行して各種演算処理を行う。
ＲＡＭ４１２は、ＣＰＵ４１１に作業用のメモリー空間を提供し、一時データを記憶する。ＲＡＭ４１２は、不揮発性メモリーを含んでいてもよい。 The CPU 411 reads out various control programs 413a and setting data stored in the storage unit 413, stores them in the RAM 412, and executes the programs 413a to perform various arithmetic processing.
The RAM 412 provides a working memory space for the CPU 411 and stores temporary data. The RAM 412 may include a non-volatile memory.

記憶部４１３には、上記のプログラム４１３ａの他、分子構造データＣ１、ラベル付き暗号化分子構造データＣ２Ｌ、暗号化分子構造データＣ２、機能データＣ３及び暗号化アルゴリズム情報Ｄ１などが記憶されている。記憶部４１３としては、例えばＨＤＤが用いられ、また、ＤＲＡＭなどが併用されてもよい。In addition to the above program 413a, the storage unit 413 stores molecular structure data C1, labeled encrypted molecular structure data C2L, encrypted molecular structure data C2, functional data C3, and encryption algorithm information D1. As the storage unit 413, for example, a HDD is used, and a DRAM or the like may also be used in combination.

記憶部４１３に記憶されている暗号化アルゴリズム情報Ｄ１は、ＭＩサーバー１から送信されたものであり、ＭＩサーバー１の記憶部１１３に記憶されている暗号化アルゴリズム情報Ｄ１と同一である。クライアントサーバー４の制御部４１は、暗号化アルゴリズム情報Ｄ１を用いることで、ＭＩサーバー１と同一の暗号化アルゴリズムに従って分子構造データＣ１を暗号化し、暗号化分子構造データＣ２を生成することができる。The encryption algorithm information D1 stored in the memory unit 413 was sent from the MI server 1 and is the same as the encryption algorithm information D1 stored in the memory unit 113 of the MI server 1. By using the encryption algorithm information D1, the control unit 41 of the client server 4 can encrypt the molecular structure data C1 according to the same encryption algorithm as the MI server 1 and generate encrypted molecular structure data C2.

分子構造データＣ１は、クライアントが保有する化合物の分子構造に係るデータである。また、分子構造データＣ１は、クライアントにより機密情報として管理されている。
ラベル付き暗号化分子構造データＣ２Ｌは、上述のとおり、複数の暗号化分子構造データに固有のラベルが対応付けられたデータである（図４参照）。ラベル付き暗号化分子構造データＣ２Ｌに含まれる暗号化分子構造データは、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って制御部４１が分子構造データＣ１を暗号化することにより生成されたものである。
暗号化分子構造データＣ２は、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って制御部４１が分子構造データＣ１を暗号化することにより生成されたデータである。暗号化分子構造データＣ２は、ラベル付き暗号化分子構造データＣ２Ｌに含まれる暗号化分子構造データと同一のものを含んでいてもよいし、互いに異なっていてもよい。
機能データＣ３は、分子構造データＣ１（及び暗号化分子構造データＣ２）に対応する化合物の機能に係るデータである。機能データＣ３は、機密情報とはされていないものとする。 The molecular structure data C1 is data relating to the molecular structure of a compound held by a client, and is managed by the client as confidential information.
As described above, the labeled encrypted molecular structure data C2L is data in which unique labels are associated with multiple pieces of encrypted molecular structure data (see FIG. 4). The encrypted molecular structure data included in the labeled encrypted molecular structure data C2L is generated by the control unit 41 encrypting the molecular structure data C1 according to the encryption algorithm indicated by the encryption algorithm information D1.
The encrypted molecular structure data C2 is data generated by the control unit 41 encrypting the molecular structure data C1 in accordance with the encryption algorithm indicated by the encryption algorithm information D1. The encrypted molecular structure data C2 may include the same encrypted molecular structure data as that included in the labeled encrypted molecular structure data C2L, or may be different from each other.
The function data C3 is data related to the function of the compound corresponding to the molecular structure data C1 (and the encrypted molecular structure data C2). The function data C3 is not considered to be confidential information.

操作部４２、表示部４３及び通信部４４の構成は、ＭＩサーバー１の操作部１２、表示部１３及び通信部１４の構成と同様であるので説明は省略する。 The configurations of the operation unit 42, display unit 43 and communication unit 44 are similar to the configurations of the operation unit 12, display unit 13 and communication unit 14 of the MI server 1, so explanation is omitted.

次に、化合物情報処理システム１００において化合物の機能予測を行う方法について説明する。化合物の機能予測を行う方法には、学習対象及び機能予測対象の暗号化分子構造データとしてそれぞれ何を用いるかに応じて、複数の方法がある。Next, we will explain how to predict the function of a compound in the compound information processing system 100. There are multiple methods for predicting the function of a compound, depending on what is used as the encrypted molecular structure data for the learning target and the function prediction target.

ＭＩサーバー１における学習モデルデータＤ２の生成には、以下の２つの学習対象の暗号化構造データのうち少なくとも一方が用いられる。
（ａ１）一般データ１１３ａに含まれる暗号化分子構造データＡ２。
（ａ２）クライアント由来データ１１３ｂに含まれる暗号化分子構造データＣ２。 To generate the learning model data D2 in the MI server 1, at least one of the following two encrypted structure data to be learned is used.
(a1) Encrypted molecular structure data A2 included in general data 113a.
(a2) Encrypted molecular structure data C2 contained in the client-derived data 113b.

また、ＭＩサーバー１において機能予測対象とされる暗号化分子構造データには、以下の３つがある。
（ｂ１）クライアント由来データ１１３ｂに含まれるラベル付き暗号化分子構造データＣ２Ｌ。
（ｂ２）一般データ１１３ａに含まれる暗号化分子構造データＢ２のうち、外部（例えば試薬ＤＢサーバー３）から取得した分子構造データＢ１を暗号化して得られた暗号化分子構造データＢ２。
（ｂ３）一般データ１１３ａに含まれる暗号化分子構造データＢ２のうち、ＭＩサーバー１の内部で生成された分子構造データＢ１を暗号化して得られた暗号化分子構造データＢ２。 The encrypted molecular structure data that is the subject of function prediction in the MI server 1 includes the following three types:
(b1) Labeled encrypted molecular structure data C2L included in the client-derived data 113b.
(b2) Among the encrypted molecular structure data B2 included in the general data 113a, the encrypted molecular structure data B2 is obtained by encrypting the molecular structure data B1 acquired from an external source (for example, the reagent DB server 3).
(b3) Encrypted molecular structure data B2 included in general data 113a, which is obtained by encrypting molecular structure data B1 generated within MI server 1.

以下では、学習対象の暗号化分子構造データ、及び機能予測対象の暗号化分子構造データの組み合わせが異なる以下の＜第１の方法＞～＜第４の方法＞を例に挙げて説明する。第１～第４の方法では、いずれも、クライアントサーバー４から外部に機密情報である分子構造データＣ１を開示（送信）することなく、ＭＩによる化合物の機能の予測結果をクライアントサーバー４で受信することができる。
＜第１の方法＞
学習対象の暗号化分子構造データ：（ａ１）
機能予測対象の暗号化分子構造データ：（ｂ１）
＜第２の方法＞
学習対象の暗号化分子構造データ：（ａ１）＋（ａ２）
機能予測対象の暗号化分子構造データ：（ｂ２）
＜第３の方法＞
学習対象の暗号化分子構造データ：（ａ１）＋（ａ２）
機能予測対象の暗号化分子構造データ：（ｂ３）
＜第４の方法＞
学習対象の暗号化分子構造データ：（ａ１）＋（ａ２）
機能予測対象の暗号化分子構造データ：（ｂ１） The following describes, as examples, the following <First Method> to <Fourth Method>, which have different combinations of encrypted molecular structure data to be learned and encrypted molecular structure data to be predicted for function. In all of the first to fourth methods, the client server 4 can receive the prediction result of the function of a compound by MI without disclosing (transmitting) the molecular structure data C1, which is confidential information, from the client server 4 to an external party.
<First Method>
Encrypted molecular structure data to be learned: (a1)
Encrypted molecular structure data of a function prediction target: (b1)
<Second Method>
Encrypted molecular structure data to be learned: (a1)+(a2)
Encrypted molecular structure data of a function prediction target: (b2)
<Third Method>
Encrypted molecular structure data to be learned: (a1)+(a2)
Encrypted molecular structure data of a function prediction target: (b3)
<Fourth Method>
Encrypted molecular structure data to be learned: (a1)+(a2)
Encrypted molecular structure data of a function prediction target: (b1)

＜第１の方法＞
図６は、化合物の機能予測に係る第１の方法を説明する図である。
図６では、ＭＩサーバー１、公的ＤＢサーバー２及びクライアントサーバー４により実行される各種データ処理の流れ、及び各サーバー間におけるデータの送受信の流れを示している。以下では簡便のため、制御部１１（４１）が通信部１４（４４）を制御して通信部１４（４４）によりデータを送信させる動作を、単に「制御部１１（４１）がデータを送信する」と記す。
第１の方法では、学習対象の暗号化分子構造データとして、「（ａ１）一般データ１１３ａに含まれる暗号化分子構造データＡ２」が用いられ、機能予測対象の暗号化分子構造データとして、「（ｂ１）クライアント由来データ１１３ｂに含まれるラベル付き暗号化分子構造データＣ２Ｌ」が用いられる。 <First Method>
FIG. 6 is a diagram illustrating a first method for predicting the function of a compound.
6 shows the flow of various data processing executed by the MI server 1, the public DB server 2, and the client server 4, and the flow of data transmission and reception between the servers. For simplicity, the operation in which the control unit 11 (41) controls the communication unit 14 (44) to transmit data via the communication unit 14 (44) will be simply referred to as "the control unit 11 (41) transmits data."
In the first method, “(a1) encrypted molecular structure data A2 included in general data 113a” is used as the encrypted molecular structure data to be learned, and “(b1) labeled encrypted molecular structure data C2L included in client-derived data 113b” is used as the encrypted molecular structure data to be predicted in function.

第１の方法では、まずＭＩサーバー１の制御部１１は、公的ＤＢサーバー２から、学習対象の分子構造データＡ１及び対応する機能データＡ３を取得する（ステップＳ１０１：第３のデータ取得ステップ）。In the first method, the control unit 11 of the MI server 1 first acquires the molecular structure data A1 to be learned and the corresponding functional data A3 from the public DB server 2 (step S101: third data acquisition step).

ＭＩサーバー１の制御部１１は、取得した分子構造データＡ１と、記憶部１１３に予め記憶されている分子構造データＡ１とを、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して暗号化分子構造データＡ２を生成する（ステップＳ１０２：暗号化ステップ）。The control unit 11 of the MI server 1 encrypts the acquired molecular structure data A1 and the molecular structure data A1 pre-stored in the memory unit 113 according to the encryption algorithm indicated by the encryption algorithm information D1 to generate encrypted molecular structure data A2 (step S102: encryption step).

ＭＩサーバー１の制御部１１は、生成した暗号化分子構造データＡ２と、ステップＳ１０１で取得した機能データＡ３及び記憶部１１３に予め記憶されている機能データＡ３と、に基づいて、機械学習により学習モデルデータＤ２を生成する（ステップＳ１０３：学習モデル生成ステップ）。既に学習モデルデータＤ２が記憶部１１３に記憶されている場合には、制御部１１は、学習モデルデータＤ２を、新たに生成した内容に更新する。
なお、機械学習には、公的ＤＢサーバー２から取得した分子構造データＡ１の暗号化分子構造データＡ２、及び記憶部１１３に予め記憶されていた分子構造データＡ１の暗号化分子構造データＡ２のうち一方のみを用いてもよい。 The control unit 11 of the MI server 1 generates learning model data D2 by machine learning based on the generated encrypted molecular structure data A2, the functional data A3 acquired in step S101, and the functional data A3 previously stored in the storage unit 113 (step S103: learning model generation step). If the learning model data D2 has already been stored in the storage unit 113, the control unit 11 updates the learning model data D2 to the newly generated content.
In addition, for machine learning, only one of the encrypted molecular structure data A2 of the molecular structure data A1 obtained from the public DB server 2 and the encrypted molecular structure data A2 of the molecular structure data A1 pre-stored in the memory unit 113 may be used.

一方、ＭＩサーバー１の制御部１１は、クライアントサーバー４に対して暗号化アルゴリズム情報Ｄ１を送信する（ステップＳ１０４：情報提供ステップ）。Meanwhile, the control unit 11 of the MI server 1 sends encryption algorithm information D1 to the client server 4 (step S104: information provision step).

暗号化アルゴリズム情報Ｄ１を受信したクライアントサーバー４の制御部４１は、機能予測対象の分子構造データＣ１を、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化するとともに、ラベルを付与して、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌを生成する（ステップＳ１０５）。また、制御部４１は、ＭＩサーバー１に対してラベル付き暗号化分子構造データＣ２Ｌを送信する。これに応じて、ＭＩサーバー１の制御部１１は、ラベル付き暗号化分子構造データＣ２Ｌを受信する（ステップＳ１０６：第１のデータ取得ステップ）。The control unit 41 of the client server 4 that has received the encryption algorithm information D1 encrypts the molecular structure data C1 of the function prediction target according to the encryption algorithm indicated by the encryption algorithm information D1, and assigns a label to generate labeled encrypted molecular structure data C2L of the function prediction target (step S105). The control unit 41 also transmits the labeled encrypted molecular structure data C2L to the MI server 1. In response, the control unit 11 of the MI server 1 receives the labeled encrypted molecular structure data C2L (step S106: first data acquisition step).

ＭＩサーバー１の制御部１１は、取得したラベル付き暗号化分子構造データＣ２Ｌに含まれる各暗号化分子構造データに対して、学習モデルデータＤ２により表される学習モデルを適用することで、各暗号化分子構造データに対応する化合物の機能を予測する（ステップＳ１０７：予測ステップ）。The control unit 11 of the MI server 1 predicts the function of the compound corresponding to each encrypted molecular structure data by applying the learning model represented by the learning model data D2 to each encrypted molecular structure data contained in the acquired labeled encrypted molecular structure data C2L (step S107: prediction step).

制御部１１は、機能の予測結果と、クライアントサーバー４から受信した機能の目標値とを比較し、機能の予測結果が目標値と一致した暗号化分子構造データを特定する（ステップＳ１０８）。ここで、機能の予測結果が目標値に一致するとは、機能を呈することを表す指標の値が目標値に一致する場合のほか、当該指標が所定範囲内であること、又は当該指標が所定値以上であること、等としてもよい。The control unit 11 compares the predicted function result with the target value of the function received from the client server 4, and identifies encrypted molecular structure data for which the predicted function result matches the target value (step S108). Here, the predicted function result matching the target value may mean that the value of an index representing the function matches the target value, or that the index is within a predetermined range, or that the index is equal to or greater than a predetermined value, etc.

制御部１１は、ステップＳ１０８で特定された暗号化分子構造データに対応付けられているラベルを、クライアントサーバー４に送信する（ステップＳ１０９）。これに応じて、クライアントサーバー４の制御部４１は、ラベル付き暗号化分子構造データＣ２Ｌにおいて、受信したラベルに対応する暗号化分子構造データを特定し、当該暗号化分子構造データに対応する化合物を、所望の機能を呈する化合物として特定する。The control unit 11 transmits the label associated with the encrypted molecular structure data identified in step S108 to the client server 4 (step S109). In response, the control unit 41 of the client server 4 identifies the encrypted molecular structure data corresponding to the received label in the labeled encrypted molecular structure data C2L, and identifies the compound corresponding to the encrypted molecular structure data as a compound exhibiting a desired function.

以上のように、本実施形態に係るＭＩサーバー１は、制御部１１を備え、当該制御部１１は、上述の第１の方法においては、クライアントサーバー４に対して、所定の暗号化アルゴリズムに従った暗号化を実行するための学習モデルデータＤ２を提供し（情報提供部）、クライアントサーバー４から、上記暗号化アルゴリズムに従って暗号化された機能予測対象のラベル付き暗号化分子構造データＣ２Ｌを取得し（第１のデータ取得部）、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌに対応する化合物の機能を予測モデルとしての学習モデルに基づいて予測し（予測部）、予測モデルは、化合物の構造に係る構造データを上記暗号化アルゴリズムに従って暗号化して得られた暗号化構造データと、上記化合物の機能に係る機能データとの相関関係を表す。
このように、暗号化後のラベル付き暗号化分子構造データＣ２Ｌを受信して機能予測を行うことで、クライアントサーバー４から、機密情報である分子構造データＣ１を受信せずに必要な処理を行うことができる。よって、ＭＩサーバー１の内部には、クライアントの機密情報が記憶されないため、当該機密情報の安全性を高めることができる。
また、暗号化分子構造データと機能データとの相関関係を表す予測モデル（ここでは、学習モデル）を用いているため、機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がない。よって、簡易な処理で化合物の機能予測を行うことができる。 As described above, the MI server 1 according to this embodiment is equipped with a control unit 11, which in the first method described above provides the client server 4 with learning model data D2 for performing encryption according to a predetermined encryption algorithm (information provision unit), acquires from the client server 4 labeled encrypted molecular structure data C2L of the function prediction target encrypted according to the above encryption algorithm (first data acquisition unit), and predicts the function of the compound corresponding to the labeled encrypted molecular structure data C2L of the function prediction target based on a learning model as a prediction model (prediction unit), and the prediction model represents the correlation between the encrypted structural data obtained by encrypting structural data relating to the structure of the compound according to the above encryption algorithm and the functional data relating to the function of the above compound.
In this way, by receiving the labeled encrypted molecular structure data C2L after encryption and performing function prediction, necessary processing can be performed without receiving the molecular structure data C1, which is confidential information, from the client server 4. Therefore, the confidential information of the client is not stored inside the MI server 1, and therefore the security of the confidential information can be improved.
In addition, since a prediction model (here, a learning model) that expresses the correlation between the encrypted molecular structure data and the function data is used, it is sufficient to have the encrypted molecular structure data of the target function prediction in order to predict the function, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data. Therefore, the function of a compound can be predicted by a simple process.

また、制御部１１は、化合物の構造に係る分子構造データＡ１を所定の暗号化アルゴリズムに従って暗号化して暗号化分子構造データＡ２を生成し（暗号化部）、暗号化分子構造データＡ２、及び化合物の機能に係る機能データＡ３に基づいて、予測モデルとしての学習モデルを生成する（学習モデル生成部）。これによれば、ＭＩサーバー１において学習モデルを生成することができる。また、暗号化分子構造データを用いて学習モデルを生成しているため、学習モデルを用いた機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がない。よって、簡易な処理で化合物の機能予測を行うことができる。 The control unit 11 also encrypts molecular structure data A1 relating to the structure of the compound according to a predetermined encryption algorithm to generate encrypted molecular structure data A2 (encryption unit), and generates a learning model as a predictive model based on the encrypted molecular structure data A2 and function data A3 relating to the function of the compound (learning model generation unit). This allows the learning model to be generated in the MI server 1. Furthermore, since the learning model is generated using the encrypted molecular structure data, in order to predict function using the learning model, it is sufficient to have encrypted molecular structure data of the function prediction target, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data. Therefore, compound function prediction can be performed with simple processing.

また、制御部１１は、公的ＤＢサーバー２から分子構造データＡ１及び機能データＡ３を取得し（第３のデータ取得部）、取得した分子構造データＡ１に基づいて暗号化分子構造データＡ２を生成し（暗号化部）、当該暗号化分子構造データＡ２、及び公的ＤＢサーバー２から取得した機能データＡ３を少なくとも用いて学習モデルデータＤ２を生成する（学習モデル生成部）。これにより、公的ＤＢサーバー２が開示している多数の化合物の情報を用いて学習モデルを生成することができる。よって、学習モデルによる化合物の機能の予測精度を高めることができる。 Furthermore, the control unit 11 acquires molecular structure data A1 and functional data A3 from the public DB server 2 (third data acquisition unit), generates encrypted molecular structure data A2 based on the acquired molecular structure data A1 (encryption unit), and generates learning model data D2 using at least the encrypted molecular structure data A2 and the functional data A3 acquired from the public DB server 2 (learning model generation unit). This makes it possible to generate a learning model using information on a large number of compounds disclosed by the public DB server 2. This makes it possible to improve the accuracy of prediction of the functions of compounds by the learning model.

また、暗号化アルゴリズムは、暗号化前の構造データへの逆変換が不可能である。これによれば、ＭＩサーバー１において、クライアントサーバー４から受信したラベル付き暗号化分子構造データＣ２Ｌを復号して分子構造データＣ１を特定することができない。よって、クライアントは、ＭＩサーバー１の管理者を含む任意の部外者に対して機密情報である分子構造データＣ１を開示することなく、ＭＩによる情報提供サービスを受けることができる。 Furthermore, the encryption algorithm does not allow reverse conversion back to the structure data before encryption. This means that the MI server 1 cannot decrypt the labeled encrypted molecular structure data C2L received from the client server 4 to identify the molecular structure data C1. Therefore, the client can receive information provision services from MI without disclosing the molecular structure data C1, which is confidential information, to any outsider, including the administrator of the MI server 1.

また、第１の方法では、プログラム１１３ｃは、ＭＩサーバー１に設けられたコンピューターとしての制御部１１を、クライアントサーバー４に対して、所定の暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報Ｄ１を提供する情報提供手段、クライアントサーバー４から、上記暗号化アルゴリズムに従って暗号化された機能予測対象のラベル付き暗号化分子構造データＣ２Ｌを取得する第１のデータ取得手段（データ取得手段）、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌに対応する化合物の機能を予測モデルとしての学習モデルに基づいて予測する予測手段、として機能させ、予測モデルは、化合物の構造に係る構造データを上記暗号化アルゴリズムに従って暗号化して得られた暗号化構造データと、上記化合物の機能に係る機能データとの相関関係を表す。
このようなプログラムによりＭＩサーバー１を動作させることで、クライアントサーバー４から、機密情報である分子構造データＣ１を受信せずに必要な処理を行うことができる。よって、ＭＩサーバー１の内部には、クライアントの機密情報が記憶されないため、当該機密情報の安全性を高めることができる。また、学習モデルを用いた機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がないため、簡易な処理で化合物の機能予測を行うことができる。 In addition, in the first method, the program 113c causes the control unit 11, which is a computer provided in the MI server 1, to function as an information providing means for providing the client server 4 with encryption algorithm information D1 for performing encryption according to a predetermined encryption algorithm, a first data acquiring means (data acquiring means) for acquiring, from the client server 4, labeled encrypted molecular structure data C2L of a function prediction target encrypted according to the above encryption algorithm, and a prediction means for predicting the function of a compound corresponding to the labeled encrypted molecular structure data C2L of a function prediction target based on a learning model as a prediction model, and the prediction model represents the correlation between the encrypted structural data obtained by encrypting structural data relating to the structure of the compound according to the above encryption algorithm and functional data relating to the function of the above compound.
By operating the MI server 1 with such a program, necessary processing can be performed without receiving the molecular structure data C1, which is confidential information, from the client server 4. Therefore, confidential information of the client is not stored inside the MI server 1, so the security of the confidential information can be improved. Furthermore, for function prediction using a learning model, it is sufficient to have encrypted molecular structure data of the function prediction target, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data, so that function prediction of a compound can be performed with simple processing.

また、情報処理方法としての第１の方法は、クライアントサーバー４に対して、所定の暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報Ｄ１を提供する情報提供ステップと、クライアントサーバー４から、上記暗号化アルゴリズムに従って暗号化された機能予測対象のラベル付き暗号化分子構造データＣ２Ｌを取得する第１のデータ取得ステップ（データ取得ステップ）と、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌに対応する化合物の機能を予測モデルとしての学習モデルに基づいて予測する予測ステップと、を含み、予測モデルは、化合物の構造に係る構造データを上記暗号化アルゴリズムに従って暗号化して得られた暗号化構造データと、上記化合物の機能に係る機能データとの相関関係を表す。
このような方法によれば、クライアントサーバー４から、機密情報である分子構造データＣ１を受信せずに必要な処理を行うことができる。よって、ＭＩサーバー１の内部には、クライアントの機密情報が記憶されないため、当該機密情報の安全性を高めることができる。また、学習モデルを用いた機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がないため、簡易な処理で化合物の機能予測を行うことができる。
また、このような方法の一部は人の手により行うこともでき、これによれば、装置を作りこまなくても多様な化合物の機能予測に効率よく対応することができる。 Further, the first method as an information processing method includes an information providing step of providing a client server 4 with encryption algorithm information D1 for performing encryption according to a predetermined encryption algorithm, a first data acquisition step (data acquisition step) of acquiring from the client server 4 labeled encrypted molecular structure data C2L of a function prediction target encrypted according to the above encryption algorithm, and a prediction step of predicting the function of a compound corresponding to the labeled encrypted molecular structure data C2L of the function prediction target based on a learning model as a prediction model, wherein the prediction model represents a correlation between the encrypted structural data obtained by encrypting structural data relating to the structure of the compound according to the above encryption algorithm and functional data relating to the function of the above compound.
According to this method, necessary processing can be performed without receiving the molecular structure data C1, which is confidential information, from the client server 4. Therefore, confidential information of the client is not stored inside the MI server 1, so the security of the confidential information can be improved. Furthermore, for function prediction using a learning model, it is sufficient to have encrypted molecular structure data of the function prediction target, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data, so that function prediction of a compound can be performed with simple processing.
Furthermore, some of these methods can be performed manually, which makes it possible to efficiently predict the functions of a variety of compounds without the need for elaborate equipment.

＜第２の方法＞
図７は、化合物の機能予測に係る第２の方法を説明する図である。
図７では、ＭＩサーバー１、公的ＤＢサーバー２、試薬ＤＢサーバー３及びクライアントサーバー４により行われる各種データ処理の流れ、及び各サーバー間におけるデータの送受信の流れを示している。
第２の方法では、学習対象の暗号化分子構造データとして、「（ａ１）一般データ１１３ａに含まれる暗号化分子構造データＡ２」及び「（ａ２）クライアント由来データ１１３ｂに含まれる暗号化分子構造データＣ２」が用いられ、機能予測対象の暗号化分子構造データとして、「（ｂ２）一般データ１１３ａに含まれる暗号化分子構造データＢ２のうち、外部（例えば試薬ＤＢサーバー３）から取得した分子構造データＢ１を暗号化して得られた暗号化分子構造データＢ２」が用いられる。 <Second Method>
FIG. 7 is a diagram illustrating a second method for predicting the function of a compound.
FIG. 7 shows the flow of various data processes performed by the MI server 1, the public DB server 2, the reagent DB server 3 and the client server 4, and the flow of data transmission and reception between the servers.
In the second method, "(a1) encrypted molecular structure data A2 included in the general data 113a" and "(a2) encrypted molecular structure data C2 included in the client-derived data 113b" are used as the encrypted molecular structure data to be learned, and "(b2) encrypted molecular structure data B2 included in the general data 113a, which is obtained by encrypting molecular structure data B1 obtained from an outside source (e.g., reagent DB server 3)" is used as the encrypted molecular structure data to be predicted in function.

第２の方法では、まずＭＩサーバー１の制御部１１は、公的ＤＢサーバー２から、学習対象の分子構造データＡ１及び対応する機能データＡ３を取得する（ステップＳ２０１：第３のデータ取得ステップ）。In the second method, first, the control unit 11 of the MI server 1 acquires the molecular structure data A1 to be learned and the corresponding functional data A3 from the public DB server 2 (step S201: third data acquisition step).

ＭＩサーバー１の制御部１１は、取得した分子構造データＡ１と、記憶部１１３に予め記憶されている分子構造データＡ１とを、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して暗号化分子構造データＡ２を生成する（ステップＳ２０２：暗号化ステップ）。The control unit 11 of the MI server 1 encrypts the acquired molecular structure data A1 and the molecular structure data A1 pre-stored in the memory unit 113 according to the encryption algorithm indicated by the encryption algorithm information D1 to generate encrypted molecular structure data A2 (step S202: encryption step).

ＭＩサーバー１の制御部１１は、クライアントサーバー４に対して暗号化アルゴリズム情報Ｄ１を送信する（ステップＳ２０３：情報提供ステップ）。The control unit 11 of the MI server 1 sends encryption algorithm information D1 to the client server 4 (step S203: information provision step).

暗号化アルゴリズム情報Ｄ１を受信したクライアントサーバー４の制御部４１は、学習対象の分子構造データＣ１を、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して暗号化分子構造データＣ２を生成する（ステップＳ２０４）。また、制御部４１は、ＭＩサーバー１に対して、学習対象の暗号化分子構造データＣ２及び対応する学習対象の機能データＣ３を送信する。これに応じて、ＭＩサーバー１の制御部１１は、学習対象の暗号化分子構造データＣ２及び機能データＣ３を受信する（ステップＳ２０５：第１のデータ取得ステップ）。The control unit 41 of the client server 4 that has received the encryption algorithm information D1 encrypts the molecular structure data C1 of the learning target according to the encryption algorithm indicated by the encryption algorithm information D1 to generate encrypted molecular structure data C2 (step S204). The control unit 41 also transmits the encrypted molecular structure data C2 of the learning target and the corresponding functional data C3 of the learning target to the MI server 1. In response, the control unit 11 of the MI server 1 receives the encrypted molecular structure data C2 and functional data C3 of the learning target (step S205: first data acquisition step).

ＭＩサーバー１の制御部１１は、ステップＳ２０２で生成した暗号化分子構造データＡ２、ステップＳ２０１で公的ＤＢサーバー２から取得した機能データＡ３、記憶部１１３に予め記憶されている機能データＡ３、ステップＳ２０５でクライアントサーバー４から取得した暗号化分子構造データＣ２及び機能データＣ３に基づいて、機械学習により学習モデルデータＤ２を生成する（ステップＳ２０６：学習モデル生成ステップ）。既に学習モデルデータＤ２が記憶部１１３に記憶されている場合には、制御部１１は、学習モデルデータＤ２を、新たに生成した内容に更新する。
なお、機械学習には、公的ＤＢサーバー２から取得した分子構造データＡ１の暗号化分子構造データＡ２、記憶部１１３に予め記憶されていた分子構造データＡ１の暗号化分子構造データＡ２、及びクライアントサーバー４から取得した暗号化分子構造データＣ２のうち一部のみを用いてもよい。 The control unit 11 of the MI server 1 generates learning model data D2 by machine learning based on the encrypted molecular structure data A2 generated in step S202, the functional data A3 acquired from the public DB server 2 in step S201, the functional data A3 pre-stored in the memory unit 113, and the encrypted molecular structure data C2 and functional data C3 acquired from the client server 4 in step S205 (step S206: learning model generation step). If the learning model data D2 has already been stored in the memory unit 113, the control unit 11 updates the learning model data D2 to the newly generated content.
In addition, for machine learning, only a portion of the encrypted molecular structure data A2 of the molecular structure data A1 obtained from the public DB server 2, the encrypted molecular structure data A2 of the molecular structure data A1 pre-stored in the memory unit 113, and the encrypted molecular structure data C2 obtained from the client server 4 may be used.

一方、ＭＩサーバー１の制御部１１は、試薬ＤＢサーバー３から機能予測対象の分子構造データＢ１を取得し（ステップＳ２０７：第２のデータ取得ステップ）、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して暗号化分子構造データＢ２を生成する（ステップＳ２０８）。Meanwhile, the control unit 11 of the MI server 1 acquires molecular structure data B1 of the function prediction target from the reagent DB server 3 (step S207: second data acquisition step), and encrypts it according to the encryption algorithm indicated by the encryption algorithm information D1 to generate encrypted molecular structure data B2 (step S208).

ＭＩサーバー１の制御部１１は、ステップＳ２０８で生成した暗号化分子構造データＢ２に対して、学習モデルデータＤ２により表される学習モデルを適用することで、各暗号化分子構造データに対応する化合物の機能を予測する（ステップＳ２０９：予測ステップ）。The control unit 11 of the MI server 1 predicts the function of the compound corresponding to each encrypted molecular structure data by applying the learning model represented by the learning model data D2 to the encrypted molecular structure data B2 generated in step S208 (step S209: prediction step).

制御部１１は、機能の予測結果と、クライアントサーバー４から受信した機能の目標値とを比較し、機能の予測結果画像目標値と一致する暗号化分子構造データＢ２を特定する（ステップＳ２１０）。The control unit 11 compares the function prediction result with the function target value received from the client server 4, and identifies the encrypted molecular structure data B2 that matches the function prediction result image target value (step S210).

制御部１１は、ステップＳ２１０で特定された暗号化分子構造データＢ２に対応する分子構造データＢ１を、クライアントサーバー４に送信する（ステップＳ２１１）。これに応じて、クライアントサーバー４の制御部４１は、受信した分子構造データＢ１に係る化合物を、所望の機能を呈する化合物として特定する。The control unit 11 transmits the molecular structure data B1 corresponding to the encrypted molecular structure data B2 identified in step S210 to the client server 4 (step S211). In response, the control unit 41 of the client server 4 identifies the compound related to the received molecular structure data B1 as a compound exhibiting the desired function.

以上のように、第２の方法においては、ＭＩサーバー１の制御部１１は、化合物の構造に係る構造データを所定の暗号化アルゴリズムに従って暗号化して暗号化構造データを生成し（暗号化部）、暗号化構造データ、及び化合物の機能に係る機能データに基づいて、暗号化構造データと機能データとの相関関係を表す学習モデルデータＤ２を生成し（学習モデル生成部）、クライアントサーバー４に対して、暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報Ｄ１を提供し（情報提供部）、クライアントサーバー４から、上記暗号化アルゴリズムに従って暗号化された学習対象の暗号化分子構造データＣ２、及び当該暗号化分子構造データＣ２に対応する化合物の機能に係る学習対象の機能データＣ３を取得し（第１のデータ取得部）、取得した学習対象の暗号化分子構造データＣ２及び学習対象の機能データＣ３を少なくとも用いて学習モデルデータＤ２を生成する（学習モデル生成部）。
このように、暗号化後の暗号化分子構造データＣ２を受信して機能予測を行うことで、クライアントサーバー４から、機密情報である分子構造データＣ１を受信せずに必要な処理を行うことができる。よって、ＭＩサーバー１の内部には、クライアントの機密情報が記憶されないため、当該機密情報の安全性を高めることができる。
また、このように化合物の構造に係る機密情報の安全性を高められることによって、非公開の化合物の構造に係る情報（学習対象の暗号化分子構造データ及び機能データ）を収集しやすくなるため、より多数の化合物の情報を用いて学習モデルの予測精度を高めることができる。
また、クライアントが所望する機能を呈する化合物は、クライアントが管理、所有している既存の化合物と構造が類似している化合物から特定される場合が多い。よって、第２の方法のように、クライアントサーバー４から受信した暗号化分子構造データＣ２及び機能データＣ３を用いて学習モデルを生成することにより、クライアントが所望する機能を呈するか否かをより高精度に予測可能な学習モデルが得られる。
また、暗号化分子構造データを用いて学習モデルを生成しているため、学習モデルを用いた機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がない。よって、簡易な処理で化合物の機能予測を行うことができる。
これにより、 As described above, in the second method, the control unit 11 of the MI server 1 encrypts structural data related to the structure of the compound according to a predetermined encryption algorithm to generate encrypted structural data (encryption unit), generates learning model data D2 representing the correlation between the encrypted structural data and functional data based on the encrypted structural data and functional data related to the function of the compound (learning model generation unit), provides the client server 4 with encryption algorithm information D1 for performing encryption according to the encryption algorithm (information provision unit), acquires from the client server 4 encrypted molecular structure data C2 of the learning subject encrypted according to the above encryption algorithm and functional data C3 of the learning subject related to the function of the compound corresponding to the encrypted molecular structure data C2 (first data acquisition unit), and generates learning model data D2 using at least the acquired encrypted molecular structure data C2 of the learning subject and functional data C3 of the learning subject (learning model generation unit).
In this way, by receiving the encrypted molecular structure data C2 after encryption and performing function prediction, necessary processing can be performed without receiving the molecular structure data C1, which is confidential information, from the client server 4. Therefore, the confidential information of the client is not stored inside the MI server 1, and therefore the security of the confidential information can be improved.
In addition, by increasing the security of confidential information related to compound structures in this manner, it becomes easier to collect information related to non-public compound structures (encrypted molecular structure data and functional data of the learning target), making it possible to improve the predictive accuracy of the learning model using information on a larger number of compounds.
Furthermore, a compound exhibiting a function desired by a client is often identified from compounds having a structure similar to that of an existing compound managed and owned by the client. Therefore, as in the second method, a learning model is generated using the encrypted molecular structure data C2 and the function data C3 received from the client server 4, thereby obtaining a learning model that can predict with high accuracy whether or not a compound exhibits the function desired by the client.
In addition, since the learning model is generated using the encrypted molecular structure data, the encrypted molecular structure data of the function prediction target is sufficient for function prediction using the learning model, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data. Therefore, the function of a compound can be predicted by a simple process.
This means:

また、制御部１１は、化合物の構造を公開する試薬ＤＢサーバー３から当該化合物の構造に係る分子構造データＢ１を取得し（第２のデータ取得部）、取得した分子構造データＢ１を暗号化して機能予測対象の暗号化分子構造データＢ２を生成し（暗号化部）、生成した機能予測対象の暗号化分子構造データＢ２に対応する化合物の機能を学習モデルデータＤ２に基づいて予測する（予測部）。これにより、試薬ＤＢサーバー３が公開している多数の化合物の中から、クライアントが所望する機能を呈する化合物を特定することができる。 The control unit 11 also acquires molecular structure data B1 relating to the structure of the compound from the reagent DB server 3 that publishes the structure of the compound (second data acquisition unit), encrypts the acquired molecular structure data B1 to generate encrypted molecular structure data B2 of the function prediction target (encryption unit), and predicts the function of the compound corresponding to the generated encrypted molecular structure data B2 of the function prediction target based on the learning model data D2 (prediction unit). This makes it possible to identify a compound exhibiting the function desired by the client from among the many compounds published by the reagent DB server 3.

また、第２の方法では、プログラム１１３ｃは、ＭＩサーバー１に設けられたコンピューターとしての制御部１１を、化合物の構造に係る構造データを所定の暗号化アルゴリズムに従って暗号化して暗号化構造データを生成する暗号化手段、暗号化構造データ、及び化合物の機能に係る機能データに基づいて、暗号化構造データと機能データとの相関関係を表す学習モデルデータＤ２を生成する学習モデル生成手段、クライアントサーバー４に対して、暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報Ｄ１を提供する情報提供手段、クライアントサーバー４から、上記暗号化アルゴリズムに従って暗号化された学習対象の暗号化分子構造データＣ２、及び当該暗号化分子構造データＣ２に対応する化合物の機能に係る学習対象の機能データＣ３を取得する第１のデータ取得手段（データ取得手段）、として機能させ、学習モデル生成手段は、第１のデータ取得手段が取得した学習対象の暗号化分子構造データＣ２及び学習対象の機能データＣ３を少なくとも用いて学習モデルデータＤ２を生成する。
このようなプログラムによりＭＩサーバー１を動作させることで、クライアントサーバー４から、機密情報である分子構造データＣ１を受信せずに必要な処理を行うことができる。よって、ＭＩサーバー１の内部には、クライアントの機密情報が記憶されないため、当該機密情報の安全性を高めることができる。また、機密情報の安全性を高められることによって、非公開の化合物の構造に係る情報を収集しやすくなるため、より多数の化合物の情報を用いて学習モデルの予測精度を高めることができる。また、クライアントサーバー４から受信した暗号化分子構造データＣ２及び機能データＣ３を用いて学習モデルを生成することにより、クライアントが所望する機能を呈するか否かをより高精度に予測可能な学習モデルが得られる。また、学習モデルを用いた機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がないため、簡易な処理で化合物の機能予測を行うことができる。 In addition, in the second method, the program 113c causes the control unit 11, which is a computer provided in the MI server 1, to function as an encryption means for encrypting structural data related to the structure of a compound according to a predetermined encryption algorithm to generate encrypted structural data, a learning model generation means for generating learning model data D2 representing the correlation between the encrypted structural data and functional data based on the encrypted structural data and functional data related to the function of the compound, an information provision means for providing the client server 4 with encryption algorithm information D1 for performing encryption according to the encryption algorithm, and a first data acquisition means (data acquisition means) for acquiring from the client server 4 the encrypted molecular structure data C2 of the learning subject encrypted according to the above encryption algorithm, and the functional data C3 of the learning subject related to the function of the compound corresponding to the encrypted molecular structure data C2, and the learning model generation means generates the learning model data D2 using at least the encrypted molecular structure data C2 of the learning subject and the functional data C3 of the learning subject acquired by the first data acquisition means.
By operating the MI server 1 with such a program, necessary processing can be performed without receiving the molecular structure data C1, which is confidential information, from the client server 4. Therefore, confidential information of the client is not stored inside the MI server 1, so the security of the confidential information can be improved. In addition, by improving the security of the confidential information, it becomes easier to collect information related to the structure of undisclosed compounds, so the prediction accuracy of the learning model can be improved using information on a larger number of compounds. In addition, by generating a learning model using the encrypted molecular structure data C2 and function data C3 received from the client server 4, a learning model that can predict with higher accuracy whether or not a function desired by the client is exhibited can be obtained. In addition, for function prediction using the learning model, it is sufficient to have the encrypted molecular structure data of the function prediction target, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data, so that the function prediction of the compound can be performed with simple processing.

また、情報処理方法としての第２の方法は、化合物の構造に係る構造データを所定の暗号化アルゴリズムに従って暗号化して暗号化構造データを生成する暗号化ステップと、暗号化構造データ、及び化合物の機能に係る機能データに基づいて、暗号化構造データと機能データとの相関関係を表す学習モデルデータＤ２を生成する学習モデル生成ステップと、クライアントサーバー４に対して、暗号化アルゴリズムに従った暗号化を実行するための暗号化アルゴリズム情報Ｄ１を提供する情報提供ステップと、クライアントサーバー４から、上記暗号化アルゴリズムに従って暗号化された学習対象の暗号化分子構造データＣ２、及び当該暗号化分子構造データＣ２に対応する化合物の機能に係る学習対象の機能データＣ３を取得する第１のデータ取得ステップ（データ取得ステップ）と、を含み、学習モデル生成ステップでは、第１のデータ取得ステップにおいて取得した学習対象の暗号化分子構造データＣ２及び学習対象の機能データＣ３を少なくとも用いて学習モデルデータＤ２を生成する。
このような方法によれば、クライアントサーバー４から、機密情報である分子構造データＣ１を受信せずに必要な処理を行うことができる。よって、ＭＩサーバー１の内部には、クライアントの機密情報が記憶されないため、当該機密情報の安全性を高めることができる。また、機密情報の安全性を高められることによって、非公開の化合物の構造に係る情報を収集しやすくなるため、より多数の化合物の情報を用いて学習モデルの予測精度を高めることができる。また、クライアントサーバー４から受信した暗号化分子構造データＣ２及び機能データＣ３を用いて学習モデルを生成することにより、クライアントが所望する機能を呈するか否かをより高精度に予測可能な学習モデルが得られる。また、学習モデルを用いた機能予測のためには、機能予測対象の暗号化分子構造データがあれば足り、暗号化分子構造データを復号して分子構造データを生成する必要がないため、簡易な処理で化合物の機能予測を行うことができる。
また、このような方法の一部は人の手により行うこともでき、これによれば、装置を作りこまなくても多様な化合物の機能予測に効率よく対応することができる。 In addition, the second method as an information processing method includes an encryption step of encrypting structural data relating to the structure of the compound according to a predetermined encryption algorithm to generate encrypted structural data, a learning model generation step of generating learning model data D2 representing the correlation between the encrypted structural data and functional data based on the encrypted structural data and functional data relating to the function of the compound, an information provision step of providing a client server 4 with encryption algorithm information D1 for performing encryption according to the encryption algorithm, and a first data acquisition step (data acquisition step) of acquiring from the client server 4 encrypted molecular structure data C2 of the learning subject encrypted according to the above encryption algorithm, and functional data C3 of the learning subject relating to the function of the compound corresponding to the encrypted molecular structure data C2, and in the learning model generation step, the learning model data D2 is generated using at least the encrypted molecular structure data C2 of the learning subject and the functional data C3 of the learning subject acquired in the first data acquisition step.
According to this method, necessary processing can be performed without receiving the molecular structure data C1, which is confidential information, from the client server 4. Therefore, confidential information of the client is not stored inside the MI server 1, so the security of the confidential information can be improved. In addition, by increasing the security of the confidential information, it becomes easier to collect information related to the structure of undisclosed compounds, so that the prediction accuracy of the learning model can be improved using information on a larger number of compounds. In addition, by generating a learning model using the encrypted molecular structure data C2 and function data C3 received from the client server 4, a learning model that can predict with higher accuracy whether or not a function desired by the client is exhibited can be obtained. In addition, for function prediction using the learning model, it is sufficient to have the encrypted molecular structure data of the function prediction target, and there is no need to decrypt the encrypted molecular structure data to generate molecular structure data, so that the function prediction of the compound can be performed by simple processing.
Furthermore, some of these methods can be performed manually, which makes it possible to efficiently predict the functions of a variety of compounds without the need for elaborate equipment.

＜第３の方法＞
図８は、化合物の機能予測に係る第３の方法を説明する図である。
第３の方法では、学習対象の暗号化分子構造データとして、「（ａ１）一般データ１１３ａに含まれる暗号化分子構造データＡ２」及び「（ａ２）クライアント由来データ１１３ｂに含まれる暗号化分子構造データＣ２」が用いられ、機能予測対象の暗号化分子構造データとして、「（ｂ３）一般データ１１３ａに含まれる暗号化分子構造データＢ２のうち、ＭＩサーバー１の内部で生成された分子構造データＢ１を暗号化して得られた暗号化分子構造データＢ２」が用いられる。 <Third Method>
FIG. 8 is a diagram illustrating a third method for predicting the function of a compound.
In the third method, "(a1) encrypted molecular structure data A2 included in general data 113a" and "(a2) encrypted molecular structure data C2 included in client-derived data 113b" are used as the encrypted molecular structure data to be learned, and "(b3) encrypted molecular structure data B2 included in general data 113a, which is obtained by encrypting molecular structure data B1 generated inside MI server 1" is used as the encrypted molecular structure data to be predicted in function.

第３の方法におけるステップＳ３０１～Ｓ３０６は、第２の方法のステップＳ２０１～Ｓ２０６と同様であるので説明は省略する。 Steps S301 to S306 in the third method are similar to steps S201 to S206 in the second method, so explanation is omitted.

ＭＩサーバー１の制御部１１は、遺伝的アルゴリズムなどを用いて機械的かつランダムに複数の分子構造データＢ１を生成し（ステップＳ３０７）、暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して複数の暗号化分子構造データＢ２を生成する（ステップＳ３０８）。ステップＳ３０９（予測ステップ）では、ステップＳ３０８で生成した暗号化分子構造データＢ２を機能予測対象の暗号化分子構造データとして用いて、機能予測を行う。
ステップＳ３１０、Ｓ３１１は、第２の方法のステップＳ２１０、Ｓ２１１と同様であるので説明は省略する。 The control unit 11 of the MI server 1 mechanically and randomly generates a plurality of molecular structure data B1 using a genetic algorithm or the like (step S307), and encrypts the data according to the encryption algorithm indicated by the encryption algorithm information D1 to generate a plurality of encrypted molecular structure data B2 (step S308). In step S309 (prediction step), function prediction is performed using the encrypted molecular structure data B2 generated in step S308 as encrypted molecular structure data to be predicted.
Steps S310 and S311 are similar to steps S210 and S211 in the second method, and therefore a description thereof will be omitted.

以上のように、第３の方法においては、ＭＩサーバー１の制御部１１は、分子構造データＢ１を生成し（構造データ生成部）、生成した分子構造データＢ１を暗号化して機能予測対象の暗号化分子構造データＢ２を生成し（暗号化部）、生成した機能予測対象の暗号化分子構造データＢ２に対応する化合物の機能を学習モデルデータＤ２に基づいて予測する（予測部）。これにより、ＭＩサーバー１の外部から十分な機能予測対象の化合物のデータを取得できない場合などにおいても、クライアントが所望する機能を呈する化合物を特定できる可能性を高めることができる。As described above, in the third method, the control unit 11 of the MI server 1 generates molecular structure data B1 (structure data generation unit), encrypts the generated molecular structure data B1 to generate encrypted molecular structure data B2 of the target for function prediction (encryption unit), and predicts the function of the compound corresponding to the generated encrypted molecular structure data B2 of the target for function prediction based on the learning model data D2 (prediction unit). This increases the possibility that the client will be able to identify a compound exhibiting the desired function even in cases where sufficient data on the compound of the target for function prediction cannot be obtained from outside the MI server 1.

＜第４の方法＞
図９は、化合物の機能予測に係る第４の方法を説明する図である。
第４の方法では、学習対象の暗号化分子構造データとして、「（ａ１）一般データ１１３ａに含まれる暗号化分子構造データＡ２」及び「（ａ２）クライアント由来データ１１３ｂに含まれる暗号化分子構造データＣ２」が用いられ、機能予測対象の暗号化分子構造データとして、「（ｂ１）クライアント由来データ１１３ｂに含まれるラベル付き暗号化分子構造データＣ２Ｌ。」が用いられる。 <Fourth Method>
FIG. 9 is a diagram illustrating a fourth method for predicting the function of a compound.
In the fourth method, “(a1) encrypted molecular structure data A2 included in general data 113a” and “(a2) encrypted molecular structure data C2 included in client-derived data 113b” are used as the encrypted molecular structure data to be learned, and “(b1) labeled encrypted molecular structure data C2L included in client-derived data 113b” is used as the encrypted molecular structure data to be predicted in function.

第４の方法のステップＳ４０１～Ｓ４０３、Ｓ４０６は、第３の方法のステップＳ３０１～Ｓ３０３、Ｓ３０６と同様であるので説明は省略する。 Steps S401 to S403 and S406 of the fourth method are similar to steps S301 to S303 and S306 of the third method, so explanation is omitted.

ステップＳ４０３で暗号化アルゴリズム情報Ｄ１を受信したクライアントサーバー４の制御部４１は、分子構造データＣ１を暗号化アルゴリズム情報Ｄ１により示される暗号化アルゴリズムに従って暗号化して、学習対象の暗号化分子構造データＣ２と、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌとを生成する（ステップＳ４０４）。また、制御部４１は、ＭＩサーバー１に対して、学習対象の暗号化分子構造データＣ２及び対応する学習対象の機能データＣ３を送信し、ＭＩサーバー１の制御部１１は、当該学習対象の暗号化分子構造データＣ２及び機能データＣ３を受信する（ステップＳ４０５：第１のデータ取得ステップ）。また、制御部４１は、ＭＩサーバー１に対して、機能予測対象のラベル付き暗号化分子構造データＣ２Ｌを送信し、ＭＩサーバー１の制御部１１は、当該ラベル付き暗号化分子構造データＣ２Ｌを受信する（ステップＳ４０７：第１のデータ取得ステップ）。The control unit 41 of the client server 4 that has received the encryption algorithm information D1 in step S403 encrypts the molecular structure data C1 according to the encryption algorithm indicated by the encryption algorithm information D1 to generate the encrypted molecular structure data C2 of the learning target and the labeled encrypted molecular structure data C2L of the function prediction target (step S404). The control unit 41 also transmits the encrypted molecular structure data C2 of the learning target and the corresponding functional data C3 of the learning target to the MI server 1, and the control unit 11 of the MI server 1 receives the encrypted molecular structure data C2 and the functional data C3 of the learning target (step S405: first data acquisition step). The control unit 41 also transmits the labeled encrypted molecular structure data C2L of the function prediction target to the MI server 1, and the control unit 11 of the MI server 1 receives the labeled encrypted molecular structure data C2L (step S407: first data acquisition step).

ＭＩサーバー１の制御部１１は、ステップＳ４０７で取得したラベル付き暗号化分子構造データＣ２Ｌに含まれる各暗号化分子構造データに対して、学習モデルデータＤ２により表される学習モデルを適用することで、各暗号化分子構造データに対応する化合物の機能を予測する（ステップＳ４０８：予測ステップ）。
以降のステップＳ４０９、Ｓ４１０は、第１の方法のステップＳ１０８、Ｓ１０９と同様であるので説明は省略する。 The control unit 11 of the MI server 1 predicts the function of the compound corresponding to each encrypted molecular structure data by applying the learning model represented by the learning model data D2 to each encrypted molecular structure data included in the labeled encrypted molecular structure data C2L acquired in step S407 (step S408: prediction step).
The subsequent steps S409 and S410 are similar to steps S108 and S109 in the first method, and therefore a description thereof will be omitted.

以上のように、第４の方法においては、ＭＩサーバー１の制御部１１は、クライアントサーバー４から、暗号化された学習対象の暗号化分子構造データＣ２、及び機能データＣ３を取得し（第１のデータ取得部）、取得した暗号化分子構造データＣ２及び機能データＣ３を少なくとも用いて学習モデルデータＤ２を生成し（学習モデル生成部）、クライアントサーバー４から、暗号化アルゴリズムに従って暗号化された機能予測対象のラベル付き暗号化分子構造データＣ２Ｌを取得し（第１のデータ取得部）、取得した機能予測対象のラベル付き暗号化分子構造データＣ２Ｌに対応する化合物の機能を学習モデルデータＤ２に基づいて予測する（予測部）。これにより、クライアントが所望する機能を呈するか否かをより高精度に予測可能な学習モデルを生成できるとともに、クライアントから提供された暗号化分子構造データＣ２の中から、クライアントが所望する機能を呈する化合物を特定することできる。As described above, in the fourth method, the control unit 11 of the MI server 1 acquires the encrypted molecular structure data C2 and the function data C3 of the learning target from the client server 4 (first data acquisition unit), generates learning model data D2 using at least the acquired encrypted molecular structure data C2 and function data C3 (learning model generation unit), acquires labeled encrypted molecular structure data C2L of the function prediction target encrypted according to the encryption algorithm from the client server 4 (first data acquisition unit), and predicts the function of the compound corresponding to the acquired labeled encrypted molecular structure data C2L of the function prediction target based on the learning model data D2 (prediction unit). This makes it possible to generate a learning model that can predict with high accuracy whether or not a compound exhibits the function desired by the client, and to identify a compound exhibiting the function desired by the client from the encrypted molecular structure data C2 provided by the client.

なお、本発明は、上記実施形態及び各変形例に限られるものではなく、様々な変更が可能である。
例えば、上記実施形態では、機能予測対象の暗号化分子構造データとして、クライアントサーバー４から取得した暗号化分子構造データＣ２（第１の方法）、試薬ＤＢサーバー３から取得した分子構造データＢ１の暗号化分子構造データＢ２（第２の方法）、ＭＩサーバー１の内部で生成した分子構造データＢ１の暗号化分子構造データＢ２（第３の方法）を例示したが、これに限定する趣旨ではない。機能予測対象の暗号化分子構造データとしては、化合物の分子構造データを暗号化した任意の暗号化分子構造データを用いることができ、その取得経路は本実施形態に例示したものに限られない。
一例を挙げると、第２の方法又は第３の方法において、クライアントサーバー４から機能予測対象の暗号化分子構造データＣ２を取得してもよい。 The present invention is not limited to the above-described embodiment and each of the modified examples, and various modifications are possible.
For example, in the above embodiment, examples of encrypted molecular structure data for which function prediction is to be performed include encrypted molecular structure data C2 (first method) obtained from the client server 4, encrypted molecular structure data B2 (second method) of molecular structure data B1 obtained from the reagent DB server 3, and encrypted molecular structure data B2 (third method) of molecular structure data B1 generated within the MI server 1, but this is not intended to be limiting. Any encrypted molecular structure data obtained by encrypting molecular structure data of a compound can be used as the encrypted molecular structure data for which function prediction is to be performed, and the route of acquisition is not limited to that exemplified in this embodiment.
As one example, in the second or third method, the encrypted molecular structure data C2 of the function prediction target may be acquired from the client server 4.

また、上記実施形態では、不可逆の暗号化アルゴリズムを用いる例を挙げて説明したが、これに限られず、可逆の暗号化アルゴリズムを用いてもよい。この場合においても、例えばＭＩサーバー１内の暗号化分子構造データＣ２に外部から不正にアクセスされたとしても、不正にアクセスした第三者は、暗号化アルゴリズムを特定できないため暗号化分子構造データＣ２を復号して分子構造データＣ１を得ることはできない。よって、可逆の暗号化アルゴリズムを用いた場合であっても、クライアントの機密情報（分子構造データＣ１）の安全性を高める効果が得られる。
また、暗号化アルゴリズムは、ハッシュ関数を用いるものに限られない。 In the above embodiment, an example of using an irreversible encryption algorithm has been described, but the present invention is not limited to this, and a reversible encryption algorithm may be used. Even in this case, even if the encrypted molecular structure data C2 in the MI server 1 is illegally accessed from outside, the third party who illegally accessed the data cannot identify the encryption algorithm and therefore cannot decrypt the encrypted molecular structure data C2 to obtain the molecular structure data C1. Therefore, even when a reversible encryption algorithm is used, the effect of increasing the security of the client's confidential information (molecular structure data C1) can be obtained.
Furthermore, the encryption algorithm is not limited to one that uses a hash function.

また、上記実施形態では、ＭＩサーバー１において学習モデルを生成したが、これに限られず、既存の学習モデル（例えば外部装置において生成された学習モデル）をそのまま用いてもよい。この場合には、ＭＩサーバー１は学習モデルの生成機能（学習モデル生成部）を有していなくてもよい。また、この態様において、機能予測対象の暗号化分子構造データをＭＩサーバー１の外部から取得する場合には、ＭＩサーバー１は、分子構造データを暗号化する機能（暗号化部）を有していなくてもよい。 In addition, in the above embodiment, the learning model was generated in MI server 1, but this is not limited to the above, and an existing learning model (e.g., a learning model generated in an external device) may be used as is. In this case, MI server 1 may not have a learning model generation function (learning model generation unit). In addition, in this aspect, when encrypted molecular structure data of the target for function prediction is obtained from outside MI server 1, MI server 1 may not have a function of encrypting the molecular structure data (encryption unit).

また、上記実施形態では、帰納的アプローチで機械学習により生成された学習モデルを用いる例を挙げて説明したが、化合物の機能予測に用いる予測モデルは、この学習モデルに限られない。予測モデルとしては、例えば、化合物についての既知の原理や規則性から化合物の機能を予測する演繹的予測モデルを用いてもよい。演繹的予測モデルを用いる場合にも、ＭＩサーバー１は、学習モデルの生成機能（学習モデル生成部）を有していなくてもよい。また、機能予測対象の暗号化分子構造データをＭＩサーバー１の外部から取得する場合には、ＭＩサーバー１は、分子構造データを暗号化する機能（暗号化部）を有していなくてもよい。In addition, in the above embodiment, an example of using a learning model generated by machine learning using an inductive approach has been described, but the prediction model used to predict the function of a compound is not limited to this learning model. As a prediction model, for example, a deductive prediction model that predicts the function of a compound from known principles and regularities about the compound may be used. Even when a deductive prediction model is used, the MI server 1 does not need to have a function of generating a learning model (learning model generation unit). Furthermore, when encrypted molecular structure data of a function prediction target is obtained from outside the MI server 1, the MI server 1 does not need to have a function of encrypting the molecular structure data (encryption unit).

また、上記実施形態では、ＭＩサーバー１、公的ＤＢサーバー２、試薬ＤＢサーバー３及びクライアントサーバー４の各々が、それぞれ単一のサーバー装置からなる例を用いて説明したが、これに限られず、これらのうち任意のサーバー装置を複数の装置からなるシステムに置き換えてもよい。例えば、ＭＩサーバー１の記憶部１１３に記憶されているプログラム及びデータの少なくとも一部を、ＭＩサーバー１の外部の記憶装置に記憶させてもよい。In addition, in the above embodiment, the MI server 1, the public DB server 2, the reagent DB server 3, and the client server 4 are each described as being composed of a single server device, but this is not limiting, and any of these server devices may be replaced with a system composed of multiple devices. For example, at least a portion of the programs and data stored in the memory unit 113 of the MI server 1 may be stored in a memory device external to the MI server 1.

本発明のいくつかの実施形態を説明したが、本発明の範囲は、上述の実施の形態に限定されるものではなく、特許請求の範囲に記載された発明の範囲とその均等の範囲を含む。 Although several embodiments of the present invention have been described, the scope of the present invention is not limited to the above-described embodiments, but includes the scope of the invention described in the claims and their equivalents.

本発明は、情報処理装置、プログラム及び情報処理方法に利用することができる。 The present invention can be used in information processing devices, programs and information processing methods.

１ＭＩサーバー（情報処理装置）
１１制御部（暗号化部、学習モデル生成部、情報提供部、第１～第３のデータ取得部、予測部、構造データ生成部）
１１１ＣＰＵ
１１２ＲＡＭ
１１３記憶部
１１３ａ一般データ
１１３ｂクライアント由来データ
１１３ｃプログラム
１２操作部
１３表示部
１４通信部
１５バス
２公的ＤＢサーバー（データベース）
３試薬ＤＢサーバー（第２の外部装置）
４クライアントサーバー（第１の外部装置）
４１制御部
４１１ＣＰＵ
４１２ＲＡＭ
４１３記憶部
４１３ａプログラム
４２操作部
４３表示部
４４通信部
４５バス
１００化合物情報処理システム
Ａ１、Ｂ１、Ｃ１分子構造データ（構造データ）
Ａ２、Ｂ２、Ｃ２暗号化分子構造データ（暗号化構造データ）
Ｃ２Ｌラベル付き暗号化分子構造データ（暗号化構造データ）
Ｃ３機能データ
Ｄ１暗号化アルゴリズム情報
Ｄ２学習モデルデータ
Ｎ通信ネットワーク 1. MI server (information processing device)
11 Control unit (encryption unit, learning model generation unit, information provision unit, first to third data acquisition units, prediction unit, structured data generation unit)
111 CPU
112 RAM
113 Storage unit 113a General data 113b Client-derived data 113c Program 12 Operation unit 13 Display unit 14 Communication unit 15 Bus 2 Public DB server (database)
3. Reagent DB server (second external device)
4. Client Server (first external device)
41 Control unit 411 CPU
412 RAM
413 Storage unit 413a Program 42 Operation unit 43 Display unit 44 Communication unit 45 Bus 100 Compound information processing system A1, B1, C1 Molecular structure data (structure data)
A2, B2, C2 Encrypted molecular structure data (encrypted structure data)
C2L Labeled encrypted molecular structure data (encrypted structure data)
C3 Function data D1 Encryption algorithm information D2 Learning model data N Communication network

Claims

an information providing unit that provides the first external device with encryption algorithm information for performing encryption according to a predetermined encryption algorithm;
a first data acquisition unit that acquires encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device;
a prediction unit that predicts a function of a compound corresponding to the encrypted structure data of the function prediction target based on a predetermined prediction model;
Equipped with
The prediction model represents a correlation between encrypted structure data obtained by encrypting structure data relating to a structure of a compound according to the encryption algorithm and function data relating to a function of the compound.

an encryption unit that encrypts structure data relating to a compound structure according to the encryption algorithm to generate encrypted structure data;
a learning model generation unit that generates a learning model as the prediction model based on the encrypted structure data and function data related to the function of the compound;
The information processing device according to claim 1 .

the first data acquisition unit acquires, from the first external device, encrypted structural data of a learning object encrypted according to the encryption algorithm, and functional data of a learning object related to a function of a compound corresponding to the encrypted structural data of the learning object;
The information processing device according to claim 2 , wherein the learning model generation unit generates the learning model using at least the encrypted structure data of the learning object and the functional data of the learning object acquired by the first data acquisition unit.

an encryption unit that encrypts structure data relating to a compound structure according to a predetermined encryption algorithm to generate encrypted structure data;
a learning model generating unit that generates a learning model that represents a correlation between the encrypted structure data and the functional data, based on the encrypted structure data and functional data related to a function of the compound;
an information providing unit that provides a first external device with encryption algorithm information for performing encryption according to the encryption algorithm;
a first data acquisition unit that acquires, from the first external device, encrypted structure data of a learning object encrypted according to the encryption algorithm, and function data of a learning object related to a function of a compound corresponding to the encrypted structure data;
Equipped with
An information processing device, wherein the learning model generation unit generates the learning model using at least the encrypted structure data of the learning target and the functional data of the learning target acquired by the first data acquisition unit.

The first data acquisition unit acquires encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device,
The information processing device according to claim 4 , further comprising a prediction unit configured to predict, based on the learning model, a function of a compound corresponding to the encrypted structure data of the function prediction target acquired by the first data acquisition unit.

a structure data generating unit that generates the structure data,
The encryption unit encrypts the structure data generated by the structure data generation unit to generate the encrypted structure data of a function prediction target;
The information processing device according to claim 2 , wherein the prediction unit predicts a function of a compound corresponding to the encrypted structure data of the function prediction target generated by the encryption unit based on the learning model.

a second data acquisition unit that acquires structural data relating to a structure of the compound from a second external device that publishes the structure of the compound;
The encryption unit encrypts the structure data acquired by the second data acquisition unit to generate the encrypted structure data of a function prediction target;
The information processing device according to claim 2 , wherein the prediction unit predicts a function of a compound corresponding to the encrypted structure data of the function prediction target generated by the encryption unit based on the learning model.

a third data acquisition unit that acquires the structure data and the function data from an external predetermined database;
The encryption unit generates the encrypted structure data based on the structure data acquired by the third data acquisition unit,
The information processing device according to any one of claims 2 to 7, wherein the learning model generation unit generates the learning model using at least the encrypted structure data and the functional data acquired by the third data acquisition unit.

The information processing device according to any one of claims 1 to 8, wherein the encryption algorithm is incapable of reverse conversion to the structure data before encryption.

A computer installed in the information processing device,
an information providing means for providing the first external device with encryption algorithm information for performing encryption according to a predetermined encryption algorithm;
a data acquisition means for acquiring encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device;
a prediction means for predicting a function of a compound corresponding to the encrypted structure data of the function prediction target based on a predetermined prediction model;
Function as a
The prediction model is a program that represents the correlation between encrypted structural data obtained by encrypting structural data relating to a structure of a compound according to the encryption algorithm, and functional data relating to a function of the compound.

A computer installed in the information processing device,
an encryption means for encrypting structural data relating to a compound structure according to a predetermined encryption algorithm to generate encrypted structural data;
a learning model generating means for generating a learning model representing a correlation between the encrypted structure data and functional data relating to a function of the compound, based on the encrypted structure data and functional data relating to a function of the compound;
an information providing means for providing a first external device with encryption algorithm information for performing encryption according to the encryption algorithm;
a data acquisition means for acquiring, from the first external device, encrypted structure data of a learning subject encrypted according to the encryption algorithm, and function data of a learning subject relating to a function of a compound corresponding to the encrypted structure data;
Function as a
The learning model generation means is a program that generates the learning model using at least the encrypted structure data of the learning subject and the functional data of the learning subject acquired by the data acquisition means.

An information processing method executed by an information processing device,
an information providing step of providing encryption algorithm information for performing encryption according to a predetermined encryption algorithm to a first external device;
a data acquisition step of acquiring encrypted structure data of a function prediction target encrypted according to the encryption algorithm from the first external device;
a prediction step of predicting a function of a compound corresponding to the encrypted structure data of the function prediction target based on a predetermined prediction model;
Including,
An information processing method, wherein the prediction model represents a correlation between encrypted structural data obtained by encrypting structural data relating to a structure of a compound according to the encryption algorithm, and functional data relating to a function of the compound.

An information processing method executed by an information processing device,
an encryption step of encrypting structural data relating to the structure of the compound according to a predetermined encryption algorithm to generate encrypted structural data;
a learning model generating step of generating a learning model representing a correlation between the encrypted structure data and the functional data, based on the encrypted structure data and functional data relating to a function of the compound;
an information providing step of providing encryption algorithm information for performing encryption according to the encryption algorithm to a first external device;
a data acquisition step of acquiring, from the first external device, encrypted structure data of a learning subject encrypted according to the encryption algorithm, and function data of a learning subject relating to a function of a compound corresponding to the encrypted structure data;
Including,
An information processing method, in which the learning model generation step generates the learning model using at least the encrypted structure data of the learning subject and the functional data of the learning subject acquired in the data acquisition step.