JP6506201B2

JP6506201B2 - System and method for determining explanatory variable group corresponding to objective variable

Info

Publication number: JP6506201B2
Application number: JP2016057676A
Authority: JP
Inventors: 淳平佐藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2016-03-22
Filing date: 2016-03-22
Publication date: 2019-04-24
Anticipated expiration: 2036-03-22
Also published as: JP2017174022A

Description

本発明は、目的変数に対応する説明変数群を決定するシステム及び方法に関する。 The present invention relates to a system and method for determining an explanatory variable group corresponding to a target variable.

近年、医療分野において、蓄積されたデータから、複数の説明変数（例えば病名や処方薬など）の測定値からなるデータセットに対して、機械学習や統計解析などを用いた分析を行うことにより、目的変数が示す事象の予測等が行われている。 In recent years, in the medical field, an analysis using machine learning or statistical analysis is performed on a data set consisting of measured values of a plurality of explanatory variables (for example, disease names and prescription drugs) from accumulated data. The prediction of the event indicated by the objective variable is performed.

当該予測等を実行するためにデータセットに含める説明変数を選択する、背景技術として、特開２０００−２０５０４号公報（特許文献１）がある。特許文献１には、「候補説明変数に基づいて説明変数を生成する説明変数合成部Ｓ1〜Ｓnと、説明変数合成部Ｓ1〜Ｓnが生成した説明変数に基づいて目的変数の変化を説明する関係式を生成する回帰分析実行部１と、回帰分析実行部１が生成した関係式の適切度を定量的に評価する適切度判定部２と、適切度が最も高い説明変数と回帰式を探索する最良回帰式決定部３とを備える。説明変数合成部Ｓ1〜Ｓnに入力される候補説明変数は、最良回帰式決定部３から出力される説明変数生成パラメータにより設定される。説明変数合成部Ｓ1〜Ｓnは、候補説明変数それ自体だけでなく、候補説明変数の組み合わせや、候補説明変数に対して何らかの演算処理を施した結果を説明変数として出力できる。」と記載されている（要約参照）。 As background art which selects the explanatory variable included in a data set in order to perform the said prediction etc., there exists Unexamined-Japanese-Patent No. 2000-20504 (patent document 1). In Patent Document 1, “the explanatory variable combining unit S1 to Sn that generates explanatory variables based on candidate explanatory variables, and the relationship that explains the change of the target variable based on the explanatory variables that the explanatory variable combining units S1 to Sn generate A regression analysis execution unit 1 for generating an equation, an appropriateness determination unit 2 for quantitatively evaluating the appropriateness of a relational expression generated by the regression analysis execution unit 1, an explanatory variable and a regression equation having the highest appropriateness are searched The best regression equation determination unit 3 is provided, and the candidate explanatory variables input to the explanatory variable combining units S1 to Sn are set by the explanatory variable generation parameter output from the best regression equation determination unit 3. The explanatory variable combination unit S1 It is described that “̃Sn can output not only the candidate explanatory variable itself but also a combination of candidate explanatory variables and the result of performing some arithmetic processing on the candidate explanatory variables as an explanatory variable.” .

特開２０００−２０５０４号公報JP 2000-20504 A

特許文献１に記載の技術が、国際疾病分類に含まれる病名から上述のデータセットに含める説明変数を選択する例を考える。国際疾病分類は、複数の病名からなり、各病名は、大分類、中分類、又は小分類のいずれかに属する。また、国際疾病分類は、上位概念から下位概念への木構造で定義されている。つまり、小分類に含まれる各病名は中分類に含まれるいずれかの病名の下位概念であり、中分類に含まれる各病名は大分類に含まれるいずれかの病名の下位概念である。 Consider an example in which the technology described in Patent Document 1 selects explanatory variables to be included in the above-described data set from disease names included in the international disease classification. The international disease classification consists of a plurality of disease names, and each disease name belongs to any of a large classification, a middle classification, or a small classification. In addition, the international disease classification is defined in a tree structure from the superordinate concept to the subordinate concept. That is, each disease name included in the minor classification is a subordinate concept of any disease name included in the middle classification, and each disease name included in the middle classification is a subordinate concept of any disease name included in the large classification.

国際疾病分類のような木構造で定義された、説明変数候補から説明変数を選択する場合、当該説明変数候補間の木構造上のつながりを考慮する必要がある。しかし、特許文献１に記載の技術は、当該つながりを考慮していない。 When selecting explanatory variables from explanatory variable candidates defined in a tree structure such as international disease classification, it is necessary to consider the tree structural connection between the explanatory variable candidates. However, the technology described in Patent Document 1 does not consider the connection.

例えば、小分類「病名Ａ」、「病名Ｂ」、「病名Ｃ」を下位概念として含む中分類「病名群α」が存在する例を考える。例えば、「病名Ｂ」による目的変数への影響力が強い場合、「病名Ａ」及び「病名Ｃ」による目的変数への影響力が弱かったとしても、「病名Ｂ」の上位概念である「病名群α」による目的変数への影響力が強い可能性が高い。 For example, consider an example where there is a middle class "sick name group α" that includes the minor classes "sick name A", "sick name B" and "sick name C" as subordinate concepts. For example, when the influence on the objective variable by "disease name B" is strong, even if the influence to the objective variable by "disease name A" and "disease name C" is weak, the "disease name" which is a superordinate concept of "disease name B" There is a high possibility that the influence of the group α on the objective variable is strong.

このとき、特許文献１に記載の技術は、「病名Ｂ」に加え、「病名群α」を説明変数に選択する可能性が高い。しかし、「病名Ｂ」と「病名群α」とが説明変数に選択された場合、ユーザは、「病名Ａ」及び「病名Ｃ」による目的変数への影響力があるか否かを判断することができない。 At this time, the technique described in Patent Document 1 is highly likely to select “disease name group α” as an explanatory variable in addition to “disease name B”. However, when “disease name B” and “disease name group α” are selected as explanatory variables, the user determines whether there is an influence on the objective variable by “disease name A” and “disease name C”. I can not

また、「病名Ｂ」と「病名群α」とが説明変数に選択された場合、目的変数の値の予測において説明変数「病名Ｂ」が二重に影響するため、目的変数の値の予測精度が低下するおそれがある。従って、このような場合、木構造上のつながりを考慮して、説明変数候補である「病名群α」を、「病名群α」の下位概念から「病名Ｂ」を除外した新たな説明変数候補である病名群、即ち「病名Ａ」及び「病名Ｃ」からなる新たな説明変数候補である病名群として、再生成することが望ましい。 In addition, when “disease name B” and “disease name group α” are selected as explanatory variables, the explanatory variable “disease name B” has double effects in predicting the value of the objective variable, so the prediction accuracy of the value of the objective variable May decrease. Therefore, in such a case, a new explanatory variable candidate obtained by excluding the “disease name B” from the lower concept of the “disease name group α” and the “disease name group α” which is the explanatory variable candidate in consideration of the tree structure connection It is desirable to regenerate the same as the disease name group, that is, a disease name group that is a new explanatory variable candidate consisting of "disease name A" and "disease name C".

また、例えば医療分野では、病名、処方薬、及び検査項目などの、数千から数万種類の説明変数候補が存在する。従って、特許文献１に記載の技術において、説明変数の選択、再生成対象の説明変数候補の決定、及び説明変数の再生成方法等を、ユーザが適切に手動で指定することは困難である。 Also, in the medical field, for example, there are several thousand to several tens of thousands of candidate explanatory variables such as disease names, prescription drugs, and examination items. Therefore, in the technique described in Patent Document 1, it is difficult for the user to manually specify the selection of explanatory variables, the determination of explanatory variable candidates to be regenerated, the method of regenerating explanatory variables, and the like appropriately.

上記の課題を解決するために、本発明の一態様は、以下の構成を採用する。目的変数に対応する説明変数群を決定する、システムであって、プロセッサと記憶装置とを含み、前記記憶装置は、１以上の階層を有する１以上の木構造と、複数の説明変数候補それぞれが前記１以上の木構造のノードのいずれかであることを示す対応関係と、を示す概念構造情報と、前記複数の説明変数候補それぞれの測定値及び前記目的変数の測定値を示す測定値情報と、を保持し、前記１以上の木構造のノードそれぞれは、概念であり、前記１以上の木構造において、親ノードは子ノードの上位概念であり、前記プロセッサは、説明変数候補群から、前記説明変数群に含める説明変数を決定する決定処理を繰り返し、初回の前記決定処理における前記説明変数候補群は、前記複数の説明変数候補であり、前記プロセッサは、前記決定処理において、前記測定値情報に基づいて、前記説明変数候補群の説明変数候補それぞれと、前記目的変数と、の第１関連度を算出し、前記第１関連度それぞれに基づいて、前記説明変数群に含める説明変数候補を前記説明変数候補群から選択し、前記選択した説明変数候補を、前記説明変数群に含め、前記選択した説明変数候補を、前記説明変数候補群から除外し、前記概念構造情報を参照して、前記説明変数候補群から、前記選択した説明変数候補の上位概念である説明変数候補を特定し、前記特定した説明変数候補それぞれを、当該説明変数候補の前記概念構造情報が示す下位概念から前記選択した説明変数候補を除外した説明変数候補に変更する変更処理を実施し、前記選択した説明変数候補の前記概念構造情報が示す兄弟ノードの測定値を前記測定値情報から取得し、前記取得した測定値に基づいて、前記変更処理における変更後の説明変数候補それぞれの測定値を算出し、前記算出した測定値を、前記測定値情報に含める、システム。 In order to solve the above-mentioned subject, one mode of the present invention adopts the following composition. A system for determining an explanatory variable group corresponding to an objective variable, comprising: a processor and a storage device, wherein the storage device comprises one or more tree structures having one or more layers, and a plurality of explanatory variable candidates Conceptual structure information indicating a correspondence relation indicating any one of the nodes of the one or more tree structures, and measurement value information indicating measurement values of the plurality of explanatory variable candidates and measurement values of the target variable , Each node of the one or more tree structures is a concept, and in the one or more tree structures, a parent node is a superordinate concept of a child node, and the processor determines from the explanatory variable candidate group the The determination process of determining an explanatory variable to be included in an explanatory variable group is repeated, and the explanatory variable candidate group in the first determination process is the plurality of explanatory variable candidates, and the processor determines the determination process. Calculating the first association degree between each of the explanatory variable candidates of the explanatory variable candidate group and the objective variable based on the measurement value information, and based on each of the first association degree, the explanatory variable group Selected explanatory variable candidates to be included in the explanatory variable candidate group, the selected explanatory variable candidate is included in the explanatory variable group, and the selected explanatory variable candidate is excluded from the explanatory variable candidate group; With reference to the information, the explanatory variable candidate which is a superordinate concept of the selected explanatory variable candidate is specified from the explanatory variable candidate group, and the conceptual structure information of the explanatory variable candidate corresponds to each of the specified explanatory variable candidates. A change process is performed to change the selected explanatory variable candidate from the lower concept to the explanatory variable candidate excluding the selected explanatory variable candidate, and the sibling node indicated by the conceptual structure information of the selected explanatory variable candidate is measured. A value is acquired from the measurement value information, a measurement value of each explanatory variable candidate after the change in the change process is calculated based on the acquired measurement value, and the calculated measurement value is included in the measurement value information. ,system.

本発明の一態様によれば、目的変数に対する説明力のある説明変数を選択することができ、かつ当該選択した説明変数が有する意味の重複を避けることができる。 According to one aspect of the present invention, an explanatory variable with an explanatory power to a target variable can be selected, and overlapping of the meaning of the selected explanatory variable can be avoided.

上記以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 Problems, configurations, and effects other than the above are clarified by the description of the embodiments below.

実施例１における変数最適化システムの構成例を示すブロック図である。FIG. 1 is a block diagram showing a configuration example of a variable optimization system in a first embodiment. 実施例１における概念構造テーブルの例である。5 is an example of a conceptual structure table in Embodiment 1. 実施例１における患者情報テーブルの例である。7 is an example of a patient information table in Embodiment 1. 実施例１における検査情報テーブルの例である。7 is an example of an inspection information table in the first embodiment. 実施例１における処方情報テーブルの例である。It is an example of a prescription information table in the first embodiment. 実施例１における説明変数候補テーブルの例である。7 is an example of an explanatory variable candidate table in the first embodiment. 実施例１におけるデータセット格納テーブルの例である。7 is an example of a data set storage table according to the first embodiment. 実施例１における、説明変数抽出処理及び再生成処理が実行された後のデータセット格納テーブルの例である。It is an example of the data set storage table after the explanatory variable extraction process and the regeneration process are executed in the first embodiment. 実施例１における関連度テーブルの例である。7 is an example of a degree-of-association table in the first embodiment. 実施例１における、説明変数抽出処理及び再生成処理が実行された後の関連度テーブルの例である。It is an example of the degree-of-association table after the explanatory variable extraction process and the regeneration process are executed in the first embodiment. 実施例１における変数最適化処理の一例を示すフローチャートである。5 is a flowchart illustrating an example of a variable optimization process according to the first embodiment. 実施例１における関連度算出処理の一例を示すフローチャートである。5 is a flowchart illustrating an example of association degree calculation processing according to the first embodiment. 実施例１における説明変数抽出処理及び説明変数再生成処理の一例を示すフローチャートである。5 is a flowchart illustrating an example of an explanatory variable extraction process and an explanatory variable regeneration process according to the first embodiment. 実施例１における各種分析オプション受付画面の一例である。7 is an example of a various analysis option reception screen according to the first embodiment. 実施例１における分析処理結果表示画面の一例である。7 is an example of an analysis processing result display screen in the first embodiment.

以下、添付図面を参照して本発明の実施形態を説明する。本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。各図において共通の構成については同一の参照符号が付されている。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. It should be noted that the present embodiment is merely an example for realizing the present invention, and does not limit the technical scope of the present invention. The same reference numerals are given to the same configuration in each drawing.

本実施形態は、変数最適化システムを説明する。変数最適化システムは、複数の説明変数候補から、目的変数に対応する１以上の説明変数を選択する。当該複数の説明変数候補それぞれは、１以上の階層を有する木構造のノードのいずれかとして定義されている。 The present embodiment describes a variable optimization system. The variable optimization system selects one or more explanatory variables corresponding to the target variable from the plurality of explanatory variable candidates. Each of the plurality of explanatory variable candidates is defined as any one of the nodes of the tree structure having one or more layers.

＜システム構成＞
図１は、本実施例の情報システムの構成例を示すブロック図である。本実施例は、変数最適化システムが、病院情報に適用された例を説明する。本実施例の情報システムは、例えば、ネットワーク１４０を介して、互いに接続された、変数最適化システム１００、病院情報システム１２０、及び入出力端末１３０、を含む。 <System configuration>
FIG. 1 is a block diagram showing an example of the configuration of the information system of this embodiment. The present embodiment describes an example where the variable optimization system is applied to hospital information. The information system of the present embodiment includes, for example, a variable optimization system 100, a hospital information system 120, and an input / output terminal 130 connected to each other via a network 140.

ネットワーク１４０は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）ケーブルによる有線通信、又は無線ＬＡＮによる無線通信を利用する。また、ネットワーク１４０は、インターネット、ＶＰＮ、携帯電話通信網、ＰＨＳ通信網など、他の広域ネットワークを利用することもできる。 The network 140 uses wired communication with a LAN (Local Area Network) cable or wireless communication with a wireless LAN. The network 140 can also use other wide area networks such as the Internet, VPN, a cellular phone communication network, a PHS communication network, and the like.

変数最適化システム１００は、蓄積された目的変数及び説明変数候補の測定値を用いて、各説明変数候補と目的変数との関連度を算出し、関連度に基づいて説明変数候補から抽出対象の説明変数を決定する。また、抽出対象の説明変数に基づいて選択した説明変数候補を再生成する。 The variable optimization system 100 calculates the degree of association between each explanatory variable candidate and the objective variable using the accumulated measured values of the objective variable and explanatory variable candidate, and extracts an explanatory variable candidate from the explanatory variable candidate based on the degree of association. Determine the explanatory variables. Further, the explanatory variable candidate selected based on the explanatory variable to be extracted is regenerated.

病院情報システム１２０は、例えば、患者情報データベース１２１、検査情報データベース１２２、及び処方情報データベース１２３、を格納する記憶装置を有する計算機である。病院情報システム１２０に含まれる各データベースは、変数の測定値を格納する。 The hospital information system 120 is, for example, a computer having a storage device that stores a patient information database 121, an examination information database 122, and a prescription information database 123. Each database included in the hospital information system 120 stores variable measurement values.

患者情報データベース１２１は、病院の患者毎の基本情報等を格納する。検査情報データベース１２２は、患者毎の検査情報を格納する。処方情報データベース１２３は、患者毎の処方情報を格納する。病院情報システム１２０に含まれるデータベースに格納される情報は、ネットワーク１４０を介して、変数最適化システム１００に提供される。 The patient information database 121 stores basic information and the like for each patient in the hospital. The examination information database 122 stores examination information for each patient. The prescription information database 123 stores prescription information for each patient. The information stored in the database included in the hospital information system 120 is provided to the variable optimization system 100 via the network 140.

入出力端末１３０は、例えば、キーボード、マウス、又はタッチパネルなどの入力装置（図示省略）と、ディスプレイなどの出力装置（図示省略）と、変数最適化システム１００などと通信する通信部（図示省略）と、を含む１又は複数のパーソナルコンピュータである。また、入出力端末１３０は、例えば、ボタン又はタッチパネルなどの入力装置と、ディスプレイなどの出力装置と、変数最適化システム１００などと通信する通信部とを含むＰＤＡ、ＰＨＳ、携帯電話、スマートフォン、又はタブレット端末などの可搬型端末であってもよい。 The input / output terminal 130 communicates with, for example, an input device (not shown) such as a keyboard, a mouse or a touch panel, an output device (not shown) such as a display, and a communication unit (not shown) And one or more personal computers. In addition, the input / output terminal 130 is, for example, a PDA, a PHS, a mobile phone, a smartphone, or an input device such as a button or a touch panel, an output device such as a display, and a communication unit that communicates with the variable optimization system 100 or the like. It may be a portable terminal such as a tablet terminal.

入出力端末１３０は、例えば、病院又は診療所などの医療機関（ヘルスケアプロバイダ）に設置される。変数最適化システム１００は、例えば、データセンターに設置される。変数最適化システム１００がデータセンターに設置されることで、患者の個人情報及び患者から収集されるデータなどのプライバシー情報を一元管理できるので、情報漏洩防止等のセキュリティ管理を簡易化できる。変数最適化システム１００は、運用の形態によってはヘルスケアプロバイダに設置されてもよい。 The input / output terminal 130 is installed in, for example, a medical institution (health care provider) such as a hospital or a clinic. The variable optimization system 100 is installed, for example, in a data center. By installing the variable optimization system 100 in the data center, it is possible to centrally manage privacy information such as patient's personal information and data collected from the patient, so that security management such as information leakage prevention can be simplified. The variable optimization system 100 may be installed in a healthcare provider depending on the mode of operation.

医療機関の医師、分析担当者、薬剤師、管理者及び経営責任者は、入出力端末１３０の利用者（以下ユーザと記載する）の一例である。変数最適化システム１００は、ユーザからの入出力端末１３０からの指示に従って、目的変数に対応する説明変数を抽出し、抽出した説明変数と目的変数の測定値を含むデータセットを出力する。 A doctor at a medical institution, a person in charge of analysis, a pharmacist, a manager, and a manager are examples of a user of the input / output terminal 130 (hereinafter referred to as a user). The variable optimization system 100 extracts an explanatory variable corresponding to the target variable according to an instruction from the input / output terminal 130 from the user, and outputs a data set including the extracted explanatory variable and the measured value of the target variable.

変数最適化システム１００は、例えば、相互に接続された、制御部１０１、出力部１０２、メモリ１０３、通信部１０４、補助記憶装置１１６を含む計算機によって構成される。制御部１０１は、例えばメモリ１０３に格納されたプログラムを実行するプロセッサであり、変数最適化システム１００を制御する。 The variable optimization system 100 is configured by, for example, a computer including a control unit 101, an output unit 102, a memory 103, a communication unit 104, and an auxiliary storage device 116 connected to one another. The control unit 101 is, for example, a processor that executes a program stored in the memory 103, and controls the variable optimization system 100.

メモリ１０３は、不揮発性の記憶素子であるＲＯＭ及び揮発性の記憶素子であるＲＡＭを含む。ＲＯＭは、不変のプログラム（例えば、ＢＩＯＳ）などを格納する。ＲＡＭは、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）のような高速かつ揮発性の記憶素子であり、制御部１０１が実行するプログラム及びプログラムの実行時に使用されるデータを一時的に格納する。 The memory 103 includes a ROM, which is a non-volatile storage element, and a RAM, which is a volatile storage element. The ROM stores an immutable program (for example, BIOS). The RAM is a high-speed and volatile storage element such as a dynamic random access memory (DRAM), and temporarily stores a program executed by the control unit 101 and data used when the program is executed.

補助記憶装置１１６は、例えば、磁気記憶装置（ＨＤＤ）、フラッシュメモリ（ＳＳＤ）等の大容量かつ不揮発性の記憶装置であり、制御部１０１が実行するプログラム及びプログラムの実行時に使用されるデータを格納する。すなわち、プログラムは、補助記憶装置１１６から読み出されて、メモリ１０３にロードされて、制御部１０１によって実行される。なお、補助記憶装置１１６に格納されるデータの一部又は全部がメモリ１０３に格納されてもよい。また、メモリ１０３に格納されているデータの一部又は全部が補助記憶装置１１６に格納されてもよい。 The auxiliary storage device 116 is, for example, a large-capacity non-volatile storage device such as a magnetic storage device (HDD) or a flash memory (SSD), and the program executed by the control unit 101 and the data used when executing the program Store. That is, the program is read from the auxiliary storage device 116, loaded into the memory 103, and executed by the control unit 101. Note that part or all of the data stored in the auxiliary storage device 116 may be stored in the memory 103. Also, part or all of the data stored in the memory 103 may be stored in the auxiliary storage device 116.

補助記憶装置１１６は、例えば、統合データベース１１３、概念構造データベース１１４、及び分析データ格納データベース１１５を格納する。統合データベース１１３は、病院情報システム１２０に含まれるデータベースから取得された情報を格納する。 The auxiliary storage device 116 stores, for example, an integrated database 113, a conceptual structure database 114, and an analysis data storage database 115. The integrated database 113 stores information acquired from a database included in the hospital information system 120.

具体的には、統合データベース１１３は、例えば、抽出対象の説明変数の候補となり得る変数、及び当該変数の測定値を格納する。以下、抽出対象の説明変数の候補を説明変数候補と呼ぶ。なお、本実施形態において、測定値は、実測値、予測値、及び推定値等を含む概念である。概念構造データベース１１４は、当該変数が属する階層構造を示すデータを格納する。分析データ格納データベース１１５は、例えば、変数最適化処理の処理結果を示すデータを格納する。 Specifically, the integrated database 113 stores, for example, a variable that can be a candidate for an explanatory variable to be extracted, and a measured value of the variable. Hereinafter, candidates for the explanatory variable to be extracted are referred to as explanatory variable candidates. In the present embodiment, the measured value is a concept including an actual measurement value, a predicted value, and an estimated value. The conceptual structure database 114 stores data indicating the hierarchical structure to which the variable belongs. The analysis data storage database 115 stores, for example, data indicating the processing result of variable optimization processing.

制御部１０１は、例えば、それぞれ変数最適化システム１００の機能を実現するための処理を実行する処理部である、表示画面生成部１０５、データ抽出部１０６、目的変数生成部１０７、説明変数生成部１０８、データセット作成部１０９、関連度算出部１１０、説明変数再生成部１１１、概念構造編集部１１２、及び蓄積データ取得部１１９を含む。 The control unit 101 is, for example, a processing unit that executes processing for realizing the function of the variable optimization system 100, for example, the display screen generation unit 105, the data extraction unit 106, the objective variable generation unit 107, and the explanatory variable generation unit 108 includes a data set creation unit 109, a degree of association calculation unit 110, an explanatory variable re-generation unit 111, a conceptual structure editing unit 112, and an accumulated data acquisition unit 119.

例えば、制御部１０１は、メモリ１０３にロードされた表示画面生成プログラムに従って動作することで、表示画面生成部１０５として機能し、メモリ１０３にロードされたデータ抽出プログラムに従って動作することで、データ抽出部１０６として機能する。制御部１０１に含まれる他の部についても同様である。なお、制御部１０１に含まれる部それぞれが専用のハードウェアによって実現されてもよい。 For example, the control unit 101 operates as the display screen generation unit 105 by operating according to the display screen generation program loaded into the memory 103, and operates as the data extraction program loaded into the memory 103 to operate as the data extraction unit. It functions as 106. The same applies to the other units included in the control unit 101. Note that each unit included in the control unit 101 may be realized by dedicated hardware.

表示画面生成部１０５は、後述する分析オプション及び分析処理結果を出力部１０２に表示するための情報を生成し、生成した情報を、出力部１０２を介してディスプレイ装置等に、又は通信部１０４を介して入出力端末１３０等に出力する。 The display screen generation unit 105 generates information for displaying an analysis option and an analysis processing result to be described later on the output unit 102, and the generated information is transmitted to the display device or the like via the output unit 102 or the communication unit 104. The data is output to the input / output terminal 130 etc.

データ抽出部１０６は、補助記憶装置１１６に格納されたデータベースからデータを抽出する。目的変数生成部１０７は、例えばユーザからの指示と概念構造データベース１１４に格納されている情報とに基づいて、目的変数を生成する。説明変数生成部１０８は、例えばユーザからの指示と概念構造データベース１１４に格納されている情報とに基づいて、抽出対象の説明変数の候補を生成する。 The data extraction unit 106 extracts data from the database stored in the auxiliary storage device 116. The target variable generation unit 107 generates a target variable based on, for example, an instruction from the user and the information stored in the conceptual structure database 114. The explanatory variable generation unit 108 generates candidates for explanatory variables to be extracted based on, for example, an instruction from the user and the information stored in the conceptual structure database 114.

データセット作成部１０９は、分析データ格納ベースに格納されるデータセット格納テーブルを作成する。データセット格納テーブルの詳細は後述する。関連度算出部１１０は、変数間の関連度を算出する。説明変数再生成部１１１は、抽出された説明変数の上位概念である説明変数候補を再生成する。概念構造編集部１１２は、概念構造データベース１１４に格納される概念構造テーブルを編集する。概念構造テーブルの詳細は後述する。 The data set creation unit 109 creates a data set storage table stored in the analysis data storage base. Details of the data set storage table will be described later. The degree of association calculation unit 110 calculates the degree of association between variables. The explanatory variable regeneration unit 111 regenerates explanatory variable candidates, which are high-level concepts of the extracted explanatory variables. The conceptual structure editor 112 edits the conceptual structure table stored in the conceptual structure database 114. Details of the conceptual structure table will be described later.

蓄積データ取得部１１９は、患者情報データベース１２１、検査情報データベース１２２、処方情報データベース１２３、に格納されたデータを取得し、取得したデータを統合データベース１１３に含まれる後述する各テーブルに格納する。蓄積データ取得部１１９は、例えば、入出力端末１３０からの指示に従って、又はユーザが予め指定した時刻に自動的に、当該データを取得する。また、病院情報システム１２０が、患者情報データベース１２１、検査情報データベース１２２、又は処方情報データベース１２３、のデータを更新した時に、例えば病院データベースからの通知に従って、蓄積データ取得部１１９は、データを取得してもよい。 The accumulated data acquisition unit 119 acquires data stored in the patient information database 121, the examination information database 122, and the prescription information database 123, and stores the acquired data in each table described later included in the integrated database 113. The accumulated data acquisition unit 119 acquires the data, for example, automatically according to an instruction from the input / output terminal 130 or at a time designated in advance by the user. Also, when the hospital information system 120 updates the data of the patient information database 121, the examination information database 122, or the prescription information database 123, the accumulated data acquisition unit 119 acquires the data according to a notification from the hospital database, for example. May be

出力部１０２は、ディスプレイ装置やプリンタなどが接続され、プログラムの実行結果をオペレータが視認可能な形式で出力するインターフェースである。なお、変数最適化システム１００は、さらに入力部を含んでもよい。入力部は、キーボードやマウスなどが接続され、オペレータからの入力を受けるインターフェースである。 The output unit 102 is an interface to which a display device, a printer, and the like are connected, and which outputs the execution result of the program in a format that can be viewed by the operator. The variable optimization system 100 may further include an input unit. The input unit is an interface to which a keyboard, a mouse and the like are connected and which receives an input from an operator.

通信部１０４は、ネットワーク１４０に接続され、所定のプロトコルに従って、他の装置との通信を制御するネットワークインターフェース装置である。また、通信部１０４は、例えば、ＵＳＢ等のシリアルインターフェースを含む。 The communication unit 104 is a network interface device connected to the network 140 and controlling communication with another device according to a predetermined protocol. The communication unit 104 also includes, for example, a serial interface such as USB.

制御部１０１が実行するプログラムは、リムーバブルメディア（ＣＤ−ＲＯＭ、フラッシュメモリなど）又はネットワークを介して変数最適化システム１００に提供され、非一時的記憶媒体である不揮発性の補助記憶装置１１６に格納される。このため、変数最適化システム１００は、リムーバブルメディアからデータを読み込むインターフェースを有するとよい。 The program executed by the control unit 101 is provided to the variable optimization system 100 via removable media (CD-ROM, flash memory, etc.) or via a network, and is stored in the non-volatile auxiliary storage device 116 which is a non-temporary storage medium. Be done. To this end, the variable optimization system 100 may have an interface for reading data from removable media.

なお、変数最適化システム１００、入出力端末１３０、及び病院情報システム１２０は、物理的に一つの計算機上で、又は、論理的又は物理的に構成された複数の計算機上で構成される計算機システムであり、同一の計算機上で別個のスレッドで動作してもよく、複数の物理的計算機資源上に構築された仮想計算機上で動作してもよい。 Note that the variable optimization system 100, the input / output terminal 130, and the hospital information system 120 are computer systems configured physically on one computer or on a plurality of logically or physically configured computers. , And may operate on separate threads on the same computer, or may operate on a virtual computer built on multiple physical computer resources.

以下、統合データベース１１３を構成するテーブルの一例を説明する。なお、本実施形態において、変数最適化システム１００が使用する情報は、データ構造に依存せずどのようなデータ構造で表現されていてもよい。以下では、テーブルが情報を格納する例を示しているが、例えば、リスト、データベース又はキューから適切に選択したデータ構造体が、情報を格納することができる。 Hereinafter, an example of the table which comprises the integrated database 113 is demonstrated. In the present embodiment, the information used by the variable optimization system 100 may be expressed by any data structure without depending on the data structure. Although the following shows an example in which a table stores information, for example, a data structure appropriately selected from a list, a database or a queue can store information.

統合データベース１１３は、例えば、患者毎の基本情報等を管理する患者情報テーブル、患者毎の検査情報を格納する検査情報テーブル、及び患者毎の処方情報を格納する処方情報テーブルを格納する。なお、本実施形態における複数のテーブルに含まれる情報は１つのテーブルに含まれてよい。例えば、統合データベース１１３は、患者情報テーブル、検査情報テーブル、及び処方情報テーブルを統合した１つのテーブルを用いて情報を管理してもよい。 The integrated database 113 stores, for example, a patient information table managing basic information and the like for each patient, an examination information table storing examination information for each patient, and a prescription information table storing prescription information for each patient. The information included in the plurality of tables in the present embodiment may be included in one table. For example, the integrated database 113 may manage information using one table integrated with a patient information table, an examination information table, and a prescription information table.

図３は、患者情報テーブルの例である。患者情報テーブル３００は、例えば、患者を識別する患者ＩＤを格納するフィールド３０１と、患者の性別を格納するフィールド３０２と、対応するレコードが入院のデータであるか又は外来のデータであるかを示す入院フラグを格納するフィールド３０３と、医療機関への外来の診断日を格納するフィールド３０４と、医療機関への入院年月日を格納するフィールド３０５と、医療機関からの退院年月日を格納するフィールド３０６と、罹患した疾患名を格納するフィールド３０７〜３０９と、を含む。 FIG. 3 is an example of a patient information table. The patient information table 300 indicates, for example, a field 301 storing a patient ID for identifying a patient, a field 302 storing a gender of the patient, and whether the corresponding record is hospitalization data or outpatient data. A field 303 for storing an admission flag, a field 304 for storing an outpatient diagnosis date to a medical institution, a field 305 for storing an admission date to a medical institution, and a discharge date from a medical institution It includes a field 306 and fields 307 to 309 storing diseased disease names.

入院フラグが「１」であるレコードは入院患者のデータであり、診断日を示すフィールド３０４には「０」が格納される。入院年月日を示すフィールド３０５と退院年月日を示すフィールド３０６それぞれには当該入院患者に入院年月日及び退院年月日それぞれが格納される。なお、未退院の患者の退院年月日を示すフィールド３０６には「０」が格納される。一方、入院フラグが「０」であるレコードは外来患者のデータであり、診断日を示すフィールド３０４には当該外来患者に対応する値、入院年月日を示すフィールド３０５と退院年月日を示すフィールド３０６には「０」が格納される。 The record in which the admission flag is "1" is the data of the inpatient, and "0" is stored in the field 304 indicating the diagnosis date. In each of the field 305 indicating the hospitalization date and the field 306 indicating the hospital discharge date, the hospitalization date and hospital discharge date for the hospitalized patient are stored. In addition, "0" is stored in the field 306 which shows the discharge date of the patient of the undischarged hospital. On the other hand, a record in which the admission flag is "0" is data of an outpatient, and a field 304 indicating a diagnosis date indicates a value corresponding to the outpatient, a field 305 indicating an admission date, and a discharge date. The field 306 stores “0”.

患者情報テーブル３００の各レコード（患者情報レコード）に、一人の患者の１入院ないし１診断に関する情報が格納される。つまり、１人の患者の１回の入院又は１回の外来の診断、が１レコードに対応する。 In each record (patient information record) of the patient information table 300, information on one hospitalization or one diagnosis of one patient is stored. That is, one hospitalization or one outpatient diagnosis of one patient corresponds to one record.

例えば、図３の例では、患者情報レコード３１０は、患者ＩＤ「＃１」の患者が、性別「男」、入院フラグ「１」、診断日「０」、入院年月日「２０１４／０４／０１」、退院年月日「２０１４／０４／２６」であり、病名１が「病名Ａ」、病名２が「病名Ｂ」、病名３は何も情報が登録されていない事を示す「ＮＵＬＬ」であることを示す。即ち、患者情報レコード３１０は、患者ＩＤ「＃１」で識別される男性の患者が、病名Ａと病名Ｂのために２０１４年４月１日から２０１４年４月２６日まで入院したことを示す。また、患者情報テーブル３００は、病名を格納するフィールドを４つ以上含んでもよい。 For example, in the example of FIG. 3, in the patient information record 310, the patient of the patient ID "# 1" has sex "male", hospitalization flag "1", diagnosis date "0", hospitalization date "2014/04 / Discharge date "2014/04/26", disease name 1 is "disease name A", disease name 2 is "disease name B", and disease name 3 is "NULL" indicating that no information is registered. To indicate that That is, the patient information record 310 indicates that the male patient identified by the patient ID “# 1” was hospitalized from April 1, 2014 to April 26, 2014 for the disease name A and the disease name B. . Further, the patient information table 300 may include four or more fields for storing disease names.

なお、患者情報テーブル３００に含まれるフィールドは図面に例示したものに限られない。また、患者情報テーブル３００は、図面に例示したフィールドの一部を含まなくてもよい。その他のテーブルについても同様である。 The fields included in the patient information table 300 are not limited to those illustrated in the drawings. Also, the patient information table 300 may not include part of the fields illustrated in the drawings. The same applies to the other tables.

図４は、検査情報テーブル４００の例である。検査情報テーブル４００は、例えば、患者ＩＤを格納するフィールド４０１と、検査項目を識別する情報を格納するフィールド４０２と、検査日を格納するフィールド４０３と、検査結果（例えば検査値）を格納するフィールド４０４と、を含む。 FIG. 4 is an example of the examination information table 400. The examination information table 400 includes, for example, a field 401 for storing a patient ID, a field 402 for storing information for identifying examination items, a field 403 for storing examination dates, and a field for storing examination results (for example, examination values). And 404.

図４の例は、患者ＩＤ「＃１」の患者の「検査項目Ｘ」の値が、「２０１４／０４／１」に「４１」、「２０１４／０４／７」に「６２」、「２０１４／０４／１３」に「１８０」、「２０１４／０４／１５」に「２２０」、「２０１４／０４／１８」に「１９６」、及び「２０１４／０４／２５」に「１２０」であったこと、かつ検査項目「検査項目Ｙ」の値が「２０１４／０４／１」に「３８」であったことを示す。 In the example of FIG. 4, the value of “test item X” of the patient with the patient ID “# 1” is “41” in “2014/04/1” and “62” in “2014/04/7”. It was "180" on April 13th, "220" on "2014/04/15", "196" on "2014/04/18", and "120" on "2014/04/25" Also, it indicates that the value of the inspection item “inspection item Y” is “38/2014/1”.

また、検査値を格納するフィールド４０４は、例えば、画像検査（例えばＣＴ画像検査）による検査結果である画像情報を格納してもよい。また、フィールド４０４は、例えば、「吐き気」又は「嘔吐」といった患者の自覚症状に関する情報等の文字列を格納してもよい。 Further, the field 404 for storing the inspection value may store, for example, image information which is an inspection result by an image inspection (for example, a CT image inspection). Also, the field 404 may store, for example, a character string such as information on the patient's symptoms such as "nausea" or "vomiting".

図５は、処方情報テーブル５００の例である。処方情報テーブル５００は、例えば、患者ＩＤを格納するフィールド５０１と、薬剤を識別する情報（例えば薬剤名）を格納するフィールド５０２と、各薬剤の処方開始日を格納するフィールド５０３と、各薬剤の処方終了日を格納するフィールド５０４と、を含む。 FIG. 5 is an example of the prescription information table 500. The prescription information table 500 includes, for example, a field 501 for storing a patient ID, a field 502 for storing information (for example, drug name) for identifying a drug, a field 503 for storing a prescription start date of each drug, and And a field 504 for storing a prescription end date.

図５の例は、患者ＩＤ「＃１」の患者に、薬剤名「薬剤Ａ」が「２０１４／０４／０１」から「２０１４／０４／０７」まで及び「２０１４／０４／０８」から「２０１４／０４／１４」まで処方され、薬剤名「薬剤Ｂ」が「２０１４／０４／０３」から「２０１４／０４／８」まで処方され、薬剤名「薬剤Ｄ」が「２０１４／０４／１６」から「２０１４／０４／２０」まで処方され、かつ薬剤名「薬剤Ｔ」が「２０１４／０４／１４」から「２０１４／０４／１８」まで処方されたことを示す。 In the example shown in FIG. 5, the drug names "Drug A" ranged from "2014/04/01" to "2014/04/07" and "2014/04/08" to "2014 for patients with patient ID" # 1 ". The drug name "Drug B" is prescribed from "2014 04/03" to "2014 Apr 8" and the drug name "Drug D" is from "2014 04/16". It indicates that the drug was prescribed up to "2014/04/20", and the drug name "drug T" was prescribed from "2014 Apr. 14" to "2014 Apr. 18".

また、処方情報テーブル５００は、例えば、各薬剤の処方の用法を格納するフィールド、及び各薬剤の処方の用量を格納するフィールドをさらに含んでもよい。以下、概念構造データベース１１４を構成するテーブルの一例を説明する。概念構造データベース１１４は、例えば、概念構造の名称と概念構造に対応する概念レベルを管理する概念構造テーブルを含む。 In addition, the prescription information table 500 may further include, for example, a field for storing the prescription usage of each medicine and a field for storing the dosage of each medicine prescription. Hereinafter, an example of the table which comprises the conceptual structure database 114 is demonstrated. The conceptual structure database 114 includes, for example, a conceptual structure table that manages the names of the conceptual structures and the conceptual levels corresponding to the conceptual structures.

図２は、概念構造テーブルの例である。概念構造テーブル２００は、１以上の木構造、及び当該１以上の木構造それぞれのノードである概念を特定する情報を格納する。概念構造テーブル２００に格納された各概念は、いずれも説明変数候補となり得る。 FIG. 2 is an example of a conceptual structure table. The conceptual structure table 200 stores information specifying one or more tree structures and a concept that is a node of each of the one or more tree structures. Each concept stored in the conceptual structure table 200 can be an explanatory variable candidate.

概念構造テーブル２００は、例えば、概念構造の名称の情報を格納するフィールド２０１と、概念構造名称に対応する概念構造の最上位の階層である概念レベル１に属する概念の情報を格納するフィールド２０２と、概念レベル１の１つ下の階層である概念レベル２に属する概念の情報を格納するフィールド２０３と、概念レベル２の１つ下の階層である概念レベル３に属する概念の情報を格納するフィールド２０４と、を含む。 The conceptual structure table 200 includes, for example, a field 201 for storing information on the name of a conceptual structure, and a field 202 for storing information on a concept belonging to a conceptual level 1 which is the highest hierarchy of the conceptual structure corresponding to the conceptual structure name. , A field 203 for storing information on concepts belonging to concept level 2 which is a hierarchy one level below concept level 1 and a field for storing information on concepts belonging to concept level 3 one layer lower than concept level 2 And 204.

概念構造テーブル２００は、予め定められていてもよいし、概念構造編集部１１２による後述する処理によって作成されてもよい。ある概念の（親ノードを含む）先祖ノードに相当する概念は、当該ある概念の上位概念である。また、ある概念の（子ノードを含む）子孫ノードに相当する概念は、当該ある概念の下位概念である。 The conceptual structure table 200 may be determined in advance, or may be created by processing to be described later by the conceptual structure editing unit 112. A concept corresponding to an ancestor node (including a parent node) of a certain concept is a superordinate concept of the certain concept. Also, a concept corresponding to a descendant node (including a child node) of a certain concept is a subordinate concept of the certain concept.

図２における概念構造テーブル２００のレコードである概念構造レコード２００Ａ、概念構造レコード２００Ｂ、及び概念構造レコード２００Ｃは、概念構造「病名概念１」の概念レベル１に「病名群１」が属し、概念レベル２に属する「病名群α」は「病名群１」の下位概念であり、概念レベル３に属する「病名Ａ」、「病名Ｂ」、及び「病名Ｃ」が「病名群α」の下位概念であることを示す。 In the conceptual structure record 200A, the conceptual structure record 200B, and the conceptual structure record 200C, which are the records of the conceptual structure table 200 in FIG. 2, “pathname group 1” belongs to the concept level 1 of the concept structure “pathname concept 1” The “disease name group α” belonging to 2 is a subordinate concept of “disease name group 1”, and “disease name A”, “disease name B” and “disease name C” belonging to concept level 3 are subordinate concepts of “disease name group α” Indicates that there is.

概念構造テーブル２００は、概念レベル３のさらに下位の概念レベル、即ち概念レベル４以下に属する概念の情報を格納するフィールドをさらに含んでもよい。即ち、概念構造は、４つ以上の階層を含んでもよい。 The concept structure table 200 may further include a field for storing information of concepts belonging to a concept level further lower than concept level 3, ie, concept level 4 or less. That is, the conceptual structure may include four or more levels.

また、１つの概念が、複数の概念構造に含まれていてもよい。図２の例において、概念構造レコード２００Ｄ及び概念構造レコード２００Ｅは、「薬剤群θ」が概念構造「薬剤概念１」と概念構造「知識概念１」の双方に含まれることを示す。また、図２の例のように、複数の概念構造に１つの概念が属する場合において、当該複数の概念構造に含まれる概念構造において当該１つの概念が属する概念レベルと、当該複数の概念構造に含まれる他の概念構造において当該１つの概念が属する概念レベルと、は互いに異なってもよい。 Also, one concept may be included in a plurality of conceptual structures. In the example of FIG. 2, the conceptual structure record 200D and the conceptual structure record 200E indicate that “drug group θ” is included in both the conceptual structure “drug concept 1” and the conceptual structure “knowledge concept 1”. In addition, in the case where one concept belongs to a plurality of conceptual structures as in the example of FIG. 2, the conceptual level to which the one concept belongs in the conceptual structures included in the plurality of conceptual structures, and the plurality of conceptual structures The concept level to which the one concept belongs in the other concept structures included may be different from each other.

以下、分析データ格納データベース１１５を構成するテーブルの一例を説明する。分析データ格納データベース１１５は、例えば、説明変数候補テーブルと、データセット格納テーブルと、関連度テーブルと、を含む。 Hereinafter, an example of the table which comprises the analysis data storage database 115 is demonstrated. The analysis data storage database 115 includes, for example, an explanatory variable candidate table, a data set storage table, and an association degree table.

図６は、説明変数候補テーブルの例である。説明変数候補テーブル６００は、例えば、説明変数候補の名称を格納するフィールド６０１と、説明変数候補が抽出対象の説明変数となったか否かを示す抽出フラグを格納するフィールド６０２と、説明変数候補が再生成対象となったか否かを示す再生成フラグを格納するフィールド６０３と、説明変数候補が再生成の対象となった際に当該説明変数候補から除外すべき下位概念の変数情報を格納するフィールド６０４と、説明変数候補が抽出の対象外となったか否かを示す対象外フラグを格納するフィールド６０５と、を含む構成される。 FIG. 6 is an example of an explanatory variable candidate table. The explanatory variable candidate table 600 includes, for example, a field 601 for storing the name of an explanatory variable candidate, a field 602 for storing an extraction flag indicating whether the explanatory variable candidate has become an explanatory variable to be extracted, and an explanatory variable candidate. A field 603 for storing a regeneration flag indicating whether or not to be regenerated, and a field for storing variable information of lower concepts to be excluded from the explanatory variable candidate when the explanatory variable candidate becomes a target for regeneration And 604, and a field 605 for storing a non-target flag indicating whether the explanatory variable candidate has become out of target for extraction.

フィールド６０２と、フィールド６０３と、フィールド６０５と、に格納される初期値は「０」であり、フィールド６０４に格納される初期値は「ＮＵＬＬ」である。説明変数候補が抽出対象の説明変数に決定された場合、フィールド６０２の当該説明変数候補に対応するセルの値が「１」に変更される。 The initial value stored in the field 602, the field 603, and the field 605 is "0", and the initial value stored in the field 604 is "NULL". When the explanatory variable candidate is determined as the explanatory variable to be extracted, the value of the cell corresponding to the explanatory variable candidate in the field 602 is changed to “1”.

説明変数候補が再生成の対象となった場合、フィールド６０３の当該説明変数候補に対応するセルの値が「１」に変更され、フィールド６０４の当該説明変数候補に対応するセルの値が再生成の原因となった説明変数を示す値に変更される。説明変数候補が抽出の対象外となった場合、フィールド６０５の当該説明変数候補に対応する値が「１」に変更される。 When the explanatory variable candidate becomes a target of regeneration, the value of the cell corresponding to the explanatory variable candidate in the field 603 is changed to “1”, and the value of the cell corresponding to the explanatory variable candidate in the field 604 is regenerated. Is changed to a value that indicates the explanatory variable that caused the When the explanatory variable candidate is not a target of extraction, the value corresponding to the explanatory variable candidate in the field 605 is changed to “1”.

図６の例は、説明変数候補「病名Ｂ」及び「病名Ｚの治療薬剤」が抽出対象の説明変数に決定され、概念構造において、抽出対象となった説明変数「病名Ｂ」を下位概念に含む説明変数候補「病名群α」が再生成対象であることを意味する。また、図６の例は、説明変数「病名群α」は、病名Ｂを含まない説明変数として再生成されることを意味する。 In the example of FIG. 6, the explanatory variable candidates "disease name B" and "treatment drug of disease name Z" are determined as the explanatory variables of the extraction target, and in the conceptual structure, the explanatory variable "disease name B" which is the extraction target is subordinate It means that the explanatory variable candidate "disease name group α" to be included is to be regenerated. Further, the example of FIG. 6 means that the explanatory variable “disease name group α” is regenerated as an explanatory variable not including the disease name B.

また、図６の例は、抽出対象に決定された説明変数「病名Ｚの治療薬剤」の下位概念である説明変数候補「薬剤群θ」が抽出対象外となり、抽出対象外となった説明変数候補「薬剤群θ」の下位概念である説明変数候補「薬剤Ａ」も抽出対象外となったことを示す。 Further, in the example of FIG. 6, the explanatory variable candidate “drug group θ” which is a subordinate concept of the explanatory variable “treatment drug with disease name Z” determined to be the extraction target is not the extraction target and is not the extraction target The explanatory variable candidate “drug A”, which is a subordinate concept of the candidate “drug group θ”, is also excluded from the extraction target.

図７は、データセット作成部１０９により作成されたデータセットを格納するデータセット格納テーブル７００の例である。データセット格納テーブル７００は、例えば、レコードの識別子を格納するフィールド７０１と、説明変数生成部１０８により生成された１以上の説明変数の値を格納するフィールド７０２と、目的変数生成部１０７により生成された目的変数の値を格納するフィールド７０３と、を含む。図７の例における、レコード識別子は患者ＩＤであり、説明変数は「病名Ａ」、「病名Ｂ」、「病名Ｃ」、「病名群α」、及び「病名Ｚの治療薬剤」を含む。 FIG. 7 is an example of a data set storage table 700 for storing the data set generated by the data set generation unit 109. The data set storage table 700 includes, for example, a field 701 for storing a record identifier, a field 702 for storing values of one or more explanatory variables generated by the explanatory variable generation unit 108, and a target variable generation unit 107. And a field 703 for storing the value of the objective variable. The record identifier in the example of FIG. 7 is the patient ID, and the explanatory variables include “disease name A”, “disease name B”, “disease name C”, “disease name group α”, and “treatment drug for disease name Z”.

図８は、説明変数候補の再生成処理及び説明変数の抽出処理が実施された後のデータセットを格納するデータセット格納テーブル７００の例である。図８において、フィールド７０２は抽出対象に決定した１以上の説明変数の値を格納する。図８における、説明変数「病名群α（病名Ｂを除く）」は、概念構造テーブル２００が示す病名群αから病名Ｂを除いた病名群を示し、説明変数再生成部１１１により「病名群α」から再生成された説明変数である。 FIG. 8 is an example of a data set storage table 700 for storing the data set after the explanatory variable candidate regeneration process and the explanatory variable extraction process are performed. In FIG. 8, a field 702 stores the values of one or more explanatory variables determined as extraction targets. The explanatory variable “disease name group α (except for the disease name B)” in FIG. 8 indicates a disease name group obtained by removing the disease name B from the disease name group α indicated by the conceptual structure table 200, and the explanatory variable regeneration unit 111 "Is an explanatory variable regenerated from".

図９は、関連度算出部１１０により算出された説明変数間の関連度を格納する関連度テーブル９００の例である。関連度テーブル９００は、例えば、未抽出の説明変数候補の名称を格納するフィールド９０１と、未抽出の説明変数候補と目的変数との関連度を格納するフィールド９０２と、未抽出の説明変数候補と既抽出の説明変数との関連度を格納するフィールド９０３と、を含む。なお、既抽出の説明変数が存在しない場合には、図９の例のように、フィールド９０３は「ＮＵＬＬ」を格納する。 FIG. 9 is an example of a degree-of-association table 900 that stores the degree of association between the explanatory variables calculated by the degree-of-relevance calculation unit 110. The association degree table 900 includes, for example, a field 901 for storing names of unextracted explanatory variable candidates, a field 902 for storing association degrees between unextracted explanatory variable candidates and target variables, and unextracted explanatory variable candidates. And a field 903 for storing the degree of association with the already extracted explanatory variable. When there is no already extracted explanatory variable, the field 903 stores “NULL” as in the example of FIG.

図１０は、説明変数再生成部１１１により説明変数「病名Ｂ」が抽出され、「病名Ｂ」の上位概念である説明変数候補が「病名群α」が「病名群α（病名Ｂを除く）」として再生成された後に、関連度算出部１１０が算出した関連度を格納する関連度テーブル９００の例である。 In FIG. 10, the explanatory variable "disease name B" is extracted by the explanatory variable regenerating unit 111, and the explanatory variable candidate which is a superordinate concept of "disease name B" is "disease name group α" is "disease name group α (except for disease name B) The “relevancy degree” is an example of the relevancy degree table 900 that stores the relevancy degree calculated by the relevancy degree calculation unit 110 after “regenerating”.

図１０の例において、関連度テーブル９００は、既抽出の説明変数「病名Ｂ」に対応するレコードを含まない。また、図１０の関連度テーブル９００の例は、説明変数候補「病名群α」が、再生成された説明変数候補「病名群α（病名Ｂを除く）」に変更されている点において、図９の関連度テーブル９００の例と異なる。図１０の例において、フィールド９０３は、関連度算出部１１０により算出された、各説明変数候補と既抽出の説明変数「病名Ｂ」との関連度を格納する。 In the example of FIG. 10, the association degree table 900 does not include a record corresponding to the already extracted explanatory variable “disease name B”. Further, the example of the association degree table 900 of FIG. 10 is that the explanatory variable candidate “disease name group α” is changed to the regenerated explanatory variable candidate “disease name group α (except for the disease name B)”. This is different from the example of the 9 relevance table 900. In the example of FIG. 10, the field 903 stores the degree of association between each explanatory variable candidate and the already extracted explanatory variable “disease name B”, which is calculated by the degree of association calculation unit 110.

以下、本実施例の変数最適化システム１００の動作例を説明する。図１１は、本実施例の変数最適化システム１００の動作例を示すフローチャートである。まず、データ抽出部１０６は、統合データベース１１３に蓄積された患者データをメモリ１０３に読み出す（Ｓ１１０１）。患者データとは、例えば、患者を識別する情報、患者に対応する処方薬の情報、及び患者に対応する検査結果の情報等の患者に関するデータの総称である。 Hereinafter, an operation example of the variable optimization system 100 according to the present embodiment will be described. FIG. 11 is a flow chart showing an operation example of the variable optimization system 100 of the present embodiment. First, the data extraction unit 106 reads out the patient data accumulated in the integrated database 113 into the memory 103 (S1101). The patient data is, for example, a generic name of data related to a patient, such as information identifying a patient, information on a prescribed medicine corresponding to the patient, and information on test results corresponding to the patient.

なお、以下、データ抽出部１０６は、患者情報テーブル３００、検査情報テーブル４００、及び処方情報テーブル５００を患者データとして用いる例を説明する。データ抽出部１０６は、ステップＳ１１０１において、統合データベース１１３に含まれる全患者データを読み出してもよいし、ユーザによって指定された患者の患者データのみを読み出してもよい。 Hereinafter, an example will be described in which the data extraction unit 106 uses the patient information table 300, the examination information table 400, and the prescription information table 500 as patient data. In step S1101, the data extraction unit 106 may read all patient data included in the integrated database 113, or may read only patient data of a patient designated by the user.

次に、データ抽出部１０６は、概念構造データベース１１４から概念構造データとして、例えば、概念構造テーブル２００を読み出し、メモリ１０３に記憶する（Ｓ１１０２）。データ抽出部１０６は、ステップＳ１１０２において、概念構造データベース１１４に含まれる全データを読み出してもよいし、一部の概念構造を読み出してもよい。次に、表示画面生成部１０５は、例えば、通信部１０４を介して入出力端末１３０に分析オプション受付画面を表示し、各種分析オプションの入力を受け付ける（Ｓ１１０３）。 Next, the data extraction unit 106 reads out, for example, the conceptual structure table 200 from the conceptual structure database 114 as conceptual structure data, and stores the conceptual structure table 200 in the memory 103 (S1102). In step S1102, the data extraction unit 106 may read all data included in the conceptual structure database 114 or may read a part of the conceptual structure. Next, the display screen generation unit 105 displays, for example, an analysis option reception screen on the input / output terminal 130 via the communication unit 104, and receives input of various analysis options (S1103).

図１４は、分析オプション受付画面の例である。分析オプション受付画面１４００は、例えば、使用する概念構造を選択するためのエリア１４０１と、抽出対象に含まれる説明変数をユーザが指定するためのエリア１４０２と、関連度の算出方式を設定するためのエリア１４０３と、説明変数の抽出・再生成方式を設定するためのエリア１４０４と、目的変数の作成方式を設定するためのエリア１４０５と、抽出する説明変数の数を設定するためのエリア１４０６と、知識データを用いた概念構造を作成するためのボタン１４０７と、概念構造を編集するためのボタン１４０８と、分析実行ボタン１４０９と、を含む。 FIG. 14 is an example of the analysis option reception screen. The analysis option reception screen 1400 includes, for example, an area 1401 for selecting a conceptual structure to be used, an area 1402 for the user to specify an explanatory variable included in the extraction target, and a calculation method of the degree of association. An area 1403; an area 1404 for setting an extraction / regeneration method of an explanatory variable; an area 1405 for setting a generation method of an objective variable; an area 1406 for setting the number of explanatory variables to be extracted; A button 1407 for creating a conceptual structure using knowledge data, a button 1408 for editing a conceptual structure, and an analysis execution button 1409 are included.

エリア１４０１内のチェックボックスは、エリア１４０１に表示される複数の概念構造から使用する概念構造を選択するために用いられる。エリア１４０２内のラジオボタンは、関連度の値に関わらず、再生成後のデータセットに強制的に含める説明変数の有無を指定するために用いられる。エリア１４０１内のプルダウンリスト１４０２１は、「あり」のラジオボタンが選択された場合において、ユーザが指定したい説明変数が含まれる概念構造を選択するためのプルダウンリストである。 The check boxes in the area 1401 are used to select a conceptual structure to be used from a plurality of conceptual structures displayed in the area 1401. The radio button in the area 1402 is used to designate the presence or absence of an explanatory variable which is forcibly included in the regenerated data set regardless of the value of the degree of association. The pull-down list 14021 in the area 1401 is a pull-down list for selecting a conceptual structure including an explanatory variable which the user desires to designate when the radio button of “Yes” is selected.

エリア１４０１内の説明変数検索ボックス１４０２２は、検索対象の説明変数名の入力を受け付け、検索された説明変数を指定するための検索ボックスである。エリア１４０２内の説明変数リスト表示エリア１４０２３のチェックボックスは、指定する説明変数を決定するために用いられる。図１４に示した例は、「病名Ａ」と「病名群β」が抽出対象の説明変数として指定されていることを示す。なお、例えば、エリア１４０２においてある説明変数が指定された場合、当該説明変数の下位概念である説明変数は選択不可能になるよう、エリア１４０２は構成されている。 The explanatory variable search box 14022 in the area 1401 is a search box for receiving the input of the explanatory variable name of the search target and designating the searched explanatory variable. The check box in the explanatory variable list display area 14023 in the area 1402 is used to determine the explanatory variable to be specified. The example shown in FIG. 14 indicates that “disease name A” and “disease name group β” are designated as explanatory variables for extraction. For example, when an explanatory variable is designated in the area 1402, the area 1402 is configured such that an explanatory variable which is a subordinate concept of the explanatory variable can not be selected.

エリア１４０３内のプルダウンリスト１４０３１は、関連度の算出方法を選択するためのプルダウンリストである。図１４の例は、相関係数を用いて関連度が算出されることを示す。エリア１４０３内のラジオボタンは、既抽出の説明変数との関連度の算出及び既抽出の説明変数との関連度が閾値以上となった説明変数を除外する処理、の実施有無を指定するために用いられる。エリア１４０３に含まれるエリア１４０３２内のアップダウンボタンは、ラジオボタン「あり」が選択された場合において、除外処理を実施する際の関連度の閾値を設定するために用いられる。また、エリア１４０３２は、に閾値の入力を直接受け付けてもよい。 The pull-down list 14031 in the area 1403 is a pull-down list for selecting a method of calculating the degree of association. The example of FIG. 14 shows that the degree of association is calculated using the correlation coefficient. The radio button in the area 1403 is used to specify whether to calculate the degree of association with the already-extracted explanatory variable and exclude the explanatory variable whose degree of association with the already-extracted explanatory variable is equal to or higher than the threshold value. Used. The up-down button in the area 14032 included in the area 1403 is used to set the threshold value of the degree of association when performing the exclusion process when the radio button “with” is selected. Also, the area 14032 may directly receive the input of the threshold.

エリア１４０４内のチェックボックスそれぞれは、「抽出された説明変数の下位概念を対象外とする」処理を実施するか否か、及び「抽出された説明変数の上位概念は再生成された説明変数のみを対象とする」処理を実施するか否か、それぞれを指定するために用いられる。 Each check box in the area 1404 determines whether or not the “subordinate concept of the extracted explanatory variable is not subject” process, and “the upper concept of the extracted explanatory variable is only the regenerated explanatory variable. It is used to specify whether or not to execute "targeted" processing.

「抽出された説明変数の下位概念を対象外とする」のチェックボックスが有効である場合、抽出された説明変数の下位概念に属する説明変数候補は抽出対象外となる。例えば、「病名群Ｚの治療薬剤」が抽出された場合、「病名群Ｚの治療薬剤」の下位概念である「薬剤群θ」及び「薬剤群ψ」、「薬剤群θ」の下位概念である「薬剤Ａ」及び「薬剤Ｂ」、並びに「薬剤群ψ」の下位概念が抽出対象外となる。 When the check box “Exclude the lower concept of the extracted explanatory variable” is valid, the explanatory variable candidates belonging to the lower concept of the extracted explanatory variable are not extracted. For example, when “treatment drug of disease name group Z” is extracted, “drug group θ” and “drug group ψ”, which are subordinate concepts of “treatment drug of disease name group Z”, and subordinate concepts of “drug group θ” Sub-concepts of “drug A” and “drug B” and “drug group ψ” are excluded from extraction.

「抽出された説明変数の上位概念は再生成された説明変数のみを分析対象とする」のチェックボックスが有効でない場合、抽出された説明変数を含む上位概念の説明変数と、抽出された説明変数を含まない説明変数候補として再生成された説明変数候補と、の両方が説明変数候補に含まれる。 If the check box "The upper concept of the extracted explanatory variable is to be analyzed is only the regenerated explanatory variable" is not valid, the explanatory variable of the upper concept including the extracted explanatory variable and the extracted explanatory variable Both of the explanatory variable candidates regenerated as the explanatory variable candidates not including H are included in the explanatory variable candidates.

例えば、説明変数「病名Ｂ」が抽出されたとき、再生成された説明変数「病名群α（病名Ｂを除く）」と、もとの説明変数「病名群α」、の両方が説明変数候補となる。一方、当該チェックボックスが有効である場合に、説明変数「病名Ｂ」が抽出されたとき、説明変数候補群は「病名群α」を含まない。 For example, when the explanatory variable "disease name B" is extracted, both of the regenerated explanatory variable "disease name group α (except for disease name B)" and the original explanatory variable "disease name group α" are explanatory variable candidates. It becomes. On the other hand, if the explanatory variable “disease name B” is extracted when the check box is valid, the explanatory variable candidate group does not include the “disease name group α”.

エリア１４０４内のラジオボタンは、説明変数の抽出優先度を選択するために用いられる。「目的変数との関連度」に対応するラジオボタンが選択された場合、目的変数との関連度が高い説明変数が優先的に抽出される。「既抽出の説明変数との関連度」に対応するラジオボタンが選択された場合、既抽出の説明変数との関連度が低い説明変数が優先的に抽出される。 Radio buttons in the area 1404 are used to select the extraction priority of the explanatory variable. When the radio button corresponding to "the degree of association with the objective variable" is selected, the explanatory variable having a high degree of association with the objective variable is preferentially extracted. When the radio button corresponding to “the degree of association with the already extracted explanatory variable” is selected, the explanatory variable having a low degree of association with the already extracted explanatory variable is preferentially extracted.

エリア１４０６内のアップダウンボタンは、抽出対象の説明変数の数を設定するために用いられる。なお、エリア１４０６が、抽出対象の説明変数の数の入力を直接受け付けてもよい。なお、エリア１４０６で指定される抽出対象の説明変数の数は、エリア１４０２で指定された説明変数の数を含む数であってもよいし、含まない数であってもよい。 The up-down button in the area 1406 is used to set the number of explanatory variables to be extracted. The area 1406 may directly receive the input of the number of explanatory variables to be extracted. Note that the number of extraction target explanatory variables specified in area 1406 may or may not include the number of explanatory variables specified in area 1402.

エリア１４０５内のラジオボタンは、作成する目的変数の種類を指定するために用いられる。目的変数の種類は、例えば、２値変数、カテゴリ変数、及び量的変数等である。２値変数は、例えば、分析目的のイベントの発生有無を「０」又は「１」で表す変数のように、２つの値のいずれかをとる変数である。 Radio buttons in the area 1405 are used to specify the type of objective variable to be created. Types of objective variables are, for example, binary variables, categorical variables, and quantitative variables. The binary variable is, for example, a variable that takes one of two values, such as a variable representing “0” or “1” whether or not an event for analysis purpose has occurred.

カテゴリ変数は、例えば、分析目的のイベントの重要度などを「低」、「中」、「高」又は「１」、「２」、「３」で表す変数のように、複数のカテゴリ値をとり得る変数である。量的変数は、分析目的のイベントに関する量的な情報を「１」、「１０」、「１００」等の値で表す変数である。例えば、カテゴリ変数は順序尺度又は名義尺度であり、カテゴリ変数を目的変数としたとき、カテゴリ変数の値域は３つ以上の値を含む。また、例えば、量的変数は間隔尺度又は比例尺度であり、量的変数を目的変数としたとき、量的変数の値域は３つ以上の値を含む。 The categorical variable is, for example, a plurality of categorical values, such as a variable represented by “low”, “medium”, “high” or “1”, “2”, “3”, for the importance of the event for analysis purpose. It is a possible variable. A quantitative variable is a variable that represents quantitative information on an event of analysis purpose by a value such as "1," "10," "100," and the like. For example, a categorical variable is an ordinal scale or a nominal scale, and when the categorical variable is an objective variable, the range of the categorical variable includes three or more values. Also, for example, the quantitative variable is an interval scale or a proportional scale, and when the quantitative variable is a target variable, the range of the quantitative variable includes three or more values.

カテゴリ変数又は量的変数が目的変数として指定された場合、関連度算出部１１０は、説明変数と目的変数の関連度を算出する際に、例えば値域が０以上１以下である所定の関数を用いて、目的変数の値を「０」から「１」の範囲に標準化してもよい。 When a categorical variable or a quantitative variable is designated as an objective variable, the association degree calculation unit 110 uses, for example, a predetermined function having a value range of 0 or more and 1 or less when calculating the association degree of the explanatory variable and the objective variable. Then, the value of the objective variable may be standardized in the range of “0” to “1”.

エリア１４０５内の複数のプルダウンは、作成する目的変数の種類が指定された後に、作成する目的変数の詳細な設定を行うために用いられる。図１４の例では、３つのプルダウンを指定することで、「退院後の３０日以内におけるイベントＸＸの有無」を示す目的変数を作成する例を示しており、イベントＸＸがある場合には目的変数として「１」を、それ以外の場合には目的変数として「０」を格納する事を示している。 The plurality of pull-downs in the area 1405 are used to perform detailed setting of the target variable to be created after the type of the target variable to be created is specified. The example of FIG. 14 shows an example of creating a target variable indicating "presence or absence of the event XX within 30 days after discharge from hospital" by specifying three pull-downs, and if there is the event XX, the target variable It is shown that “1” is stored as “0”, and “0” is stored as an objective variable in other cases.

なお、エリア１４０５内のプルダウン１４０５１において選択可能な項目は、エリア１４０５内のラジオボタンで指定された目的変数の種類に対応する。つまり、例えば、２値変数に対応するラジオボタンが選択された場合、プルダウン１４０５１において、２値変数を示す項目のみが選択可能である。また、２値変数に対応するラジオボタンが選択された場合のみ、エリア１４０５内の入力エリア１４０５２は「イベントあり時」の状態を示す値の入力を受け付ける。 An item selectable in the pull-down 14051 in the area 1405 corresponds to the type of the target variable designated by the radio button in the area 1405. That is, for example, when the radio button corresponding to the binary variable is selected, only the item indicating the binary variable can be selected in the pull-down 14051. Further, only when the radio button corresponding to the binary variable is selected, the input area 14052 in the area 1405 receives the input of the value indicating the state of “when there is an event”.

ボタン１４０７は、概念構造編集部１１２に、知識データを用いた概念構造テーブル２００の作成を実行させるためのボタンである。医学論文、薬剤の添付文書、薬剤や病名などのマスタ情報、及び各種ガイドラインは、いずれも知識データの例である。知識データは、例えば、補助記憶装置１１６、病院情報システム１２０、又は変数最適化システム１００に接続された外部のデータベース等に格納されている。 A button 1407 is a button for causing the conceptual structure editing unit 112 to create the conceptual structure table 200 using the knowledge data. Medical papers, drug inserts, master information such as drugs and disease names, and various guidelines are all examples of knowledge data. The knowledge data is stored, for example, in the auxiliary storage device 116, the hospital information system 120, or an external database connected to the variable optimization system 100.

以下、「病名Ｚの治療行為として薬剤Ｚを使用した」という記述が含まれる医学文献を知識データとして用いた、概念構造編集処理の一例を説明する。概念構造編集部１１２は、例えば、当該記述に対して所定のルールに基づく構文解析を実行して、当該記述中の名詞句を特定する。概念構造編集部１１２は、例えば、所定の辞書に含まれる表現が当該記述に含まれるか否か等の所定のルールに基づいて、当該記述中の複数の名詞句が上位概念と下位概念の関係にあるかを判定する。当該所定の辞書及び当該所定のルールは、例えばメモリ１０３又は補助記憶装置１１６に予め格納されている。概念構造編集部１１２は、当該記述の例においては、「病名Ｚの治療行為」が「薬剤Ｚ」の上位概念であると判定する。 Hereinafter, an example of a conceptual structure editing process will be described using medical literature including a description “the drug Z is used as a treatment for the disease name Z” as knowledge data. The conceptual structure editing unit 112, for example, executes syntactic analysis based on a predetermined rule on the description to specify a noun phrase in the description. The conceptual structure editing unit 112 determines, for example, based on a predetermined rule such as whether or not an expression included in a predetermined dictionary is included in the description, the plurality of noun phrases in the description have a relationship between the upper concept and the lower concept. Determine if it is. The predetermined dictionary and the predetermined rule are stored in advance in, for example, the memory 103 or the auxiliary storage device 116. In the example of the description, the conceptual structure editing unit 112 determines that “treatment of disease name Z” is a superordinate concept of “drug Z”.

概念構造編集部１１２は、例えば、概念構造名称を、例えば、ユーザからの指示に従って決定し、決定した概念構造名称をフィールド２０１に、概念「病名Ｚの治療行為」をフィールド２０２に、概念「病名Ｚの治療行為」をフィールド２０３に、概念「薬剤Ｚ」をフィールド２０４に、それぞれ格納する。当該処理により、概念構造テーブル２００において、「病名Ｚの治療行為」の下位概念として「病名Ｚの治療薬剤」、「病名Ｚの治療薬剤」の下位概念として「薬剤Ｚ」が生成される。 The conceptual structure editing unit 112 determines, for example, a conceptual structural name according to an instruction from the user, the determined conceptual structural name in the field 201, the concept “treatment of the disease name Z” in the field 202, and the concept “disease name The “treatment action of Z” is stored in the field 203, and the concept “drug Z” is stored in the field 204. As a result of the process, in the conceptual structure table 200, "treatment drug for disease name Z" and "drug Z" are generated as subordinate concepts of "treatment for disease name Z" and "treatment drug for disease name Z".

また、概念構造編集部１１２は、例えば、概念構造テーブル２００を参照して、「薬剤Ｚ」が特定の薬剤群に属すると判定した場合、「病名Ｚの治療薬剤」と「薬剤Ｚ」の間に、例えば概念「薬剤群Ｚ」を含む概念構造を作成してもよい。知識データとして添付文書を用いた場合には、概念構造編集部１１２は、例えば、添付文書から薬効分類名と薬剤の名称を抽出することにより、「薬効分類ＸＸ」の下位概念として「薬剤名ＸＸ」を生成する。 In addition, for example, when it is determined that “drug Z” belongs to a specific drug group with reference to the concept structure table 200, the conceptual structure editing unit 112 selects between “the therapeutic drug of disease name Z” and “drug Z”. For example, a conceptual structure including the concept “drug group Z” may be created. When an attached document is used as the knowledge data, for example, the conceptual structure editing unit 112 extracts a drug efficacy class name and a drug name from the package insert, and as a subordinate concept of “drug effect class XX”, “drug name XX” Generate

なお、例えば、概念構造テーブル２００において、例えば「病名Ｚの治療行為」が「薬剤Ｚ」の下位概念として既に登録されている場合のように、知識データが示す概念の上位下位の関係と、概念構造テーブル２００が示す概念の上位下位の関係と、が異なる場合、概念構造編集部１１２は、例えば、ユーザの指示に従って概念構造テーブル２００を編集してもよいし、「薬剤Ｚ」が「病名Ｚの治療行為」の下位概念となるように概念構造テーブル２００を編集してもよい。 For example, in the conceptual structure table 200, for example, as in the case where "treatment of disease name Z" is already registered as a subordinate concept of "drug Z", the upper-lower relation between the concept indicated by the knowledge data and the concept If the upper-lower relation of the concept indicated by the structure table 200 is different, the conceptual structure editing unit 112 may edit the conceptual structure table 200 according to the instruction of the user, for example. The conceptual structure table 200 may be edited to be a subordinate concept of “treatment action”.

図１４の説明に戻る。ボタン１４０８は、例えば、概念構造テーブル２００の内容を含む概念構造編集画面を、表示画面生成部１０５に作成させ、入出力端末１３０等に表示させるためのボタンである。ユーザは、例えば、概念構造編集画面を介して、概念構造中の各概念レベルに含まれる概念の編集や、概念構造の追加及び削除などを行う。 It returns to the explanation of FIG. A button 1408 is, for example, a button for causing the display screen generation unit 105 to create a conceptual structure editing screen including the contents of the conceptual structure table 200 and displaying the screen on the input / output terminal 130 or the like. For example, the user performs editing of a concept included in each concept level in the conceptual structure, addition and deletion of a conceptual structure, and the like through the conceptual structure editing screen.

図１１の説明に戻る。目的変数生成部１０７は、入出力端末１３０から分析実行ボタン１４０９が選択された旨の通知を受信すると、ステップＳ１１０３で受け付けた各種分析オプションにと、ステップＳ１１０１で読み出した患者データとに基づいて、目的変数を生成する（Ｓ１１０４）。 It returns to the explanation of FIG. When receiving the notification that the analysis execution button 1409 has been selected from the input / output terminal 130, the objective variable generation unit 107, based on the various analysis options accepted in step S1103 and the patient data read out in step S1101, An objective variable is generated (S1104).

エリア１４０５において、「退院後の３０日以内におけるイベントＸＸの有無」を示す２値変数である目的変数を作成すること、及びイベントＸＸがある場合の当該２値変数の値が「１」であることが指定されている場合における、ステップＳ１１０４の処理の例を説明する。 In the area 1405, creating a target variable which is a binary variable indicating "presence or absence of the event XX within 30 days after discharge from the hospital", and the value of the binary variable when there is the event XX is "1". An example of the process of step S1104 when it is specified will be described.

まず、目的変数生成部１０７は、患者情報テーブル３００から退院経験のある患者の退院日の情報を取得する。次に、目的変数生成部１０７は、患者情報テーブル３００、検査情報テーブル４００、及び処方情報テーブル５００などを参照して、各患者の退院日から３０日以内にイベントＸＸの発生の有無を検証することにより、各患者の目的変数の値を決定する。 First, the objective variable generation unit 107 acquires, from the patient information table 300, information on the discharge date of a patient who has a discharge experience. Next, the objective variable generation unit 107 verifies the presence or absence of occurrence of the event XX within 30 days from the discharge date of each patient with reference to the patient information table 300, the examination information table 400, the prescription information table 500, and the like. Thus, the value of the objective variable of each patient is determined.

なお、エリア１４０５で指定され得る目的変数それぞれに対する、各患者における当該目的変数の値を決定するために目的変数生成部１０７が参照する患者データの情報、及び当該目的変数の値の決定方法は、例えば、予め定められている。 Information of patient data referred to by the objective variable generation unit 107 for determining the value of the objective variable in each patient for each objective variable that can be designated in the area 1405 and the method of determining the value of the objective variable are as follows: For example, it is predetermined.

目的変数生成部１０７は、各患者の患者ＩＤをフィールド７０１に格納する。また、目的変数生成部１０７は、各患者について、当該患者の退院日から３０日以内にイベントＸＸが発生している場合には当該フィールド７０３の当該患者に対応するセルに「１」を、当該患者の退院日から３０日以内にイベントＸＸが発生していない場合にはフィールド７０３の当該患者に対応するセルに「０」を格納する。目的変数の作成において、前述したテーブル以外の情報が利用されてもよい。 The objective variable generation unit 107 stores the patient ID of each patient in the field 701. In addition, if the event XX occurs within 30 days from the discharge date of the patient for each patient, the objective variable generation unit 107 sets “1” to the cell corresponding to the patient of the field 703, If the event XX has not occurred within 30 days from the discharge date of the patient, "0" is stored in the cell corresponding to the patient of the field 703. In the creation of the objective variable, information other than the above-described table may be used.

次に、説明変数生成部１０８は、説明変数候補を生成する（ステップＳ１１０５）。具体的には、例えば、説明変数生成部１０８は、エリア１４０１において選択された概念構造に含まれる概念を概念構造テーブル２００から取得する。説明変数生成部１０８は、取得した概念のうち、ステップＳ１１０１で読み出した患者データの項目に含まれる各概念を説明変数候補に決定する。説明変数生成部１０８は、説明変数候補の名称を、説明変数候補テーブル６００のフィールド６０１、及び関連度テーブル９００のフィールド９０１にそれぞれ格納する。 Next, the explanatory variable generation unit 108 generates explanatory variable candidates (step S1105). Specifically, for example, the explanatory variable generation unit 108 acquires, from the conceptual structure table 200, the concepts included in the conceptual structure selected in the area 1401. The explanatory variable generation unit 108 determines, as explanatory variable candidates, each concept included in the item of the patient data read out in step S1101 among the acquired concepts. The explanatory variable generation unit 108 stores the names of the explanatory variable candidates in the field 601 of the explanatory variable candidate table 600 and the field 901 of the association degree table 900, respectively.

次に、データセット作成部１０９は、データセットを作成し、データセット格納テーブル７００に作成したデータセットを登録する（Ｓ１１０７）。具体的には、例えば、データセット作成部１０９は、説明変数候補の変数の名称それぞれをデータセット格納テーブルのフィールド７０２の項目名欄に格納する。データセット作成部１０９は、当該変数それぞれに対応する各患者の値を患者データから取得し、フィールド７０２の対応するセルに格納する。 Next, the data set creation unit 109 creates a data set, and registers the created data set in the data set storage table 700 (S1107). Specifically, for example, the data set creation unit 109 stores the names of the explanatory variable candidate variables in the item name column of the field 702 of the data set storage table. The data set creation unit 109 acquires the value of each patient corresponding to each of the variables from patient data, and stores the value in the corresponding cell of the field 702.

次に、関連度算出部１１０は、ステップＳ１１０４で作成した目的変数と、ステップＳ１１０５で作成した説明変数候補それぞれと、の関連度を算出する（Ｓ１１０７）。 Next, the degree-of-association calculation unit 110 calculates the degree of association between the objective variable created in step S1104 and each of the explanatory variable candidates created in step S1105 (S1107).

図１２は、関連度算出部１１０による関連度算出処理、即ちステップＳ１１０７の処理の詳細の一例を示すフローチャートである。関連度算出部１１０は、データセット格納テーブル７００、説明変数候補テーブル６００、をメモリ１０３上に読み出す（Ｓ１２０１）。次に、関連度算出部１１０は、全ての説明変数候補について目的変数との関連度を算出したか否かを判定する（Ｓ１２０２）。 FIG. 12 is a flowchart showing an example of details of the degree-of-association calculation process by the degree-of-association calculation unit 110, that is, the process of step S1107. The degree-of-association calculation unit 110 reads out the data set storage table 700 and the explanatory variable candidate table 600 onto the memory 103 (S1201). Next, the degree-of-association calculation unit 110 determines whether the degree of association with the target variable has been calculated for all the explanatory variable candidates (S1202).

関連度算出部１１０は、目的変数との関連度が算出されていない説明変数候補があると判定した場合（Ｓ１２０２：未処理あり）、当該説明変数候補から１つの説明変数候補を選択し、選択した説明変数候補と目的変数との関連度を算出する（Ｓ１２０３）。 When it is determined that there is an explanatory variable candidate for which the degree of association with the objective variable is not calculated (S1202: unprocessed), the association degree calculation unit 110 selects one explanatory variable candidate from the explanatory variable candidates and selects it. The degree of association between the candidate explanatory variables and the objective variable is calculated (S1203).

関連度算出部１１０は、例えば、データセット格納テーブル７００から当該選択した説明変数候補に対応する列の値をそのまま抽出することにより生成した列ベクトルと、データセット格納テーブル７００から目的変数に対応する列の値をそのまま抽出することにより生成した列ベクトルと、の相関係数を、関連度として算出する。また、関連度算出部１１０は、例えば、当該２つの列ベクトルの非線形相関係数（Ｍａｘｉｍｕｍｉｎｆｏｒｍａｔｉｏｎｃｏｅｆｆｉｃｉｅｎｔ）等を、関連度して算出してしてもよい。 The degree-of-association calculation unit 110 corresponds to, for example, a column vector generated by extracting as it is the value of the column corresponding to the selected explanatory variable candidate from the data set storage table 700 and the target variable from the data set storage table 700. The correlation coefficient between the column vector and the column vector generated by extracting the column value as it is is calculated as the degree of association. Further, the degree-of-association calculation unit 110 may calculate, for example, the non-linear correlation coefficient (Maximum information coefficient) or the like of the two column vectors as the degree of association.

次に、関連度算出部１１０は、ステップＳ１２０３において選択した説明変数候補と、ステップＳ１２０３にて算出した関連度と、を紐付けて、関連度テーブル９００のフィールド９０１及びフィールド９０２に格納する（Ｓ１２０４）。関連度算出部１１０は、全ての説明変数候補について目的変数との関連度を算出と判定した場合（Ｓ１２０２：全て終了）、図１２の処理を終了し、図１１のステップＳ１１０８に遷移する。 Next, the degree of association calculation unit 110 associates the explanatory variable candidate selected in step S1203 with the degree of association calculated in step S1203 and stores the result in the field 901 and the field 902 of the degree of association table 900 (S1204). ). When the degree of association calculating unit 110 determines that the degree of association with the target variable is calculated for all the explanatory variable candidates (S1202: all ends), the process of FIG. 12 ends, and the process transitions to step S1108 of FIG.

図１１の説明に戻る。説明変数再生成部１１１は、作成したデータセットと、ステップＳ１１０４で作成した目的変数と、ステップＳ１１０５で作成した説明変数候補と、ステップＳ１１０７で算出した関連度と、ステップＳ１１０２で読み出した概念構造と、に基づき、説明変数候補の抽出及び再生成を行う（Ｓ１１０８）。 It returns to the explanation of FIG. The explanatory variable regenerating unit 111 generates the data set, the objective variable generated in step S1104, the explanatory variable candidate generated in step S1105, the degree of association calculated in step S1107, and the conceptual structure read out in step S1102. , And extracts and reproduces explanatory variable candidates (S1108).

図１３は、説明変数再生成部１１１による、説明変数抽出及び再生成処理、即ちステップＳ１１０８の処理の詳細、の一例を示すフローチャートである。説明変数再生成部１１１は、概念構造テーブル２００、データセット格納テーブル７００、及び関連度テーブル９００をメモリ１０３に読み出す（Ｓ１３０１）。 FIG. 13 is a flowchart showing an example of the explanatory variable extraction and regeneration processing by the explanatory variable regeneration unit 111, that is, the details of the processing of step S1108. The explanatory variable regenerating unit 111 reads out the conceptual structure table 200, the data set storage table 700, and the association degree table 900 into the memory 103 (S1301).

次に、説明変数再生成部１１１は、ステップＳ１１０３の各種分析オプションの受付において、エリア１４０２の説明変数の抽出指定があるか否かを判定する（Ｓ１３０２）。説明変数再生成部１１１は、説明変数の抽出指定があると判定した場合（Ｓ１３０２：Ｙｅｓ）、エリア１４０２で指定された説明変数を抽出対象に設定し（Ｓ１３０３）、ステップＳ１３０８に遷移する。 Next, the explanatory variable regenerating unit 111 determines whether or not extraction specification of the explanatory variable of the area 1402 is specified in the reception of various analysis options in step S1103 (S1302). If the explanatory variable regeneration unit 111 determines that there is an extraction variable extraction specification (S1302: Yes), it sets an explanatory variable specified in the area 1402 as an extraction target (S1303), and transitions to step S1308.

説明変数再生成部１１１は、説明変数の抽出指定がないと判定した場合（Ｓ１３０２：Ｎｏ）、既抽出の説明変数があるか否か、即ちステップＳ１３０８の処理が少なくとも１回行われたか否かを判定する（Ｓ１３０４）。以下、説明変数候補テーブル６００において、抽出フラグ及び対象外フラグが共に０である説明変数候補を、抽出説明変数候補と呼ぶ。 When the explanatory variable regenerating unit 111 determines that the extraction specification of the explanatory variable is not specified (S1302: No), whether or not there is the already extracted explanatory variable, that is, whether the process of step S1308 has been performed at least once. Is determined (S1304). Hereinafter, in the explanatory variable candidate table 600, an explanatory variable candidate in which both the extraction flag and the non-target flag are 0 is referred to as an extracted explanatory variable candidate.

説明変数再生成部１１１は、既抽出の説明変数がないと判定した場合（Ｓ１３０４：Ｎｏ）、説明変数候補テーブル６００と関連度テーブル９００とを参照して、例えば、ステップＳ１１０７で算出した目的変数との関連度が最大の抽出説明変数候補を抽出対象の説明変数に決定し（Ｓ１３０５）、ステップＳ１３０８に遷移する。 If the explanatory variable regeneration unit 111 determines that there is no already-extracted explanatory variable (S1304: No), the objective variable calculated in step S1107, for example, is referred to with reference to the explanatory variable candidate table 600 and the association degree table 900. The extraction explanatory variable candidate having the highest degree of association with is determined as the explanatory variable to be extracted (S1305), and the process transitions to step S1308.

ステップＳ１３０５において関連度が最大の複数の抽出説明変数候補が存在した場合、説明変数再生成部１１１は、例えば、当該複数の抽出説明変数候補からランダムに選択した１つの抽出説明変数候補を抽出対象に決定する。また、表示画面生成部１０５が、当該複数の抽出説明変数候補を、入出力端末１３０に出力し、ユーザに抽出対象を選択させてもよい。 If there are a plurality of extracted explanatory variable candidates having the highest degree of association in step S1305, the explanatory variable regenerating unit 111 extracts, for example, one extracted explanatory variable candidate randomly selected from the plurality of extracted explanatory variable candidates. Decide on. Further, the display screen generation unit 105 may output the plurality of extracted explanatory variable candidates to the input / output terminal 130, and allow the user to select an extraction target.

説明変数再生成部１１１は、既抽出の説明変数があると判定した場合（Ｓ１３０４：Ｙｅｓ）、既抽出の説明変数と抽出説明変数候補それぞれとの関連度を算出し、フィールド９０３に格納する（Ｓ１３０６）。複数の既抽出の説明変数が存在する場合には、説明変数再生成部１１１は、例えば、各既抽出の説明変数と抽出説明変数候補との関連度の総和を、当該複数の既抽出の説明変数と当該抽出説明変数候補との関連度として算出する。また、総和の代わりに、平均、中央値、などの統計値が用いられてもよい。 If the explanatory variable regeneration unit 111 determines that there is an already extracted explanatory variable (S1304: Yes), it calculates the degree of association between the already extracted explanatory variable and each of the extracted explanatory variable candidates, and stores it in the field 903 ( S1306). When a plurality of already-extracted explanatory variables exist, the explanatory variable regenerating unit 111 may, for example, calculate the sum of the degree of association between each already extracted explanatory variable and the extracted explanatory variable candidate. It is calculated as the degree of association between the variable and the extracted explanatory variable candidate. Also, instead of the sum, statistical values such as mean, median, etc. may be used.

次に、説明変数再生成部１１１は、ステップＳ１３０６で算出した関連度が所定の閾値以下であり、かつ目的変数と関連度が最大である抽出説明変数候補を、抽出対象の説明変数に決定する（Ｓ１３０７）。なお、ステップＳ１３０７において、ステップＳ１３０６で算出した関連度が所定の閾値以下である抽出説明変数候補が存在しない場合、説明変数再生成部１１１は、例えば、処理を終了する。 Next, the explanatory variable regenerating unit 111 determines, as the explanatory variable to be extracted, the extraction explanatory variable candidate whose degree of association calculated in step S1306 is equal to or less than a predetermined threshold and whose degree of association with the objective variable is the largest. (S1307). If it is determined in step S1307 that there is no extracted explanatory variable candidate whose degree of association calculated in step S1306 is equal to or less than the predetermined threshold value, the explanatory variable regenerating unit 111 ends the process, for example.

ステップＳ１３０７において、ステップＳ１３０６で算出した関連度が所定の閾値以下であり、かつ関連度が最大である、複数の抽出説明変数候補が存在した場合、説明変数再生成部１１１は、例えば、当該複数の抽出説明変数候補からランダムに選択した１つの抽出説明変数候補を抽出対象に決定する。また、表示画面生成部１０５が、当該複数の抽出説明変数候補を、入出力端末１３０に出力し、ユーザに抽出対象を選択させてもよい。 In step S1307, if there are a plurality of extracted explanatory variable candidates for which the degree of association calculated in step S1306 is equal to or less than a predetermined threshold and the degree of association is maximum, the explanatory variable regenerating unit 111 One extraction explanatory variable candidate randomly selected from the extraction explanatory variable candidates of is determined as an extraction target. Further, the display screen generation unit 105 may output the plurality of extracted explanatory variable candidates to the input / output terminal 130, and allow the user to select an extraction target.

なお、エリア１４０３内の「なし」に対応するラジオボタンが選択されている場合は、説明変数再生成部１１１は、ステップＳ１３０６における関連度を算出せず、ステップＳ１３０７において、目的変数と関連度が最大である抽出説明変数候補を、抽出対象の説明変数に決定する。 When the radio button corresponding to “none” in the area 1403 is selected, the explanatory variable regenerating unit 111 does not calculate the degree of association in step S1306, and the degree of association with the objective variable in step S1307. The extraction explanatory variable candidate which is the largest is determined as the explanatory variable to be extracted.

次に、説明変数再生成部１１１は、説明変数候補テーブル６００の、抽出対象に決定した説明変数の抽出フラグの値を「１」に変更する、即ち抽出対象に決定した説明変数を抽出説明変数候補から除外する（Ｓ１３０８）。このとき、説明変数再生成部１１１は、関連度テーブル９００から、抽出対象に決定した説明変数の情報を除外してもよい。また、説明変数再生成部１１１は、抽出対象に決定した説明変数の、抽出された順番を示す情報を説明変数候補テーブル６００に格納してもよい。 Next, the explanatory variable regeneration unit 111 changes the value of the extraction flag of the explanatory variable determined as the extraction target in the explanatory variable candidate table 600 to “1”, that is, extracts the explanatory variable determined as the extraction target It excludes from a candidate (S1308). At this time, the explanatory variable regenerating unit 111 may exclude the information of the explanatory variable determined as the extraction target from the association degree table 900. Further, the explanatory variable regenerating unit 111 may store information indicating the extracted order of the explanatory variables determined as the extraction target in the explanatory variable candidate table 600.

次に、説明変数再生成部１１１は、説明変数候補の再生成処理を実行する（Ｓ１３０９）。具体的には、説明変数再生成部１１１は、例えば、概念構造テーブル２００を参照し、抽出対象の説明変数の上位概念である全ての説明変数候補を再生成対象の説明変数候補に決定し、説明変数候補テーブル６００の再生成対象の説明変数候補に対応する再生成フラグの値を１に変更する。 Next, the explanatory variable regenerating unit 111 executes the process of regenerating explanatory variable candidates (S1309). Specifically, the explanatory variable regenerating unit 111 refers to, for example, the conceptual structure table 200, and determines all explanatory variable candidates that are high-level concepts of the explanatory variable to be extracted as the explanatory variable candidate to be regenerated. The value of the regeneration flag corresponding to the explanatory variable candidate to be regenerated in the explanatory variable candidate table 600 is changed to 1.

説明変数再生成部１１１は、再生成対象の説明変数候補それぞれを、抽出対象の説明変数を含まない説明変数として再生成する。具体的には、説明変数再生成部１１１は、関連度テーブル９００のフィールド９０１における再生成対象の説明変数候補の名称を、再生成された説明変数の名称に変更する。また、説明変数再生成部１１１は、説明変数候補テーブル６００のフィールド６０４の、再生成対象の説明変数候補に対応するセルに抽出対象の説明変数の名称を格納する。 The explanatory variable regeneration unit 111 regenerates each explanatory variable candidate to be regenerated as an explanatory variable which does not include the explanatory variable to be extracted. Specifically, the explanatory variable regenerating unit 111 changes the name of the explanatory variable candidate to be regenerated in the field 901 of the association degree table 900 to the name of the regenerated explanatory variable. Further, the explanatory variable regenerating unit 111 stores the name of the explanatory variable to be extracted in the cell corresponding to the explanatory variable candidate to be regenerated in the field 604 of the explanatory variable candidate table 600.

また、説明変数再生成部１１１は、データセット格納テーブル７００のフィールド７０２に再生成後の説明変数候補の列を追加する。説明変数再生成部１１１は、例えば、概念構造テーブル２００が示す概念構造を参照して、再生成対象の説明変数の下位概念であって、かつ抽出対象に決定した説明変数候補の兄弟ノードである、説明変数候補を特定する。説明変数再生成部１１１は、例えば、フィールド７０２に格納された当該特定した説明変数候補の値に基づいて、追加した列の値を決定する。 Further, the explanatory variable regenerating unit 111 adds a column of explanatory variable candidates after regeneration to the field 702 of the data set storage table 700. The explanatory variable regenerating unit 111 is, for example, a subordinate concept of the explanatory variable to be regenerated with reference to the conceptual structure indicated by the conceptual structure table 200, and is a sibling node of the explanatory variable candidate determined to be the extraction target. , Identify candidate explanatory variables. The explanatory variable regenerating unit 111 determines the value of the added column based on the value of the specified explanatory variable candidate stored in the field 702, for example.

なお、エリア１４０４内の「抽出された説明変数の上位概念は再生成された説明変数のみを分析対象とする」のチェックボックスが有効でない場合、説明変数再生成部１１１は、説明変数候補テーブル６００のフィールド６０４の、再生成対象の説明変数候補に対応する再生成フラグ及び除外変数の値を変更しない。また、このとき、説明変数再生成部１１１は、再生成対象の説明変数候補に対応するレコードを説明変数候補テーブル６００に追加し、当該レコードのフィールド６０２〜６０５に初期値を格納する。 When the check box “The upper concept of the extracted explanatory variable is to be analyzed is only the explanatory variable that is regenerated” in the area 1404 is not valid, the explanatory variable regenerating unit 111 generates the explanatory variable candidate table 600. In the field 604, the values of the regeneration flag and the exclusion variable corresponding to the explanatory variable candidate to be regenerated are not changed. At this time, the explanatory variable regenerating unit 111 adds a record corresponding to the explanatory variable candidate to be regenerated to the explanatory variable candidate table 600, and stores the initial value in the fields 602 to 605 of the record.

以下、「病名Ｂ」が抽出対象に決定した場合における、説明変数候補の再生成処理の例を説明する。説明変数再生成部１１１は、概念構造テーブル２００を参照し、「病名Ｂ」の上位概念である説明変数候補「病名群α」及び「病名群１」を再生成対象の説明変数候補に決定する。説明変数再生成部１１１は、「病名群α」から「病名Ｂ」を除いた説明変数候補「病名群α（病名Ｂを除く）」、及び「病名群１」から「病名Ｂ」を除いた説明変数候補「病名群１（病名Ｂを除く）」を再生成する。 Hereinafter, an example of the process of regenerating explanatory variable candidates in the case where “disease name B” is determined as the extraction target will be described. The explanatory variable regeneration unit 111 refers to the conceptual structure table 200, and determines the explanatory variable candidates “disease name group α” and “disease name group 1”, which are high-level concepts of “disease name B”, as explanatory variable candidates to be regenerated. . The explanatory variable regenerating unit 111 excludes the explanatory variable candidate "disease name group a (except for the disease name B)" obtained by removing "disease name B" from the "disease name group a", and removes "disease name B" from "disease name group 1". The explanatory variable candidate "disease name group 1 (except for disease name B)" is regenerated.

説明変数再生成部１１１は、関連度テーブル９００のフィールド９０１の、「病名群α」を「病名群α（病名Ｂを除く）」に、「病名群１」を「病名群α（病名Ｂを除く）」に変更する。また、説明変数再生成部１１１は、説明変数候補テーブル６００の「病名群α」及び「病名群１」に対応するフィールド６０４のセルそれぞれに、抽出対象の説明変数である「病名Ｂ」を格納する。 The explanatory variable regeneration unit 111 sets “disease name group α” to “disease name group α (except for disease name B)” in the field 901 of the degree of association table 900 and “disease name group 1” to “disease name group α (disease name B Change to "Excluding." In addition, the explanatory variable regenerating unit 111 stores “disease name B”, which is an explanatory variable to be extracted, in each cell of the field 604 corresponding to “disease name group α” and “disease name group 1” of the explanatory variable candidate table 600. Do.

また、説明変数再生成部１１１は、データセット格納テーブル７００のフィールド７０２に「病名群α（病名Ｂを除く）」、「病名群１（病名Ｂを除く）」の列を追加する。説明変数再生成部１１１は、概念構造テーブル２００を参照して、「病名群α」の下位概念であって、「病名Ｂ」の兄弟ノードである、「病名Ａ」及び「病名Ｃ」を特定する。 Further, the explanatory variable regenerating unit 111 adds columns of “disease name group α (except for disease name B)” and “disease name group 1 (except for disease name B)” to the field 702 of the data set storage table 700. The explanatory variable regeneration unit 111 refers to the conceptual structure table 200, and identifies “disease name A” and “disease name C” that are subordinate concepts of “disease name group α” and are sibling nodes of “disease name B”. Do.

説明変数再生成部１１１は、データセット格納テーブル７００から、各患者の「病名Ａ」及び「病名Ｃ」の値を取得する。説明変数再生成部１１１は、当該患者の「病名Ａ」及び「病名Ｂ」の値が全て０であった場合に当該患者の「病名群α（病名Ｂを除く）」の値を０に決定し、その他の場合は１に決定する。説明変数再生成部１１１は、同様の方法で、各患者の「病名群１（病名Ｂを除く）」の値を決定する。 The explanatory variable regenerating unit 111 acquires the values of “disease name A” and “disease name C” of each patient from the data set storage table 700. The explanatory variable regeneration unit 111 determines that the value of the “patient name group α (excluding the patient name B)” of the patient is 0 when the values of the “patient name A” and the “patient name B” of the patient are all 0. Otherwise, decide to 1. The explanatory variable regeneration unit 111 determines the value of “disease name group 1 (except for disease name B)” of each patient in the same manner.

次に、説明変数再生成部１１１は、ステップＳ１３０９で再生成された説明変数候補それぞれと目的変数との関連度を算出し、関連度テーブル９００の再生成された説明変数候補それぞれに対応するフィールド９０２のセルに登録する（Ｓ１３１０）。 Next, the explanatory variable regenerating unit 111 calculates the degree of association between each of the explanatory variable candidates regenerated in step S1309 and the target variable, and the field corresponding to each of the regenerated explanatory variable candidates of the relevance degree table 900. It registers in the cell of 902 (S1310).

次に、説明変数再生成部１１１は、概念構造テーブル２００を参照して、抽出対象の説明変数の下位概念の全ての説明変数候補を特定し、特定した説明変数候補を抽出対象外とする（Ｓ１３１１）。具体的には、例えば、説明変数再生成部１１１は、説明変数候補テーブル６００の特定した説明変数候補の対象外フラグを「１」に変更する。なお、エリア１４０４の「抽出された説明変数の下位概念を対象外とする」のチェックボックスが有効でない場合、説明変数再生成部１１１は、ステップＳ１３１１の処理を実行しない。 Next, the explanatory variable regenerating unit 111 refers to the conceptual structure table 200 to identify all explanatory variable candidates of the lower concept of the explanatory variable to be extracted, and excludes the identified explanatory variable candidates from the extraction target ( S1311). Specifically, for example, the explanatory variable regenerating unit 111 changes the non-target flag of the explanatory variable candidate specified in the explanatory variable candidate table 600 to “1”. If the check box of “do not target lower concepts of extracted explanatory variables” in the area 1404 is not valid, the explanatory variable regenerating unit 111 does not execute the process of step S1311.

次に、説明変数再生成部１１１は、説明変数候補テーブル６００に対象外フラグが「０」の説明変数候補が存在するか否かを判定する（Ｓ１３１２）。説明変数再生成部１１１は、対象外フラグが「０」の説明変数候補が存在しないと判断した場合（Ｓ１３１２：Ｎｏ）、図１３の処理を終了し、ステップＳ１１０９に遷移する。説明変数再生成部１１１は、対象外フラグが「０」の説明変数候補があると判定した場合（１３１２：Ｙｅｓ）、ステップＳ１３１３に遷移する。 Next, the explanatory variable regenerating unit 111 determines whether there is an explanatory variable candidate whose non-target flag is “0” in the explanatory variable candidate table 600 (S1312). If the explanatory variable regenerating unit 111 determines that the explanatory variable candidate having the non-target flag of “0” does not exist (S1312: No), the process of FIG. 13 ends, and the process transitions to step S1109. If the explanatory variable regenerating unit 111 determines that there is an explanatory variable candidate whose non-target flag is “0” (Yes in 1312), the process transitions to step S1313.

説明変数再生成部１１１は、既抽出の説明変数の数がエリア１４０６の設定値未満であるか否かを判定する（Ｓ１３１３）。説明変数再生成部１１１は、既抽出の説明変数の数が設定値以上であると判定した場合（Ｓ１３１３：Ｎｏ）、図１３の処理を終了し、ステップＳ１１０９に遷移する。説明変数再生成部１１１は、既抽出の説明変数の数が設定値未満であると判定した場合（Ｓ１３１３：Ｙｅｓ）、ステップＳ１３０４に戻る。 The explanatory variable regenerating unit 111 determines whether the number of already extracted explanatory variables is less than the set value of the area 1406 (S1313). If the explanatory variable regeneration unit 111 determines that the number of already-extracted explanatory variables is equal to or larger than the set value (S1313: No), the process of FIG. 13 ends, and the process transitions to step S1109. If the explanatory variable regeneration unit 111 determines that the number of already-extracted explanatory variables is less than the set value (S1313: Yes), the process returns to step S1304.

なお、エリア１４０２において複数の説明変数が抽出対象に指定されている場合、説明変数再生成部１１１は、例えば、ステップＳ１３０３及びステップＳ１３０８〜Ｓ１３１３の処理を、当該複数の説明変数それぞれに対して順に実行する。 When a plurality of explanatory variables are specified as extraction targets in the area 1402, for example, the explanatory variable regenerating unit 111 sequentially performs the processing of step S1303 and steps S1308 to S1313 for each of the plurality of explanatory variables. Run.

図１１の説明に戻る。データセット作成部１０９は、説明変数候補テーブル６００に基づきデータセットの再作成を行い、データセット格納テーブル７００に、再作成したデータセットを登録する（Ｓ１１０９）。具体的には、例えば、データセット作成部１０９は、フィールド７０２が、説明変数候補テーブル６００における抽出フラグが１である説明変数の情報からなる、データセット格納テーブル７００を作成する。 It returns to the explanation of FIG. The data set creation unit 109 re-creates the data set based on the explanatory variable candidate table 600, and registers the re-created data set in the data set storage table 700 (S1109). Specifically, for example, the data set creation unit 109 creates a data set storage table 700 in which the field 702 is information of an explanatory variable whose extraction flag is 1 in the explanatory variable candidate table 600.

次に、表示画面生成部１０５は、例えば、概念構造テーブル２００及び説明変数候補テーブル６００の情報に基づいて、分析処理結果を表示する表示画面を生成する（Ｓ１１１０）。表示画面の詳細については後述する。表示画面生成部１０５は、ステップＳ１１１０で生成した画面を、出力部１０２を介して入出力端末１３０に表示する（Ｓ１１１１）。 Next, the display screen generation unit 105 generates a display screen displaying the analysis processing result based on, for example, the information of the conceptual structure table 200 and the explanatory variable candidate table 600 (S1110). Details of the display screen will be described later. The display screen generation unit 105 displays the screen generated in step S1110 on the input / output terminal 130 via the output unit 102 (S1111).

なお、データセット作成部１０９は、ステップＳ１１０９で再作成したデータセットに含まれる目的変数及び抽出された説明変数と、当該抽出された説明変数それぞれと当該目的変数との関連度と、を含む変数セットを、例えば、メモリ１０３又は補助記憶装置１１６に記憶してもよい。 The data set creation unit 109 is a variable including the objective variable and the extracted explanatory variables included in the data set re-created in step S1109, and the degree of association between each of the extracted explanatory variables and the objective variable. The set may be stored, for example, in the memory 103 or the auxiliary storage device 116.

次回以降の図１１の処理におけるステップＳ１１０４において、変数セットに含まれる目的変数と同一の目的変数が指定された場合、説明変数再生成部１１１は、例えば、当該変数セットにおける関連度が所定の閾値以上である説明変数を、ステップＳ１３０３における抽出指定の説明変数に含めてもよい。 When the same objective variable as the objective variable included in the variable set is designated in step S1104 in the process of FIG. 11 from the next time onwards, the explanatory variable regenerating unit 111 determines that the association degree in the variable set is a predetermined threshold, for example. The explanatory variable which is the above may be included in the explanatory variable of extraction specification in step S1303.

即ち、説明変数再生成部１１１は、当該変数セットにおける関連度が所定の閾値以上である説明変数については、新たに関連度を算出することなく抽出対象の説明変数に決定する。なお、当該変数セットにおける関連度が所定の閾値以上である説明変数が再生成された説明変数を含む場合、ステップＳ１３０８において当該再生成された説明変数に対応する元の説明変数候補の抽出フラグを１に変更する。 That is, the explanatory variable regenerating unit 111 determines the explanatory variable whose degree of association in the variable set is equal to or more than the predetermined threshold as the explanatory variable of the extraction target without newly calculating the degree of association. If the explanatory variable whose association degree in the variable set is equal to or higher than the predetermined threshold includes the regenerated explanatory variable, the extraction flag of the original explanatory variable candidate corresponding to the regenerated explanatory variable is generated in step S1308. Change to 1

これにより、説明変数再生成部１１１は、同じ目的変数に対応する説明変数を再度抽出する処理において、当該目的変数に対する説明力の高い説明変数を抽出することができ、かつ当該処理を高速に実行することができる。 Thus, in the process of extracting again the explanatory variable corresponding to the same objective variable, the explanatory variable regeneration unit 111 can extract an explanatory variable with high explanatory power for the objective variable, and execute the process at high speed. can do.

図１５は、ステップＳ１１１１で入出力端末１３０に表示される分析処理結果表示画面の例である。表示画面１５００は、例えば、抽出説明変数の一覧を表示するエリア１５０１と、抽出説明変数と概念構造の関係図を表示するエリア１５０２と、を含む。 FIG. 15 shows an example of the analysis processing result display screen displayed on the input / output terminal 130 in step S1111. The display screen 1500 includes, for example, an area 1501 displaying a list of extracted explanatory variables, and an area 1502 displaying a relationship diagram of the extracted explanatory variables and the conceptual structure.

エリア１５０１は、例えば、抽出された説明変数（即ち抽出フラグが１である説明変数）、当該説明変数それぞれの抽出された順番、当該説明変数それぞれと目的変数との関連度と、を表示する。図１５の例では、エリア１５０１は、抽出された説明変数として、抽出された順番に「病名Ｂ」、「病名Ｚとの治療薬剤」、「病名群α（病名Ｂを除く）」を表示している。 The area 1501 displays, for example, the extracted explanatory variables (that is, the explanatory variables whose extraction flag is 1), the extraction order of each of the explanatory variables, and the degree of association between each of the explanatory variables and the objective variable. In the example of FIG. 15, the area 1501 displays “disease name B”, “treatment drug with disease name Z”, “disease name group α (except for disease name B)” in the extracted order as the explanatory variable extracted ing.

エリア１５０２は、例えば、抽出された説明変数と、当該説明変数が抽出されたために再生成された説明変数と、除外対象である説明変数と、を含む概念構造を表示する。エリア１５０２において、点線は、再生成された説明変数に対応する再生成前の説明変数と、当該再生成前の説明変数と同一の概念構造に属し、かつ隣接する階層に属する説明変数と、を結ぶ。また、エリア１５０２において、実線は、同一の概念構造に属し、かつ隣接する階層に属する説明変数同士、を結ぶ。 The area 1502 displays, for example, a conceptual structure including the extracted explanatory variables, the explanatory variables regenerated because the explanatory variables are extracted, and the explanatory variables to be excluded. In the area 1502, dotted lines indicate the explanatory variables before regeneration corresponding to the regenerated explanatory variables, and the explanatory variables belonging to the same conceptual structure as the explanatory variables before the regeneration and belong to the adjacent hierarchy. tie. Further, in the area 1502, solid lines connect the explanatory variables belonging to the same conceptual structure and belonging to adjacent layers.

ユーザは、表示画面１５００を参照することにより、表示されている抽出説明変数が、概念構造のどの位置に含まれる説明変数であるか、どの説明変数が除外対象となったか、どの説明変数が再生成されたか等を容易に把握できる。ひいては、ユーザは、説明変数の抽出及び再生成が妥当であったかどうか等の検証を容易に行うことができる。 The user refers to the display screen 1500 to find out at which position of the conceptual structure the extracted explanatory variable being displayed is an explanatory variable, which explanatory variable has been excluded, and which explanatory variable is reproduced. It can be easily grasped that it was made. As a result, the user can easily verify whether extraction and regeneration of explanatory variables are appropriate.

以上のように、変数最適化システム１００は、目的変数との関連度が高い説明変数を抽出対象に決定することにより目的変数に対する説明力が高い説明変数を抽出することができる。また、変数最適化システム１００は、抽出対象の上位概念の説明変数候補を、抽出対象の説明変数を含まない説明変数候補として再生成する。当該処理により、変数最適化システム１００は、新たな概念を自動的に生成し、生成した新たな概念を説明変数候補とすることができる。 As described above, the variable optimization system 100 can extract an explanatory variable with high explanatory power for the objective variable by determining an explanatory variable having a high degree of association with the objective variable as an extraction target. Further, the variable optimization system 100 regenerates the explanatory variable candidate of the upper concept to be extracted as an explanatory variable candidate not including the explanatory variable to be extracted. By the process, the variable optimization system 100 can automatically generate a new concept, and can use the generated new concept as an explanatory variable candidate.

また、再生成された説明変数候補は、抽出対象の説明変数を含まない概念であるため、変数最適化システム１００が、再生成された説明変数候補を抽出対象に選択した場合であっても、抽出された説明変数が持つ意味の重複を抑制できる。また、再生成された説明変数候補は、特定の病名を含まない病名群のように、ユーザにとって医学的意味の把握が容易な変数である。 In addition, since the regenerated explanatory variable candidate is a concept that does not include the explanatory variable to be extracted, even when the variable optimization system 100 selects the regenerated explanatory variable candidate as the extraction target, It is possible to suppress the duplication of the meaning of the extracted explanatory variables. Also, the regenerated explanatory variable candidate is a variable for which the user can easily understand the medical meaning, such as a disease name group not including a specific disease name.

また、変数最適化システム１００は、抽出対象の説明変数の下位概念を抽出対象外とすることで、抽出処理の効率化が行えるだけでなく、同一の意味を含む説明変数を除外することが可能できる。ひいては、変数最適化システム１００は、同一の意味を持つ異なる説明変数がデータセットから除外することができる、つまり医学的意味の把握が容易なデータセットを作成することができる。 In addition, the variable optimization system 100 can exclude the explanatory variables having the same meaning as well as the efficiency of the extraction processing can be performed by excluding the lower concepts of the explanatory variables to be extracted. it can. Consequently, the variable optimization system 100 can create a data set in which different explanatory variables having the same meaning can be excluded from the data set, that is, the medical meaning can be easily grasped.

また、変数最適化システム１００は、既抽出の説明変数との関連度が高い説明変数を除外することで、多重共線性が発生しにくい説明変数の抽出を行うことが可能となり、目的の事象を予測及び検証に適したデータセットを生成することができる。つまり、変数最適化システム１００は、目的の事象の予測精度及び検証精度を向上させるための、説明変数を選択することができる。 In addition, the variable optimization system 100 can extract explanatory variables that are less likely to cause multicollinearity by excluding explanatory variables that have a high degree of association with already-extracted explanatory variables, so that the target event can be obtained. Data sets suitable for prediction and verification can be generated. That is, the variable optimization system 100 can select explanatory variables to improve the prediction accuracy and verification accuracy of the target event.

なお、本発明は上述した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 The present invention is not limited to the embodiments described above, but includes various modifications. For example, the embodiments described above are described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the configurations described. In addition, with respect to a part of the configuration of each embodiment, it is possible to add, delete, and replace other configurations.

上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によってハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによってソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスクドライブ、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記憶装置、又は、ＩＣカード、ＳＤカード、ＤＶＤ等の計算機読み取り可能な非一時的データ記憶媒体に格納することができる。 Each of the above configurations, functions, processing units, processing means, etc. may be realized by hardware, for example, by designing part or all of them with an integrated circuit. Further, each configuration, function, and the like described above may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files that realize each function is memory, hard disk drive, storage device such as SSD (Solid State Drive), or computer readable non-temporary data such as IC card, SD card, DVD, etc. It can be stored in a storage medium.

また、図面には、実施例を説明するために必要と考えられる制御線及び情報線を示しており、必ずしも、本発明が適用された実際の製品に含まれる全ての制御線及び情報線を示しているとは限らない。実際にはほとんど全ての構成が相互に接続されていると考えてもよい。 Also, the drawings show control lines and information lines that are considered to be necessary to explain the embodiment, and not necessarily all control lines and information lines included in an actual product to which the present invention is applied. Not necessarily. In practice, almost all configurations may be considered to be mutually connected.

１００変数最適化システム、１０１制御部、１０２出力部、１０３メモリ、１０４通信部、１０５表示画面生成部、１０６データ抽出部、１０７蓄積データ取得部、１０８説明変数生成部、１０９データセット作成部、１１０関連度算出部、１１１説明変数再生成部、１１２概念構造編集部、１１３統合データベース、１１４概念構造データベース、１１５分析データベース、１１６補助記憶装置、１１９蓄積データ取得部 Reference Signs List 100 variable optimization system, 101 control unit, 102 output unit, 103 memory, 104 communication unit, 105 display screen generation unit, 106 data extraction unit, 107 accumulated data acquisition unit, 108 explanatory variable generation unit, 109 data set generation unit, 110 Relevance degree calculation unit, 111 explanatory variable regeneration unit, 112 conceptual structure editing unit, 113 integrated database, 114 conceptual structure database, 115 analysis database, 116 auxiliary storage unit, 119 accumulated data acquisition unit

Claims

A system for determining an explanatory variable group corresponding to a target variable,
Including processor and storage,
The storage device is
Conceptual structure information indicating one or more tree structures having one or more layers, and a correspondence indicating that each of a plurality of explanatory variable candidates is any one of the nodes of the one or more tree structures;
Holding measurement values of the plurality of explanatory variable candidates and measurement value information indicating the measurement values of the target variable;
Each of the one or more tree-structured nodes is a concept,
In the one or more tree structures, a parent node is a superordinate concept of a child node,
The processor repeats a determination process of determining explanatory variables to be included in the explanatory variable group from the explanatory variable candidate group,
The explanatory variable candidate group in the first determination process is the plurality of explanatory variable candidates,
In the determination process, the processor
Based on the measurement value information, a first degree of association between each explanatory variable candidate of the explanatory variable candidate group and the objective variable is calculated;
An explanatory variable candidate to be included in the explanatory variable group is selected from the explanatory variable candidate group based on each of the first degree of association,
The selected explanatory variable candidate is included in the explanatory variable group,
Excluding the selected explanatory variable candidate from the explanatory variable candidate group;
Identifying explanatory variable candidates, which are high-level concepts of the selected explanatory variable candidates, from the explanatory variable candidate group with reference to the conceptual structure information;
A change process is performed to change each of the identified explanatory variable candidates into explanatory variable candidates excluding the selected explanatory variable candidate from the lower concept indicated by the conceptual structure information of the explanatory variable candidate.
The measurement value of the sibling node indicated by the conceptual structure information of the selected explanatory variable candidate is acquired from the measurement value information,
Based on the acquired measurement value, the measurement value of each explanatory variable candidate after the change in the change process is calculated,
The system which includes the calculated measured value in the measured value information.

The system according to claim 1, wherein
The processor determines whether or not the second and subsequent determination processes include:
Based on the measurement value information, a second degree of association between the explanatory variable group and each explanatory variable candidate of the explanatory variable candidate group is calculated;
The system which selects the explanatory variable candidate included in the said explanatory variable group from the explanatory variable candidate of the said explanatory variable candidate group whose said 2nd related degree is below a threshold value based on each said 1st degree of association.

The system according to claim 1, wherein
In the determination process, the processor
Identifying the explanatory variable candidate which is a subordinate concept of the selected explanatory variable candidate from the explanatory variable candidate group with reference to the conceptual structure information;
The system which excludes the explanatory variable candidate which is the specified lower concept from the explanatory variable candidate group.

The system according to claim 1, wherein
The system according to claim 1, wherein the first explanatory variable candidate included in the plurality of explanatory variable candidates belongs to a plurality of tree structures.

The system according to claim 1, wherein
The storage device holds knowledge information indicating a first concept and a second concept which is a superordinate concept of the first concept,
The processor is
When all the tree structures included in the conceptual structure information do not include the first concept and the second concept, a tree structure in which the second concept is a parent node of the first concept is used as the conceptual structure information. Including
When the first tree structure included in the conceptual structure information includes the first concept and a third concept which is a superordinate concept of the first concept, one of the second concept or the third concept is the other parent. A system including a tree structure that is a node and the other is a parent node of the first concept in the conceptual structure information.

The system according to claim 1, wherein
The processor is
Storing, in the storage device, variable information indicating an explanatory variable group determined by repeating the determination process, and the first relevance of each of the explanatory variables of the determined explanatory variable group and the objective variable;
Implement a redetermination process to redetermine the explanatory variable group corresponding to the objective variable,
The re-determination process consists of repeating the determination process,
In the first determination process in the redetermination process,
The explanatory variable indicated by the variable information, which is the explanatory variable indicated by the variable information, is selected as an explanatory variable candidate to be included in the explanatory variable group, based on each of the first degrees of association indicated by the variable information.
Identifying explanatory variable candidates before change by the change process of the selected explanatory variable with reference to the conceptual structure information;
The system which excludes the specified explanatory variable candidate before change from the explanatory variable candidate group.

The system according to claim 1, wherein
The processor, in the determination process, selects an explanatory variable candidate of the explanatory variable candidate group having the largest first degree of association as an explanatory variable candidate to be included in the explanatory variable group.

The system according to claim 1, wherein
Connected to the display,
The processor is
The system which outputs the said explanatory variable group and the explanatory variable candidate after the change in the said change process to the said display apparatus.

A method in which a system determines an explanatory variable group corresponding to a target variable,
The system
Conceptual structure information indicating one or more tree structures having one or more layers, and a correspondence indicating that each of a plurality of explanatory variable candidates is any one of the nodes of the one or more tree structures;
Holding measurement values of the plurality of explanatory variable candidates and measurement value information indicating the measurement values of the target variable;
Each of the one or more tree-structured nodes is a concept,
In the one or more tree structures, a parent node is a superordinate concept of a child node,
The method repeats the determination process in which the system determines an explanatory variable to be included in the explanatory variable group from the explanatory variable candidate group,
The explanatory variable candidate group in the first determination process is the plurality of explanatory variable candidates,
In the method, the system determines whether
Based on the measurement value information, a first degree of association between each explanatory variable candidate of the explanatory variable candidate group and the objective variable is calculated;
An explanatory variable candidate to be included in the explanatory variable group is selected from the explanatory variable candidate group based on each of the first degree of association,
The selected explanatory variable candidate is included in the explanatory variable group,
Excluding the selected explanatory variable candidate from the explanatory variable candidate group;
Identifying explanatory variable candidates, which are high-level concepts of the selected explanatory variable candidates, from the explanatory variable candidate group with reference to the conceptual structure information;
A change process is performed to change each of the identified explanatory variable candidates into explanatory variable candidates excluding the selected explanatory variable candidate from the lower concept indicated by the conceptual structure information of the explanatory variable candidate.
The measurement value of the sibling node indicated by the conceptual structure information of the selected explanatory variable candidate is acquired from the measurement value information,
Based on the acquired measurement value, the measurement value of each explanatory variable candidate after the change in the change process is calculated,
The method of including the calculated measured value in the measured value information.

The method according to claim 9, wherein
In the second and subsequent determination processes, the system
Based on the measurement value information, a second degree of association between the explanatory variable group and each explanatory variable candidate of the explanatory variable candidate group is calculated;
A method of selecting an explanatory variable candidate to be included in the explanatory variable group based on each of the first degree of association, from explanatory variable candidates of the explanatory variable candidate group whose second degree of association is equal to or less than a threshold.

The method according to claim 9, wherein
In the determination process, the system
Identifying the explanatory variable candidate which is a subordinate concept of the selected explanatory variable candidate from the explanatory variable candidate group with reference to the conceptual structure information;
The method of excluding the explanatory variable candidate which is the specified lower concept from the explanatory variable candidate group.

The method according to claim 9, wherein
The first explanatory variable candidate included in the plurality of explanatory variable candidates belongs to a plurality of tree structures.

The method according to claim 9, wherein
The system holds knowledge information indicating a first concept and a second concept which is a superordinate concept of the first concept,
The method comprises:
When all the tree structures included in the conceptual structure information do not include the first concept and the second concept, a tree structure in which the second concept is a parent node of the first concept is used as the conceptual structure information. Including
When the first tree structure included in the conceptual structure information includes the first concept and a third concept which is a superordinate concept of the first concept, one of the second concept or the third concept is the other parent. A method, comprising: a tree structure which is a node and the other is a parent node of the first concept in the conceptual structure information.

The method according to claim 9, wherein
The system
Storing, in the system, variable information indicating an explanatory variable group determined by repeating the determination process and the first relevance of each of the explanatory variables of the determined explanatory variable group and the objective variable;
Implement a redetermination process to redetermine the explanatory variable group corresponding to the objective variable,
The re-determination process consists of repeating the determination process,
In the first determination process in the redetermination process,
The explanatory variable indicated by the variable information, which is the explanatory variable indicated by the variable information, is selected as an explanatory variable candidate to be included in the explanatory variable group, based on each of the first degrees of association indicated by the variable information.
Identifying explanatory variable candidates before change by the change process of the selected explanatory variable with reference to the conceptual structure information;
The method for excluding the specified explanatory variable candidate before change from the explanatory variable candidate group.

The method according to claim 9, wherein
The method is such that, in the determination process, the system selects an explanatory variable candidate of the explanatory variable candidate group having the largest first degree of association as an explanatory variable candidate to be included in the explanatory variable group.