JP7656499B2

JP7656499B2 - Learning model generation method, program, information processing method, information processing device, and learning model data generation method

Info

Publication number: JP7656499B2
Application number: JP2021104303A
Authority: JP
Inventors: 英俊山下
Original assignee: HU Group Research Institute GK
Current assignee: HU Group Research Institute GK
Priority date: 2021-06-23
Filing date: 2021-06-23
Publication date: 2025-04-03
Anticipated expiration: 2041-06-23
Also published as: JP2023003235A

Description

本発明は、学習モデルの生成方法、プログラム、情報処理方法、情報処理装置及び学習モデルデータの生成方法に関する。 The present invention relates to a method for generating a learning model, a program, an information processing method, an information processing device, and a method for generating learning model data.

病院の検査室又は検査センター等の検査施設では、常時大量の臨床検査を実施しており、臨床側ではその検査結果に基づいて患者の病態に関する診断・治療の方針決定がおこなわれる。検査結果に基づいて治療方針の決定を行う前段階において、検査結果の妥当性を検証し、異常値であると判定された場合には再検査を実施することで、検査の信頼性を高めることが行われている。 In hospital laboratories, testing centers, and other testing facilities, a large number of clinical tests are constantly being carried out, and clinicians use the test results to make diagnostic and treatment decisions regarding the patient's condition. Before deciding on a treatment plan based on the test results, the validity of the test results is verified, and if any values are determined to be abnormal, retests are conducted to increase the reliability of the test.

検査項目毎に前回値と今回値の差に基準値を設け、患者別の検体の値を前回値と比較するデータ処理により、異常値等を抽出する等して、検査の信頼性を高めるデルタ検証法が開示されている（例えば特許文献１、２参照）。 A delta verification method has been disclosed that increases the reliability of tests by setting a standard value for the difference between the previous value and the current value for each test item, and extracting abnormal values through data processing that compares the values of samples from individual patients with the previous values (see, for example, Patent Documents 1 and 2).

特開平５－１５１２８２号公報Japanese Patent Application Publication No. 5-151282 特開平１１－２９６６０５号公報Japanese Patent Application Publication No. 11-296605

しかしながら、特許文献１、２に記載の技術では、単項目の検査結果における検査値の差分に基づき異常か否かが判断されており、関連する他の検査値の変動や検査間隔、検査の時系列順序が考慮されていないため、検出精度が低下することが懸念される。 However, in the technologies described in Patent Documents 1 and 2, whether or not there is an abnormality is determined based on the difference in test values in the test results of a single item, and since the fluctuations in other related test values, the test intervals, and the chronological order of the tests are not taken into consideration, there are concerns that the detection accuracy may decrease.

本開示の目的は、検査値の異常の検出精度を向上できる学習モデルの生成方法等を提供することである。 The objective of this disclosure is to provide a method for generating a learning model that can improve the accuracy of detecting abnormalities in test values.

本開示の一態様に係る学習モデルの生成方法は、臨床検査における複数の検体データを取得し、取得した複数の前記検体データに基づき、異常検体データを含む第１の検体データ群を生成し、生成した第１の前記検体データ群と、前記異常検体データを含まない第２の前記検体データ群と、前記検体データ群の正常又は異常に関する分類情報とを含む訓練データに基づき、前記検体データ群を入力した場合に前記分類情報を出力する学習モデルを生成する処理をコンピュータに実行させる。 A method for generating a learning model according to one aspect of the present disclosure causes a computer to execute a process of acquiring a plurality of specimen data in a clinical test, generating a first specimen data group including abnormal specimen data based on the acquired plurality of specimen data, and generating a learning model that outputs classification information when the specimen data group is input based on training data including the generated first specimen data group, the second specimen data group not including the abnormal specimen data, and classification information regarding normality or abnormality of the specimen data group.

本開示の一態様に係るプログラムは、時系列順に配列された複数の検体データからなる検体データ群を取得し、時系列順に配列された複数の検体データからなる検体データ群を入力した場合に、前記検体データ群に含まれる時系列で最後の前記検体データの正常又は異常に関する分類情報を出力するよう学習済みの学習モデルに、取得した前記検体データ群を入力して、前記分類情報を出力する処理をコンピュータに実行させる。 A program according to one aspect of the present disclosure acquires a specimen data group consisting of a plurality of specimen data arranged in chronological order, and causes a computer to execute a process of inputting the acquired specimen data group into a learning model that has been trained to output classification information regarding normality or abnormality of the last specimen data in chronological order included in the specimen data group when the specimen data group consisting of a plurality of specimen data arranged in chronological order is input, and outputting the classification information.

本開示の一態様に係る学習モデルデータの生成方法は、臨床検査における複数の検体データを取得し、取得した複数の前記検体データに基づき、異常検体データを含む第１の検体データ群を生成し、生成した第１の前記検体データ群及び異常を示すラベルと、前記異常検体データを含まない第２の前記検体データ群及び正常を示すラベルとを学習モデルに対する訓練データとして対応付けて記憶部に記憶する処理をコンピュータに実行させる。 A method for generating learning model data according to one aspect of the present disclosure includes acquiring multiple pieces of specimen data in a clinical test, generating a first specimen data group including abnormal specimen data based on the acquired multiple pieces of specimen data, and causing a computer to execute a process of associating the generated first specimen data group and a label indicating abnormality with a second specimen data group not including the abnormal specimen data and a label indicating normality as training data for a learning model, and storing the association in a storage unit.

本開示によれば、検査値の異常の検出精度を向上できる。 This disclosure can improve the accuracy of detecting abnormalities in test values.

情報処理システムの構成例を示すブロック図である。1 is a block diagram illustrating an example of the configuration of an information processing system. 検体ＤＢの内容例を示す図である。FIG. 13 is a diagram showing an example of the contents of a specimen DB. 訓練ＤＢの内容例を示す図である。FIG. 2 is a diagram showing an example of the contents of a training DB. 第１学習モデルの概要を示す説明図である。FIG. 2 is an explanatory diagram showing an overview of a first learning model. 訓練データの生成処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for generating training data. 第１学習モデルの生成処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a process for generating a first learning model. 異常検出の処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for detecting an abnormality. 表示装置に表示される画面の一例を示す模式図である。FIG. 2 is a schematic diagram showing an example of a screen displayed on a display device. 表示装置に表示される画面の一例を示す模式図である。FIG. 2 is a schematic diagram showing an example of a screen displayed on a display device. 表示装置に表示される画面の一例を示す模式図である。FIG. 2 is a schematic diagram showing an example of a screen displayed on a display device. 第２実施形態における第１学習モデルの生成処理手順の一例を示すフローチャートである。A flowchart showing an example of a generation processing procedure of a first learning model in the second embodiment. 第３実施形態における第２学習モデルの概要を示す説明図である。An explanatory diagram showing an overview of a second learning model in the third embodiment. 第３実施形態における第１学習モデルの学習処理手順の一例を示すフローチャートである。A flowchart showing an example of a learning process procedure for a first learning model in the third embodiment. 第４実施形態における第３学習モデルの概要を示す説明図である。An explanatory diagram showing an overview of the third learning model in the fourth embodiment. 第４実施形態における疾患情報の取得処理手順の一例を示すフローチャートである。23 is a flowchart showing an example of a disease information acquisition process procedure according to the fourth embodiment.

本開示をその実施の形態を示す図面を参照して具体的に説明する。 This disclosure will be specifically described with reference to drawings showing embodiments thereof.

（第１実施形態）
図１は、情報処理システムの構成例を示すブロック図である。情報処理システムは、情報処理装置１及び検査施設装置２を含む。情報処理装置１及び検査施設装置２は、ＬＡＮ（Local Area Network）、インターネット等のネットワークＮを介して通信接続されている。 First Embodiment
1 is a block diagram showing a configuration example of an information processing system. The information processing system includes an information processing device 1 and a testing facility device 2. The information processing device 1 and the testing facility device 2 are communicatively connected via a network N such as a LAN (Local Area Network) or the Internet.

検査施設装置２は、例えば病院の検査室又は検査センター等の検査施設に設置される。検査施設装置２は、例えばサーバコンピュータ、パーソナルコンピュータ等であり、表示装置３、入力装置４及び検査装置５等が接続されている。表示装置３は、例えば液晶ディスプレイ又は有機ＥＬ（Electro Luminescence）ディスプレイ等である。入力装置４は、ユーザの操作を受け付ける操作部であり、例えばキーボード、タッチパネル等のポインティングデバイスである。検査装置５は、検体に対して各種の検査（分析）処理を行い、臨床検査結果（検体データ）を生成する。検査装置５により得られた検体データは、検査施設装置２を介し、情報処理装置１へ送信される。 The testing facility device 2 is installed in a testing facility such as a hospital testing room or testing center. The testing facility device 2 is, for example, a server computer, a personal computer, etc., and is connected to a display device 3, an input device 4, and a testing device 5, etc. The display device 3 is, for example, a liquid crystal display or an organic EL (Electro Luminescence) display, etc. The input device 4 is an operation unit that accepts user operations, and is, for example, a pointing device such as a keyboard or a touch panel. The testing device 5 performs various testing (analysis) processes on the specimen and generates clinical test results (specimen data). The specimen data obtained by the testing device 5 is transmitted to the information processing device 1 via the testing facility device 2.

情報処理装置１は、例えばサーバコンピュータ、パーソナルコンピュータ等であり、種々の情報処理、情報の送受信が可能である。情報処理装置１は、検査施設装置２を介し、検査装置５により得られた検体データを取得する。情報処理装置１は、取得した検体データに対する正常又は異常に関する分類情報を取得する。情報処理装置１は、検体データの異常検出装置として機能する。情報処理装置１は、生成した分類情報を表示装置３に表示する。なお、情報処理装置１は検査施設装置２と同じ施設に設置されたローカルサーバであってもよく、インターネット等を介して検査施設装置２に通信接続されたクラウドサーバであってもよい。 The information processing device 1 is, for example, a server computer, a personal computer, etc., and is capable of various information processing and sending and receiving information. The information processing device 1 acquires specimen data obtained by the testing device 5 via the testing facility device 2. The information processing device 1 acquires classification information regarding normality or abnormality of the acquired specimen data. The information processing device 1 functions as an abnormality detection device for the specimen data. The information processing device 1 displays the generated classification information on the display device 3. Note that the information processing device 1 may be a local server installed in the same facility as the testing facility device 2, or may be a cloud server communicatively connected to the testing facility device 2 via the Internet, etc.

分類情報とは、臨床検査により得られた検体データの正常又は異常（異常の有無）を示す情報である。検体データは、複数種類の検査項目に対応する複数の検査値を含むものであってよい。分類情報は、正常及び異常を含み、異常は、さらにその原因に応じて、例えば検体取り違えや希釈ミス等の複数種類に分類される。ここで、正常とは、検体データの検査値が検査時に起因する異常値でない、すなわち臨床検査結果が適正であることを意味する。異常とは、検体データの検査値が検査工程に起因する異常値である（検査過誤）、すなわち臨床検査結果が適正でないことを意味する。 Classification information is information that indicates whether the specimen data obtained by a clinical test is normal or abnormal (the presence or absence of abnormality). The specimen data may include multiple test values corresponding to multiple types of test items. Classification information includes normal and abnormal, and abnormalities are further classified into multiple types depending on the cause, such as sample mix-up and dilution error. Here, normal means that the test value of the specimen data is not an abnormal value caused during the test, i.e., the clinical test result is appropriate. Abnormal means that the test value of the specimen data is an abnormal value caused by the testing process (test error), i.e., the clinical test result is not appropriate.

検査施設の担当者は、表示装置３に表示される分類情報に基づき、検体データにおける異常の有無やその原因を把握できる。検体データが正常である場合には正しい検体データとして臨床へ報告し、検体データが異常である場合には異常の原因に応じた態様で再検査を行うことで、効率的に適正な検査を実施することができる。 The person in charge at the testing facility can determine whether there is an abnormality in the sample data and the cause of the abnormality based on the classification information displayed on the display device 3. If the sample data is normal, it is reported to the clinic as correct sample data, and if the sample data is abnormal, retesting is performed in a manner appropriate to the cause of the abnormality, allowing for efficient and appropriate testing.

臨床検査における検査結果は、患者の生理状態だけでなく、上述のような検査の実施条件、検査における人為的ミス、検査試薬や検査装置の異常、等の様々な因子の影響を受けて変動する。臨床検査において日常的に取り扱うデータは、同一患者であれば一般に時系列に深く関連した検査値が得られる。また人体に関連する検査値が取りうる範囲は、程度の差はあっても限定的であり、各検査値は互いに関連性が高いと考えられる。よって、同一検査の検査値の差分や、複数の検査項目間には関連性があると考えられている。従来、デルタ検証法等の統計的手法を用いて、検査結果の妥当性が判断されている。 Test results in clinical testing vary due to the influence of various factors, including not only the physiological state of the patient, but also the test conditions mentioned above, human error during the test, and abnormalities in the test reagents and testing equipment. Data handled daily in clinical testing generally yields test values that are closely related over time if they are from the same patient. Furthermore, the range of possible test values related to the human body is limited, although to varying degrees, and each test value is considered to be highly related to each other. Therefore, it is believed that there is a correlation between differences in test values from the same test and between multiple test items. Conventionally, the validity of test results has been determined using statistical methods such as delta validation.

しかしながら、検査値の異常が、パニック値のような患者の病態に由来するものか、検査過誤のような検査工程に由来するものか、を区別するには統計的推論だけでは難しいことから、検査値が異常であると判定された場合には再検査を実施することで、検査工程の異常の有無が判断されている。再検査を高頻度に実施すれば、検査工程の異常の見逃しを抑制できるが、その一方で再検査にかかる費用と時間を余分に消費する問題が生じる。本情報処理システムでは、後述する第１学習モデル１１３を用いて、異常の原因を示す分類情報を提供することにより、効率的な検査を実現する。 However, because it is difficult to distinguish using statistical inference alone whether an abnormality in a test value is due to the patient's pathology, such as a panic value, or due to the testing process, such as a test error, a retest is conducted when a test value is determined to be abnormal to determine whether there is an abnormality in the testing process. Conducting retests frequently can prevent abnormalities in the testing process from being overlooked, but on the other hand, there is the problem that the retesting requires additional costs and time. This information processing system achieves efficient testing by providing classification information indicating the cause of the abnormality using a first learning model 113 described below.

情報処理装置１は、制御部１０、記憶部１１及び通信部１２を備える。情報処理装置１は複数のコンピュータからなるマルチコンピュータであってもよく、ソフトウェアによって仮想的に構築された仮想マシンであってもよい。 The information processing device 1 includes a control unit 10, a storage unit 11, and a communication unit 12. The information processing device 1 may be a multi-computer consisting of multiple computers, or may be a virtual machine virtually constructed by software.

制御部１０は、一又は複数のＣＰＵ（Central Processing Unit ）、ＧＰＵ（Graphics Processing Unit）等を用いたプロセッサである。制御部１０は、内蔵するＲＯＭ（Read Only Memory）又はＲＡＭ（Random Access Memory）等のメモリを用い、各構成部を制御して処理を実行する。 The control unit 10 is a processor that uses one or more CPUs (Central Processing Units), GPUs (Graphics Processing Units), etc. The control unit 10 uses built-in memory such as ROM (Read Only Memory) or RAM (Random Access Memory) to control each component and execute processing.

記憶部１１は、例えばハードディスク又はＳＳＤ（Solid State Drive ）等の不揮発性記憶装置である。記憶部１１には、プログラム（プログラム製品）１Ｐを含む制御部１０が参照するプログラム及びデータが記憶されている。制御部１０は、プログラム１Ｐを読み出して実行することによって、汎用的なサーバコンピュータを本開示特有の情報処理装置として機能させる。記憶部１１は、複数の記憶装置により構成されていてもよく、情報処理装置１に接続された外部記憶装置であってもよい。 The storage unit 11 is a non-volatile storage device such as a hard disk or SSD (Solid State Drive). The storage unit 11 stores programs and data referenced by the control unit 10, including the program (program product) 1P. The control unit 10 reads and executes the program 1P, causing a general-purpose server computer to function as an information processing device specific to the present disclosure. The storage unit 11 may be composed of multiple storage devices, or may be an external storage device connected to the information processing device 1.

記憶部１１に記憶されるプログラム１Ｐは、記録媒体にコンピュータ読み取り可能に記録されている態様であってもよい。記憶部１１は、図示しない読出装置によって記録媒体１Ａから読み出されたプログラム１Ｐを記憶する。また、図示しない通信網に接続されている図示しない外部コンピュータからプログラム１Ｐをダウンロードし、記憶部１１に記憶させたものであってもよい。なお、プログラム１Ｐは、単一のコンピュータ上で、または１つのサイトにおいて配置されるか、もしくは複数のサイトにわたって分散され、通信ネットワークによって相互接続された複数のコンピュータ上で実行されるように展開することができる。 The program 1P stored in the memory unit 11 may be recorded in a computer-readable manner on a recording medium. The memory unit 11 stores the program 1P read from the recording medium 1A by a reading device (not shown). The program 1P may also be downloaded from an external computer (not shown) connected to a communication network (not shown) and stored in the memory unit 11. The program 1P may be deployed so as to be executed on a single computer, or on one site, or distributed across multiple sites and on multiple computers interconnected by a communication network.

また、記憶部１１には、検体ＤＢ（Data Base ：データベース）１１１、訓練ＤＢ１１２及び複数の第１学習モデル１１３が記憶されている。検体ＤＢ１１１は、臨床検査における検体に係る検体データを格納するデータベースである。訓練ＤＢ１１２は、第１学習モデル１１３の学習に用いる訓練データ（学習データ）を格納するデータベースである。第１学習モデル１１３は、訓練データを学習済みの機械学習モデルである。第１学習モデル１１３は、人工知能ソフトウェアを構成するプログラムモジュールとしての利用が想定される。記憶部１１にはさらに、第２学習モデル１１４及び第３学習モデル１１５が記憶されていてもよい。第２学習モデル１１４及び第３学習モデル１１５については他の実施形態で詳述する。 The memory unit 11 also stores a specimen DB (Data Base) 111, a training DB 112, and a plurality of first learning models 113. The specimen DB 111 is a database that stores specimen data related to specimens in clinical tests. The training DB 112 is a database that stores training data (learning data) used to learn the first learning model 113. The first learning model 113 is a machine learning model that has learned the training data. It is assumed that the first learning model 113 will be used as a program module that constitutes artificial intelligence software. The memory unit 11 may further store a second learning model 114 and a third learning model 115. The second learning model 114 and the third learning model 115 will be described in detail in other embodiments.

通信部１２は、通信に関する処理を行うための通信モジュールである。制御部１０は、通信部１２を介して検査施設装置２との間で情報の送受信が可能である。 The communication unit 12 is a communication module for performing communication-related processing. The control unit 10 can send and receive information to and from the testing facility device 2 via the communication unit 12.

本実施の形態において情報処理装置１は、上記の構成に限定されず、例えば画像を表示する表示部、操作入力を受け付ける入力部等を含んでもよい。 In this embodiment, the information processing device 1 is not limited to the above configuration, and may include, for example, a display unit that displays images, an input unit that accepts operational input, etc.

検査施設装置２は、制御部２０、記憶部２１、通信部２２及び入出力部２３を備える。
制御部２０は、一又は複数のＣＰＵ、ＧＰＵ等を用いたプロセッサである。制御部２０は、内蔵するＲＯＭ又はＲＡＭ等のメモリを用い、各構成部を制御して処理を実行する。 The testing facility equipment 2 includes a control unit 20, a memory unit 21, a communication unit 22, and an input/output unit 23.
The control unit 20 is a processor using one or more CPUs, GPUs, etc. The control unit 20 uses built-in memories such as ROM and RAM to control each component and execute processes.

記憶部２１は、例えばハードディスク又はＳＳＤ等の不揮発性記憶装置である。記憶部２１には、制御部２０が参照するプログラム及びデータが記憶されている。通信部２２は、通信に関する処理を行うための通信モジュールである。制御部２０は、通信部２２を介して情報処理装置１との間で情報の送受信が可能である。 The memory unit 21 is a non-volatile storage device such as a hard disk or SSD. The memory unit 21 stores programs and data referenced by the control unit 20. The communication unit 22 is a communication module for performing communication-related processing. The control unit 20 can send and receive information to and from the information processing device 1 via the communication unit 22.

入出力部２３は、外部装置を接続するための入出力Ｉ／Ｆ（インタフェース）である。入出力部２３には、上述の表示装置３、入力装置４及び検査装置５等が接続されている。制御部２０は、入出力部２３を介し、検体データの正常又は異常に関する分類情報を表示装置３に表示させる。制御部２０は、入出力部２３を介し、入力装置４に入力された情報を受け付ける。制御部２０は、入出力部２３を介し、検査装置５から出力される検体データを受け付ける。 The input/output unit 23 is an input/output I/F (interface) for connecting an external device. The above-mentioned display device 3, input device 4, and testing device 5 are connected to the input/output unit 23. The control unit 20 causes the display device 3 to display classification information regarding whether the specimen data is normal or abnormal via the input/output unit 23. The control unit 20 accepts information input to the input device 4 via the input/output unit 23. The control unit 20 accepts specimen data output from the testing device 5 via the input/output unit 23.

図２は、検体ＤＢ１１１の内容例を示す図である。検体データＤＢ１１１は、例えば、複数の検体データそれぞれの検体ＩＤ、検査施設名、患者属性情報、検査日時及び検査値等を関連付けて格納する。検体ＩＤは、検体を識別するための識別情報である。検査施設名は、検体の検査を実施した検査施設を識別する情報である。患者属性情報は、検体に係る患者（被検体）の属性情報であり、例えば患者ＩＤ、患者名、年齢及び性別等を含む。各患者は、例えば患者ＩＤ及び患者名等により、一意に特定可能である。検査日時は、検体の検査を実施した日時、又は血液検査の採血日時である。検査値は、検体に係る検査値である。検査値列には、複数の検査項目それぞれに対応付けて各検査値が格納される。検体ＤＢ１１１に記憶される検体データはさらに、検査施設の担当科種類（病院の診療科種類）、検査機器ＩＤ、検査試薬のロット番号、検体データの受付日、検査手法、検査担当者名等の情報を含んでもよい。 2 is a diagram showing an example of the contents of the specimen DB 111. The specimen data DB 111 stores, for example, a specimen ID, a testing facility name, patient attribute information, a test date and time, and a test value for each of a plurality of specimen data in association with each other. The specimen ID is identification information for identifying a specimen. The testing facility name is information for identifying the testing facility that performed the test on the specimen. The patient attribute information is attribute information of a patient (subject) related to the specimen, and includes, for example, a patient ID, a patient name, an age, and a sex. Each patient can be uniquely identified, for example, by a patient ID and a patient name. The test date and time is the date and time when the test on the specimen was performed, or the date and time when blood was drawn for a blood test. The test value is a test value related to the specimen. The test value column stores each test value in association with each of a plurality of test items. The specimen data stored in the specimen DB 111 may further include information such as the type of department in charge of the testing facility (type of medical department in a hospital), a test equipment ID, a lot number of a test reagent, a date of receipt of the specimen data, a test method, and a name of a person in charge of the test.

情報処理装置１は、検査施設装置２から検体データを受信する都度、受信した検体データを検体ＤＢ１１１に記憶する。検体ＤＢ１１１には、患者毎に時系列的に検体データが格納されてもよい。検体ＤＢ１１１の記憶内容は図２に示す例に限定されない。 Each time the information processing device 1 receives sample data from the testing facility device 2, it stores the received sample data in the sample DB 111. The sample DB 111 may store sample data in chronological order for each patient. The contents stored in the sample DB 111 are not limited to the example shown in FIG. 2.

図３は、訓練ＤＢ１１２の内容例を示す図である。訓練ＤＢ１１２は、後述する第１学習モデル１１３の学習に用いる訓練データを格納する。訓練ＤＢ１１２は、例えば、訓練データを識別する訓練データＩＤに関連付けて、検体データ群に関する検体データ群情報及び分類情報を格納する。 Figure 3 is a diagram showing an example of the contents of the training DB 112. The training DB 112 stores training data used for learning the first learning model 113 described below. The training DB 112 stores, for example, specimen data group information and classification information related to the specimen data group in association with a training data ID that identifies the training data.

検体データ群情報列には、検体ＤＢ１１１に記憶される検体データと同一又はその一部の項目が含まれ、例えば検体データの検査日時及び複数の検査項目に対する検査値が格納される。検体データ群情報は、１つの検体データ群を構成する複数の検体データの情報を含む。検体データ群は、同一患者に由来する時系列順の検体データから構成される。具体的には、検体データ群は、同一患者について取得された検体データのうち、時系列で隣接する所定数の検体データからなる。図３では、４つの検体データにより１つの検体データ群が構成される例を示すが、検体データ群を構成する検体データの数は限定されない。 The specimen data group information string contains items that are the same as or a part of the specimen data stored in specimen DB 111, such as the test date and time of the specimen data and test values for multiple test items. The specimen data group information contains information on multiple specimen data that make up one specimen data group. A specimen data group is composed of specimen data in chronological order that originates from the same patient. Specifically, a specimen data group is composed of a predetermined number of specimen data that are adjacent in chronological order among the specimen data obtained for the same patient. Figure 3 shows an example in which one specimen data group is composed of four specimen data, but the number of specimen data that make up a specimen data group is not limited.

詳しくは後述するが、検体データ群は、検査過誤に由来する異常な検体データ（異常検体データ）を含まない検体データ群と、該異常検体データを含む検体データ群とに分けられる。図３に示す例にて、訓練データＩＤ００１－１及び訓練データＩＤ００１－２は、異常検体データを含まない検体データ群であり、訓練データＩＤ００１－３は異常検体データを含む検体データ群である。訓練データＩＤ００１－３は、訓練データＩＤ００１－２の４行目の検体データにおける全ての検査値を、異常値に置き換えたものである。 As will be described in more detail later, the specimen data groups are divided into specimen data groups that do not contain abnormal specimen data resulting from testing errors (abnormal specimen data) and specimen data groups that contain such abnormal specimen data. In the example shown in FIG. 3, training data ID001-1 and training data ID001-2 are specimen data groups that do not contain abnormal specimen data, and training data ID001-3 is a specimen data group that includes abnormal specimen data. Training data ID001-3 is obtained by replacing all test values in the specimen data in the fourth row of training data ID001-2 with abnormal values.

分類情報列は、検体データ群の正常又は異常に関する分類情報を格納する。分類情報は、例えば、正常及び異常を含む。検査過誤に由来する異常は、その原因に応じて取り違え、希釈ミス、単位誤認、試薬間違い及び検査機器内の検体詰まり等複数種類に分類される。 The classification information column stores classification information regarding whether the sample data group is normal or abnormal. Classification information includes, for example, normal and abnormal. Abnormalities resulting from testing errors are classified into multiple types depending on the cause, such as mix-up, dilution error, misidentification of units, reagent error, and sample clogging in the testing equipment.

取り違えとは、同一受付日に検査を実施した異なる患者間における検体の取り違え（検体データに係る全ての検査値の取り違え）に起因する検査値の異常である。取り違えは、同一受付日に検査を実施した異なる患者間における特定の検査項目における検査値の取り違えを含んでもよい。希釈ミスとは、検体を生理食塩水や緩衝液、試薬などで希釈する際の希釈倍率や希釈量の誤りに起因する異常である。単位誤認とは、検査値に係る単位の誤認に起因する異常である。試薬間違いとは、試薬を用いた検査における使用試薬の誤りや試薬不足又は試薬過多に起因する異常である。検体詰まりとは、検査に用いる検査機器内のチューブ等の内部における検体の詰まりに起因する異常である。分類情報列は、異常の種類毎に、正常又は異常（異常の有無）を格納してもよい。訓練ＤＢ１１２の記憶内容は図３に示す例に限定されない。 A mix-up is an abnormality in a test value caused by a mix-up of samples (mix-up of all test values related to sample data) between different patients who underwent testing on the same reception date. Mix-ups may include mix-ups of test values for a specific test item between different patients who underwent testing on the same reception date. A dilution error is an abnormality caused by an error in the dilution ratio or dilution amount when diluting a sample with saline, buffer solution, reagent, etc. A unit misidentification is an abnormality caused by a misidentification of a unit related to a test value. A reagent error is an abnormality caused by an error in the use of a reagent in a test using a reagent, or a shortage or excess of a reagent. A specimen clogging is an abnormality caused by a specimen clogging inside a tube or the like in a testing device used for the test. The classification information column may store normal or abnormal (presence or absence of abnormality) for each type of abnormality. The contents stored in the training DB 112 are not limited to the example shown in FIG. 3.

図４は、第１学習モデル１１３の概要を示す説明図である。第１学習モデル１１３は、時系列順の複数の検体データからなる検体データ群を入力として、当該検体データ群に含まれる時系列で最後（最新すなわち直近）の検体データの正常又は異常に関する分類情報を出力する機械学習モデルである。具体的には、第１学習モデル１１３は、検体データ群を入力として、正常又は異常（異常の有無）の分類結果を出力する。 Figure 4 is an explanatory diagram showing an overview of the first learning model 113. The first learning model 113 is a machine learning model that takes as input a sample data group consisting of multiple sample data in chronological order, and outputs classification information regarding the normality or abnormality of the last (latest, i.e., most recent) sample data in the chronological order contained in the sample data group. Specifically, the first learning model 113 takes as input a sample data group, and outputs a classification result of normality or abnormality (presence or absence of abnormality).

第１学習モデル１１３は、異常の種類毎に複数記憶されている。すなわち、検体データ群を入力として、取り違えの有無を出力する第１の第１学習モデル１１３、希釈ミスの有無を出力する第２の第１学習モデル１１３、単位誤認の有無を出力する第３の第１学習モデル１１３、試薬間違いの有無を出力する第４の第１学習モデル１１３、検体詰まりの有無を出力する第５の第１学習モデル１１３がそれぞれ記憶されている。各第１学習モデル１１３は同様の構成であるため、以下では取り違えに起因する異常の有無を出力する第１学習モデル１１３の構成について説明する。 Multiple first learning models 113 are stored for each type of abnormality. That is, a first first learning model 113 that receives a group of sample data as input and outputs whether or not there is a mix-up, a second first learning model 113 that outputs whether or not there is a dilution error, a third first learning model 113 that outputs whether or not there is a unit misidentification, a fourth first learning model 113 that outputs whether or not there is a reagent mistake, and a fifth first learning model 113 that outputs whether or not there is a sample clogging are respectively stored. Since each first learning model 113 has the same configuration, the configuration of the first learning model 113 that outputs whether or not there is an abnormality caused by a mix-up will be described below.

情報処理装置１は、所定の訓練データを学習する機械学習を行って第１学習モデル１１３を事前に生成しておく。そして情報処理装置１は、検査装置５から取得した検体データ群を第１学習モデル１１３に入力し、検体データ群に対する分類情報を出力する。 The information processing device 1 performs machine learning to learn predetermined training data to generate a first learning model 113 in advance. The information processing device 1 then inputs a sample data group acquired from the testing device 5 to the first learning model 113 and outputs classification information for the sample data group.

第１学習モデル１１３は、例えばＸＧＢｏｏｓｔ（eXtreme Gradient Boosting ）により構築される。ＸＧＢｏｏｓｔは、勾配ブースティングと、ランダムフォレストとを組み合わせたアンサンブル学習の手法である。第１学習モデル１１３は、複数の決定木（弱学習器）を構築し、１つ前までの決定木の情報を用いて新たな決定木を構築していくブースティングを行うよう構成されている。具体的には、１つ前の決定木では予測できなかった誤差（損失関数の勾配）を目的変数として新たな決定木が構築される。各決定木において、入力データは根から枝に行く途中で条件により分類され、末端の葉ノードに辿り着くと、当該末端の葉ノードに与えられた値が予測値として出される。 The first learning model 113 is constructed, for example, by XGBoost (eXtreme Gradient Boosting). XGBoost is an ensemble learning method that combines gradient boosting and random forest. The first learning model 113 is configured to perform boosting by constructing multiple decision trees (weak learners) and constructing a new decision tree using information from the previous decision tree. Specifically, a new decision tree is constructed with an error (gradient of the loss function) that could not be predicted by the previous decision tree as the objective variable. In each decision tree, the input data is classified according to conditions on the way from the root to the branches, and when it reaches the terminal leaf node, the value given to the terminal leaf node is output as the predicted value.

第１学習モデル１１３の入力は、検体データ群である。検体データ群は、例えば同一検査施設における同一患者に係る、今回、前回、前々回及び前々々回の検体データを含む。検体データは、検査日時（前回検査日時との間隔）、及び複数の検査項目に応じた検査値を含む。複数の検体データは、時系列順に配列された検体データ群データとして第１学習モデル１１３へ入力される。なお、検体データ群に含まれる検体データの数は４つに限定されない。検体データ群は、異常有無の判定対象となる検体データ（時系列で最後の検体データ）と、当該判定対象となる検体データよりも過去に取得した少なくとも１つの検体データとを含むものであってよい。なお一般に、臨床現場において検査過誤や病態異常として最優先で確認が必要なのは、過去のデータではなく、直近のデータが対象になるため、本方法では、過去データは正しいと仮定し、直近（時系列上の最新）のデータが正しいかどうかを検証する。 The input of the first learning model 113 is a sample data group. The sample data group includes, for example, the current, previous, previous-before, and previous-before sample data related to the same patient in the same testing facility. The sample data includes the test date and time (the interval from the previous test date and time) and test values corresponding to multiple test items. The multiple sample data are input to the first learning model 113 as sample data group data arranged in chronological order. Note that the number of sample data included in the sample data group is not limited to four. The sample data group may include sample data to be determined for the presence or absence of abnormality (the last sample data in the chronological order) and at least one sample data acquired earlier than the sample data to be determined. Note that in general, in clinical settings, it is not past data but the most recent data that needs to be checked first as a test error or pathological abnormality, so in this method, it is assumed that the past data is correct, and whether the most recent (latest in the chronological order) data is correct is verified.

第１学習モデル１１３の出力は、検体データ群に含まれる時系列で最後の検体データの正常又は異常の分類結果である。第１学習モデル１１３は、検体データ群を入力した場合に、正常及び異常の各クラスに対する確度を出力する。第１学習モデル１１３は、確度が閾値以上であるクラスを出力値とすることができる。 The output of the first learning model 113 is the classification result of the last sample data in the time series contained in the sample data group as normal or abnormal. When a sample data group is input, the first learning model 113 outputs the accuracy for each of the normal and abnormal classes. The first learning model 113 can output the class whose accuracy is equal to or greater than a threshold value.

なお、第１学習モデル１１３はマルチクラス分類モデルであり、１つの第１学習モデル１１３により、正常又は複数種類の異常を分類するものであってもよい。この場合、第１学習モデル１１３は、時系列順の複数の検体データからなる検体データ群を入力として、当該検体データ群に含まれる時系列で最後の検体データの正常又は複数種類の異常を示す分類情報を出力する。 The first learning model 113 may be a multi-class classification model, and may classify normality or multiple types of abnormality using a single first learning model 113. In this case, the first learning model 113 receives as input a sample data group consisting of multiple sample data in chronological order, and outputs classification information indicating whether the last sample data in the chronological order contained in the sample data group is normal or multiple types of abnormality.

上記では、第１学習モデル１１３がＸＧＢｏｏｓｔである例を説明したが、第１学習モデル１１３の構成は限定されるものではなく、検体データ群に含まれる時系列で最後の検体データの正常又は異常に関する分類情報を識別可能であればよい。第１学習モデル１１３は、例えば、Ｔｒａｎｓｆｏｒｍｅｒ、ＣＮＮ（Convolution Neural Network）、ＲＮＮ（Recurrent Neural Network）、ＬＳＴＭ（Long Short Term Memory）等のニューラルネットワークであってもよく、サポートベクタマシン、ロジスティクス回帰、ランダムフォレスト等の他の学習アルゴリズムを用いてもよい。 Although an example in which the first learning model 113 is XGBoost has been described above, the configuration of the first learning model 113 is not limited as long as it is capable of identifying classification information regarding normality or abnormality of the last sample data in the time series contained in the sample data group. The first learning model 113 may be, for example, a neural network such as a Transformer, a Convolution Neural Network (CNN), a Recurrent Neural Network (RNN), or a Long Short Term Memory (LSTM), or may use other learning algorithms such as a support vector machine, a logistics regression, or a random forest.

情報処理装置１は、異常検出を行う運用フェーズの前段階である学習フェーズにおいて、訓練データを用いて第１学習モデル１１３を生成し、生成した第１学習モデル１１３を記憶部１１に記憶する。そして、運用フェーズにおいて、記憶する第１学習モデル１１３を用いて検体データの異常を検出する。 In the learning phase, which is a stage preceding the operational phase in which anomaly detection is performed, the information processing device 1 generates a first learning model 113 using training data and stores the generated first learning model 113 in the storage unit 11. Then, in the operational phase, the stored first learning model 113 is used to detect anomalies in the sample data.

図５は、訓練データの生成処理手順の一例を示すフローチャートである。以下の処理は、情報処理装置１の記憶部１１に記憶されるプログラム１Ｐに従って制御部１０により実行される。 Figure 5 is a flowchart showing an example of a training data generation process. The following process is executed by the control unit 10 in accordance with the program 1P stored in the storage unit 11 of the information processing device 1.

情報処理装置１の制御部１０は、検体ＤＢ１１１に記憶する情報に基づき、同一患者に係る複数の時系列順の検体データを取得する（ステップＳ１１）。制御部１０は、例えば検体ＤＢ１１１に記憶する大量の検体データから、同一検査施設名且つ同一患者に係る検体データを抽出し、時系列順に並べる。 The control unit 10 of the information processing device 1 acquires multiple pieces of chronologically ordered specimen data relating to the same patient based on the information stored in the specimen DB 111 (step S11). The control unit 10 extracts specimen data relating to the same testing facility and the same patient from the large amount of specimen data stored in the specimen DB 111, for example, and arranges them in chronological order.

制御部１０は、取得した複数の検体データから、共起頻度の高い検査項目の検査値を含む検体データを抽出する（ステップＳ１２）。同じ時期に高頻度で実施されている検査項目、すなわち共起頻度の高い検査項目は、例えばアソシエーション分析やマーケットバスケット分析、共起クラスタマイニング（福井健一ら、人工知能３０巻２号、２０１５年、を参照）を用いることで、抽出することができる。一方で簡易に共起頻度の高い検査項目を抽出する方法が複数あり、例えば次のように抽出できる。まず、ある検査項目と共起する検査項目の頻度表を作成し、それを全検査項目について、２項目間の総当たりの頻度表を作成する。次にその頻度表において、その上位の検査項目に限定した検査項目セットについてセット内の全項目を含む共起頻度を確認し、ある閾値以上の共起頻度を持つ検査項目を絞り込むことで、共起頻度の高い検査項目グループ（複数の検体データ）を抽出できる。これにより、欠損値の少ない検体データ群を生成できる。 The control unit 10 extracts specimen data including test values of test items with high co-occurrence frequencies from the acquired specimen data (step S12). Test items that are frequently performed at the same time, i.e., test items with high co-occurrence frequencies, can be extracted by using, for example, association analysis, market basket analysis, and co-occurrence cluster mining (see Kenichi Fukui et al., Artificial Intelligence, Vol. 30, No. 2, 2015). On the other hand, there are several methods for easily extracting test items with high co-occurrence frequencies, and they can be extracted, for example, as follows. First, a frequency table of test items that co-occur with a certain test item is created, and a brute-force frequency table between the two items is created for all test items. Next, in the frequency table, the co-occurrence frequency including all items in the set is confirmed for a test item set limited to the top test item, and test items with a co-occurrence frequency above a certain threshold are narrowed down to extract a test item group (multiple specimen data) with high co-occurrence frequencies. This makes it possible to generate a specimen data group with few missing values.

制御部１０は、取得した複数の時系列順の共起頻度の高い検体データに基づき、異常検体データを含まない検体データ群を生成する（ステップＳ１３）。詳細には、制御部１０は、例えば取得した複数の時系列順の検体データに対し、最も新しい検体データを含む４つの検体データからなる第１の検体データ群を生成する。同様に、制御部１０は、時系列で２番目に新しい検体データを含む４つの検体データからなる第２の検体データ群、…、のように、基準となる検体データを順にスライドさせ、複数の検体データ群を生成する。このようにして、臨床検査で得られた検査値をそのまま用いた、異常検体データを含まない検体データ群が生成される。なお制御部１０は、第１学習モデル１１３への入力要素に必要な所定の管理項目のみを抽出して検体データを生成してもよい。 The control unit 10 generates a sample data group that does not include abnormal sample data based on the multiple chronologically ordered sample data with high co-occurrence frequency obtained (step S13). In detail, the control unit 10 generates a first sample data group consisting of four sample data including the most recent sample data from the multiple chronologically ordered sample data obtained. Similarly, the control unit 10 slides the reference sample data in order to generate multiple sample data groups, such as a second sample data group consisting of four sample data including the second most recent sample data in chronological order, .... In this way, a sample data group that does not include abnormal sample data is generated using the test values obtained in the clinical test as they are. The control unit 10 may generate sample data by extracting only predetermined management items necessary for input elements to the first learning model 113.

制御部１０は、取得した検体データの一部を異常値に置き換えることにより、異常検体データを含む検体データ群を生成する（ステップＳ１４）。詳細には、制御部１０は、検体データ群に含まれる検体データのうち、時系列で最後の検体データの一部を異常値に置き換えることにより異常検体データを生成する。制御部１０は、生成した異常検体データと、臨床検査で得られた検査値をそのまま用いた残り３つの検体データとからなる検体データ群を生成する。 The control unit 10 generates a specimen data group including abnormal specimen data by replacing a portion of the acquired specimen data with an abnormal value (step S14). In detail, the control unit 10 generates abnormal specimen data by replacing a portion of the last specimen data in the chronological order among the specimen data included in the specimen data group with an abnormal value. The control unit 10 generates a specimen data group consisting of the generated abnormal specimen data and the remaining three specimen data that use the test values obtained in the clinical test as is.

異常検体データの生成方法の具体例を以下に示す。制御部１０は、第１患者に係る所定検査日時における検体データに含まれる全ての検査値を、当該所定検査日時に検査を行った、第１患者とは異なる第２患者に係る検体データの検査値に入れ替えることで、取り違えに起因する異常検体データを生成する。制御部１０は、第１患者に係る所定検査日時における検体データに含まれる特定の検査項目に対する検査値のみを、第２患者に係る検査値に入れ替えてもよい。 A specific example of a method for generating abnormal sample data is shown below. The control unit 10 generates abnormal sample data resulting from a mix-up by replacing all test values included in the sample data for a first patient at a specified test date and time with test values for sample data for a second patient, different from the first patient, who was tested at the specified test date and time. The control unit 10 may replace only test values for a specific test item included in the sample data for the first patient at a specified test date and time with test values for the second patient.

制御部１０は、検体データに含まれる一又は複数の検査項目に対する検査値を、規定と異なる希釈倍率による希釈を行ったと仮定し算出される検査値に入れ替えることで、希釈ミス（計算ミス）に起因する異常検体データを生成する。制御部１０は、検体データに含まれる一又は複数の検査項目に対する検査値を、単位を誤認したと仮定し算出される検査値に入れ替えることで、単位誤認（計算ミス）に起因する異常検体データを生成する。 The control unit 10 generates abnormal specimen data resulting from a dilution error (calculation error) by replacing the test values for one or more test items included in the specimen data with test values calculated on the assumption that dilution was performed at a dilution ratio different from the specified one. The control unit 10 generates abnormal specimen data resulting from a unit misidentification (calculation error) by replacing the test values for one or more test items included in the specimen data with test values calculated on the assumption that a unit was misidentified.

制御部１０は、検体データに含まれる一又は複数の検査項目に対する検査値を、規定と異なる試薬を用いたと仮定し算出される検査値に入れ替えることで、試薬間違いに起因する異常検体データを生成する。制御部１０は、検体データに含まれる一又は複数の検査項目に対する検査値を、検体詰まりが発生したと仮定し算出される検査値（例えば、ゼロ、実際の値の数分の一又は特定の最大値等）に入れ替えることで、検体詰まりに起因する異常検体データを生成する。このように、制御部１０は、実際に得られた検体データを用い、異常な検体データを生成する。なお、異常検体データの生成方法及び種類は上記の例に限定されるものではない。 The control unit 10 generates abnormal specimen data caused by an incorrect reagent by replacing the test values for one or more test items included in the specimen data with test values calculated under the assumption that a reagent other than the specified one was used. The control unit 10 generates abnormal specimen data caused by specimen clogging by replacing the test values for one or more test items included in the specimen data with test values calculated under the assumption that a specimen clogging has occurred (e.g., zero, a fraction of the actual value, or a specific maximum value, etc.). In this way, the control unit 10 generates abnormal specimen data using actually obtained specimen data. Note that the method and type of abnormal specimen data generation are not limited to the above examples.

制御部１０は、生成した異常検体データを含まない検体データ群及び異常検体データを含む検体データ群それぞれに対し、分類情報が正解値としてラベル付けされたデータセットである訓練データを生成する（ステップＳ１５）。具体的には、制御部１０は、各検体データ群に対し、異常の種類毎に正常又は異常を示すラベルが付与された訓練データを生成する。訓練データは、各検体データ群に対し、正常又は複数種類の異常を示すラベルが付与されたものであってもよい。 The control unit 10 generates training data, which is a data set in which classification information is labeled as a correct answer value, for each of the generated specimen data groups that do not include abnormal specimen data and the generated specimen data groups that include abnormal specimen data (step S15). Specifically, the control unit 10 generates training data in which a label indicating normality or abnormality is assigned to each specimen data group for each type of abnormality. The training data may be data in which a label indicating normality or multiple types of abnormalities is assigned to each specimen data group.

制御部１０は、生成した訓練データを訓練ＤＢ１１２に記憶し（ステップＳ１６）、一連の処理を終了する。制御部１０は、検体ＤＢ１１１に収集した大量の検体データに基づき複数の情報群を生成し、生成した情報群を訓練データとして訓練ＤＢ１１２に蓄積する。通常、臨床検査において検体データが異常と判断された場合には、再検査後の正常な検体データのみが最終的にデータベースに記録されることが多く、異常な検体データを収集することが困難である。上述の処理によれば、異常検体データを仮想的に生成することで、効率的に訓練データを蓄積することができる。なお、異常検体データは、実際の臨床検査における異常の発生頻度を考慮して生成されることが異常検知の精度を客観的に評価する上で好ましい。 The control unit 10 stores the generated training data in the training DB 112 (step S16), and ends the series of processes. The control unit 10 generates multiple information groups based on the large amount of specimen data collected in the specimen DB 111, and accumulates the generated information groups in the training DB 112 as training data. Usually, when specimen data is determined to be abnormal in a clinical test, only normal specimen data after retesting is often ultimately recorded in the database, making it difficult to collect abnormal specimen data. According to the above-mentioned process, abnormal specimen data is virtually generated, and training data can be accumulated efficiently. Note that, in order to objectively evaluate the accuracy of anomaly detection, it is preferable that the abnormal specimen data is generated taking into account the frequency of occurrence of abnormalities in actual clinical tests.

上述の処理において、検体データは、例えば検査施設名、患者属性情報、検査装置ＩＤ等を含むものであってもよい。すなわち、第１学習モデル１１３への入力要素に検査施設名、患者属性情報、検査装置ＩＤ等を含むものであってもよい。この場合、検体データ群の生成時に、同一検査施設や同一患者毎に検体データを区分する処理を不要とすることができる。 In the above-mentioned processing, the specimen data may include, for example, the name of the testing facility, patient attribute information, a testing device ID, etc. In other words, the input elements to the first learning model 113 may include the name of the testing facility, patient attribute information, a testing device ID, etc. In this case, when generating a specimen data group, it is possible to eliminate the need for processing to classify specimen data for the same testing facility or the same patient.

図６は、第１学習モデル１１３の生成処理手順の一例を示すフローチャートである。以下の処理は、例えば図５の処理の終了後に、情報処理装置１の記憶部１１に記憶されるプログラム１Ｐに従って制御部１０により実行される。 Figure 6 is a flowchart showing an example of a process for generating the first learning model 113. The following process is executed by the control unit 10 according to the program 1P stored in the storage unit 11 of the information processing device 1, for example, after the process of Figure 5 is completed.

情報処理装置１の制御部１０は、訓練ＤＢ１１２に記憶された情報に基づき、情報群より抽出された１組の訓練データを取得する（ステップＳ２１）。制御部１０は、取得した訓練データを用いて、検体データ群を入力した場合に、当該検体データ群に含まれる時系列で最後の検体データの正常又は異常を示す情報を出力する第１学習モデル１１３を生成する（ステップＳ２２）。 The control unit 10 of the information processing device 1 acquires a set of training data extracted from the information group based on the information stored in the training DB 112 (step S21). The control unit 10 uses the acquired training data to generate a first learning model 113 that outputs information indicating whether the last specimen data in the time series included in the specimen data group is normal or abnormal when a specimen data group is input (step S22).

具体的には、制御部１０は、訓練データに含まれる検体データ群を用い、第１の決定木を構築する。制御部１０は、入力データを第１の決定木に当てはめ、入力データと葉ノードに含まれる予測値との第１の残差を求める。次いで制御部１０は、得られた第１の残差を用い、同様の手順により第２の決定木を構築し、第１の残差と葉ノードに含まれる予測値との第２の残差を求める。第１の決定木と第２の決定木とを合成することでより精度が高くなるよう、第２の決定木が構築される。制御部１０は、以下同様の手順により決定木を逐次的に構築する。 Specifically, the control unit 10 uses a group of sample data included in the training data to construct a first decision tree. The control unit 10 fits the input data into the first decision tree and obtains a first residual between the input data and the predicted value included in the leaf node. The control unit 10 then uses the obtained first residual to construct a second decision tree by a similar procedure and obtains a second residual between the first residual and the predicted value included in the leaf node. The second decision tree is constructed by combining the first and second decision trees to achieve higher accuracy. The control unit 10 subsequently constructs decision trees sequentially by a similar procedure.

制御部１０は、第１学習モデル１１３における損失関数を最適化（最小化）するよう、例えば勾配降下法を用いパラメータを調整し、決定木を逐次的に学習・統合する。なお、制御部１０は、いわゆるオーバーフィッティング（過学習）を防止するため、損失関数を最適化する際、全ての説明変数を使用するのではなく、ランダムに決定された割合で説明変数の数を選定して用いる。制御部１０は、損失関数が所定基準を満たすことにより学習を完了する。 The control unit 10 adjusts parameters using, for example, gradient descent to optimize (minimize) the loss function in the first learning model 113, and sequentially learns and integrates decision trees. Note that, in order to prevent so-called overfitting, when optimizing the loss function, the control unit 10 does not use all explanatory variables, but instead selects and uses a randomly determined number of explanatory variables. The control unit 10 completes learning when the loss function satisfies a predetermined criterion.

制御部１０は、学習済みの第１学習モデル１１３として、学習済みの第１学習モデル１１３に関する定義情報を記憶部１１に記憶させ（ステップＳ２３）、一連の処理を終了する。制御部１０は、上述の処理により、検体データ群に対し、正常又は異常に関する分類情報を適切に出力可能に学習された第１学習モデル１１３を構築することができる。制御部１０は、各第１学習モデル１１３について上述の処理を実行する。 The control unit 10 stores definition information related to the trained first learning model 113 in the storage unit 11 as the trained first learning model 113 (step S23), and ends the series of processes. Through the above-mentioned processes, the control unit 10 can construct a trained first learning model 113 capable of appropriately outputting classification information related to normality or abnormality for a group of sample data. The control unit 10 executes the above-mentioned processes for each first learning model 113.

上述のように生成された第１学習モデル１１３を用いて、情報処理装置１は、検体データの異常を検出する。以下、運用フェーズにおいて情報処理装置１が実行する処理手順について説明する。 Using the first learning model 113 generated as described above, the information processing device 1 detects anomalies in the sample data. Below, the processing procedure executed by the information processing device 1 in the operation phase is described.

図７は、異常検出の処理手順の一例を示すフローチャートである。以下の処理は、情報処理装置１の記憶部１１に記憶されるプログラム１Ｐに従って制御部１０によって実行される。制御部１０は、例えば検査施設装置２から検体データが送信される都度、リアルタイムで以下の処理を行ってもよく、収集された検体データに基づき、任意のタイミングで処理を行ってもよい。 Figure 7 is a flowchart showing an example of a processing procedure for detecting an anomaly. The following processing is executed by the control unit 10 in accordance with the program 1P stored in the memory unit 11 of the information processing device 1. The control unit 10 may perform the following processing in real time, for example, each time sample data is transmitted from the testing facility device 2, or may perform the processing at any timing based on the collected sample data.

情報処理装置１の制御部１０は、検査施設装置２から検体データを取得し（ステップＳ３１）、取得した検体データを検体ＤＢ１１１に記憶する。 The control unit 10 of the information processing device 1 acquires the specimen data from the testing facility device 2 (step S31) and stores the acquired specimen data in the specimen DB 111.

制御部１０は、検体ＤＢ１１１に記憶する情報に基づき、取得した検体データを含む検体データ群を取得する（ステップＳ３２）。検体データ群は、例えば受信した検体データと同一の患者に係る検体データであって、受信した検体データと時系列で隣接する４つの検体データを、時系列順に配列したものである。なお制御部１０は、過去に取得された複数の検体データから、今回（最新）の検体データと共起頻度の高い検査項目の検査値を含む検体データを抽出し、抽出した共起頻度の高い検査項目の検査値を含む検体データを含む検体データ群を取得してもよい。共起頻度の高い検体データの抽出方法は、訓練データの生成時と同様の手法であってよい。 Based on the information stored in the specimen DB 111, the control unit 10 acquires a specimen data group including the acquired specimen data (step S32). The specimen data group is, for example, specimen data related to the same patient as the received specimen data, and is four specimen data items adjacent in time series to the received specimen data arranged in chronological order. The control unit 10 may extract specimen data including test values of test items that frequently co-occur with the current (latest) specimen data from multiple specimen data acquired in the past, and acquire a specimen data group including specimen data including the extracted test values of the frequently co-occurring test items. The method of extracting frequently co-occurring specimen data may be the same as that used when generating the training data.

制御部１０は、取得した検体データ群を入力データとして第１学習モデル１１３に入力する（ステップＳ３３）。制御部１０は、第１学習モデル１１３から出力される分類情報を取得する（ステップＳ３４）。 The control unit 10 inputs the acquired sample data group as input data to the first learning model 113 (step S33). The control unit 10 acquires the classification information output from the first learning model 113 (step S34).

制御部１０は、第１学習モデル１１３に入力される検体データにおける各検査項目に対応する検査値の寄与度を算出することにより、出力される分類情報に寄与した検査値を特定する（ステップＳ３５）。寄与度の算出方法は限定されるものではないが、例えばＳＨＡＰ（SHapley Additive exPlanation）、ＬＩＭＥ（Local Interpretable Model-Agnostic Explanations）等を利用することができる。検査値の寄与度が大きい程、当該検査値の分類に及ぼす影響が大きいことを示す。 The control unit 10 calculates the contribution of the test value corresponding to each test item in the sample data input to the first learning model 113, thereby identifying the test value that contributed to the output classification information (step S35). The method for calculating the contribution is not limited, but SHAP (SHapley Additive exPlanation), LIME (Local Interpretable Model-Agnostic Explanations), etc., can be used. The greater the contribution of the test value, the greater the influence on the classification of the test value.

制御部１０は、算出した寄与度に基づき、例えば寄与度が所定値以上である検査値を、分類情報に寄与した検査値と特定する。制御部１０は、例えば寄与度が高い順に所定数の検査値を選択することにより、分類情報に寄与した検査値と特定してもよい。なお、第１学習モデル１１３にＡｔｔｅｎｔｉｏｎ機構を用いた場合には、Ａｔｔｅｎｔｉｏｎの値を寄与度としてもよい。制御部１０は、複数の第１学習モデル１１３それぞれに対しステップＳ３３からステップＳ３５の処理を実行する。制御部１０は、分類情報に寄与した検査値に係る検査項目を特定してもよい。 Based on the calculated contribution degree, the control unit 10 identifies, for example, test values whose contribution degree is equal to or greater than a predetermined value as test values that contributed to the classification information. The control unit 10 may identify test values that contributed to the classification information by, for example, selecting a predetermined number of test values in order of decreasing contribution degree. Note that, if an Attention mechanism is used for the first learning model 113, the Attention value may be used as the contribution degree. The control unit 10 executes the processes of steps S33 to S35 for each of the multiple first learning models 113. The control unit 10 may identify the test item related to the test value that contributed to the classification information.

制御部１０は、他の手法による検体データの適否の検証結果を取得する（ステップＳ３６）。制御部１０は、検体データの適否の検証方法として、例えば公知のデルタ検証法及び出現実績ゾーン検証法を用いてよい。 The control unit 10 obtains the results of the verification of the appropriateness of the sample data using other methods (step S36). The control unit 10 may use, for example, the well-known delta verification method and occurrence history zone verification method as a method for verifying the appropriateness of the sample data.

デルタ検証法は、取得した今回の検体データに含まれる所定検査項目の検査値と、同一患者の前回値とを比較し、今回の検査値がリミット上下限値（例えば±２ＳＤ）以内か否かを判定するものである。今回の検査値がリミット上下限値の範囲外である場合、今回の検査値は異常と判定される。なおデルタ検証法による検証方法については、例えば特開平５－１５１２８２号公報や特開平１１－２９６６０５号公報を参照されたい。 The delta verification method compares the test values of a specified test item contained in the currently acquired sample data with the previous values for the same patient, and determines whether the current test value is within the upper and lower limits (e.g., ±2SD). If the current test value is outside the upper and lower limits, the current test value is determined to be abnormal. For details on the verification method using the delta verification method, see, for example, JP-A-5-151282 and JP-A-11-296605.

出現実績ゾーン検証法は、検査値の出現実績に基づいてその頻度分布を求め、前回値チェックの許容範囲、すなわち正常ゾーンを非線形に設定し、単項目チェック、項目間チェック、前回値チェックを行うものである。今回の検査値が正常ゾーンの範囲外である場合、今回の検査値は異常と判定される。なお出現実績ゾーン検証法による検証方法については、例えば特許公報第２８２８６０９号公報を参照されたい。 The occurrence history zone verification method finds the frequency distribution of test values based on their occurrence history, sets the allowable range for previous value checks, i.e., the normal zone, non-linearly, and performs single-item checks, inter-item checks, and previous value checks. If the current test value is outside the normal zone, the current test value is determined to be abnormal. For details about the verification method using the occurrence history zone verification method, please refer to Patent Publication No. 2828609, for example.

制御部１０は、取得した分類情報及び検証結果を含む画面情報を生成する（ステップＳ３７）。制御部１０は、生成した画面情報を表示装置３へ表示させ（ステップＳ３８）、一連の処理を終了する。 The control unit 10 generates screen information including the acquired classification information and verification results (step S37). The control unit 10 displays the generated screen information on the display device 3 (step S38), and ends the series of processes.

図８から図１０は、表示装置３に表示される画面の一例を示す模式図である。図８は、結果一覧画面３１の例を示す。結果一覧画面３１は、表示対象となる検体データに対する、第１学習モデル１１３による分類情報と、デルタ検証法及び出現実績ゾーン検証法による検証結果とを含む検出結果を一覧で表示する。結果一覧は、例えば表形式であり、横軸方向に異常検出の手法名（例えばデルタ検証法を意味するデルタ法、出現実績ゾーン検証法を意味するゾーン法、第１学習モデル１１３を意味する機械学習法等）、初期検査値、及び再検査した場合の最新の検査値が表示される。また、縦軸方向に検体データの各項目（例えば検体ＩＤ、患者ＩＤ、検査値、検査項目等）が表示される。さらに機械学習法の下行には、異常の種類を示す複数項目が表示される。異常が検出された手法（異常の種類を示す項目）における一又は複数の検査値又は検査項目の欄は、所定の表示態様により、検体データの異常を認識可能に示す。具体的には、異常の有無を表す記号、機械学習法により予測された異常有無の確度の数値等が表示される。 8 to 10 are schematic diagrams showing an example of a screen displayed on the display device 3. FIG. 8 shows an example of a result list screen 31. The result list screen 31 displays a list of detection results including classification information by the first learning model 113 and verification results by the delta verification method and the occurrence history zone verification method for the specimen data to be displayed. The result list is, for example, in a table format, and the name of the abnormality detection method (e.g., the delta method meaning the delta verification method, the zone method meaning the occurrence history zone verification method, the machine learning method meaning the first learning model 113, etc.), the initial test value, and the latest test value in the case of retesting are displayed on the horizontal axis. In addition, each item of the specimen data (e.g., specimen ID, patient ID, test value, test item, etc.) is displayed on the vertical axis. Furthermore, multiple items indicating the type of abnormality are displayed below the machine learning method. The column of one or more test values or test items in the method (item indicating the type of abnormality) in which the abnormality was detected shows the abnormality of the specimen data in a recognizable manner by a predetermined display mode. Specifically, a symbol indicating the presence or absence of an abnormality, a numerical value of the accuracy of the presence or absence of an abnormality predicted by the machine learning method, etc. are displayed.

結果一覧はさらに、例えばホテリング理論等の他の異常検出の手法による検出結果（異常度）や、異常判定の統合結果（ホテリング理論の異常度や機械学習法による予測の確度に基づく自動推定結果）等が含まれてもよい。なお、ホテリング理論等による推定方法については、他の実施形態で詳述する。 The result list may further include detection results (degree of anomaly) using other anomaly detection methods such as Hotelling's theory, and integrated results of anomaly determination (automatic estimation results based on the degree of anomaly in Hotelling's theory or the accuracy of predictions using machine learning methods). Note that estimation methods using Hotelling's theory, etc. will be described in detail in other embodiments.

情報処理装置１の制御部１０は、第１学習モデル１１３による分類情報と、分類情報に寄与した検査値の特定結果とに基づき、分類情報に対応する異常の種類列の検査値又は検査項目欄に、例えば色付け、枠付け、点灯／点滅、異常の有無を表す記号表示等の処理を施す。この場合において、制御部１０は、例えば表示色の濃度を検査値又は検査項目の寄与度に応じて変更する、表示色を分類情報（異常の種類）に応じて変更するなど、異常の表示態様を分類情報や寄与度に応じて異ならせてもよい。同様に、制御部１０は、デルタ検証法及び出現実績ゾーン検証法による検証結果に基づき、異常が検出された手法名列の検査値又は検査項目欄に、例えば色付け、枠付け、点灯／点滅、異常の有無を表す記号表示等の処理を施す。デルタ検証法及び出現実績ゾーン検証法については、検証に用いた検査値又は検査項目欄に対し各種処理が施されてもよい。 The control unit 10 of the information processing device 1 performs processing such as coloring, framing, lighting/flashing, and symbol display indicating the presence or absence of an abnormality on the test value or test item column of the abnormality type column corresponding to the classification information based on the classification information by the first learning model 113 and the identification result of the test value that contributed to the classification information. In this case, the control unit 10 may change the display mode of the abnormality depending on the classification information or the contribution degree, for example, by changing the density of the display color depending on the contribution degree of the test value or test item, or by changing the display color depending on the classification information (type of abnormality). Similarly, the control unit 10 performs processing such as coloring, framing, lighting/flashing, and symbol display indicating the presence or absence of an abnormality on the test value or test item column of the method name column in which an abnormality was detected based on the verification results by the delta verification method and the occurrence history zone verification method. For the delta verification method and the occurrence history zone verification method, various processes may be performed on the test value or test item column used in the verification.

結果一覧画面３１は、表示対象となる患者ＩＤの検査値の時系列データを表示してもよい。また結果一覧画面３１は、検体データ（検体ＩＤ）毎に表示されるものに限定されず、同一検査日に取得された複数の検体データに関する検出結果を一覧で表示するものであってもよい。これにより、例えば取り違えが検出された場合、その取り違え対象を推定し易くなる。 The result list screen 31 may display time series data of test values for the patient ID to be displayed. Furthermore, the result list screen 31 is not limited to displaying each sample data (sample ID), but may display a list of detection results for multiple sample data acquired on the same test date. This makes it easier to estimate the subject of the mix-up, for example, if a mix-up is detected.

制御部１０は、図８に示す如く結果一覧が表示されている状態で、いずれかの検査値のタップ操作を受け付けた場合、図９に示す第１の詳細画面３２を表示装置３へ表示させる。第１の詳細画面３２は、選択された検査値の時系列変化を示すグラフを含む。グラフの縦軸は検査値、横軸は検査日である。制御部１０は、例えば検体ＤＢ１１１を参照し、同一患者の今回及び過去所定回数分（図９の例では過去４回分）の検体データにおける検査値を取得する。検査値は、タップ操作により選択を受け付けた検査値に係る検査項目に対応するものである。制御部１０は、取得した各検査値をグラフ上にプロットする。 When the control unit 10 receives a tap operation on any test value while the list of results is displayed as shown in FIG. 8, it causes the display device 3 to display a first details screen 32 shown in FIG. 9. The first details screen 32 includes a graph showing the time series change of the selected test value. The vertical axis of the graph is the test value, and the horizontal axis is the test date. The control unit 10, for example, refers to the specimen DB 111 and acquires test values for the current specimen data and a predetermined number of previous specimen data (the past four specimens in the example of FIG. 9) for the same patient. The test value corresponds to the test item related to the test value selected by the tap operation. The control unit 10 plots each acquired test value on the graph.

グラフには、今回の検査値に対する正常変動範囲及び異常候補範囲が表示されている。制御部１０は、例えば過去４回分の検査値に基づき、今回の検査値の正常変動範囲を推定する。正常変動範囲の推定方法は限定されるものではないが、制御部１０は、例えばガウス過程回帰等の学習アルゴリズムにより学習済みのモデルを用いて、正常変動範囲を推定してもよい。制御部１０は、グラフ上の正常変動範囲に対応する領域と、正常範囲外である異常候補範囲に対応する領域とに、異なる色付け等の処理を施す。正常変動範囲及び異常候補範囲が可視化されることで、今回の検査値の状態を視覚的に容易に把握できる。図９では、１つの検査項目に関する情報を含む第１の詳細画面３２の例を示したが、第１の詳細画面３２は、複数の検査項目に関する情報を同時に（並列に）表示するものであってもよい。 The graph displays the normal variation range and the candidate abnormality range for the current test value. The control unit 10 estimates the normal variation range of the current test value based on, for example, the past four test values. The method of estimating the normal variation range is not limited, but the control unit 10 may estimate the normal variation range using a model that has been trained using a learning algorithm such as Gaussian process regression. The control unit 10 performs processing such as coloring differently on the area on the graph that corresponds to the normal variation range and the area that corresponds to the candidate abnormality range that is outside the normal range. By visualizing the normal variation range and the candidate abnormality range, the state of the current test value can be visually and easily grasped. Although FIG. 9 shows an example of the first detailed screen 32 that includes information about one test item, the first detailed screen 32 may display information about multiple test items simultaneously (in parallel).

また、制御部１０は、図８に示す如く結果一覧が表示されている状態で、「デルタ法」又は「ゾーン法」のタップ操作を受け付けた場合、図１０に示す第２の詳細画面３２を表示装置３へ表示させる。第２の詳細画面３２は、図１０Ａに示すように、「デルタ法」の選択に応じて表示されるデルタ検証法による検査値の頻度分布マップや、図１０Ｂに示すように、「ゾーン法」の選択に応じて表示される出現実績ゾーン検証法による検査値の頻度分布マップを含む。頻度分布マップはいずれも、デルタ検証法及び出現実績ゾーン検証法により特定される今回の検査値の正常範囲及び異常範囲を、異なる色付けや境界線表示等の表示態様により識別可能に表示する。制御部１０は、検証結果に基づき頻度分布マップを生成する。さらに制御部１０は、生成した頻度分布マップ上の対応する位置に、今回の検査値をプロットする。 In addition, when the control unit 10 receives a tap operation for the "delta method" or "zone method" while the result list is displayed as shown in FIG. 8, the control unit 10 causes the display device 3 to display the second detailed screen 32 shown in FIG. 10. The second detailed screen 32 includes a frequency distribution map of test values by the delta verification method, which is displayed in response to the selection of the "delta method," as shown in FIG. 10A, and a frequency distribution map of test values by the occurrence history zone verification method, which is displayed in response to the selection of the "zone method," as shown in FIG. 10B. Both frequency distribution maps display the normal range and abnormal range of the current test value identified by the delta verification method and the occurrence history zone verification method in a distinguishable manner, such as by different colors or boundary display. The control unit 10 generates a frequency distribution map based on the verification results. Furthermore, the control unit 10 plots the current test value at the corresponding position on the generated frequency distribution map.

本実施形態によれば、情報処理装置１は、第１学習モデル１１３を用いて検体データの異常を精度よく検出することができる。第１学習モデル１１３は、共起頻度の高い検査項目に係る時系列の複数の検査値に基づき、検体データの正常又異常を精度よく分類する。各種手法による検出結果が統合して表示されるため、検出結果を容易に比較することができる。情報処理装置１は、複数種類の異常を分類できるため、異常の原因を切り分けることで、その後の再検査等の対応を効率的に行うことができる。 According to this embodiment, the information processing device 1 can accurately detect abnormalities in the sample data using the first learning model 113. The first learning model 113 accurately classifies the sample data as normal or abnormal based on multiple test values in a time series related to test items that frequently co-occur. Since the detection results obtained by various methods are integrated and displayed, the detection results can be easily compared. Since the information processing device 1 can classify multiple types of abnormalities, subsequent measures such as retesting can be efficiently taken by isolating the cause of the abnormality.

本実施形態によれば、検査工程における異常、すなわち検査過誤の可能性が高い検査を推定できるため、再検査の頻度を最小限に抑制しつつ、効率的に検査結果の信頼性を高める（検査過誤の見逃しを抑制する）ことができる。また、一般的に再検査の実施要否は、経験豊かな臨床検査技師により判断されることが多いところ、本実施形態によれば、再検査を行う検査対象を自動的に抽出し、再検査の理由を自動的に提示できるため、検査担当者の経験に依存しない形で、再検査を効率よく実施することができる。異常の原因が特定されることにより、異常の原因に応じて再検査の有無が判定でき、再検査の自動化が可能となる。 According to this embodiment, abnormalities in the testing process, i.e. tests with a high probability of testing errors, can be estimated, so that the frequency of retesting can be minimized while efficiently increasing the reliability of test results (reducing oversight of testing errors). Furthermore, while the need for retesting is generally determined by experienced clinical laboratory technicians, this embodiment can automatically extract test subjects for retesting and automatically present the reason for retesting, allowing retesting to be performed efficiently without relying on the experience of the tester. By identifying the cause of the abnormality, it is possible to determine whether or not to retest depending on the cause of the abnormality, making it possible to automate retesting.

上記の各フローチャートでは、一連の処理を情報処理装置１の制御部１０が実行する例を説明したが、各処理の処理主体は限定されない。上記の処理は、一部又は全部が検査施設装置２の制御部２０で実行されるものであってもよく。検査装置５で実行されるものであってもよい。第１学習モデル１１３は、情報処理装置１により生成され、検査施設装置２で学習されたものであってもよい。情報処理装置１が生成した第１学習モデル１１３を検査施設装置２にインストールし、検査施設装置２で異常検出を行ってもよい。また、分類情報は、検査施設装置２を介し表示装置３へ出力されるものに限定されない。制御部１０は、表示装置３以外の装置に分類情報を出力し、表示させてもよい。情報処理装置１は、検査施設装置２と別体で構成されるものに限定されず、検査施設装置２に内包されるものであってもよい。 In each of the above flowcharts, an example in which a series of processes is executed by the control unit 10 of the information processing device 1 has been described, but the processing entity of each process is not limited. The above processes may be executed in part or in whole by the control unit 20 of the testing facility device 2. They may also be executed by the testing device 5. The first learning model 113 may be generated by the information processing device 1 and learned by the testing facility device 2. The first learning model 113 generated by the information processing device 1 may be installed in the testing facility device 2, and anomaly detection may be performed by the testing facility device 2. In addition, the classification information is not limited to being output to the display device 3 via the testing facility device 2. The control unit 10 may output the classification information to a device other than the display device 3 and display it. The information processing device 1 is not limited to being configured separately from the testing facility device 2, and may be included in the testing facility device 2.

（第２実施形態）
第２実施形態では、臨床検査により得られる実際の異常検体データを用いて第１学習モデル１１３を学習する。以下では主に第１実施形態との相違点を説明し、第１実施形態と共通する構成については同一の符号を付してその詳細な説明を省略する。 Second Embodiment
In the second embodiment, actual abnormal specimen data obtained by a clinical test is used to train the first learning model 113. The following mainly describes the differences from the first embodiment, and the same reference numerals are used to designate the same components as in the first embodiment, and detailed descriptions thereof will be omitted.

第２実施形態の情報処理装置１は、臨床検査により得られる実際の異常検体データを用いて訓練データを生成する。情報処理装置１は、第１実施形態における第１学習モデル１１３の学習を事前学習とし、新たに生成した生成した訓練データを用いて、２段階で第１学習モデル１１３を学習する（ファインチューニング）。 The information processing device 1 of the second embodiment generates training data using actual abnormal sample data obtained by clinical testing. The information processing device 1 performs pre-learning of the first learning model 113 in the first embodiment, and uses newly generated training data to train the first learning model 113 in two stages (fine tuning).

図１１は、第２実施形態における第１学習モデル１１３の生成処理手順の一例を示すフローチャートである。以下の処理は、情報処理装置１の記憶部１１に記憶されるプログラム１Ｐに従って制御部１０により実行される。 Figure 11 is a flowchart showing an example of a process for generating the first learning model 113 in the second embodiment. The following process is executed by the control unit 10 according to the program 1P stored in the storage unit 11 of the information processing device 1.

情報処理装置１の制御部１０は、検査施設装置２から異常値を含む異常検体データと、異常の種類（異常の原因）とを対応付けて取得する（ステップＳ４１）。得られた異常検体データは、臨床検査において実際に得られた異常値を含む。 The control unit 10 of the information processing device 1 acquires abnormal specimen data including abnormal values from the testing facility device 2 in association with the type of abnormality (cause of the abnormality) (step S41). The acquired abnormal specimen data includes abnormal values actually obtained in a clinical test.

制御部１０は、検体ＤＢ１１１を参照して、取得した異常検体データと同一患者に係る過去の１回以上の検体データ、例えば過去３回分の検体データを取得し、実際の異常検体データと過去の検体データとを含む検体データ群を生成する（ステップＳ４２）。 The control unit 10 refers to the specimen DB 111 to acquire one or more past specimen data relating to the same patient as the acquired abnormal specimen data, for example, the past three specimen data, and generates a specimen data group including the actual abnormal specimen data and the past specimen data (step S42).

制御部１０は、生成した検体データ群に対し、異常の種類を示す分類情報が正解値としてラベル付けされたデータセットである訓練データを生成する（ステップＳ４３）。制御部１０は、生成した訓練データを訓練ＤＢ１１２に記憶する（ステップＳ４４）。 The control unit 10 generates training data, which is a data set in which classification information indicating the type of abnormality is labeled as a correct answer value for the generated sample data group (step S43). The control unit 10 stores the generated training data in the training DB 112 (step S44).

制御部１０は、新たに取得した訓練データを用いて、検体データ群を入力した場合に、当該検体データ群に含まれる時系列で最後の検体データの正常又は異常を示す情報を出力する第１学習モデル１１３を生成する（ステップＳ４５）。この場合において、制御部１０は、第１実施形態において説明した処理により、臨床検査における実際の異常検体データを含まない訓練データを用いて事前に第１学習モデル１１３を学習しておく。そして制御部１０は、事前学習によるパラメータを初期値とし、新たな訓練データを用いてパラメータを最適化するよう、第１学習モデル１１３の学習を実行する。 The control unit 10 uses the newly acquired training data to generate a first learning model 113 that outputs information indicating whether the last sample data in the time series contained in the sample data group is normal or abnormal when the sample data group is input (step S45). In this case, the control unit 10 trains the first learning model 113 in advance using training data that does not include actual abnormal sample data from clinical testing, by the process described in the first embodiment. The control unit 10 then performs training of the first learning model 113, using parameters from the advance training as initial values and optimizing the parameters using the new training data.

制御部１０は、学習済みの第１学習モデル１１３に関する定義情報を記憶部１１に記憶させ（ステップＳ４６）、一連の処理を終了する。 The control unit 10 stores definition information regarding the trained first learning model 113 in the memory unit 11 (step S46) and ends the series of processes.

本実施形態によれば、２段階で第１学習モデル１１３の学習を行うことにより、効率的に学習を実行できるとともに、第１学習モデル１１３の精度を向上できる。例えば、複数の検査施設から取得した検体データ群を用いて第１学習モデル１１３の事前学習を行い、各検査施設で実際に得られた異常検体データを用いてファインチューニングすることで、各検査施設に十分な数の異常データを準備できなくても、各検査施設に適した第１学習モデル１１３を効率的且つ効果的に生成することができる。 According to this embodiment, by learning the first learning model 113 in two stages, learning can be performed efficiently and the accuracy of the first learning model 113 can be improved. For example, by pre-learning the first learning model 113 using a group of specimen data acquired from multiple testing facilities and fine-tuning using abnormal specimen data actually obtained at each testing facility, it is possible to efficiently and effectively generate a first learning model 113 suitable for each testing facility even if a sufficient number of abnormal data cannot be prepared for each testing facility.

（第３実施形態）
第３実施形態では、教師なし学習モデルとの組み合わせにより、第１学習モデル１１３を学習する。以下では主に第１実施形態との相違点を説明し、第１実施形態と共通する構成については同一の符号を付してその詳細な説明を省略する。 Third Embodiment
In the third embodiment, the first learning model 113 is learned by combining with an unsupervised learning model. The following mainly describes the differences from the first embodiment, and the same reference numerals are used for the configurations common to the first embodiment, and detailed descriptions thereof are omitted.

第３実施形態の情報処理装置１は、教師なし学習モデルである第２学習モデル１１４を記憶部１１に記憶している。情報処理装置１は、第２学習モデル１１４による分類結果を用い、第１学習モデル１１３を学習する。 The information processing device 1 of the third embodiment stores a second learning model 114, which is an unsupervised learning model, in the memory unit 11. The information processing device 1 uses the classification results from the second learning model 114 to learn the first learning model 113.

図１２は、第３実施形態における第２学習モデル１１４の概要を示す説明図である。第２学習モデル１１４は、検体データ群を入力として、当該検体データ群の異常の有無（異常又は正常）を示す情報を出力するよう構成される機械学習モデルである。第２学習モデル１１４は、第２学習モデル１１４は、例えばＬＯＦ（Local Outlier Factor）の手法により、検体データ群における異常の有無の分類結果を出力する。第２学習モデル１１４は、検体データ群の異常度を算出し、ある閾値より異常度が大きい場合を異常と定義することで、異常の有無を分類する。 FIG. 12 is an explanatory diagram showing an overview of the second learning model 114 in the third embodiment. The second learning model 114 is a machine learning model configured to input a sample data group and output information indicating the presence or absence of an abnormality (abnormal or normal) in the sample data group. The second learning model 114 outputs a classification result of the presence or absence of an abnormality in the sample data group, for example, by a method of LOF (Local Outlier Factor). The second learning model 114 calculates the degree of anomaly of the sample data group and classifies the presence or absence of an abnormality by defining a case where the degree of anomaly is greater than a certain threshold as an abnormality.

情報処理装置１は、過去に収集した大量の異常検体データを含まない検体データ群、すなわち正常な検体データ群のみを用いて、教師なし学習を行う。第２学習モデル１１４は、異常検体データを含まない検体データ群を正常データとし、新たな検体データ群の異常度が正常の範囲内か否かを識別するように訓練される。第２学習モデル１１４は、異常検体データを含まない検体データ群以外、すなわち異常検体データを含む検体データ群を異常値として識別する。第２学習モデル１１４は、正常データに対する、入力データである検体データ群の外れ度合いを評価した外れ値スコア（ＬＯＦ）を用いて、異常の有無を分類する。これにより、検体データ群が入力された場合、異常の有無を示す情報を適切に出力可能に学習された第２学習モデル１１４が構築される。 The information processing device 1 performs unsupervised learning using a large amount of previously collected specimen data groups that do not contain abnormal specimen data, i.e., normal specimen data groups. The second learning model 114 is trained to identify whether the degree of abnormality of a new specimen data group is within the normal range, with specimen data groups that do not contain abnormal specimen data as normal data. The second learning model 114 identifies specimen data groups other than those that do not contain abnormal specimen data, i.e., specimen data groups that contain abnormal specimen data, as abnormal values. The second learning model 114 classifies the presence or absence of an abnormality using an outlier score (LOF) that evaluates the degree of deviation of the specimen data group, which is the input data, from the normal data. This allows the second learning model 114 to be constructed so that, when a specimen data group is input, it can appropriately output information indicating the presence or absence of an abnormality.

第２学習モデル１１４は、検体データ群の特徴量を抽出する特徴量抽出器と、上述の分類モデルとを備え、特徴量抽出器により抽出された特徴量に基づき、検体データ群における異常の有無を分類してもよい。特徴量抽出器は、例えば、検体データ群の特徴を抽出し、抽出された特徴量を出力する畳み込みニューラルネットワークモデルである。ＣＮＮは、例えば転移学習により、事前に学習された学習済みモデルを用いてもよい。 The second learning model 114 may include a feature extractor that extracts features of the specimen data group and the classification model described above, and may classify the presence or absence of anomalies in the specimen data group based on the features extracted by the feature extractor. The feature extractor is, for example, a convolutional neural network model that extracts features of the specimen data group and outputs the extracted features. The CNN may use a trained model that has been trained in advance, for example by transfer learning.

第２学習モデル１１４の構成は上記の例に限定されるものではなく、教師なしの学習アルゴリズムにより、検体データ群の異常の有無を識別可能であればよい。第２学習モデル１１４は、例えばｋ近傍法、Ｏｎｅ－ｃｌａｓｓＳＶＭ、ホテリング理論のような異常度を定義した異常検知方法等により構成されてもよい。 The configuration of the second learning model 114 is not limited to the above example, and it is sufficient if the presence or absence of an abnormality in a group of sample data can be identified by an unsupervised learning algorithm. The second learning model 114 may be configured by an anomaly detection method that defines the degree of anomaly, such as the k-nearest neighbor method, one-class SVM, or Hotelling's theory.

なお第２学習モデル１１４への入力データは、複数の検体データを含む検体データ群に限定されず、少なくとも検体データ群における時系列で最後の検体データ、すなわち最新（直近）の検体データを含むものであればよい。 The input data to the second learning model 114 is not limited to a specimen data group including multiple specimen data, but may include at least the last specimen data in the specimen data group in time series, i.e., the most recent (most recent) specimen data.

図１３は、第３実施形態における第１学習モデル１１３の学習処理手順の一例を示すフローチャートである。以下の処理は、情報処理装置１の記憶部１１に記憶されるプログラム１Ｐに従って制御部１０により実行される。記憶部１１には、予め学習済みの第１学習モデル１１３が記憶されている。 Figure 13 is a flowchart showing an example of a learning process procedure for the first learning model 113 in the third embodiment. The following process is executed by the control unit 10 according to a program 1P stored in the storage unit 11 of the information processing device 1. The storage unit 11 stores a first learning model 113 that has been previously learned.

情報処理装置１の制御部１０は、検体データ群を取得する（ステップＳ５１）。取得する検体データ群は、例えば図７のステップＳ３２において取得した検体データ群と同一であってよい。 The control unit 10 of the information processing device 1 acquires a sample data group (step S51). The acquired sample data group may be the same as the sample data group acquired in step S32 of FIG. 7, for example.

制御部１０は、取得した検体データ群を入力データとして第２学習モデル１１４に入力する（ステップＳ５２）。制御部１０は、第２学習モデル１１４から出力される異常の有無の推定結果を取得する（ステップＳ５３）。 The control unit 10 inputs the acquired sample data group as input data to the second learning model 114 (step S52). The control unit 10 acquires the estimation result of the presence or absence of an abnormality output from the second learning model 114 (step S53).

制御部１０は、取得した第２学習モデル１１４による分類結果と、第１学習モデル１１３による分類情報とを比較する（ステップＳ５４）。制御部１０は、比較結果を検査施設装置２へ出力する（ステップＳ５５）。 The control unit 10 compares the classification result obtained by the second learning model 114 with the classification information obtained by the first learning model 113 (step S54). The control unit 10 outputs the comparison result to the testing facility device 2 (step S55).

制御部１０は、第１学習モデル１１３による分類情報が、第２学習モデル１１４による分類結果と異なる場合、第２学習モデル１１４による分類結果を修正情報として取得する（ステップＳ５６）。制御部１０は、分類情報に対する修正情報を用いて第１学習モデル１１３の再学習を行い、第１学習モデル１１３を更新する（ステップＳ５７）。具体的には、制御部１０は、第１学習モデル１１３に入力した検体データ群と、分類情報に対する修正情報とを訓練データとする再学習を行い、第１学習モデル１１３を更新する。制御部１０は、第１学習モデル１１３から出力される分類情報が修正後の推定情報に近似するようパラメータを最適化し、第１学習モデル１１３を再生成する。 When the classification information from the first learning model 113 differs from the classification result from the second learning model 114, the control unit 10 acquires the classification result from the second learning model 114 as correction information (step S56). The control unit 10 re-learns the first learning model 113 using the correction information for the classification information, and updates the first learning model 113 (step S57). Specifically, the control unit 10 re-learns using the sample data group input to the first learning model 113 and the correction information for the classification information as training data, and updates the first learning model 113. The control unit 10 optimizes parameters so that the classification information output from the first learning model 113 approximates the corrected estimated information, and regenerates the first learning model 113.

第２学習モデル１１４による分類結果と、第１学習モデル１１３による分類情報との比較結果は、例えば図８に示した結果一覧画面３１を用いて表示されてもよい。制御部１０は、第２学習モデル１１４による分類情報に加え、第２学習モデル１１４における各検査値（検査項目）に対する異常度を取得する。制御部１０は、第２学習モデル１１４の異常検知手法（図８の例ではホテリング理論）の検査値欄に、取得した各異常度を表示するとともに、異常と判定された検査値欄に対して、色付け等の表示態様を施す。また、制御部１０は、第２学習モデル１１４による分類結果と、第１学習モデル１１３による分類情報とを比較し、例えば第２学習モデル１１４における異常度と、第１学習モデル１１３による予測の確度とに基づいて、異常判定の統合結果（正常又は異常）を導出する。制御部１０は、導出した総合的な判定結果を異常判定列に表示する。 The comparison result between the classification result by the second learning model 114 and the classification information by the first learning model 113 may be displayed, for example, using the result list screen 31 shown in FIG. 8. In addition to the classification information by the second learning model 114, the control unit 10 acquires the degree of abnormality for each test value (test item) in the second learning model 114. The control unit 10 displays each acquired degree of abnormality in the test value column of the anomaly detection method of the second learning model 114 (Hotelling's theory in the example of FIG. 8), and applies a display mode such as coloring to the test value column determined to be abnormal. In addition, the control unit 10 compares the classification result by the second learning model 114 with the classification information by the first learning model 113, and derives an integrated result of anomaly judgment (normal or abnormal) based on, for example, the degree of abnormality in the second learning model 114 and the accuracy of prediction by the first learning model 113. The control unit 10 displays the derived overall judgment result in the anomaly judgment column.

なお、情報処理装置１は、第１学習モデル１１３と第２学習モデル１１４とを統合した１つの統合モデルを構築し、当該統合モデルを用いて検体データの異常を検出するよう構成されてもよい。 The information processing device 1 may be configured to construct a single integrated model by integrating the first learning model 113 and the second learning model 114, and to detect abnormalities in the sample data using the integrated model.

本実施形態によれば、第１学習モデル１１３と第２学習モデル１１４とを組み合わせることで第１学習モデル１１３をより最適化することができる。第１学習モデル１１３は、過去の異常とは異なる多様な異常を好適に検出できる。 According to this embodiment, the first learning model 113 can be further optimized by combining the first learning model 113 and the second learning model 114. The first learning model 113 can effectively detect a variety of anomalies that are different from past anomalies.

（第４実施形態）
第４実施形態では、被検者の状態に関連した病態や疾患、生理的状態に起因する異常（以下、「疾患に起因する検査値の異常」という）を検出する。以下では主に第１実施形態との相違点を説明し、第１実施形態と共通する構成については同一の符号を付してその詳細な説明を省略する。 Fourth Embodiment
In the fourth embodiment, abnormalities caused by pathological conditions, diseases, and physiological conditions related to the condition of a subject (hereinafter referred to as "abnormal test values caused by diseases") are detected. The following mainly describes the differences from the first embodiment, and the same reference numerals are used to designate the same components as the first embodiment, and detailed descriptions thereof will be omitted.

第４実施形態の情報処理装置１は、疾患に起因する検査値の異常を検出する第３学習モデル１１５を記憶部１１に記憶している。情報処理装置１は、第３学習モデル１１５を用い、検査過誤による検査時の異常とは異なる、患者の疾患に起因する検体データの異常を検出する。 The information processing device 1 of the fourth embodiment stores a third learning model 115 in the storage unit 11, which detects abnormalities in test values caused by a disease. The information processing device 1 uses the third learning model 115 to detect abnormalities in the specimen data caused by the patient's disease, which are different from abnormalities during testing caused by testing errors.

図１４は、第４実施形態における第３学習モデル１１５の概要を示す説明図である。第３学習モデル１１５は、複数の検査項目に対応する検査値により構成される検体データを入力として、当該検体データの正常又は異常に関する疾患情報を出力する機械学習モデルである。第３学習モデル１１５は予め、情報処理装置１又は外部装置において、ニューラルネットワークを用いた深層学習によって、生成され、学習される。学習アルゴリズムは、時系列データを取得した場合にはリカレントニューラルネットワーク（ＲＮＮ：Recurrent Neural Network）でもよい。 Figure 14 is an explanatory diagram showing an overview of the third learning model 115 in the fourth embodiment. The third learning model 115 is a machine learning model that receives sample data composed of test values corresponding to a plurality of test items as input and outputs disease information regarding whether the sample data is normal or abnormal. The third learning model 115 is generated and learned in advance by deep learning using a neural network in the information processing device 1 or an external device. The learning algorithm may be a recurrent neural network (RNN) when time series data is acquired.

第３学習モデル１１５は、検体データを入力する入力層と、疾患情報を出力する出力層と、特徴量を抽出する中間層（隠れ層）とを備える。中間層は、入力データの特徴量を抽出する複数のノードを有し、各種パラメータを用いて抽出された特徴量を出力層に受け渡す。入力層に、検体データが入力された場合、中間層で演算が行なわれ、出力層から、疾患情報が出力される。 The third learning model 115 includes an input layer that inputs specimen data, an output layer that outputs disease information, and an intermediate layer (hidden layer) that extracts features. The intermediate layer has multiple nodes that extract features of the input data, and passes the features extracted using various parameters to the output layer. When specimen data is input to the input layer, calculations are performed in the intermediate layer, and disease information is output from the output layer.

第３学習モデル１１５の入力は、複数の検査項目に対応する検査値を含む検体データである。検体データは、患者の年齢、性別等の属性情報をさらに含んでもよい。 The input of the third learning model 115 is sample data including test values corresponding to multiple test items. The sample data may further include attribute information such as the patient's age and gender.

第３学習モデル１１５の出力は、患者の疾患に起因する検体データの正常又は異常を示す疾患情報である。異常は、疾患名（例えば糖尿病、高血圧等）に応じて複数種類に分類されてよい。出力層は、設定されている疾患情報（正常又は各疾患名）に各々対応するノードを含み、各疾患情報に対する確度をスコアとして出力する。情報処理装置１は、スコアが最も高い疾患情報、あるいはスコアが閾値以上である疾患情報を出力層の出力値とすることができる。なお出力層は、それぞれの疾患情報の確度を出力する複数の出力ノードを有する代わりに、最も確度の高い疾患リスクを出力する１個の出力ノードを有してもよい。 The output of the third learning model 115 is disease information indicating whether the specimen data is normal or abnormal due to the patient's disease. The abnormalities may be classified into multiple types according to the disease name (e.g., diabetes, hypertension, etc.). The output layer includes nodes corresponding to the set disease information (normal or each disease name), and outputs the accuracy for each disease information as a score. The information processing device 1 can set the disease information with the highest score, or the disease information with a score equal to or greater than a threshold, as the output value of the output layer. Note that the output layer may have one output node that outputs the most accurate disease risk, instead of having multiple output nodes that output the accuracy of each disease information.

制御部１０は、過去に実施した大量の臨床検査から得られる入力情報に、既知の検査結果（正常又は疾患名）が付与された情報群を訓練データとして予め収集して第３学習モデル１１５を学習する。制御部１０は、検体データに応じた疾患情報を出力するよう、例えば誤差逆伝播法を用いて、第３学習モデル１１５を構成する各種パラメータを学習する。これにより、検体データが入力された場合、疾患情報を適切に出力可能に学習された第３学習モデル１１５が構築される。 The control unit 10 learns the third learning model 115 by collecting in advance as training data a group of information in which input information obtained from a large number of clinical tests performed in the past is given known test results (normal or disease name). The control unit 10 learns various parameters that constitute the third learning model 115, for example using the backpropagation method, so as to output disease information corresponding to the sample data. In this way, when sample data is input, the third learning model 115 is constructed, which has been trained to be able to appropriately output disease information.

図１５は、第４実施形態における疾患情報の取得処理手順の一例を示すフローチャートである。以下の処理は、情報処理装置１の記憶部１１に記憶されるプログラム１Ｐに従って制御部１０により実行される。 Figure 15 is a flowchart showing an example of a disease information acquisition processing procedure in the fourth embodiment. The following processing is executed by the control unit 10 according to the program 1P stored in the storage unit 11 of the information processing device 1.

情報処理装置１の制御部１０は、第１学習モデル１１３から出力される分類情報に基づき、検体データ群が正常と分類された検体データ群を特定し、特定した検体データ群における最新の（時系列で最後の）検体データを抽出する（ステップＳ６１）。 The control unit 10 of the information processing device 1 identifies a sample data group that is classified as normal based on the classification information output from the first learning model 113, and extracts the latest (last in chronological order) sample data from the identified sample data group (step S61).

制御部１０は、取得した検体データを入力データとして第３学習モデル１１５に入力する（ステップＳ６２）。制御部１０は、第３学習モデル１１５から出力される検体データの正常又は異常を示す疾患情報を取得する（ステップＳ６３）。 The control unit 10 inputs the acquired specimen data as input data to the third learning model 115 (step S62). The control unit 10 acquires disease information indicating whether the specimen data is normal or abnormal, which is output from the third learning model 115 (step S63).

制御部１０は、取得した疾患情報を検査施設装置２へ出力し（ステップＳ６４）、一連の処理を終了する。制御部１０は、取得した疾患情報を、分類情報等とともに表示する画面情報を生成してもよい。 The control unit 10 outputs the acquired disease information to the testing facility device 2 (step S64), and ends the series of processes. The control unit 10 may generate screen information that displays the acquired disease information together with classification information, etc.

本実施形態によれば、臨床検査時の検査過誤に由来する異常に加え、被検者の生理状態と関連した疾患に関する異常を検出することで、より多くの情報を提供することができる。検体データの異常値に対し、疾患情報を加味することで、より臨床検査結果の妥当性を適正に判断することができる。 According to this embodiment, in addition to abnormalities resulting from testing errors during clinical testing, more information can be provided by detecting abnormalities related to diseases associated with the physiological state of the subject. By adding disease information to abnormal values in the sample data, the validity of the clinical test results can be more appropriately determined.

今回開示した実施の形態は、全ての点で例示であって、制限的なものではないと考えられるべきである。各実施例にて記載されている技術的特徴は互いに組み合わせることができ、本発明の範囲は、特許請求の範囲内での全ての変更及び特許請求の範囲と均等の範囲が含まれることが意図される。 The embodiments disclosed herein are illustrative in all respects and should not be considered limiting. The technical features described in each embodiment can be combined with each other, and the scope of the present invention is intended to include all modifications within the scope of the claims and equivalents to the scope of the claims.

１情報処理装置
２検査施設装置
３表示装置
４入力装置
５検査装置
１０，２０制御部
１１，２１記憶部
１２，２２通信部
２３入出力部
１１１検体ＤＢ
１１２訓練ＤＢ
１１３第１学習モデル
１１４第２学習モデル
１１５第３学習モデル
１Ｐプログラム
１Ａ記録媒体 REFERENCE SIGNS LIST 1 Information processing device 2 Testing facility device 3 Display device 4 Input device 5 Testing device 10, 20 Control unit 11, 21 Memory unit 12, 22 Communication unit 23 Input/output unit 111 Specimen DB
112 Training DB
113 First learning model 114 Second learning model 115 Third learning model 1P Program 1A Recording medium

Claims

Obtaining a plurality of pieces of sample data including test values corresponding to a plurality of test items in a clinical test;
generating a first sample data group including abnormal sample data by replacing at least one test value included in the acquired plurality of sample data with an abnormal value ;
A method for generating a learning model that causes a computer to execute a process of generating a learning model that outputs the classification information when the sample data group is input, based on training data that includes the generated first sample data group, the second sample data group that does not include the abnormal sample data, and classification information regarding whether the sample data group is normal or abnormal.

The method for generating a learning model according to claim 1 , further comprising the steps of: replacing a portion of the sample data among a plurality of sample data items at different test dates and times related to a first patient with the sample data item of a second patient, thereby generating the first group of sample data.

The method for generating a learning model according to claim 1 or 2, further comprising the step of generating the first group of sample data by modifying the at least one test value in accordance with a predetermined rule.

The method for generating a learning model according to claim 3 , wherein the predetermined rule includes changing a dilution ratio.

The method for generating a learning model according to claim 1 , wherein the classification information includes normality and a plurality of types of abnormality including an abnormality caused by a mix-up of a sample.

Obtaining abnormal specimen data obtained in the clinical test,
6. The method for generating a learning model according to claim 1, further comprising: learning the learning model based on the training data including a third sample data group including abnormal sample data obtained in the acquired clinical test, and a label indicating abnormality in the sample data group.

7. The method for generating a learning model according to claim 1, further comprising: training the learning model using a result of classification of the sample data group as normal or abnormal by an unsupervised model that has been trained by unsupervised learning to classify the sample data group as normal or abnormal.

Extracting the specimen data having a high co-occurrence frequency from the plurality of specimen data;
The method for generating a learning model according to claim 1 , further comprising generating the sample data group from the extracted sample data with high co-occurrence frequency.

Obtaining a specimen data group consisting of a plurality of specimen data arranged in chronological order;
inputting the acquired specimen data group into a learning model that has been trained to output classification information regarding normality or abnormality of the last specimen data in the time series included in the specimen data group when a specimen data group consisting of a plurality of specimen data arranged in chronological order is input, and outputting the classification information;
The learning model has been trained using training data including a first specimen data group including abnormal specimen data generated by replacing at least one test value included in a plurality of specimen data including test values corresponding to a plurality of test items in a clinical test with an abnormal value, a second specimen data group not including the abnormal specimen data, and classification information regarding normality or abnormality of the specimen data group.
A program that causes a computer to execute a process.

The program according to claim 9 , further comprising: acquiring the specimen data group consisting of a plurality of specimen data with high co-occurrence frequencies extracted from the plurality of specimen data.

The program according to claim 9 or 10, wherein the classification information includes an abnormality caused by a mix-up of a sample.

The program according to claim 9 , wherein the classification information includes an abnormality caused by a calculation error when calculating any one of test values corresponding to a plurality of test items constituting the sample data.

13. The program according to claim 9, wherein the learning model outputs the classification information including a plurality of types of abnormalities, including an abnormality caused by a mix-up of samples and an abnormality caused by a calculation error when calculating one of the test values corresponding to a plurality of test items constituting the sample data.

Obtaining a verification result regarding normality or abnormality of the sample data by a delta verification method or an occurrence record zone verification method;
The program according to claim 9 , further comprising: outputting the acquired verification result and the classification information in association with each other.

The program according to claim 9 , further comprising: outputting, in an identifiable manner, from among test values corresponding to a plurality of test items constituting the specimen data, the test value or the test item having a high contribution to the classification information.

16. The program according to claim 9, further comprising: a graph showing time series changes in the latest sample data and past sample data in a clinical test; and an abnormality range of the latest sample data estimated based on the past sample data, the graph being output in association with the abnormality range of the latest sample data.

Extracting the latest sample data determined to be normal by the learning model;
17. The program according to claim 9, further comprising: inputting the extracted latest specimen data into a second learning model that outputs disease information relating to abnormalities in the test values caused by a disease when specimen data consisting of test values corresponding to a plurality of test items is input; and outputting the disease information.

Obtaining a specimen data group consisting of a plurality of specimen data arranged in chronological order;
inputting the acquired specimen data group into a learning model that has been trained to output classification information regarding normality or abnormality of the last specimen data in chronological order when a specimen data group consisting of a plurality of specimen data arranged in chronological order is input, and outputting the classification information;
The learning model has been trained using training data including a first specimen data group including abnormal specimen data generated by replacing at least one test value included in a plurality of specimen data including test values corresponding to a plurality of test items in a clinical test with an abnormal value, a second specimen data group not including the abnormal specimen data, and classification information regarding normality or abnormality of the specimen data group.
An information processing method for causing a computer to execute processing.

an acquisition unit that acquires a sample data group consisting of a plurality of sample data arranged in chronological order;
and an output unit that inputs the acquired sample data group to a learning model that has been trained to output classification information regarding normality or abnormality of the last sample data in the time series when a sample data group consisting of a plurality of sample data arranged in chronological order is input, and outputs the classification information ;
The learning model has been trained using training data including a first specimen data group including abnormal specimen data generated by replacing at least one test value included in a plurality of specimen data including test values corresponding to a plurality of test items in a clinical test with an abnormal value, a second specimen data group not including the abnormal specimen data, and classification information regarding normality or abnormality of the specimen data group.
Information processing device.

Obtaining a plurality of pieces of sample data including test values corresponding to a plurality of test items in a clinical test;
generating a first sample data group including abnormal sample data by replacing at least one test value included in the acquired plurality of sample data with an abnormal value ;
The method for generating learning model data causes a computer to execute a process of correspondingly storing the generated first sample data group and a label indicating abnormality, and the generated second sample data group not including the abnormal sample data and a label indicating normality in a storage unit as training data for a learning model.