JP7580448B2

JP7580448B2 - Gate area estimation program, gate area estimation method, and gate area estimation device

Info

Publication number: JP7580448B2
Application number: JP2022510573A
Authority: JP
Inventors: 圭伍河野; 晴彦二田
Original assignee: HU Group Research Institute GK
Current assignee: HU Group Research Institute GK
Priority date: 2020-03-25
Filing date: 2021-03-24
Publication date: 2024-11-11
Anticipated expiration: 2041-03-24
Also published as: WO2021193673A1; JPWO2021193673A1; CN115335681A

Description

本発明は、フローサイトメトリーにおけるゲート領域を推定するゲート領域推定プログラム等に関する。 The present invention relates to a gate region estimation program for estimating a gate region in flow cytometry.

フローサイトメトリー（Flow Cytometry：ＦＣＭ）は、単一の細胞毎に複数の特徴量を測定できる技術である。ＦＣＭでは、流動する液体に懸濁した細胞が一列になるように流す。一個一個流れる細胞に光を当て、その光の散乱や蛍光具合により、細胞の大きさ、内部の複雑さ、構成物質などの指標が得られる。フローサイトメトリーは医療においては、例えば、細胞性免疫検査に利用されている。 Flow cytometry (FCM) is a technology that can measure multiple characteristics of individual cells. In FCM, cells suspended in a flowing liquid are made to flow in a single file. Light is shone on each individual flowing cell, and the scattering and fluorescence of the light provides indicators of the cell's size, internal complexity, constituent materials, and other properties. In medicine, flow cytometry is used, for example, in cell-mediated immune testing.

細胞性免疫検査では、フローサイトメトリーで得られた複数の指標値の解析を行い、検査結果として返却する。解析技術の１つにゲーティングがある。ゲーティングは得られたデータの中から特定の集団のみを選んで解析する技術である。従来、解析対象とする集団の特定は、検査士が２次元の散布図において、楕円形や多角形（「ゲート」という）を描くことにより指定していた。このようなゲートの設定は、検査士の経験や知識による所が大きい。そのため、経験や知識が少ない検査士が適切なゲート設定を行うことは困難である。In cellular immunity testing, multiple index values obtained by flow cytometry are analyzed and returned as test results. One of the analysis techniques is gating. Gating is a technique for selecting and analyzing only specific groups from the obtained data. Traditionally, the group to be analyzed was specified by the technician drawing an ellipse or polygon (called a "gate") on a two-dimensional scatter plot. The setting of such gates depends heavily on the technician's experience and knowledge. For this reason, it is difficult for technicians with little experience or knowledge to set appropriate gates.

それに対して、ゲート設定を自動化する技術が提案されている（特許文献１、２等）。しかしながら、従来技術は細胞の密度情報を用いた設定方法や、ルールベースによる手法での設定であり、検査士が蓄積してきた経験や知識が十分、活用されていない。In response to this, technologies have been proposed to automate gate setting (Patent Documents 1 and 2, etc.). However, conventional technologies use setting methods that use cell density information or rule-based methods, and do not fully utilize the experience and knowledge accumulated by technicians.

特許第６４８０９１８号公報Patent No. 6480918 特許第５０４７８０３号公報Patent No. 5047803

そこで、検査士が蓄積した経験や知識に基づくゲート設定データを訓練データとして深層学習を行った学習モデルにより、ゲート領域を推定することが考えられる。しかし、学習モデルによるゲート領域の推定は精度が十分でない。 One possible solution is to estimate the gate area using a learning model that uses deep learning to train gate setting data based on the experience and knowledge accumulated by inspectors. However, the accuracy of the gate area estimation using the learning model is insufficient.

本発明はこのような状況に鑑みてなされたものである。その目的は、学習モデルにより、ゲート領域を推定する場合において、より精度の良い推定結果を出力するゲート領域推定プログラム等の提供である。The present invention has been made in consideration of these circumstances. Its purpose is to provide a gate area estimation program and the like that outputs more accurate estimation results when estimating a gate area using a learning model.

本発明に係るゲート領域推定プログラムは、測定項目が異なるフローサイトメトリーの測定より得た複数の散布図を含む散布図群を取得し、散布図群とゲート領域とを含む訓練データに基づき学習を行った複数の学習モデルそれぞれに、取得した散布図群を入力し、前記複数の学習モデルそれぞれから得た推定ゲート領域を出力する処理をコンピュータに行わせることを特徴とする。The gate region estimation program of the present invention is characterized in that it causes a computer to perform a process of acquiring a scatter plot group including multiple scatter plots obtained from flow cytometry measurements with different measurement items, inputting the acquired scatter plot group into each of multiple learning models that have been trained based on training data including the scatter plot group and the gate region, and outputting the estimated gate region obtained from each of the multiple learning models.

本発明にあっては、複数の学習モデルを用いるアンサンブル学習により、精度の良いゲート領域の推定を行うことが可能となる。 In the present invention, ensemble learning using multiple learning models makes it possible to estimate the gate region with high accuracy.

検査システムの構成例を示す説明図である。FIG. 1 is an explanatory diagram illustrating an example of the configuration of an inspection system. 処理部のハードウェア構成例を示すブロック図である。3 is a block diagram showing an example of a hardware configuration of a processing unit; FIG. 測定値ＤＢの例を示す説明図である。FIG. 11 is an explanatory diagram illustrating an example of a measurement value DB. 特徴情報ＤＢの例を示す説明図である。FIG. 2 is an explanatory diagram illustrating an example of a feature information DB. ゲートＤＢの例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of a gate DB. 閾値ＤＢの例を示す説明図である。FIG. 11 is an explanatory diagram illustrating an example of a threshold DB. 自信度ＤＢの例を示す説明図である。FIG. 13 is an explanatory diagram illustrating an example of a confidence level DB. 回帰モデルの生成処理に関する説明図である。FIG. 11 is an explanatory diagram regarding a generation process of a regression model. 回帰モデル生成処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for generating a regression model. 閾値決定処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for a threshold value determination process. ゲート領域の推定結果例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of an estimation result of a gate region. ゲート領域の推定結果例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of an estimation result of a gate region. 散布度の例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of dispersion degree. 散布度の例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of dispersion degree. 散布度の例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of dispersion degree. ゲート領域推定処理の手順例を示すフローチャートである。13 is a flowchart showing an example of a procedure for gate region estimation processing. 自信度判定処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for determining a degree of confidence. ゲート領域の推定結果例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of an estimation result of a gate region. ゲート領域の推定結果例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of an estimation result of a gate region. 散布度の例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of dispersion degree. 散布度の例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of dispersion degree. ゲート領域の推定結果例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of an estimation result of a gate region. 推定結果表示画面の例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of an estimation result display screen. 推定結果表示画面の例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of an estimation result display screen. ＩＤ一覧画面の例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of an ID list screen. ゲート領域推定処理の他の手順例を示すフローチャートである。13 is a flowchart showing another example of the procedure of the gate region estimation process. 外れ値ゲート領域の除外例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of excluding an outlier gate region. 外れ値ゲート領域の除外例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of excluding an outlier gate region. １０個の小集団の例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of 10 small groups. ゲート選択処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for gate selection processing. ゲート領域の選択例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of gate region selection. ゲート領域の選択例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of gate region selection. 輝度情報によるゲート領域の選択例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of selection of a gate region based on luminance information. ゲート領域選択処理の他の手順例を示すフローチャートである。13 is a flowchart showing another example of a procedure for the gate region selection process. ゲート領域選択処理の他の手順例を示すフローチャートである。13 is a flowchart showing another example of a procedure for the gate region selection process.

以下実施の形態を、図面を参照して説明する。以下の説明においては、白血病・リンパ腫解析（ＬＬＡ：Leukemia, Lymphoma Analysis）検査におけるＣＤ４５ゲーティングを例として説明する。最初に、ＬＬＡ検査の工程について説明する。ＬＬＡ検査は大まかに５つの工程を含む。１．分注、２．前処理、３．測定・描写、４．解析、５．報告である。 The following embodiment will be described with reference to the drawings. In the following description, CD45 gating in Leukemia, Lymphoma Analysis (LLA) testing will be used as an example. First, the steps of the LLA testing will be described. The LLA testing roughly includes five steps: 1. Dispensing, 2. Pretreatment, 3. Measurement/Description, 4. Analysis, and 5. Reporting.

分注工程では、一つの検体（以下、「ＩＤ」と記す。）を分ける工程である。ＬＬＡ検査では一つのＩＤを最大１０個に分注して検査を行う。分注した各検体をＳＥＱと記す。また、分注した１０の検体をＳＥＱ１、ＳＥＱ２、…、ＳＥＱ１０と記す。前処理工程では、各ＳＥＱに共通な処理（細胞濃度の調整など）を行い、個別に表面マーカを付ける。ＳＥＱ１はネガティブコントロールとする。ネガティブコントロールは、効果を検証したい対象と同一の条件で、既に陰性の結果が出ることが分かっている対象に検査を行うこと、あるいはその対象を意味する語である。陰性対照とも言う。検査においては、検証したい対象と、ネガティブコントロールにおける結果を比較することで、その相対的な差異から検査結果が解析される。The dispensing process is a process of dividing one sample (hereinafter referred to as "ID"). In the LLA test, one ID is dispensed into a maximum of 10 samples for testing. Each dispensed sample is referred to as SEQ. The 10 dispensed samples are also referred to as SEQ1, SEQ2, ..., SEQ10. In the pretreatment process, a common process (such as adjusting the cell concentration) is performed on each SEQ, and a surface marker is individually attached. SEQ1 is the negative control. A negative control is a term used to refer to testing a subject that is already known to produce negative results under the same conditions as the subject whose effect is to be verified, or to that subject. It is also called a negative control. In the test, the results of the subject to be verified are compared with the results of the negative control, and the test results are analyzed from the relative differences.

測定・描写工程では、１０ＳＥＱをフローサイトメータで測定を行い、蛍光値を得る。各ＳＥＱ内の個々の細胞について、測定値を含めた５つの項目からなる情報が得られる。項目の内訳は、ＦＳＣ、ＳＳＣ、ＦＬ１、ＦＬ２、ＦＬ３である。ＦＳＣは前方散乱光（ＦＳＣ：Forward Scattered Light）の測定値を示す。ＦＳＣはレーザービームの光軸に対して前方で検出される散乱光の値を示す。ＦＳＣは細胞の表面積または大きさにほぼ比例するため、細胞の大きさを示す指標値となる。ＳＳＣは側方散乱光（ＳＳＣ：Side Scattered Light）の測定値を示す。側方散乱光は、レーザービームの光軸に対して９０°の角度で検出される光である。ＳＳＣは、その大部分が細胞内の物質に光が当たって散乱したものである。ＳＳＣは、細胞の顆粒性状、内部構造にほぼ比例するため、細胞の顆粒性状、内部構造を示す指標値となる。ＦＬは蛍光（Fluorescence）を示すが、ここではフローサイトメータが備える複数の蛍光用検出器を示す。数字は蛍光用検出器の順番号を示す。ＦＬ１は１番目の蛍光検出器を示すが、ここでは、マーカとして各ＳＥＱのマーカ情報が設定される項目の名称である。ＦＬ２は２番目の蛍光検出器を示すが、ここでは、マーカとして各ＳＥＱのマーカ情報が設定される項目の名称である。ＦＬ３は３番目の蛍光用検出器を示すが、ここでは、ＣＤ４５のマーカ情報が設定される項目の名称である。In the measurement and depiction process, 10 SEQs are measured with a flow cytometer to obtain fluorescence values. For each cell in each SEQ, information consisting of five items, including the measured value, is obtained. The items are FSC, SSC, FL1, FL2, and FL3. FSC indicates the measured value of forward scattered light (FSC). FSC indicates the value of scattered light detected forward of the optical axis of the laser beam. FSC is approximately proportional to the surface area or size of the cell, so it is an index value that indicates the size of the cell. SSC indicates the measured value of side scattered light (SSC). Side scattered light is light detected at an angle of 90° to the optical axis of the laser beam. Most of SSC is scattered when light hits substances inside the cell. SSC is approximately proportional to the granular properties and internal structure of the cell, so it is an index value that indicates the granular properties and internal structure of the cell. FL indicates fluorescence, and here indicates multiple fluorescence detectors equipped in a flow cytometer. Numbers indicate the sequential numbers of the fluorescence detectors. FL1 indicates the first fluorescence detector, and here indicates the name of the item in which the marker information of each SEQ is set as a marker. FL2 indicates the second fluorescence detector, and here indicates the name of the item in which the marker information of each SEQ is set as a marker. FL3 indicates the third fluorescence detector, and here indicates the name of the item in which the marker information of CD45 is set.

フローサイトメータは、各ＳＥＱで２つの散布図を作成し、散布図をディスプレイ等に表示する。例えば、一つの散布図は、一方の軸をＳＳＣとし、他方の軸をＦＬ３とする。もう一つの散布図は、一方の軸をＳＳＣとし、他方の軸をＦＳＣとする。 The flow cytometer creates two scatter plots for each SEQ and displays the scatter plots on a display etc. For example, one scatter plot has one axis as SSC and the other as FL3. Another scatter plot has one axis as SSC and the other as FSC.

解析工程では、散布図の様相より、検査士が疾患を推定し、各散布図上に疾患特定に有用なゲートを作成する。そして、ゲート範囲に存在する細胞のみからなるＦＬ１―ＦＬ２の散布図を各ＳＥＱで作成し、マーカ反応として観察する。報告工程では、特に有用なゲートを２つ報告用に決定し、報告書を作成する。In the analysis process, the technician estimates the disease based on the appearance of the scatter plots and creates gates on each scatter plot that are useful for identifying the disease. Then, for each SEQ, a scatter plot of FL1-FL2 consisting of only cells present within the gate range is created and observed as a marker reaction. In the reporting process, two particularly useful gates are selected for reporting and a report is created.

（実施の形態１）
図１は検査システムの構成例を示す説明図である。検査システムはフローサイトメータ（ゲート領域推定装置）１０と学習サーバ３とを含む。フローサイトメータ１０と学習サーバ３とはネットワークＮを介して、通信可能に接続されている。フローサイトメータ１０は、装置全体の動作に関する種々の処理を行う処理部１と、検体を受け入れ、フローサイトメトリーによる測定を行う測定部２とを含む。 (Embodiment 1)
1 is an explanatory diagram showing an example of the configuration of an inspection system. The inspection system includes a flow cytometer (gate region estimation device) 10 and a learning server 3. The flow cytometer 10 and the learning server 3 are communicatively connected via a network N. The flow cytometer 10 includes a processing unit 1 that performs various processes related to the operation of the entire device, and a measurement unit 2 that receives a sample and performs measurement by flow cytometry.

学習サーバ３は、サーバコンピュータ、ワークステーション等で構成する。学習サーバ３は検査システムにおいて、必須の構成ではない。学習サーバ３は、主としてフローサイトメータ１０を補完する役目を担い、測定データや学習モデルをバックアップとして記憶する。また、フローサイトメータ１０に代わって、学習モデルの生成、学習モデルの再学習を行ってもよい。この場合、学習サーバ３は、学習モデルを特徴付けるパラメータ等をフローサイトメータ１０に送信する。なお、学習サーバ３の機能を、クラウドサービスで提供してもよい。The learning server 3 is composed of a server computer, a workstation, etc. The learning server 3 is not a required component of the inspection system. The learning server 3 mainly serves to complement the flow cytometer 10, storing measurement data and learning models as backups. It may also generate learning models and re-learn the learning models on behalf of the flow cytometer 10. In this case, the learning server 3 transmits parameters that characterize the learning model, etc. to the flow cytometer 10. The functions of the learning server 3 may be provided as a cloud service.

図２は、処理部のハードウェア構成例を示すブロック図である。処理部１は制御部１１、主記憶部１２、補助記憶部１３、入力部１４、表示部１５、通信部１６、及び読み取り部１７を含む。制御部１１、主記憶部１２、補助記憶部１３、入力部１４、表示部１５、通信部１６、及び読み取り部１７はバスＢにより接続されている。処理部１はフローサイトメータ１０と別体としても良い。処理部１は、ＰＣ（Personal Computer）、ノートパソコン、タブレットコンピュータ等で構築する。処理部１を複数のコンピュータからなるマルチコンピュータ、ソフトウェアによって仮想的に構築された仮想マシン又は量子コンピュータで構成してもよい。 Figure 2 is a block diagram showing an example of the hardware configuration of the processing unit. The processing unit 1 includes a control unit 11, a main memory unit 12, an auxiliary memory unit 13, an input unit 14, a display unit 15, a communication unit 16, and a reading unit 17. The control unit 11, the main memory unit 12, the auxiliary memory unit 13, the input unit 14, the display unit 15, the communication unit 16, and the reading unit 17 are connected by a bus B. The processing unit 1 may be separate from the flow cytometer 10. The processing unit 1 is constructed using a PC (Personal Computer), a notebook computer, a tablet computer, etc. The processing unit 1 may also be constructed using a multi-computer consisting of multiple computers, a virtual machine constructed virtually using software, or a quantum computer.

制御部１１は、一又は複数のＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro-Processing Unit）、ＧＰＵ（Graphics Processing Unit）等の演算処理装置を有する。制御部１１は、補助記憶部１３に記憶された図示しないＯＳ（Operating System）や制御プログラム１Ｐ（ゲート領域推定プログラム）を読み出して実行することにより、フローサイトメータ１０に係る種々の情報処理、制御処理等を行う。また、制御部１１は取得部、出力部等の機能部を含む。The control unit 11 has one or more arithmetic processing devices such as a CPU (Central Processing Unit), an MPU (Micro-Processing Unit), a GPU (Graphics Processing Unit), etc. The control unit 11 reads out and executes an OS (Operating System) (not shown) and a control program 1P (gate region estimation program) stored in the auxiliary storage unit 13, thereby performing various information processing, control processing, etc. related to the flow cytometer 10. The control unit 11 also includes functional units such as an acquisition unit and an output unit.

主記憶部１２は、ＳＲＡＭ（Static Random Access Memory）、ＤＲＡＭ（Dynamic Random Access Memory）、フラッシュメモリ等である。主記憶部１２は主として制御部１１が演算処理を実行するために必要なデータを一時的に記憶する。The main memory unit 12 is a static random access memory (SRAM), a dynamic random access memory (DRAM), a flash memory, etc. The main memory unit 12 mainly temporarily stores data required for the control unit 11 to execute arithmetic processing.

補助記憶部１３はハードディスク又はＳＳＤ（Solid State Drive）等であり、制御部１１が処理を実行するために必要な制御プログラム１Ｐや各種ＤＢ（Database）を記憶する。補助記憶部１３は、測定値ＤＢ１３１、特徴情報ＤＢ１３２、ゲートＤＢ１３３、第１回帰モデル１３４１から第５回帰モデル１３４５、閾値ＤＢ１３５、及び自信度ＤＢ１３６を記憶する。補助記憶部１３はフローサイトメータ１０に接続された外部記憶装置であってもよい。補助記憶部１３に記憶する各種ＤＢ等を、ネットワークＮで接続されたデータベースサーバやクラウドストレージに記憶してもよい。The auxiliary memory unit 13 is a hard disk or SSD (Solid State Drive) or the like, and stores the control program 1P and various DBs (Databases) necessary for the control unit 11 to execute processing. The auxiliary memory unit 13 stores a measurement value DB 131, a feature information DB 132, a gate DB 133, a first regression model 1341 to a fifth regression model 1345, a threshold DB 135, and a confidence level DB 136. The auxiliary memory unit 13 may be an external storage device connected to the flow cytometer 10. The various DBs and the like stored in the auxiliary memory unit 13 may be stored in a database server or cloud storage connected via a network N.

本実施の形態においては、複数の学習モデルを用いるアンサンブル学習を行う。複数の学習モデルの出力結果を用いて、ゲート領域の推定結果について、自信度を求める。本実施の形態においては、アンサンブル学習を行うために第１回帰モデル１３４１から第５回帰モデル１３４５の５つの学習モデルを使用するが、それに限らない。学習モデルの数は２つから４つでもよいし、６つ以上としてもよい。In this embodiment, ensemble learning is performed using multiple learning models. The output results of the multiple learning models are used to calculate the confidence level for the estimation result of the gate region. In this embodiment, five learning models, the first regression model 1341 to the fifth regression model 1345, are used to perform ensemble learning, but this is not limited to this. The number of learning models may be two to four, or six or more.

入力部１４はキーボードやマウスである。表示部１５は液晶表示パネル等を含む。表示部１５は測定を行うための情報や測定結果、ゲート情報などを種々の情報を表示する。表示部１５は入力部１４と一体化したタッチパネルディスプレイでもよい。なお、表示部１５に表示する情報をフローサイトメータ１０の外部表示装置に表示を行ってもよい。 The input unit 14 is a keyboard and a mouse. The display unit 15 includes a liquid crystal display panel, etc. The display unit 15 displays various information such as information for performing measurements, measurement results, and gate information. The display unit 15 may be a touch panel display integrated with the input unit 14. The information displayed on the display unit 15 may be displayed on an external display device of the flow cytometer 10.

通信部１６はネットワークＮを介して、学習サーバ３と通信を行う。また、制御部１１が通信部１６を用い、ネットワークＮ等を介して他のコンピュータから制御プログラム１Ｐをダウンロードし、補助記憶部１３に記憶してもよい。The communication unit 16 communicates with the learning server 3 via the network N. In addition, the control unit 11 may use the communication unit 16 to download the control program 1P from another computer via the network N, etc., and store it in the auxiliary memory unit 13.

読み取り部１７はＣＤ（Compact Disc）－ＲＯＭ及びＤＶＤ（Digital Versatile Disc）－ＲＯＭを含む可搬型記憶媒体１ａを読み取る。制御部１１が読み取り部１７を介して、制御プログラム１Ｐを可搬型記憶媒体１ａより読み取り、補助記憶部１３に記憶してもよい。また、ネットワークＮ等を介して他のコンピュータから制御部１１が制御プログラム１Ｐをダウンロードし、補助記憶部１３に記憶してもよい。さらにまた、半導体メモリ１ｂから、制御部１１が制御プログラム１Ｐを読み込んでもよい。The reading unit 17 reads portable storage medium 1a including CD (Compact Disc)-ROM and DVD (Digital Versatile Disc)-ROM. The control unit 11 may read the control program 1P from the portable storage medium 1a via the reading unit 17 and store it in the auxiliary storage unit 13. The control unit 11 may also download the control program 1P from another computer via a network N or the like and store it in the auxiliary storage unit 13. Furthermore, the control unit 11 may read the control program 1P from the semiconductor memory 1b.

補助記憶部１３が記憶するデータベースについて説明する。図３は測定値ＤＢの例を示す説明図である。測定値ＤＢ１３１はフローサイトメータ１０による測定の測定値を記憶する。図３に示すのは測定値ＤＢ１３１に記憶される１レコードの例である。測定値ＤＢ１３１の各レコードは、基本部１３１１とデータ部１３１２とを含む。基本部１３１１は受付番号列、受付日列、検査番号列、検査日列、カルテ番号列、氏名列、性別列、年齢列、及び採取日列を含む。受付番号列は検査依頼を受け付けた際に発番する受付番号（識別情報）を記憶する。受付日列は、検査依頼を受け付けた日付を記憶する。検査番号列は検査を行う際に発番する検査番号を記憶する。検査日列は検査を実施した日付を記憶する。カルテ番号列は検査依頼に対応するカルテの番号を記憶する。氏名列は検体を提供した被検査者の氏名を記憶する。性別列は被検査者の性別を記憶する。例えば、被検査者が男性であれば、性別列はＭを記憶する。被検査者が女性であれば、性別列はＦを記憶する。年齢列は被検査者の年齢を記憶する。採取日列は被検査者から検体を採取した日付を記憶する。データ部１３１２において、各列は測定項目について、細胞毎の測定値を記憶する。各行は一つの細胞について、測定項目毎の測定値を記憶する。The database stored in the auxiliary memory unit 13 will be described. FIG. 3 is an explanatory diagram showing an example of a measurement value DB. The measurement value DB 131 stores the measurement values of measurements made by the flow cytometer 10. FIG. 3 shows an example of one record stored in the measurement value DB 131. Each record in the measurement value DB 131 includes a basic section 1311 and a data section 1312. The basic section 1311 includes a reception number column, a reception date column, a test number column, a test date column, a medical record number column, a name column, a gender column, an age column, and a collection date column. The reception number column stores the reception number (identification information) issued when a test request is received. The reception date column stores the date on which the test request is received. The test number column stores the test number issued when the test is performed. The test date column stores the date on which the test was performed. The medical record number column stores the number of the medical record corresponding to the test request. The name column stores the name of the subject who provided the sample. The gender column stores the gender of the subject. For example, if the subject is male, the gender column stores M. If the subject is female, the gender column stores F. The age column stores the age of the subject. The collection date column stores the date the sample was collected from the subject. In the data section 1312, each column stores the measurement value for each cell for the measurement item. Each row stores the measurement value for each measurement item for one cell.

図４は特徴情報ＤＢの例を示す説明図である。特徴情報ＤＢ１３２は測定値から得られる特徴を示す情報（以下、「特徴情報」とも言う。）を記憶する。特徴情報は例えば、散布図やヒストグラムである。特徴情報ＤＢ１３２は、受付番号列、検査番号列、順番号列、種別列、横軸列、縦軸列、及び画像列を含む。受付番号列は、受付番号を記憶する。検査番号列は、検査番号を記憶する。順番号列は同一検査内での特徴情報の順番号を記憶する。種別列は特徴情報の種別を記憶する。例えば、種別は上述したように散布図やヒストグラムである。横軸列は散布図やヒストグラムにおいて横軸として採用した項目を記憶する。縦軸列は散布図において縦軸として採用した項目を記憶する。ヒストグラムの場合、縦軸は細胞数であるので、縦軸列は細胞数を記憶する。画像列は散布図やヒストグラムを画像として記憶する。 Figure 4 is an explanatory diagram showing an example of a feature information DB. The feature information DB 132 stores information indicating features obtained from the measured values (hereinafter also referred to as "feature information"). The feature information is, for example, a scatter plot or a histogram. The feature information DB 132 includes a reception number column, an examination number column, a sequence number column, a type column, a horizontal axis column, a vertical axis column, and an image column. The reception number column stores the reception number. The examination number column stores the examination number. The sequence number column stores the sequence number of feature information within the same examination. The type column stores the type of feature information. For example, the type is a scatter plot or a histogram as described above. The horizontal axis column stores the item adopted as the horizontal axis in the scatter plot or histogram. The vertical axis column stores the item adopted as the vertical axis in the scatter plot. In the case of a histogram, the vertical axis is the number of cells, so the vertical axis column stores the number of cells. The image column stores the scatter plot or histogram as an image.

図５はゲートＤＢの例を示す説明図である。ゲートＤＢ１３３は散布図に対して、設定されたゲートの情報（ゲート情報）を記憶する。ゲート情報はゲート領域を確定するための情報である。ゲート情報はゲート領域の外形線を示す図形の情報、ゲート領域に含まれる測定値の値範囲、ゲート領域に含まれる測定値の集合などである。散布図画像上において、ゲート領域に含まれる点のピクセル座標値でもよい。ここでは、ゲート情報はゲート領域の外形線を示す図形とし、その形状は楕円形状とするが、それに限られない。図形は複数の辺から構成される多角形や、複数の曲線を結んだ図形でもよい。ゲートＤＢ１３３は、受付番号列、検査番号列、横軸列、縦軸列、ゲート番号列、CX列、CY列、DX列、DY列、及びθ列を含む。受付番号列は受付番号を記憶する。検査番号列は検査番号を記憶する。横軸列は散布図において横軸として採用した項目を記憶する。縦軸列は散布図において縦軸として採用した項目を記憶する。ゲート番号列はゲートの順番号を記憶する。CX列は楕円の中心x座標値を記憶する。CY列は楕円の中心y座標値を記憶する。DX列は楕円の長径の値を記憶する。DY列は楕円の短径の値を記憶する。θ列は楕円の傾き角度を記憶する。例えば、傾き角度は横軸と楕円の長径とがなす角度である。ゲート形状として、多角形を設定可能とする場合、ゲートＤＢ１３３は多角形を形づくる複数点の座標列を記憶する。 Figure 5 is an explanatory diagram showing an example of a gate DB. Gate DB 133 stores information on the gates (gate information) set for the scatter plot. The gate information is information for determining the gate area. The gate information is information on a figure showing the outline of the gate area, a value range of the measurement values included in the gate area, a set of measurement values included in the gate area, etc. It may be pixel coordinate values of a point included in the gate area on the scatter plot image. Here, the gate information is a figure showing the outline of the gate area, and the shape is an ellipse, but is not limited to this. The figure may be a polygon consisting of multiple sides or a figure connecting multiple curves. Gate DB 133 includes a reception number column, an examination number column, a horizontal axis column, a vertical axis column, a gate number column, a CX column, a CY column, a DX column, a DY column, and a θ column. The reception number column stores the reception number. The examination number column stores the examination number. The horizontal axis column stores the item adopted as the horizontal axis in the scatter plot. The vertical axis column stores the item used as the vertical axis in the scatter plot. The gate number column stores the gate sequence number. The CX column stores the x-coordinate value of the center of the ellipse. The CY column stores the y-coordinate value of the center of the ellipse. The DX column stores the value of the major axis of the ellipse. The DY column stores the value of the minor axis of the ellipse. The θ column stores the inclination angle of the ellipse. For example, the inclination angle is the angle between the horizontal axis and the major axis of the ellipse. If a polygon can be set as the gate shape, the gate DB133 stores the coordinate sequence of multiple points that form the polygon.

図６は閾値ＤＢの例を示す説明図である。閾値ＤＢ１３５はゲート領域を示す各項目値のバラつきの度合いを示す指標（散布度）に関する閾値を記憶する。当該閾値は、回帰モデルの自信度を判定する際に用いる。図６に示す例はゲート領域が楕円の場合である。閾値ＤＢ１３５はID列、横軸列、縦軸列、CX列、CY列、DX列、及びDY列を含む。ID列は閾値群を特定するＩＤを記憶する。横軸列は散布図において横軸とする項目を記憶する。縦軸列は散布図において縦軸とする項目を記憶する。CX列は楕円の中心x座標値に関する閾値を記憶する。CY列は楕円の中心y座標値に関する閾値を記憶する。DX列は楕円の長径の値に関する閾値を記憶する。DY列は楕円の短径の値に関する閾値を記憶する。CX列、CY列、DX列、及びDY列はそれぞれ、A列及びB列を含む。A列は閾値Ａを記憶する。B列は閾値Ｂを記憶する。Ｂ列の「－」は値が設定されていないことを示す。閾値Ａのみが設定されている場合、回帰モデルの自信度は、高又は低のいずれかとなる。閾値Ｂが設定されている場合は、自信度を数値で示す。例えば、閾値Ａより小さいならば自信度５０、更に閾値Ｂよりも小さいならば自信度７０とする。なお、閾値は３つ以上であってもよい。 Figure 6 is an explanatory diagram showing an example of a threshold DB. The threshold DB 135 stores thresholds for indices (dispersion) that indicate the degree of variation in each item value indicating the gate region. The thresholds are used when determining the confidence level of the regression model. The example shown in Figure 6 is a case where the gate region is an ellipse. The threshold DB 135 includes an ID column, a horizontal axis column, a vertical axis column, a CX column, a CY column, a DX column, and a DY column. The ID column stores an ID that identifies a threshold group. The horizontal axis column stores items that are the horizontal axis in a scatter plot. The vertical axis column stores items that are the vertical axis in a scatter plot. The CX column stores a threshold for the center x coordinate value of the ellipse. The CY column stores a threshold for the center y coordinate value of the ellipse. The DX column stores a threshold for the major axis value of the ellipse. The DY column stores a threshold for the minor axis value of the ellipse. The CX column, CY column, DX column, and DY column each include an A column and a B column. Column A stores a threshold A. Column B stores threshold B. A "-" in column B indicates that no value is set. If only threshold A is set, the confidence level of the regression model will be either high or low. If threshold B is set, the confidence level is indicated by a numerical value. For example, if it is smaller than threshold A, the confidence level is 50, and if it is further smaller than threshold B, the confidence level is 70. Note that there may be three or more thresholds.

図７は自信度ＤＢの例を示す説明図である。自信度ＤＢ１３６は回帰モデルによるゲート領域の推定結果の自信度を記憶する。自信度ＤＢ１３６は受付番号列、検査番号列、ゲート番号列、CX列、CY列、DX列、DY列、ゲート全体列、及び全体列を含む。受付番号列は受付番号を記憶する。検査番号列は検査番号を記憶する。ゲート番号列はゲートの順番号を記憶する。受付番号列、検査番号列及びゲート番号列により、ゲートＤＢ１３３との対応付けが可能となる。CX列は楕円の中心x座標値の自信度を記憶する。CY列は楕円の中心y座標値の自信度を記憶する。DX列は楕円の長径長の自信度を記憶する。DY列は楕円の短径長の自信度を記憶する。ゲート全体列はゲート毎の自信度を記憶する。全体列は検査毎の自信度を記憶する。図７に示す例では、自信度の値は高又は低である。 Figure 7 is an explanatory diagram showing an example of a confidence level DB. The confidence level DB 136 stores the confidence level of the estimation result of the gate area by the regression model. The confidence level DB 136 includes a reception number column, an examination number column, a gate number column, a CX column, a CY column, a DX column, a DY column, a gate total column, and a total column. The reception number column stores the reception number. The examination number column stores the examination number. The gate number column stores the gate sequence number. The reception number column, the examination number column, and the gate number column enable correspondence with the gate DB 133. The CX column stores the confidence level of the center x coordinate value of the ellipse. The CY column stores the confidence level of the center y coordinate value of the ellipse. The DX column stores the confidence level of the major axis length of the ellipse. The DY column stores the confidence level of the minor axis length of the ellipse. The gate total column stores the confidence level for each gate. The total column stores the confidence level for each examination. In the example shown in Figure 7, the confidence level value is high or low.

次に、準備工程について説明する。準備工程は実運用に入る前に行う工程である。図８は回帰モデルの生成処理に関する説明図である。第１回帰モデル１３４１から第５回帰モデル１３４５の５つの学習モデルを生成する。図８は、機械学習を行って第１回帰モデル１３４１から第５回帰モデル１３４５を生成する処理を示している。基本的な処理の内容は全ての学習モデルで同様である。基本的な処理について、第１回帰モデル１３４１を代表として説明する。 Next, the preparation process will be described. The preparation process is a process carried out before actual operation begins. Figure 8 is an explanatory diagram regarding the regression model generation process. Five learning models, the first regression model 1341 to the fifth regression model 1345, are generated. Figure 8 shows the process of generating the first regression model 1341 to the fifth regression model 1345 by performing machine learning. The basic process is similar for all learning models. The basic process will be described using the first regression model 1341 as a representative example.

本実施の形態にフローサイトメータ１０において、処理部１は、測定部２で得た測定結果に基づき作成した散布図画像に対する適切なゲートの特徴量を学習するディープラーニングを行うことで、複数の散布図画像（散布図群）を入力とし、ゲート情報を出力とする第１回帰モデル１３４１を生成する。複数の散布図画像とは、少なくとも１軸の項目が異なる複数の散布図画像である。例えば、横軸がＳＳＣで縦軸がＦＬ３の散布図画像、及び、横軸がＳＳＣで縦軸がＦＳＣの散布図画像からなる２つの散布図画像である。３つ以上の散布図画像を入力してもよい。ニューラルネットワークは例えばＣＮＮ（Convolution Neural Network）である。第１回帰モデル１３４１は、各散布図画像の特徴量をそれぞれ学習する複数の特徴抽出器と、各特徴抽出器が出力した特徴量を結合する結合器と、結合した特徴量に基づき、ゲート情報の各項目（中心Ｘ座標、中心Ｙ座標、長径、短径、傾斜角度）を推定し出力する複数の推定器とを有する。なお、第１回帰モデル１３４１に散布図画像ではなく、散布図の基になる測定値の集合を入力してもよい。In the flow cytometer 10 of this embodiment, the processing unit 1 performs deep learning to learn the appropriate gate feature amount for the scatter plot image created based on the measurement results obtained by the measurement unit 2, thereby generating a first regression model 1341 that inputs multiple scatter plot images (scatter plot group) and outputs gate information. The multiple scatter plot images are multiple scatter plot images with at least one axis item different. For example, two scatter plot images are a scatter plot image with a horizontal axis of SSC and a vertical axis of FL3, and a scatter plot image with a horizontal axis of SSC and a vertical axis of FSC. Three or more scatter plot images may be input. The neural network is, for example, a CNN (Convolution Neural Network). The first regression model 1341 has multiple feature extractors that respectively learn the feature amounts of each scatter plot image, a combiner that combines the feature amounts output by each feature extractor, and multiple estimators that estimate and output each item of the gate information (center X coordinate, center Y coordinate, major axis, minor axis, tilt angle) based on the combined feature amount. Note that instead of the scatter plot image, a set of measurement values on which the scatter plot is based may be input to the first regression model 1341 .

各特徴抽出器は、入力層、中間層を含む。入力層は、散布図画像に含まれる各画素の画素値の入力を受け付ける複数のニューロンを有し、入力された画素値を中間層に受け渡す。中間層は複数のニューロンを有し、散布図画像内からの特徴量を抽出して出力層に受け渡す。例えば特徴抽出器がＣＮＮである場合、中間層は、入力層から入力された各画素の画素値を畳み込むコンボリューション層と、コンボリューション層で畳み込んだ画素値をマッピングするプーリング層とが交互に連結された構成を有し、画素情報を圧縮しながら最終的に画像特徴量を抽出する。散布図画像を入力する特徴抽出器を画像毎に設けるのではなく、１つの特徴抽出器に複数の散布図画像を入力する構成でもよい。Each feature extractor includes an input layer and an intermediate layer. The input layer has a plurality of neurons that accept input of pixel values of each pixel included in the scatter plot image, and passes the input pixel values to the intermediate layer. The intermediate layer has a plurality of neurons, extracts features from the scatter plot image, and passes them to the output layer. For example, when the feature extractor is a CNN, the intermediate layer has a configuration in which a convolution layer that convolves the pixel values of each pixel input from the input layer and a pooling layer that maps the pixel values convolved in the convolution layer are alternately connected, and finally extracts image features while compressing pixel information. Instead of providing a feature extractor that inputs a scatter plot image for each image, a configuration in which multiple scatter plot images are input to one feature extractor may be used.

なお、本実施の形態では第１回帰モデル１３４１がＣＮＮであるものとして説明するが、第１回帰モデル１３４１はＣＮＮに限定されず、ＣＮＮ以外のニューラルネットワーク、ベイジアンネットワーク、決定木など、他の学習アルゴリズムで構築された学習済みモデルであってもよい。In this embodiment, the first regression model 1341 is described as being a CNN, but the first regression model 1341 is not limited to a CNN and may be a trained model constructed using other learning algorithms, such as a neural network other than CNN, a Bayesian network, or a decision tree.

処理部１は、複数の散布図画像と、散布図に対応したゲート情報の正解値とが対応付けられた訓練データを用いて学習を行う。例えば図８に示すように、訓練データは、複数の散布図画像に対し、ゲート情報がラベル付けされたデータである。なお、ここでは簡略のため、２種類の散布図を１組の散布図とする。また、１組の散布図に対して、１つのゲートを設けるものとして説明するが、複数のゲートを設けてもよい。この場合、ゲート情報には有用度を示す値を含める。The processing unit 1 performs learning using training data in which multiple scatter plot images are associated with correct values of gate information corresponding to the scatter plots. For example, as shown in FIG. 8, the training data is data in which gate information is labeled for multiple scatter plot images. For simplicity, two types of scatter plots are considered as one set of scatter plots. Also, although the following description assumes that one gate is provided for one set of scatter plots, multiple gates may be provided. In this case, the gate information includes a value indicating usefulness.

処理部１は、訓練データである２つの散布図画像をそれぞれ異なる特徴抽出器に入力する。各特徴抽出器が出力した特徴量が結合器により結合される。結合器による結合は、単純に特徴量を結合する方法（Concatenate）、特徴量を示す値を加算する方法（Add）、特徴量の最大のものを選択する（Maxpool）方法などがある。The processing unit 1 inputs the two scatter plot images, which are training data, into different feature extractors. The features output by each feature extractor are combined by a combiner. Combinations by the combiner can be performed in a number of ways, including simply combining the features (Concatenate), adding values indicating the features (Add), and selecting the feature with the maximum value (Maxpool).

結合された特徴量に基づき、各推定器は推定結果として、ゲート情報を出力する。各推定器が出力する値の組み合わせで、１組のゲート情報となる。出力するゲート情報は複数組であってもよい。この場合、複数組に応じた数の推定器を設ける。例えば、優先順位１位のゲート情報と、優先順位２位のゲート情報とを出力する場合、図８における推定器の数が５から１０個となる。Based on the combined features, each estimator outputs gate information as an estimation result. A combination of values output by each estimator forms a set of gate information. There may be multiple sets of gate information to be output. In this case, a number of estimators corresponding to the multiple sets are provided. For example, when outputting gate information with the first priority and gate information with the second priority, the number of estimators in Figure 8 will be 5 to 10.

処理部１は推定器から得たゲート情報を、訓練データにおいて散布図画像に対しラベル付けされた情報、すなわち正解値と比較し、推定器からの出力値が正解値に近づくように、特徴抽出器や推定器での演算処理に用いるパラメータを最適化する。当該パラメータは、例えばニューロン間の重み（結合係数）、各ニューロンで用いられる活性化関数の係数などである。パラメータの最適化の方法は特に限定されないが、例えば処理部１は誤差逆伝播法を用いて各種パラメータの最適化を行う。処理部１は、訓練データに含まれる検査毎のデータについて上記の処理を行い、第１回帰モデル１３４１を生成する。The processing unit 1 compares the gate information obtained from the estimator with the information labeled for the scatter plot image in the training data, i.e., the correct value, and optimizes the parameters used in the calculation process in the feature extractor and estimator so that the output value from the estimator approaches the correct value. The parameters in question are, for example, the weights (coupling coefficients) between neurons and the coefficients of the activation functions used in each neuron. There are no particular limitations on the method of optimizing the parameters, but for example, the processing unit 1 optimizes various parameters using the backpropagation method. The processing unit 1 performs the above processing on the data for each test included in the training data, and generates the first regression model 1341.

次に、処理部１の制御部１１が行う処理について説明する。図９は回帰モデル生成処理の手順例を示すフローチャートである。制御部１１は検査履歴を取得する（ステップＳ１）。検査履歴は過去の検査結果の蓄積であり、測定値ＤＢ１３１に記憶された過去の測定値である。制御部１１は処理対象とする１つの履歴を選択する（ステップＳ２）。制御部１１は選択した履歴に対応する特徴情報を取得する（ステップＳ３）。特徴情報は例えば散布図である。特徴情報は特徴情報ＤＢ１３２から取得する。特徴情報が記憶されていない場合、測定値から生成してもよい。制御部１１は選択した履歴に対応するゲート情報を取得する（ステップＳ４）。ゲート情報はゲートＤＢ１３３より取得する。制御部１１は取得した特徴情報とゲート情報とを訓練データとして、第１回帰モデル１３４１の学習を行う（ステップＳ５）。制御部１１は未処理の検査履歴があるか否かを判定する（ステップＳ６）。制御部１１は未処理の検査履歴があると判定した場合（ステップＳ６でＹＥＳ）、処理をステップＳ２に戻し、未処理の検査履歴に関する処理を行う。制御部１１は未処理の検査履歴がないと判定した場合（ステップＳ６でＮＯ）、第１回帰モデル１３４１を記憶し（ステップＳ７）、処理を終了する。Next, the processing performed by the control unit 11 of the processing unit 1 will be described. FIG. 9 is a flowchart showing an example of the procedure of the regression model generation processing. The control unit 11 acquires the inspection history (step S1). The inspection history is an accumulation of past inspection results, and is past measurement values stored in the measurement value DB 131. The control unit 11 selects one history to be processed (step S2). The control unit 11 acquires feature information corresponding to the selected history (step S3). The feature information is, for example, a scatter plot. The feature information is acquired from the feature information DB 132. If the feature information is not stored, it may be generated from the measurement value. The control unit 11 acquires gate information corresponding to the selected history (step S4). The gate information is acquired from the gate DB 133. The control unit 11 learns the first regression model 1341 using the acquired feature information and gate information as training data (step S5). The control unit 11 determines whether there is an unprocessed inspection history (step S6). When the control unit 11 determines that there is an unprocessed inspection history (YES in step S6), the control unit 11 returns the process to step S2 and performs a process related to the unprocessed inspection history. When the control unit 11 determines that there is no unprocessed inspection history (NO in step S6), the control unit 11 stores the first regression model 1341 (step S7) and ends the process.

図８及び図９を用いて説明した処理と同様な処理により、第２回帰モデル１３４２、第３回帰モデル１３４３、第４回帰モデル１３４４、第５回帰モデル１３４５を生成する。ただし、第１回帰モデル１３４１から第５回帰モデル１３４５のそれぞれは、例えば、訓練データ、ネットワーク構造、ハイパーパラメータを変えることにより、生成条件が異なる回帰モデルとする。訓練データについては、データ拡張やブートストラップ法で用いる復元抽出法によりデータ数を増やすことにより、５つの学習モデルの訓練データを異なるものとする。ネットワーク構造については、入力層、出力層の数を変える。また、Fine-tuningにより、ある学習モデルから、それと異なる学習モデルを生成する。ハイパーパラメータについては、中間層の層数、各層（レイヤ）のノード数、重み、損失関数、最適化関数、学習率、バッチサイズ等の設定を異なるものとする。 The second regression model 1342, the third regression model 1343, the fourth regression model 1344, and the fifth regression model 1345 are generated by a process similar to that described using FIG. 8 and FIG. 9. However, the first regression model 1341 to the fifth regression model 1345 are regression models with different generation conditions, for example, by changing the training data, network structure, and hyperparameters. For the training data, the training data of the five learning models is made different by increasing the number of data by the restoration sampling method used in data augmentation and bootstrap method. For the network structure, the number of input layers and output layers is changed. In addition, a different learning model is generated from a certain learning model by fine-tuning. For the hyperparameters, the settings of the number of layers of the intermediate layer, the number of nodes in each layer, weights, loss function, optimization function, learning rate, batch size, etc. are made different.

第１回帰モデル１３４１から第５回帰モデル１３４５を生成した後、自信度を判定するための閾値を決定する。図１０は閾値決定処理の手順例を示すフローチャートである。制御部１１はテストデータを取得する（ステップＳ１１）。制御部１１は取得したテストデータを各回帰モデルに入力する（ステップＳ１２）。制御部１１は各回帰モデルから推定出力を取得する（ステップＳ１３）。推定出力は、各回帰モデルが推定したゲート領域を示すパラメータの値である。ゲート領域が楕円の場合、パラメータは中心座標（Ｃｘ，Ｃｙ）、長半径と短半径との長さ（Ｄｘ，Ｄｙ）、及び長半径とｘ軸とがなす角の角度（θ）である。ゲート領域が多角形の場合は、各頂点の座標値である。制御部１１は各回帰モデルから出力された値から、パラメータ毎に散布度を算出する（ステップＳ１４）。散布度の一例は標準偏差である。制御部１１は未処理のテストデータがあるか否かを判定する（ステップＳ１５）。制御部１１は未処理のテストデータがあると判定した場合（ステップＳ１５でＹＥＳ）、処理をステップＳ１１へ戻し、未処理のテストデータについての処理を行う。制御部１１は未処理のテストデータがないと判定した場合（ステップＳ１５でＮＯ）、パラメータ毎の閾値を決定する（ステップＳ１６）。閾値は許容できる出力値のバラつきの限度を意味する。閾値は統計手法により決定する。または、テストデータ毎の散布度の値から、例えばベテランの検査士が判断して、決定する。制御部１１は決定した閾値を記憶し（ステップＳ１７）、閾値決定処理を終了する。なお、閾値はフローサイトメータ１０が稼働する環境毎、例えば、検査機関毎に調整してもよい。散布度は標準偏差以外に、分散、不偏分散又は平均偏差でもよい。After generating the first regression model 1341 to the fifth regression model 1345, a threshold value for judging the confidence level is determined. FIG. 10 is a flowchart showing an example of the procedure of the threshold determination process. The control unit 11 acquires test data (step S11). The control unit 11 inputs the acquired test data to each regression model (step S12). The control unit 11 acquires an estimated output from each regression model (step S13). The estimated output is a parameter value indicating the gate region estimated by each regression model. If the gate region is an ellipse, the parameters are the center coordinates (Cx, Cy), the length of the major axis and the minor axis (Dx, Dy), and the angle (θ) between the major axis and the x-axis. If the gate region is a polygon, the parameters are the coordinate values of each vertex. The control unit 11 calculates the degree of dispersion for each parameter from the values output from each regression model (step S14). An example of the degree of dispersion is the standard deviation. The control unit 11 determines whether there is unprocessed test data (step S15). When the control unit 11 determines that there is unprocessed test data (YES in step S15), it returns the process to step S11 and processes the unprocessed test data. When the control unit 11 determines that there is no unprocessed test data (NO in step S15), it determines a threshold value for each parameter (step S16). The threshold value means the limit of the allowable variation in the output value. The threshold value is determined by a statistical method. Alternatively, the threshold value is determined, for example, by a veteran inspector based on the value of the degree of dispersion for each test data. The control unit 11 stores the determined threshold value (step S17) and ends the threshold value determination process. The threshold value may be adjusted for each environment in which the flow cytometer 10 is operated, for example, for each testing institution. The degree of dispersion may be variance, unbiased variance, or average deviation, in addition to the standard deviation.

閾値決定処理の具体例を説明する。図１１Ａ及び図１１Ｂはゲート領域の推定出力例を示す説明図である。図１１はＣＤ４５ゲーティングにおける散布図の例である。図１１Ａは入力とする散布図の例を示す。図１１Ａは横軸がＳＳＣ（Side Scattered Light：側方散乱光）であり、縦軸がＦＬ３（ＦＬ＝Fluorescence：蛍光用検出器、３は３番目の意。）である。図１１Ｂは、入力された散布図におけるゲート領域を、５つの回帰モデルが推定したそれぞれの結果を示す。図１１Ｂの縦軸、横軸は図１１Ａと同様である。ここでは、楕円形のゲート領域を１つ推定する例を示す。図１１Ｂでは、入力された散布図にゲート領域が重ね書きされている。ＡＩ－１が第１回帰モデル１３４１の推定結果を示し、ＡＩ－２が第２回帰モデル１３４２の推定結果を示し、以下同様であり、ＡＩ－５が第５回帰モデル１３４５の推定結果を示す。散布図の下にゲート領域である楕円形のパラメータを記載している。上から順に中心Ｘ座標（Cx）、中心Ｙ座標（Cy）、長径の長さ（Dx）、短径の長さ（Dy）、傾斜角度（θ）である。 A specific example of the threshold determination process will be described. Figures 11A and 11B are explanatory diagrams showing an example of an estimated output of a gate area. Figure 11 is an example of a scatter plot in CD45 gating. Figure 11A shows an example of an input scatter plot. In Figure 11A, the horizontal axis is SSC (Side Scattered Light) and the vertical axis is FL3 (FL = Fluorescence: fluorescence detector, 3 means third). Figure 11B shows the results of five regression models estimating the gate area in the input scatter plot. The vertical and horizontal axes in Figure 11B are the same as in Figure 11A. Here, an example of estimating one elliptical gate area is shown. In Figure 11B, the gate area is overwritten on the input scatter plot. AI-1 indicates the estimation result of the first regression model 1341, AI-2 indicates the estimation result of the second regression model 1342, and so on, with AI-5 indicating the estimation result of the fifth regression model 1345. Below the scatter plot, the parameters of the ellipse that is the gate region are listed. From the top, they are the central X coordinate (Cx), central Y coordinate (Cy), major axis length (Dx), minor axis length (Dy), and inclination angle (θ).

図１２は散布度の例を示す説明図である。散布度は標準偏差（ＳＤ：standard deviation）とする。図１２の左表は、図１１Ｂに示した値を再掲載している。なお、傾斜角度は、ゲート領域の形状が正円形とした場合など、ＳＤが大きくなったとしても、推定結果の自信度に影響しないので、ＳＤの算出項目から除外している。ＳＤの閾値は、複数のテストデータに対する結果について算出し、算出結果に基づいて決定する。 Figure 12 is an explanatory diagram showing an example of dispersion. Dispersion is defined as standard deviation (SD). The table on the left of Figure 12 reproduces the values shown in Figure 11B. Note that the inclination angle is excluded from the SD calculation items because it does not affect the confidence of the estimation result even if the SD becomes large, for example, when the shape of the gate region is a perfect circle. The SD threshold is calculated for the results of multiple test data and determined based on the calculation results.

図１３Ａ及び図１３Ｂは散布度の例を示す説明図である。図１３Ａと図１３Ｂとは異なるテストデータを入力した場合の推定結果である。図１３Ａ及び１３Ｂともに、横軸はＳＳＣ、縦軸はＦＬ３である。図１３Ａ及び図１３Ｂでは、入力の散布図に５つの回帰モデルが出力したゲート領域を重ねて描いている。散布図右側の数値は楕円形を特定する中心座標、長短径のＳＤを示している。図１３Ａは推定結果のバラつきが小さい例であり、図１３Ｂは推定結果のバラつきが大きい例である。図１３Ａと図１３Ｂとからすると、Cx:5.6以上、Cy:10.9～36.8の間、Dx:12.3以上、Dy:6.4以上で閾値を決めるとよいと考えられる。２つのテストデータのみでなく、他のデータについてのＳＤを考慮して、パラメータ毎の最終的な閾値を決定することが望ましい。各パラメータの閾値が決定すれば、準備工程は終了である。なお、上述の閾値決定処理においては、各回帰モデルはゲート領域の推定結果として、１つのゲートを出力する前提で説明したが、複数であってもよい。複数の場合は、ゲート毎に閾値を決定する。ゲート領域として、第１ゲート、第２ゲート、第３ゲートが出力される場合、各回帰モデルの第１ゲートについて、散布度を求め、閾値を決定する。 Figures 13A and 13B are explanatory diagrams showing examples of the degree of scatter. Figures 13A and 13B show the estimation results when different test data is input. In both Figures 13A and 13B, the horizontal axis is SSC and the vertical axis is FL3. In Figures 13A and 13B, the gate regions output by the five regression models are overlaid on the input scatter plot. The numbers on the right side of the scatter plot indicate the center coordinates that specify the ellipse and the SD of the major and minor axes. Figure 13A is an example where the estimation results have a small variation, and Figure 13B is an example where the estimation results have a large variation. From Figures 13A and 13B, it is considered that the thresholds should be set to Cx: 5.6 or more, Cy: between 10.9 and 36.8, Dx: 12.3 or more, and Dy: 6.4 or more. It is desirable to determine the final thresholds for each parameter taking into account not only the two test data but also the SD for other data. Once the thresholds for each parameter have been determined, the preparation process is complete. In the above-mentioned threshold determination process, it is assumed that each regression model outputs one gate as an estimation result of the gate region, but multiple gates may be output. In the case of multiple gates, a threshold is determined for each gate. In the case where a first gate, a second gate, and a third gate are output as the gate region, the degree of dispersion is calculated for the first gate of each regression model, and a threshold is determined.

次に、運用工程について説明する。以下の説明では、散布度を標準偏差（ＳＤ）とする。また、ＳＤが閾値以下の場合、自信度を高とする。ＳＤが閾値を超えた場合、自信度を低とする。また、各回帰モデルは複数のゲート領域の推定結果を出力するものとする。 Next, the operation process will be explained. In the following explanation, the degree of dispersion is the standard deviation (SD). Furthermore, if the SD is equal to or less than a threshold, the confidence level is set to high. If the SD exceeds the threshold, the confidence level is set to low. Furthermore, each regression model is assumed to output estimation results for multiple gate regions.

図１４はゲート領域推定処理の手順例を示すフローチャートである。制御部１１は散布図を取得する（ステップＳ３１）。ここでの散布図は測定結果を示す点の座標列並びに、横軸の測定項目及び縦軸の測定項目である。制御部１１は取得した散布図を各回帰モデル入力する（ステップＳ３２）。制御部１１は各回帰モデルがゲート領域の推定出力を取得する（ステップＳ３３）。制御部１１は回帰モデル毎、ゲート毎、パラメータ毎に散布度、ここでは標準偏差を算出する（ステップＳ３４）。制御部１１は自信度の判定を行う（ステップＳ３５）。制御部１１は結果を記憶する（ステップＳ３６）。制御部１１はゲート領域の推定結果をゲートＤＢ１３３に記憶し、自信度を自信度ＤＢ１３６に記憶する。制御部１１はゲート領域判定処理を終了する。 Figure 14 is a flowchart showing an example of the procedure for gate area estimation processing. The control unit 11 acquires a scatter diagram (step S31). The scatter diagram here is a coordinate sequence of points showing the measurement results, as well as the measurement items on the horizontal axis and the measurement items on the vertical axis. The control unit 11 inputs the acquired scatter diagram to each regression model (step S32). The control unit 11 acquires the estimated output of the gate area of each regression model (step S33). The control unit 11 calculates the degree of dispersion, here the standard deviation, for each regression model, each gate, and each parameter (step S34). The control unit 11 determines the degree of confidence (step S35). The control unit 11 stores the results (step S36). The control unit 11 stores the gate area estimation results in the gate DB 133, and stores the degree of confidence in the confidence DB 136. The control unit 11 ends the gate area determination process.

図１５は自信度判定処理の手順例を示すフローチャートである。自信度判定処理は図１４のステップＳ３５に対応する処理である。制御部１１は対象とするゲート領域を選択する（ステップＳ５１）。制御部１１は処理対象とするパラメータ（Cx、Cy、Dx、Dy等の変数）を選択する（ステップＳ５２）。制御部１１はパラメータの標準偏差が閾値以下であるか否かを判定する（ステップＳ５３）。制御部１１はパラメータの標準偏差が閾値以下であると判定した場合（ステップＳ５３でＹＥＳ）、全パラメータについて処理済みか否かを判定する（ステップＳ５４）。制御部１１は全パラメータについて処理済みでないと判定した場合（ステップＳ５４でＮＯ）、処理をステップＳ５２に戻し、未処理のパラメータについての処理を行う。制御部１１は全パラメータについて処理済みと判定した場合（ステップＳ５４でＹＥＳ）、処理対象としているゲートの自信度が高であることを、一時記憶領域に記憶する（ステップＳ５５）。一時記憶領域は主記憶部１２又は補助記憶部１３に設ける。制御部１１はパラメータの標準偏差が閾値を超えていると判定した場合（ステップＳ５３でＮＯ）、処理対象としているゲートの自信度が低であることを、一時記憶領域に記憶する（ステップＳ５６）。制御部１１は全ゲートについて処理済みか否かを判定する（ステップＳ５７）。制御部１１は全ゲートについて処理済みでないと判定した場合（ステップＳ５７でＮＯ）、処理をステップＳ５１に戻し、未処理のゲートについての処理を行う。制御部１１は全ゲートについて処理済みと判定した場合（ステップＳ５７でＹＥＳ）、一時記憶領域を参照し、全ゲートの自信度が高であるか否かを判定する（ステップＳ５８）。制御部１１は全ゲートの自信度が高であると判定した場合（ステップＳ５８でＹＥＳ）、処理対象としている散布図におけるゲート領域の推定結果に対する自信度（全体の自信度）が高であることを一時記憶領域に記憶する（ステップＳ５９）。制御部１１は全ゲートの自信度が高ではなく、一部のゲートの自信度が低であると判定した場合（ステップＳ５８でＮＯ）、処理対象としている散布図についての推定結果全体の自信度が低であることを一時記憶領域に記憶する（ステップＳ６０）。制御部１１は処理を呼び出し元に戻す。 Figure 15 is a flowchart showing an example of the procedure of the confidence level determination process. The confidence level determination process corresponds to step S35 in Figure 14. The control unit 11 selects the gate region to be processed (step S51). The control unit 11 selects the parameters to be processed (variables such as Cx, Cy, Dx, Dy, etc.) (step S52). The control unit 11 judges whether the standard deviation of the parameters is equal to or less than a threshold value (step S53). If the control unit 11 judges that the standard deviation of the parameters is equal to or less than a threshold value (YES in step S53), it judges whether all parameters have been processed (step S54). If the control unit 11 judges that all parameters have not been processed (NO in step S54), it returns the process to step S52 and performs processing on the unprocessed parameters. If the control unit 11 judges that all parameters have been processed (YES in step S54), it stores in the temporary storage area that the confidence level of the gate to be processed is high (step S55). The temporary storage area is provided in the main storage unit 12 or the auxiliary storage unit 13. When the control unit 11 determines that the standard deviation of the parameters exceeds the threshold (NO in step S53), it stores in the temporary storage area that the confidence level of the gate being processed is low (step S56). The control unit 11 determines whether or not all gates have been processed (step S57). When the control unit 11 determines that all gates have not been processed (NO in step S57), it returns the process to step S51 and processes the unprocessed gates. When the control unit 11 determines that all gates have been processed (YES in step S57), it refers to the temporary storage area and determines whether or not the confidence levels of all gates are high (step S58). When the control unit 11 determines that the confidence levels of all gates are high (YES in step S58), it stores in the temporary storage area that the confidence level (overall confidence level) for the estimation result of the gate area in the scatter diagram being processed is high (step S59). When the control unit 11 determines that the confidence level of all gates is not high, but that the confidence level of some gates is low (NO in step S58), it stores in a temporary storage area that the confidence level of the entire estimation result for the scatter diagram being processed is low (step S60).The control unit 11 returns the process to the caller.

ゲート領域推定処理の具体例を説明する。図１６Ａ及び図１６Ｂは、ゲート領域の推定結果例を示す説明図である。図１６は図１１と同様に、ＣＤ４５ゲーティングにおける散布図の例である。図１６Ａは入力とする散布図の例であり、図１６Ｂは入力された散布図におけるゲート領域を、５つの回帰モデルが推定したそれぞれの結果を示す。図１６Ａ及び図１６Ｂにおいて、横軸はＳＳＣであり、縦軸はＦＬ３である。ここでは、推定結果として得たゲート領域の１つを表示している。図１６Ｂでは、入力された散布図にゲート領域が重ね書きされている。ＡＩ－１が第１回帰モデル１３４１の推定結果を示し、ＡＩ－２が第２回帰モデル１３４２の推定結果を示し、以下同様であり、ＡＩ－５が第５回帰モデル１３４５の推定結果を示す。散布図の下にゲート領域である楕円形のパラメータを記載している。上から順に中心Ｘ座標（Cx）、中心Ｙ座標（Cy）、長径の長さ（Dx）、短径の長さ（Dy）、傾斜角度（θ）である。 A specific example of the gate region estimation process will be described. Figures 16A and 16B are explanatory diagrams showing an example of the estimation result of the gate region. Like Figure 11, Figure 16 is an example of a scatter plot in CD45 gating. Figure 16A is an example of an input scatter plot, and Figure 16B shows the results of the gate region in the input scatter plot estimated by five regression models. In Figures 16A and 16B, the horizontal axis is SSC and the vertical axis is FL3. Here, one of the gate regions obtained as an estimation result is displayed. In Figure 16B, the gate region is overwritten on the input scatter plot. AI-1 shows the estimation result of the first regression model 1341, AI-2 shows the estimation result of the second regression model 1342, and so on, and AI-5 shows the estimation result of the fifth regression model 1345. The parameters of the ellipse that is the gate region are written below the scatter plot. From top to bottom, these are the central X coordinate (Cx), central Y coordinate (Cy), major axis length (Dx), minor axis length (Dy), and inclination angle (θ).

図１７Ａ及び図１７Ｂは散布度の例を示す説明図である。図１７Ａと図１７Ｂとは異なる散布図を入力した場合の推定結果である。図１７Ａ及び１７Ｂともに、横軸はＳＳＣ、縦軸はＦＬ３である。図１７Ａは、図１６Ｂに示した５つの図を一つの図として描いたものである。すなわち、図１７Ａは入力の散布図に５つの回帰モデルが出力したゲート領域を重ねて描いている。図１７Ｂも図１７Ａと同様である。散布図右側の数値は楕円形を特定する中心座標、長短径のＳＤを示している。図１７Ａは推定結果のバラつきが小さい例であり、図１７Ｂは推定結果のバラつきが大きい例である。ここで、Cx、Cy、Dx、Dyの各ＳＤが２０以下ならば自信度を高、２０を超える項目が１つでも有るならば、自信度を低とする場合、図１７Ａに示す結果は、全てＳＤが２０以下であるので、自信度が高と判定される。一方、図１７Ｂに示す結果は、Cx及びDxのＳＤが２０を超えているため、自信度が低と判定される。 Figures 17A and 17B are explanatory diagrams showing examples of the degree of scatter. Figures 17A and 17B show the estimation results when different scatter plots are input. In both Figures 17A and 17B, the horizontal axis is SSC and the vertical axis is FL3. Figure 17A shows the five plots shown in Figure 16B drawn as one figure. That is, Figure 17A shows the gate regions output by the five regression models superimposed on the input scatter plot. Figure 17B is the same as Figure 17A. The numbers on the right side of the scatter plot indicate the center coordinates that specify the ellipse and the SD of the major and minor axes. Figure 17A is an example where the estimation results have a small variation, and Figure 17B is an example where the estimation results have a large variation. Here, if the SD of each of Cx, Cy, Dx, and Dy is 20 or less, the confidence level is high, and if there is even one item that exceeds 20, the confidence level is low. In this case, the results shown in Figure 17A have all SDs of 20 or less, so the confidence level is determined to be high. On the other hand, in the result shown in FIG. 17B, since the SD of Cx and Dx exceeds 20, the confidence level is determined to be low.

続いて、複数のゲート領域を推定した場合の自信度の判定例を示す。図１８はゲート領域の推定結果例を示す説明図である。図１８はゲートＧ１からＧ３の３つのゲート領域の推定結果が描かれている。各ゲート領域は５つの回帰モデルが出力したゲート領域を重ねて散布図上に描いている。散布図の下の表は各パラメータのＳＤを示している。ゲートＧ１及びＧ２は、全てのＳＤが２０以下であるので、自信度は高と判定される。ゲートＧ３はCxのＳＤが２０を超えているため、自信度が低と判定される。複数のゲート領域を推定する場合、全てのゲート領域の推定結果の自信度が高である場合、全体の自信度を高とし、ゲート領域の推定結果の自信度に１つでも低がある場合、全体の自信度を低とする。この定義で判定するならば、図１８に示す推定結果は全体としては、自信度が低と判定される。 Next, an example of the confidence level determination when multiple gate regions are estimated is shown. Figure 18 is an explanatory diagram showing an example of the gate region estimation result. Figure 18 shows the estimation results of three gate regions, G1 to G3. Each gate region is drawn on a scatter plot with the gate regions output by five regression models overlapping. The table below the scatter plot shows the SD of each parameter. Gates G1 and G2 are determined to have high confidence levels because all SDs are 20 or less. Gate G3 is determined to have low confidence levels because the SD of Cx exceeds 20. When multiple gate regions are estimated, if the confidence levels of the estimation results of all gate regions are high, the overall confidence level is high, and if there is even one low confidence level in the estimation results of the gate regions, the overall confidence level is low. If judged by this definition, the estimation results shown in Figure 18 are determined to have low confidence levels overall.

次に、ゲート領域の推定結果の画面表示について説明する。図１９Ａ及び図１９Ｂは推定結果表示画面の例を示す説明図である。図１９Ａは自信度が高であるときの画面例である。図１９Ｂは自信度が低であるときの画面例である。推定結果表示画面は散布図１９１、自信度１９２、自信度アイコン１９３を含む。散布図１９１は散布図にゲート領域の推定結果を表示したものである。ここで、表示するゲート領域は、５つの回帰モデルが出力した５つの推定領域のうち、所定のアルゴリズムで選択された１つの領域である。自信度１９２は判定結果全体についての自信度を表示する。図１９では、自信度：高をHighで、自信度：低をLowで表示している。自信度アイコン１９３は自信度を顔アイコンで表現している。自信度が高であれば、笑顔のアイコンを表示し、自信度が低であれば、困った顔のアイコンを表示する。なお、散布図１９１において、表示するゲート領域を、図１３等と同様に５つの回帰モデルが出力した５つの推定領域全てとしてもよい。Next, the screen display of the estimation result of the gate area will be described. Figures 19A and 19B are explanatory diagrams showing examples of the estimation result display screen. Figure 19A is an example of a screen when the confidence level is high. Figure 19B is an example of a screen when the confidence level is low. The estimation result display screen includes a scatter diagram 191, confidence level 192, and confidence level icon 193. The scatter diagram 191 displays the estimation result of the gate area in a scatter diagram. Here, the gate area to be displayed is one area selected by a predetermined algorithm from the five estimated areas output by the five regression models. The confidence level 192 displays the confidence level for the entire judgment result. In Figure 19, high confidence level is displayed as High, and low confidence level is displayed as Low. The confidence level icon 193 expresses the confidence level with a face icon. If the confidence level is high, a smiling icon is displayed, and if the confidence level is low, a troubled face icon is displayed. In addition, in the scatter diagram 191, the gate regions to be displayed may be all five estimated regions output by the five regression models, similar to FIG.

図２０はＩＤ一覧画面の例を示す説明図である。ＩＤ一覧画面は、検査毎に付されるＩＤとゲート領域の推定結果の自信度とを対応付けて一覧表示する。ＩＤ一覧画面は、ＩＤ表示２０１と自信度表示２０２とを含む。ＩＤ表示２０１は例えば受付番号を表示する。自信度表示２０２は例えば自信度が高い場合にＡを、自信度が低い場合にａを表示する。ＩＤ表示２０１の１つのＩＤを選択すると、図１９に示した推定結果表示画面を表示する。 Figure 20 is an explanatory diagram showing an example of an ID list screen. The ID list screen displays a list of IDs assigned to each test in association with the confidence level of the estimation results for the gate area. The ID list screen includes an ID display 201 and a confidence level display 202. The ID display 201 displays, for example, a reception number. The confidence level display 202 displays, for example, A when the confidence level is high and a when the confidence level is low. When an ID is selected from the ID display 201, the estimation result display screen shown in Figure 19 is displayed.

本実施の形態では、ゲート領域の推定結果に自信度を付して出力する。それにより、自信度表示２０２を参照して、「Ａ」は熟練度の高い検査士が優先して念入りにチェックする、間違っている可能性の高い「ａ」は時間を掛けて解析するなど、現場の運用環境に合わせた運用が可能となるという効果を奏する。In this embodiment, the gate region estimation result is output with a confidence level attached. This has the effect of enabling operation suited to the on-site operating environment, such as by referring to the confidence level display 202 and having a highly skilled inspector give priority to carefully checking "A" and taking time to analyze "a" which is likely to be incorrect.

本実施の形態において、各回帰モデルへ入力する散布図は１つとしたが、それに限らず２つ以上であってもよい。また、散布図は２次元に限らず、３次元以上であってもよい。In this embodiment, one scatter plot is input to each regression model, but this is not limited to this and two or more scatter plots may be input. Also, the scatter plot is not limited to two dimensions and may be three or more dimensions.

散布度はゲート領域を表す図形のパラメータの標準偏差、楕円の場合は中心座標、長半径の長さの標準偏差としたが、それに限らない。５つの回帰モデルが推定したゲート領域の面積を散布度としても良い。例えば、散布図に５つの推定ゲート領域を重畳表示した場合に、５つの領域を包含する領域の面積と、５つが重なっている領域の面積とを算出し、前者の面積に占める後者の面積の割合を散布度とする。この場合、値が小さいほど、バラつきが大きいと判断する。１が最大値であり、５つの領域が全て一致するときである。 The degree of dispersion is the standard deviation of the parameters of the figure representing the gate region, and in the case of an ellipse, the standard deviation of the central coordinates and the length of the semimajor axis, but is not limited to this. The area of the gate region estimated by five regression models may also be taken as the degree of dispersion. For example, when five estimated gate regions are superimposed on a scatter plot, the area of the region containing the five regions and the area of the region where the five overlap are calculated, and the ratio of the latter area to the former area is taken as the degree of dispersion. In this case, the smaller the value, the greater the variation. 1 is the maximum value, when all five regions match.

（実施の形態２）
本実施の形態は、５つの回帰モデルが推定した５つのゲート領域のうち、ユーザに提示する１つのゲート領域を選択する手法に関する形態である。アンサンブル学習においては、複数の学習モデルの出力結果を組み合わせて最終的な結果を得る。アンサンブル学習では複数の学習モデルを用いるため、出力のぶれを低減させる効果がある。学習毎に精度のぶれが起きやすいニューラルネットワークにおいて、特に効果的であることが知られている。 (Embodiment 2)
This embodiment relates to a method for selecting one gate region to be presented to a user from five gate regions estimated by five regression models. In ensemble learning, the output results of multiple learning models are combined to obtain a final result. Since multiple learning models are used in ensemble learning, it has the effect of reducing output fluctuations. It is known to be particularly effective in neural networks, where accuracy fluctuations are likely to occur with each learning.

アンサンブル学習では、学習毎に精度のぶれが起きやすいニューラルネットワークにおいて特に効果的とされ、各種コンペで用いられる技術である。しかし、ゲート領域の推定を行う回帰モデルの出力は多出力であり、複合的に評価する必要がある。単純に複数のモデル間の平均などで各出力を組み合わせても精度は向上しにくい。異なる条件で学習した複数の回帰モデルそれぞれにより、ゲート領域を推定させた場合、学習の違いにより推定結果が異なる。そのため、各回帰モデルが出力した推定結果から各パラメータの平均値を求め、表示する最終的なゲート領域を決定した場合、検査士から見ると、散布図のどの部分を囲う目的のゲートであるのか不明瞭となる可能性が高い。そこで、本実施の形態では、５つの回帰モデルが推定した５つのゲート領域のうち、ユーザに提示する最適なゲート領域を１つ選択する。本実施の形態において、ハードウェア構成や第１回帰モデル１３４１から第５回帰モデル１３４５の生成処理等は、実施の形態１と同様である。以下の説明においては、主として実施の形態１と異なる点を説明する。Ensemble learning is particularly effective in neural networks, which are prone to fluctuations in accuracy with each learning, and is a technique used in various competitions. However, the output of the regression model that estimates the gate area is multi-output and needs to be evaluated in a composite manner. Simply combining the outputs of multiple models by averaging them together does not improve accuracy. When the gate area is estimated by multiple regression models trained under different conditions, the estimation results differ due to differences in learning. Therefore, when the average value of each parameter is calculated from the estimation results output by each regression model and the final gate area to be displayed is determined, it is highly likely that it will be unclear to the inspector which part of the scatter diagram is the gate to be surrounded. Therefore, in this embodiment, one of the five gate areas estimated by the five regression models is selected as the optimal gate area to be presented to the user. In this embodiment, the hardware configuration and the generation process of the first regression model 1341 to the fifth regression model 1345 are the same as those in the first embodiment. In the following explanation, the differences from the first embodiment will be mainly explained.

準備工程は実施の形態１と同様であるので説明を省略する。以下、運用工程について説明する。図２１はゲート領域推定処理の他の手順例を示すフローチャートである。図１４に示した処理に、ゲート領域を選択するための処理が追加されている。制御部１１は散布図を取得する（ステップＳ７１）。制御部１１は取得した散布図を各回帰モデルへ入力する（ステップＳ７２）。制御部１１は各回帰モデルがゲート領域の推定出力を取得する（ステップＳ７３）。制御部１１は外れ値を含むゲート領域を選択対象から外す（ステップＳ７４）。５つの回帰モデルが出力したゲート領域の各パラメータについて、中央値を求める。１項目でも中央値から外れるゲート領域は選択対象から外す。なお、ステップＳ７４は必須の処理ではなく省略してもよい。制御部１１は、各ゲート領域について特徴量を算出する（ステップＳ７５）。特徴量はゲート内の細胞数、ゲート領域の面積、ゲート内の細胞密度、ゲート内の細胞純度等である。特徴量については、後に補足する。制御部１１は特徴量に基づき、最適ゲートを選択する（ステップＳ７６）。制御部１１は散布度を算出する（ステップＳ７７）。制御部１１は自信度の判定を行う（ステップＳ７８）。ステップＳ７７及びステップＳ７８の内容は、実施の形態１と同様であるから説明を省略する。制御部１１は選択したゲート領域及び自信度を記憶し（ステップＳ７９）、ゲート領域推定処理を終了する。The preparation process is the same as in the first embodiment, so the description will be omitted. The operation process will be described below. FIG. 21 is a flowchart showing another example of the procedure of the gate region estimation process. A process for selecting a gate region is added to the process shown in FIG. 14. The control unit 11 acquires a scatter diagram (step S71). The control unit 11 inputs the acquired scatter diagram to each regression model (step S72). The control unit 11 acquires the estimated output of the gate region of each regression model (step S73). The control unit 11 removes the gate region including an outlier from the selection target (step S74). The median is calculated for each parameter of the gate region output by the five regression models. The gate region that deviates from the median in even one item is removed from the selection target. Note that step S74 is not a required process and may be omitted. The control unit 11 calculates the feature amount for each gate region (step S75). The feature amount is the number of cells in the gate, the area of the gate region, the cell density in the gate, the cell purity in the gate, etc. The feature amount will be supplemented later. The control unit 11 selects the optimal gate based on the feature amount (step S76). The control unit 11 calculates the degree of dispersion (step S77). The control unit 11 determines the confidence level (step S78). The contents of steps S77 and S78 are the same as those in the first embodiment, so the description will be omitted. The control unit 11 stores the selected gate region and the confidence level (step S79), and ends the gate region estimation process.

次に、外れ値を含むゲート領域を選択対象から外す例を示す。図２２Ａ及び図２２Ｂは外れ値ゲート領域の除外例を示す説明図である。図２２Ａは５つの回帰モデルが出力したゲート領域を散布図に重畳したものである。ゲート領域のうち、ゲート領域Ｇｊは他のゲート領域とは大きさが異なるため、外れ値を含むゲート領域として選択対象から外れる。図２２Ｂは選択対象から外れたゲート領域Ｇｊのみを表示した散布図である。Next, an example of excluding gate regions containing outliers from selection is shown. Figures 22A and 22B are explanatory diagrams showing an example of excluding outlier gate regions. Figure 22A shows the gate regions output by five regression models superimposed on a scatter plot. Of the gate regions, gate region Gj is a different size from the other gate regions, and is therefore excluded from selection as a gate region containing an outlier. Figure 22B is a scatter plot that displays only gate region Gj that was excluded from selection.

続いて、特徴量の１つである細胞純度について説明する。検査において、各ゲート内に含まれる細胞種は基本的には一種であることが望ましい。凡その細胞種はＦＳＣ、ＳＳＣ、ＣＤ４５の情報から推測できる。そこで、細胞集団をＦＳＣ、ＳＳＣ、ＦＬ３の情報から大まかにクラス分類し、対象ゲート内にどのクラスが最も多いか、またそのクラスの細胞の何割がゲート内に含まれるかを細胞純度と定義する。具体例には、ＦＳＣ、ＳＳＣ、ＦＬ３の分布において、３次元の自動クラスタリング手法、k-meansを適用し、ｎ個の小集団を作る。ｎは自然数である。ここではｎ＝１０である。図２３は１０個の小集団の例を示す説明図である。五角形のマークはk-meansに用いられる各小集団の中心を示す。図２３では横軸がＳＳＣ、縦軸がＦＬ３の２次元表示となっているが、実際は紙面法線方向の軸がＦＳＣである３次元のクラスタリングである。図２３において、ゲート領域Ｇ内には、クラスＣｂの細胞が多い。そこで、ゲート領域Ｇの細胞純度は、クラスＣｂのうち、ゲート領域Ｇに含まれる割合とする。すなわち、対象ゲート領域に最も多く含まれるクラスの細胞を求め、対象ゲート領域に含まれる当該クラスの細胞数を当該クラスの細胞全体の数で除した値が、細胞純度である。Next, cell purity, which is one of the feature quantities, will be explained. In the test, it is desirable that each gate contains only one type of cell. The cell type can be roughly estimated from the information on FSC, SSC, and CD45. Therefore, the cell population is roughly classified into classes based on the information on FSC, SSC, and FL3, and the cell purity is defined as which class is most abundant in the target gate and what percentage of cells of that class are contained in the gate. As a specific example, in the distribution of FSC, SSC, and FL3, a three-dimensional automatic clustering method, k-means, is applied to create n small groups. n is a natural number. Here, n=10. Figure 23 is an explanatory diagram showing an example of 10 small groups. The pentagonal marks indicate the centers of each small group used in k-means. In Figure 23, the horizontal axis is a two-dimensional display with SSC and the vertical axis is FL3, but in reality it is a three-dimensional clustering with FSC as the axis normal to the paper. In Figure 23, there are many cells of class Cb in the gate area G. Therefore, the cell purity of the gate region G is the proportion of class Cb contained in the gate region G. That is, the cell purity is calculated by finding the class of cells contained most frequently in the target gate region, and dividing the number of cells of that class contained in the target gate region by the total number of cells of that class.

図２４はゲート選択処理の手順例を示すフローチャートである。ゲート選択処理は図２１のステップＳ７６に対応するものである。制御部１１は細胞のクラスタリングを行う（ステップＳ９１）。例えば上述のようにＦＳＣ、ＳＳＣ、ＦＬ３の分布において、k-meansによる３次元の自動クラスタリングを行い、細胞を１０個のクラスに分ける。制御部１１は５つの回帰モデルそれぞれが出力した５つのゲート領域の中で処理対象とするゲート領域を選択する（ステップＳ９２）。制御部１１は選択したゲート領域内に含まれるクラス毎の細胞数を求め、細胞数が最多のクラスを特定する（ステップＳ９３）。制御部１１は細胞純度を算出する（ステップＳ９４）。制御部１１は未処理のゲート領域が有るか否かを判定する（ステップＳ９５）。制御部１１は未処理のゲート領域が有ると判定した場合（ステップＳ９５でＹＥＳ）、処理をステップＳ９２に戻し、未処理のゲート領域についての処理を行う。制御部１１は未処理のゲート領域がないと判定した場合（ステップＳ９５でＮＯ）、出力するゲート領域を選択する（ステップＳ９６）。制御部１１は、５つのゲート領域の中から、細胞純度が最大のゲート領域を選択する。制御部１１はゲート選択処理を終了する。 Figure 24 is a flowchart showing an example of the procedure of the gate selection process. The gate selection process corresponds to step S76 in Figure 21. The control unit 11 performs cell clustering (step S91). For example, as described above, in the distribution of FSC, SSC, and FL3, three-dimensional automatic clustering is performed by k-means to divide the cells into 10 classes. The control unit 11 selects a gate region to be processed from the five gate regions output by each of the five regression models (step S92). The control unit 11 obtains the number of cells for each class contained in the selected gate region, and identifies the class with the largest number of cells (step S93). The control unit 11 calculates the cell purity (step S94). The control unit 11 determines whether or not there is an unprocessed gate region (step S95). If the control unit 11 determines that there is an unprocessed gate region (YES in step S95), the process returns to step S92 and processes the unprocessed gate region. When the control unit 11 determines that there is no unprocessed gate region (NO in step S95), it selects a gate region to be output (step S96). The control unit 11 selects the gate region with the highest cell purity from among the five gate regions. The control unit 11 ends the gate selection process.

図２５Ａ及び図２５Ｂはゲート領域の選択例を示す説明図である。図２５Ａは５つの回帰モデルが出力したゲート領域を散布図に重畳したものである。図２５Ａの右側の数値は、各ゲート領域の細胞純度を示す。ここでは、ＡＩ－３すなわち第３回帰モデル１３４３が出力したゲート領域の細胞純度が０．６６で最も大きい値であるので、当該ゲート領域が選択される。図２５Ｂは選択したゲート領域のみを重畳表示した散布図の例である。複数個のゲート領域を出力する場合も同様な処理で可能である。詳細については後述する。 Figures 25A and 25B are explanatory diagrams showing an example of gate region selection. Figure 25A shows gate regions output by five regression models superimposed on a scatter plot. The numbers on the right side of Figure 25A indicate the cell purity of each gate region. In this case, the cell purity of the gate region output by AI-3, i.e., the third regression model 1343, is the highest at 0.66, so that gate region is selected. Figure 25B is an example of a scatter plot in which only the selected gate region is superimposed. Similar processing is also possible when outputting multiple gate regions. Details will be described later.

次に、細胞純度以外の特徴量について補足する。細胞数は、ゲート領域に含まれる細胞の数である。面積は、２次元の散布図におけるゲート領域を示す図形の面積である。細胞密度は細胞数を面積で除した値である。 Next, we will discuss features other than cell purity. Cell count is the number of cells contained in the gate region. Area is the area of the shape that shows the gate region in a two-dimensional scatter plot. Cell density is the number of cells divided by the area.

特徴量として、解析軸以外の測定値を用いてもよい。解析軸とは２次元表示する散布図の横軸、縦軸である。フローサイトメトリー検査では、全次元の測定値から細胞腫の判別を行う。よってゲート内においても他次元（解析軸以外）の測定値は、細胞腫を絞った最適なゲートを決めるための指標となりうる。 Measurement values other than the analysis axis may be used as features. The analysis axis is the horizontal and vertical axis of a two-dimensional scatter plot. In flow cytometry testing, cell tumors are identified from measurement values in all dimensions. Therefore, even within a gate, measurement values in other dimensions (other than the analysis axis) can serve as indicators for determining the optimal gate to narrow down cell tumors.

上述の例では、解析軸はＳＳＣとＦＬ３（ＣＤ４５）である。この場合、解析軸以外の測定値の例としては、ＦＳＣ、ＦＬ１（ＣＤ３４）である。このとき、ＦＬ１でＣＤ３４を測定し、その平均値を基準として、最適ゲートを選択する。例えば、ＡＩ－１の平均値が０．２１、ＡＩ－２の平均値が０．１６、ＡＩ－３の平均値が０．１８、ＡＩ－４の平均値が０．２０とする。ＡＩ－５が出力したゲート領域は外れ値ゲート領域として除外されている。このとき、ＡＩ－１が出力したゲート領域が選択される。 In the above example, the analysis axes are SSC and FL3 (CD45). In this case, examples of measurements other than the analysis axes are FSC and FL1 (CD34). At this time, CD34 is measured at FL1, and the optimal gate is selected based on the average value. For example, the average value of AI-1 is 0.21, the average value of AI-2 is 0.16, the average value of AI-3 is 0.18, and the average value of AI-4 is 0.20. The gate area output by AI-5 is excluded as an outlier gate area. At this time, the gate area output by AI-1 is selected.

特徴量として、画像情報を用いてもよい。ゲート内の細胞密度で最適ゲートを選択する場合、細胞分布の偏りに影響を受け適切なゲートを選択できない場合がある。これを避けるために分布状況を画像へ変換して特徴量を取得する。以下、処理例を説明する。散布図の内容を画像情報、ここでは輝度Ｌとして扱うために、細胞の存在する部分を黒ピクセル（Ｌ＜２５５）、それ以外を白ピクセル（Ｌ＝２５５）で表現する。 Image information may be used as a feature. When selecting an optimal gate based on the cell density within the gate, bias in cell distribution may affect the selection of an appropriate gate. To avoid this, the distribution situation is converted into an image to obtain the feature. An example of processing is explained below. To treat the contents of the scatter plot as image information, here the luminance L, the areas where cells exist are represented by black pixels (L<255) and the rest by white pixels (L=255).

図２６は輝度情報によるゲート領域の選択例を示す説明図である。図２６の上段左は、ＡＩ－１が出力したゲート領域を重畳表示した散布図である。図２６の下段左はＡＩ－１が出力したゲート領域の特徴量を示す。図２６の上段右は、ＡＩ－２が出力したゲート領域を重畳表示した散布図である。図２６の下段右は、ＡＩ－２が出力したゲート領域の特徴量を示す。なお、ＡＩ－３からＡＩ－５が出力したゲート領域は、その特徴量がＡＩ－１及びＡＩ－２が出力したゲート領域のものよりも小さく、何れも選択される可能性はないものとする。図２６の例において、選択すべきゲート領域は、ＡＩ－１が出力した領域である。 Figure 26 is an explanatory diagram showing an example of gate area selection based on luminance information. The top left of Figure 26 is a scatter plot with a superimposed display of the gate area output by AI-1. The bottom left of Figure 26 shows the features of the gate area output by AI-1. The top right of Figure 26 is a scatter plot with a superimposed display of the gate area output by AI-2. The bottom right of Figure 26 shows the features of the gate area output by AI-2. Note that the features of the gate areas output by AI-3 to AI-5 are smaller than those of the gate areas output by AI-1 and AI-2, so there is no possibility of any of them being selected. In the example of Figure 26, the gate area to be selected is the area output by AI-1.

図２６の例において、ゲート領域を選択するための特徴量として細胞密度を採用した場合、ＡＩ－１は１．０、ＡＩ－２は１．１であるから、ＡＩ－２が出力したゲート領域が選択されてしまう。しかし、ゲート領域を選択するための特徴量として黒白比（＝白ピクセル数/黒ピクセル数）を採用すると、ＡＩ－１は０．７、ＡＩ－２は０．５となり、ＡＩ－１が出力したゲート領域が選択される。図２６の例では、細胞分布に偏りがあるため、特徴量として細胞密度を用いると、不適切な選択がされてしまう。黒白比は細胞密度をある程度反映しつつ、細胞数による影響を低減できるので、適切な選択結果となる。 In the example of Figure 26, if cell density is used as the feature for selecting the gate region, AI-1 is 1.0 and AI-2 is 1.1, so the gate region output by AI-2 will be selected. However, if the black-white ratio (= number of white pixels/number of black pixels) is used as the feature for selecting the gate region, AI-1 will be 0.7 and AI-2 will be 0.5, so the gate region output by AI-1 will be selected. In the example of Figure 26, there is a bias in the cell distribution, so using cell density as a feature would result in an inappropriate selection. The black-white ratio reflects cell density to some extent while reducing the influence of cell number, resulting in an appropriate selection result.

本実施の形態について、上述では、１つのゲート領域を得る場合を説明した。複数個のゲート領域を得るためには、処理の拡張が必要となるが、その拡張方法には２つの方法が考えられる。方法１は、全ゲートを出力する複数個のＡＩからひとつ最適な出力を選ぶ方法である。例えば、全てのゲート領域について、ＡＩ－１の出力した推定結果を選択する方法である。 In the above description of this embodiment, the case where one gate area is obtained has been described. To obtain multiple gate areas, the processing needs to be expanded, and there are two possible methods for this expansion. Method 1 is a method in which an optimal output is selected from multiple AIs that output all gates. For example, this method selects the estimation results output by AI-1 for all gate areas.

図２７はゲート領域選択処理の他の手順例を示すフローチャートである。図２７は１つの散布図に複数個のゲート領域を設定する場合の選択処理であり、上述の方法１による処理である。制御部１１は処理対象とする回帰モデルを選択する（ステップＳ１１１）。制御部１１は選択した回帰モデルが出力した複数ゲート領域のうち、処理対象とするゲート領域を選択する（ステップＳ１１２）。制御部１１は選択したゲート領域の自信度を判定する（ステップＳ１１３）。自信度の判定は上述したとおりである。制御部１１は自信度を一時記憶領域に記憶する（ステップＳ１１４）。制御部１１は未処理のゲート領域があるか否かを判定する（ステップＳ１１５）。制御部１１は未処理のゲート領域があると判定した場合（ステップＳ１１５でＹＥＳ）、処理をステップＳ１１２に戻し、未処理のゲート領域についての処理を行う。制御部１１は未処理のゲート領域がないと判定した場合（ステップＳ１１５でＮＯ）、選択している回帰モデルが出力したゲート領域全体の自信度を判定する（ステップＳ１１６）。例えば、設定するゲート領域が３つであり、各ゲート領域の自信度が高、高、低の場合、高の個数２を全体の自信度とする。制御部１１は自信度を一時記憶領域に記憶する（ステップＳ１１７）。制御部１１は未処理対象の回帰モデルがあるか否かを判定する（ステップＳ１１８）。制御部１１は未処理の回帰モデルがあると判定した場合（ステップＳ１１８でＹＥＳ）、処理をステップＳ１１１に戻し、未処理の回帰モデルについての処理を行う。制御部１１は未処理の回帰モデルがないと判定した場合（ステップＳ１１８でＮＯ）、回帰モデル毎の自信度に基づき、回帰モデルを選択する（ステップＳ１１９）。制御部１１は選択した回帰モデルが出力したゲート領域を出力し（ステップＳ１２０）、処理を終了する。 Figure 27 is a flowchart showing another example of the procedure of the gate region selection process. Figure 27 shows the selection process when multiple gate regions are set in one scatter plot, and is the process by the above-mentioned method 1. The control unit 11 selects a regression model to be processed (step S111). The control unit 11 selects a gate region to be processed from the multiple gate regions output by the selected regression model (step S112). The control unit 11 judges the confidence level of the selected gate region (step S113). The confidence level is judged as described above. The control unit 11 stores the confidence level in a temporary storage area (step S114). The control unit 11 judges whether or not there is an unprocessed gate region (step S115). If the control unit 11 judges that there is an unprocessed gate region (YES in step S115), the process returns to step S112 and processes the unprocessed gate region. If the control unit 11 judges that there is no unprocessed gate region (NO in step S115), the control unit 11 judges the confidence level of the entire gate region output by the selected regression model (step S116). For example, when there are three gate regions to be set and the confidence levels of each gate region are high, high, and low, the number of highs, 2, is set as the overall confidence level. The control unit 11 stores the confidence levels in a temporary storage area (step S117). The control unit 11 judges whether there is an unprocessed regression model (step S118). When the control unit 11 judges that there is an unprocessed regression model (YES in step S118), the process returns to step S111 and processes the unprocessed regression model. When the control unit 11 judges that there is no unprocessed regression model (NO in step S118), the control unit 11 selects a regression model based on the confidence level for each regression model (step S119). The control unit 11 outputs the gate region output by the selected regression model (step S120) and ends the process.

方法２は、全ゲートを出力する複数個のＡＩから各ゲートにひとつ最適な出力を選ぶ方法である。例えば、第１ゲートはＡＩ－１の出力した推定結果を選択し、第２ゲートはＡＩ－４の出力した推定結果を選択し、第３ゲートはＡＩ－５の出力した推定結果を選択する。方法１は、ひとつのＡＩが各ゲートを出力するため、ゲート間で重なることが少ないという長所が有る一方、ひとつのゲートがずれていると他ゲートもずれている可能性が高いという短所がある。方法２は、ゲート毎に選ぶため他ゲートのずれの影響は少ないという長所が有る一方、他ゲートの情報がない条件で選ぶとゲート同士が重なりやすいという短所がある。ただし、他ゲートの情報を工夫して加えることで当該短所の影響を軽減可能である。 Method 2 is a method of selecting one optimal output for each gate from multiple AIs that output all gates. For example, the first gate selects the estimation result output by AI-1, the second gate selects the estimation result output by AI-4, and the third gate selects the estimation result output by AI-5. Method 1 has the advantage that there is little overlap between gates because one AI outputs each gate, but it has the disadvantage that if one gate is misaligned, the other gates are likely to be misaligned as well. Method 2 has the advantage that there is little impact from misalignment of other gates because selection is made gate by gate, but it has the disadvantage that gates are likely to overlap if selected under conditions where there is no information on other gates. However, the impact of this disadvantage can be mitigated by ingeniously adding information on other gates.

方法２における短所の影響を軽減する処理について説明する。ゲート選択において下記の条件を用いて、他ゲート情報を加えた場合とない場合での選択ゲートの比較を行う。条件１：最適ゲートを決める基準として特徴量「細胞純度」を用いる。条件２：有用度に従って、ゲートを選択する（一般的に解析ゲートには有用度があるため）。その際に他ゲートに含まれる細胞は特徴量の計算から除く。条件１のみを適用した場合と、条件１及び２を適用した場合とで、ゲートの選択を行う。何れか一方でゲートの重なりがなければ、当該選択結果を採用する。なお条件１における特徴量は、ゲート領域に関わるならば細胞純度以外の他の特徴量でも良い。 A process to reduce the impact of the shortcomings of method 2 is described below. The following conditions are used in gate selection to compare selected gates with and without adding other gate information. Condition 1: The feature "cell purity" is used as the criterion for determining the optimal gate. Condition 2: A gate is selected according to its usefulness (as analysis gates generally have a usefulness). At that time, cells contained in other gates are excluded from the feature calculation. Gate selection is performed when only condition 1 is applied, and when conditions 1 and 2 are applied. If there is no gate overlap in either case, that selection result is adopted. Note that the feature in condition 1 can be a feature other than cell purity as long as it is related to the gate region.

図２８はゲート領域選択処理の他の手順例を示すフローチャートである。図２８は１つの散布図に複数個のゲート領域を設定する場合の選択処理であり、上述の方法２による処理である。制御部１１は複数個のゲート領域の中で、処理対象とするゲート領域を選択する（ステップＳ１３１）。例えば、３個のゲート領域を設定する場合、それぞれを第１ゲート、第２ゲート、第３ゲートとする。順番号は検査結果の報告において重要性が大きい順などの有用度で定める。このとき、制御部１１は第１ゲート、第２ゲート、第３ゲートの順に処理を行う。制御部１１は処理対象とする回帰モデルを選択する（ステップＳ１３２）。制御部１１は選択した回帰モデルが出力したゲート領域についての特徴量、例えば細胞純度を算出する（ステップＳ１３３）。制御部１１は未処理の回帰モデルがあるか否かを判定する（ステップＳ１３４）。制御部１１は未処理の回帰モデルがあると判定した場合（ステップＳ１３４でＹＥＳ）、処理をステップＳ１３２に戻し、未処理の回帰モデルについての処理を行う。制御部１１は未処理の回帰モデルがないと判定した場合（ステップＳ１３４でＮＯ）、特徴量に基づき、いずれかの回帰モデルが出力したゲート領域の中から、最終的に出力するゲート領域を選択する（ステップＳ１３５）。制御部１１は選択したゲート領域の情報を一時記憶領域に記憶する（ステップＳ１３６）。制御部１１は未処理のゲート領域があるか否かを判定する（ステップＳ１３７）。制御部１１は未処理のゲート領域があると判定した場合（ステップＳ１３７でＹＥＳ）、処理をステップＳ１３１に戻し、未処理のゲート領域についての処理を行う。制御部１１は未処理のゲート領域がないと判定した場合（ステップＳ１３７でＮＯ）、一時記憶領域に記憶した選択情報に基づき、すべてのゲート領域を出力し（ステップＳ１３８）、処理を終了する。 Figure 28 is a flowchart showing another example of the procedure of the gate region selection process. Figure 28 shows the selection process when multiple gate regions are set in one scatter plot, and is the process by the above-mentioned method 2. The control unit 11 selects a gate region to be processed from multiple gate regions (step S131). For example, when three gate regions are set, they are designated as the first gate, the second gate, and the third gate, respectively. The order numbers are determined by the usefulness, such as the order of importance in reporting the test results. At this time, the control unit 11 processes the first gate, the second gate, and the third gate in this order. The control unit 11 selects a regression model to be processed (step S132). The control unit 11 calculates the feature amount, for example, the cell purity, for the gate region output by the selected regression model (step S133). The control unit 11 determines whether or not there is an unprocessed regression model (step S134). When the control unit 11 determines that there is an unprocessed regression model (YES in step S134), the process returns to step S132 and processes the unprocessed regression model. When the control unit 11 determines that there is no unprocessed regression model (NO in step S134), it selects a gate region to be finally output from among the gate regions output by any of the regression models based on the feature amount (step S135). The control unit 11 stores information on the selected gate region in a temporary storage area (step S136). The control unit 11 determines whether or not there is an unprocessed gate region (step S137). When the control unit 11 determines that there is an unprocessed gate region (YES in step S137), it returns the process to step S131 and processes the unprocessed gate region. When the control unit 11 determines that there is no unprocessed gate region (NO in step S137), it outputs all gate regions based on the selection information stored in the temporary storage area (step S138) and ends the process.

本実施の形態においては、複数の回帰モデルが出力したゲート領域の推定結果から、最適なゲート領域を選択することが可能となる。なお、上述の実施の形態では、ＬＬＡにおけるＣＤ４５ゲーティングを例としたが、悪性リンパ腫解析（ＭＬＡ：Malignant Lymphoma Analysis）検査におけるＣＤ４５ゲーティングでも、同様な手順で実行可能である。In this embodiment, it is possible to select the optimal gate region from the estimated gate region results output by multiple regression models. Note that, in the above embodiment, CD45 gating in LLA is used as an example, but CD45 gating in malignant lymphoma analysis (MLA) tests can also be performed in a similar procedure.

各実施の形態で記載されている技術的特徴（構成要件）はお互いに組み合わせ可能であり、組み合わせすることにより、新しい技術的特徴を形成することができる。
今回開示された実施の形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、請求の範囲によって示され、請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 The technical features (constituent elements) described in each embodiment can be combined with each other, and by combining them, new technical features can be formed.
The embodiments disclosed herein are illustrative in all respects and should not be considered as limiting. The scope of the present invention is defined by the claims, not by the above meaning, and is intended to include all modifications within the scope and meaning equivalent to the claims.

１０フローサイトメータ
１処理部
１１制御部
１２主記憶部
１３補助記憶部
１３１測定値ＤＢ
１３２特徴情報ＤＢ
１３３ゲートＤＢ
１３４１第１回帰モデル
１３４２第２回帰モデル
１３４３第３回帰モデル
１３４４第４回帰モデル
１３４５第５回帰モデル
１３５閾値ＤＢ
１３６自信度ＤＢ
１４入力部
１５表示部
１６通信部
１７読み取り部
１Ｐ制御プログラム
１ａ可搬型記憶媒体
１ｂ半導体メモリ
２測定部
３学習サーバ REFERENCE SIGNS LIST 10 Flow cytometer 1 Processing section 11 Control section 12 Main memory section 13 Auxiliary memory section 131 Measurement value DB
132 Feature Information DB
133 Gate DB
1341 First regression model 1342 Second regression model 1343 Third regression model 1344 Fourth regression model 1345 Fifth regression model 135 Threshold DB
136 Confidence DB
14 Input unit 15 Display unit 16 Communication unit 17 Reading unit 1P Control program 1a Portable storage medium 1b Semiconductor memory 2 Measurement unit 3 Learning server

Claims

A group of scatter plots including multiple scatter plots obtained by flow cytometry measurements with different measurement items is obtained,
The obtained scatter plot group is input to each of a plurality of learning models that have been trained based on training data including the scatter plot group and the gate region;
A gate region estimation program for causing a computer to perform a process of outputting estimated gate regions obtained from each of the plurality of learning models.

The gate region estimation program according to claim 1 , further comprising: determining a degree of confidence based on a plurality of the estimated gate regions.

3. The gate region estimation program according to claim 2, further comprising: determining a degree of confidence based on a degree of dispersion of each of a plurality of variables indicating each of the plurality of estimated gate regions.

The gate region estimation program according to claim 3, characterized in that the degree of dispersion of each variable is compared with a plurality of predetermined thresholds, a degree of confidence for each variable is determined in a plurality of stages, and the degree of confidence of the estimated gate region is determined based on the degree of confidence for each variable.

5. The gate region estimation program according to claim 2, further comprising: determining a degree of confidence of an estimated gate region for each measurement of a plurality of samples; and outputting identification information for identifying the samples in association with the determined degree of confidence.

selecting one learning model based on the estimated gate regions obtained from each of the plurality of learning models;
The gate region estimation program according to claim 1 , further comprising: outputting an estimated gate region output by the selected learning model.

The gate region estimation program according to claim 6 , further comprising: selecting a learning model based on the number of cells included in each of the estimated gate regions output by a plurality of the learning models.

The gate region estimation program according to claim 6 , further comprising: selecting a learning model based on the areas of the estimated gate regions output from a plurality of said learning models.

Clustering the measured cells based on a scatter diagram obtained from the plurality of measurement items;
The gate region estimation program according to claim 6, further comprising: selecting a learning model for each of the estimated gate regions output by a plurality of the learning models based on a cell purity determined using a result of the clustering.

obtaining a plurality of estimated gate regions from each of the plurality of learning models;
Selecting one learning model for each group including a plurality of the estimated gate regions related to each other;
10. The gate region estimation program according to claim 1, further comprising:outputting the estimated gate region output by each of the selected learning models.

determining a confidence level based on the plurality of estimated gate regions;
selecting one learning model based on the estimated gate regions obtained from each of the plurality of learning models;
2. The gate region estimation program according to claim 1, further comprising: outputting an estimated gate region output by the selected learning model and the degree of confidence.

The computer
A group of scatter plots including multiple scatter plots obtained by flow cytometry measurements with different measurement items is obtained,
The obtained scatter plot group is input to each of a plurality of learning models that have been trained based on training data including the scatter plot group and the gate region;
and outputting the estimated gate regions obtained from each of the plurality of learning models.

An acquisition unit that acquires a scatter plot group including a plurality of scatter plots obtained by flow cytometry measurements having different measurement items;
and an output unit that inputs the acquired scatter plot group to each of a plurality of learning models that have been trained based on training data including a scatter plot group and a gate region, and outputs the estimated gate region obtained from each of the plurality of learning models.