JP4355281B2

JP4355281B2 - Peak extraction method and peak extraction apparatus

Info

Publication number: JP4355281B2
Application number: JP2004359754A
Authority: JP
Inventors: 育夫済木; 卓也尾山; ゆかり柴垣; 和生小川
Original assignee: INTEC SYSTEMS INSTITUTE, INC.
Current assignee: INTEC SYSTEMS INSTITUTE, INC.
Priority date: 2004-12-13
Filing date: 2004-12-13
Publication date: 2009-10-28
Anticipated expiration: 2024-12-13
Also published as: JP2006170647A

Description

本発明は、電子計算機を用いた情報処理によって、スペクトル特性におけるピークを抽出する技術に関する。また、ピーク抽出技術を用いた診断装置に関する。飛行時間型質量分析装置(ＴＯＦ−ＭＳ:time-of-flight Mass Spectrometry)、液体クロマトグラフなどの質量分析装置を用いて血漿、血清、尿、組織抽出液、髄液などの様々な生体試料を測定することにより得られる波形データ（以下「波形データ」あるいは「波形」と称する。）から、その試料に含まれるタンパク質等生体由来分子に対応する「ピーク」部分を、計算機を用いた情報処理によって抽出する手法および装置に関する。 The present invention relates to a technique for extracting a peak in spectral characteristics by information processing using an electronic computer. The present invention also relates to a diagnostic apparatus using a peak extraction technique. Various biological samples such as plasma, serum, urine, tissue extract, and cerebrospinal fluid using mass spectrometer such as time-of-flight mass spectrometer (TOF-MS) and liquid chromatograph From the waveform data obtained by measurement (hereinafter referred to as “waveform data” or “waveform”), a “peak” portion corresponding to a biological molecule such as a protein contained in the sample is obtained by information processing using a computer. The present invention relates to an extraction method and apparatus.

スペクトル特性におけるピークの抽出技術としては種々の技術が提案されている。一般的なピーク抽出手法では、例えば、ＳＮ比が一定値以上であるなどの条件に基づいて単一波形からピークを抽出する方法を用いていた（この欄において第１のステップと称する）。 Various techniques have been proposed as techniques for extracting peaks in spectral characteristics. In a general peak extraction method, for example, a method of extracting a peak from a single waveform based on a condition such that the SN ratio is a certain value or more is used (referred to as a first step in this column).

例えば由来の異なる単一波形を比較する場合には、第２のステップとして、上記第１のステップで異なる単一波形から抽出したピークをグループ化して比較することが必要となる場合が多い。この際、第１のステップで抽出された全てのピークは、第２のステップであるピークのグループ化処理において全て同等に取り扱われる（特許文献１参照）。 For example, when comparing single waveforms having different origins, it is often necessary to group and compare peaks extracted from different single waveforms in the first step as the second step. At this time, all the peaks extracted in the first step are all handled equally in the peak grouping process in the second step (see Patent Document 1).

単一波形から得られたピークを、複数波形間で重ね合わせ、重ね合わせた状態における隣接する複数波形間のピーク同士がある距離よりも近い位置にある場合には、それらを同一のピークグループとみなし、同一由来のピークとして特定する処理が行われる。同一のピークグループに属する複数のピークについて、一部のピークが他のピークとその強度に関して十分な差が認められる場合には、そのピークを有する波形に関しては特異性がみられることになり、例えば、既知の物質と比較することによってその特異なピークに対応する物質を同定することやその特異性が特定の疾患患者に固有のものであればそれを診断に用いることなどが可能となる。 When peaks obtained from a single waveform are overlapped between multiple waveforms and the peaks between adjacent waveforms in the superimposed state are closer than a certain distance, they are grouped with the same peak group. Assuming that the peaks are identified as the same origin. For a plurality of peaks that belong to the same peak group, if some peaks are sufficiently different from other peaks in terms of their intensity, specificity will be seen with respect to waveforms having such peaks, for example By comparing with a known substance, a substance corresponding to the specific peak can be identified, and if the specificity is specific to a specific disease patient, it can be used for diagnosis.

特開２００３−３１５３４１号公報JP 2003-315341 A

しかしながら、上記のピーク抽出方法を用いると、以下のような場合にピーク抽出精度が低下する可能性がある。
（１）ノイズによる質低下
単一波形から得られたピークはピークらしいものもノイズのようなものも、次の処理であるピークグループ化処理においては全て同等に扱われる。図１５は、上記の一般的な処理を行ってピーク抽出を行い、次いで、ピークのグループ化を行う手順の例を示す図である。図１５の左側中段の図に示すようにノイズが有る場合に、ピークのグループ化処理においてこのノイズに起因するピークも１つのピークグループとみなされる。その結果、ノイズのみ、或いは、大部分がノイズからなるピークグループが生成され易く、かかるノイズを多く含むピークグループとノイズを多くは含まないピークグループとで区別が付かないという問題点がある。 However, when the above-described peak extraction method is used, the peak extraction accuracy may be reduced in the following cases.
(1) Quality degradation due to noise Peaks obtained from a single waveform, whether peak-like or noise-like, are all treated equally in the next peak grouping process. FIG. 15 is a diagram illustrating an example of a procedure for performing peak extraction by performing the above general processing and then performing peak grouping. As shown in the middle diagram on the left side of FIG. 15, when there is noise, the peak due to this noise is also regarded as one peak group in the peak grouping process. As a result, it is easy to generate a peak group consisting of only noise or mostly noise, and there is a problem that it is impossible to distinguish between a peak group containing a lot of noise and a peak group containing no noise.

（２）ピーク密集区間での質低下
もう１つの問題であるピーク密集区間でのピーク抽出の質低下について図１６を参照しつつ説明を行う。図１６は、上記の一般的な処理を行ってピーク抽出を行い、次いで、ピークのグループ化を行う手順の例を示す図である。 (2) Degradation of quality in peak dense section Another problem, the degradation of peak extraction quality in a peak dense section, will be described with reference to FIG. FIG. 16 is a diagram illustrating an example of a procedure for performing peak extraction by performing the above-described general processing and then performing grouping of peaks.

図１６に示すような細かなピークが密集している区間では、隣接ピークの間隔が狭まっている。そのため、一定の距離以内にあるピークをすべてグループ化する従来の単純なピークグループ化方法では、右図に示すように、結果として多くのピークが同一のグループに割り当てられてしまい、各波形から得られたピークの対応付けが正しくなされない。特にブロードなピーク、即ちピークトップが不明瞭あるいは隣接するピークが重なり合っている状態が多くみられる高分子領域においては、ピーク抽出によって得られた結果は図１６に示すような「細かなピークが密集する」状態である場合が多く、この問題がより生じやすい。 In the section where fine peaks are dense as shown in FIG. 16, the interval between adjacent peaks is narrowed. Therefore, in the conventional simple peak grouping method that groups all peaks within a certain distance, as shown in the right figure, many peaks are assigned to the same group as a result. The assigned peaks are not correctly matched. In particular, in a polymer region where a broad peak, that is, a state in which the peak top is not clear or adjacent peaks are often overlapped, the result obtained by peak extraction shows that “fine peaks are concentrated as shown in FIG. In many cases, this problem is more likely to occur.

本発明は、上記の問題点に鑑みて、ピーク抽出とグループ化されたピークの対応付けとの精度を向上させることを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to improve the accuracy of peak extraction and grouped peak association.

本発明は、個々のピークに対して信頼度を付与し、複数の波形において同じ位置に存在するピークの信頼度の合計値をピーク密度として求めこのピーク密度をもとにピークグループの信頼度を付与する方式を採用している。これにより、大部分がノイズのピークグループ、或いは、ごく一部の波形にしか見られない例外的なピークからなるグループの信頼度は低く設定される。これにより、ノイズによる質低下という上記（１）の問題に対しては、信頼度の低いピークグループを除去することにより対処できる。上記（２）の問題に対しては、近傍ピークを単純にグループ化するのではなく、微小区間ごとにピーク密度を計算し、ピーク密度の増減を考慮したグループ化を行うため、よりピークらしい部分のみをグループ化することが可能である(図１７参照)。 In the present invention, reliability is given to each peak, and a total value of the reliability of peaks existing at the same position in a plurality of waveforms is obtained as a peak density, and the reliability of the peak group is determined based on this peak density. The method of granting is adopted. As a result, the reliability of a group consisting mainly of noise peak groups or exceptional peaks that can be seen in only a small part of the waveform is set low. As a result, the problem (1) of quality degradation due to noise can be dealt with by removing peak groups with low reliability. For the problem (2) above, instead of simply grouping neighboring peaks, the peak density is calculated for each minute section, and grouping is performed in consideration of the increase and decrease in peak density. Only can be grouped (see FIG. 17).

さらに、ピークグループに信頼度を付与するということは、ピークグループの序列化を行うことに対応する。従って、大量のピークグループが得られた場合に、信頼度の高いピークグループのみ、或いは、信頼度の高いピークグループから、後続の作業を行うことができ、作業の効率化が可能になる。 Furthermore, giving a confidence level to a peak group corresponds to ordering the peak groups. Therefore, when a large number of peak groups are obtained, subsequent work can be performed only from the peak group with high reliability or from the peak group with high reliability, and the work efficiency can be improved.

本発明は、個々のピークに対して信頼度を付与し、複数の波形において同じ位置に存在するピークの信頼度の合計値をピーク密度として求め、このピーク密度をもとにピークグループの信頼度を付与する方式を採用している。これにより、大部分がノイズのピークグループ、或いは、ごく一部の波形にしか見られない例外的なピークからなるグループの信頼度は低く設定される。これにより、ノイズによる質低下という上記（１）の問題に対しては、信頼度の低いピークグループを除去することにより対処できる。上記（２）の問題に対しては、近傍ピークを単純にグループ化するのではなく、微小区間ごとにピーク密度を計算し、ピーク密度の増減を考慮したグループ化を行うため、よりピークらしい部分のみをグループ化することが可能である。 In the present invention, reliability is given to individual peaks, and the total value of the reliability of peaks existing at the same position in a plurality of waveforms is obtained as a peak density, and the reliability of the peak group is determined based on the peak density. The method of giving is adopted. As a result, the reliability of a group consisting mainly of noise peak groups or exceptional peaks that can be seen in only a small part of the waveform is set low. As a result, the problem (1) of quality degradation due to noise can be dealt with by removing peak groups with low reliability. For the problem (2) above, instead of simply grouping neighboring peaks, the peak density is calculated for each minute section, and grouping is performed in consideration of the increase and decrease in peak density. It is possible to group only.

本明細書において、ピーク抽出処理とは、単一波形からピークを抽出し、抽出したピークを複数の波形でグループ化する処理を指す。 In this specification, peak extraction processing refers to processing for extracting peaks from a single waveform and grouping the extracted peaks into a plurality of waveforms.

以下、本発明の実施の形態によるピーク抽出技術について、図面を参照しつつ説明を行う。まず、具体的な実施の形態について説明する前に、本発明の概要についてSurface-Enhanced Laser Desorption/Ionization TOF MS(ＳＥＬＤＩＴＯＦ−ＭＳ)を利用した質量分析を例にして、図１Ａ、Ｂを参照しつつ説明を行う。 Hereinafter, a peak extraction technique according to an embodiment of the present invention will be described with reference to the drawings. First, before describing a specific embodiment, referring to FIGS. 1A and 1B, an example of mass spectrometry using Surface-Enhanced Laser Desorption / Ionization TOF MS (SELDI TOF-MS) will be described. However, an explanation will be given.

図１Ａは、プロテオーム解析装置の位置付けを示す図である。図１Ａに示すように、ＳＥＬＤＩＴＯＦ−ＭＳは、例えば、金属製のチップと飛行時間(Time-of-flight)型の質量分析計を組み合わせたシステムである。生体内から得られた少量のサンプルを直接あるいは粗分画して用いることで、短時間で様々なタンパク質の解析を行うことができる。チップの表面にイオン交換性、IMAC(Ni、Cuなどの金属イオン)などの化学的な修飾が施されており、化学的性質を利用してチップ表面にタンパク質を捕捉し、ＴＯＦ−ＭＳで測定する。生体試料中のタンパク質の発現解析やタンパク質の同定、翻訳後修飾の解析などを行うことが可能である。このＳＥＬＤＩＴＯＦ−ＭＳにより測定されたプロテオーム波形データは、横軸に分子量あるいはそれに準ずる量としての質量電荷比(m/z)を、縦軸に分子数あるいはそれに準ずる量としての強度をプロットしたスペクトルとして得られる。この場合には、各m/zに対応するピークがタンパク質等の特定分子に対応するピークである。さらに、このＳＥＬＤＩＴＯＦ−ＭＳを用いて血漿等の生体試料を測定することにより得られる波形データから、その試料に含まれるタンパク質等生体由来分子に対応する「ピーク」部分を、計算機を用いた情報処理によって抽出するピーク抽出装置を備えている。ＳＥＬＤＩＴＯＦ−ＭＳを用いて複数の試料からそれぞれ得られた波形データに対し、同一分子由来と推測されるピークを複数波形データ間で対応づけし、ピークのグループ化を行う。ピークのグループ化の質は、その後のデータ解析処理の精度に大きな影響を与え、ひいては、そのデータ解析に基づく診断等の精度に大きな影響を与える。 FIG. 1A is a diagram illustrating the positioning of the proteome analyzer. As shown in FIG. 1A, the SELDI TOF-MS is a system in which, for example, a metal chip and a time-of-flight mass spectrometer are combined. Various proteins can be analyzed in a short time by using a small amount of sample obtained from the living body directly or roughly. The surface of the chip is chemically modified such as ion exchange and IMAC (metal ions such as Ni and Cu). Using the chemical properties, proteins are captured on the surface of the chip and measured by TOF-MS. To do. It is possible to analyze protein expression in biological samples, identify proteins, analyze post-translational modifications, and the like. The proteome waveform data measured by SELDI TOF-MS is a spectrum in which the horizontal axis represents the molecular weight or the mass-to-charge ratio (m / z) as the equivalent amount, and the vertical axis represents the number of molecules or the intensity as the equivalent amount. As obtained. In this case, the peak corresponding to each m / z is a peak corresponding to a specific molecule such as a protein. Furthermore, from a waveform data obtained by measuring a biological sample such as plasma using this SELDI TOF-MS, a “peak” portion corresponding to a biological derived molecule such as a protein contained in the sample is obtained using a computer. A peak extraction device is provided for extraction by processing. For the waveform data obtained from each of a plurality of samples using SELDI TOF-MS, the peaks estimated to originate from the same molecule are associated with each other between the plurality of waveform data, and the peaks are grouped. The quality of peak grouping has a great influence on the accuracy of subsequent data analysis processing, and consequently has a great influence on the accuracy of diagnosis and the like based on the data analysis.

図１Ｂは、本実施の形態によるピーク抽出装置の概略構成例を示す図である。図１Ｂに示すように、本実施の形態によるピーク抽出装置は、波形データを取得する波形データ取得部１と、単一波形データにおけるピークを抽出するピーク抽出部３と、ピークのグループ化を行うグループ化処理部５と、データを出力するデータ出力部７と、を有している。さらに、グループ化されたピークに基づいて、単一波形データ間の差異を検出する差異検出部１１を有する。それぞれの処理部は、実際には以下に説明する方法をコンピュータに実行させるためのプログラムを格納するＲＯＭと、ＲＯＭ内のデータを展開するＲＡＭと、ＲＡＭに展開されたプログラムを実行するＣＰＵと、により処理を行うものである。 FIG. 1B is a diagram illustrating a schematic configuration example of the peak extraction apparatus according to the present embodiment. As shown in FIG. 1B, the peak extraction apparatus according to the present embodiment performs grouping of peaks, and a waveform data acquisition unit 1 that acquires waveform data, a peak extraction unit 3 that extracts peaks in single waveform data, and the like. A grouping processing unit 5 and a data output unit 7 for outputting data are provided. Furthermore, it has the difference detection part 11 which detects the difference between single waveform data based on the grouped peak. Each processing unit actually includes a ROM for storing a program for causing a computer to execute the method described below, a RAM for expanding data in the ROM, a CPU for executing a program expanded in the RAM, The processing is performed by the following.

本実施の形態によるピーク抽出装置は、単一の波形データからピークを抽出し、そのピークに信頼度を付与する。その後、複数波形にまたがるピークのグループ化と信頼度の付与を行う。 The peak extraction apparatus according to the present embodiment extracts a peak from single waveform data and gives a reliability to the peak. After that, grouping of peaks across multiple waveforms and imparting reliability are performed.

すなわち、まず、波形データに関して、後述するピーク領域の幅を変化させることができるパラメータを設定し、このパラメータを任意に変えることにより、細かいピークから粗いピークまで様々なピークを「領域」（以下、「ピーク領域」と称する。）として抽出する。またそれらのピーク領域の包含関係からピーク領域の重なり度合い（以下、「包含レベル」と称する。）を求める。この「包含レベル」は、「ピークらしさ」、すなわち抽出したピークの信頼度を表す指標として用いることができる。 That is, first, regarding the waveform data, a parameter that can change the width of the peak region to be described later is set, and by arbitrarily changing this parameter, various peaks from a fine peak to a coarse peak can be changed into “regions” (hereinafter referred to as “region”). This is referred to as “peak area”. Further, the degree of overlap of the peak areas (hereinafter referred to as “inclusion level”) is obtained from the inclusion relation of these peak areas. The “inclusion level” can be used as an index representing “peakness”, that is, the reliability of the extracted peak.

この場合の複数波形データにまたがるピークの対応付けに際し、上述の「包含レベル」を用いてピークグループ自体の信頼度を数値化する。即ち包含レベルの高い（＝信頼度の高い）ピークが多くの波形データで見られる位置では、高い信頼度のピークグループが得られ、一方、逆の場合には得られるピークグループの信頼度は低いと判断することができる。 In this case, when associating peaks across a plurality of waveform data, the reliability of the peak group itself is quantified using the “inclusion level” described above. That is, at a position where a peak with a high inclusion level (= high reliability) is seen in many waveform data, a peak group with high reliability is obtained, whereas in the opposite case, the obtained peak group has low reliability. It can be judged.

以上に説明したように、ピークのグループ化の質を求めることにより、その後のデータ解析処理の精度、ひいては、最終的なデータ解析に基づく診断等の精度を知ることができる。 As described above, by determining the quality of peak grouping, it is possible to know the accuracy of subsequent data analysis processing, and hence the accuracy of diagnosis and the like based on final data analysis.

以下、単一波形データからのピーク抽出処理、複数波形データからのピークのグループ化処理の順に説明を行う。図２は、本実施の形態による単一波形データからのピーク抽出処理（ステップ１）及びピークのグループ化処理（ステップ２）の概要を示す図である。図２に示すように、複数の対象について測定してそれぞれ得られた単一の波形１から波形Ｎまでを、ステップ１において、波形１から波形Ｎまでのそれぞれの単一波形データからのピーク抽出を行い、次いで、ステップ２においてピークのグループ化を行う。以下に、ステップ１の単一波形データからのピーク抽出処理、ステップ２のピークのグループ化処理の順に説明を行う。 In the following, description will be given in the order of peak extraction processing from single waveform data and peak grouping processing from a plurality of waveform data. FIG. 2 is a diagram showing an overview of peak extraction processing (step 1) and peak grouping processing (step 2) from single waveform data according to the present embodiment. As shown in FIG. 2, peak extraction from single waveform data from waveform 1 to waveform N is performed in step 1 from single waveform 1 to waveform N obtained by measuring a plurality of objects. Then, in step 2, peak grouping is performed. Hereinafter, the peak extraction process from the single waveform data in step 1 and the peak grouping process in step 2 will be described in this order.

（１）ステップ１：単一波形データからのピーク抽出処理
本実施の形態によるピーク抽出装置では、ピークの存在する部分をピーク領域として検出する。このピーク領域として検出する処理において、内部で用いるパラメータを様々な値に設定することにより、様々な幅を持つピーク領域が得られる。換言すれば、様々な幅を持つピーク領域が得られるようにパラメータを選択してピーク領域を検出する処理を行う。このパラメータの値が大きい場合は、高いピークを含む幅の広い領域が得られる。逆にパラメータ値が小さい場合は、低いピークを含む幅の狭い領域が得られる。一般にパラメータ値が小さいと多くのピークが得られるがノイズも多く含まれるようになる。一方、パラメータ値が大きいとノイズが少ない一方でピークの検出漏れの確率が大きくなる。 (1) Step 1: Peak Extraction Processing from Single Waveform Data The peak extraction apparatus according to this embodiment detects a portion where a peak exists as a peak region. In the process of detecting as a peak region, peak regions having various widths can be obtained by setting various parameters used internally. In other words, processing is performed to select a parameter and detect a peak region so that peak regions having various widths can be obtained. When the value of this parameter is large, a wide region including a high peak is obtained. Conversely, when the parameter value is small, a narrow region including a low peak is obtained. In general, when the parameter value is small, many peaks are obtained, but a lot of noise is included. On the other hand, if the parameter value is large, the probability of omission of peak detection increases while noise is small.

本実施の形態によるピーク抽出装置では、パラメータ値を数通りに変更し、それぞれのパラメータ値毎に幅の異なるピーク領域を得る。これらのピーク領域の重なり具合からその包含関係を構成し、包含関係の深さを「包含レベル」として数値化する。この包含レベルを「ピークらしさ」の指標として取り扱うことにより、ピークに信頼度を付与することができ、ピークとしての信頼度を伴ったピーク領域を検出することができる。 In the peak extraction apparatus according to the present embodiment, parameter values are changed in several ways, and peak regions having different widths are obtained for the respective parameter values. The inclusion relation is constructed from the degree of overlap of these peak areas, and the depth of the inclusion relation is quantified as the “inclusion level”. By treating this inclusion level as an index of “peakness”, it is possible to give a confidence level to the peak and to detect a peak region with a confidence level as a peak.

図３は、単一波形データからピークを抽出し、包含レベルを求める過程を示す図である。図３に示す単一波形は、横軸がｍ／ｚ、縦軸が強度である。但し、ｍは質量、ｚは電荷である。図３に示すように、右側から４つのピークが抽出されている（順に、ピーク１〜４とする）。内部のパラメータを０．５、１、５、１０とした場合のピーク領域を検出する。内部のパラメータを０．５とした場合に、ピーク１からピーク４までのいずれのピークにおいてもピーク領域が検出される。内部パラメータ＝１でのピーク領域は、ピーク２及びピーク４においてピーク領域が検出される。内部パラメータ＝５及び１０でのピーク領域は、ピーク４においてピーク領域が検出される。 FIG. 3 is a diagram illustrating a process of extracting a peak from single waveform data and obtaining an inclusion level. In the single waveform shown in FIG. 3, the horizontal axis is m / z and the vertical axis is intensity. However, m is mass and z is an electric charge. As shown in FIG. 3, four peaks are extracted from the right side (in order, peaks 1 to 4). The peak area is detected when the internal parameters are 0.5, 1, 5, and 10. When the internal parameter is set to 0.5, a peak region is detected in any peak from peak 1 to peak 4. As for the peak region with the internal parameter = 1, the peak region is detected at peak 2 and peak 4. As for the peak region at the internal parameters = 5 and 10, the peak region is detected at the peak 4.

このように４種類の異なる内部パラメータ値におけるピーク領域を検出し、ピーク１〜４のそれぞれに関して、検出されたピーク領域の構造に基づいて包含レベルを決定する。 In this way, peak areas at four different internal parameter values are detected, and the inclusion level is determined for each of the peaks 1 to 4 based on the structure of the detected peak area.

以下に、ステップ１についてより詳細に説明する。ステップ１は、以下の２つのサブステップに分割される。以下、各サブステップについて詳細に述べる。 Hereinafter, Step 1 will be described in more detail. Step 1 is divided into the following two substeps. Hereinafter, each sub-step will be described in detail.

（１−１）ピーク領域の検出
本実施の形態によるピーク領域検出ステップにおいては、図３に示すように、内部のパラメータＰの値を数通りに変化させ、各値Ｐに対して単一波形からのピーク領域の検出を行う。変化させるパラメータＰの具体的な値は予め与えられており、全ての波形データにおいて、その数、値ともに共通であるものとする。例えばＮ個のパラメータ値が事前に与えられていれば、単一の波形に対してＮ回のピーク領域の検出を行い、それぞれのパラメータ値に対して別個のピーク領域検出結果が得られる。 (1-1) Peak Area Detection In the peak area detection step according to the present embodiment, as shown in FIG. 3, the value of the internal parameter P is changed in several ways, and a single waveform is obtained for each value P. The peak area from is detected. A specific value of the parameter P to be changed is given in advance, and the number and value are common in all waveform data. For example, if N parameter values are given in advance, N peak areas are detected for a single waveform, and a separate peak area detection result is obtained for each parameter value.

ピーク領域検出処理は、波形中のすべての点に対して、その点がピークの近傍であるか否かを推定することによって行う。ピークの近傍であると推定された連続する点の集合が一つのピーク領域となる。各点がピークの近傍であるか否かの推定は、当該点の左方および右方を順次探索し、予め設定した閾値を超えて増加・減少しているか否かによって行う。当該点の推定の具体的手順について図４（ａ）から（ｄ）までを参照しつつ具体的に説明する。図４（ａ）から図４（ｄ）までは、探索を終了する４つのケースである。 The peak area detection process is performed by estimating whether or not each point in the waveform is near the peak. A set of consecutive points estimated to be in the vicinity of the peak becomes one peak region. The estimation as to whether or not each point is in the vicinity of the peak is performed by sequentially searching the left and right sides of the point and determining whether or not the point has increased or decreased beyond a preset threshold value. A specific procedure for estimating the point will be specifically described with reference to FIGS. 4 (a) to 4 (d). FIG. 4A to FIG. 4D are four cases in which the search is terminated.

（１）まず、当該点を点Ａとする。（２）次いで、点Ａの左方、即ち、ｍ／ｚ値が減少する方向を探索する。この際、(ア)ＢをＡとする。(イ)Ｂの左隣の点を改めてＢとする。(ウ)Ｂが下記探索終了ケースのいずれかに当てはまれば、下記（３）へ、それ以外は上記(イ)の処理に移る。（３）点Ａの右方探索を行う。すなわち上記（２）と同様の処理を右方即ちｍ／ｚ値の増加する方向に対して行う。（４）点Ａの左方探索および右方探索の探索終了ケースをもとに、表１に基づいて、点Ａがピークの近傍であるか否かを推定する。 (1) First, let the point be point A. (2) Next, the left side of the point A, that is, the direction in which the m / z value decreases is searched. At this time, (a) B is assumed to be A. (B) The point on the left of B is changed to B again. (C) If B corresponds to one of the following search end cases, the process proceeds to the following (3), and otherwise, the process proceeds to (a). (3) Perform a right search for point A. That is, the same processing as in (2) is performed on the right side, that is, in the direction in which the m / z value increases. (4) Based on the search end cases of the left search and right search of point A, it is estimated whether or not point A is in the vicinity of the peak based on Table 1.

ここで、探索終了ケースは、以下の通りである。すなわち、図４（ａ）に示すように、探索終了ケース１として、点Ｂの強度が上方閾値Ｔ１を上回った場合である。この場合には、左側により高いピークが存在することが推定されるため、点Ａはピークの近傍ではないと推定できる。図４（ｂ）に示すケース２では、点Ｂの強度が下方閾値を下回る。この場合には、自分（点Ａ）よりも点Ｂが下がっているので、点Ａはピークの近傍である可能性があるが、右方を探索するまではピークの近傍と推定することはできない。図４（ｃ）に示すケース３では、ケース１、２のいずれかの状態になることなく、点Ａから予め定められた距離Ｄまで到達した場合である。図４（ｄ）に示すケース４は、上記のいずれにも合致せずに波形の端まで到達した場合である。ここで、図４（ｃ）ケース３におけるＤは、Ａ又はＢに依存する可変値であっても良い。
尚、図４（ｃ）は省略することが可能である。 Here, the search end cases are as follows. That is, as shown in FIG. 4A, as search end case 1, the intensity at point B exceeds the upper threshold value T1. In this case, since it is estimated that a higher peak exists on the left side, it can be estimated that the point A is not near the peak. In case 2 shown in FIG. 4B, the intensity at point B is below the lower threshold. In this case, since the point B is lower than the self (point A), the point A may be near the peak, but cannot be estimated as the vicinity of the peak until the right side is searched. . Case 3 shown in FIG. 4C is a case where the vehicle has reached the predetermined distance D from the point A without entering any of the cases 1 and 2. Case 4 shown in FIG. 4D is a case where the end of the waveform is reached without matching any of the above. Here, D in case 3 of FIG. 4C may be a variable value depending on A or B.
Note that FIG. 4C can be omitted.

表１は、ピークの近傍を推定するための推定表である。「○」は、ピークの近傍であることを意味する。「×」は、ピークの近傍ではないことを意味する。「左」は、点Ａの左隣の点の推定結果と同じにする（前の推定結果で代用する）ことを意味する。 Table 1 is an estimation table for estimating the vicinity of the peak. “◯” means near the peak. “X” means not near the peak. “Left” means to make the same as the estimation result of the point adjacent to the left of the point A (substitute the previous estimation result).

左方と右方とを探索し、それぞれの探索結果がケース１〜４までのいずれに当てはまるかを判定することにより、ピークの近傍であるか否かを推定することができる。この推定結果に基づいて、ピーク領域を決めることができる。尚、表１及び以下の計算手法は、例示であり、変更又は変形が可能であることはいうまでもない。 By searching the left side and the right side and determining which of the cases 1 to 4 corresponds to each search result, it can be estimated whether or not it is in the vicinity of the peak. Based on this estimation result, the peak region can be determined. Needless to say, Table 1 and the following calculation method are merely examples and can be changed or modified.

尚、上記探索終了ケースにおける「上方閾値Ｔ１」および「下方閾値Ｔ２」は、パラメータＰに依存する値である。閾値の決め方には様々な方法があるが、以下の表２に代表的な３つの方法を示す。 The “upper threshold value T1” and the “lower threshold value T2” in the search end case are values that depend on the parameter P. Although there are various methods for determining the threshold value, the following three methods are shown in Table 2 below.

ここで、Ｙ（Ａ）は、ＡのＹ座標即ち強度であり、max{ Ｙ(Ａ),.., Ｙ(Ｂ)}は、ＡからＢに至るすべての点の強度の最大値であり、ｎ(Ａ)はＡの位置におけるノイズの強さを示す。表２に示されるように、Ｔ１、Ｔ２は、Ａ又はＢに依存する可変値であっても良い。 Where Y (A) is the Y coordinate or intensity of A, and max {Y (A),..., Y (B)} is the maximum intensity of all points from A to B. , N (A) indicates the intensity of noise at the position A. As shown in Table 2, T1 and T2 may be variable values depending on A or B.

表２に示すように、方法１では、Ｔ１が点Ａの強度よりもパラメータＰだけ上がった値、Ｔ２が点Ａの強度よりもパラメータＰだけＰ下がった値として定義される。方法２では、Ｔ１が点Ａの強度よりもパラメータＰだけＰ上がった値、Ｔ２が位置ＡからＢまでの間における強度の最大値からパラメータＰだけ下がった値として定義される。方法３では、Ｔ１は点Ａの強度よりもノイズの強さとパラメータＰとで表される関数ｎ（Ａ）×２^Ｐだけ大きい値であり、Ｔ２は位置ＡからＢまでの間における強度の最大値からノイズの強さとパラメータＰとで表される関数ｎ（Ａ）×２^Ｐだけ小さい値である。ここで、関数ｎ（Ａ）×２^Ｐは、Ｐ＝１であれば、ノイズの強さｎ（Ａ）の２倍が上記の関数となる。以下では、方法３を採用した場合について説明する。
以上のような方法により、単一波形におけるピーク領域を検出することができる。 As shown in Table 2, in method 1, T1 is defined as a value that is increased by a parameter P from the intensity at point A, and T2 is defined as a value that is decreased by P by a parameter P from the intensity at point A. In the method 2, T1 is defined as a value increased by P by the parameter P from the intensity at the point A, and T2 is defined as a value decreased by the parameter P from the maximum value of the intensity between the positions A and B. In method 3, T1 is a large value only function n (A) × 2 ^P represented by the strength and the parameter P of the noise than the intensity of the point A, the maximum intensity of between T2 from position A to B a small value by the function n (a) × 2 ^P represented by the strength and the parameter P of the noise from the value. Here, if the function n (A) × ^2P is P = 1, twice the noise intensity n (A) is the above function. Below, the case where the method 3 is employ | adopted is demonstrated.
The peak region in a single waveform can be detected by the method as described above.

（１−２）ピーク領域の選別と包含レベルの設定
サブステップ（１−２）では、上述のサブステップ（１−１）において得られた数通りのパラメータ値によるピーク領域群に対して、それらの包含関係をもとにピーク領域の選別および包含レベルの設定を行う。図５は、サブステップ（１−１）で得られたピーク領域の例を示す図である。以下、図５を参照しつつサブステップ（１−２）の処理について説明する。パラメータＰを様々に変化させて得られたピーク領域、すなわち、パラメータＰがＰ_１、Ｐ_２、Ｐ_３、Ｐ_４の場合について得られた図５において太線で示される全てのピーク領域に関して、そのうちの任意の２組に対する包含関係を検査する。図６は、包含関係について検査した様子を示す図である。図６に示すように、包含関係を、包含される側（例えば符号Ａで示す。）から包含する側（例えば符号Ｂで示す。）へ向けての矢印により示している。尚、包含関係を検査する際に、一方が他方を完全に含まなくともほぼ含んでいれば包含関係が成立すると判断されるように、後述する「包含関係成立条件」を設定する。これにより、若干のはみ出しを許容する包含関係の構築が可能となる。なお、包含関係は必ずしもパラメータＰの大小とは一致しない。 (1-2) Selection of peak areas and setting of inclusion levels In sub-step (1-2), the peak area groups based on several parameter values obtained in the above-mentioned sub-step (1-1) Based on the inclusion relation, the peak area is selected and the inclusion level is set. FIG. 5 is a diagram illustrating an example of the peak region obtained in the substep (1-1). Hereinafter, the process of the sub-step (1-2) will be described with reference to FIG. Peak regions obtained by variously changing the parameter P, that is, all peak regions indicated by bold lines in FIG. 5 obtained when the parameter P is P ₁ , P ₂ , P ₃ , P ₄ , Check the inclusive relation to any two sets of. FIG. 6 is a diagram illustrating a state in which the inclusion relation is inspected. As shown in FIG. 6, the inclusion relationship is indicated by an arrow from the included side (for example, indicated by reference symbol A) to the included side (for example, indicated by reference symbol B). It should be noted that, when the inclusion relationship is inspected, an “inclusion relationship establishment condition” to be described later is set so that the inclusion relationship is determined to be established if one of the two does not completely include the other. As a result, it is possible to construct an inclusion relationship that allows some protrusion. Note that the inclusion relationship does not necessarily match the size of the parameter P.

図７は、包含関係成立条件のうち不完全な包含関係ではあるが包含していると見なすことができる例を示す図である。下記の式（１）を満たすような完全な包含関係が成立しなくても、式（２）、（３）に示す条件が全て満たされる場合には、ピーク領域Ａはピーク領域Ｂに包含されるものとする。 FIG. 7 is a diagram illustrating an example of the inclusion relationship establishment condition that can be considered to be included although it is an incomplete inclusion relationship. Even if a complete inclusion relationship that satisfies the following equation (1) is not satisfied, the peak region A is included in the peak region B if all the conditions shown in the equations (2) and (3) are satisfied. Shall be.

完全な包含関係は以下の関係である。
ｃ＜０（１）
但し、ピーク領域の左端、右端ともに満たされることが条件である。 The complete inclusion relationship is the following relationship:
c <0 (1)
However, the condition is that both the left end and the right end of the peak region are satisfied.

不完全ではあるが、本実施の形態において包含関係になるとみなす関係は以下の関係である。
ｃ＜ｋａ（２） Although incomplete, the relationship that is considered to be an inclusion relationship in the present embodiment is the following relationship.
c <ka (2)

但し、ピーク領域の左端、右端ともに満たされることが条件である。
ｐｂ ≦ ａ＜ｑｂ（３） However, the condition is that both the left end and the right end of the peak region are satisfied.
pb ≦ a <qb (3)

但し、a：ピーク領域Ａの幅ｂ：ピーク領域Ｂの幅ｃ：ピーク領域Ａのはみ出しの長さ、ｋ，ｐ，ｑ：予め決められら定数（例：ｋ＝０．２, ｐ＝０．２, ｑ＝０．９５）。ｋは、０．２から０．３の間、ｑは１に近い１以下の値が適当である。状況によって調整することで、包含関係を正確に把握することができる。 Where a: width of the peak area A b: width of the peak area B c: length of protrusion of the peak area A, k, p, q: constants determined in advance (eg, k = 0.2, p = 0) .2, q = 0.95). A value of 1 or less close to 1 is suitable for k between 0.2 and 0.3. By adjusting according to the situation, it is possible to accurately grasp the inclusion relationship.

次に、図８に示すように、各ピーク領域の包含レベル（ＣＬ）を設定する。図８に示すように、各ピーク領域からそれを包含するピーク領域が存在する限りそれらを順次辿る。すなわち、矢印の方向に包含関係がとれずに矢印を付すことができなくなるまで辿る。そして、その辿った回数を包含レベルＣＬと定義する。但し、複数の経路がある場合は、辿った回数の最大値を包含レベルＣＬとする。 Next, as shown in FIG. 8, the inclusion level (CL) of each peak area is set. As shown in FIG. 8, as long as there is a peak region including the peak region from each peak region, they are sequentially traced. That is, the process is followed until the arrow cannot be attached because the inclusion relation is not taken in the direction of the arrow. The number of times of tracing is defined as the inclusion level CL. However, when there are a plurality of routes, the maximum value of the number of traces is set as the inclusion level CL.

次に、図９に示すように、他のいずれかのピーク領域を包含しているピーク領域、および、他のピーク領域との包含関係が存在しないピーク領域を取り除く（矢印の指示される側に相当する領域を除去する）。これにより、他のピーク領域に包含されており、かつ、他のピーク領域を包含していないピーク領域のみを残すことができる。 Next, as shown in FIG. 9, a peak region including any other peak region and a peak region that does not have an inclusive relationship with another peak region are removed (on the side indicated by the arrow). Remove the corresponding area). Thereby, it is possible to leave only a peak region that is included in another peak region and does not include another peak region.

最後に、図１０に示すように、２つのピーク領域間で重なりがある場合は、いずれか一方のみを取り除く。どちらを取り除くかに関しては様々な決め方があるが、「パラメータＰの小さいほうを取り除く」と規定した場合の例を図１０に示している。
以上の手順により、ピーク領域を精度良く抽出することができる。さらに、抽出されたピーク領域には、包含レベル、すなわち、ピーク領域の信頼度が付与されている。尚、ここでは、ピーク領域内において最も強度の大きい点をそのピーク領域の頂点とする。 Finally, as shown in FIG. 10, when there is an overlap between two peak areas, only one of them is removed. There are various ways of determining which one to remove, but FIG. 10 shows an example of the case where “removes the smaller parameter P” is specified.
By the above procedure, the peak region can be extracted with high accuracy. Further, the extracted peak area is given an inclusion level, that is, the reliability of the peak area. Here, the point having the highest intensity in the peak region is defined as the apex of the peak region.

（２）ステップ２：複数波形データからのピークのグループ化
ステップ２においては、複数の波形データから得られたピークのグループ化処理を行う。複数の波形データそれぞれにおいてピーク抽出を行い、それぞれのピークを比較する場合は、ある波形からのピークが他の波形におけるどのピークと対応するのかを決める必要がある。 (2) Step 2: Grouping of peaks from a plurality of waveform data In step 2, a grouping of peaks obtained from a plurality of waveform data is performed. When performing peak extraction in each of a plurality of waveform data and comparing the respective peaks, it is necessary to determine which peak in one waveform corresponds to the peak in another waveform.

しかしながら、対応する複数ピークのｍ／ｚ値が完全に一致することは稀であり、ｍ／ｚ値の完全一致をもってピークの対応をとることは困難である。そこで、ｍ／ｚ値が多少ずれていても、比較的近いピークに関しては同じグループに属するものとして適切にグループ化する必要がある。本実施の形態によるピーク抽出装置では、ステップ１において得られた各ピーク領域の包含レベルを利用することにより、信頼度を付加したピークのグループ化を実現する。ピークのグループ化処理は以下のサブステップからなる。
（２−１）複数の波形から得られたピーク領域の頂点座標を重ね合わせる。
（２−２）ｍ／ｚ軸を微小区間に分割する。
（２−３）区間ごとに包含レベルの合計を計算し、全区間にわたるピーク密度曲線を得る。
（２−４）ピーク密度曲線をもとに、ピークのグループ化範囲を決定する。
（２−５）同一のグループ化範囲に属するピークをグループ化する。 However, it is rare that the m / z values of the corresponding plural peaks completely match, and it is difficult to match the peaks when the m / z values completely match. Therefore, even if the m / z value is slightly shifted, it is necessary to appropriately group relatively close peaks as belonging to the same group. In the peak extraction apparatus according to the present embodiment, grouping of peaks with added reliability is realized by using the inclusion level of each peak region obtained in step 1. The peak grouping process consists of the following sub-steps.
(2-1) Overlapping vertex coordinates of peak areas obtained from a plurality of waveforms.
(2-2) Divide the m / z axis into minute sections.
(2-3) The sum of the inclusion levels is calculated for each section to obtain a peak density curve over the entire section.
(2-4) A peak grouping range is determined based on the peak density curve.
(2-5) Peaks belonging to the same grouping range are grouped.

以下では、上記各サブステップについて説明を行うが、その前にステップ２の流れの概略について図１１を参照しつつ説明を行う。まず、個々の波形データからのピークであって、ステップ１において抽出されたピーク領域とピークの頂点座標とを有する波形データを準備する。これらの個々の波形におけるピークの頂点座標を重ねて表示し、この状態においてｍ／ｚ軸を微小区間に分割する。次いで、この微小区間毎に包含レベルの合計を計算し、曲線化する。これをピーク密度曲線と称する。このピーク密度曲線に対して、あるしきい値Ｌを基準にしてこれを越える値をグループ化範囲とする。これにより、グループ化範囲に対して信頼度を付与することができる。
以下に、より詳細に各サブステップについて説明する。 In the following, each of the above sub-steps will be described. Before that, the outline of the flow of step 2 will be described with reference to FIG. First, waveform data that is a peak from each waveform data and has the peak region extracted in step 1 and the peak coordinate of the peak is prepared. The peak coordinates of the peaks in these individual waveforms are displayed in an overlapping manner, and in this state, the m / z axis is divided into minute sections. Next, the sum of the inclusion levels is calculated for each minute section, and is curved. This is called a peak density curve. For this peak density curve, a value exceeding this with reference to a certain threshold value L is defined as a grouping range. Thereby, the reliability can be given to the grouping range.
Below, each sub-step is demonstrated in detail.

（２−１）：ピーク頂点の重ね合わせステップ
まず、各単一波形から求めたピークの頂点を、ｍ／ｚ軸、強度軸双方が揃うように重ね合わせる。 (2-1): Peak vertex superposition step First, the peak vertices obtained from each single waveform are superposed such that both the m / z axis and the intensity axis are aligned.

（２−２）：ｍ／ｚ軸の分割
ｍ／ｚ軸を微小区間に分割する。区間の分割方法としては、ａ）区間幅を固定にして分割する方法、ｂ）区間幅を変動させて分割する方法、ｃ）隣り合う区間同士の重なりを許容する分割方法、ｄ）許容しない分割方法など、がある。以下に表３と図１２とを参照しつつ代表的な分割方法を挙げて説明する。図１２は、分割方法１〜４までの具体的な分割例を示す図である。 (2-2): Division of m / z axis The m / z axis is divided into minute sections. As a method of dividing a section, a) a method of dividing with a fixed section width, b) a method of dividing by varying the section width, c) a dividing method that allows overlapping of adjacent sections, d) a partition that does not allow There are methods. Hereinafter, a typical division method will be described with reference to Table 3 and FIG. FIG. 12 is a diagram illustrating a specific division example of the division methods 1 to 4.

表３に示す方法のうち、分割方法１は、波形の左端から等間隔に分割（区間の重なりなし）する方法であり、ｍ／ｚに関して等間隔に分割する基本的な方法である。これに、対して、分割方法２は、波形の左端から、ｍ／ｚの値に応じて幅が変動する区間に分割（区間の重なりなし）する方法である。例えば、ＳＥＬＤＩＴＯＦ−ＭＳでは、ｍ／ｚが大きくなると、解像度（分解能）が粗くなる（低下する）ため、このような方法によりｍ／ｚ軸を実効的に等間隔に近い方向にすることができる。分割方法３は、少数点以下が０である位置を中心に、幅が固定(例えば１．２)の区間に分割（区間の重なりあり）する方法であり、分割範囲の中心位置同士が等間隔になる。分割方法４の方法は、少数点以下が０である位置を中心に、ｍ／ｚの値に応じて変動する幅を持つ区間に分割（区間の重なりあり）する方法であり、分割方法２と分割方法３とを加味した方法であり、本実施の形態では、ＳＥＬＤＩＴＯＦ−ＭＳに適用することを考慮して適正化された分割方法４を採用している。 Of the methods shown in Table 3, the dividing method 1 is a method of dividing the waveform at equal intervals from the left end (no overlapping of sections), and is a basic method of dividing at equal intervals with respect to m / z. On the other hand, the dividing method 2 is a method of dividing (no overlapping of sections) into sections whose width varies according to the value of m / z from the left end of the waveform. For example, in SELDI TOF-MS, as m / z increases, the resolution (resolution) becomes coarse (decreases). Therefore, the m / z axis can be effectively made in a direction close to an equal interval by such a method. it can. Dividing method 3 is a method of dividing a section with a fixed width (for example, 1.2) around a position where the decimal point is 0 (with overlapping sections), and the center positions of the divided ranges are equally spaced. become. The method of the division method 4 is a method of dividing into sections (with overlapping sections) having a width that varies according to the value of m / z, centering on the position where the decimal point is 0 or less. In this embodiment, the division method 4 that is optimized in consideration of applying to the SELDI TOF-MS is adopted.

（２−３）ピーク密度曲線の導出
ｍ／ｚ軸の分割に関するサブステップ（２−２）で得られた区間のそれぞれについて、当該区間に頂点が含まれるピーク領域を全て列挙し、それらの包含レベルの合計値を当該区間に対するピーク密度とみなす処理を行う。包含レベルが高いほど、また、ピークが多く含まれる区間ほど、ピーク密度が高くなる。それぞれの区間毎にピーク密度を計算しこれを全ての区間に対してプロットすることにより、このプロットに沿ったピーク密度曲線が得られる。尚、ピーク密度計算方法には、上述の包含レベルの合計を行う基本的な方法（基本方式）の他に、様々な代替方法が存在する。表４にピーク密度計算方法の代表的な例を示す。 (2-3) Derivation of peak density curve For each of the sections obtained in the substep (2-2) regarding the division of the m / z axis, list all peak areas whose vertices are included in the section, and include them A process of regarding the total value of the levels as the peak density for the section is performed. The higher the inclusion level and the higher the peak density, the higher the peak. By calculating the peak density for each section and plotting it for all sections, a peak density curve along this plot is obtained. The peak density calculation method includes various alternative methods in addition to the basic method (basic method) for summing up the above-described inclusion levels. Table 4 shows typical examples of peak density calculation methods.

表４に示すピーク密度計算方法１は、ある単調増加関数F(x)に対して、ピーク密度を以下の式で求める。
ピーク密度＝Σ F(CLi)i=1,,nとする。（４） In the peak density calculation method 1 shown in Table 4, for a certain monotonously increasing function F (x), the peak density is obtained by the following equation.
Peak density = Σ F (CLi) i = 1,, n. (4)

但し、式（４）において、Ｐ１からＰｎまでは、当該区間に属するピークであり、ＣＬｉはＰｉの包含レベルである。単調増加関数F(x)を具体的に決めることによって、様々なピーク密度計算方法を実現できる。以下にいくつかの例を示す。
F(x)＝x （５） However, in Formula (4), P1 to Pn are peaks belonging to the section, and CLi is the inclusion level of Pi. Various peak density calculation methods can be realized by specifically determining the monotonically increasing function F (x). Some examples are shown below.
F (x) = x (5)

単調増加関数F(x)が式（５）で表される場合は、ピーク密度計算方法１は基本方式と一致する。即ちピーク密度は包含レベルの合計に相当する。
他には以下の式を用いることができる。
F(x)＝x ・x （６） When the monotonically increasing function F (x) is expressed by the equation (5), the peak density calculation method 1 matches the basic method. That is, the peak density corresponds to the sum of the inclusion levels.
Otherwise, the following formula can be used.
F (x) = x · x (6)

F(x)として式（６）を用いると、ピーク密度は包含レベルの２乗の合計に相当する。式（６）は、包含レベルが高いほどピーク密度に与える影響が大きいことを考慮して重み付けしたものである。
F(x)＝０（x＜３）
１（x≧３）（７）
式（７）は、包含レベルの低いピークはノイズに起因するピークと推定して無視し、包含レベルが３以上のもののみを考慮する方法である。 Using equation (6) as F (x), the peak density corresponds to the sum of the squares of the inclusion levels. Equation (6) is weighted considering that the higher the inclusion level, the greater the influence on the peak density.
F (x) = 0 (x <3)
1 (x ≧ 3) (7)
Equation (7) is a method in which a peak with a low inclusion level is estimated as a peak due to noise and ignored, and only peaks with an inclusion level of 3 or more are considered.

次に、ピーク密度の計算方法２について説明する。ピーク密度計算方法２では、当該区間を中心として含み、幅が当該区間よりＳ倍広い区間を参照区間と規定する。当該区間および参照区間について、上記ピーク密度計算方法１のいずれかの方法によりピーク密度を計算し、それぞれＤ１、Ｄ２とし、数値Ｓ・Ｄ１／Ｄ２を改めて実際のピーク密度とする。図１３は、この方法によりピーク密度を求めるピーク密度の計算方法の概念図である。当該区間の周辺に比べた前項方法の数値の相対的な高さを表す。この方法によれば、前後の広い範囲を参照して前後と比較することで、より精度の良いピーク密度を求めることができる。尚、本実施の形態では計算方法２による方法を用いている。また計算方法２で用いる下位のピーク密度計算方法には単調増加関数F(x)として式（６）を用いている。 Next, peak density calculation method 2 will be described. In the peak density calculation method 2, a section including the section as a center and having a width S times wider than the section is defined as a reference section. For the section and the reference section, the peak density is calculated by any one of the above-described peak density calculation methods 1 and is set to D1 and D2, respectively, and the numerical value S · D1 / D2 is changed to the actual peak density. FIG. 13 is a conceptual diagram of a peak density calculation method for obtaining the peak density by this method. Represents the relative height of the numerical value of the previous method compared to the surrounding area. According to this method, a more accurate peak density can be obtained by referring to a wide range before and after and comparing it with the front and back. In the present embodiment, the calculation method 2 is used. In the lower peak density calculation method used in calculation method 2, equation (6) is used as the monotonically increasing function F (x).

（２−４）グループ化範囲の決定
サブステップ（２−３）で得られたピーク密度曲線をもとに、グループ化範囲を以下の手順により決定する。図１４は、グループ化範囲決定方法の概略を示す図である。まず、第１に、ピーク密度曲線において閾値Ｌ_０よりも大きいピーク密度を持つ区間が存在しなければ終了する。第２に、ピーク密度曲線において閾値Ｌ_０よりも大きいピーク密度を持つ区間が存在すれば、最も高いピーク密度を持つ区間をグループ化の中心Ｃとする。またそのピーク密度をＬとする（ａ）。第３に、後述するような予め指定された方法により、中心Ｃを含む一定の範囲をグループ化範囲とする（ａ’）。またそのグループ化範囲の信頼度をＬとする（ａ）。第４に、グループ化範囲内の全区間のピーク密度を「０」にする（ｂ）。また、グループ化範囲の左方および右方を順次検査し、ピーク密度が減少する限りにおいて、その区間のピーク密度を「０」にする（ｂ）。ここで第１に戻り、残るピークに関して同様の操作を行う（この際、都度Ｌが変更される、（ｃ，ｄ，ｅ））。閾値Ｌ_０よりも大きいピーク密度を持つ区間が存在しなくなれば、処理を終了し最終的なグループ化範囲を確定する（ｅ）。上述のＣを含む一定の範囲をグループ化範囲とする方法としては種々の方法を用いることができる。代表的なグループ化範囲を決定する方法を表５に示す。 (2-4) Determination of grouping range Based on the peak density curve obtained in substep (2-3), the grouping range is determined by the following procedure. FIG. 14 is a diagram showing an outline of a grouping range determination method. First, if there is no section having a peak density larger than the threshold L ₀ in the peak density curve, the process is terminated. Second, if there is a section having a peak density larger than the threshold L ₀ in the peak density curve, the section having the highest peak density is set as the grouping center C. The peak density is L (a). Third, a predetermined range including the center C is set as a grouping range (a ′) by a method designated in advance as described later. Further, the reliability of the grouping range is L (a). Fourth, the peak density of all sections within the grouping range is set to “0” (b). Further, the left and right sides of the grouping range are sequentially inspected, and as long as the peak density decreases, the peak density in that section is set to “0” (b). Here, returning to the first, the same operation is performed for the remaining peaks (in this case, L is changed each time (c, d, e)). If there is no longer a section having a peak density greater than the threshold L ₀ , the process is terminated and the final grouping range is determined (e). Various methods can be used as a method for setting a certain range including C as a grouping range. A method for determining a representative grouping range is shown in Table 5.

表５におけるグループ化範囲決定方法１は、点Ｃを中心に±ｗの区間をグループ化範囲とする方法である。ｗは固定値でも良いし、点Ｃのｍ／ｚの値に依存して変動する値でもよい。グループ化範囲決定方法２は、点Ｃの左方および右方を順次検査し、ピーク密度がＬ’までに下がったところまでをグループ化範囲とする方法である。ここで、Ｌ’は、点Ｃのピーク密度Ｌを用いて、例えば以下のような式により決められる値である。式としては、第１にＬ’ ＝Ｌ・αである。但し、αは０≦α＜１なる定数であり、例えば０．１程度の値である。或いは、Ｌ’ ＝Ｌ − βに基づいて決定することも可能である。但し、βは０＜β≦Lとなる定数である。グループ化範囲決定方法３は、上記１と２とを併用する方法である。本実施の形態では、グループ化範囲決定方法３を用いている。 The grouping range determination method 1 in Table 5 is a method in which a section of ± w around the point C is set as the grouping range. w may be a fixed value or a value that varies depending on the value of m / z of the point C. The grouping range determination method 2 is a method in which the left side and the right side of the point C are sequentially inspected, and the point where the peak density has decreased to L ′ is set as the grouping range. Here, L ′ is a value determined by the following equation, for example, using the peak density L at the point C. As a formula, first, L ′ = L · α. However, α is a constant satisfying 0 ≦ α <1, for example, about 0.1. Alternatively, it can be determined based on L ′ = L−β. However, β is a constant satisfying 0 <β ≦ L. The grouping range determination method 3 is a method in which the above 1 and 2 are used in combination. In the present embodiment, the grouping range determination method 3 is used.

上記の方法１ではピーク密度の変化を考慮せず予め定められた範囲をグループ化範囲とするため、細かなピークが密集している部分では、隣り合う別ピークが同じグループ化範囲に割り当てられるという問題がある。一方、方法２では得られるグループ化範囲の幅に制限を設けてないため、実際には有り得ない非常に幅の広いグループ化範囲が得られる場合がある。方法３では両方法を併用することで、これらの問題が生じるのを回避できる。 In the above method 1, since a predetermined range is used as a grouping range without considering changes in peak density, adjacent peaks are assigned to the same grouping range in a portion where fine peaks are concentrated. There's a problem. On the other hand, since the method 2 does not limit the width of the obtained grouping range, there may be a case where a very wide grouping range that cannot be actually obtained is obtained. Method 3 can avoid the occurrence of these problems by using both methods together.

（２−５）ピークのグループ化
元に戻り、各ピーク頂点を見て、ピーク頂点がサブステップ（２−４）において得られたいずれのグループ化範囲に所属するかについて検査する。そして、同じグループ化範囲に属するピークをグループ化する。いずれのグループ化範囲にも属さないピークはグループ化されず、即ちノイズであったとみなす。 (2-5) Peak Grouping Returning to the original, each peak vertex is observed, and it is inspected to which grouping range the peak vertex belongs in the sub-step (2-4). Then, peaks belonging to the same grouping range are grouped. Peaks that do not belong to any grouping range are not grouped, i.e. considered to be noise.

以上のように、本発明の実施の形態においては、個々のピークに対して信頼度を付与し、複数の波形において同じ位置に存在するピークの信頼度の合計値をピーク密度として求め、このピーク密度をもとにピークグループの信頼度を付与する。これにより、大部分がノイズのピークグループ、或いは、ごく一部の波形にしか見られない例外的なピークからなるグループの信頼度は低く設定される。従って、ノイズによる質低下を防止するとともに、近傍ピークを単純にグループ化するのではなく、微小区間ごとにピーク密度を計算し、ピーク密度の増減を考慮したグループ化を行うため、よりピークらしい部分のみをグループ化することが可能である。 As described above, in the embodiment of the present invention, the reliability is given to each peak, and the total value of the reliability of peaks existing at the same position in a plurality of waveforms is obtained as the peak density. Gives peak group reliability based on density. As a result, the reliability of a group consisting mainly of noise peak groups or exceptional peaks that can be seen in only a small part of the waveform is set low. Therefore, while preventing quality degradation due to noise and not simply grouping nearby peaks, the peak density is calculated for each minute section, and grouping is performed in consideration of changes in peak density. It is possible to group only.

さらに、ピークグループに信頼度を付与することで、大量のピークグループが得られた場合に、信頼度の高いピークグループのみ、或いは、信頼度の高いピークグループから、後続の作業を行うことができ、作業の効率化が可能になるという利点がある。 Furthermore, if a large number of peak groups are obtained by assigning reliability to the peak group, the subsequent operations can be performed only from the peak group with high reliability or from the peak group with high reliability. There is an advantage that work efficiency can be improved.

以下に、40匹のラットの血漿を用いた例について記載する。図１８においては、用いる４０匹のラットはその性質により「正常なラット」、「外科的処置を施したラット」などのいくつかの群に分けられるものとする。図１８は、本実施例による処理の流れを示すフローチャート図である。図１８に示すように、ステップＳ１において、ＳＥＬＤＩＴＯＦ−ＭＳにより測定を行う。採取されたラット４０匹分の血漿に対し、ＳＥＬＤＩＴＯＦ−ＭＳを用いてそれぞれ２回測定を行い、測定結果を計算機に取り込める形式の波形データとして出力する。１回の測定に対しそれぞれ波形データが得られる（ステップＳ１）。 In the following, an example using the plasma of 40 rats is described. In FIG. 18, it is assumed that the 40 rats used are divided into several groups such as “normal rats” and “rats subjected to surgical treatment” depending on their properties. FIG. 18 is a flowchart showing the flow of processing according to this embodiment. As shown in FIG. 18, in step S1, measurement is performed by SELDI TOF-MS. The collected plasma of 40 rats is measured twice using SELDI TOF-MS, and the measurement results are output as waveform data that can be loaded into a computer. Waveform data is obtained for each measurement (step S1).

次に、計算機に取り込まれた各波形データに対して、後述する計算機処理（ステップＳ２）を行う。ステップＳ２において、まず、波形データを準備し、ベースラインの除去（ステップＳ２−１）と平滑化処理を行い（ステップＳ２−２）、ステップＳ２−３でピーク抽出処理を行う。ステップＳ２−３のピーク抽出処理においては、ステップＳ２−３−１の単一波形からのピーク抽出（上記ステップ１）と、ステップＳ２−３−２の複数波形からのピークのグループ化処理（上記ステップ２）を行う。次いで、ステップＳ２−４において、ステップＳ２−１の結果またはステップＳ２−２の結果と、ステップＳ２−３の結果とに基づいて、グループ間の差異を検出する。 Next, a computer process (step S2) described later is performed on each waveform data fetched into the computer. In step S2, first, waveform data is prepared, baseline removal (step S2-1) and smoothing processing are performed (step S2-2), and peak extraction processing is performed in step S2-3. In the peak extraction process in step S2-3, peak extraction from a single waveform in step S2-3-1 (step 1 above) and grouping of peaks from a plurality of waveforms in step S2-3-2 (above described above). Step 2) is performed. Next, in step S2-4, a difference between groups is detected based on the result of step S2-1 or the result of step S2-2 and the result of step S2-3.

ベースラインの除去（ステップＳ２−１）および平滑化の処理（ステップＳ２−２）によって得られた波形データ（図中(a)点に相当）のサンプル１０例を図１９に示す。図１９に示すサンプル１から１０までの各波形データをピーク抽出装置に入力すると、それぞれ個別に単一波形からのピーク抽出の処理が施される。その結果、各波形データに対するピークと各ピークに対する包含レベルとが得られる（図１８(b)点に相当）。図１９のサンプル１に対するピーク抽出結果を図２０に示す。図２０中の一部を拡大したデータを図２１に示す。これらの図中の波形は平滑化処理後のデータである。また図中の点は、ピーク頂点を示す。 FIG. 19 shows 10 samples of waveform data (corresponding to point (a) in the figure) obtained by removing the baseline (step S2-1) and smoothing (step S2-2). When the waveform data of samples 1 to 10 shown in FIG. 19 are input to the peak extraction device, the peak extraction processing from a single waveform is performed individually. As a result, a peak for each waveform data and an inclusion level for each peak are obtained (corresponding to point (b) in FIG. 18). The peak extraction result for sample 1 in FIG. 19 is shown in FIG. FIG. 21 shows data obtained by enlarging a part of FIG. The waveforms in these figures are the data after the smoothing process. Moreover, the point in a figure shows a peak vertex.

各波形データに対してピーク抽出および包含レベルの算出を行った後、「ステップ２複数波形からのピークのグループ化」を行う。グループ化の結果を図２２に示す。また、その一部を拡大したデータを図２３に示す。図中の上段においては、サンプル１から１０までの波形データと抽出されたピーク頂点とが示されている。図中の中段は、得られたピーク密度曲線が描かれている。図下段では、ピーク密度曲線をもとにグループ化されたピークが示されている。同じマークで示されたピーク頂点が同一グループに属するものである。 After performing peak extraction and inclusion level calculation for each waveform data, “step 2 grouping of peaks from multiple waveforms” is performed. The result of grouping is shown in FIG. FIG. 23 shows data obtained by enlarging a part of the data. In the upper part of the figure, the waveform data of samples 1 to 10 and the extracted peak vertices are shown. In the middle of the figure, the obtained peak density curve is drawn. In the lower part of the figure, peaks grouped based on the peak density curve are shown. The peak vertices indicated by the same mark belong to the same group.

次にピーク抽出装置から得られたピークグループ及びそのグループ化範囲に基づき、正常なラット群(以下Ａ群)と外科的処置を施したラット群(以下Ｂ群)の差異を検出する。図２４はＡ群とＢ群とで差がみられる代表的なグループ化範囲周辺において、Ａ群とＢ群とを上下にわけて波形データを示した図である。図２５は同じグループ化範囲周辺におけるピーク頂点あるいは後述の代表点を示した図である。以下に、図２４、図２５を参照しつつ差異検出について説明する。まず、あるピークグループに含まれるすべてのピーク頂点について、それぞれのピーク頂点の強度を得る。得られた強度の中でＡ群（図２５の黒丸）またはＢ群（図２５の×印）のラットに関するものを取り出し（強度取得処理）、Mann-WhitneyのU検定などの適切な統計的検定手法によってＡ群とＢ群の強度の差に対する有意確率を算出する。ピーク抽出装置から得られた全ピークグループに対して同様の有意確率を算出し、十分小さい有意確率が得られたピークグループをＡ群とＢ群の間で差異が認められるピークグループとして検出する。尚、上記「強度取得処理」の留意点について、以下に説明する。 Next, based on the peak group obtained from the peak extractor and the grouping range thereof, a difference between a normal rat group (hereinafter referred to as group A) and a rat group subjected to surgical treatment (hereinafter referred to as group B) is detected. FIG. 24 is a diagram showing the waveform data by dividing the A group and the B group vertically in the vicinity of a typical grouping range where a difference is seen between the A group and the B group. FIG. 25 is a diagram showing peak vertices or representative points described later around the same grouping range. Hereinafter, the difference detection will be described with reference to FIGS. First, the intensity of each peak vertex is obtained for all peak vertices included in a certain peak group. Among the obtained intensities, those related to rats in Group A (black circle in FIG. 25) or Group B (marked in FIG. 25) are taken out (strength acquisition processing), and appropriate statistical tests such as Mann-Whitney U test are performed. The significance probability with respect to the difference in intensity between group A and group B is calculated by the method. Similar significance is calculated for all peak groups obtained from the peak extractor, and a peak group with sufficiently small significance is detected as a peak group in which a difference is recognized between the A group and the B group. The points to be noted in the “strength acquisition process” will be described below.

(i)Ａ群またはＢ群のラットに関するある同一波形データ上のピークが、当該ピークグループに複数含まれる場合は、それらの中から一つのピークを選んで代表点とし、代表点に対する強度のみを取り出す。代表点の選び方としては、例えば強度が最大のものを代表点とする方法などがある。 (i) When there are multiple peaks on the same waveform data related to rats in Group A or Group B, select one peak from them and use it as a representative point. Take out. As a method for selecting a representative point, for example, there is a method in which a representative point having the maximum intensity is used.

(ii) Ａ群またはＢ群のラットに関するある同一波形データ上のピークが、当該ピークグループに一つも含まれない場合は、当該ピークグループのグループ化範囲で前述の波形データ上の代表点を一つ選び、その強度を取り出す。代表点の選び方としては、例えば最大の強度を持つ点を代表点とする方法などがある。 (ii) If none of the peaks in the same waveform data relating to the rats in Group A or Group B are included in the peak group, the representative points on the waveform data described above are set within the grouping range of the peak group. Choose one and take out its strength. As a method for selecting a representative point, for example, there is a method of using a point having the maximum intensity as a representative point.

(iii) (i)および(ii)において代表点を選択する際、対象とする波形データは平滑化前と平滑化後（それぞれ図１８中の(e)、(f)）のどちらでも良い。ただし、いずれを対象とするかは、統計的検定の対象となるデータ全体を通して統一しておく必要がある。
・必要に応じて強度の正規化を行っても良い。 (iii) When selecting representative points in (i) and (ii), the target waveform data may be either before smoothing or after smoothing ((e) and (f) in FIG. 18 respectively). However, it is necessary to unify which is the target throughout the data subject to statistical tests.
-Strength normalization may be performed as necessary.

尚、本実施例ではラットの血漿を用いたが、他の動物あるいはヒトの血漿、血清、尿、組織抽出液、髄液などの様々な生体試料へも同様に適用が可能である。また比較する群を「健常人」、「疾患Ｓの患者」などとした場合は、健常人と疾患患者とを区別できるグループ化範囲を見つけることが可能となる。更には、別途取得した血漿等の生体試料に対してＳＥＬＤＩＴＯＦ−ＭＳによる測定を行って波形データを得ておき、差異のあるグループ化範囲におけるその波形データの強度を調べることで、同生体試料を提供したヒトに関する疾患の有無の推定、即ち診断を行うことが可能となる。 Although rat plasma was used in this example, the present invention can be similarly applied to various biological samples such as plasma, serum, urine, tissue extract, and spinal fluid of other animals or humans. Further, when the group to be compared is “healthy person”, “patient of disease S”, etc., it becomes possible to find a grouping range in which the healthy person and the disease patient can be distinguished. Furthermore, a biological sample such as plasma obtained separately is measured by SELIDIOF-MS to obtain waveform data, and by examining the intensity of the waveform data in a different grouping range, It is possible to estimate the presence or absence of a disease related to the provided human, that is, to make a diagnosis.

また、本実施例ではラット群間の差異の検出に統計的検定手法を用いたが、差異を検出可能な方法であれば別の方法を用いてもよい。また、本実施例では、単独のピークグループごとにラット群間の差異の検出を行ったが、複数のピークグループの組み合わせによって差異を検出することも可能である。さらに、本実施例ではＡ群とＢ群との２群間の差異を検出したが、Kruskal-Wallisの検定などの適切な手法を用いることで、３つ以上の群間の差異を検出することも可能である。 In this embodiment, the statistical test method is used to detect the difference between the rat groups, but another method may be used as long as the method can detect the difference. In this example, the difference between the rat groups was detected for each single peak group, but the difference can also be detected by combining a plurality of peak groups. Further, in this example, the difference between the two groups of the A group and the B group was detected, but by using an appropriate technique such as the Kruskal-Wallis test, a difference between three or more groups is detected. Is also possible.

尚、本実施の形態による手法は、ＳＥＬＤＩＴＯＦ−ＭＳに留まらず、液体クロマトグラフや質量分析など出力が波形で得られる装置全般に適用可能である。特にピークがブロードな場合、即ちピークトップが不明瞭であったり、隣接するピークが重なり合ったりするような場合に対して有効である。 Note that the method according to the present embodiment is applicable not only to SELDI TOF-MS but also to all devices that can obtain output in a waveform, such as liquid chromatograph and mass spectrometry. This is particularly effective when the peak is broad, that is, when the peak top is unclear or adjacent peaks overlap.

飛行時間型質量分析装置(ＴＯＦ−ＭＳ:time-of-flight Mass Spectrometry)、液体クロマトグラフなどの質量分析装置を用いて血漿、血清、尿、組織抽出液、髄液などの様々な生体試料を測定することにより得られるスペクトル特性の波形データから、生体試料に含まれるタンパク質等の生体由来分子に対応する波形のピークを、電子計算機を用いた情報処理によって抽出する技術に関する。 Various biological samples such as plasma, serum, urine, tissue extract, and cerebrospinal fluid using mass spectrometer such as time-of-flight mass spectrometer (TOF-MS) and liquid chromatograph The present invention relates to a technique for extracting a waveform peak corresponding to biologically derived molecules such as proteins contained in a biological sample from waveform data of spectral characteristics obtained by measurement by information processing using an electronic computer.

本発明の一実施の形態によるプロテオーム解析装置の位置付けを示す図である。It is a figure which shows the positioning of the proteome analysis apparatus by one embodiment of this invention. 本実施の形態によるピーク抽出装置の概略構成例を示す図である。It is a figure which shows the schematic structural example of the peak extraction apparatus by this Embodiment. 本実施の形態による単一波形データからのピーク抽出処理（ステップ１）及びピークのグループ化処理（ステップ２）の概要を示す図である。It is a figure which shows the outline | summary of the peak extraction process (step 1) and the peak grouping process (step 2) from single waveform data by this Embodiment. 本実施の形態による単一波形データからピークを抽出し、包含レベルを求める課程を示す図である。It is a figure which shows the process which extracts a peak from the single waveform data by this Embodiment, and calculates | requires an inclusion level. 図４（ａ）から図４（ｄ）までは、ピークの探索を終了する４つのケースである。FIG. 4A to FIG. 4D show four cases where the peak search is finished. サブステップ（１−１）において得られた数通りのパラメータ値によるピーク領域群を示す図である。It is a figure which shows the peak area group by the several parameter value obtained in substep (1-1). パラメータＰがＰ_１、Ｐ_２、Ｐ_３、Ｐ_４の場合について包含関係を検査した様子を示す図である。Parameter P is a diagram showing a state of inspecting an inclusion relationship for the case of _{_{_{P 1, P 2, P 3}}} , P 4. 包含関係成立条件のうち不完全な包含関係ではあるが包含していると見なすことができる例を示す図である。It is a figure which shows the example which can be considered that it is included although it is incomplete inclusion relation among the inclusion relation formation conditions. 各ピーク領域の包含レベル（ＣＬ）を設定する様子を示す図である。It is a figure which shows a mode that the inclusion level (CL) of each peak area | region is set. 他のいずれかのピーク領域を包含しているピーク領域、および、他のピーク領域との包含関係が存在しないピーク領域を取り除く手順を示す図である。It is a figure which shows the procedure which removes the peak area | region which includes any other peak area | region, and the peak area | region which does not have the inclusion relation with another peak area | region. ２つのピーク領域間で重なりがある場合にいずれか一方のみを取り除く手順を示す図である。It is a figure which shows the procedure which removes only one when there exists overlap between two peak areas. 複数の波形データから得られたピークのグループ化処理を示す図である。It is a figure which shows the grouping process of the peak obtained from several waveform data. 区間の分割方法方法１〜４までの具体的な分割例を示す図である。It is a figure which shows the specific division example to the division | segmentation method methods 1-4 of the area. ピーク密度の計算方法２であって、当該区間を中心として含み、幅が当該区間よりＳ倍広い区間を参照区間と規定する方法によりピーク密度を求めるピーク密度の計算方法の概念図である。It is a peak density calculation method 2 and is a conceptual diagram of a peak density calculation method for obtaining a peak density by a method that defines a section including the section as a center and having a width S times wider than the section as a reference section. グループ化範囲決定方法の概略を示す図である。It is a figure which shows the outline of the grouping range determination method. 単一波形データからピーク抽出を行い、次いで、ピークのグループ化を行う手順の例を示す図である。It is a figure which shows the example of the procedure which performs peak extraction from single waveform data, and then performs grouping of peaks. 上記の一般的な処理を行ってピーク抽出を行い、次いで、ピークのグループ化を行う手順の例を示す図である。It is a figure which shows the example of the procedure which performs said general process, performs peak extraction, and then performs grouping of peaks. 微小区間ごとにピーク密度を計算し、ピーク密度の増減を考慮したグループ化を行う方法を示す図である。It is a figure which shows the method of calculating the peak density for every micro area, and performing the grouping which considered the increase / decrease in peak density. 本発明の実施例による処理の流れを示すフローチャート図である。It is a flowchart figure which shows the flow of the process by the Example of this invention. ベースラインの除去および平滑化の処理によって得られた波形データ（図１８中(a)点に相当）のサンプル１０例を示す図である。It is a figure which shows the sample 10 example of the waveform data (equivalent to the (a) point in FIG. 18) obtained by the removal and smoothing process of a baseline. 図１９のサンプル１に対するピーク抽出結果を示す図である。It is a figure which shows the peak extraction result with respect to the sample 1 of FIG. 図２０中の一部を拡大したデータを示す図である。It is a figure which shows the data which expanded a part in FIG. 各波形データに対してピーク抽出および包含レベルの算出を行った後、複数波形からのピークのグループ化を行った結果を示す図である。It is a figure which shows the result of having grouped the peak from several waveforms, after performing peak extraction and calculation of an inclusion level with respect to each waveform data. 図２２の一部を拡大したデータを示す図である。It is a figure which shows the data which expanded a part of FIG. 正常なラット群(以下Ａ群)と外科的処置を施したラット群(以下Ｂ群)とで差がみられる代表的なグループ化範囲周辺において、Ａ群とＢ群とに上下にわけて波形データを示した図である。In the vicinity of a typical grouping range where there is a difference between a normal rat group (hereinafter referred to as group A) and a surgically treated rat group (hereinafter referred to as group B), the waveforms are divided vertically into groups A and B. It is the figure which showed data. 正常なラット群(以下Ａ群)と外科的処置を施したラット群(以下Ｂ群)とで差がみられる代表的なグループ化範囲周辺において、各グループ化範囲内での各波形の代表点を示した図である。In the vicinity of a typical grouping range where there is a difference between a normal rat group (hereinafter referred to as Group A) and a surgically treated rat group (hereinafter referred to as Group B), representative points of each waveform within each grouped range FIG.

Explanation of symbols

１…波形データ取得部、３…ピーク抽出部、５…グループ化処理部、７…データ出力部、１１…差異検出部。 DESCRIPTION OF SYMBOLS 1 ... Waveform data acquisition part, 3 ... Peak extraction part, 5 ... Grouping process part, 7 ... Data output part, 11 ... Difference detection part

Claims

A peak extraction method for extracting peaks from waveform data in a plurality of spectral characteristics having a variable axis and an intensity axis,
For each point in the single waveform data of each spectral characteristic, it is estimated whether the point is in the vicinity of the peak on the variable axis, and consecutive points estimated to be in the vicinity of the peak For each of the parameters whose values are changed in a plurality of ways by performing the process of setting the set of peaks as a peak region on the basis of a parameter which is a parameter and the parameter value is changed in a plurality of ways. Determining different peak areas for,
Obtaining an inclusive relation of the peak area determined by the parameter, and assigning a first reliability of the peak area based on the number of the inclusive relation;
Preparing a plurality of the single waveform data, and determining a second reliability on the variable axis based on the plurality of first reliability by the plurality of waveform data;
A grouping range is determined based on the second reliability, and exists within the grouping range among a group of peak vertices obtained as the maximum point on the intensity axis in the peak region in the single waveform data And a step of grouping a group of peak vertices into one group.

Estimating whether it is in the vicinity of the peak defines a point A in the waveform data;
Sequentially searching for the point B existing in the search direction on the variable axis with respect to the point A in the search direction and the decreasing direction of the variable on the variable axis of the point A; Sequentially searching in the increasing direction of the variable in the variable axis,
The searching step is terminated when the point B in the search direction corresponding to the point A corresponds to any of the following, and the case is regarded as a search end case, and the search in the decreasing direction and the increasing direction are performed. The peak extraction method according to claim 1, wherein it is estimated whether or not the point A is in the vicinity of a peak based on a result of the search end case relating to a search for a search.
Case 1: The intensity axis coordinate of the point B exceeds the upper threshold.
Case 2: The intensity axis coordinate of the point B is below the lower threshold.
Case 3: Does not apply to any of the above cases, and reaches the end of the waveform data.
However, the upper threshold and the lower threshold are values determined based on the parameters.

2. The step of assigning the first reliability includes sequentially tracing the peak region from the peak region as long as a peak region including the first region exists, and setting the number of traces as the reliability. 2. The peak extraction method according to 2.
However, when there are a plurality of routes when tracing, the value when the number of times of tracing is the maximum is taken as the reliability.

The step of obtaining the second confidence level includes the step of dividing the variable axis into a plurality of sections and generating a peak density curve based on the first confidence level for each section,
4. The step of determining the grouping range includes a step of determining a grouping range according to a level of reliability based on the peak density curve and a certain threshold value. The peak extraction method according to any one of the above.

The peak extraction method according to any one of claims 1 to 4, wherein the spectral characteristic is a spectral characteristic resulting from a biological molecule.

The program for making a computer perform the peak extraction method of any one of Claim 1-5 .

A waveform data acquisition unit for acquiring waveform data in a plurality of spectral characteristics having a variable axis and an intensity axis;
A peak extraction unit for extracting a peak in single waveform data with the reliability of the peak; and
In a peak extraction device having a grouping processing unit that groups the peaks based on the reliability of the peaks,
The peak extraction unit estimates whether or not the point is in the vicinity of the peak on the variable axis with respect to the point in the single waveform data of each of the spectral characteristics. The process of setting the estimated set of consecutive points as the peak area is a parameter, and the value of the parameter is changed in multiple ways by changing the value of the parameter in multiple ways, thereby changing the value in multiple ways. A different peak area is obtained for each of the parameters, an inclusion relation of the peak areas obtained by the parameters is obtained, and a first reliability of the peak area is given based on the number of the inclusion relations. A plurality of the single waveform data, and a second signal on the variable axis based on the plurality of first reliability values obtained by the plurality of waveform data. Determine the degree,
The grouping processing unit determines a grouping range based on the second reliability, and includes a peak vertex group obtained as a maximum point on the intensity axis in a peak region in the single waveform data. The peak extraction apparatus characterized in that groups of peak vertices existing in the grouping range are grouped as one group.

Further, one or more of the single peaks based on at least one of a difference in intensity of each peak included in the grouped peak or a difference in intensity in the grouping range of each of the single waveform data. The peak extraction apparatus according to claim 7 , further comprising a difference detection unit that detects a difference between a plurality of different waveform data groups, which is a waveform data group including waveform data.

The peak extraction apparatus according to claim 7 or 8 , wherein a peak having a spectral characteristic caused by a biological molecule is extracted.