JP7767181B2

JP7767181B2 - Data processing system and data processing method

Info

Publication number: JP7767181B2
Application number: JP2022027097A
Authority: JP
Inventors: 進芹田
Original assignee: Hitachi Power Solutions Co Ltd
Current assignee: Hitachi Power Solutions Co Ltd
Priority date: 2022-02-24
Filing date: 2022-02-24
Publication date: 2025-11-11
Anticipated expiration: 2042-02-24
Also published as: JP2023123182A

Description

本発明は、対象に生じる異常の予兆を検知するためのデータ処理技術に係り、特に、産業機械の故障予兆の検知のためのデータ処理の発明に関する。 The present invention relates to data processing technology for detecting signs of abnormalities occurring in an object, and in particular to an invention for data processing for detecting signs of failure in industrial machinery.

分析対象から測定された時系列データに基づいて評価モデルを構築し、評価モデルを利用して分析対象の状態を評価することが、行われている。例えば、設備、機械、装置、そして、機器等の保全対象に対する保全の方式として、状態基準保全（予知保全：Condition Based Maintenance (CBM)）が知られている。 An evaluation model is constructed based on time-series data measured from the object of analysis, and the evaluation model is then used to evaluate the condition of the object of analysis. For example, Condition Based Maintenance (CBM) is known as a maintenance method for maintenance targets such as facilities, machinery, equipment, and devices.

これは、現在安定して稼働している装置であっても、装置の状態を監視して、劣化の状態に応じて、装置の故障前に装置を保全するというものである。例えば、風力発電設備やガスタービン等の振動や発熱の状況など、装置の動作状態をモニタリングして、その結果に応じて、装置の故障予知が実行される。 This involves monitoring the condition of equipment, even if it is currently operating stably, and maintaining it before it fails depending on the level of deterioration. For example, the operating status of equipment, such as vibration and heat generation in wind power generation facilities and gas turbines, is monitored, and failure predictions are made based on the results.

予知保全のためのシステムは、装置や設備から得られる時系列データに基づいて機械学習を行って学習モデルを作成し、このモデルにセンサからの時系列データを適用して装置の故障予知を実現する。システムは、センサデータのうち正常データを学習データとして取り込み、学習データに基づいて検知モデルを構築し、これに、検証データとして装置の過去の故障履歴を適用することによって、モデルの検知性能を検証する。 Predictive maintenance systems use machine learning to create a learning model based on time-series data obtained from devices and equipment, and then apply time-series data from sensors to this model to predict equipment failures. The system incorporates normal data from the sensor data as learning data, builds a detection model based on the learning data, and verifies the model's detection performance by applying the device's past failure history as validation data.

この種のシステムでは評価モデルの性能を向上させるために、学習データとしての適格性を向上させることが望まれており、例えば、特開２０２０－１０２００１号公報は、産業機械における測定データの取得に際して、同じ動作で測定されたデータであるか否かの確認を容易にすることを目的として、機械学習を用いた、産業機械の異常検知を行うため、予め正常データのみからなる学習用データを取得する際に不適切なデータの混入の確認を容易にする学習用データ確認支援装置であって、学習モデル生成時に、不適切なデータの混入を過去データとのマッチングで判定することを開示している。 In this type of system, improving the suitability of the training data is desirable to improve the performance of the evaluation model. For example, JP 2020-102001 A discloses a training data verification support device that uses machine learning to detect anomalies in industrial machinery, with the aim of making it easier to confirm whether the data was measured using the same operation when acquiring measurement data from industrial machinery. The device facilitates checking for inappropriate data when acquiring training data consisting only of normal data in advance, and determines whether inappropriate data is mixed in by matching it with past data when generating a training model.

さらに、特開２０２１－３３５５４号公報は、複数の学習データの各々について、一つのサンプルデータセットに対するモデルの予測精度に与える影響の強さを表すスコアに基づいて、有害学習データを特定し、学習データからこれを削除することにより、モデルの予測精度を向上させる学習データの精練方法を開示している。 Furthermore, Japanese Patent Application Laid-Open No. 2021-33554 discloses a method for refining training data that improves the predictive accuracy of a model by identifying harmful training data for each of multiple training data sets based on a score that indicates the strength of the impact that the harmful training data has on the predictive accuracy of the model for one sample data set, and removing this harmful training data from the training data.

特開２０２０－１０２００１号公報Japanese Patent Application Laid-Open No. 2020-102001 特開２０２１－３３５５４号公報Japanese Patent Application Laid-Open No. 2021-33554

従来のシステムは、装置が故障していない期間のセンサデータの中から正常データを選別し、選別データを学習データとして装置の予知保全モデルを生成している。しかしながら、データの学習データとしての適格性の可否を判断することはそもそも容易ではなく、例えば、装置が故障していない期間のデータであっても、記録されていない故障が発生している可能性があり、そして、明確な故障ではないが装置が通常とは異なる動作をしている可能性もある。従来、この種のデータは、所定のルールに基づいて、学習データから除かれるようにしていたが、新たな装置や、装置の新たな動作環境では、過去のルールが有効ではないことが多い。 Conventional systems select normal data from sensor data from periods when the equipment is not malfunctioning, and use the selected data as training data to generate a predictive maintenance model for the equipment. However, determining whether data is suitable for use as training data is not easy in the first place. For example, even data from periods when the equipment is not malfunctioning may contain unrecorded failures, and there is also the possibility that the equipment is operating abnormally, even if it is not an obvious failure. Traditionally, this type of data has been removed from the training data based on predetermined rules, but past rules are often no longer valid for new equipment or new operating environments for the equipment.

一方、学習データの選別を人の作業に委ねようとすると、時系列データの期間が長い場合、データ選別のパターンが膨大で負荷が大きくなることに加えて、選別基準が恣意的になり汎用性に欠ける。そこで、本発明は、対象から測定されたデータに基づいて機械学習を行って、対象の異常予兆の検知モデルを作成する際に、学習データの選別を支援することによってモデルの検知精度を向上させることを目的とする。 On the other hand, if the selection of training data is left to humans, when the time series data spans a long period, not only will the number of data selection patterns be enormous, imposing a heavy burden, but the selection criteria will be arbitrary and lack versatility. Therefore, the present invention aims to improve the detection accuracy of models by assisting in the selection of training data when performing machine learning based on data measured from a target to create a detection model for signs of abnormalities in the target.

前記目的を達成するために、本発明は、対象の測定センサから時系列データを受信し、プログラムを実行することにより、当該時系列データに基づいて、前記対象の異常の予兆を検知するためのデータ処理を実行するコントローラを備える、データ処理システムであって、前記コントローラは、前記受信した時系列データから機械学習のための学習データを収集し、当該学習データは複数の測定データを含み、当該学習データに基づいて機械学習を行って、前記対象の異常の予兆を検知するためのモデルを構築し、当該モデルを検証データに適用して当該検証データの異常度を算出し、前記学習データに含まれる測定データのうち前記異常度に影響を及ぼす測定データを、注目されるべきデータとして特定し、当該注目されるべきデータを提示する、ことを特徴とする。本発明はさらにこの特徴に係るデータ処理方法でもある。 To achieve the above objective, the present invention provides a data processing system including a controller that receives time-series data from a measurement sensor of an object and executes a program to perform data processing based on the time-series data to detect signs of abnormality in the object. The controller collects learning data for machine learning from the received time-series data, and the learning data includes multiple pieces of measurement data. The controller performs machine learning based on the learning data to construct a model for detecting signs of abnormality in the object. The model is applied to validation data to calculate the degree of abnormality of the validation data. Measurement data included in the learning data that affects the degree of abnormality is identified as data that should be noted, and the controller presents the data that should be noted. The present invention also provides a data processing method relating to this feature.

本発明によれば、対象から測定されたデータに基づいて機械学習を行って、対象の異常予兆の検知モデルを作成する際に、学習データの選別を支援することによってモデルの検知精度を向上させることができる。 According to the present invention, when machine learning is performed based on data measured from a target to create a detection model for predicting abnormalities in the target, the detection accuracy of the model can be improved by assisting in the selection of training data.

データ処理システムの実施形態を含み、対象の異常の予兆を検知する全体システムのブロック図である。1 is a block diagram of an overall system that includes an embodiment of a data processing system and detects a sign of an abnormality in a target. センサデータの記録テーブルの一例である。10 is an example of a recording table of sensor data. 故障履歴テーブルの一例である。10 is an example of a failure history table. 装置の測定センサから出力された時系列データの一例に係る波形図である。FIG. 4 is a waveform diagram showing an example of time-series data output from a measurement sensor of the device. データ処理プログラムの一例に係るフローチャートである。10 is a flowchart illustrating an example of a data processing program. 装置の測定センサから出力された時系列データを所定時間毎に分割することを説明するための波形図である。10 is a waveform diagram for explaining division of time-series data output from a measurement sensor of the device at predetermined time intervals. FIG. 装置の測定センサから出力された時系列データをクラスタリングによって分割することを説明するための図である。FIG. 10 is a diagram for explaining how time-series data output from a measurement sensor of the device is divided by clustering. 時系列データをクラスタリングによる分割と時系列データの波形図との関係を示す図である。FIG. 10 is a diagram showing the relationship between division of time series data by clustering and a waveform diagram of the time series data. 機械学習のためのデータからインパクトデータ（注目されるべきデータ）を特定する過程を説明するためのブロックである。This is a block that explains the process of identifying impact data (data that should be noted) from data for machine learning. 対象異常度と基準異常度との時間変化のグラフである。10 is a graph showing the change over time of the target abnormality degree and the reference abnormality degree. 対象異常度と基準異常度との時間変化の他のグラフである。10 is another graph showing the change over time of the target abnormality degree and the reference abnormality degree. 対象異常度と基準異常度との時間変化の更に他のグラフである。10 is yet another graph showing the change over time of the target abnormality degree and the reference abnormality degree. 対象異常度と基準異常度との時間変化のまた更に他のグラフである。10 is yet another graph showing the change over time of the target abnormality degree and the reference abnormality degree. データ精錬モジュールがデータブロックの優先度を求める方式を説明するブロック図である。FIG. 10 is a block diagram illustrating how the data refinement module determines the priority of a data block. インパクトデータの情報表示態様の一つとしての表示リストの一例である。10 is an example of a display list as one of the information display modes of impact data. ユーザ装置に出力される、インパクトデータ関連情報の画面の例である。10 is an example of a screen showing impact data related information output to a user device.

次に、本発明に係るデータ処理システムの実施形態について説明する。図１は、このデータ処理システム１０１を含む、対象の異常の予兆を検知する全体システム１００のハードウェアブロック図である。対象としては、故障予兆の検知が必要な産業機械、例えば、風力発電装置、ガスタービンエンジンでよい。 Next, we will explain an embodiment of a data processing system according to the present invention. Figure 1 is a hardware block diagram of an overall system 100 that includes this data processing system 101 and detects signs of abnormalities in an object. The object may be industrial machinery that requires detection of signs of failure, such as wind power generation equipment or gas turbine engines.

全体システム１００は、夫々が対象としての複数の装置１１１を備える装置群１１０と、複数の装置夫々のセンサ１２１（センサ１～センサＮ）からなるセンサ群１２０と、センサからの測定データであるセンサデータ１３１と装置の故障履歴データ１３２とを格納するストレージ１３０と、データ処理システム１０１とを備えている。 The overall system 100 comprises a device group 110, each of which comprises a plurality of target devices 111; a sensor group 120, each of which comprises sensors 121 (sensor 1 to sensor N) for each of the devices; storage 130, which stores sensor data 131, which is measurement data from the sensors, and device failure history data 132; and a data processing system 101.

データ処理システム１０１は、データ精錬サーバ（データ精錬装置）１５０と、故障の予兆を検知する故障予知サーバ（故障予兆検知装置）１６０とを備えている。装置群１１０、センサ群１２０、ストレージ１３０、データ精錬サーバ１５０、そして、故障予知サーバ１６０は互いにネットワーク１６５を介して接続されている。 The data processing system 101 includes a data refining server (data refining device) 150 and a failure prediction server (failure sign detection device) 160 that detects signs of failure. The device group 110, sensor group 120, storage 130, data refining server 150, and failure prediction server 160 are connected to each other via a network 165.

ストレージ１３０は、センサ群からのセンサデータを受信すると、図２（センサデータの記録テーブル）に示すように、装置群の装置ＩＤ２０１毎で、かつ、センサ群のセンサ毎の測定データ２０３，２０４を、センサデータ（記録領域）１３１に記録する。２０２は測定時刻である。センサは、装置の制御に係る制御量の時系列データ、装置の制御に係る状態量の時系列データ、装置が動作している環境の時系列データ等、装置から測定された時系列データを出力する。 When the storage 130 receives sensor data from the sensor group, it records the measurement data 203, 204 for each device ID 201 of the device group and for each sensor of the sensor group in the sensor data (storage area) 131, as shown in Figure 2 (sensor data recording table). 202 is the measurement time. The sensors output time series data measured from the devices, such as time series data of control variables related to device control, time series data of state variables related to device control, and time series data of the environment in which the devices are operating.

そして、ストレージ１３０は装置から故障に関連するデータを受信すると、図３（故障履歴テーブル）に示すように、故障ＩＤ３０１毎に、故障が生じた装置ＩＤ３０２、故障が発生した故障日３０３、故障の症状３０４、そして、故障に対する対策３０５を、故障履歴（記録領域）１３２のテーブルに記録する。データ処理システム１０１は、図４に示すように、装置１１１のセンサの出力データ４００から、装置故障時４０６の出力データ等の異常データを含まない正常データ、換言すれば、正常データのみ、あるいは、正常データを主とするデータを学習データ４０２として設定し、学習データ４０２に基づいて機械学習を実行して、装置の故障予兆の検知モデルを構築する。 When the storage 130 receives data related to the failure from the device, as shown in FIG. 3 (failure history table), it records, for each failure ID 301, the device ID 302 on which the failure occurred, the failure date 303 on which the failure occurred, the symptoms of the failure 304, and the countermeasures taken against the failure 305 in a table in the failure history (recording area) 132. As shown in FIG. 4, the data processing system 101 sets normal data that does not contain abnormal data such as output data at the time of device failure 406 from the output data 400 of the device 111 as training data 402, in other words, data that is only normal data or mainly normal data, and performs machine learning based on the training data 402 to construct a detection model for signs of failure in the device.

正常データかそうでないかの区別は、装置の状態や装置の動作環境に応じて相対的なものであるため、データ処理システム１０１は、センサデータに基づいて、好ましくは、教師なし学習を実行して学習モデルを作成する。データ処理システム１０１は、故障予兆検知モデルを、装置故障時４０６以前のセンサデータを含む検証データ４０４に適用して、検証データ４０４の異常度を算出し、故障予兆検知モデルの検知性能の評価指標を求める。 Because the distinction between normal and abnormal data is relative and depends on the state of the device and the operating environment of the device, the data processing system 101 preferably performs unsupervised learning based on the sensor data to create a learning model. The data processing system 101 applies the failure sign detection model to validation data 404, which includes sensor data from before the device failure 406, calculates the degree of anomaly of the validation data 404, and obtains an evaluation index for the detection performance of the failure sign detection model.

データ精錬サーバ１５０は、センサデータから収集される学習データを最適化するためのデータ精錬モジュール１５１と、最適化された学習データに基づいて機械学習を実行して、装置の故障予兆検知モデルを構築するモデル構築モジュール１５２とを備える。データ精錬サーバ１５０は、コントローラがメモリのプログラムを実行することにより、データ精錬モジュール１５１とモデル構築モジュール１５２とを実現する。なお、モジュールを手段、部、又は、ユニットと言い換えてもよい。故障予知サーバ１６０もコントローラがメモリのプログラムを実行することにより、故障予兆検知モデルをセンサデータに適用するモジュール１６１を実現する。 The data refinement server 150 comprises a data refinement module 151 for optimizing learning data collected from sensor data, and a model construction module 152 for performing machine learning based on the optimized learning data to construct a failure sign detection model for the device. The data refinement server 150 realizes the data refinement module 151 and the model construction module 152 by having a controller execute a program in memory. Note that the word "module" may also be referred to as a means, part, or unit. The failure prediction server 160 also realizes a module 161 for applying a failure sign detection model to sensor data by having a controller execute a program in memory.

データ精錬モジュール１５１は学習データ４０２を最適化するための支援処理をデータ処理システム１０１の管理者、ユーザ等に提供する。データ精錬モジュール１５１は学習データ４０２として収集したセンサデータの系内で、故障予兆検知の精度について所定以上に影響を与えるデータを相対的に特定し、これをインパクトデータ、即ち、ユーザに注目されるべきデータとして、その影響度と共にユーザに提示する。 The data refinement module 151 provides support processing to administrators, users, etc. of the data processing system 101 for optimizing the training data 402. Within the system of sensor data collected as training data 402, the data refinement module 151 relatively identifies data that has a predetermined or greater impact on the accuracy of failure sign detection, and presents this to the user as impact data, i.e., data that should be brought to the user's attention, along with its degree of impact.

この提示を受けた、ユーザは、インパクトデータを学習データに含めたまま機械学習をするか、又は、インパクトデータを学習データから除いて機械学習をするかを判断でき、モデル構築モジュール１５２はユーザからの指令を受けて、何れかの方式で機械学習を実行する。 Upon receiving this presentation, the user can decide whether to perform machine learning while keeping the impact data included in the learning data, or to perform machine learning after excluding the impact data from the learning data. The model construction module 152 receives instructions from the user and performs machine learning using either method.

データ処理システム１０１の動作を、図５のフローチャートに基づいて説明する。なお、フローチャートの実行主体は、プログラムを実行することによってデータ精錬モジュール１５１等を実現するプロセッサないしはコントローラである。なお、モジュールとフローチャートとの関係を明りょうにするために、モジュールを主語としてフローチャートの説明を続ける。 The operation of the data processing system 101 will be explained based on the flowchart in Figure 5. The flowchart is executed by a processor or controller that implements the data refinement module 151, etc., by executing a program. To clarify the relationship between the module and the flowchart, the flowchart will continue to be explained with the module as the subject.

データ精錬モジュール１５１は、モデル学習のためのセンサデータをストレージ１３０のセンサデータ１３１の記憶領域テーブル（図２）から読み込む。モデル学習のためのセンサデータの種類、及び、読み込む範囲は、ユーザ等の設定に基づいてよい。 The data refinement module 151 reads sensor data for model learning from the storage area table (Figure 2) for sensor data 131 in storage 130. The type of sensor data for model learning and the range of data to be read may be based on settings made by the user, etc.

データ精錬モジュール１５１は学習データとして読み込まれた時系列データの前処理として、学習データを複数のデータブロックに分割する（図５：Ｓ５００）。学習データを分割する方式は時間的な分割、又は、クラスタリングによる分類がある。図６Ａは、学習データ４０２である、センサの時系列データの波形図である。符号６００はスライディングウィンドウであり、データ精錬モジュール１５１がこれを時間軸に沿って規則的に移動させることにより、ウィンドウに囲まれた範囲の一つ又は複数のセンサデータを一つのデータブロックとして抽出できる。これが学習データを時間的に分割する方式の一例である。一つのデータブロックの特徴量は例えば、データセットに含まれる複数のデータの平均値やその標準偏差などの統計量でよい。 The data refinement module 151 divides the time-series data loaded as training data into multiple data blocks as preprocessing (Figure 5: S500). Methods for dividing the training data include temporal division and classification by clustering. Figure 6A is a waveform diagram of the sensor time-series data, which is the training data 402. Reference numeral 600 denotes a sliding window, which the data refinement module 151 moves regularly along the time axis, allowing one or more pieces of sensor data within the range enclosed by the window to be extracted as a single data block. This is an example of a method for dividing training data over time. The feature quantity of a single data block may be, for example, a statistical quantity such as the average value or standard deviation of the multiple pieces of data included in the dataset.

これに対して、クラスタリングは、学習データをその特性に応じて複数のクラスタに分類することである。図６Ｂは、学習データが４つのクラスタに分類されていることを示している。このうち、符号６１０で示されるクラスタには３つのデータブロックが含まれている。図６Ｃに示されるように、３つのデータ６１０Ａ，６１０Ｂ，６１０Ｃは夫々が時系列データである学習データ４０２に離散して存在している。 In contrast, clustering involves classifying training data into multiple clusters according to its characteristics. Figure 6B shows that the training data has been classified into four clusters. Of these, the cluster designated by the reference numeral 610 contains three data blocks. As shown in Figure 6C, the three data 610A, 610B, and 610C each exist discretely in the training data 402, which is time-series data.

図７はデータ精錬モジュール１５１によるデータ精錬過程を説明するためのブロック図であり、これを図５のフローチャートと対応させて説明する。データ精錬モジュール１５１が学習データ４０２を時間毎に互いに均等サイズのデータブロック７０００に分割することは、既述のとおりである（図５：Ｓ５００）。 Figure 7 is a block diagram illustrating the data refinement process performed by the data refinement module 151, and will be explained in conjunction with the flowchart in Figure 5. As previously described, the data refinement module 151 divides the training data 402 into data blocks 7000 of equal size over time (Figure 5: S500).

次いで、データ精錬モジュール１５１は、学習データの全てのデータブロック４０２ＢＬから、既述のインパクトデータであるか否かの探索対象のデータブロックを絞りこむ（図５、図７：Ｓ５０２）。図７において、７００１～７００４で示されるデータブロックが探索対象である。学習データの全てのデータブロックについてインパクトデータか否かのチェックを行うと計算負荷が大きくなるため、一部のデータブロックが探索対象とされている。探索対象になるデータブロックの決定方法については後述する。なお、全データブロックを探索対象にすることを妨げるものではない。 Next, the data refinement module 151 narrows down all data blocks 402BL of the training data to the data blocks to be searched for to determine whether they are the impact data described above (S502, Figures 5 and 7). In Figure 7, the data blocks indicated by 7001 to 7004 are the search targets. Since checking whether all data blocks of the training data are impact data would result in a large computational load, only a portion of the data blocks are selected as search targets. The method for determining the data blocks to be searched for will be described later. Note that this does not prevent all data blocks from being search targets.

次いで、データ精錬モジュール１５１は、探索対象のデータセットの中から、一つのデータセット７０００を選択する（図５、図７：Ｓ５０４）。モデル構築モジュール１５２は、学習データから選択されたデータブロックを除き（図７：Ｓ５０４１）、残りのデータブロック（４０４ＢＬ）に基づいて教師無し機械学習を実行して（図５、7：Ｓ５０６）、故障予兆検知モデル（対象モデル）を作成する（図７：７０１０）。 Next, the data refinement module 151 selects one dataset 7000 from the datasets to be searched (Figs. 5 and 7: S504). The model construction module 152 removes the selected data block from the training data (Fig. 7: S5041), and performs unsupervised machine learning based on the remaining data block (404BL) (Figs. 5 and 7: S506), creating a failure sign detection model (target model) (Fig. 7: 7010).

さらに、モデル構築モジュール１５２は、学習データの全データブロック（４０２ＢＬ）に基づいて教師無し機械学習を実行して（図７：Ｓ５０７）、故障予兆検知モデル（基準モデル：図５：Ｓ５０８、図７、７０１２）を作成する。 Furthermore, the model construction module 152 performs unsupervised machine learning based on all data blocks (402BL) of the training data (Figure 7: S507) to create a failure sign detection model (reference model: Figure 5: S508, Figure 7, 7012).

次いで、故障予知サーバのコントローラはモデル適用モジュール１６１を起動させて、モデル適用モジュール１６１は、モデル構成モジュール１５２から対象モデル７０１０を読み込み、さらに、ストレージ１３０から既述の検証データ４０４、即ち、装置故障時の所定以前のセンサデータである、時系列データを読み込んで、これに対象モデル７０１０を適用して、検証データの異常度（対象異常度）７０１４を演算し（図５，７：Ｓ５１０）、これをメモリに記憶する。モデル適用モジュール１６１は、同じように、検証データ４０４に基準モデル７０１２を適用して、検証データの異常度（基準異常度）７０１６を演算する（図５，７：Ｓ５１２）。 Next, the controller of the failure prediction server starts the model application module 161, which reads the target model 7010 from the model configuration module 152 and further reads the aforementioned verification data 404 from the storage 130, i.e., time-series data that is sensor data from a certain period before the device failure, and applies the target model 7010 to this to calculate the anomaly level (target anomaly level) 7014 of the verification data (Figures 5 and 7: S510), which is then stored in memory. In the same manner, the model application module 161 applies the reference model 7012 to the verification data 404 to calculate the anomaly level (reference anomaly level) 7016 of the verification data (Figures 5 and 7: S512).

次いで、モデル適用モジュール１６１は、対象異常度７０１４と基準異常度７０１６とを比較し、両者の差分を評価する（図５，７：Ｓ５１４）。この評価は次のようにして実行できる。８００は前者が後者以上に増加している期間（増加期間）であり、８０２は前者が後者以下に減少している期間(減少期間)である。モデル適用モジュール１６１は、増加期間に於ける評価指標を、異常スコアの増加量８００Ａの増加期間幅８００の積分値とし、減少期間に於ける評価指標を、異常スコアの減少量８０２Ａの減少期間幅８０２の積分値とし、両方の差分を評価結果とする。なお、評価指標を異常値が増加したデータ数と異常値が減少したデータ数とし、両者の差分を評価結果とする態様でもよい。 Next, the model application module 161 compares the target anomaly degree 7014 with the reference anomaly degree 7016 and evaluates the difference between them (Figures 5 and 7: S514). This evaluation can be performed as follows. 800 is the period (increase period) during which the former increases more than the latter, and 802 is the period (decrease period) during which the former decreases less than the latter. The model application module 161 sets the evaluation index for the increase period to the integral of the increase period width 800 of the anomaly score increase amount 800A, and sets the evaluation index for the decrease period to the integral of the decrease period width 802 of the anomaly score decrease amount 802A, and sets the difference between the two to the evaluation result. Note that the evaluation index may also be the number of data items with an increase in abnormal values and the number of data items with a decrease in abnormal values, and the difference between the two to be the evaluation result.

モデル適用モジュール１６１は、ステップＳ５１４において、選択したデータブロックの異常度に及ぼす影響度について評価を終了すると全てのデータブロックの評価が終了するまでこれを繰り返し（Ｓ５１６）、７００１～７００４のデータブロックのうち最も影響度が大きいデータブロックをインパクトデータとして決定し、そして、インパクトデータが故障予兆検知モデルの検知精度に与える影響を判定する（Ｓ５１８）。 In step S514, the model application module 161 finishes evaluating the impact of the selected data block on the degree of anomaly, and then repeats this process until evaluation of all data blocks is complete (S516). It then determines the data block with the greatest impact of data blocks 7001 to 7004 as the impact data, and determines the impact of the impact data on the detection accuracy of the failure sign detection model (S518).

図９Ａは、対象異常度７０１６の異常スコアと基準異常度７０１４の異常スコアとの時間変化を比較したグラフである。対象異常度７０１６の方が基準異常度７０１４より装置故障日前の早い段階で異常スコアが増加し、かつ、故障日に於ける異常スコアも大きい値になっている。したがって、モデル適用モジュール１６１は、故障予兆検知モデルの精度向上のためには、学習データからインパクトデータを除いて故障予兆検知モデルを構築した方が好適であると判定する。 Figure 9A is a graph comparing the time changes in the anomaly score of the target anomaly degree 7016 and the anomaly score of the reference anomaly degree 7014. The anomaly score of the target anomaly degree 7016 increases earlier before the device failure date than the reference anomaly degree 7014, and the anomaly score on the failure date is also larger. Therefore, the model application module 161 determines that, in order to improve the accuracy of the failure sign detection model, it is more appropriate to construct the failure sign detection model by excluding impact data from the learning data.

一方、図９Ｂでは、対象異常度７０１６の方が基準異常度７０１４より装置故障日前の段階で異常スコアが遅れて増加し、かつ、故障日に於ける異常スコアも小さくなっている。したがって、モデル適用モジュール１６１は、学習データからインパクトデータを除かない方が故障予兆検知モデルの精度の維持のために好適であると判定する。 On the other hand, in Figure 9B, the target anomaly level 7016 increases in anomaly score later than the reference anomaly level 7014 before the equipment failure date, and the anomaly score on the failure date is also smaller. Therefore, the model application module 161 determines that not excluding impact data from the learning data is more suitable for maintaining the accuracy of the failure sign detection model.

図９Ｃでは、対象異常度７０１６と基準異常度７０１４とは装置故障日を含めて異常スコアの変化の度合いには殆ど差が無い為、モデル適用モジュール１６１は、故障予兆検知モデルの精度にインパクトデータは影響がないことを判定する。 In Figure 9C, there is almost no difference in the degree of change in the anomaly score between the target anomaly level 7016 and the reference anomaly level 7014, including on the day of the device failure, so the model application module 161 determines that the impact data does not affect the accuracy of the failure sign detection model.

データ処理システム１０１は、ステップＳ５１８を経て図５のフローチャートを終了する。なお、影響度の大きい順に複数のデータブロックをインパクトデータとして判定してもよい。データ処理システムは、故障予兆検知モデルの構築時、又は、そのメンテナンス時に図５のフローチャートを開始すればよい。 The data processing system 101 completes the flowchart in Figure 5 after step S518. Note that multiple data blocks may be determined as impact data in descending order of impact. The data processing system may start the flowchart in Figure 5 when building a failure sign detection model or when performing maintenance on it.

図７の説明において、データ精錬モジュール１５１は、インパクトデータの探索対象を絞り込むことを説明した。そこで、データ精錬モジュール１５１は、学習データ４０２に属するデータブロック７０００の優先度を利用する。データ精錬モジュール１５１は優先度が所定以上のデータブロック（図７：７００１～７００４）をインパクトデータであるか否かの探索の際の優先的な対象とする。 In the explanation of Figure 7, it was explained that the data refinement module 151 narrows down the search targets for impact data. Therefore, the data refinement module 151 uses the priority of the data block 7000 that belongs to the training data 402. The data refinement module 151 prioritizes data blocks with a priority level or higher (Figure 7: 7001 to 7004) when searching for impact data.

データ精錬モジュール１５１は、データブロックの優先度を、データブロックが学習データに占める異常度（自己異常度）に基づいて算出する。自己異常度が高いデータブロックほどインパクトデータである可能性が高い。 The data refinement module 151 calculates the priority of a data block based on the degree of anomaly (self-anomaly degree) that the data block occupies in the learning data. The higher the self-anomaly degree of a data block, the more likely it is to be impact data.

データ精錬モジュール１５１は、一例として、図１０に説明する方式に従って、データブロックの自己異常度を計算する。データ精錬モジュール１５１は、学習データに含まれる全てのデータブロック４０２ＢＬを二つのグループＢＬ８００，ＢＬ８０２に分割し、一方のグループのデータブロックＢＬ８００に基づいて機械学習８１０を実行して学習モデルを作成し、他方のグループのデータブロックＢＬ８０２の異常度を計算８１２する。 The data refinement module 151 calculates the degree of self-anomaly of a data block according to the method described in FIG. 10, as an example. The data refinement module 151 divides all data blocks 402BL included in the learning data into two groups BL800 and BL802, performs machine learning 810 based on the data blocks BL800 of one group to create a learning model, and calculates 812 the degree of anomaly of the data blocks BL802 of the other group.

データ精錬モジュール１５１は、学習データに含まれる全てのデータブロックを二つのグループに分割の仕方には複数の組み合わせがあるが、全てのパターンについて、データブロックの自己異常度を計算する。この結果、全てのデータブロックについて自己異常度が計算される。 The data refinement module 151 calculates the degree of self-anomalousness of each data block for all patterns, even though there are multiple ways to divide all data blocks contained in the training data into two groups. As a result, the degree of self-anomalousness is calculated for all data blocks.

データ精錬モジュール１５１は、全てのパターンについて、同一のデータブロックに対する複数の異常度を合計したものか、或いは、同一のデータブロックに対する複数の異常度を平均した等をデータブロックの優先度とする。モデル適用モジュール１６１は、優先度が大きい順に所定数のデータブロックをインパクトデータの探索対象とする。全てのデータブロックをインパクトデータの探索対象とした場合に比較して、インパクトデータを判定する負荷が軽減される。 The data refinement module 151 determines the priority of a data block by adding up multiple anomaly levels for the same data block for all patterns, or by averaging multiple anomaly levels for the same data block. The model application module 161 searches for impact data on a predetermined number of data blocks in descending order of priority. This reduces the load of determining impact data compared to when all data blocks are searched for impact data.

モデル適用モジュール１６１は、インパクトデータを判定すると、これをデータ処理システム１０１のユーザに表示させる。図１１はインパクトデータの情報表示態様の一つとしての表示リストの一例である。この表示例は、インパクトデータとして、識別子１～４で区別される４つのデータブロックがあること、インパクトデータ毎の精度、即ち、インパクトデータを学習データから削除した際に、故障予兆検知の精度が向上するか、反対に精度が悪化するかの指標と、学習データに対するインパクトデータの出現率、即ち、学習データのサイズに対するインパクトデータのサイズの割合とを含んでいる。 Once the model application module 161 has determined the impact data, it displays it to the user of the data processing system 101. Figure 11 is an example of a display list as one form of displaying impact data information. This display example indicates that there are four data blocks of impact data, distinguished by identifiers 1 to 4; the accuracy of each piece of impact data, i.e., an indicator of whether deleting the impact data from the training data will improve or worsen the accuracy of failure sign detection; and the occurrence rate of the impact data relative to the training data, i.e., the ratio of the size of the impact data to the size of the training data.

図１２は、ユーザ装置に出力される、インパクトデータ関連情報の画面１０００の他の例であり、センサ１，２の夫々においてインパクトデータの分布を含んでいる。横軸はセンサの出力値の目盛りであり、縦軸は、出力値のカウント数である。１２０２はセンサの出力を示し、１２００はセンサ出力のうちのインパクトデータを示す。センサ１、センサ２共に、センサ出力値が大きい部分でインパクトデータが出現していることが分かる。さらに画面１０００は、インパクトデータが出現する時間的分布を含んでいる。ユーザは、図１１，図１２の表示態様に基づいて、インパクトデータの特徴や属性を詳しく知ることができる。 Figure 12 is another example of a screen 1000 of impact data-related information output to a user device, and includes the distribution of impact data for each of sensors 1 and 2. The horizontal axis is the scale of the sensor output value, and the vertical axis is the count number of the output value. 1202 indicates the sensor output, and 1200 indicates the impact data within the sensor output. It can be seen that impact data appears in areas where the sensor output value is large for both sensor 1 and sensor 2. Furthermore, screen 1000 includes the temporal distribution of the appearance of impact data. The user can learn more about the characteristics and attributes of the impact data based on the display modes of Figures 11 and 12.

管理ユーザは、インパクトデータ、そして、インパクトデータに関する既述の種々の解析情報を知ることによって、インパクトデータは、例えば、装置の不調や装置の動作環境の変化に基づくものであって、機械学習モデルの故障予兆検知の精度を向上させるために、インパクトデータを学習データから除いて故障予兆検知モデルを構築することがよいのか、あるいは、インパクトデータの出現タイミング等に基づいて、インパクトデータは装置の状態を正確に表したものであるから、学習データから除かず、故障予兆検知モデルを構築した方がよいかを、容易に判断することができる。 By knowing the impact data and the various analytical information related to the impact data described above, the administrative user can easily determine whether it is better to remove the impact data from the training data and build a failure sign detection model in order to improve the accuracy of the machine learning model's failure sign detection, since the impact data is based on, for example, equipment malfunctions or changes in the equipment's operating environment, or whether it is better to build a failure sign detection model without removing the impact data from the training data, since the impact data accurately represents the equipment's condition, based on the timing of the impact data's appearance, etc.

本発明は上記した実施形態に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施形態の構成の一部を他の実施形態の構成に置き換えることが可能であり、また、ある実施形態の構成に他の実施形態の構成を加えることも可能である。また、各実施形態の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 The present invention is not limited to the above-described embodiments and includes various modifications. For example, the above-described embodiments have been described in detail to clearly explain the present invention, and are not necessarily limited to those including all of the described configurations. Furthermore, it is possible to replace part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. Furthermore, it is possible to add, delete, or replace part of the configuration of each embodiment with other configurations.

既述の実施形態では、産業機械の故障の予兆を検知することについて説明したが、何らかの異常の予兆を検知できる対象であれば、産業機械に限らず、人体のバイタルの異常予兆の検知に本発明が適用されてもよい。また、既述の実施形態では、インパクトデータを一つのデータブロックから成るものとして説明したが、複数のデータブロックの組み合わせをインパクトデータとしてもよい。 In the above-described embodiments, the detection of signs of failure in industrial machinery was described, but the present invention can also be applied to detecting signs of abnormalities in the vital signs of the human body, not just industrial machinery, as long as signs of some kind of abnormality can be detected. Also, in the above-described embodiments, impact data was described as consisting of one data block, but impact data may also be a combination of multiple data blocks.

１０１：データ処理システム、１３０：ストレージ、１５０：データ精錬サーバ、１５１：データ精錬モジュール、１５２：モデル構築モジュール、１６０：故障予知サーバ
101: Data processing system, 130: Storage, 150: Data refinement server, 151: Data refinement module, 152: Model construction module, 160: Failure prediction server

Claims

receiving time series data from a target measurement sensor;
a controller that executes a program to perform data processing for detecting a sign of abnormality in the target based on the time-series data;
1. A data processing system comprising:
The controller
collecting learning data for machine learning from the received time series data, the learning data including a plurality of measurement data;
Performing machine learning based on the learning data to construct a model for detecting signs of abnormality in the target;
Applying the model to validation data to calculate the degree of anomaly of the validation data;
Identifying measurement data that affects the degree of anomaly from among the measurement data included in the learning data as data that should be noted;
Present the relevant noteworthy data,
identifying the measurement data as data of interest;
selecting at least one measurement data item from the training data;
the degree of anomaly resulting from performing the machine learning after excluding the selected measurement data from the learning data is set as a first degree of anomaly, the degree of anomaly resulting from performing the machine learning without excluding the selected measurement data is set as a second degree of anomaly, the first degree of anomaly is compared with the second degree of anomaly, and the data to be noted is determined based on the comparison result;
This is done by
Furthermore, the controller
The verification data is time-series data at the time of a failure as an abnormality of the target.
The learning data is collected from time series data of a period that does not include a failure of the target.
Data processing system.

The controller
determining, as the data to be noted, the measurement data that maximizes the difference between the first abnormality degree and the second abnormality degree;
10. The data processing system of claim 1.

The controller
selecting a plurality of measurement data items from the learning data in a preferential manner;
determining the data to be noted from the selected plurality of measurement data;
3. The data processing system of claim 2.

The controller
The machine learning is performed by unsupervised learning.
10. The data processing system of claim 1.

The controller
Detecting signs of failure of the target industrial machinery based on time-series data from the industrial machinery;
10. The data processing system of claim 1.

The controller
Dividing the training data into a plurality of blocks;
Identifying the data to be noted from the plurality of blocks;
10. The data processing system of claim 1.