JP7440938B2

JP7440938B2 - Event prediction system, event prediction method and program

Info

Publication number: JP7440938B2
Application number: JP2021573071A
Authority: JP
Inventors: 崇人本田; 靖子櫻井; 光希川畑; 保志櫻井
Original assignee: Osaka University NUC
Current assignee: University of Osaka NUC
Priority date: 2020-01-22
Filing date: 2021-01-12
Publication date: 2024-02-29
Anticipated expiration: 2041-01-12
Also published as: WO2021149528A1; US20230058585A1; JPWO2021149528A1

Description

本発明は、時系列センサデータに基づくイベント予測技術に関する。 The present invention relates to event prediction technology based on time-series sensor data.

近年、製造業においては製造工場のスマート化が推し進められている。大量のセンサを使用して生産ラインの稼働状況を常に監視し、その様子を時系列データとして蓄積、分析することにより、機器の異常検知（非特許文献２５，３２）や品質管理（非特許文献１４）等、あらゆる側面から生産性を向上する取り組みが行われている。これらの取り組みに共通する重要な課題は、収集した大規模データからの効果的な知見獲得と、それに基づく将来予測技術の開発である。特に、製造工場から得られる時系列データは複数のドメイン（設備、センサ、時間など）を持つ複雑なデータであり、多角的なパターンを持つことが多い。生産ラインにおいては、複数の作業工程（パターン）の時間遷移のみならず、複数ラインでの並列作業によって生まれる作業ラインごとに共通／相違なパターンを持つ。不良品や設備故障の要因を効果的に捉えるためには、このような多角的かつ動的なパターンを柔軟に表現すると同時に、それらの間に隠された因果関係を明らかにする必要がある。 In recent years, in the manufacturing industry, there has been a push to make manufacturing plants smarter. By constantly monitoring the operating status of the production line using a large number of sensors, and accumulating and analyzing the status as time-series data, it is possible to detect abnormalities in equipment (Non-patent Documents 25, 32) and quality control (Non-patent Documents 14) Efforts are being made to improve productivity from all aspects. An important issue common to these efforts is the effective acquisition of knowledge from the large-scale data collected and the development of future prediction technology based on this knowledge. In particular, time-series data obtained from manufacturing plants is complex data with multiple domains (equipment, sensors, time, etc.) and often has multifaceted patterns. On a production line, not only the time transition of multiple work processes (patterns), but also common/different patterns are created for each work line due to parallel work on multiple lines. In order to effectively understand the causes of defective products and equipment failures, it is necessary to flexibly express such multifaceted and dynamic patterns and at the same time clarify the hidden causal relationships between them.

加えて、スマート工場で想定されるタスクでは、故障や不具合、加工精度の低下など、各イベントの発生を事前に把握することで、対策の選択肢が広がる。つまり、大規模センサデータの将来予測技術は、より長期的な予測能力を有することが望ましい（非特許文献１５）。 In addition, for tasks envisioned in smart factories, knowing in advance the occurrence of various events, such as breakdowns, defects, and reductions in processing accuracy, will expand the options for countermeasures. In other words, it is desirable that future prediction technology for large-scale sensor data has a longer-term prediction ability (Non-Patent Document 15).

センサデータの解析に関する研究は、データベースやデータマイニング等、様々な分野で進められている（非特許文献２，１７，１９，２２，２４，２５）。自己回帰モデル（AR: auto regressive model）、線形動的システム(LDS: linear dynamical systems) は代表的な技術であり，これらに基づくセンサデータの解析と予測手法が数多く存在する（非特許文献１３）。 Research on analysis of sensor data is progressing in various fields such as databases and data mining (Non-Patent Documents 2, 17, 19, 22, 24, 25). Auto regressive models (AR) and linear dynamical systems (LDS) are representative technologies, and there are many sensor data analysis and prediction methods based on these (Non-patent Document 13). .

Regime-Cast（非特許文献１５）は、大量に生成され続ける多次元センサデータから非線形動的システムをリアルタイムに推定し、適応的に将来を予測し続ける能力を持つ。しかし、この手法はセンサストリームを入力とし、センサデータの実測値の予測において高い性能を示すが、正常／異常といったイベントデータの予測には対応していない。 Regime-Cast (Non-Patent Document 15) has the ability to estimate nonlinear dynamic systems in real time from multidimensional sensor data that continues to be generated in large quantities, and to continue to adaptively predict the future. However, although this method uses sensor streams as input and shows high performance in predicting actual measured values of sensor data, it does not support prediction of event data such as normality/abnormality.

また、時系列ビッグデータを対象としたパターン発見とクラスタリングも重要な課題である（非特許文献８，１０，１１，１６，２８，２９，３１）。Matsubaraら（非特許文献１８）は、大規模イベントテンソルの解析手法として、TriMineを提案した。TriMineは、与えられたデータを複数のトピックに分類し、潜在的なトレンドやパターンを検出するが、Web上のクリックログのような離散イベントデータを対象としており、IOTセンサデータのような時系列シーケンスの動的パターンやそのグループ（レジーム）を表現することができず、扱う問題が異なる。加えて、TriMineはイベントを予測する能力を持たない。 Furthermore, pattern discovery and clustering for time-series big data are also important issues (Non-Patent Documents 8, 10, 11, 16, 28, 29, 31). Matsubara et al. (Non-Patent Document 18) proposed TriMine as a method for analyzing large-scale event tensors. TriMine classifies given data into multiple topics and detects potential trends and patterns, but it targets discrete event data like click logs on the web and time series like IOT sensor data. It is not possible to express dynamic patterns of sequences or their groups (regimes), and the problems they deal with are different. Additionally, TriMine has no ability to predict events.

Deep Neural Networkに基づく非線形動特性の解析に関する研究も盛んである（非特許文献３，９，２６，２７）。Qinらは非特許文献２１において、入力時系列の中で重要な次元と次元削減後の特空間で重要な次元を２階層にわたりモデル化することで、高精度に株価を予測する手法を提案した。一方で、本研究のように、不連続に発生するイベントを予測するタスクではイベントの発生強度（Intensity）をモデル化する手法が主流である（非特許文献５，６，２０，３０）。例えば、RMTPP（非特許文献５）は、過去のイベント履歴から次に発生するイベントの時刻と種類を予測するための非線形モデルを提案している。しかし、これらの手法はイベント履歴のみから構成されるカテゴリカルデータを対象としており、センサからの実測値で構成される連続データによるイベント予測を行うことはできない。 Research on analysis of nonlinear dynamic characteristics based on Deep Neural Networks is also active (Non-Patent Documents 3, 9, 26, 27). In Non-Patent Document 21, Qin et al. proposed a method for predicting stock prices with high accuracy by modeling important dimensions in the input time series and important dimensions in the special space after dimension reduction over two layers. . On the other hand, in the task of predicting events that occur discontinuously, as in this study, the mainstream is a method of modeling the intensity of event occurrence (Non-Patent Documents 5, 6, 20, 30). For example, RMTPP (Non-Patent Document 5) proposes a nonlinear model for predicting the time and type of the next event based on past event history. However, these methods target categorical data consisting only of event history, and cannot predict events using continuous data consisting of actual measured values from sensors.

C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2006.C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2006. G. E. Box, G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. Prentice Hall, Englewood Cliffs, NJ, 3rd edition, 1994.G. E. Box, G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. Prentice Hall, Englewood Cliffs, NJ, 3rd edition, 1994. P. Chen, S. Liu, C. Shi, B. Hooi, B. Wang, and X. Cheng. Neucast: Seasonal neural forecast of power grid time series. In IJCAI, pages 3315-3321, 2018.P. Chen, S. Liu, C. Shi, B. Hooi, B. Wang, and X. Cheng. Neucast: Seasonal neural forecast of power grid time series. In IJCAI, pages 3315-3321, 2018. K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. arXiv e-prints, page arXiv:1409.1259, Sep 2014.K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. arXiv e-prints, page arXiv:1409.1259, Sep 2014. N. Du, H. Dai, R. Trivedi, U. Upadhyay, M. Gomez-Rodriguez, and L. Song. Recurrent marked temporal point processes: Embedding event history to vector. In KDD, pages 1555-1564, 2016.N. Du, H. Dai, R. Trivedi, U. Upadhyay, M. Gomez-Rodriguez, and L. Song. Recurrent marked temporal point processes: Embedding event history to vector. In KDD, pages 1555-1564, 2016. N. Du, Y. Wang, N. He, and L. Song. Time-sensitive recommendation from recurrent user activities. In NIPS, pages 3492-3500, 2015.N. Du, Y. Wang, N. He, and L. Song. Time-sensitive recommendation from recurrent user activities. In NIPS, pages 3492-3500, 2015. J. G. DAVID FORNEY. The viterbi algorithm. In Proceedings of the IEEE, pages 268-278, 1973.J. G. DAVID FORNEY. The viterbi algorithm. In Proceedings of the IEEE, pages 268-278, 1973. D. Hallac, S. Vare, S. Boyd, and J. Leskovec. Toeplitz inverse covariance-based clustering of multivariate time series data. In KDD, pages 215-223, 2017.D. Hallac, S. Vare, S. Boyd, and J. Leskovec. Toeplitz inverse covariance-based clustering of multivariate time series data. In KDD, pages 215-223, 2017. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735-1780, Nov. 1997.S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735-1780, Nov. 1997. T. Honda, Y. Matsubara, R. Neyama, M. Abe, and Y. Sakurai. Multi-aspect mining of complex sensor sequences. In ICDM, 2019.T. Honda, Y. Matsubara, R. Neyama, M. Abe, and Y. Sakurai. Multi-aspect mining of complex sensor sequences. In ICDM, 2019. K. Kawabata, Y. Matsubara, and Y. Sakurai. Automatic sequential pattern mining in data streams. In CIKM, pages 1733-1742, 2019.K. Kawabata, Y. Matsubara, and Y. Sakurai. Automatic sequential pattern mining in data streams. In CIKM, pages 1733-1742, 2019. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2015.D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2015. L. Li, J. McCann, N. Pollard, and C. Faloutsos. Dynammo: Mining and summarization of coevolving sequences with missing values. In KDD, 2009.L. Li, J. McCann, N. Pollard, and C. Faloutsos. Dynammo: Mining and summarization of coevolving sequences with missing values. In KDD, 2009. Y. Li, J. Wang, J. Ye, and C. K. Reddy. A multi-task learning formulation for survival analysis. In KDD, pages 1715-1724, 2016.Y. Li, J. Wang, J. Ye, and C. K. Reddy. A multi-task learning formulation for survival analysis. In KDD, pages 1715-1724, 2016. Y. Matsubara and Y. Sakurai. Regime shifts in streams: Realtime forecasting of co-evolving time sequences. In KDD, 2016.Y. Matsubara and Y. Sakurai. Regime shifts in streams: Realtime forecasting of co-evolving time sequences. In KDD, 2016. Y. Matsubara, Y. Sakurai, and C. Faloutsos. Autoplait: Automatic mining of co-evolving time sequences. In SIGMOD, pages 193-204, 2014.Y. Matsubara, Y. Sakurai, and C. Faloutsos. Autoplait: Automatic mining of co-evolving time sequences. In SIGMOD, pages 193-204, 2014. Y. Matsubara, Y. Sakurai, and C. Faloutsos. The web as a jungle: Non-linear dynamical systems for co-evolving online activities. In WWW, pages 721-731, 2015.Y. Matsubara, Y. Sakurai, and C. Faloutsos. The web as a jungle: Non-linear dynamical systems for co-evolving online activities. In WWW, pages 721-731, 2015. Y. Matsubara, Y. Sakurai, C. Faloutsos, T. Iwata, and M. Yoshikawa. Fast mining and forecasting of complex timestamped events. In KDD, pages 271-279, 2012.Y. Matsubara, Y. Sakurai, C. Faloutsos, T. Iwata, and M. Yoshikawa. Fast mining and forecasting of complex timestamped events. In KDD, pages 271-279, 2012. Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and fall patterns of information diffusion: model and implications. In KDD, pages 6-14, 2012.Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and fall patterns of information diffusion: model and implications. In KDD, pages 6-14, 2012. H. Mei and J. Eisner. The neural hawkes process: A neutrally self-modulating multivariate point process. In NIPS, pages 6757-6767, 2017.H. Mei and J. Eisner. The neural hawkes process: A self-modulating neutral multivariate point process. In NIPS, pages 6757-6767, 2017. Y. Qin, D. Song, H. Chen, W. Cheng, G. Jiang, and G. W. Cottrell. A dual-stage attention-based recurrent neural network for time series prediction. In IJCAI, pages 2627-2633, 2017.Y. Qin, D. Song, H. Chen, W. Cheng, G. Jiang, and G. W. Cottrell. A dual-stage attention-based recurrent neural network for time series prediction. In IJCAI, pages 2627-2633, 2017. T. Rakthanmanon, B. J. L. Campana, A. Mueen, G. E. A. P. A. Batista, M. B. Westover, Q. Zhu, J. Zakaria, and E. J. Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In KDD, pages 262-270, 2012.T. Rakthanmanon, B. J. L. Campana, A. Mueen, G. E. A. P. A. Batista, M. B. Westover, Q. Zhu, J. Zakaria, and E. J. Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In KDD, pages 262-270, 2012. J. Rissanen. A Universal Prior for Integers and Estimation by Minimum Description Length. Ann. of Statist, 11(2):416-431, 1983.J. Rissanen. A Universal Prior for Integers and Estimation by Minimum Description Length. Ann. of Statist, 11(2):416-431, 1983. Y. Sakurai, Y. Matsubara, and C. Faloutsos. Mining and forecasting of big time-series data. In SIGMOD, pages 919-922, 2015.Y. Sakurai, Y. Matsubara, and C. Faloutsos. Mining and forecasting of big time-series data. In SIGMOD, pages 919-922, 2015. Y. Sakurai, S. Papadimitriou, and C. Faloutsos. Braid: Stream mining through group lag correlations. In SIGMOD, pages 599-610, 2005.Y. Sakurai, S. Papadimitriou, and C. Faloutsos. Braid: Stream mining through group lag correlations. In SIGMOD, pages 599-610, 2005. I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104-3112. 2014.I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104-3112. 2014. Tsungnan Lin, B. G. Horne, P. Tino, and C. L. Giles. Learning long-term dependencies in narx recurrent neural networks. IEEE Transactions on Neural Networks, 7(6):1329-1338, 1996.Tsungnan Lin, B. G. Horne, P. Tino, and C. L. Giles. Learning long-term dependencies in narx recurrent neural networks. IEEE Transactions on Neural Networks, 7(6):1329-1338, 1996. P. Wang, H. Wang, and W. Wang. Finding semantics in time series. In SIGMOD Conference, pages 385-396, 2011.P. Wang, H. Wang, and W. Wang. Finding semantics in time series. In SIGMOD Conference, pages 385-396, 2011. S.Wang, K. Kam, C. Xiao, S. R. Bowen, and W. A. Chaovalitwongse. An efficient time series subsequence pattern mining and prediction framework with an application to respiratory motion prediction. In AAAI, pages 2159-2165, 2016.S. Wang, K. Kam, C. Xiao, S. R. Bowen, and W. A. Chaovalitwongse. An efficient time series subsequence pattern mining and prediction framework with an application to respiratory motion prediction. In AAAI, pages 2159-2165, 2016. S. Xiao, J. Yan, X. Yang, H. Zha, and S. Chu. Modeling the intensity function of point process via recurrent neural networks, 2017.S. Xiao, J. Yan, X. Yang, H. Zha, and S. Chu. Modeling the intensity function of point process via recurrent neural networks, 2017. R. Zhao and Q. Ji. An adversarial hierarchical hidden markov model for human pose modeling and generation. In AAAI, 2018.R. Zhao and Q. Ji. An adversarial hierarchical hidden markov model for human pose modeling and generation. In AAAI, 2018. Y. Zhou, H. Zou, R. Arghandeh, W. Gu, and C. J. Spanos. Non-parametric outliers detection in multiple time series A case study: Power grid data analysis. In AAAI, 2018.Y. Zhou, H. Zou, R. Arghandeh, W. Gu, and C. J. Spanos. Non-parametric outliers detection in multiple time series A case study: Power grid data analysis. In AAAI, 2018.

以上説明したように、従来、時系列テンソルデータを対象とし、かつ時系列パターンに関する事前知識を必要とせず、時系列データの特徴的なパターンを用いてイベント予測を行うイベント予測方法乃至システムは提案されていない。 As explained above, conventional event prediction methods and systems have been proposed that target time-series tensor data and perform event prediction using characteristic patterns of time-series data without requiring prior knowledge of time-series patterns. It has not been.

本発明は、上記に鑑みてなされたもので、時系列テンソルデータを対象とし、データの要約処理を介して、長期的かつ高精度なイベント予測を可能にするイベント予測システム、その方法およびプログラムを提供するものである。 The present invention has been made in view of the above, and provides an event prediction system, method, and program for the event prediction system that targets time-series tensor data and enables long-term and highly accurate event prediction through data summarization processing. This is what we provide.

本発明に係るイベント予測システムは、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に収集される時系列センサデータから多角的な動的パターンのモデルパラメータの抽出を継続的に行う第１の特徴量抽出手段と、前記モデルパラメータを用いて前記時系列センサデータをモデル化情報とその誤差情報とを含む要約情報に順次特徴量化する第２の特徴量抽出手段と、前記要約情報を入力として所定時間先での所定のイベントの発生確率を出力する予測手段とを備えたものである。 The event prediction system according to the present invention continuously extracts model parameters of multifaceted dynamic patterns from time-series sensor data continuously collected from multiple types of sensors placed on multiple observation targets. a first feature extracting means; a second feature extracting means for sequentially converting the time-series sensor data into features using the model parameters into summary information including modeling information and its error information; and the summary information and a prediction means for outputting the probability of occurrence of a predetermined event in a predetermined time period.

また、本発明に係るイベント予測方法は、コンピュータの第１の特徴量抽出部が、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に収集され、記憶部に記憶された時系列センサデータから多角的な動的パターンのモデルパラメータの抽出を継続的に行って前記記憶部に記憶し、前記コンピュータの第２の特徴量抽出部が、前記モデルパラメータ及び前記時系列センサデータを前記記憶部から読み出して、前記時系列センサデータをモデル化情報とその誤差情報とを含む要約情報に順次特徴量化して前記記憶部に記憶し、前記コンピュータの予測部が、前記要約情報を前記記憶部から読み出して入力とし、所定時間先での所定のイベントの発生確率を出力するものである。 Further, in the event prediction method according to the present invention, the first feature extracting unit of the computer continuously collects data from multiple types of sensors placed on multiple observation targets, and stores the time series in the storage unit. Model parameters of a multifaceted dynamic pattern are continuously extracted from sensor data and stored in the storage unit, and a second feature extracting unit of the computer extracts the model parameters and the time-series sensor data from the sensor data. The time-series sensor data is read out from the storage unit and sequentially converted into feature quantities into summary information including modeling information and its error information, and stored in the storage unit, and the prediction unit of the computer converts the summary information into the storage unit. It is read from the section and used as input, and outputs the probability of occurrence of a predetermined event in a predetermined time period.

また、本発明に係るプログラムは、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に収集される時系列センサデータから多角的な動的パターンのモデルパラメータの抽出を継続的に行う第１の特徴量抽出手段、前記モデルパラメータを用いて前記時系列センサデータをモデル化情報とその誤差情報とを含む要約情報に順次特徴量化する第２の特徴量抽出手段、及び前記要約情報を入力として所定時間先での所定のイベントの発生確率を出力する予測手段として、コンピュータを機能させるものである。 Further, the program according to the present invention continuously extracts model parameters of multifaceted dynamic patterns from time-series sensor data continuously collected from multiple types of sensors placed on multiple observation targets. a first feature extracting means; a second feature extracting means for sequentially converting the time-series sensor data into summary information including modeling information and its error information using the model parameters; The computer functions as a prediction means that outputs as input the probability of occurrence of a predetermined event in a predetermined time period.

これらの発明によれば、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に時系列センサデータが収集され、収集された時系列センサデータから多角的な動的パターンのモデルパラメータの抽出が、第１の特徴量抽出手段によって継続的に行われる。次いで、第２の特徴量抽出手段によって、前記モデルパラメータを用いて前記時系列センサデータがモデル化情報とその誤差情報とを含む要約情報に順次特徴量化される。そして、予測手段によって、前記要約情報を入力として所定時間先での所定のイベントの発生確率が出力される。従って、時系列センサデータ内に含まれる時系列パターンに関する事前知識を必要とせず、パターン（レジーム）の変化点と潜在的な振る舞いとが、例えば時間遷移と観測対象間の多角的な観点とから把握される。また、大規模な時系列センサデータの特徴的なパターンを発見し、それらを用いて長期間先のイベント予測が可能となる。なお、センサの配置は、観測対象に直接設置され、乃至観測対象を遠隔的に観測可能な態様で設置でもよい。 According to these inventions, time-series sensor data is continuously collected from multiple types of sensors placed on multiple observation targets, and model parameters of multifaceted dynamic patterns are determined from the collected time-series sensor data. Extraction is continuously performed by the first feature extraction means. Next, the time-series sensor data is sequentially converted into features by the second feature extracting means using the model parameters into summary information including modeling information and its error information. Then, the prediction means inputs the summary information and outputs the probability of occurrence of a predetermined event in a predetermined time period. Therefore, there is no need for prior knowledge of time-series patterns contained in time-series sensor data, and changing points and potential behavior of patterns (regimes) can be determined based on, for example, time transitions and multiple perspectives between observation targets. be understood. Furthermore, it becomes possible to discover characteristic patterns in large-scale time-series sensor data and use them to predict events over a long period of time. Note that the sensors may be placed directly on the observation target, or may be installed in such a manner that the observation target can be observed remotely.

本発明によれば、時系列センサデータから多角的に特徴量を抽出して要約することで、簡易な構成で長期的かつ高精度なイベント予測を可能にする。 According to the present invention, by extracting and summarizing feature quantities from multiple angles from time-series sensor data, it is possible to perform long-term and highly accurate event prediction with a simple configuration.

本発明に係るイベント予測システムの一実施形態を示す全体ブロック図である。1 is an overall block diagram showing an embodiment of an event prediction system according to the present invention. 本発明が適用例であるスマート工場データから取り込んだ情報の処理状況の一例を示す図で、（ａ）は、オリジナルのセンサデータを示し、（ｂ）は、オリジナルデータからのパターン検出結果を示し、（ｃ）、（ｄ）は、オリジナルデータに基づき、所定時間後に非常停止した場合（ｄ）と、そうでない場合（ｃ）との典型的なレジーム例を示した図である。A diagram showing an example of the processing status of information imported from smart factory data to which the present invention is applied, in which (a) shows original sensor data, and (b) shows pattern detection results from the original data. , (c), and (d) are diagrams showing typical regime examples of a case (d) in which an emergency stop is made after a predetermined time and a case (c) in which it is not, based on original data. 本発明に係る提案モデルの概要を示す図である。1 is a diagram showing an outline of a proposed model according to the present invention. 本発明に係る提案アルゴリズムの基本的な概念を説明するための遷移図である。FIG. 3 is a transition diagram for explaining the basic concept of the proposed algorithm according to the present invention. 予測先のステップ数l_sを変化させたときの精度の比較図である。FIG. 7 is a comparison diagram of accuracy when changing the number of steps to predict, _ls . ネットワーク学習時に使用するミニバッチのウインドウ幅を変化させたときの予測精度の比較図である。FIG. 4 is a comparison diagram of prediction accuracy when changing the window width of the mini-batch used during network learning. 予測結果の適合率（Precision）と再現率（Recall）とを示す比較図である。FIG. 3 is a comparison diagram showing precision and recall of prediction results. 検出セグメント数mに対する本予測システムの予測精度の変化を示す図である。FIG. 3 is a diagram showing changes in prediction accuracy of the present prediction system with respect to the number m of detected segments. 学習サンプル数と予測精度との関係性を示す図である。FIG. 3 is a diagram showing the relationship between the number of learning samples and prediction accuracy. 設備数w、センサ数d、シーケンス長nをそれぞれ変化させたときの本予測システムの計算コストを示す図である。FIG. 7 is a diagram showing the calculation cost of the present prediction system when the number of equipment w, the number of sensors d, and the sequence length n are changed.

本発明は、好ましくは大規模の時系列センサデータのためのイベント予測手法に関する。本発明は、一例として、（facility, sensor, time）の３つ組で構成される、例えば工場設備センサデータから、複数の観点に基づく多角的な時系列パターンを統合的に解析要約し、将来の長期的なイベント予測を行う技術に関する。より具体的には、工場に取り付けられた各設備における回転速度、稼働電圧、設備温度などのセンサデータの実測値で構成される時系列データが与えられたとき、（a）基本的な時系列パターン、各設備間における共通パターンや設備固有のパターンを抽出し、統計的に要約することで、(b)将来的なイベント予測を行う。さらに、これらの処理は、(c)データサイズに対して線形である。なお、後述するように、実データを用いた実験では、本予測手法が工場設備のセンサデータに含まれる特徴的な時系列パターンを多角的に捉え、長期的なイベント予測を行うことを確認し、さらに、後述するように最新の既存手法（比較例）と比較し、大幅な精度、性能向上を達成していることを明らかにした。 The present invention relates to an event prediction method, preferably for large-scale time-series sensor data. As an example, the present invention comprehensively analyzes and summarizes multifaceted time-series patterns based on multiple viewpoints from factory equipment sensor data, which is composed of the triplet of (facility, sensor, time). Regarding technology for long-term event prediction. More specifically, when given time series data consisting of actual measured values of sensor data such as rotation speed, operating voltage, and equipment temperature for each piece of equipment installed in a factory, (a) basic time series By extracting patterns, common patterns between each piece of equipment, and equipment-specific patterns, and statistically summarizing them, (b) predict future events. Furthermore, these processes are (c) linear with respect to data size. As described later, in experiments using real data, we confirmed that this prediction method captures characteristic time-series patterns included in sensor data of factory equipment from multiple angles and predicts long-term events. Furthermore, as described below, compared with the latest existing method (comparative example), it was revealed that significant accuracy and performance improvements were achieved.

すなわち、本予測システムは、時系列データに含まれる典型的なパターン（以下、レジームという）の数と変化点とを多角的に捉え、システムの稼働状況を正確に把握することで、将来発生するイベントを予測する。より具体的には、複数箇所の設備で複数のセンサから収集された大規模時系列センサデータが与えられたとき、所定時間後すなわちl_sステップ先のイベントを予測する。 In other words, this prediction system analyzes the number and change points of typical patterns (hereinafter referred to as regimes) included in time-series data from multiple angles, and accurately grasps the operating status of the system to predict future occurrences. Predict events. More specifically, when given large-scale time-series sensor data collected from multiple sensors at multiple locations of equipment, events after a predetermined time, ie, 1 _s steps ahead, are predicted.

さらに具体的には、(a)センサデータの中から多角的なパターンとその変化点を検出し、それらを要約情報としてまとめることで、(b)長期的かつ高精度な予測の実現に供する。さらに、(c)これらの処理を高速に行う。 More specifically, (a) detecting multifaceted patterns and their changing points from sensor data and summarizing them as summary information will (b) help realize long-term and highly accurate predictions. Furthermore, (c) these processes are performed at high speed.

以下、図面を参照して本発明を説明する。図１は、本発明に係るイベント予測システム（以下、予測システム１）の全体ブロック図を示す。本予測システム１は、例えば工場などの複数の設備としての観測対象２０，…に設置されたそれぞれのセンサ群２１からの、稼働状況に関連する大規模時系列センサデータを有線又は無線の通信路を介して収集する構成と、取り込んだ各時系列データから特徴量を抽出し、さらに所定時間後でのイベント予測処理を実行するプロセッサ（ＣＰＵ）からなる制御部１０を有するコンピュータとを備えている。また、本実施形態では、機械学習を利用しており、予測処理に適用されるパラメータを、機械学習を通して更新するようにしている。図１の詳細は後述する。 The present invention will be described below with reference to the drawings. FIG. 1 shows an overall block diagram of an event prediction system (hereinafter referred to as prediction system 1) according to the present invention. This prediction system 1 transmits large-scale time-series sensor data related to operating conditions from respective sensor groups 21 installed in observation targets 20, ..., which are a plurality of facilities such as factories, through wired or wireless communication channels. and a computer having a control unit 10 consisting of a processor (CPU) that extracts feature quantities from each imported time series data and further executes event prediction processing after a predetermined period of time. . Furthermore, in this embodiment, machine learning is used, and parameters applied to prediction processing are updated through machine learning. Details of FIG. 1 will be described later.

まず、予測処理の理解のために、図２に記載された具体例で説明する。図２は、観測対象２０（図１）の一例としてのスマート工場からのセンサデータであって、予測処理に供する（入力するための）情報を示している。図２（ａ）は、オリジナルのセンサデータを示しており、５つの設備（＃１～＃５）からの、各センサ群２１（図１）の一例として収集される３つのセンサ値（回転速度：Speed、稼働電圧：Load、設備温度：Temp）で構成されている。図２（ａ）中、黒矩形で塗られた箇所は、対応する設備が非常停止中であることを示す。なお、図２（ａ）の稼働電圧：Loadの波形は、回転速度：Speedの波形とおおむね重複している。図２（ｂ）は、本予測システムによる、オリジナルデータからのパターン抽出結果を示している。図２（ｂ）中の縦線は、時系列パターンが変化した時刻を示し、同一レジームに属するセグメントは同一の濃淡色で表されている。予測システム１は、複数の設備から得られた時系列データを同時に解析することにより、多角的なパターン、すなわち、各設備内のパターンの時間遷移だけでなく、設備間で共通あるいは相違するパターンを検出することが可能である。 First, in order to understand the prediction process, a specific example shown in FIG. 2 will be explained. FIG. 2 shows sensor data from a smart factory as an example of the observation target 20 (FIG. 1), and shows information used (input) for prediction processing. FIG. 2(a) shows the original sensor data, in which three sensor values (rotation speed :Speed, operating voltage: Load, equipment temperature: Temp). In FIG. 2(a), a portion filled with a black rectangle indicates that the corresponding equipment is in an emergency stop. Note that the waveform of the operating voltage: Load in FIG. 2(a) roughly overlaps with the waveform of the rotational speed: Speed. FIG. 2(b) shows the result of pattern extraction from the original data by this prediction system. Vertical lines in FIG. 2(b) indicate times when the time-series pattern changes, and segments belonging to the same regime are represented in the same shaded color. By simultaneously analyzing time-series data obtained from multiple pieces of equipment, the prediction system 1 analyzes multifaceted patterns, that is, not only temporal transitions of patterns within each piece of equipment, but also patterns that are common or different between pieces of equipment. It is possible to detect.

図２（ｃ）（ｄ）には、オリジナルデータの中から、l_s = 200ステップ（約１７分）後に非常停止した場合とそうでない場合との典型的な例を示した。図２（ｃ）（ｄ）の左側は、セグメンテーション結果を示す。右側のθ₁～θ₅は、それぞれ共通の時系列パターン（すなわちレジーム）を表し、それらの遷移の様子を可視化したものである。p200の値は、図２（ｃ）（ｄ）の左側の図にあたる部分シーケンスとそのパターン検出結果が与えられたとき、本予測システムが出力した200ステップ先での非常停止確率である。図２（ｃ）（ｄ）の右側の図において、より多くの遷移が検出されたレジーム間には、太い矢印が表示される。また、円の大きさはレジームの発生期間の大きさを示す。図２（ｄ）を見ると、設備が非常停止する前に回転速度Speedが上昇（θ₅）しており、その傾向はレジームθ₄、θ₅の遷移が現れることによって表現されている。実際に、本予測システム１は、非常停止を正確に予測し、p200が高い値を示している。すなわち、データに含まれる潜在的なパターンを検出することで、非常停止に至る過程を多角的に分析できるだけでなく、それらの要約情報を用いることで長期的かつ高精度な予測が可能となる。なお、図２（ｃ）では、レジームθ₂、θ₃、θ₂、θ₁、θ₂のような非常停止の予兆のない遷移が見られ、p200も低い値を示している。 FIGS. 2(c) and 2(d) show typical examples from the original data of cases in which an emergency stop occurs after l _s = 200 steps (approximately 17 minutes) and cases in which it does not. The left side of FIGS. 2(c) and 2(d) shows the segmentation results. θ ₁ to θ ₅ on the right represent common time series patterns (ie, regimes), and their transitions are visualized. The value of p200 is the emergency stop probability 200 steps ahead that is output by this prediction system when the partial sequence and its pattern detection results corresponding to the left diagrams in FIGS. 2(c) and 2(d) are given. In the right-hand diagrams of FIGS. 2(c) and 2(d), thick arrows are displayed between regimes where more transitions are detected. Furthermore, the size of the circle indicates the size of the period during which the regime occurred. Looking at FIG. 2(d), the rotational speed Speed increases (θ ₅ ) before the equipment makes an emergency stop, and this tendency is expressed by the appearance of a transition between regimes θ ₄ and θ ₅ . In fact, this prediction system 1 accurately predicts emergency stops, and p200 shows a high value. In other words, by detecting latent patterns contained in the data, not only can the process leading to an emergency stop be analyzed from multiple angles, but also long-term and highly accurate predictions can be made using the summary information. In addition, in FIG. 2(c), transitions such as regimes θ ₂ , θ ₃ , θ ₂ , θ ₁ , and θ ₂ without any sign of emergency stop are observed, and p200 also shows a low value.

本予測システム１で扱う工場設備センサデータの一例として、三菱重工エンジン＆ターボチャージャ株式会社で、2017年10月1日に稼働した５５設備における３種類のセンサデータを示す。本データは、(facility, sensor, time) の３つ組で表現され、それぞれ、w個の設備、d種のセンサ、nの期間（例えば５秒単位）からなる。かかるセンサデータは、３階のテンソルＸ ∈ Ｒ^w×d×nとして表現することができ、テンソルＸの要素x_ij(t)は、時刻ｔにおけるｉ番目の設備のｊ番目のセンサでの計測値を示す。本実施形態では、かかるセンサデータを多次元時系列テンソルと呼ぶ。 As an example of factory equipment sensor data handled by this prediction system 1, three types of sensor data from 55 equipment operated at Mitsubishi Heavy Industries Engine & Turbocharger Corporation on October 1, 2017 are shown. This data is expressed as a triplet (facility, sensor, time), each consisting of w pieces of equipment, d types of sensors, and n periods (for example, in units of 5 seconds). Such sensor data can be expressed as a third-floor tensor X ∈ R ^{w × d × n} , where the element x _ij (t) of the tensor Show value. In this embodiment, such sensor data is referred to as a multidimensional time series tensor.

本予測システム１は、与えられた時系列テンソルＸから、l_sステップ先の設備アラートを予測するものであり、そのために必要な処理を以下に示す。 This prediction system 1 predicts an equipment alert _ls steps ahead from a given time series tensor X, and the necessary processing for this is shown below.

すなわち、時系列テンソルＸ(t_s:t_e)が与えられたとき、l_sステップ先のアラートラベルｙ(t_e+l_s)を次の式（１）に基づいて予測する。 That is, when the time series tensor X(t _s :t _e ) is given, the alert label y(t _e +l _s ) for l _s steps ahead is predicted based on the following equation (1).

у(t_e+l_s) ≒Ｆ(Ｘ(t_s:t_e)) (1)
なお、t_s:t_eは、予測に使用するシーケンスのウインドウ（現時点から過去方向に所定の期間）を表し、Ｆを提案モデルとする。 у(t _e +l _s ) ≒F(X(t _s :t _e )) (1)
Note that t _s : _te represents the window of the sequence used for prediction (a predetermined period from the current moment to the past), and F is the proposed model.

ここでは、アラートラベルｙ(t_e+l_s)を高精度に予測するために、確率モデルと深層学習に基づくモデルを構築し、与えられたセンサデータから、例えば故障（アラート）の要因となる高次元かつ非線形な動的特性を抽出する。具体的には、本予測システム１は、次の３つの処理(P1),(P2),(P3)を実行する。 Here, in order to predict the alert label y(t _e +l _s ) with high accuracy, we build a model based on a probabilistic model and deep learning, and use the given sensor data to predict, for example, the cause of a failure (alert). Extract high-dimensional and nonlinear dynamic characteristics. Specifically, this prediction system 1 executes the following three processes (P1), (P2), and (P3).

(P1) 潜在的な動的パターンの多角的な検出
(P2) 動的パターンに基づく特徴抽出
(P3) l_sステップ先の長期予測
まず、各処理(P1),(P2),(P3)について、図１との関連を説明する。図１において、制御部１０には、記憶部１００、例えば後述するウインドウの表示を行うなどの表示部１２１、及び外部からの指示を受け付ける操作部１２２が接続されている。記憶部１００は、制御プログラム記憶部１０１、各センサ群２１から入力される時系列センサデータを記憶するデータストリーム記憶部１０２、及び予測処理に適用される、人工知能（ＡＩ）を構成するニューラルネットワークモデルのパラメータ（各エッジの重みなど）を記憶するパラメータ記憶部１０３を備える。制御プログラム記憶部１０１は、後述するイベント予測処理を実行するためのプログラムデータ及び必要な各種の演算式データを格納する。また、記憶部１００は、データストリーム記憶部１０２の他、後述する各処理「(P1)潜在的な動的パターンの多角的な検出」、「(P2)動的パターンに基づく特徴抽出」及び処理「(P3)l_sステップ先の長期予測」の実行中に得られる夫々のデータを一時的に保管するワークエリア（記憶部）を有する。 (P1) Multifaceted detection of latent dynamic patterns
(P2) Feature extraction based on dynamic patterns
(P3) l Long-term prediction for _s steps ahead First, the relationship with FIG. 1 will be explained for each process (P1), (P2), and (P3). In FIG. 1, the control unit 10 is connected to a storage unit 100, a display unit 121 that displays a window to be described later, for example, and an operation unit 122 that accepts instructions from the outside. The storage unit 100 includes a control program storage unit 101, a data stream storage unit 102 that stores time-series sensor data input from each sensor group 21, and a neural network that constitutes artificial intelligence (AI) that is applied to prediction processing. It includes a parameter storage unit 103 that stores model parameters (weights of each edge, etc.). The control program storage unit 101 stores program data and various necessary arithmetic expression data for executing event prediction processing, which will be described later. In addition to the data stream storage unit 102, the storage unit 100 also stores various processes such as “(P1) Multifaceted detection of latent dynamic patterns” and “(P2) Feature extraction based on dynamic patterns” which will be described later. It has a work area (storage unit) that temporarily stores each data obtained during the execution of "(P3) long-term prediction of the _next step".

制御部１０は、制御プログラムが実行されることで、データ取込処理部１１、特徴量抽出部１２、予測部１３、及びパラメータ更新部１４として機能する。データ取込処理部１１は、各観測対象２０（工場の各設備）のセンサ群２１からの時系列センサデータをネットワーク１１０を経由して取り込む。 The control unit 10 functions as a data acquisition processing unit 11, a feature quantity extraction unit 12, a prediction unit 13, and a parameter updating unit 14 by executing a control program. The data acquisition processing unit 11 acquires time-series sensor data from the sensor group 21 of each observation target 20 (each piece of equipment in the factory) via the network 110.

特徴量抽出部１２は、後述する、処理「(P1)潜在的な動的パターンの多角的な検出」、及び「(P2)動的パターンに基づく特徴抽出」を実行する。予測部１３は、処理「(P3)l_sステップ先の長期予測」を実行する。本実施形態では、予測部１３は、パラメータ記憶部１０３からのパラメータを適用して予測処理を行う。各処理の詳細は後述する。 The feature extraction unit 12 executes the processes "(P1) Multifaceted detection of latent dynamic patterns" and "(P2) Feature extraction based on dynamic patterns", which will be described later. The prediction unit 13 executes the process “(P3) long-term prediction for _s steps ahead”. In this embodiment, the prediction unit 13 performs prediction processing by applying parameters from the parameter storage unit 103. Details of each process will be described later.

機械学習装置３０は、プロセッサを内蔵するコンピュータからなる制御部３００、記憶部３１０を備えると共に、表示部３２１、操作部３２２を備える。記憶部３１０は、学習プログラム記憶部３１１、データストリーム記憶部３１２及びパラメータ記憶部３１３を備える。データストリーム記憶部３１２は、各センサ群２１から入力される時系列センサデータを通信で、または外部メモリを介して取り込んで、あるいはデータストリーム記憶部１０２に一旦書き込まれたデータを取り込んで記憶する。 The machine learning device 30 includes a control section 300 and a storage section 310 that are made up of a computer with a built-in processor, as well as a display section 321 and an operation section 322. The storage unit 310 includes a learning program storage unit 311, a data stream storage unit 312, and a parameter storage unit 313. The data stream storage unit 312 captures time-series sensor data input from each sensor group 21 through communication or via an external memory, or captures and stores data once written in the data stream storage unit 102.

制御部３００は、学習プログラム記憶部３１１からの学習プログラムが実行されることで、データ取込処理部３０１、特徴量抽出部３０２及び機械学習部３０３として機能する。データ取込処理部３０１は、データ取込処理部１１と同様で、さらに取り込みデータの取り込み期間を適宜に自動乃至マニュアルで設定（例えば直近の１週間分など）可能である。特徴量抽出部３０２は、必要に応じて設けられ、例えば工場設備の変更その他の状況変化に応じて前記処理(P1)、(P2)の条件を適宜調整して処理を確認する。 The control unit 300 functions as a data acquisition processing unit 301, a feature extraction unit 302, and a machine learning unit 303 by executing a learning program from a learning program storage unit 311. The data acquisition processing unit 301 is similar to the data acquisition processing unit 11, and can further automatically or manually set the acquisition period of the acquired data (for example, for the most recent one week) as appropriate. The feature extraction unit 302 is provided as necessary, and checks the processing by adjusting the conditions of the processing (P1) and (P2) as appropriate, for example, in response to changes in factory equipment or other changes in the situation.

機械学習部３０３は、好ましくは直近の所定期間分の時系列センサデータに対して、例えば「教師あり学習」などを適用して機械学習を行い、学習結果であるパラメータをパラメータ記憶部３１３に保管し、かつ必要に応じてパラメータ更新部１４を介して、あるいは機械学習装置３０の操作部３２２からの指示を受けてパラメータ記憶部１０３を更新する。なお、機械学習は、別体の機械学習装置３０の態様の他、各種の態様が採用可能である。例えば、入力データは、データストリーム記憶部１０２から所定期間分を取り出すようにしてもよい。また、システム停止期間（例えば夜間）を利用するなどして、予測部１３を利用して学習を実行し、学習結果であるパラメータを更新する態様でもよい。 The machine learning unit 303 preferably performs machine learning by applying, for example, “supervised learning” to the time-series sensor data for the most recent predetermined period, and stores the parameters that are the learning results in the parameter storage unit 313. Then, the parameter storage unit 103 is updated as necessary via the parameter update unit 14 or in response to an instruction from the operation unit 322 of the machine learning device 30. Note that, in addition to the form of the separate machine learning device 30, various forms of machine learning can be adopted. For example, the input data may be extracted for a predetermined period from the data stream storage unit 102. Alternatively, learning may be performed using the prediction unit 13 during a system stop period (for example, at night), and the parameters that are the learning results may be updated.

次に、「提案モデル」の概要と必要な定義を、表１のように示す。 Next, the outline and necessary definitions of the "proposed model" are shown in Table 1.

＜提案モデル＞
(P1) 潜在的な動的パターンの検出
多次元時系列テンソルＸが与えられたとき、本予測システムはまず、Ｘをm個のセグメント集合Ｓ={s₁,. . . ,s_m}に分割してその特徴をとらえる。s_iはｉ番目のセグメントの開始点t_s、終了点t_e、設備番号で構成され（すなわち、s_i= {t_s,t_e,facilityID}）、各セグメントは重複がないものとする。そして、発見したセグメント集合を類似セグメントのグループに分類する。本予測システムではこれらのグループを「レジーム」と呼ぶ。 <Proposed model>
(P1) Detection of latent dynamic patterns When a multidimensional time series tensor X _is given, this prediction system first divides X into a set of m segments S={s ₁ ,. Divide it into parts to understand its characteristics. s _i is composed of the start point t _s , end point t _e , and facility number of the i-th segment (i.e., s _i ={t _s ,t _e ,facilityID}), and each segment is assumed to have no overlap. Then, the discovered segment set is classified into groups of similar segments. In this prediction system, these groups are called "regimes."

・定義１（レジーム）
rを最適なセグメントグループの個数とする。それぞれのセグメントsは、セグメントグループの１つに割り当てられる。さらに、各セグメントが所属するレジームを表現するため、新たにセグメントメンバーシップを定義する。・Definition 1 (regime)
Let r be the optimal number of segment groups. Each segment s is assigned to one of the segment groups. Furthermore, to express the regime to which each segment belongs, segment membership is newly defined.

・定義２（セグメントメンバーシップ）
多次元時系列テンソルＸが与えられたとき、Ｆ={f₁, . . . ,f_m}を、m個の整数列とし、f_iをｉ番目のセグメントが所属するレジームの番号とする（1≦ f_i≦r）。・Definition 2 (segment membership)
When a multidimensional time series _tensor X is given, let F={f ₁ , _. 1≦f _i ≦r).

これにより、多次元時系列テンソルＸをm個のセグメントとr個のレジームとで、{m,r,Ｓ,Θ,Ｆ} として表現することができる。次に、本予測システムは、得られたレジーム情報に基づき、多次元時系列テンソルＸを統計モデル化し、重要な特徴を抽出する。 Thereby, the multidimensional time series tensor X can be expressed as {m, r, S, Θ, F} using m segments and r regimes. Next, the present prediction system statistically models the multidimensional time series tensor X based on the obtained regime information and extracts important features.

(P2)動的パターンに基づく特徴抽出
それぞれのレジームは、統計モデルΘ={θ₁, . . . ,θ_r,Δ_r×r}として表現される。本研究では、多次元時系列テンソルＸの振る舞いを表現するため、隠れマルコフモデル(HMM:Hidden Markov Model)を用いる。HMMは、隠れ状態を持つマルコフ過程を仮定した確率モデルの一種であり、音声認識を含む様々な分野において、時系列処理手法として広く利用されている。HMMは、初期確率Π={π_i}^k _i=1、遷移確率Ａ={a_ij}^k _i,j=1、出力確率Ｂ={b_i(x)}^k _i=1の３つ組で表現される（すなわち、θ={Π,Ａ,Ｂ}）。ここで、kは、HMMの潜在状態数を示す。本予測システムでは、出力確率Ｂが多次元ガウス分布から生成されるものとする。これにより多次元ベクトルのシーケンスを確率モデルで表現する（すなわちＢ～{Ｎ(μ_i,σ² _i)}^k _i=1)。HMMのモデルパラメータθ={Π,Ａ,Ｂ}と、入力データとしてあるユーザのシーケンスхが与えられた時、хの尤度Ｐ(х|θ)は、次式（数１）のように計算される。 (P2) Feature extraction based on dynamic patterns Each regime is expressed as a statistical model Θ={θ ₁ , . . . , θ _r ,Δ _r×r }. In this study, we use Hidden Markov Model (HMM) to express the behavior of multidimensional time series tensor X. HMM is a type of probabilistic model that assumes a Markov process with hidden states, and is widely used as a time series processing method in various fields including speech recognition. HMM is a triplet of initial probability Π={π _i } ^k _i=1 , transition probability A={a _ij } ^k _i,j=1 , and output probability B={b _i (x)} ^k _i=1 (i.e., θ={Π,A,B}). Here, k indicates the number of latent states of the HMM. In this prediction system, it is assumed that the output probability B is generated from a multidimensional Gaussian distribution. This represents a sequence of multidimensional vectors as a probabilistic model (ie, B~{N(μ _i ,σ ² _i )} ^k _i=1 ). When the HMM model parameters θ={Π,A,B} and a certain user sequence х are given as input data, the likelihood P(х|θ) of х is expressed as the following equation (Equation 1): calculated.

ここで、p_i(t)は、時刻tにおける潜在状態ｉの最大確率を示し、nは、хのシーケンス長である。この尤度は、図４に示す遷移図に基づき、動的計画法の一種であるビタビアルゴリズム（非特許文献７）を用いて計算される。ここでさらに、新たな概念としてレジーム遷移行列Δ_r×rを導入する。 Here, p _i (t) indicates the maximum probability of latent state i at time t, and n is the sequence length of х. This likelihood is calculated using the Viterbi algorithm (non-patent document 7), which is a type of dynamic programming, based on the transition diagram shown in FIG. Here, we further introduce the regime transition matrix Δ _r×r as a new concept.

・定義３（レジーム遷移行列）
Δ_r×rをr個のレジーム群の遷移行列と呼ぶ。ここで、要素δ_ij∈Δ は、ｉ番目のレジームからｊ番目のレジームへの遷移確率を示す。すなわち、0≦δ_ij≦1,Σ_jδ_ij= 1という条件を持つ。上記のモデルを用いて、多次元時系列テンソルＸを、以下に示すHMMの潜在状態系列Ｚとモデル化した際の誤差εとで要約し、特徴量化することで、高精度かつ長期的な予測を実現する。・Definition 3 (regime transition matrix)
Δ _r×r is called the transition matrix of r regime groups. Here, the element δ _ij ∈Δ indicates the transition probability from the i-th regime to the j-th regime. That is, the condition is 0≦δ _ij ≦1, Σ _j δ _ij = 1. Using the above model, the multidimensional time series tensor Realize.

・定義４（潜在状態テンソル）
設備ごとのHMMの潜在状態系列Ｚ={Z₁,. . . ,Z_w}を潜在状態テンソルと呼ぶ。ここで、Z_i={z_ij(1),. . . ,z_ij(n)}^d _j=1 であり、z_ij(t)は、自身と同じ潜在状態に属するデータ集合xの平均と分散の組{μ,σ}で構成される。・Definition 4 (latent state tensor)
The latent state sequence Z={Z ₁ ,. . . ,Z _w } of the HMM for each equipment is called the latent state tensor. Here, Z _i ={z _ij (1),. . . ,z _ij (n)} ^d _j=1 , and z _ij (t) is the average of the data set It consists of a set of variances {μ,σ}.

・定義５（誤差テンソル）
多次元時系列テンソルＸを潜在状態テンソルＺでモデル化した際の誤差ε={E₁, . . . ,E_w}を誤差テンソルと呼ぶ。本予測システムでは、HMMの出力確率Ｂが多次元ガウス分布に従うと仮定しているため、ｉ番目の設備のｊ番目のセンサにおける時刻tでの誤差e_ij(t) ∈ E_i は、以下の（数２）ように表現される。・Definition 5 (error tensor)
The error ε={E ₁ , . . . , E _w } when the multidimensional time series tensor X is modeled by the latent state tensor Z is called an error tensor. In this prediction system, it is assumed that the output probability B of the HMM follows a multidimensional Gaussian distribution, so the error e _ij (t) ∈ E _i at time t in the j-th sensor of the i-th equipment is It is expressed as (Equation 2).

すなわち，時系列テンソルＸを、(Ｐ1)で得られたレジーム情報{m,r,Ｓ,Θ,Ｆ} に基づきＸ≒IGPDF(Ｚ,ε)となるような潜在状態テンソルＺと誤差テンソルεとによって要約し、重要な特徴を抽出する。ここで、IGPDF(Inverse Gaussian Probability Density Function) は、ガウス分布における確率密度関数の逆関数を表す。 That is, the time series tensor X is transformed into a latent state tensor Z and an error tensor ε such that and extract important features. Here, IGPDF (Inverse Gaussian Probability Density Function) represents the inverse function of the probability density function in Gaussian distribution.

(P3)lsステップ先の長期予測
結論として、前記式（１）は、以下の式（２）のように書き換えられる。 (P3) Long-term prediction for ls steps ahead In conclusion, the above equation (1) can be rewritten as the following equation (2).

ここで、Ｆは、予測モデルを表す。すなわち、時系列テンソルＸが与えられたとき、提案手法は、Ｘを潜在状態テンソルＺと誤差テンソルεとで要約することで重要な特徴を抽出し、それらに提案モデルＦを適用し、l_sステップ先の長期的な予測を高精度に行う。 Here, F represents a prediction model. That is, when a time series tensor X is given, the proposed method extracts important features by summarizing X with a latent state tensor Z and an error tensor ε, applies the proposed model F to them, _and Make long-term predictions with high accuracy.

＜処理(P1),(P2),(P3)についてのアルゴリズム＞
上記では、多次元時系列テンソルＸを要約し、効果的に予測するための提案モデルについて述べた。ここでは、前記式（１）を解決するためのアルゴリズムについて説明を行う。ここで問題となるのは、どのようにレジームやセグメントの数を決定するかである。本予測システムは、最小記述長（MDL:Minimum Description Length）の概念に基づき、適切なモデルを生成するための基準となる符号化スキームを導入する。 <Algorithm for processing (P1), (P2), (P3)>
Above, we have described a proposed model for summarizing and effectively predicting the multidimensional time series tensor X. Here, an algorithm for solving the above equation (1) will be explained. The problem here is how to determine the number of regimes and segments. This prediction system is based on the concept of minimum description length (MDL) and introduces a coding scheme that serves as a standard for generating an appropriate model.

1.モデル選択とデータ圧縮
直感的には、データが与えられたときのモデルのよさは、次の式（３）で表現できる。 1. Model selection and data compression Intuitively, the goodness of a model given data can be expressed by the following equation (3).

ここで，Cost_M(M)は、モデルMを表現するためのモデルコストを示し、Cost_C(X|M) は、モデルMが与えられたときのテンソルＸの符号化コストを示す。αは、符号化コストに対する重み（デフォルトでは、α＝1）であり、αの値が大きいほどより実データに正確なモデルを生成する（すなわち、セグメントの数m、レジームの数rが大きくなる）。 Here, Cost _M (M) indicates the model cost for expressing the model M, and Cost _C (X|M) indicates the encoding cost of the tensor X when the model M is given. α is the weight for the encoding cost (by default, α = 1), and a larger value of α produces a model that is more accurate to the real data (i.e., the number of segments m and the number of regimes r are larger) ).

・モデルコスト
具体的には、本予測システムの全パラメータ集合の表現コストは、以下の要素で構成される。・Model cost Specifically, the representation cost of the entire parameter set of this prediction system is composed of the following elements.

なお、上記^＊２に示すlog^＊は、整数のユニバーサル符号長を表し、log^＊(x)≒log_２(x)＋log_２log_２(x)＋…である（非特許文献２３）。また、浮動小数点のコストをc_Fとすると、k個の状態を持つ単一のレジームパラメータθは、Cost_M(θ)=log^＊(k)+c_F(k+k²+2kd)、レジーム遷移行列Δは、Cost_M(Δ)=c_Fr²のコストを要する。 Note that log ^* shown in ^*2 above represents the universal code length of an integer, and log ^* (x)≈log ₂ (x) + log ₂ log ₂ (x) +... (Non-Patent Document 23). Also, if the floating point cost is c _F , then a single regime parameter θ with k states is Cost _M (θ)=log ^* (k)+c _F (k+k ² +2kd), regime The transition matrix Δ requires a cost of Cost _M (Δ)=c _F r ² .

・符号化コスト
モデルパラメータが与えられたときのＸの符号化コストは、ハフマン符号を用いた情報圧縮により、負の対数尤度を用いて次の（数６）のように表現することができる。・Encoding cost The encoding cost of X when model parameters are given can be expressed as follows (Equation 6) using negative log likelihood by information compression using Huffman codes. .

ここで、iと(i-1)番目のセグメントは、それぞれuとv番目のレジームに所属するものとし、Ｘ[s_i]は、Ｘに含まれるセグメントs_iで構成される部分シーケンスを表す。P(Ｘ[s_i]|θ_u)は、θ_uが与えられたときのＸ[s_i]の尤度とする。結論として、提案アルゴリズムは、前記式（３）を最小化するようにＸに含まれる時系列パターンの数rとその変化点の数mを決定する。 Here, the i and (i-1)th segments belong to the u and vth regimes, respectively, and X[s _i ] represents a partial sequence composed of segments s _i included in X. . P(X[s _i ]|θ _u ) is the likelihood of X[s _i ] when θ _u is given. In conclusion, the proposed algorithm determines the number r of time series patterns included in X and the number m of their changing points so as to minimize the above equation (3).

次いで、データをコスト関数に基づき要約しながら、長期的なラベル予測を実現するための具体的なアルゴリズムについて詳述する。 Next, a specific algorithm for realizing long-term label prediction will be detailed while summarizing data based on a cost function.

2.アルゴリズムの概要
本予測システムは、次のアルゴリズムで構成される。 2. Algorithm Overview This prediction system consists of the following algorithm.

・REGIMEGENRATION (P1)：テンソルＸに含まれる時系列パターンの種類と変化点を検出する。各時系列パターンのダイナミクスをモデルパラメータΘとして表現し、モデルパラメータ集合{m,r,Ｓ,Θ,Ｆ}を得る。・REGIMEGENRATION (P1): Detects the type and change points of time series patterns included in tensor X. The dynamics of each time-series pattern is expressed as model parameters Θ, and a model parameter set {m, r, S, Θ, F} is obtained.

・FEATUREEXTRACTION (P2)：時系列パターンの要約情報{m,r,Ｓ,Θ,Ｆ}を用いて、オリジナルテンソルＸを潜在状態テンソルＺと誤差テンソルεとで表現する。 - FEATUREEXTRACTION (P2): Using the time series pattern summary information {m, r, S, Θ, F}, the original tensor X is expressed as a latent state tensor Z and an error tensor ε.

・SPLITCAST (P3)：{Ｚ,ε}のうち、あるウインドウt_s:t_eの部分シーケンス{Ｚ(t_s:t_e),ε(t_s:t_e)}から故障の予兆となる特徴を抽出し、l_s先の故障ラベルｙ(t_e+l_s)を予測する。・SPLITCAST (P3): Features that indicate a failure from the partial sequence {Z(t _s : _{t e} ₎ ,ε(t _s :t _e )} of a certain window t _s :t e in {Z, ε} , and predict the failure label y(t _e +l _s ) ahead of l _s .

図３は、提案モデルの概要を示す。テンソルＸが与えられたとき、提案手法は、Ｘの時系列パターンの時間遷移と設備固有のパターンを捉え，それに基づいてＸを{Ｚ,ε}で要約する。最終的に、得られた{Ｚ,ε}からl_sステップ先でのアラートラベルを予測し、出力する。 Figure 3 shows an overview of the proposed model. When a tensor X is given, the proposed method captures the time transition of the time series pattern of X and the equipment-specific pattern, and summarizes X as {Z, ε} based on that. Finally, the alert label for l _s steps ahead is predicted and output from the obtained {Z, ε}.

3.RegimeGeneration(P1)
ここではアルゴリズムの詳細を述べる。時系列解析における根本的な問題は、時系列データに内在する隠された構造があるかどうかである。ここで扱う多次元時系列テンソルＸは、複数の観点からの特徴を持つ。すなわち、時間ドメインの特徴と設備ドメインの特徴である。具体的には、スマート工場から得られる時系列センサデータは、各工程の時間遷移パターンと、設備固有のパターンとを持つ。そこで、以下では、与えられた時系列テンソルの根底にある構造を簡潔に要約した、多角的なパターン発見と、グループ化を同時に行う。 3. Regime Generation (P1)
Here we describe the details of the algorithm. A fundamental question in time series analysis is whether there is a hidden structure inherent in time series data. The multidimensional time series tensor X handled here has characteristics from multiple viewpoints. That is, they are characteristics of the time domain and characteristics of the equipment domain. Specifically, time-series sensor data obtained from smart factories has time transition patterns for each process and equipment-specific patterns. Therefore, in the following, we simultaneously perform grouping and multifaceted pattern discovery that concisely summarizes the underlying structure of a given time series tensor.

ここで、時系列テンソルの多角的解析のためのアルゴリズムであるV-SplitとH-Splitを提案する。V-Splitは、時間方向の観点からレジームを推定し、H-Splitは、設備ごとの特性をレジームとして表現する。これら２つのアルゴリズムを任意方向に行うことで効率的かつ効果的に重要なパターンを多角的に発見し、レジームとして要約する。具体的には、式（３）に基づき、以下の２つのアルゴリズムを繰り返す。 Here, we propose V-Split and H-Split, which are algorithms for multilateral analysis of time series tensors. V-Split estimates the regime from the perspective of time, and H-Split expresses the characteristics of each facility as a regime. By performing these two algorithms in any direction, important patterns can be efficiently and effectively discovered from multiple angles and summarized as a regime. Specifically, the following two algorithms are repeated based on equation (3).

・V-Split：テンソルＸから時間遷移するパターンとその変化点を検出し、２つのグループ（すなわちレジーム）に分割する。それら２つのレジームに対し、モデルパラメータ{θ₁,θ₂,Δ}を推定する。・V-Split: Detects time-transition patterns and their changing points from tensor X, and divides it into two groups (i.e., regimes). For these two regimes, model parameters {θ ₁ , θ ₂ , Δ} are estimated.

・H-Split：テンソルＸに表れる、ある１つのレジームから設備ごとの特徴を抽出し、２つのレジームに分割後、それらのレジームのモデルパラメータを推定する。・H-Split: Extracts the features of each equipment from one regime that appears in tensor X, divides it into two regimes, and then estimates the model parameters of those regimes.

上記のアルゴリズムにより、r = 1, 2, . . . とレジーム数が変化していく。もし、レジームθ₀を２つのレジーム{θ₁,θ₂}に分割した際、コスト関数（式（３））の値が大きくなればθ₀は最適とみなし、それ以上分割しない。生成された全てのレジームについて同様にコスト計算を繰り返し、コストが下がらなくなるまで上記の分割アルゴリズムを繰り返す。最終的に、コストが収束したときのセグメント、レジーム、モデルパラメータ{m,r,Ｓ,Θ,Ｆ}を出力して、RegimeGenerationを終了する。 Using the above algorithm, the number of regimes changes as r = 1, 2, . . . If, when regime θ ₀ is divided into two regimes {θ ₁ , θ ₂ }, the value of the cost function (formula (3)) becomes large, θ ₀ is considered optimal and is not divided any further. The cost calculation is repeated in the same way for all generated regimes, and the above division algorithm is repeated until the cost no longer decreases. Finally, the segment, regime, and model parameters {m, r, S, Θ, F} are output when the cost converges, and RegimeGeneration ends.

続いて、分割アルゴリズムV-Split、H-Splitのそれぞれについて述べる。 Next, the splitting algorithms V-Split and H-Split will be described.

(3-1)V-Split
多次元時系列テンソルＸが与えられたとき、V-Splitは時間遷移の観点から２つのレジームを検出し、それらのモデルパラメータ{θ₁,θ₂,Δ}を推定する。高精度なモデルを生成するため、本予測システムは、セグメント／レジームの検出とモデルパラメータの更新を以下のように繰り返し行う。 (3-1)V-Split
When a multidimensional time series tensor X is given, V-Split detects two regimes from the perspective of time transition and estimates their model parameters {θ ₁ , θ ₂ , Δ}. In order to generate a highly accurate model, this prediction system repeatedly detects segments/regimes and updates model parameters as follows.

・(Phase 1) V-Assignment：２つのモデルパラメータが与えられたとき、それらに基づいて２つのセグメントセット{Ｓ₁,Ｓ₂}とパターンの変化点とを抽出する。 - (Phase 1) V-Assignment: When two model parameters are given, two segment sets {S ₁ , S ₂ } and pattern change points are extracted based on them.

・(Phase 2) ModelEstimation：２つのセグメントセットが与えられたとき、それらに基づいてモデルパラメータ{θ₁,θ₂,Δ}を更新する。 - (Phase 2) ModelEstimation: When two segment sets are given, model parameters {θ ₁ , θ ₂ , Δ} are updated based on them.

V-Splitの概要をアルゴリズム１（表２）に示す。上記のアルゴリズム１は期待値最大化法（EM:Expectation maximization）に基づいており、それぞれのフェーズがE，Mステップに対応している。 An overview of V-Split is shown in Algorithm 1 (Table 2). Algorithm 1 above is based on the expectation maximization method (EM), and each phase corresponds to E and M steps.

まず、最も単純な部分問題として、テンソルＸと２つのモデルパラメータ{θ₁,θ₂,Δ}が与えられている場合を考える。V-Assignmentは、レジームのモデルパラメータに基づき、Ｘのパターンの変化点を検出することができる（表２のステップ５～７）。提案アルゴリズムの基本的な概念を説明するため、図４の遷移図を示す。２つのレジーム{θ₁,θ₂}の遷移を接続し、時刻ごとに２つのレジームの符号化コストを比較しながら、与えられたレジーム間のパターン遷移を推定する。本アルゴリズムは動的計画法の一種であるビタビアルゴリズム（非特許文献７）に基づき、符号化コストCost_T(Ｘ|Θ)=-ln P(Ｘ|Θ)を計算する。具体的には、尤度P(Ｘ|Θ)は、次の（数７）ように計算される。 First, consider the case where a tensor X and two model parameters {θ ₁ , θ ₂ , Δ} are given as the simplest subproblem. V-Assignment can detect changing points in the pattern of X based on the model parameters of the regime (Steps 5 to 7 in Table 2). The transition diagram of FIG. 4 is shown to explain the basic concept of the proposed algorithm. The pattern transition between the given regimes is estimated by connecting the transitions of the two regimes {θ ₁ , θ ₂ } and comparing the encoding costs of the two regimes at each time. This algorithm calculates the encoding cost Cost _T (X|Θ)=-ln P(X|Θ) based on the Viterbi algorithm (Non-Patent Document 7), which is a type of dynamic programming. Specifically, the likelihood P(X|Θ) is calculated as shown below (Equation 7).

ここで、P(Ｘ|Θ)_iは、i番目のレジームθ_iへ遷移する時の尤度を示す。例として、P(Ｘ|Θ)₁は、次の（数８）ように計算される。 Here, P(X|Θ) _i indicates the likelihood of transitioning to the i-th regime θ _i . As an example, P(X|Θ) ₁ is calculated as follows (Equation 8).

ここで、p_1;i(t)は、時刻tでのレジームθ₁の潜在状態iの最大確率を表し、δ₂₁は、レジームθ₁からθ₂へのレジーム遷移確率、max_u{p_2;u(t-1)}は、前時刻t-1でのθ₂の尤もらしい潜在状態である確率、π_1;iは、θ₁の潜在状態iの初期確率、b_1;i(x(t))は、θ₁の潜在状態iに対するx(t)の出力確率、そしてa_1;jiは、θ₁の潜在状態iから潜在状態jへの遷移確率を表す。ここで、時刻t=1において、レジームθ₁である確率は、p_1;i(1)=δ₁₁π_1;ib_1;i(x(t))で与えられる。なお、モデルパラメータの推定には、BaumWelchアルゴリズム（非特許文献１）を用い、レジーム遷移確率Δ={δ₁₁,δ₁₂,δ₂₁,δ₂₂}を次の（数９）のように計算する。 Here, p _1;i (t) represents the maximum probability of latent state i in regime θ ₁ at time t, δ ₂₁ is the regime transition probability from regime θ ₁ to θ ₂ , max _u {p _{2 ;u} (t-1)} is the probability of being a plausible latent state of θ ₂ at the previous time t-1, π _1;i is the initial probability of latent state i of θ ₁ , b _1;i (x (t)) represents the output probability of x(t) for latent state i of θ ₁ , and a _1;ji represents the transition probability from latent state i to latent state j of θ ₁ . Here, the probability of regime θ ₁ at time t=1 is given by p _1;i (1)=δ ₁₁ π _1;i b _1;i (x(t)). Note that the BaumWelch algorithm (non-patent document 1) is used to estimate the model parameters, and the regime transition probability Δ={δ ₁₁ , δ ₁₂ , δ ₂₁ , δ ₂₂ } is calculated as shown in the following (Equation 9). .

ここで、Σ_s∈S1 |s|は、レジームθ₁に所属するセグメントの長さの総和を表し、N₁₂は、θ₁からθ₂へのレジームの切り替え回数を示す。δ₂₁,δ₂₂についても同様に計算できる。 Here, Σ _s∈S1 |s| represents the total length of segments belonging to regime θ ₁ , and N ₁₂ represents the number of times the regime is switched from θ ₁ to θ ₂ . δ ₂₁ and δ ₂₂ can also be calculated in the same way.

(3-2)H-Split
これまで、時系列テンソルＸの中から時間方向の特徴を捉えるためのアルゴリズム１のV-Splitについて説明した。現実問題として、時系列テンソルＸは、パターンの時間遷移だけでなく、設備ごとの個体差を持っている。例えば、ある２つの設備において、同じ部品を加工する場合であっても、工程ごとに設備間でのセンサデータの振る舞いに個体差が生じる。本予測システムでは、このような設備固有の特徴を捉え、効果的にモデル化するためのアルゴリズムであるH-Splitを提案する。直感的には、本アルゴリズム２は、V-Splitと同様に、（Phase 1）レジーム分割と（Phase 2）モデル推定との２つのフェーズを繰り返し行うことで、適切なレジームとそのモデルパラメータを推定する。V-Splitと異なるのは、設備固有の特徴を捉えるためのアルゴリズムH-Assignment（Phase 1）である。H-Assignmentの概要をアルゴリズム２（表３）に示す。なお、（表３）に示すアルゴリズムは、（表２）中のステップ５の「V-Assignment」に対応する部分に該当し、H-Splitは、H-Assignmentに置き換えた内容で（表２）を実行すればよい。 (3-2)H-Split
So far, we have explained Algorithm 1, V-Split, for capturing temporal features from the time series tensor X. As a practical matter, the time series tensor X has not only the time transition of the pattern but also individual differences between each piece of equipment. For example, even when two pieces of equipment process the same part, individual differences occur in the behavior of sensor data between the pieces of equipment for each process. In this prediction system, we propose H-Split, an algorithm that captures these unique characteristics of equipment and effectively models them. Intuitively, like V-Split, Algorithm 2 estimates an appropriate regime and its model parameters by repeating two phases: (Phase 1) regime splitting and (Phase 2) model estimation. do. What differs from V-Split is the algorithm H-Assignment (Phase 1), which captures equipment-specific characteristics. An overview of H-Assignment is shown in Algorithm 2 (Table 3). The algorithm shown in (Table 3) corresponds to the part corresponding to "V-Assignment" in step 5 in (Table 2), and H-Split is the content replaced with H-Assignment (Table 2). All you have to do is execute.

これまでの典型的なクラスタリングアルゴリズムと異なり、H-Assignmentは、効果的に設備固有のパターンを抽出する。具体的には、テンソルＸとモデルパラメータ{θ₁,θ₂}が与えられたとき、アルゴリズム２は、設備iのセグメントをあるレジームθに割り当てたときの符号化コストを以下の（数１０）のように計算し、よりコストが小さくなるレジームに設備iのセグメントを割り当てる。 Unlike traditional clustering algorithms, H-Assignment effectively extracts equipment-specific patterns. Specifically _, given _the tensor Assign the segment of equipment i to the regime with the lower cost.

ここで、X[i]={s₁, s₂,. . . }は、設備iのセグメントのセットである。すなわち、同じ設備のセグメントは同じレジームに属するよう制約されている。 Here, X[i]={s ₁ , s ₂ ,. . . } is the set of segments of equipment i. That is, segments of the same equipment are constrained to belong to the same regime.

4.FeatureExtraction(P2)
ここまでは、多次元時系列テンソルから任意のタイミングで変化する時系列パターンを多角的に検出するためのアルゴリズムについて説明した。次に、故障発生の長期的な予測を実現するために、時系列データから故障の原因、あるいは予兆を示す特徴を抽出することである。一般に、高いサンプリングレートで収集されるセンサデータは、多くのノイズを含み、監視するシステムが複雑であるほどその正確なふるまいをモデル化することが難しくなる。そこで本予測システムでは、時系列パターンの特徴を利用してＸを抽象化し、故障の予兆を効果的に抽出する手法を提案する。具体的には、時系列テンソルＸとモデルパラメータ集合{m,r,Ｓ,Θ,Ｆ}とが与えられたとき、Ｘを時系列パターンに基づく潜在状態テンソルＺとモデル化した際の誤差テンソルεとに分割する。 4.FeatureExtraction(P2)
So far, we have described an algorithm for multifacetedly detecting time series patterns that change at arbitrary timing from a multidimensional time series tensor. Next, in order to achieve long-term prediction of failure occurrence, it is necessary to extract features that indicate the cause or sign of failure from time-series data. Generally, sensor data collected at a high sampling rate contains a lot of noise, and the more complex the system being monitored, the more difficult it is to model its accurate behavior. Therefore, in this prediction system, we propose a method to abstract X using the characteristics of time-series patterns and effectively extract signs of failure. Specifically, given a time series tensor X and a model parameter set {m, r, S, Θ, F}, the error tensor when X is modeled as a latent state tensor Z based on a time series pattern Divide into ε.

今、r個のレジーム集合Θ={θ₁, . . . ,θ_r}が与えられたとすると、各時刻tにおける、設備iのデータx_i(t)={x_ij(t)}^d _j=1は、Θ内のレジームのいずれかの状態z_i(t)に変換される。ここで、z_i(t)は、自身と同じ状態に属する全データポイントの平均と分散の組{μ,σ}を示す。すなわち、潜在状態テンソルの次元は、Ｚ∈R^w×2d×nとなる。続いて、Θが与えられたときの、時刻tにおける設備iのセンサjの計測値x_ij(t)∈Ｘの符号化誤差を事後確率p(x_ij(t)|θ)で表現する。すなわち、時系列テンソルＸ全体の符号化誤差は、ε∈R^w×d×nである。最終的に、２つの特徴を結合した系列Ｘ’∈R^w×3d×nを出力する。以上の処理により、入力データの情報を失うことなく、学習モデル推定の際に時系列方向の潜在的なふるまいを考慮することができる。 _Now , assuming that r regime sets _Θ ={ _θ ₁ _, ^. ₌₁ is transformed to any state z _i (t) of the regime in Θ. Here, z _i (t) represents the mean and variance set {μ,σ} of all data points belonging to the same state as itself. That is, the dimension of the latent state tensor is Z∈R ^w×2d×n . Next, when Θ is given, the encoding error of the measurement value x _ij (t)∈X of sensor j of equipment i at time t is expressed as a posterior probability p(x _ij (t)|θ). That is, the encoding error of the entire time series tensor X is εεR ^w×d×n . Finally, a sequence X'∈R ^w×3d×n, which is a combination of two features, is output. Through the above processing, potential behavior in the time series direction can be taken into account when estimating a learning model without losing information on input data.

5.SPLITCAST(P3)
本予測システムの最終的な目標は、与えられた時系列テンソルＸから、l_sステップ先の長期的な予測を高精度に行うことである。ラベル予測タスクの典型的な手法として、近年では深層学習に基づく手法が数多く提案されている。深層学習に基づく手法は、中間層を多層にしたり、中間層のユニット数を増やしたりすることで柔軟な学習を実現できる一方、層数やユニット数が増えるほど学習パラメータが多くなり計算時間が長くなる。また、過学習の問題もあり、問題を解決するためのテクニックが数多く存在する一方、どれも経験則に基づくものであり、人手を介した非常に細かなチューニングが必要となる。そこで本予測システムは、確率モデルに基づく特徴抽出手法と深層学習手法を組み合わせ、実データから抽出された特徴的な時系列パターンを学習することで、より小さなネットワークで学習でき、過学習の問題を軽減しながら効率的かつ効果的なアラートラベル予測を実現する。 5.SPLITCAST(P3)
The ultimate goal of this prediction system is to perform long-term prediction of l _s steps ahead with high accuracy from a given time series tensor X. In recent years, many methods based on deep learning have been proposed as typical methods for label prediction tasks. Methods based on deep learning can achieve flexible learning by creating multiple intermediate layers or increasing the number of units in the intermediate layer, but as the number of layers and units increases, the number of learning parameters increases and the calculation time increases. Become. There is also the problem of overfitting, and while there are many techniques to solve this problem, they are all based on empirical rules and require extremely detailed manual tuning. Therefore, this prediction system combines a feature extraction method based on a probabilistic model and a deep learning method to learn characteristic time-series patterns extracted from real data, which enables learning with a smaller network and solves the problem of overfitting. Achieving efficient and effective alert label prediction while reducing

具体的には、テンソルＸ’={Ｚ,ε}の時間発展の様子をモデル化するために、図３に示すように、LSTM(Long-short term memory)（非特許文献９）を適用する。LSTMは、入力サンプルを時系列データとして扱い、高次元の非線形ダイナミクスを学習可能にする深層学習モデルのひとつである。LSTMは、RNN（Recurrent neural network) の中間層のユニットをメモリユニットと呼ばれる特殊な構造に置き換えたもので、入力ゲート、出力ゲート、忘却ゲートの３種類を使用して時刻tのユニット値c_tとユニットの出力値h_tとを制御する。各ゲートの出力値をそれぞれi_t, o_t, f_tとすると、LSTMの順伝播は、以下の式（数１１）で表される。 Specifically, in order to model the time evolution of the tensor . LSTM is a deep learning model that treats input samples as time-series data and makes it possible to learn high-dimensional nonlinear dynamics. LSTM is an RNN (Recurrent neural network) in which the middle layer unit is replaced with a special structure called a memory unit, and it uses three types of input gates, output gates, and forgetting gates to calculate the unit value c _t at time t. and the output value h _t of the unit. When the output values of each gate are respectively i _t , o _t , f _t , forward propagation of LSTM is expressed by the following equation (Equation 11).

本予測システムでは、活性化関数にsigmoid関数を使用する。LSTMは、公知のように、メモリユニットによって与えられた入力系列の長期依存性を学習することができるため、レジーム遷移とレジーム内部の状態遷移の過程で設備故障に対して特に重要な特徴を記憶しながら、設備の最新の稼働状況を要約した特徴ベクトルを抽出すると考えられる。 This prediction system uses the sigmoid function as the activation function. As is well known, LSTM is capable of learning long-term dependencies of input sequences given by a memory unit, and thus remembers features that are particularly important for equipment failures in the process of regime transitions and state transitions within regimes. It is conceivable to extract a feature vector that summarizes the latest operating status of the equipment.

最後に、h_tを用いてl_sステップ先のラベル予測を行う。本実施形態では、時刻tにおける最新の部分シーケンスからのl_s先の故障予測を２クラス分類タスクとして扱い、出力を時刻t+l_sにおける故障発生確率とする。したがって、本予測システムの最終的な出力は、（数１２）で示すようになる。 Finally, use h _t to predict the label for l _s steps ahead. In this embodiment, prediction of a failure in the future from the latest partial sequence at time t is handled as a two-class classification task, and the output is the probability of failure occurrence at time t+ _{l s} _. Therefore, the final output of this prediction system is as shown in (Equation 12).

また、本予測システムにおけるモデルが最小化すべき目的関数は、BCE(Binary cross entropy)となり、モデル学習時のバッチサイズをＮ、各入力サンプルiに対する本予測システムにおける出力値をy^{^} _iとすると、（数１３）で示すように表される。 In addition, the objective function that the model in this prediction system should minimize is BCE (Binary cross entropy), and if the batch size during model learning is N and the output value in this prediction system for each input sample i is y ^{^} _i , It is expressed as shown in (Equation 13).

ここで重要な点として、本予測システムは、比較的小さなユニット数(=10)、かつシンプルな構造のモデルを用いながら、以下の評価実験に示すように、非常に高い性能を示している。 The important point here is that this prediction system uses a model with a relatively small number of units (=10) and a simple structure, but has shown extremely high performance as shown in the evaluation experiment below.

(5-1)理論的な分析
本予測システムでの計算量は、データサイズに対し線形(O(wdn))である。以下、この補助（実質的な）定理について説明する。 (5-1) Theoretical analysis The amount of calculation in this prediction system is linear (O(wdn)) with respect to the data size. This auxiliary (substantive) theorem will be explained below.

各反復処理において、V-Assignment、H-Assignment、およびModelEstimationは、符号化コストとモデルパラメータの推定のためにO(wdnk²)の計算量を要する。ここで、wは設備数、dは次元数、nは時系列の長さ、kはレジーム{θ_i}^r _i=1の中の隠れ状態の数を示す。よって、RegimeGeneration (P1)の計算量は、O(#iter wdnk²)である。ここで、反復回数#iterと隠れ状態の個数kとは非常に小さい定数であるため、無視することができる。よって、RegimeGeneration の計算量は、O(wdn)となる。FeatureExtraction (P2)においては、各設備、各センサ、各時刻の潜在状態とモデル化した際の誤差を出力するため、計算量はO(wdn)である。最終的に、得られたモデルをユニット数uのLSTMで学習する際、計算量はO(u² wdn)となる。ここで、本予測システムにおいては、複雑なニューラルネットワークを想定しておらず、ニューラルネットワークのユニット数uは、非常に小さい定数であるため無視できる。従って、本予測システムでの計算量はO(wdn)である。 In each iteration, V-Assignment, H-Assignment, and ModelEstimation require O(wdnk ² ) computational complexity due to encoding cost and model parameter estimation. Here, w is the number of equipment, d is the number of dimensions, n is the length of the time series, and k is the number of hidden states in the regime {θ _i } ^r _i=1 . Therefore, the amount of calculation for RegimeGeneration (P1) is O(#iter wdnk ² ). Here, the number of iterations #iter and the number k of hidden states are very small constants and can be ignored. Therefore, the amount of calculation for RegimeGeneration is O(wdn). In FeatureExtraction (P2), the amount of calculation is O(wdn) because it outputs the latent state of each equipment, each sensor, and each time and the error when modeling. Finally, when learning the obtained model using LSTM with the number of units u, the amount of calculation becomes O(u ² wdn). Here, in this prediction system, a complicated neural network is not assumed, and the number u of units of the neural network is a very small constant and can therefore be ignored. Therefore, the amount of calculation in this prediction system is O(wdn).

＜評価実験＞
本予測システムの有効性を検証するため、図２の具体例を適用して、実データを用いた実験を行った。本実験では、以下の項目について検証した。 <Evaluation experiment>
In order to verify the effectiveness of this prediction system, we applied the specific example in Figure 2 and conducted an experiment using actual data. In this experiment, the following items were verified.

(1)設備故障の長期的予測に対する提案手法の精度
(2)設備のリアルタイム監視に対する計算時間の検証
実験は128GBのメモリ、NVIDIA TITAN V 12GBのGPU搭載のLinux（登録商標）(Ubuntu 18.04 LTS)マシン上で実施した。また、データセットは、平均値と分散値で正規化(z-normalization)して使用した。 (1) Accuracy of the proposed method for long-term prediction of equipment failures
(2) Verification experiments on calculation time for real-time monitoring of equipment were conducted on a Linux (registered trademark) (Ubuntu 18.04 LTS) machine equipped with 128 GB of memory and NVIDIA TITAN V 12 GB GPU. In addition, the data set was used after normalization (z-normalization) using the mean value and variance value.

1.本予測システムの予測精度
与えられた時系列テンソルに対する故障予測精度について検証した。比較例として、一般的な２値予測モデルであるロジスティック回帰(LR:Logistic regression)（非特許文献１）と再帰型ニューラルネットワークモデルであるRNN(Recurrent neural network)、GRU(Gated recurrent unit)（非特許文献４）、LSTMとを採用した。LRでは、他の再帰型モデルを推定する際にミニバッチとして与える部分シーケンスから平均値、分散値、最大値、最小値を算出し、４次元の特徴ベクトルとしてラベル予測を行った。RNN，GRU，LSTMでは、実データを入力としてラベル予測を行った。 1. Prediction accuracy of this prediction system We verified the failure prediction accuracy for a given time series tensor. As a comparative example, we use a general binary prediction model, Logistic regression (LR) (Non-Patent Document 1), a recurrent neural network model, RNN (Recurrent neural network), and GRU (Gated recurrent unit). Patent Document 4) and LSTM were adopted. In LR, the mean value, variance value, maximum value, and minimum value were calculated from the partial sequence given as a mini-batch when estimating other recurrent models, and label prediction was performed as a four-dimensional feature vector. RNN, GRU, and LSTM performed label prediction using real data as input.

本予測システムに関して、予測ステップ数200、ウインドウサイズ400、符号化コストの重み(α=)1.0をデフォルトとして実験を行った。また、本予測システム(Proposed、図５)を含むすべての再帰型モデルについて、中間層のユニット数を10、出力層のユニット数は5とし、最適化アルゴリズムにはAdam（非特許文献１２）を使用した。評価指標にはAccuracyを使用し、５分割交差検証を行なった際の平均値を比較した。 Regarding this prediction system, we conducted an experiment with the number of prediction steps as 200, the window size as 400, and the encoding cost weight (α=) 1.0 as defaults. In addition, for all recurrent models including this prediction system (Proposed, Figure 5), the number of units in the intermediate layer is 10, the number of units in the output layer is 5, and Adam (Non-patent Document 12) is used as the optimization algorithm. used. Accuracy was used as the evaluation index, and the average values obtained when 5-fold cross validation was performed were compared.

使用したデータセットは、三菱重工エンジン＆ターボチャージャ株式会社で2017年10月から3ヶ月間実際に稼働し、ベアリング・ハウジング加工を行っていた55の工場設備に取り付けられた、回転速度(Speed)、稼働電圧(Load)、設備温度(Temp) の３つのセンサによって５秒間隔で取得されたものである。スライディングウインドウで学習用サンプルを生成しており、設備自体が稼働していないときのサンプルは省いている。正常稼働時のサンプル数が62983、非常停止前のサンプル数が1069あり、学習に偏りが生じるため、非常停止時のサンプル数に正常稼働時のサンプル数を揃え、結果として1069×2サンプルを用い実験を行った。 The data set used was the rotational speed (Speed) installed in 55 factory equipment that was actually in operation at Mitsubishi Heavy Industries Engine & Turbocharger Corporation for three months from October 2017 and was processing bearing housings. , operating voltage (Load), and equipment temperature (Temp) at 5-second intervals. Training samples are generated using a sliding window, and samples taken when the equipment itself is not in operation are omitted. The number of samples during normal operation is 62983, and the number of samples before emergency stop is 1069, which causes bias in learning. Therefore, the number of samples during emergency stop is equal to the number of samples during normal operation, and as a result, 1069 × 2 samples are used. We conducted an experiment.

(1)予測先ステップ数を変化させたときの予測精度
図５は、予測先のステップ数l_sを変化させたときの精度の比較図である。図中、比較例の種別表記とデータ表示順（左右）とは対応している。本実験は、異なるl_sごとにサンプルを生成し、学習と予測を行った。比較例は、ランダムに予測した場合と同程度(Accuracy=0.5)の予測精度を示す一方で、本予測システムでは、いずれの条件下でも優れた性能を示している。この結果から、非常停止の要因は、温度の上昇や稼働電圧の低下といった単純なものではなく、非線形性を有する複雑な事象だと考えられる。本予測システムは、実データに含まれる時系列パターンを考慮して各時刻のダイナミクスを捉えることができるため、他の再帰型モデルと比べて効果的に非常停止の要因を抽出することに成功した。 (1) Prediction accuracy when changing the number of prediction destination steps FIG. 5 is a comparison diagram of accuracy when changing the prediction destination step number l _s . In the figure, the type notation of the comparative example and the data display order (left and right) correspond. In this experiment, samples were generated for each different l _s , and learning and prediction were performed. While the comparative example shows prediction accuracy comparable to that of random prediction (Accuracy=0.5), the present prediction system shows excellent performance under all conditions. From this result, it is thought that the cause of the emergency stop is not a simple one such as a rise in temperature or a drop in operating voltage, but a complex event with nonlinearity. This prediction system can capture the dynamics at each time by taking into account the time-series patterns contained in the actual data, so it has succeeded in extracting the causes of emergency stops more effectively than other recursive models. .

(2)ウインドウサイズを変化させたときの予測精度
図６は、ネットワーク学習時に使用するミニバッチのウインドウ幅を変化させたときの予測精度の比較図である。本予測システムは、異なるウインドウ幅のデータに対しても安定して高い性能を示している。 (2) Prediction accuracy when changing window size FIG. 6 is a comparison diagram of prediction accuracy when changing the window width of the mini-batch used during network learning. This prediction system shows stable and high performance even for data with different window widths.

(3)予測結果の適合率と再現率
図７は、予測結果の適合率（Precision）と再現率（Recall）とを示す図である。適合率は、予測されたイベントの合計数とそのうち正解であったイベントの合計数の割合を示す。再現率は、全てのイベントの正解値の数と予測されたイベントの中で正解した合計数の割合を示す。両者とも、精度が高い場合には、１に近づく。本予測システムはどちらの指標に対しても優れた性能を示している。 (3) Precision and recall of prediction results FIG. 7 is a diagram showing precision and recall of prediction results. Precision rate indicates the ratio of the total number of predicted events to the total number of correct events. Recall rate indicates the ratio of the number of correct values for all events to the total number of correct values among predicted events. Both approaches 1 when the accuracy is high. This prediction system shows excellent performance for both indicators.

(4)発見セグメント数に対する予測精度
図８は、検出セグメント数mに対する本予測システムの予測精度を示す図である。符号化コストの重みであるαを、0.1～10まで変化させながら、検出セグメント数を増減させた。図８に示すように、本予測システムによって分割したセグメントの数によって予測精度は大きく変化している。mが小さい場合、時系列データから十分な要約情報を得ることができず予測精度が低下している。また、mが大きい場合にも同様に、要約情報が実データに近づくため予測精度が低下している。この結果からも、時系列テンソルからのパターン検出が故障予測の精度向上に有効であると考えられる。本実験では、m=1000のとき、最も良い結果(Accuracy=0.88)が得られた。結論として、本予測システムは比較例に対し、平均して約６２％もの精度向上を達成した。 (4) Prediction accuracy with respect to the number of discovered segments FIG. 8 is a diagram showing the prediction accuracy of this prediction system with respect to the number of detected segments m. The number of detected segments was increased or decreased while changing α, which is the weight of the encoding cost, from 0.1 to 10. As shown in FIG. 8, the prediction accuracy varies greatly depending on the number of segments divided by this prediction system. When m is small, it is not possible to obtain sufficient summary information from time-series data, resulting in a decrease in prediction accuracy. Similarly, when m is large, the prediction accuracy decreases because the summary information approaches the actual data. This result also suggests that pattern detection from time-series tensors is effective in improving failure prediction accuracy. In this experiment, the best result (Accuracy=0.88) was obtained when m=1000. In conclusion, this prediction system achieved an average accuracy improvement of about 62% compared to the comparative example.

(5)学習サンプル数と予測精度の関係
実運用において、学習サンプルが少ない場合、十分な精度を得られない可能性がある。図９は、学習サンプル数と予測精度との関係性を示す図である。本予測システムは、少ないサンプル数においても比較例より高い性能を示しており、学習サンプル数が増大するにつれて、より高い精度で故障イベントを予測することができている。 (5) Relationship between the number of training samples and prediction accuracy In actual operation, if the number of training samples is small, sufficient accuracy may not be obtained. FIG. 9 is a diagram showing the relationship between the number of learning samples and prediction accuracy. This prediction system shows higher performance than the comparative example even with a small number of samples, and as the number of learning samples increases, it is able to predict failure events with higher accuracy.

2.提案手法の計算速度
図１０は、設備数w、センサ数d、シーケンス長nをそれぞれ変化させたときの本予測システムの計算コストを示す図である。より具体的には、入力データを時系列パターンに分割し、モデルの学習を10 epoch分、終えたときの計算時間である。本予測システムは、与えられた時系列テンソルから効率的に時系列パターンを検出するため、すべての実験においてデータサイズに線形な計算量であり（すなわち、O(wdn)）、大規模センサデータの解析に適した手法であることが分かった。 2. Calculation speed of the proposed method FIG. 10 is a diagram showing the calculation cost of the present prediction system when the number of equipment w, the number of sensors d, and the sequence length n are varied. More specifically, it is the calculation time when the input data is divided into time-series patterns and the model training is completed for 10 epochs. This prediction system efficiently detects time-series patterns from a given time-series tensor, so the amount of calculation is linear with the data size in all experiments (i.e., O(wdn)), and it is This method was found to be suitable for analysis.

以上のように、本予測システムは、例えば工場設備で得られた実データを用いて実験を行い、本予測システムが複雑な時系列パターンを適切にモデル化し、長期的な故障予測を高精度に行えることが確認でき、さらに既存の比較例と比較して大幅な精度と性能の向上を達成していることが確認できた。 As described above, this prediction system has been tested using actual data obtained from factory equipment, for example, and has demonstrated that this prediction system can appropriately model complex time-series patterns and accurately predict long-term failures. It was confirmed that this method can be used, and it was also confirmed that a significant improvement in accuracy and performance was achieved compared to existing comparative examples.

なお、本発明は、工場設備に対するアラートイベントの予測に限らず、車載の各種センサを利用して各車両の走行状態による故障などのアラートラベル予測、各種の生体情報に基づくアラートラベルの予測などに適用可能である。また、アラートラベルは、不良、故障、品質低下の他、適用対象に応じて種々のアラート内容を設定することが可能である。また、予測処理は、人工知能（ＡＩ）に限定されず、他の手法を採用してもよい。 The present invention is not limited to predicting alert events for factory equipment, but can also be used to predict alert labels such as failures depending on the driving condition of each vehicle using various on-board sensors, and predict alert labels based on various biological information. Applicable. Further, the alert label can be set with various alert contents depending on the application target, in addition to defects, failures, and quality deterioration. Furthermore, the prediction processing is not limited to artificial intelligence (AI), and other techniques may be employed.

以上説明したように、本発明に係るイベント予測システムは、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に収集される時系列センサデータから多角的な動的パターンのモデルパラメータの抽出を継続的に行う第１の特徴量抽出手段と、前記モデルパラメータを用いて前記時系列センサデータをモデル化情報とその誤差情報とを含む要約情報に順次特徴量化する第２の特徴量抽出手段と、前記要約情報を入力として所定時間先での所定のイベントの発生確率を出力する予測手段とを備えることが好ましい。 As explained above, the event prediction system according to the present invention can estimate model parameters of multifaceted dynamic patterns from time-series sensor data continuously collected from multiple types of sensors placed on multiple observation targets. a first feature extraction means that continuously performs extraction, and a second feature extraction that sequentially converts the time-series sensor data into features using the model parameters into summary information including modeling information and its error information. It is preferable to include a prediction means for inputting the summary information and outputting a probability of occurrence of a predetermined event in a predetermined time period.

また、本発明に係るイベント予測方法は、コンピュータの第１の特徴量抽出部が、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に収集され、記憶部に記憶された時系列センサデータから多角的な動的パターンのモデルパラメータの抽出を継続的に行って前記記憶部に記憶し、前記コンピュータの第２の特徴量抽出部が、前記モデルパラメータ及び前記時系列センサデータを前記記憶部から読み出して、前記時系列センサデータをモデル化情報とその誤差情報とを含む要約情報に順次特徴量化して前記記憶部に記憶し、前記コンピュータの予測部が、前記要約情報を前記記憶部から読み出して入力とし、所定時間先での所定のイベントの発生確率を出力することが好ましい。 Further, in the event prediction method according to the present invention, the first feature extracting unit of the computer continuously collects data from multiple types of sensors placed on multiple observation targets, and stores the time series in the storage unit. Model parameters of a multifaceted dynamic pattern are continuously extracted from sensor data and stored in the storage unit, and a second feature extracting unit of the computer extracts the model parameters and the time-series sensor data from the sensor data. The time-series sensor data is read out from the storage unit and sequentially converted into feature quantities into summary information including modeling information and its error information, and stored in the storage unit, and the prediction unit of the computer converts the summary information into the storage unit. It is preferable that the probability of occurrence of a predetermined event in a predetermined period of time is output.

また、本発明に係るプログラムは、複数の観測対象にそれぞれ配置された複数種のセンサから継続的に収集される時系列センサデータから多角的な動的パターンのモデルパラメータの抽出を継続的に行う第１の特徴量抽出手段、前記モデルパラメータを用いて前記時系列センサデータをモデル化情報とその誤差情報とを含む要約情報に順次特徴量化する第２の特徴量抽出手段、及び前記要約情報を入力として所定時間先での所定のイベントの発生確率を出力する予測手段として、コンピュータを機能させることが好ましい。 Further, the program according to the present invention continuously extracts model parameters of multifaceted dynamic patterns from time-series sensor data continuously collected from multiple types of sensors placed on multiple observation targets. a first feature extracting means; a second feature extracting means for sequentially converting the time-series sensor data into summary information including modeling information and its error information using the model parameters; Preferably, the computer functions as a prediction means that outputs as input the probability of occurrence of a predetermined event in a predetermined time period.

また、前記第１の特徴量抽出手段は、前記動的パターンを時間方向及び前記観測対象間におけるセグメント及びそのパターン化を行って検出することが好ましい。この構成によれば、動的パターンが多角的に抽出されるので、精度の低下を抑止しつつ処理に要するデータ量の低減が図れる。 Further, it is preferable that the first feature extraction means detects the dynamic pattern by segmenting the dynamic pattern in the time direction and between the observation objects and patterning the segment. According to this configuration, since dynamic patterns are extracted from multiple angles, it is possible to reduce the amount of data required for processing while suppressing a decrease in accuracy.

また、前記第１の特徴量抽出手段は、前記セグメントの個数の設定を、コスト関数を用いて行うことが好ましい。この構成によれば、時系列センサデータのセグメント化において、セグメント数がコスト関数によってデータ量及び処理時間を考慮した最適値に設定される。 Further, it is preferable that the first feature extraction means sets the number of segments using a cost function. According to this configuration, in segmenting time-series sensor data, the number of segments is set to an optimal value using a cost function in consideration of data amount and processing time.

また、前記予測手段は、ニューラルネットワークモデルに設定されたパラメータに基づいて前記所定のイベントの発生確率を得ることが好ましい。この構成によれば、小型かつシンプルな構造のモデルで高精度の予測が可能となる。 Further, it is preferable that the prediction means obtains the probability of occurrence of the predetermined event based on parameters set in a neural network model. According to this configuration, highly accurate predictions can be made using a model with a small and simple structure.

また、前記予測手段は、ニューラルネットワークモデルにLSTM(Long-short term memory)を適用することが好ましい。この構成によれば、LSTMによって、深層学習モデルでの適用が可能であり、また、入力系列の長期依存性を学習することができるため、長期間先の高精度での予測が可能となる。 Further, it is preferable that the prediction means applies LSTM (Long-short term memory) to the neural network model. According to this configuration, LSTM can be applied to deep learning models, and long-term dependence of input sequences can be learned, making it possible to make highly accurate predictions for a long period of time.

また、本発明は、前記第２の特徴量抽出手段で得た前記要約情報を所定期間分取り込み、前記予測手段と同一の構成を有する学習用予測手段で機械学習をさせ、学習結果で得られたパラメータを前記予測手段に更新する機械学習装置を備えることが好ましい。この構成によれば、予測精度を漸次向上させることが可能となる。 Further, the present invention captures the summary information obtained by the second feature extracting means for a predetermined period, performs machine learning with a learning prediction means having the same configuration as the prediction means, and obtains a learning result. It is preferable to include a machine learning device that updates the predicted parameters to the prediction means. According to this configuration, it is possible to gradually improve prediction accuracy.

１イベント予測システム
１１データ取込処理部
１２特徴量抽出部（第１、第２の特徴量抽出手段）
１３予測部
１４パラメータ更新部
１００記憶部
２０観測対象
２１センサ群
３０機械学習装置 1 Event prediction system 11 Data import processing section 12 Feature amount extraction section (first and second feature amount extraction means)
13 Prediction unit 14 Parameter update unit 100 Storage unit 20 Observation target 21 Sensor group 30 Machine learning device

Claims

Multifaceted analysis is performed from time-series sensor data that is continuously collected from multiple types of sensors placed on multiple observation targets and appears in the window of ts:te (a predetermined period from the current time te to the past direction ts). a first feature extraction means that continuously extracts a model parameter set including model parameters of a dynamic pattern;
When the time-series sensor data and the model parameter set are given, the time-series sensor data is modeled using the model parameter set as modeling information Z(ts:te) based on the time-series pattern. a second feature extraction means that sequentially converts the time-series sensor data and the error information ε(ts:te) of the modeling information Z(ts:te) into features;
a prediction means for inputting the modeling information Z(ts:te) and the error information ε(ts:te) and outputting the probability of occurrence of a predetermined event in a predetermined time ahead;
The first feature extracting means detects the dynamic pattern by segmenting and patterning the dynamic pattern in the time direction and between the observation targets, and sets the number of the segments using a cost function,
The model parameter set is {m, r, S, Θ, F},
The second feature extracting means uses a hidden Markov model to summarize the modeling information Z(ts:te) and error information ε(ts:te) during modeling,
The event prediction system is characterized in that the prediction means obtains the probability of occurrence of the predetermined event based on parameters set in a neural network model applying LSTM (Long-short term memory).
However, m is the number of segments in the time-series sensor data, r is the number of regimes among them, S is the segment set representing the start point, end point, and observation target number of each segment, Θ is the number of each regime. model parameters, and F is the number of the regime to which the segment belongs.

A learning prediction having the same configuration as the prediction means takes in the modeling information Z(ts:te) and the error information ε(ts:te) obtained by the second feature extracting means for a predetermined period. 2. The event prediction system according to claim 1 , further comprising a machine learning device that performs machine learning in said prediction means and updates said prediction means with parameters obtained as a learning result .

The first feature extracting means of the computer continuously collects data from a plurality of types of sensors placed on a plurality of observation targets, and ts:te (a predetermined period from the current time te to the past direction ts). We continuously extract model parameter sets including model parameters of multifaceted dynamic patterns from time-series sensor data appearing in the window of
When the second feature extracting means of the computer is given the time-series sensor data and the model parameter set, the second feature extracting means of the computer converts the time-series sensor data into modeling information Z based on the time-series pattern using the model parameter set. (ts:te) and error information ε(ts:te) between the time-series sensor data when modeled and the modeling information Z(ts:te), and sequentially convert them into features,
A prediction means of the computer inputs the modeling information Z(ts:te) and error information ε(ts:te) and outputs the probability of occurrence of a predetermined event in a predetermined time ahead, and further,
The first feature extracting means detects the dynamic pattern by segmenting and patterning the dynamic pattern in the time direction and between the observation targets, and sets the number of the segments using a cost function,
The model parameter set is {m, r, S, Θ, F},
The second feature extracting means uses a hidden Markov model to summarize the modeling information Z(ts:te) and error information ε(ts:te) during modeling,
The event prediction method is characterized in that the prediction means obtains the probability of occurrence of the predetermined event based on parameters set in a neural network model applying LSTM (Long-short term memory).
However, m is the number of segments in the time-series sensor data, r is the number of regimes among them, S is the segment set representing the start point, end point, and observation target number of each segment, Θ is the number of each regime. model parameters, and F is the number of the regime to which the segment belongs.

Multifaceted analysis is performed from time-series sensor data that is continuously collected from multiple types of sensors placed on multiple observation targets and appears in the window of ts:te (a predetermined period from the current time te to the past direction ts). a first feature extraction means that continuously extracts a model parameter set including model parameters of a dynamic pattern;
When the time-series sensor data and the model parameter set are given, the time-series sensor data is modeled using the model parameter set as modeling information Z(ts:te) based on the time-series pattern. a second feature extraction means that sequentially converts the time-series sensor data and error information ε(ts:te) between the time-series sensor data and the modeling information Z(ts:te) into features;
The computer functions as a prediction means that inputs the modeling information Z (ts: te) and error information ε (ts: te) and outputs the probability of occurrence of a predetermined event in a predetermined time ahead, and further,
The first feature extracting means detects the dynamic pattern by segmenting and patterning the dynamic pattern in the time direction and between the observation targets, and sets the number of the segments using a cost function,
The model parameter set is {m, r, S, Θ, F},
The second feature extracting means uses a hidden Markov model to summarize the modeling information Z(ts:te) and error information ε(ts:te) during modeling,
The prediction means is a program that outputs an occurrence probability of the predetermined event based on parameters set in a neural network model to which LSTM (Long-short term memory) is applied.
However, m is the number of segments in the time-series sensor data, r is the number of regimes among them, S is the segment set representing the start point, end point, and observation target number of each segment, Θ is the number of each regime. model parameters, and F is the number of the regime to which the segment belongs.