JP7484653B2

JP7484653B2 - Information processing device, information processing method, operation management device, and operation management method

Info

Publication number: JP7484653B2
Application number: JP2020175752A
Authority: JP
Inventors: 翔長田
Original assignee: JFE Engineering Corp
Current assignee: JFE Engineering Corp
Priority date: 2020-10-20
Filing date: 2020-10-20
Publication date: 2024-05-16
Anticipated expiration: 2040-10-20
Also published as: JP2022067185A

Description

本発明は、情報処理装置、情報処理方法、運転管理装置、および運転管理方法に関する。 The present invention relates to an information processing device, an information processing method, an operation management device, and an operation management method.

分類、回帰、または方策関数の出力などのニューラルネットワークによる推論は、ニューラルネットワークの自由度の高さから、一定程度の過学習に起因した推論結果の不安定さが存在する。そこで、過学習の低減のために種々の正則化方法が提案されている。ところが、過学習を低減させるための正則化は、パラメータの調整が難しく、設定によっては学習時間が膨大になったり、汎化性能が得られなかったりすることがある。さらに、ニューラルネットワークは一般的に、ノイズに対して脆弱であることが知られている。ニューラルネットワークは実用上、計測誤差などの影響によって誤った推論結果が出力される可能性があった。非特許文献１，２には、学習用データの水増しや汎化性能の向上のために、学習時において、学習用ノイズを付加する技術が開示されている。 Inferences using neural networks, such as classification, regression, or the output of a policy function, have a certain degree of instability in the inference results due to overlearning due to the high degree of freedom of the neural network. Therefore, various regularization methods have been proposed to reduce overlearning. However, regularization to reduce overlearning is difficult to adjust parameters, and depending on the settings, the learning time may become enormous or generalization performance may not be obtained. Furthermore, neural networks are generally known to be vulnerable to noise. In practice, neural networks may output erroneous inference results due to the influence of measurement errors, etc. Non-Patent Documents 1 and 2 disclose techniques for adding learning noise during learning in order to pad the learning data and improve generalization performance.

特開２０１９－０３２６５９号公報JP 2019-032659 A

“Noisy Networks for Exploration” Meire Fortunato et al (2018)“Noisy Networks for Exploration” Meire Fortunato et al. (2018) “Creating artificial neural networks that generalize” Jocelyn Sietsma, Robert J.F.Dow (1991)“Creating artificial neural networks that generalize” Jocelyn Sietsma, Robert J.F.Dow (1991)

しかしながら、上述した従来技術においては、精度の向上には限界があった。そこで、入力データをニューラルネットワークに入力させて得られる推論結果の精度をさらに向上させる技術の開発が求められていた。 However, the above-mentioned conventional technology has limitations on how much accuracy can be improved. Therefore, there is a need to develop technology that can further improve the accuracy of inference results obtained by inputting input data into a neural network.

本発明は、上記に鑑みてなされたものであって、その目的は、入力データをニューラルネットワークに入力させて得られる推論結果の精度を向上させることができる情報処理装置、情報処理方法、運転管理装置、および運転管理方法を提供することにある。 The present invention has been made in view of the above, and its purpose is to provide an information processing device, an information processing method, an operation management device, and an operation management method that can improve the accuracy of inference results obtained by inputting input data into a neural network.

上述した課題を解決し、目的を達成するために、本発明の一態様に係る情報処理装置は、少なくとも１つの入力データを取得して、前記入力データに基づいてニューラルネットワークを用いて推論結果データを出力する制御部を備えた情報処理装置であって、前記制御部は、取得した前記入力データを記憶部に格納し、前記記憶部から読み出した１つの前記入力データに対して、互いに異なる複数のノイズを付加した、複数のノイズ付加データを生成し、前記複数のノイズ付加データを前記ニューラルネットワークに入力して、前記複数のノイズ付加データに対応した複数のノイズ付加推論結果データを出力し、前記複数のノイズ付加推論結果データに基づいて、前記推論結果データを出力する。 In order to solve the above-mentioned problems and achieve the object, an information processing device according to one aspect of the present invention is an information processing device including a control unit that acquires at least one input data and outputs inference result data based on the input data using a neural network, the control unit stores the acquired input data in a memory unit, generates multiple noise-added data by adding multiple noises different from each other to one input data read from the memory unit, inputs the multiple noise-added data to the neural network, outputs multiple noise-added inference result data corresponding to the multiple noise-added data, and outputs the inference result data based on the multiple noise-added inference result data.

本発明の一態様に係る情報処理装置は、上記の発明において、前記互いに異なる複数のノイズは、正規分布に基づいて選択されたノイズである。 In the information processing device according to one aspect of the present invention, the plurality of mutually different noises are noises selected based on a normal distribution.

本発明の一態様に係る情報処理装置は、上記の発明において、前記制御部は、前記複数のノイズ付加推論結果データのそれぞれに対してシミュレートを行って、前記複数のノイズ付加推論結果データのそれぞれに対応した複数の評価値を出力し、前記複数の評価値に基づいて、前記複数のノイズ付加推論結果データから選択した少なくとも１つのノイズ付加推論結果データを、前記推論結果データとして出力する。 In the information processing device according to one aspect of the present invention, in the above invention, the control unit performs a simulation for each of the plurality of noise-added inference result data, outputs a plurality of evaluation values corresponding to each of the plurality of noise-added inference result data, and outputs at least one noise-added inference result data selected from the plurality of noise-added inference result data based on the plurality of evaluation values as the inference result data.

本発明の一態様に係る情報処理装置は、上記の発明において、前記制御部は、前記複数のノイズ付加推論結果データに対してアンサンブル学習を行い、前記アンサンブル学習によって得られた出力を、前記推論結果データとして出力する。 In the information processing device according to one aspect of the present invention, in the above invention, the control unit performs ensemble learning on the multiple noise-added inference result data, and outputs the output obtained by the ensemble learning as the inference result data.

本発明の一態様に係る情報処理装置は、上記の発明において、前記入力データは、離散値を含むデータである。 In the information processing device according to one aspect of the present invention, the input data is data including discrete values.

本発明の一態様に係る運転管理装置は、液体を貯留する少なくとも１槽の貯留部と、前記貯留部に前記液体を供給可能な少なくとも１台の送液部と、を備える貯留設備に対して、前記送液部を制御して前記貯留部内の液位を管理する運転制御部を備えた運転管理装置であって、前記運転制御部は、前記貯留設備から前記貯留部内の液位を含む液位情報を取得し、取得した前記液位情報を含む情報を貯留入力データとして、上記の発明の情報処理装置に入力し、前記貯留入力データに対応して前記情報処理装置から出力された推論結果データを、貯留推論結果データとして取得し、前記貯留推論結果データに基づいて、前記送液部を制御する。 An operation management device according to one aspect of the present invention is an operation management device for a storage facility that includes at least one storage tank for storing liquid and at least one liquid delivery unit capable of supplying the liquid to the storage tank, and that includes an operation control unit that controls the liquid delivery unit to manage the liquid level in the storage facility, and the operation control unit acquires liquid level information including the liquid level in the storage facility from the storage facility, inputs the acquired information including the liquid level information as storage input data to the information processing device of the above invention, acquires inference result data output from the information processing device in response to the storage input data as storage inference result data, and controls the liquid delivery unit based on the storage inference result data.

本発明の一態様に係る運転管理装置は、上記の発明において、前記貯留推論結果データは、前記送液部の発停を切り替える液位の情報を含む。 In the operation management device according to one aspect of the present invention, in the above invention, the storage inference result data includes information on the liquid level that switches on and off the liquid delivery unit.

本発明の一態様に係る運転管理装置は、上記の発明において、前記貯留入力データは、所定時間において、前記貯留部から前記液体が排出される量の時刻に沿った予測値の情報を含む。 In the operation management device according to one aspect of the present invention, in the above invention, the storage input data includes information on a predicted value along with time of the amount of the liquid to be discharged from the storage section at a given time.

本発明の一態様に係る運転管理装置は、上記の発明において、前記送液部は、発停の切り換えが制御されるポンプから構成され、前記貯留入力データは、所定時刻における前記ポンプの発停の情報を含む。 In the operation management device according to one aspect of the present invention, in the above invention, the liquid delivery unit is composed of a pump whose on/off switching is controlled, and the storage input data includes information on the on/off of the pump at a specified time.

本発明の一態様に係る運転管理装置は、上記の発明において、前記送液部は、電力によって駆動するポンプから構成され、前記貯留入力データは、所定時間における前記電力の料金単価および消費電力の情報を含む。 In the operation management device according to one aspect of the present invention, in the above invention, the liquid delivery unit is composed of a pump driven by electricity, and the storage input data includes information on the unit price of the electricity and the amount of power consumed at a specified time.

本発明の一態様に係る情報処理方法は、少なくとも１つの入力データを取得して、前記入力データに基づいてニューラルネットワークを用いて推論結果データを出力する情報処理装置が実行する情報処理方法であって、取得した前記入力データを記憶部に格納し、前記記憶部から読み出した１つの前記入力データに対して、互いに異なる複数のノイズを付加した、複数のノイズ付加データを生成し、前記複数のノイズ付加データを前記ニューラルネットワークに入力して、前記複数のノイズ付加データに対応した複数のノイズ付加推論結果データを出力し、前記複数のノイズ付加推論結果データに基づいて、前記推論結果データを出力する。 An information processing method according to one aspect of the present invention is an information processing method executed by an information processing device that acquires at least one input data and outputs inference result data based on the input data using a neural network, storing the acquired input data in a storage unit, generating multiple noise-added data by adding multiple noises different from each other to one piece of input data read from the storage unit, inputting the multiple noise-added data to the neural network, outputting multiple noise-added inference result data corresponding to the multiple noise-added data, and outputting the inference result data based on the multiple noise-added inference result data.

本発明の一態様に係る運転管理方法は、液体を貯留する少なくとも１槽の貯留部と、前記貯留部に前記液体を供給可能な少なくとも１台の送液部と、を備える貯留設備に対して、前記送液部を制御して前記貯留部内の液位を管理する運転管理装置が実行する運転管理方法であって、前記貯留設備から前記貯留部内の液位を含む液位情報を取得し、取得した前記液位情報を含む情報を貯留入力データとして、上記の発明による情報処理方法に基づいて、前記貯留入力データに対応して得られた貯留推論結果データを取得し、前記貯留推論結果データに基づいて、前記送液部を制御する。 An operation management method according to one aspect of the present invention is an operation management method for a storage facility including at least one storage tank for storing liquid and at least one liquid delivery unit capable of supplying the liquid to the storage facility, the operation management device controls the liquid delivery unit to manage the liquid level in the storage facility, and obtains liquid level information including the liquid level in the storage facility from the storage facility, and uses the obtained information including the liquid level information as storage input data to obtain storage inference result data obtained in response to the storage input data based on the information processing method according to the above invention, and controls the liquid delivery unit based on the storage inference result data.

本発明に係る情報処理装置、情報処理方法、運転管理装置、および運転管理方法は、入力データをニューラルネットワークに入力させて得られる推論結果の精度を向上させることが可能となる。 The information processing device, information processing method, operation management device, and operation management method of the present invention make it possible to improve the accuracy of inference results obtained by inputting input data into a neural network.

図１は、本発明の一実施形態による情報処理装置を適用した運転管理システムの構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of an operation management system to which an information processing device according to an embodiment of the present invention is applied. 図２は、本発明の一実施形態による、入力データから推論結果を導出する方法を説明するための図である。FIG. 2 is a diagram illustrating a method for deriving inference results from input data according to an embodiment of the present invention. 図３は、本発明の一実施形態による配水池系の運転管理方法を説明するためのフローチャートである。FIG. 3 is a flowchart for explaining a method for managing an operation of a water reservoir system according to an embodiment of the present invention. 図４は、本発明の一実施形態による配水池系の運転管理方法による、運転時間に沿った制御継続時間を示すグラフである。FIG. 4 is a graph showing the control duration along with the operation time according to the method for managing the operation of a water-distributing reservoir system according to one embodiment of the present invention. 図５は、従来技術による配水池系の運転管理方法による、運転時間に沿った制御継続時間を示すグラフである。FIG. 5 is a graph showing the control duration along with the operation time according to the conventional method for managing the operation of a water reservoir system. 図６は、本発明の一実施形態による配水池系の運転管理方法による、運転時間に沿った制御スコアを示すグラフである。FIG. 6 is a graph showing the control score along with the operation time according to the operation management method for a water reservoir system according to an embodiment of the present invention. 図７は、従来技術による配水池系の運転管理方法による、運転時間に沿った制御スコアを示すグラフである。FIG. 7 is a graph showing the control score along with the operation time according to the conventional method for managing the operation of a reservoir system. 図８は、本発明の一実施形態の変形例による、入力データから推論結果を導出する方法を説明するための図である。FIG. 8 is a diagram for explaining a method for deriving an inference result from input data according to a modified embodiment of the present invention. 図９は、従来技術によるニューラルネットワークの全体構成を模式的に示す図である。FIG. 9 is a diagram showing a schematic diagram of the overall configuration of a neural network according to the prior art.

以下、本発明の一実施形態について図面を参照しつつ説明する。なお、以下の一実施形態の全図においては、同一または対応する部分には同一の符号を付す。また、本発明は以下に説明する一実施形態によって限定されるものではない。 One embodiment of the present invention will be described below with reference to the drawings. Note that in all the drawings of the embodiment below, the same or corresponding parts are given the same reference numerals. Furthermore, the present invention is not limited to the embodiment described below.

図１は、本発明の一実施形態による情報処理装置を適用した運転管理システムの構成を示すブロック図である。図１に示すように、本発明の一実施形態による運転管理システム１は、ネットワーク２を介して互いに通信可能な、学習装置１０と運転管理装置２０とを有する。なお、運転管理装置２０が学習装置１０を備えても良い。運転管理装置２０は、ネットワーク２を介して配水池系３０から各種情報を収集可能に構成される。 Figure 1 is a block diagram showing the configuration of an operation management system to which an information processing device according to one embodiment of the present invention is applied. As shown in Figure 1, an operation management system 1 according to one embodiment of the present invention has a learning device 10 and an operation management device 20 that can communicate with each other via a network 2. The operation management device 20 may also be equipped with the learning device 10. The operation management device 20 is configured to be able to collect various information from a water reservoir system 30 via the network 2.

ネットワーク２は、例えば、インターネットなどの公衆通信網であって、例えばＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）、携帯電話などの電話通信網や公衆回線、ＶＰＮ（Virtual Private Network）、および専用線などの一または複数の組み合わせからなる。ネットワーク２は、有線通信および無線通信が適宜組み合わされている。 Network 2 is, for example, a public communication network such as the Internet, and is composed of one or more combinations of, for example, a LAN (Local Area Network), a WAN (Wide Area Network), a telephone communication network such as a mobile phone, a public line, a VPN (Virtual Private Network), and a dedicated line. Network 2 appropriately combines wired communication and wireless communication.

（配水池系）
貯留設備としての配水池系３０は、複数の配水池３１，３２、および複数の配水ポンプ３３，３４を備える。複数の配水ポンプ３３，３４は、それぞれの配水池３１，３２ごとに、１台または複数台設けられる。具体的に例えば、貯留部としての配水池３１、３２が３０箇所程度の場合、送液部としての配水ポンプ３３，３４は４０台程度備えられる。配水池３１，３２からは、水を必要とする需要家３５，３６に水が供給される。 (Water reservoir system)
The reservoir system 30 as a storage facility includes a plurality of reservoirs 31, 32 and a plurality of distribution pumps 33, 34. One or more of the distribution pumps 33, 34 are provided for each of the reservoirs 31, 32. Specifically, for example, when there are about 30 reservoirs 31, 32 as storage units, about 40 distribution pumps 33, 34 as liquid delivery units are provided. Water is supplied from the reservoirs 31, 32 to consumers 35, 36 that require water.

本実施形態において、機械学習装置としての学習装置１０は、配水池系３０の推移などの環境条件、配水ポンプ３３，３４の発停状態や消費電力、所定時間における需要家３５，３６の単位時間当たりの電気料金（以下、電気料金単価）、時刻などの所定のデータを入力パラメータとし、配水池系３０における複数の配水ポンプ３３，３４の発停を切り替える水位を出力パラメータとする強化学習を行う。 In this embodiment, the learning device 10 as a machine learning device performs reinforcement learning using predetermined data such as environmental conditions such as the transition of the reservoir system 30, the on/off status and power consumption of the distribution pumps 33, 34, the electricity charges per unit time of the consumers 35, 36 at a specified time (hereinafter, the electricity rate), and time as input parameters, and the water level at which the multiple distribution pumps 33, 34 in the reservoir system 30 are switched on and off as an output parameter.

（学習装置）
学習装置１０は、通信部（図示せず）を有する複数の配水池系３０から送信された種々の情報を、ネットワーク２を介して収集するデータ収集処理を実行する。学習装置１０は、収集した種々の情報によって機械学習を実行可能である。学習装置１０は、制御部１１、記憶部１２、通信部１３、出力部１４、および入力部１５を備える。 (Learning device)
The learning device 10 executes a data collection process to collect various information transmitted from a plurality of reservoir systems 30 having a communication unit (not shown) via a network 2. The learning device 10 is capable of performing machine learning using the various collected information. The learning device 10 includes a control unit 11, a memory unit 12, a communication unit 13, an output unit 14, and an input unit 15.

制御部１１は、具体的に、ＣＰＵ（Central Processing Unit）、ＤＳＰ（Digital Signal Processor）、ＦＰＧＡ（Field-Programmable Gate Array）などのプロセッサ、およびＲＡＭ（Random Access Memory）やＲＯＭ（Read Only Memory）などの主記憶部（いずれも図示せず）を備える。 Specifically, the control unit 11 includes a processor such as a CPU (Central Processing Unit), a DSP (Digital Signal Processor), or an FPGA (Field-Programmable Gate Array), and a main memory unit such as a RAM (Random Access Memory) or a ROM (Read Only Memory) (none of which are shown).

記憶部１２は、物理的には、ＲＡＭ等の揮発性メモリ、ＲＯＭ等の不揮発性メモリ、ＥＰＲＯＭ（Erasable Programmable ROM）、ハードディスクドライブ（ＨＤＤ、Hard Disk Drive）、ソリッドステートドライブ（ＳＳＤ、Solid State Drive）、およびリムーバブルメディアなどから選ばれた記憶媒体から構成される。なお、リムーバブルメディアは、例えば、ＵＳＢ（Universal Serial Bus）メモリ、または、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、またはＢＤ（Blu-ray（登録商標） Disc）のようなディスク記録媒体である。また、外部から装着可能なメモリカードなどのコンピュータ読み取り可能な記録媒体を用いて記憶部１２を構成してもよい。記憶部１２には、学習装置１０の動作を実行するための、オペレーティングシステム（Operating System :ＯＳ）、各種プログラム、各種テーブル、各種データベースなどが記憶可能である。各種プログラムには、本実施形態による学習モデルやニューラルネットワークも含まれる。これらの各種プログラムは、ハードディスク、フラッシュメモリ、ＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ、フレキシブルディスク等のコンピュータ読み取り可能な記録媒体に記録して広く流通させることも可能である。 The storage unit 12 is physically composed of a storage medium selected from a volatile memory such as a RAM, a non-volatile memory such as a ROM, an erasable programmable ROM (EPROM), a hard disk drive (HDD), a solid state drive (SSD), and a removable medium. The removable medium is, for example, a universal serial bus (USB) memory, or a disk recording medium such as a compact disc (CD), a digital versatile disc (DVD), or a Blu-ray (registered trademark) disc (BD). The storage unit 12 may also be configured using a computer-readable recording medium such as an externally mountable memory card. The storage unit 12 can store an operating system (OS), various programs, various tables, various databases, and the like for executing the operation of the learning device 10. The various programs include the learning model and neural network according to this embodiment. These various programs can also be recorded on computer-readable recording media such as hard disks, flash memory, CD-ROMs, DVD-ROMs, and flexible disks, and widely distributed.

本実施形態において、制御部１１は、記憶部１２に記憶されたプログラムを主記憶部の作業領域にロードして実行し、プログラムの実行を通じて各構成部などを制御することで、所定の目的に合致した機能を実現できる。具体的に、制御部１１は、プログラムの実行によって、学習部１１１、判定部１１２、行動選択部１１３、状態入力部１１４、環境部１１５、演算部１１６、シミュレータ部１１７の機能を実現できる。 In this embodiment, the control unit 11 loads a program stored in the memory unit 12 into the working area of the main memory unit, executes it, and controls each component unit through the execution of the program, thereby realizing functions that meet a predetermined purpose. Specifically, the control unit 11 can realize the functions of a learning unit 111, a judgment unit 112, an action selection unit 113, a state input unit 114, an environment unit 115, a calculation unit 116, and a simulator unit 117 by executing the program.

制御部１１によるプログラムの実行によって、学習部１１１の機能が実行される。学習部１１１は、学習装置１０が受信した入出力データセットをもとに機械学習を行うことができる。学習部１１１は、学習した結果を学習モデル１２１として記憶部１２に書き込んで記憶させる。学習部１１１は、学習を行っているニューラルネットワークとは別に、所定のタイミングで、当該タイミングにおける最新の学習モデルを、記憶部１２に記憶させてもよい。記憶部１２に記憶させる際には、古い学習モデル１２１を削除して最新の学習モデル１２１を記憶させる更新でもよいし、古い学習モデル１２１の一部または全部を保存したまま最新の学習モデル１２１を記憶させる蓄積でもよい。 The function of the learning unit 111 is executed by the control unit 11 executing the program. The learning unit 111 can perform machine learning based on the input/output data set received by the learning device 10. The learning unit 111 writes and stores the learned results as a learning model 121 in the memory unit 12. The learning unit 111 may store the latest learning model at a predetermined timing in the memory unit 12 separately from the neural network being trained. When storing in the memory unit 12, an update may be performed in which the old learning model 121 is deleted and the latest learning model 121 is stored, or an accumulation may be performed in which the latest learning model 121 is stored while preserving part or all of the old learning model 121.

通信部１３は、例えば、ＬＡＮ（Local Area Network）インターフェースボードや、無線通信のための無線通信回路などである。ＬＡＮインターフェースボードや無線通信回路は、ネットワーク２に接続される。通信部１３は、ネットワーク２に接続して、運転管理装置２０や配水池系３０との間で通信を行う。 The communication unit 13 is, for example, a LAN (Local Area Network) interface board or a wireless communication circuit for wireless communication. The LAN interface board and the wireless communication circuit are connected to the network 2. The communication unit 13 is connected to the network 2 and communicates with the operation management device 20 and the water reservoir system 30.

出力手段としての出力部１４は、制御部１１による制御に従って、液晶ディスプレイやプラズマディスプレイなどのディスプレイの画面上に、文字や図形などを表示したり、スピーカから音声を出力したりして、所定の情報を外部に通知するように構成される。さらに、出力部１４は、印刷用紙などに所定の情報を印刷することによって出力するプリンタを含む。記憶部１２に格納された各種情報は、例えば所定の事務所などに設置された出力部１４のモニタなどで確認することができる。 The output unit 14 as an output means is configured to notify the outside of predetermined information by displaying characters, figures, etc. on the screen of a display such as a liquid crystal display or plasma display, or by outputting sound from a speaker, according to the control of the control unit 11. Furthermore, the output unit 14 includes a printer that outputs predetermined information by printing it on printing paper, etc. The various information stored in the memory unit 12 can be confirmed, for example, on a monitor of the output unit 14 installed in a predetermined office, etc.

入力手段としての入力部１５は、例えば、キーボードや出力部１４の内部に組み込まれて表示パネルのタッチ操作を検出するタッチパネル式キーボード、または外部との間の通話を可能とする音声入力デバイスなどから構成される。なお、出力部１４および入力部１５を一体化させて、例えばタッチパネルディスプレイやスピーカマイクロホンなどの入出力部としても良い。 The input unit 15 as an input means is composed of, for example, a keyboard or a touch panel keyboard that is built into the output unit 14 and detects touch operations on the display panel, or a voice input device that enables calls to be made with the outside. The output unit 14 and the input unit 15 may be integrated together to form an input/output unit such as a touch panel display or a speaker/microphone.

（運転管理装置）
運転管理装置２０は、ネットワーク２を介して通信可能な構成を有する。運転管理装置２０は、制御部２１、記憶部２２、通信部２３、出力部２４、および入力部２５を備える。制御部２１、記憶部２２、通信部２３、出力部２４、および入力部２５はそれぞれ、物理的および機能的には、上述した制御部１１、記憶部１２、通信部１３、出力部１４、および入力部２５と同様の構成を有する。 (Operation management device)
The operation management device 20 has a configuration capable of communicating via the network 2. The operation management device 20 includes a control unit 21, a memory unit 22, a communication unit 23, an output unit 24, and an input unit 25. The control unit 21, the memory unit 22, the communication unit 23, the output unit 24, and the input unit 25 have the same physical and functional configurations as the above-mentioned control unit 11, the memory unit 12, the communication unit 13, the output unit 14, and the input unit 25, respectively.

本実施形態において制御部２１は、記憶部２２に記憶されたプログラムを主記憶部の作業領域にロードして実行することによって、具体的に、運転計画部２１１および運転制御部２１２の機能を実現できる。 In this embodiment, the control unit 21 can specifically realize the functions of the operation plan unit 211 and the operation control unit 212 by loading a program stored in the memory unit 22 into the working area of the main memory unit and executing it.

記憶部２２には、運転計画部２１１により生成された運転計画情報が検索可能な運転計画データベース２２１、および運転制御部２１２が取得した配水池系３０の運転情報が検索可能な運転情報データベース２２２が格納されている。運転計画データベース２２１に格納された運転計画情報は、例えば配水ポンプ３３，３４の発停に関するタイミングの計画情報を含む。運転制御部２１２は、運転計画情報に基づいて、配水池系３０における配水ポンプ３３，３４の発停を制御する。運転情報データベース２２２における運転情報は、例えば、所定時点における配水ポンプ３３，３４のオンオフの情報（以下、発停情報）や、配水池３１，３２に貯留された水の水位の情報（以下、水位情報）を含む。なお、配水池系３０に関する情報は、これらの情報に限定されない。 The memory unit 22 stores an operation plan database 221 in which operation plan information generated by the operation plan unit 211 can be searched, and an operation information database 222 in which operation information of the water distribution reservoir system 30 acquired by the operation control unit 212 can be searched. The operation plan information stored in the operation plan database 221 includes, for example, plan information on the timing of starting and stopping the water distribution pumps 33, 34. The operation control unit 212 controls the starting and stopping of the water distribution pumps 33, 34 in the water distribution reservoir system 30 based on the operation plan information. The operation information in the operation information database 222 includes, for example, on/off information of the water distribution pumps 33, 34 at a specified time (hereinafter, on/off information) and information on the water level of the water stored in the water distribution reservoirs 31, 32 (hereinafter, water level information). Note that the information on the water distribution reservoir system 30 is not limited to these pieces of information.

通信部２３は、ネットワーク２に接続して、学習装置１０および配水池系３０との間で通信可能である。通信部２３は、制御部２１による制御に基づいて出力される指令信号によって、配水池３１，３２の液位情報としての水位情報や発停情報などの各種情報を収集する。なお、通信部２３によって送受信される情報は、これらの情報に限定されない。本実施形態において運転管理装置２０は、ネットワーク２を介して通信可能なクラウドサーバとして機能させることもできる。 The communication unit 23 is connected to the network 2 and is capable of communicating with the learning device 10 and the reservoir system 30. The communication unit 23 collects various information, such as water level information as liquid level information of the reservoirs 31, 32 and start/stop information, by command signals output based on the control of the control unit 21. Note that the information transmitted and received by the communication unit 23 is not limited to these pieces of information. In this embodiment, the operation management device 20 can also function as a cloud server capable of communicating via the network 2.

ここで、本発明者は、情報処理装置としての学習装置１０に関して、従来技術によるニューラルネットワークを用いた情報処理方法について検討を行った。図９は、従来技術によるニューラルネットワークの全体構成を模式的に示す図である。図９に示すニューラルネットワーク３００は、例えば順伝播型ニューラルネットワークであり、入力層３０１、中間層３０２、および出力層３０３を有する。入力層３０１は複数のノードからなり、各ノードには互いに異なる入力パラメータが入力される。中間層３０２は入力層３０１からの出力が入力される。中間層３０２は、入力層３０１からの入力を受ける複数のノードからなる層を含む多層の構造を有する。出力層３０３は、中間層３０２からの出力が入力され、出力パラメータを出力する。中間層３０２が多層構造、例えば３～５層構造を有するニューラルネットワークを用いた機械学習は、深層学習（ディープラーニング）と呼ばれる。図９に示す例では、入力パラメータが所定の入力データ３１０であり、出力パラメータが、目的とする推論結果データ３２０である。 Here, the present inventor has studied an information processing method using a neural network according to the prior art for a learning device 10 as an information processing device. FIG. 9 is a diagram showing a schematic diagram of the overall configuration of a neural network according to the prior art. The neural network 300 shown in FIG. 9 is, for example, a forward propagation type neural network, and has an input layer 301, an intermediate layer 302, and an output layer 303. The input layer 301 is composed of a plurality of nodes, and different input parameters are input to each node. The intermediate layer 302 receives the output from the input layer 301. The intermediate layer 302 has a multi-layer structure including a layer composed of a plurality of nodes that receive the input from the input layer 301. The output layer 303 receives the output from the intermediate layer 302 and outputs an output parameter. Machine learning using a neural network in which the intermediate layer 302 has a multi-layer structure, for example, a 3-5 layer structure, is called deep learning. In the example shown in FIG. 9, the input parameter is a predetermined input data 310, and the output parameter is the desired inference result data 320.

ニューラルネットワーク３００において、教師あり学習や半教師あり学習によって所定の学習モデルを生成する際には、学習用入力パラメータおよび学習用出力パラメータの入出力データセットが用いられる。入力データ１１０に対して所定の推論結果データ１２０を出力する制御部１１における学習部１１１は、学習用入力パラメータおよび学習用出力パラメータを教師データとして、図９に示すニューラルネットワークを用いたディープラーニングなどの機械学習によって、学習モデル１２１を生成できる。学習装置１０の制御部１１は、学習部１１１によって生成された学習モデルに基づいて、入力データ３１０から推論結果データ３２０を導出する。 When generating a predetermined learning model by supervised learning or semi-supervised learning in the neural network 300, an input/output data set of learning input parameters and learning output parameters is used. The learning unit 111 in the control unit 11 that outputs predetermined inference result data 120 for input data 110 can generate a learning model 121 by machine learning such as deep learning using the neural network shown in FIG. 9, using the learning input parameters and learning output parameters as teacher data. The control unit 11 of the learning device 10 derives inference result data 320 from the input data 310 based on the learning model generated by the learning unit 111.

また、ニューラルネットワーク３００を強化学習に用いた場合、状態ｓや行動ａなどの入力データ３１０をニューラルネットワーク３００に入力して、推論結果データ３２０を出力する。推論結果データ３２０としては、状態ｓ、行動ａ、状態行動価値（Ｑ値）などが出力される。 When the neural network 300 is used for reinforcement learning, input data 310 such as state s and action a are input to the neural network 300, and inference result data 320 is output. As the inference result data 320, state s, action a, state action value (Q value), etc. are output.

本発明者は、従来のニューラルネットワーク３００に対して、種々検討を行った結果、出力される推論結果の精度を向上させるためには、推論時、すなわちニューラルネットワーク３００に入力する際に、入力データに互いに異なる複数のノイズを加える方法を案出した。すなわち、本発明者は、推論時の入力データにノイズを加えて、複数のパターンの入力データを生成し、生成された複数の入力データのそれぞれをニューラルネットワーク３００に入力したことによって得られた複数の推論結果に基づいて、少なくとも１つの水路結果を選択または生成することによって、高精度で評価値の高い出力を得られることを想到した。以下に説明する本発明およびその一実施形態は、本発明者による以上の鋭意検討によって案出されたものである。 After various studies on the conventional neural network 300, the inventor came up with a method of adding multiple different noises to the input data during inference, i.e., when inputting the data into the neural network 300, in order to improve the accuracy of the inference results that are output. That is, the inventor came up with the idea of adding noise to the input data during inference to generate multiple patterns of input data, and selecting or generating at least one waterway result based on multiple inference results obtained by inputting each of the multiple generated input data into the neural network 300, thereby obtaining an output with high accuracy and a high evaluation value. The present invention and one embodiment thereof described below were devised by the inventor through the above-mentioned intensive studies.

図２は、本発明の一実施形態による情報処理装置としての学習装置１０が実行する情報処理方法を説明するための図である。図２に示すように、一実施形態による情報処理方法においては、演算部１１６が、１つの入力データ３１０に対して、第１ノイズを付加することにより、第１ノイズ付加データ３１１－１を生成する。同様に、演算部１１６が、１つの入力データ３１０に対して第１ノイズとは異なる第２ノイズを付加することにより、第２ノイズ付加データ３１１－２を生成する。さらに、演算部１１６は、入力データに対して第１ノイズとも第２ノイズとも異なる第３ノイズを付加することにより、第３ノイズ付加データ３１１－３を生成する。これらの互いに異なるノイズの付加は、１つの入力データ３１０に対してｎ個（ｎは２以上の自然数）のノイズを付加するまで実行される。入力データ３１０に対して付加される第１ノイズから第ｎノイズまでのｎ個のノイズは、互いに相関がない、互いに異なるノイズである。これらの互いに異なる第１ノイズ～第ｎノイズまでの複数のノイズは、例えば正規分布、いわゆるガウス雑音に基づいて選択されるが、他の分布に基づいたノイズを選択することも可能である。これにより、同じ入力データ３１０に基づいて、互いに異なる第１ノイズ付加データ３１１－１～第ｎノイズ付加データ３１１－ｎが、演算部１１６により生成される。生成された第１ノイズ付加データ３１１－１～第ｎノイズ付加データ３１１－ｎは、記憶部１２に格納される。 2 is a diagram for explaining an information processing method executed by the learning device 10 as an information processing device according to one embodiment of the present invention. As shown in FIG. 2, in the information processing method according to one embodiment, the calculation unit 116 generates first noise-added data 311-1 by adding a first noise to one input data 310. Similarly, the calculation unit 116 generates second noise-added data 311-2 by adding a second noise different from the first noise to one input data 310. Furthermore, the calculation unit 116 generates third noise-added data 311-3 by adding a third noise different from both the first noise and the second noise to the input data. The addition of these mutually different noises is performed until n noises (n is a natural number of 2 or more) are added to one input data 310. The n noises from the first noise to the nth noise added to the input data 310 are mutually different noises that are not correlated with each other. These multiple noises from the first noise to the nth noise that are mutually different are selected based on, for example, a normal distribution, so-called Gaussian noise, but it is also possible to select noises based on other distributions. As a result, the calculation unit 116 generates the first noise-added data 311-1 to the nth noise-added data 311-n, which are different from each other, based on the same input data 310. The generated first noise-added data 311-1 to the nth noise-added data 311-n are stored in the storage unit 12.

学習部１１１は、記憶部１２からニューラルネットワーク１２２のプログラムを読み込んでニューラルネットワーク３００の処理を実行する。すなわち、学習部１１１は、第１ノイズ付加データ３１１－１～第ｎノイズ付加データ３１１－ｎのそれぞれを、ニューラルネットワーク３００に入力する。ニューラルネットワーク３００は、それぞれの第１ノイズ付加データ３１１－１～第ｎノイズ付加データ３１１－ｎに対応して、それぞれの第１推論結果データ３２１－１、第２推論結果データ３２１－２、第３推論結果データ３２１－３、…、第ｎ推論結果データ３２１－ｎを出力する。出力された複数のノイズ付加推論結果データとしての、第１推論結果データ３２１－１～第ｎ推論結果データ３２１－ｎは、記憶部１２に格納される。 The learning unit 111 reads the program of the neural network 122 from the storage unit 12 and executes the processing of the neural network 300. That is, the learning unit 111 inputs each of the first noise-added data 311-1 to the nth noise-added data 311-n to the neural network 300. The neural network 300 outputs the first inference result data 321-1, the second inference result data 321-2, the third inference result data 321-3, ..., the nth inference result data 321-n corresponding to each of the first noise-added data 311-1 to the nth noise-added data 311-n. The first inference result data 321-1 to the nth inference result data 321-n output as the multiple noise-added inference result data are stored in the storage unit 12.

続いて、制御部１１のシミュレータ部１１７によって実行されるシミュレータ３４０は、ニューラルネットワーク３００から出力された第１推論結果データ３２１－１～第ｎ推論結果データ３２１－ｎのそれぞれに対して、個別にシミュレートを実行する。シミュレータ３４０は、第１推論結果データ３２１－１に基づいてシミュレートを実行した結果として、第１評価値３３０－１を出力する。同様にシミュレータ部１１７は、第２推論結果データ３２１－２、第３推論結果データ３２１－３、…、および第ｎ推論結果データ３２１－ｎに基づいてシミュレートを実行した結果としてそれぞれ、第２評価値３３０－２、第３評価値３３０－３、…、および第ｎ評価値３３０－ｎを出力する。出力された第１評価値３３０－１～第ｎ評価値３３０－ｎは記憶部１２に格納される。 Then, the simulator 340 executed by the simulator unit 117 of the control unit 11 individually executes a simulation for each of the first inference result data 321-1 to the n-th inference result data 321-n output from the neural network 300. The simulator 340 outputs a first evaluation value 330-1 as a result of executing a simulation based on the first inference result data 321-1. Similarly, the simulator unit 117 outputs a second evaluation value 330-2, a third evaluation value 330-3, ..., and an n-th evaluation value 330-n as a result of executing a simulation based on the second inference result data 321-2, the third inference result data 321-3, ..., and the n-th inference result data 321-n, respectively. The output first evaluation value 330-1 to the n-th evaluation value 330-n are stored in the memory unit 12.

シミュレータ部１１７が出力した第１評価値３３０－１～第ｎ評価値３３０－ｎは、評価値を選択するセレクタ３５０の機能を有する制御部１１の判定部１１２に入力される。判定部１１２は、セレクタ３５０によって、最適な評価値を選択する。制御部１１の判定部１１２は、第１評価値３３０－１～第ｎ評価値３３０－ｎから選択した評価値に対応する推論結果データを選択して、出力する。以上により、入力データ３１０に対する、推論結果データ３２０が出力される。 The first evaluation value 330-1 through the nth evaluation value 330-n output by the simulator unit 117 are input to the judgment unit 112 of the control unit 11, which has the function of a selector 350 that selects an evaluation value. The judgment unit 112 selects the optimal evaluation value using the selector 350. The judgment unit 112 of the control unit 11 selects and outputs inference result data corresponding to the evaluation value selected from the first evaluation value 330-1 through the nth evaluation value 330-n. As a result of the above, inference result data 320 for the input data 310 is output.

以上のように、強化学習などの、いわゆる制御方法や評価値を出力するニューラルネットワーク３００であって、シミュレータ部１１７によって事前にシミュレート可能である場合には、複数の推論結果データ（第１～第ｎ推論結果データ）から複数の評価値（第１～第ｎ評価値）を取得して、最適な評価値となった推論結果データを選択することにより、推論結果データの精度を向上させることができる。また、入力データに複数の異なるノイズを付加することによって、ニューラルネットワーク３００に入力するデータをノイズの付加数分だけ増加させることができるので、ノイズ付加による入力の水増しにより、方策を出力する強化学習などにおいては、シミュレータと組み合わせることによって、より評価値の高い制御の方策を得ることができる。 As described above, in the case of the neural network 300 that outputs so-called control methods and evaluation values such as reinforcement learning, and that can be simulated in advance by the simulator unit 117, the accuracy of the inference result data can be improved by obtaining multiple evaluation values (1st to nth evaluation values) from multiple inference result data (1st to nth inference result data) and selecting the inference result data with the optimal evaluation value. In addition, by adding multiple different noises to the input data, the data input to the neural network 300 can be increased by the amount of added noise, so that in reinforcement learning and the like that outputs a policy by padding the input by adding noise, a control policy with a higher evaluation value can be obtained by combining it with a simulator.

なお、評価値に対して所定の閾値を設定し、全ての第１評価値３３０－１～第ｎ評価値３３０－ｎが所定の閾値未満であった場合に、第１評価値３３０－１～第ｎ評価値３３０－ｎのうちの少なくとも１つ、または複数の評価値が所定の閾値以上になるまで、入力データ３１０に対してさらにノイズを付加するようにしても良い。以上説明した一実施形態による情報処理方法は、ニューラルネットワークを用いて情報処理を実行可能なあらゆる処理に適用可能である。 A predetermined threshold value may be set for the evaluation values, and if all of the first evaluation value 330-1 through the nth evaluation value 330-n are below the predetermined threshold value, further noise may be added to the input data 310 until at least one or more of the first evaluation value 330-1 through the nth evaluation value 330-n become equal to or greater than the predetermined threshold value. The information processing method according to the embodiment described above is applicable to any process in which information processing can be performed using a neural network.

（配水池系の運転管理方法）
次に、以上の一実施形態による情報処理方法に基づいた、配水池の運転管理方法について説明する。 (Operation and management method for water reservoir system)
Next, a method for managing operation of a water reservoir based on the information processing method according to the above embodiment will be described.

まず、一実施形態による運転管理方法の理解を容易にするために、配水池系の運転管理方法における従来技術の問題点について説明する。すなわち、従来、浄水場、配水池３１，３２と、需要家３５，３６との間で液体である水道水などを輸送するための配水ポンプ３３，３４は、オンオフ制御しかできない場合が多い。この場合、配水ポンプ３３，３４に使用する電力量や、配水ポンプ３３，３４の発停回数を低減するための運転計画を生成することが困難であった。これは、連続値の解を許容する線形計画問題であれば、最適値を現実的な時間で計算できるのに対し、離散値（整数）の解のみを許容する整数計画問題や、離散値と連続値とが混在する混合整数計画問題では、膨大なパターンを計算する必要があるため、現実的な処理時間で演算が終了しないためである。換言すると、連続値ではなく一部または全ての変数の解として離散値のみ許容する最適化問題は、膨大なパターンを試行することによって最適解を導出する必要があるため、一般的には最適解にたどり着くことが困難であり、代替として最適解に近い解を許容される時間内で導出している。 First, in order to facilitate understanding of the operation management method according to one embodiment, the problems of the conventional technology in the operation management method of a water reservoir system will be described. That is, conventionally, the water distribution pumps 33, 34 for transporting liquid tap water between the water purification plant, the water distribution reservoirs 31, 32, and the consumers 35, 36 can often only be controlled to be on and off. In this case, it was difficult to generate an operation plan for reducing the amount of power used by the water distribution pumps 33, 34 and the number of times the water distribution pumps 33, 34 are started and stopped. This is because, while a linear programming problem that allows for a solution of continuous values can calculate the optimal value in a realistic time, an integer programming problem that allows for only a solution of discrete values (integers) and a mixed integer programming problem that mixes discrete and continuous values require the calculation of a huge number of patterns, and the calculation cannot be completed in a realistic processing time. In other words, optimization problems that allow only discrete values as solutions for some or all variables, rather than continuous values, require trying a huge number of patterns to derive the optimal solution, so it is generally difficult to arrive at the optimal solution, and as an alternative, a solution close to the optimal solution is derived within an acceptable time.

しかしながら、本発明者の知見によれば、離散値のみを許容する問題においては、膨大な場合分けが必要になり、対象ごとに特殊な事情が存在する場合には、対象ごとに場合分けをする枝刈り探索法を変更するのはかなり煩雑な作業となる。そのため、配水池系などにおいて、繁雑な処理を低減でき、最適な運転管理を行う技術が求められていた。 However, according to the inventor's knowledge, problems that allow only discrete values require a huge number of case distinctions, and when special circumstances exist for each target, changing the pruning search method that distinguishes between cases for each target becomes a very cumbersome task. For this reason, there has been a demand for technology that can reduce the amount of complicated processing and perform optimal operation management in water reservoir systems, etc.

そこで、本発明者は、運転計画の生成において、機械学習における例えばＤＱＮ（Deep Q-Network）などの強化学習を採用する方法について検討を行った。すなわち、操作そのものの評価ではなく、操作の結果の評価を用いた制御方法として、機械学習における強化学習を採用することを想到した。これにより、膨大な場合分けが不要になり、シミュレータを作成する際に、対象ごとの特殊事情を考慮すれば、制御に反映させることが可能になる。 The inventors therefore investigated a method of employing reinforcement learning in machine learning, such as DQN (Deep Q-Network), in generating operation plans. In other words, they came up with the idea of employing reinforcement learning in machine learning as a control method that uses evaluation of the results of operations, rather than evaluation of the operations themselves. This eliminates the need for a huge number of case distinctions, and when creating a simulator, if special circumstances for each target are taken into account, they can be reflected in the control.

さらに本発明者は、強化学習における報酬として、配水池３１，３２における制約条件である、水位の上限および下限、すなわち水位の許容範囲からの逸脱具合である、水位の許容範囲の上限または下限からの乖離の大きさに応じた報酬を想到した。具体的には、水位が、許容範囲から逸脱した乖離の大きさが大きくなるほど、報酬を小さくし、さらには報酬として負の絶対値が大きくなるようにした。これにより、配水池３１，３２の水位が、許容範囲内に収まれば報酬を大きくして、学習を進みやすくできる。これは、上述した制御方法において、制約条件から逸脱した時に一律の罰則を与えて、強化学習における学習の繰り返しの単位であるエピソードを停止した場合、逸脱した度合いが大きい場合も小さい場合も同じ罰則を与えることになる。そのため、制御方法を更新する際に制御方法をいずれの方向に更新すればよいかが明確にならない。そこで、制約条件からの逸脱の程度に応じて、例えば単調増加な関数を用いた罰則を与えることによって、制約条件から逸脱した場合に、逸脱した度合いが小さい方の報酬が逸脱した度合いが大きい方の報酬よりも大きくなるような勾配を生成できるため、勾配に従って強化学習を行うことによって、制約条件を満たす方策を学習できることになる。 Furthermore, the inventor has come up with the idea of a reward in reinforcement learning that corresponds to the magnitude of deviation from the upper or lower limit of the water level tolerance range, which is the constraint condition in the water reservoirs 31 and 32, that is, the deviation of the water level from the tolerance range. Specifically, the greater the deviation of the water level from the tolerance range, the smaller the reward is, and the larger the negative absolute value of the reward is. This makes it easier to progress in learning by increasing the reward if the water level of the water reservoirs 31 and 32 falls within the tolerance range. This is because, in the above-mentioned control method, if a uniform penalty is given when a constraint condition is deviated and an episode, which is a unit of repeated learning in reinforcement learning, is stopped, the same penalty is given whether the degree of deviation is large or small. Therefore, when updating the control method, it is not clear in which direction the control method should be updated. Therefore, by applying a penalty using, for example, a monotonically increasing function depending on the degree of deviation from the constraints, a gradient can be generated in which, when the constraints are deviated from, the reward for a smaller degree of deviation is greater than the reward for a larger degree of deviation. By performing reinforcement learning according to the gradient, it is possible to learn a policy that satisfies the constraints.

また、配水ポンプ３３，３４の発停ではなく、配水ポンプ３３，３４をオンまたはオフにするための水位の閾値を最適化することによって、離散値の最適化の問題から、連続値の最適化の問題に変更することで、現実的な処理時間で演算を終了可能とした。すなわち、単純に、設定時刻において配水ポンプ３３，３４の発停を計画する場合、配水ポンプ３３，３４のオンとオフとの２通りに対して、時間の刻み数と、配水ポンプ３３，３４の数との累乗で算出される数の操作から、いずれの操作を選ぶかが選択される。しかしながら、所定の発停状態にある配水ポンプ３３，３４に対して、次に配水ポンプ３３，３４の発停操作を行う配水池３１，３２の水位（以下、発停水位）を決定するように制御する。これにより、オンオフの離散値の最適化問題を、配水池３１，３２の水位の連続値の最適化問題に変更できる。そのため、ニューラルネットワーク３００による推論ごとに水位を決定すれば良いことになるので、従来に比して短時間で最適解に近い解を導出できる。さらに、配水池３１，３２のそれぞれにおける水位を所定の許容範囲内に設定していることにより、制約条件からの逸脱を低減できる。さらに、連続値から所定の制御を選択する場合、離散値から所定の制御を選択する場合に比して、推論ごとのばらつきが少なくなるため、安定した運転計画をたてることができる。 In addition, instead of starting and stopping the distribution pumps 33, 34, the water level threshold for turning the distribution pumps 33, 34 on or off is optimized, changing the problem of optimizing discrete values to a problem of optimizing continuous values, making it possible to complete the calculation in a realistic processing time. In other words, when simply planning the start and stop of the distribution pumps 33, 34 at a set time, one of the operations calculated as the power of the number of time increments and the number of distribution pumps 33, 34 is selected for the two ways of turning the distribution pumps 33, 34 on and off. However, for the distribution pumps 33, 34 in a specified start/stop state, the water level of the distribution reservoirs 31, 32 at which the distribution pumps 33, 34 are next started and stopped (hereinafter, the start/stop water level) is controlled to be determined. This allows the optimization problem of the discrete values of on/off to be changed to the optimization problem of the continuous values of the water levels of the distribution reservoirs 31, 32. Therefore, since it is only necessary to determine the water level for each inference by the neural network 300, a solution close to the optimal solution can be derived in a shorter time than in the past. Furthermore, by setting the water levels in each of the reservoirs 31 and 32 within a predetermined tolerance range, deviation from the constraints can be reduced. Furthermore, when a predetermined control is selected from continuous values, there is less variation in each inference compared to when a predetermined control is selected from discrete values, making it possible to create a stable operation plan.

以下に説明する本実施形態による配水池の運転管理方法は、本発明者による以上の鋭意検討によって案出されたものである。図３は、本実施形態による配水池の運転管理方法を説明するためのフローチャートである。なお、運転管理装置２０は、ネットワーク２を通じて、常時または適時、配水池３１，３２から水位情報を取得しているとともに、配水ポンプ３３，３４から発停情報を取得し、取得した水位情報および発停情報を記憶部２２に格納している。また、学習装置１０と運転管理装置２０と配水池系３０との間における情報の送受信、供給、または取得は、ネットワーク２を介して行われるが、都度の説明は省略する。 The method for managing operation of a water reservoir according to this embodiment, which is described below, was devised by the inventor through the above-mentioned intensive study. FIG. 3 is a flowchart for explaining the method for managing operation of a water reservoir according to this embodiment. The operation management device 20 constantly or at appropriate times acquires water level information from the water reservoirs 31, 32 via the network 2, and acquires start/stop information from the water distribution pumps 33, 34, and stores the acquired water level information and start/stop information in the memory unit 22. The transmission, supply, or acquisition of information between the learning device 10, the operation management device 20, and the water reservoir system 30 is also performed via the network 2, but a detailed explanation of each is omitted.

図３に示すように、ステップＳＴ１において学習装置１０の制御部１１における状態入力部１１４は、配水池系３０に関する情報を貯留入力データとして取得する。入力データの一例としての貯留入力データは、例えば配水ポンプ３３，３４のオンオフなどの離散値を含む。具体的に、状態入力部１１４は、運転管理装置２０から、水位情報および発停情報を取得する。状態入力部１１４は、運転管理装置２０の制御部２１における運転計画部２１１によって、運転計画データベース２２１から、所定時間、例えば４８時間におけるそれぞれの需要家３５，３６が使用する水道水の量の予測値の情報（以下、需要予測情報）を取得する。需要予測情報は、換言すると、時刻に沿った、配水池３１，３２から排出される水道水の排出量の予測値である。状態入力部１１４は、適時、所定の電力供給事業者などの事業者サーバから、具体的に配水ポンプ３３，３４に電力を供給している電力供給事業者の事業者サーバ（図示せず）から、所定時刻における電力料金単価の情報（以下、料金情報）を取得する。状態入力部１１４は、常時または適時、所定の時刻サーバ（図示せず）から現在時刻の情報（以下、時刻）を取得する。なお、状態入力部１１４は、水位情報、発停情報、需要予測情報、料金情報、および時刻のうちの、全ての情報を取得しても、水位情報を含む一部の情報のみを取得しても良い。さらに、その他の情報を取得することも可能である。 As shown in FIG. 3, in step ST1, the state input unit 114 in the control unit 11 of the learning device 10 acquires information about the water reservoir system 30 as storage input data. The storage input data, which is an example of input data, includes discrete values such as the on/off status of the water distribution pumps 33, 34. Specifically, the state input unit 114 acquires water level information and start/stop information from the operation management device 20. The state input unit 114 acquires information on the predicted value of the amount of tap water used by each consumer 35, 36 in a predetermined time, for example, 48 hours (hereinafter, demand forecast information) from the operation plan database 221 by the operation plan unit 211 in the control unit 21 of the operation management device 20. The demand forecast information is, in other words, a predicted value of the amount of tap water discharged from the water reservoirs 31, 32 according to time. The status input unit 114 acquires information on the electricity rate unit price at a specified time (hereinafter, rate information) from a business server of a specified power supply business, specifically from the business server (not shown) of the power supply business supplying power to the water distribution pumps 33, 34, as appropriate. The status input unit 114 acquires current time information (hereinafter, time) from a specified time server (not shown) all the time or as appropriate. The status input unit 114 may acquire all of the information, including water level information, start/stop information, demand forecast information, rate information, and time, or may acquire only a portion of the information including the water level information. It is also possible to acquire other information.

次に、ステップＳＴ２に移行して制御部１１の演算部１１６は、取得したデータに対して、複数の互いに異なるノイズを付加する。具体的には、図２に示すように、演算部１１６は、例えば取得した複数の配水池３１，３２における複数の水位情報のうちの、少なくとも１つの水位情報に対して、互いに異なる複数のノイズを付加する。本実施形態においては、これらのノイズは、正規分布、いわゆるガウス雑音に基づいて選択されるが、例えば一様分布などの他の分布に基づいたノイズ、他の分布に基づいて選択することも可能である。水位情報に対して互いに異なる複数のノイズを付加することにより、１つの水位情報に対して複数のノイズ付加データが、ノイズ付加水位情報として生成される。同様に、取得した複数の配水ポンプ３３，３４における複数の発停情報のうちの、少なくとも１つの発停情報に対して、互いに異なる複数のノイズを付加する。これにより、１つの発停情報に対して複数のノイズ付加データが、ノイズ付加発停情報として生成される。さらに、需要予測情報に対しても同様に、少なくとも１つの需要予測情報に対して、複数のノイズ付加需要予測情報が生成される。なお、演算部１１６による情報に対するノイズの付加は、水位情報、発停情報、および需要予測情報の全ての情報に対して行っても良く、これらの情報から選択した情報に対してのみ行っても良い。また、演算部１１６は、必要に応じて、料金情報や時刻に対しても、複数の互いに異なるノイズを付加して、複数のノイズ付加料金情報や、複数のノイズ付加時刻を生成しても良い。すなわち、状態入力部１１４が取得した種々の情報のうちの、少なくとも１つの情報に対してノイズを付加すれば良い。 Next, the process proceeds to step ST2, and the calculation unit 116 of the control unit 11 adds multiple different noises to the acquired data. Specifically, as shown in FIG. 2, the calculation unit 116 adds multiple different noises to at least one of the multiple water level information pieces in the multiple water level information pieces in the multiple water distribution reservoirs 31 and 32 acquired, for example. In this embodiment, these noises are selected based on normal distribution, or so-called Gaussian noise, but it is also possible to select noises based on other distributions, such as uniform distribution, or other distributions. By adding multiple different noises to the water level information, multiple noise-added data are generated for one piece of water level information as noise-added water level information. Similarly, multiple different noises are added to at least one piece of on/off information pieces in the multiple on/off information pieces in the multiple distribution pumps 33 and 34 acquired. As a result, multiple noise-added data are generated for one piece of on/off information as noise-added on/off information. Furthermore, similarly, multiple noise-added demand forecast information is generated for at least one piece of demand forecast information. The calculation unit 116 may add noise to all of the water level information, start/stop information, and demand forecast information, or may add noise to only selected information from among these. The calculation unit 116 may also add multiple different noises to the fee information and time as necessary to generate multiple noise-added fee information and multiple noise-added times. In other words, it is sufficient to add noise to at least one of the various pieces of information acquired by the status input unit 114.

次に、図３のステップＳＴ３に移行して、学習装置１０の制御部１１における学習部１１１は、記憶部１２からニューラルネットワーク１２２を読み出し、複数のノイズ付加データをニューラルネットワーク１２２に入力する。なお、学習部１１１は、強化学習による機械学習によって生成された学習モデル１２１を読み出して、複数のノイズ付加データを学習モデル１２１に入力しても良い。学習モデル１２１またはニューラルネットワーク１２２によって、制御部１１の学習部１１１は、入力した複数のノイズ付加データにそれぞれ対応した、複数の推論結果データ３２１を出力する（図２参照）。 Next, moving to step ST3 in FIG. 3, the learning unit 111 in the control unit 11 of the learning device 10 reads out the neural network 122 from the memory unit 12 and inputs the multiple noise-added data to the neural network 122. The learning unit 111 may also read out the learning model 121 generated by machine learning using reinforcement learning and input the multiple noise-added data to the learning model 121. Using the learning model 121 or the neural network 122, the learning unit 111 of the control unit 11 outputs multiple inference result data 321 that respectively correspond to the multiple noise-added data that have been input (see FIG. 2).

その後、ステップＳＴ４に移行して、複数の推論結果データ３２１がシミュレータ部１１７によって実行されるシミュレータ３４０（図２参照）に入力される。シミュレータ３４０は、複数の推論結果データ３２１に対応した複数の評価値３３０を出力する。複数の評価値３３０は、制御部１１の判定部１１２によって実行されるセレクタ３５０に入力される。セレクタ３５０は、複数の評価値３３０から、例えば、最も評価値が高い評価値３３０、または所定の評価値に最も近い評価値３３０などの所定条件に基づいて、少なくとも１つの評価値３３０を選択する。なお、複数の評価値３３０が所定条件を満たしていない場合には、ステップＳＴ２に復帰してノイズの付加を改めて行っても良い。続いて、制御部１１の行動選択部１１３は、複数の推論結果データ３２１から、セレクタ３５０が選択した評価値に対応した少なくとも１つの推論結果データ３２１を最終の推論結果データ３２０として選択する。ここで、最終の推論結果データ３２０は、それぞれの配水ポンプ３３，３４の発停の切り換えを行うタイミングであったり、発停の切り換えを行う配水池３１，３２の水位であったり、それぞれの配水ポンプ３３，３４の発停のパターンであったりする。 Then, the process proceeds to step ST4, where the multiple inference result data 321 are input to the simulator 340 (see FIG. 2) executed by the simulator unit 117. The simulator 340 outputs multiple evaluation values 330 corresponding to the multiple inference result data 321. The multiple evaluation values 330 are input to the selector 350 executed by the judgment unit 112 of the control unit 11. The selector 350 selects at least one evaluation value 330 from the multiple evaluation values 330 based on a predetermined condition, such as the evaluation value 330 with the highest evaluation value or the evaluation value 330 closest to a predetermined evaluation value. If the multiple evaluation values 330 do not satisfy the predetermined condition, the process may return to step ST2 and add noise again. Next, the action selection unit 113 of the control unit 11 selects at least one inference result data 321 corresponding to the evaluation value selected by the selector 350 from the multiple inference result data 321 as the final inference result data 320. Here, the final inference result data 320 may be the timing for switching on and off the respective distribution pumps 33, 34, the water levels of the distribution reservoirs 31, 32 at which the switching on and off is performed, or the on and off pattern of the respective distribution pumps 33, 34.

続いて、ステップＳＴ５に移行して学習装置１０の通信部１３は、制御部１１によって導出された最終の推論結果データ３２０を、貯留推論結果データとして運転管理装置２０に送信する。運転管理装置２０の運転制御部２１２は、取得した貯留推論結果データ（最終の推論結果データ３２０）に基づいて、配水池系３０の配水ポンプ３３，３４の運転、すなわち発停を制御する。配水池系３０における配水ポンプ３３，３４の発停によって得られるそれぞれの配水池３１，３２の水位は、水位情報として常時または適時、運転管理装置２０に送信される。同様に、配水ポンプ３３，３４の動作のオンオフも、発停情報として常時または適時、運転管理装置２０に送信される。運転管理装置２０が取得したそれぞれの配水池３１，３２の水位情報や、それぞれの配水ポンプ３３，３４の発停情報は、運転管理装置２０の記憶部２２における運転情報データベース２２２に格納される。 Next, proceeding to step ST5, the communication unit 13 of the learning device 10 transmits the final inference result data 320 derived by the control unit 11 to the operation management device 20 as storage inference result data. The operation control unit 212 of the operation management device 20 controls the operation, i.e., start/stop, of the distribution pumps 33, 34 of the distribution reservoir system 30 based on the acquired storage inference result data (final inference result data 320). The water levels of the distribution reservoirs 31, 32 obtained by starting and stopping the distribution pumps 33, 34 in the distribution reservoir system 30 are transmitted to the operation management device 20 as water level information at all times or as needed. Similarly, the on/off operation of the distribution pumps 33, 34 is also transmitted to the operation management device 20 as start/stop information at all times or as needed. The water level information of the distribution reservoirs 31, 32 and the start/stop information of the distribution pumps 33, 34 acquired by the operation management device 20 are stored in the operation information database 222 in the memory unit 22 of the operation management device 20.

次に、ステップＳＴ６に移行して、運転制御部２１２は、取得したそれぞれの配水池３１，３２における水位情報に基づいて、それぞれの配水池３１，３２の水位が許容範囲内であるか否かを判定する。なお、それぞれの配水池３１，３２の水位の許容範囲は、配水池３１，３２ごとにそれぞれ設定されている。運転制御部２１２は、配水ポンプ３３，３４の発停に応じたそれぞれの配水池３１，３２における水位が、許容範囲から逸脱したか否かを配水池３１，３２ごとに確認する。また、運転制御部２１２は、許容範囲から逸脱した場合には、許容範囲からどの程度逸脱したか、すなわち、それぞれの配水池３１，３２の水位が、許容範囲の上限よりどの程度高いか、許容範囲の下限よりどの程度低いかを、逸脱レベルとして導出する。同様に、運転制御部２１２は、配水ポンプ３３，３４の発停回数や使用電力料金が、許容範囲からどの程度逸脱したかに応じて、逸脱レベルを導出する。運転制御部２１２によって導出された逸脱レベルは、運転管理装置２０から、学習装置１０に送信される。 Next, proceeding to step ST6, the operation control unit 212 determines whether the water levels of the respective reservoirs 31, 32 are within the allowable range based on the acquired water level information of the respective reservoirs 31, 32. The allowable range of the water levels of the respective reservoirs 31, 32 is set for each reservoir 31, 32. The operation control unit 212 checks for each reservoir 31, 32 whether the water levels in the respective reservoirs 31, 32 according to the start and stop of the distribution pumps 33, 34 deviate from the allowable range. If the water levels deviate from the allowable range, the operation control unit 212 derives the deviation level, i.e., how much the water levels of the respective reservoirs 31, 32 are higher than the upper limit of the allowable range or how much lower the water levels are than the lower limit of the allowable range. Similarly, the operation control unit 212 derives the deviation level depending on how much the number of starts and stops of the distribution pumps 33, 34 and the electricity usage fee deviate from the allowable range. The deviation level derived by the operation control unit 212 is transmitted from the operation management device 20 to the learning device 10.

学習装置１０の環境部１１５は、取得した逸脱レベルに基づいて、強化学習の報酬に関する罰則を導出する。具体的に環境部１１５は、取得した逸脱レベルが大きいほど、線形的に減点が大きくなるように、換言すると報酬が少なくなるように、報酬を設定する。また、環境部１１５は、配水池３１，３２の制約条件である水位の許容範囲を維持しつつ、運転を所定時間、例えば１時間継続できた場合に、正の報酬を設定する。さらに、環境部１１５は、配水池系３０の配水ポンプ３３，３４の発停回数や使用電力料金に応じて、逸脱レベルが大きくなるに従って罰則が大きくなるように、報酬を設定する。 The environment unit 115 of the learning device 10 derives a penalty related to the reward for reinforcement learning based on the acquired deviation level. Specifically, the environment unit 115 sets the reward so that the greater the acquired deviation level, the greater the linear deduction, in other words, the smaller the reward. The environment unit 115 also sets a positive reward when operation can be continued for a predetermined time, for example, one hour, while maintaining the allowable range of the water level, which is a constraint condition of the distributing reservoirs 31, 32. Furthermore, the environment unit 115 sets the reward so that the penalty increases as the deviation level increases, depending on the number of starts and stops of the distributing pumps 33, 34 of the distributing reservoir system 30 and the electricity usage fee.

次に、ステップＳＴ７に移行して、学習部１１１は、環境部１１５が設定した報酬を取得して、強化学習における報酬とする。これにより、学習部１１１の強化学習における機械学習が進み、配水池系３０の運転制御に関して、より適切な推論結果、すなわち配水ポンプ３３，３４の発停のパターンを最適化させることができる。 Next, the process proceeds to step ST7, where the learning unit 111 obtains the reward set by the environment unit 115 and sets it as the reward in the reinforcement learning. This advances the machine learning in the reinforcement learning of the learning unit 111, and more appropriate inference results can be obtained for the operational control of the water reservoir system 30, i.e., the start/stop pattern of the water distribution pumps 33, 34 can be optimized.

図４は、上述した入力データにノイズを付加した一実施形態による配水池系の運転管理方法による、運転時間に沿った制御継続時間を示すグラフである。図５は、比較例としての入力データにノイズを付加することなく、強化学習を用いた従来技術による配水池系の運転管理方法による、運転時間に沿った制御継続時間を示すグラフである。なお、制御継続時間とは、配水池３１，３２の水位を、水位の許容範囲内に維持した状態で運転を継続できた時間であり、図４および図５に示す例では、運転継続時間の上限を２４時間（１日）とした。 Figure 4 is a graph showing the control duration over operation time according to an embodiment of the operation management method for a water reservoir system in which noise is added to the input data described above. Figure 5 is a graph showing the control duration over operation time according to a conventional operation management method for a water reservoir system using reinforcement learning without adding noise to the input data as a comparative example. The control duration is the time during which operation can be continued while maintaining the water levels of the water reservoirs 31 and 32 within the allowable water level range, and in the examples shown in Figures 4 and 5, the upper limit of the operation duration is set to 24 hours (one day).

図４から、一実施形態による運転管理方法においては、配水池３１，３２の水位を所定の水位の許容範囲内に維持した状態で、１９００時間程度まで配水池系３０の運転を継続することができたことが分かる。図４に示す例では、約１９００時間の運転において、配水池３１，３２の制約条件、すなわち水位の許容範囲を維持した運転ができたことが確認された。これに対し、図５から、従来技術による運転管理方法においては、配水池３１，３２の水位が、所定の水位の許容範囲から逸脱していることが分かる。図５に示す例では、約１９００時間の運転において、１１０時間程度、水位の許容範囲を逸脱して、配水池３１，３２の制約条件を逸脱した運転になったことが確認された。 From FIG. 4, it can be seen that in the operation management method according to one embodiment, the operation of the reservoir system 30 could be continued for approximately 1,900 hours while maintaining the water levels of the reservoirs 31, 32 within a predetermined allowable range. In the example shown in FIG. 4, it was confirmed that operation was possible while maintaining the constraints of the reservoirs 31, 32, i.e., the allowable range of the water levels, during approximately 1,900 hours of operation. In contrast, from FIG. 5, it can be seen that in the operation management method according to the conventional technology, the water levels of the reservoirs 31, 32 deviate from the allowable range of the water levels. In the example shown in FIG. 5, it was confirmed that operation deviated from the allowable range of the water levels for approximately 110 hours during approximately 1,900 hours of operation, resulting in operation deviating from the constraints of the reservoirs 31, 32.

すなわち、従来技術においては、安定して運転できたのは、（（１９００－１１０）／１９００＝）９４．２％程度である。このように、強化学習によって生成された学習モデルや、教師あり学習または半教師あり学習などによって生成された学習モデルを、より精度良く構築したとしても、安定した運転を１００％継続することは極めて困難であるのに対し、入力データにノイズを付加する本実施形態による運転管理方法によれば、配水池系３０の運転を安定して行うことができる。なお、入力データにノイズを付加することによって、出力される推論結果の精度を向上させる対象としては、配水池系３０などに限定されず、貯留ピットおよび焼却炉などを備えた廃棄物処理施設や、橋梁などの建築構造物など、種々の対象に適用可能である。 In other words, in the conventional technology, stable operation was achieved about 94.2% of the time ((1900-110)/1900=). Thus, even if a learning model generated by reinforcement learning or a learning model generated by supervised learning or semi-supervised learning is constructed with higher accuracy, it is extremely difficult to maintain stable operation 100% of the time. In contrast, the operation management method according to the present embodiment, which adds noise to the input data, allows stable operation of the reservoir system 30. Note that the target for improving the accuracy of the inference results output by adding noise to the input data is not limited to the reservoir system 30, but can be applied to various targets such as waste treatment facilities equipped with storage pits and incinerators, and architectural structures such as bridges.

図６は、一実施形態による配水池系の運転管理方法による、運転時間に沿った制御スコアを示すグラフである。なお、一実施形態による運転管理方法においては、セレクタ３５０によって例えば２０未満の評価値の推論結果を排除している。図７は、比較例としての従来技術による配水池系の運転管理方法による、運転時間に沿った制御スコアを示すグラフである。なお、図６および図７に示す制御スコアは、上述した制約条件である水位の許容範囲内での運転を継続できたか否かの運転継続性、使用電力料金、配水ポンプ３３，３４の発停回数に基づいて算出される。例えば、運転継続性は、水位の許容範囲内での運転を１時間継続した場合に１点とし、継続できなかった場合には水位の許容範囲から逸脱した大きさ応じた点数を減点した。また、発停回数が所定回数を超えたり、使用電力料金が所定の料金を超えたりした場合には、超えた回数や金額に応じた点数を減点した。なお、減点は、水位の許容範囲から逸脱した大きさに応じて、線形的な点数が好ましいが、指数的な点数や、対数的な点数や、多項式的な点数としても良い。また、制御スコアは、報酬または評価値とも言われる。 Figure 6 is a graph showing the control score along the operation time according to the operation management method of the water reservoir system according to one embodiment. In the operation management method according to one embodiment, the selector 350 eliminates inference results of evaluation values less than 20, for example. Figure 7 is a graph showing the control score along the operation time according to the operation management method of the water reservoir system according to the conventional technology as a comparative example. The control scores shown in Figures 6 and 7 are calculated based on the operation continuity, which is whether or not operation could be continued within the allowable range of the water level, which is the above-mentioned constraint condition, the power usage fee, and the number of starts and stops of the water distribution pumps 33 and 34. For example, the operation continuity is given 1 point when operation within the allowable range of the water level is continued for one hour, and if it cannot be continued, the points according to the amount of deviation from the allowable range of the water level are deducted. In addition, if the number of starts and stops exceeds a predetermined number of times or the power usage fee exceeds a predetermined fee, the points according to the number of times or amount of the exceedance are deducted. The deduction is preferably a linear score according to the deviation of the water level from the allowable range, but it may also be an exponential score, a logarithmic score, or a polynomial score. The control score is also called a reward or an evaluation value.

図６から、一実施形態による配水池系３０の運転管理方法においては、制御スコアを２０点以上２３点以下の範囲に収めることが可能であることが分かる。これに対し、図７から、従来技術による配水池系３０の管理方法においては、制御スコアが２０未満、さらには－１０程度まで低下することが分かる。すなわち、上述した一実施形態による運転管理方法によれば、従来に比して配水池系３０の運転のさらなる最適化を実現可能であることが分かる。 From Figure 6, it can be seen that in the operation management method of the reservoir system 30 according to one embodiment, it is possible to keep the control score in the range of 20 points or more and 23 points or less. In contrast, from Figure 7, it can be seen that in the management method of the reservoir system 30 according to the conventional technology, the control score falls below 20, or even to around -10. In other words, it can be seen that the operation management method according to the above-mentioned embodiment makes it possible to further optimize the operation of the reservoir system 30 compared to the conventional method.

（変形例）
次に、上述した一実施形態による情報処理方法の変形例について説明する。図８は、一実施形態の変形例による、入力データから推論結果を導出する方法を説明するための図である。図８に示すように、変形例による運転管理方法においては、上述した一実施形態と異なり、ニューラルネットワークが第１推論結果データ３２１－１～第ｎ推論結果データ３２１－ｎを出力した後、これらの第１推論結果データ３２１－１～第ｎ推論結果データ３２１－ｎは、アンサンブル３６０に入力される。アンサンブル３６０は、例えばバギング、ブースティング、またはスタッキングなどのアルゴリズムに基づいて学習部１１１により実行され、複数の推論結果データ３２１を組み合わせて多数決的に推論を行う。分類や回帰など推定問題においては、複数の出力を用いたアンサンブル学習によって精度を向上させることができる。一実施形態の変形例の場合、ノイズを付加した複数のノイズ付加入力データ３１１－１～３１１－ｎから、それらのデータに対応した複数の推論結果データ３２１－１～３２１－ｎを取得し、アンサンブル学習によってより精度の高い推論結果を得る最終の推論結果データ３２０として得ることができる。その他の情報処理方法については、一実施形態と同様である。なお、アンサンブル学習以外の学習型の対象物認識アルゴリズムを用いることも可能である。 (Modification)
Next, a modified example of the information processing method according to the embodiment described above will be described. FIG. 8 is a diagram for explaining a method of deriving an inference result from input data according to a modified example of the embodiment. As shown in FIG. 8, in the operation management method according to the modified example, unlike the embodiment described above, after the neural network outputs the first inference result data 321-1 to the n-th inference result data 321-n, these first inference result data 321-1 to the n-th inference result data 321-n are input to the ensemble 360. The ensemble 360 is executed by the learning unit 111 based on an algorithm such as bagging, boosting, or stacking, for example, and performs inference by majority rule by combining multiple inference result data 321. In estimation problems such as classification and regression, accuracy can be improved by ensemble learning using multiple outputs. In the case of the modified example of the embodiment, multiple inference result data 321-1 to 321-n corresponding to multiple noise-added input data 311-1 to 311-n to which noise is added are obtained, and the final inference result data 320 that obtains a more accurate inference result by ensemble learning can be obtained. Other information processing methods are the same as those in the first embodiment. Note that it is also possible to use a learning type object recognition algorithm other than the ensemble learning.

以上説明した一実施形態によれば、入力データに互いに異なる複数のノイズを付加して複数のノイズ付加データを生成し、複数のノイズ付加データをニューラルネットワークに入力して、複数のノイズ付加データのそれぞれに対応した複数の推論結果データを取得し、取得した複数の推論結果データに基づいて、最終の推論結果データを出力していることにより、入力データをニューラルネットワークに入力させて得られる推論結果データの精度をさらに向上させることが可能となる。 According to the embodiment described above, multiple noises different from each other are added to input data to generate multiple noise-added data, the multiple noise-added data are input to a neural network to obtain multiple inference result data corresponding to each of the multiple noise-added data, and final inference result data is output based on the obtained multiple inference result data. This makes it possible to further improve the accuracy of the inference result data obtained by inputting input data into a neural network.

（記録媒体）
上述の一実施形態において、学習装置１０や運転管理装置２０が実行する処理方法を実行させるプログラムを、コンピュータその他の機械やウェアラブルデバイスなどの装置（以下、コンピュータなど、という）が読み取り可能な記録媒体に記録することができる。コンピュータなどに、この記録媒体のプログラムを読み込ませて実行させることにより、当該コンピュータなどが移動体制御装置として機能する。ここで、コンピュータなどが読み取り可能な記録媒体とは、データやプログラムなどの情報を電気的、磁気的、光学的、機械的、または化学的作用によって蓄積し、コンピュータなどから読み取ることができる非一時的な記録媒体をいう。このような記録媒体のうちのコンピュータ等から取り外し可能なものとしては、例えばフレキシブルディスク、光磁気ディスク、ＣＤ－ＲＯＭ、ＣＤ－Ｒ／Ｗ、ＤＶＤ、ＢＤ、ＤＡＴ、磁気テープ、フラッシュメモリなどのメモリカードなどがある。また、コンピュータなどに固定された記録媒体としてハードディスク、ＲＯＭなどがある。さらに、ＳＳＤは、コンピュータなどから取り外し可能な記録媒体としても、コンピュータなどに固定された記録媒体としても利用可能である。 (recoding media)
In the above-described embodiment, a program for executing the processing method executed by the learning device 10 or the operation management device 20 can be recorded on a recording medium that can be read by a computer or other machine or device such as a wearable device (hereinafter referred to as a computer, etc.). By having a computer, etc. read and execute the program from the recording medium, the computer, etc. functions as a mobile body control device. Here, a recording medium that can be read by a computer, etc. refers to a non-transient recording medium that accumulates information such as data and programs by electrical, magnetic, optical, mechanical, or chemical action and can be read from a computer, etc. Among such recording media, those that can be removed from a computer, etc. include, for example, a flexible disk, a magneto-optical disk, a CD-ROM, a CD-R/W, a DVD, a BD, a DAT, a magnetic tape, and a memory card such as a flash memory. In addition, a hard disk, a ROM, etc. are examples of recording media fixed to a computer, etc. Furthermore, an SSD can be used as a recording medium that can be removed from a computer, etc., or as a recording medium fixed to a computer, etc.

また、一実施形態による学習装置１０、および運転管理装置２０に実行させるプログラムは、インターネットなどのネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成しても良い。 In addition, the programs executed by the learning device 10 and the operation management device 20 in one embodiment may be stored on a computer connected to a network such as the Internet and provided by downloading via the network.

（その他の実施形態）
一実施形態においては、上述した「部」を、「回路」などに読み替えることができる。例えば、制御部は、制御回路に読み替えることができる。 Other Embodiments
In one embodiment, the above-mentioned "unit" can be read as a "circuit" etc. For example, the control unit can be read as a control circuit.

なお、本明細書におけるフローチャートの説明では、「まず」、「次に」、「その後」、「続いて」などの表現を用いてステップ間の処理の前後関係を明示していたが、本実施の形態を実施するために必要な処理の順序は、それらの表現によって一意的に定められるわけではない。すなわち、本明細書で記載したフローチャートにおける処理の順序は、矛盾のない範囲で変更することができる。 Note that in the explanation of the flowcharts in this specification, the order of processing between steps is clearly indicated using expressions such as "first," "next," "then," and "continue." However, the order of processing required to implement this embodiment is not uniquely determined by these expressions. In other words, the order of processing in the flowcharts described in this specification can be changed as long as there are no contradictions.

さらなる効果や変形例は、当業者によって容易に導き出すことができる。本開示のより広範な態様は、以上のように表しかつ記述した特定の詳細および代表的な実施形態に限定されるものではない。したがって、添付のクレームおよびその均等物によって定義される総括的な発明の概念の精神または範囲から逸脱することなく、様々な変更が可能である。 Further advantages and modifications may be readily derived by those skilled in the art. The broader aspects of the present disclosure are not limited to the specific details and representative embodiments shown and described above. Thus, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and equivalents thereof.

１運転管理システム
２ネットワーク
１０学習装置
１１，２１制御部
１２，２２記憶部
１３，２３通信部
１４，２４出力部
１５，２５入力部
２０運転管理装置
３０配水池系
３１，３２配水池
３３，３４配水ポンプ
３５，３６需要家
１１１学習部
１１２判定部
１１３行動選択部
１１４状態入力部
１１５環境部
１１６演算部
１１７シミュレータ部
１２１学習モデル
１２２，３００ニューラルネットワーク
２１１運転計画部
２１２運転制御部
２２１運転計画データベース
２２２運転情報データベース
３０１入力層
３０２中間層
３０３出力層
３１０入力データ
３１１－１第１ノイズ付加データ
３１１－２第２ノイズ付加データ
３１１－３第３ノイズ付加データ
３１１－ｎ第ｎノイズ付加データ
３２０，３２１推論結果データ
３２１－１第１推論結果データ
３２１－２第２推論結果データ
３２１－３第３推論結果データ
３２１－ｎ第ｎ推論結果データ
３３０評価値
３３０－１第１評価値
３３０－２第２評価値
３３０－３第３評価値
３３０－ｎ第ｎ評価値
３４０シミュレータ
３５０セレクタ
３６０アンサンブル 1 Operation management system 2 Network 10 Learning device 11, 21 Control unit 12, 22 Memory unit 13, 23 Communication unit 14, 24 Output unit 15, 25 Input unit 20 Operation management device 30 Reservoir system 31, 32 Reservoir 33, 34 Distribution pump 35, 36 Consumer 111 Learning unit 112 Judgment unit 113 Action selection unit 114 State input unit 115 Environment unit 116 Calculation unit 117 Simulator unit 121 Learning model 122, 300 Neural network 211 Operation plan unit 212 Operation control unit 221 Operation plan database 222 Operation information database 301 Input layer 302 Intermediate layer 303 Output layer 310 Input data 311-1 First noise-added data 311-2 Second noise-added data 311-3 Third noise-added data 311-n nth noise-added data 320, 321 Inference result data 321-1 First inference result data 321-2 Second inference result data 321-3 Third inference result data 321-n nth inference result data 330 Evaluation value 330-1 First evaluation value 330-2 Second evaluation value 330-3 Third evaluation value 330-n nth evaluation value 340 Simulator 350 Selector 360 Ensemble

Claims

An information processing device including a control unit that acquires at least one input data and outputs inference result data using a neural network based on the input data,
The control unit is
The acquired input data is stored in a storage unit;
generating a plurality of noise-added data by adding a plurality of noises different from each other to the single input data read from the storage unit;
inputting the plurality of noise-added data into the neural network, and outputting a plurality of noise-added inference result data corresponding to the plurality of noise-added data;
and an information processing device that outputs the inference result data based on the plurality of noise-added inference result data.

The information processing apparatus according to claim 1 , wherein the plurality of noises different from each other are noises selected based on a normal distribution.

The control unit is
performing a simulation for each of the plurality of noise-added inference result data, and outputting a plurality of evaluation values corresponding to each of the plurality of noise-added inference result data;
The information processing apparatus according to claim 1 , further comprising: outputting, as the inference result data, at least one noise-added inference result data selected from the plurality of noise-added inference result data based on the plurality of evaluation values.

The control unit is
performing ensemble learning on the plurality of noise-added inference result data;
The information processing device according to claim 1 , wherein an output obtained by the ensemble learning is output as the inference result data.

The information processing device according to any one of claims 1 to 4, wherein the input data is data including discrete values.

An operation management device for a storage facility including at least one storage tank for storing a liquid and at least one liquid delivery unit capable of supplying the liquid to the storage tank, the operation management device including an operation control unit that controls the liquid delivery unit to manage a liquid level in the storage unit,
The operation control unit is
Acquiring liquid level information including a liquid level in the storage unit from the storage facility;
Inputting information including the acquired liquid level information as stored input data into an information processing device according to any one of claims 1 to 5, and acquiring inference result data output from the information processing device in response to the stored input data as stored inference result data;
An operation management device that controls the liquid delivery unit based on the storage inference result data.

The operation management device according to claim 6 , wherein the storage inference result data includes information on a liquid level that switches on/off the liquid delivery unit.

The operation management device according to claim 6 or 7, wherein the storage input data includes information on a predicted value along with time of an amount of the liquid to be discharged from the storage portion at a predetermined time.

The liquid delivery unit is composed of a pump whose on/off switching is controlled,
The operation management device according to any one of claims 6 to 8, wherein the storage input data includes information on starting and stopping of the pump at a predetermined time.

The liquid delivery unit is composed of a pump driven by electricity,
The operation management device according to any one of claims 6 to 9, wherein the stored input data includes information on the unit price of electricity and power consumption at a predetermined time.

1. An information processing method executed by an information processing device that acquires at least one input data and outputs inference result data using a neural network based on the input data, comprising:
The acquired input data is stored in a storage unit;
generating a plurality of noise-added data by adding a plurality of noises different from each other to the single input data read from the storage unit;
inputting the plurality of noise-added data into the neural network, and outputting a plurality of noise-added inference result data corresponding to the plurality of noise-added data;
and outputting the inference result data based on the plurality of noise-added inference result data.

An operation management method for a storage facility including at least one storage tank for storing a liquid and at least one liquid delivery unit capable of supplying the liquid to the storage tank, the operation management method being performed by an operation management device that controls the liquid delivery unit to manage a liquid level in the storage unit,
Acquiring liquid level information including a liquid level in the storage unit from the storage facility;
The information including the acquired liquid level information is used as storage input data, and storage inference result data obtained corresponding to the storage input data is acquired based on the information processing method according to claim 11 .
The operation management method further comprises controlling the liquid delivery unit based on the storage inference result data.