JP6952018B2

JP6952018B2 - Control device and control method

Info

Publication number: JP6952018B2
Application number: JP2018187912A
Authority: JP
Inventors: 服部　哲; 哲服部; 敬規高田; 佑樹田内
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2018-10-03
Filing date: 2018-10-03
Publication date: 2021-10-20
Anticipated expiration: 2038-10-03
Also published as: DE102019214640A1; CN110976523B; CN110976523A; JP2020057238A

Description

本発明は、ニューラルネット等の人工知能を用いた実時間のフィードバック制御を行う技術に関する。 The present invention relates to a technique for performing real-time feedback control using artificial intelligence such as a neural network.

従来から、各種のプラントにおいてはその制御により所望の制御結果を得るために各種制御理論に基づいたプラント制御が実施されている。 Conventionally, in various plants, plant control based on various control theories has been carried out in order to obtain a desired control result by the control.

プラントの一例として例えば圧延機制御においては、制御の一例として板の波打ち状態を制御する形状制御を対象とした制御理論として、ファジィ制御やニューロ・ファジィ制御が適用されてきた。ファジィ制御は、クーラントを利用した形状制御に、また、ニューロ・ファジィ制御は、センジミア圧延機の形状制御に適用されている。このうちニューロ・ファジィ制御を適用した形状制御は、特許文献１に示されるように、形状検出器で検出された実績形状パターンと目標形状パターンの差と、予め設定された基準形状パターンとの類似割合を求め、その類似割合からこれも予め設定された基準形状パターンに対する制御操作端操作量によって表現された制御ルールにより、操作端に対する制御出力量を求めることにより行われている。以下、従来技術として、ニューロ・ファジィ制御を用いたセンヂミア圧延機の形状制御を用いるものとする。 As an example of a plant, for example, in rolling mill control, fuzzy control and neuro-fuzzy control have been applied as control theories for shape control for controlling the wavy state of a plate as an example of control. Fuzzy control is applied to shape control using coolant, and neuro-fuzzy control is applied to shape control of a sentimia rolling mill. Of these, the shape control to which the neuro-fuzzy control is applied is similar to the difference between the actual shape pattern detected by the shape detector and the target shape pattern and the preset reference shape pattern, as shown in Patent Document 1. The ratio is obtained, and the control output amount for the operation end is obtained from the similar ratio according to the control rule expressed by the control operation end operation amount for the preset reference shape pattern. Hereinafter, as a conventional technique, shape control of a sendimia rolling mill using neuro-fuzzy control will be used.

図１に、特許文献１の図１に記述されたセンヂミア圧延機の形状制御を示す。センヂミア圧延機の形状制御では、ニューロ・ファジィ制御が用いられる。この例では、パターン認識機構５１で、形状検出器５２にて検出した実形状より形状のパターン認識を行い、実形状が予め設定された基準形状パターンのどれに最も近いかを演算する。制御演算機構５３では、図２で示すような予め設定された形状パターンに対する制御操作端操作量で構成される制御ルールを用いて制御を実施する。図２についてより具体的に述べると、パターン認識機構５１では、形状検出器５２にて検出した形状実績と目標形状（εｒｅｆ）との差分（Δε）が、１から８の形状パターン（ε）のどれに最も近いかを演算し、制御演算機構５３では、１から８の制御方法のいずれかを選択し実行する。 FIG. 1 shows the shape control of the Sendimia rolling mill described in FIG. 1 of Patent Document 1. Neuro-fuzzy control is used to control the shape of the Sendimia rolling mill. In this example, the pattern recognition mechanism 51 recognizes the shape pattern from the actual shape detected by the shape detector 52, and calculates which of the preset reference shape patterns the actual shape is closest to. The control calculation mechanism 53 executes control using a control rule composed of a control operation end operation amount for a preset shape pattern as shown in FIG. More specifically, FIG. 2 shows that in the pattern recognition mechanism 51, the difference (Δε) between the actual shape detected by the shape detector 52 and the target shape (εref) is a shape pattern (ε) of 1 to 8. The closest control method is calculated, and the control calculation mechanism 53 selects and executes any of the control methods 1 to 8.

ところが特許文献１の手法では、制御ルールの検証のために、圧延中にオペレータに手動操作を行ってもらい制御ルールの検証等行う場合が有るが、予想に反した形状変化を示す場合がある。つまり、上記の様にして決定した制御ルールが現実に則していない場合が発生する。これは、機械的特性の検討不足や圧延機の操業状態や機械条件の変化が原因であるが、予め設定した制御ルールが最も良いルールかどうかを１つ１つ検証するのは、考慮すべき条件が多く困難である。そのため、制御ルールを一度設定してしまうと、不具合が無い限りそのままとしてしまう場合が多い。 However, in the method of Patent Document 1, in order to verify the control rule, the operator may manually perform the operation during rolling to verify the control rule, but the shape may change unexpectedly. That is, there may be a case where the control rule determined as described above does not conform to the reality. This is due to insufficient examination of mechanical characteristics and changes in the operating conditions and machine conditions of the rolling mill, but it should be considered to verify whether the preset control rules are the best rules one by one. There are many conditions and it is difficult. Therefore, once the control rule is set, it is often left as it is unless there is a problem.

操業条件の変化等で、制御ルールが現実に則したものでなくなってくると、制御ルールが固定されているため、ある程度以上の制御精度を出すことは困難となってくる。また、一旦形状制御が動作してしまうと、オペレータは手動操作をしなくなる（制御にとって外乱となってしまう）ため、新たな制御ルールをオペレータの手動介入により見つけていくのも困難である。さらに、新しい規格の圧延材を圧延する場合も制御ルールをその材料にあわせて設定するのは困難である。 When the control rules do not conform to the reality due to changes in operating conditions or the like, the control rules are fixed, and it becomes difficult to obtain control accuracy above a certain level. Further, once the shape control is operated, the operator does not perform the manual operation (it becomes a disturbance for the control), so that it is difficult to find a new control rule by the manual intervention of the operator. Furthermore, when rolling a rolled material of a new standard, it is difficult to set control rules according to the material.

以上のように、従来の形状制御においては、予め設定された制御ルールを用いて制御するため、制御ルールを修正するのが困難であるという問題が有った。 As described above, in the conventional shape control, since the control is performed by using the preset control rule, there is a problem that it is difficult to modify the control rule.

この問題を解決するために、特許文献２に示すような、形状制御を行いながら制御ルールをランダムに変化させ、形状が良くなるルールを学習して行くことで、
１）圧延中に形状制御を実施しながら新たな制御ルールを発見していく。
２）新たな制御ルールは、予め予想できるものでは無く、全く予測できなかった制御ルールが最適となる場合も有る事から、ランダムに制御操作端を動作させ、それに対する制御結果を見ながら見つけていく。
ことを実現している。 In order to solve this problem, as shown in Patent Document 2, the control rule is randomly changed while performing shape control, and the rule for improving the shape is learned.
1) Discover new control rules while controlling the shape during rolling.
2) New control rules are not predictable in advance, and control rules that could not be predicted at all may be optimal. Therefore, operate the control operation end at random and find it while looking at the control results for it. go.
I have realized that.

特許２８０４１６１号公報Japanese Patent No. 2804161 特許４００３７３３号公報Japanese Patent No. 4003733

上記従来技術は、予め代表的な形状を基準形状パターンとして設定し、基準波形パターンに対する制御操作端操作量との関係を示す制御ルールを基に制御を行っている。制御ルールの学習につても、基準波形パターンに対する制御操作端操作量に関するものであり、予め定めている代表的な基準形状パターンはそのまま用いている。そのため、特定の形状パターンにしか反応しない形状制御となってしまう問題がある。 In the above-mentioned prior art, a representative shape is set in advance as a reference shape pattern, and control is performed based on a control rule indicating a relationship with a control operation end operation amount with respect to the reference waveform pattern. Also in the learning of the control rule, it is related to the control operation end operation amount with respect to the reference waveform pattern, and the typical reference shape pattern defined in advance is used as it is. Therefore, there is a problem that the shape control reacts only to a specific shape pattern.

基準形状パターンは、人間が予め対象となる圧延機に関する知識や、形状実績と手動介入操作を蓄積した経験より定めたものであるが、対象となる圧延機および被圧延材で発生する全ての形状を網羅する事は困難である。そのため、基準形状パターンとは異なる形状が発生した場合、形状制御による制御が実施されず、形状偏差が抑制されずに残ってしまい、あるいは似たような基準形状パターンと誤認識し、誤った制御操作を行って、逆に形状を悪化させてしまう場合も有る。 The standard shape pattern is determined by human beings in advance based on their knowledge of the target rolling mill and the experience of accumulating shape results and manual intervention operations, but all shapes generated in the target rolling mill and the material to be rolled It is difficult to cover. Therefore, when a shape different from the reference shape pattern is generated, the control by the shape control is not performed and the shape deviation remains unsuppressed, or it is erroneously recognized as a similar reference shape pattern and erroneously controlled. In some cases, the shape may be deteriorated by performing an operation.

そのため、従来の形状制御においては、予め設定された基準形状パターンとそれに対する制御ルールを用いて制御ルールの学習をし、制御を実施するため、制御精度の向上に限界があるという問題が有った。 Therefore, in the conventional shape control, there is a problem that there is a limit to the improvement of the control accuracy because the control rule is learned and the control is performed by using the preset reference shape pattern and the control rule for the reference shape pattern. rice field.

それを解決するために、制御対象プラントに対して、制御対象プラントの実績データの組合せのパターンを認識して、制御を実施するプラント制御装置であって、制御対象プラントの実績データと制御操作の組合せを学習する制御方法学習装置と、学習した実績データと制御操作の組合せに応じて制御対象プラントの制御を実施する制御実行装置を備え、制御実行装置は、制御対象プラントの実績データと制御操作の定められた組合せに従って制御出力を与える制御ルール実行部と、制御ルール実行部が出力する制御出力の可否を判定するとともに、当該実績データと制御操作が誤りである事を制御方法学習装置に通知する制御出力判定部と、制御出力判定部が、制御出力を制御対象プラントに出力した場合、制御対象プラントの実績データが悪化すると判断した場合は、制御出力を制御対象プラントに出力することを阻止する制御出力抑制部とを備え、制御方法学習装置は、制御実行装置が制御出力を実際に、制御対象プラントに出力した場合に、制御効果が実績データに表れるまでの時間遅れ後に、実績データが当該制御前に比較して良くなったか、悪くなったかについての制御結果の良否を判定する制御結果良否判定部と、制御結果良否判定部における制御結果の良否と、制御出力をもちいて教師データを得る学習データ作成部と、実績データと教師データを学習データとして学習する制御ルール学習部とを備え、制御方法学習装置が学習する事で、制御対象プラントの状態に応じて複数の制御目標に対して別個の実績データと制御操作の組合せを得、得られた実績データと制御操作の組合せを制御ルール実行部における制御対象プラントの実績データと制御操作の定められた組合せとして使用することを特徴とするプラント制御装置、を用いることが考えられる。 In order to solve this problem, it is a plant control device that recognizes the pattern of the combination of the actual data of the controlled plant and executes the control for the controlled plant. It is equipped with a control method learning device that learns combinations and a control execution device that controls the controlled plant according to the combination of the learned actual data and control operation. The control execution device is the actual data and control operation of the controlled plant. The control rule execution unit that gives control output according to the specified combination and the control output output by the control rule execution unit are judged, and the control method learning device is notified that the actual data and the control operation are incorrect. If the control output judgment unit and the control output judgment unit output the control output to the control target plant and determine that the actual data of the control target plant deteriorates, prevent the control output from being output to the control target plant. The control method learning device is equipped with a control output suppression unit, and when the control execution device actually outputs the control output to the controlled plant, the actual data is displayed after a time delay until the control effect appears in the actual data. The control result quality judgment unit that determines whether the control result is better or worse than before the control, the control result quality judgment unit in the control result quality judgment unit, and the teacher data using the control output are used. It is equipped with a learning data creation unit to obtain and a control rule learning unit that learns actual data and teacher data as learning data. The feature is that separate performance data and control operation combinations are obtained, and the obtained performance data and control operation combination is used as a defined combination of control target plant performance data and control operation in the control rule execution unit. It is conceivable to use a plant control device.

このときに、制御結果の良否判定に用いる評価関数が適切であることが非常に重要となる。しかしながら、評価関数を決定する際に制御装置の設計者は、制御対象プラントの操業技術者やオペレータ等に聞き取り調査を実施したり、実際のプラントの動作を確認したりしながら主観的に決定しており、真に適切に設定されているか不明である場合が多い。 At this time, it is very important that the evaluation function used for determining the quality of the control result is appropriate. However, when determining the evaluation function, the designer of the control device subjectively determines it while conducting an interview survey with the operation engineers and operators of the controlled plant and confirming the actual operation of the plant. In many cases, it is unclear whether the settings are truly appropriate.

一例として圧延機の形状制御について考えてみる。圧延機の形状制御においては、板幅方向の全体において目標形状と実形状が一致するのが理想である。しかし、現実にはそうならない場合が多い。そのため、実際の作業においては、板の特定の領域を重視し、その領域で実形状を目標形状に合致するように制御するのが一般的である。板の形状を評価する評価関数として、板幅方向の各部における形状偏差（＝形状実績−目標形状）に対して、板幅方向の各部に対して重み付けした評価関数が用いられる。 As an example, consider the shape control of a rolling mill. In the shape control of the rolling mill, it is ideal that the target shape and the actual shape match in the entire plate width direction. However, in reality this is often not the case. Therefore, in actual work, it is common to emphasize a specific region of the plate and control the actual shape so as to match the target shape in that region. As an evaluation function for evaluating the shape of the plate, an evaluation function is used in which the shape deviation (= actual shape-target shape) in each part in the plate width direction is weighted for each part in the plate width direction.

圧延機では、板幅方向の端部（板端部）の形状に対する制御操作端は、それを除く部分（中央部）に対する制御操作端とは別々のものとなっている。しかし、それらは互いに影響しあっている場合が多い。また、板端部は中央部のように両側から拘束されないため、形状偏差が大きくなる場合が多い。板幅方向の板端部に制御を加えると、その影響が中央部におよんで中央部の形状が悪化したり、その逆の場合が発生したりする。このように、板端部と中央部の形状を同時に目標値に合致するように制御することは困難である。多くの場合オペレータは、板端部か中央部かどちらかを優先させて手動制御を実施する。 In the rolling mill, the control operation end for the shape of the end portion (plate end portion) in the plate width direction is separate from the control operation end for the portion (center portion) other than the end portion (plate end portion). However, they often influence each other. Further, since the plate end portion is not constrained from both sides unlike the central portion, the shape deviation often becomes large. When control is applied to the plate end portion in the plate width direction, the effect extends to the central portion and the shape of the central portion deteriorates, and vice versa. As described above, it is difficult to control the shapes of the plate edge portion and the central portion at the same time so as to match the target value. In many cases, the operator gives priority to either the edge portion or the center portion of the plate to perform manual control.

制御結果の良否判定において適用されている評価関数が、オペレータの考えと異なる評価をするような場合、オペレータは、制御装置による形状制御からの操作を取り消して、自分の考えに従って手動操作を実施することになる。その場合、制御装置による形状制御とオペレータが行う手動操作が競合する状態となる。その結果、オペレータは自身による手動操作にとって邪魔となる制御装置からの形状制御をＯＦＦすることも考えられる。それが度重なると、オペレータは制御装置による形状制御を最初からＯＮしなくなってしまうことも懸念される。 When the evaluation function applied in the quality judgment of the control result evaluates differently from the operator's idea, the operator cancels the operation from the shape control by the control device and performs the manual operation according to his / her own idea. It will be. In that case, the shape control by the control device and the manual operation performed by the operator conflict with each other. As a result, it is conceivable that the operator turns off the shape control from the control device, which is an obstacle to the manual operation by the operator. If this is repeated, there is a concern that the operator will not turn on the shape control by the control device from the beginning.

制御結果の良否判定に適用する評価関数を、オペレータの考えと合致する評価を行うものにすれば、制御装置による制御とオペレータの手動操作との競合が低減するだけでなく、更には、オペレータが手動操作を行うことが減り、オペレータの負荷が低減され、形状制御の精度が向上することも期待される。 If the evaluation function applied to the quality judgment of the control result is evaluated in accordance with the operator's idea, not only the conflict between the control by the control device and the manual operation of the operator is reduced, but also the operator can perform the evaluation. It is also expected that the number of manual operations will be reduced, the load on the operator will be reduced, and the accuracy of shape control will be improved.

本発明の目的は、制御結果の適切な良否判定に基づく制御を実行可能にする技術を提供することである。 An object of the present invention is to provide a technique that enables control based on an appropriate pass / fail judgment of a control result.

本開示による制御装置は、制御対象を制御する制御装置であって、与えられた制御ルールに従って前記制御対象へ制御出力を与える制御実行装置と、指定された評価関数を用いて前記制御対象に対して与えられた制御出力を評価し、その評価結果を利用して学習データを作成し、該学習データを学習することにより前記制御ルールを構築し、該制御ルールを前記制御実行装置に与える制御方法学習装置と、複数の評価関数を予め保持しており、前記制御対象への制御状態に基づいて、前記複数の評価関数のうちいずれかを選択し、前記選択した評価関数を前記制御方法学習装置に指定する評価関数設定部と、を有する。 The control device according to the present disclosure is a control device that controls a control target, and is a control execution device that gives a control output to the control target according to a given control rule, and the control target using a designated evaluation function. A control method in which the given control output is evaluated, training data is created using the evaluation result, the control rule is constructed by learning the training data, and the control rule is given to the control execution device. A learning device and a plurality of evaluation functions are held in advance, one of the plurality of evaluation functions is selected based on the control state of the controlled object, and the selected evaluation function is used as the control method learning device. It has an evaluation function setting unit specified in.

本開示によれば、制御結果の適切な良否判定に基づく制御が実行可能になることが期待される。 According to the present disclosure, it is expected that control based on an appropriate pass / fail judgment of the control result can be executed.

特許文献１の図１に記述されたセンヂミア圧延機の形状制御を示す図である。It is a figure which shows the shape control of the Sendimia rolling mill described in FIG. 1 of Patent Document 1. FIG. 形状パターンに対する制御操作端操作量で構成される制御ルールを示す図である。It is a figure which shows the control rule which is composed of the control operation end operation amount for a shape pattern. 実施例に係るプラント制御装置の概要を示す図である。It is a figure which shows the outline of the plant control apparatus which concerns on Example. 実施例に係る制御ルール実行部１０の具体例を示す図である。It is a figure which shows the specific example of the control rule execution part 10 which concerns on embodiment. 実施例に係る制御ルール学習部１１の具体例を示す図である。It is a figure which shows the specific example of the control rule learning part 11 which concerns on embodiment. 評価関数設定部１７の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the evaluation function setting part 17. センヂミア圧延機の形状制御に用いる場合のニューラルネットワークの構成を示す図である。It is a figure which shows the structure of the neural network when it is used for the shape control of a Sendimia rolling mill. 形状偏差と制御方法について説明するための図である。It is a figure for demonstrating a shape deviation and a control method. 制御入力データ作成部２の概要を示す図である。It is a figure which shows the outline of the control input data creation unit 2. 制御出力演算部３の概要を示す図である。It is a figure which shows the outline of the control output calculation unit 3. 圧延機の圧延速度の遷移の一例を示す図である。It is a figure which shows an example of the transition of the rolling speed of a rolling mill. 評価関数ＤＢＤＢ５の一例を示す図である。It is a figure which shows an example of the evaluation function DB DB5. 評価関数選択方法学習部１７３の動作概要を説明するための図である。It is a figure for demonstrating the operation outline of the evaluation function selection method learning part 173. 評価関数学習部１７４の動作概要を説明するための図である。It is a figure for demonstrating the operation outline of the evaluation function learning part 174. 評価関数学習部１７４の概要構成を示す図である。It is a figure which shows the outline structure of the evaluation function learning part 174. 制御出力判定部５の概要を説明するための図である。It is a figure for demonstrating the outline of the control output determination part 5. 制御良否判定部６の動作概要を説明するための図である。It is a figure for demonstrating the operation outline of the control quality determination part 6. 学習データ作成部７の動作概要を説明するための図である。It is a figure for demonstrating the operation outline of the learning data creation part 7. 学習データ作成部７における処理段階と処理内容を示す図である。It is a figure which shows the processing stage and processing content in a learning data creation unit 7. 学習データデータベースＤＢ２に保存されたデータ例を示す図である。It is a figure which shows the example of data stored in the learning data database DB2. ニューラルネット管理テーブルＴＢの一例を示す図である。It is a figure which shows an example of the neural network management table TB. 学習データデータベースＤＢ２の一例を示す図である。It is a figure which shows an example of the learning data database DB2.

まずは、本発明における知見、並びに本発明に至る経緯について圧延機の形状制御を例にして説明をしておく。 First, the findings in the present invention and the background to the present invention will be described by taking shape control of a rolling mill as an example.

まず、上記課題を解決するために以下の３つのことが求められる First, the following three things are required to solve the above problems.

（１）基準形状パターンと、それに対する制御操作を予め別々に設定し、制御操作方法を学習していくのではなく、形状パターンと制御操作の組合せを学習し、それを用いて制御操作を実施する (1) Rather than setting the reference shape pattern and the control operation for it separately in advance and learning the control operation method, the combination of the shape pattern and the control operation is learned and the control operation is performed using it. do

（２）新たな制御ルールは、予め予想できるものでは無く、全く予測できなかった制御ルールが最適となる場合も有る事から、ランダムに制御操作端を動作させ、それに対する制御結果を見ながら見つけていく (2) New control rules are not predictable in advance, and control rules that could not be predicted at all may be optimal. Therefore, the control operation end is randomly operated and found while looking at the control results for it. To go

（３）制御結果の良否に関し、圧延機の状態に応じて評価関数を選択し、好適な制御ルールの選択を可能とする。 (3) Regarding the quality of the control result, the evaluation function is selected according to the state of the rolling mill, and a suitable control rule can be selected.

これら３つを実現するためには、形状制御に使用する、形状パターンと制御操作の組合せを変化させながら制御結果が良くなるように制御操作を変更していくのがよい。そのためには、形状パターンとその形状パターンに対して好適な制御操作との組合せをニューラルネットワーク等の人工知能で学習し、人工知能により、圧延機で発生した形状パターンに対する制御操作の出力を変更していくのがよい。 In order to realize these three, it is preferable to change the control operation so that the control result is improved while changing the combination of the shape pattern and the control operation used for the shape control. For that purpose, the combination of the shape pattern and the control operation suitable for the shape pattern is learned by artificial intelligence such as a neural network, and the output of the control operation for the shape pattern generated by the rolling mill is changed by the artificial intelligence. It is good to go.

操業中の圧延機に対して形状制御を実施しながら制御操作を変更すると、誤った制御出力が出力され、板の形状が悪化し、板破断等の操業異常が発生する事がある。板破断が発生すると、圧延機で使用するロールの交換に時間を要したり、圧延中の被圧延材が無駄になったりと、ダメージが大きい。そのため、可能な限り誤った制御出力を圧延機に対して出力しないようにする事が必要である。そのため、形状の良否を判定するための評価関数を圧延状態に応じて変更するのがよい。 If the control operation is changed while performing shape control on the rolling mill in operation, an erroneous control output may be output, the shape of the plate may deteriorate, and an operation abnormality such as plate breakage may occur. When a plate break occurs, it takes time to replace the roll used in the rolling mill, and the material to be rolled during rolling is wasted, resulting in great damage. Therefore, it is necessary not to output an erroneous control output to the rolling mill as much as possible. Therefore, it is preferable to change the evaluation function for determining the quality of the shape according to the rolling state.

圧延状態とは、制御対象である圧延機が置かれている圧延に関する状態のことである。制御対象が圧延機に限られなければ、圧延状態を一般化して制御状態と呼ぶことができる。圧延状態は、圧延機に加えられる制御操作、圧延機の状態、圧延機による圧延の状態など様々なパラメータで判別することができる。本実施形態では一例として圧延速度で圧延状態を判別するものとする。 The rolling state is a state related to rolling in which the rolling mill to be controlled is placed. If the control target is not limited to the rolling mill, the rolling state can be generalized and called the control state. The rolling state can be determined by various parameters such as a control operation applied to the rolling mill, a rolling mill state, and a rolling state by the rolling mill. In the present embodiment, as an example, the rolling state is determined by the rolling speed.

以上のことから本実施形態においては、これを実現するため、ニューラルネットが出力した制御操作の良否を、例えば圧延機の簡易モデル等を用いて検証し、明らかに形状が悪化すると考えられる出力は、圧延機の制御操作端に対して出力しないようにし、形状悪化を防止する。この時、ニューラルネットに関しては、その形状パターンに対する制御操作は誤りであるとして学習を実施する。 From the above, in this embodiment, in order to realize this, the quality of the control operation output by the neural network is verified by using, for example, a simple model of a rolling mill, and the output that is considered to have a clearly deteriorated shape is obtained. , Prevents shape deterioration by not outputting to the control operation end of the rolling mill. At this time, with respect to the neural network, learning is performed assuming that the control operation for the shape pattern is incorrect.

制御操作の良否の検証方法自体が誤っている可能性が有るため、ある確率で誤っていると判断されたニューラルネットの制御操作出力についても、圧延機の制御操作端に出力することで、想定外の形状パターンと制御操作の組合せについても学習していく事が可能となる。 Since there is a possibility that the verification method itself of the quality of the control operation is wrong, it is assumed that the control operation output of the neural network, which is judged to be wrong with a certain probability, is also output to the control operation end of the rolling mill. It is possible to learn about the combination of the outer shape pattern and the control operation.

以下、本発明の実施例について図面を用いて詳細に説明する。 Hereinafter, examples of the present invention will be described in detail with reference to the drawings.

図３に、実施例に係るプラント制御装置の概要を示す。図３のプラント制御装置は、制御対象プラント１と、制御対象プラント１からの実績データＳｉを入力して図２に例示したような制御ルール（ニューラルネット）に従い定めた制御操作量出力ＳＯを制御対象プラント１に与えて制御する制御実行装置２０と、制御対象プラント１からの実績データＳｉなどを入力して学習を行い、学習した制御ルールを制御実行装置２０における制御ルールに反映させる制御方法学習装置２１と、複数のデータベースＤＢ（ＤＢ１からＤＢ３）、並びにデータベースＤＢの管理テーブルＴＢから構成されている。 FIG. 3 shows an outline of the plant control device according to the embodiment. The plant control device of FIG. 3 inputs the control target plant 1 and the actual data Si from the control target plant 1 and controls the control operation amount output SO determined according to the control rule (neural net) as illustrated in FIG. Control method learning in which the control execution device 20 that is given to and controlled by the target plant 1 and the actual data Si from the control target plant 1 are input and learned, and the learned control rules are reflected in the control rules in the control execution device 20. It is composed of an apparatus 21, a plurality of database DBs (DB1 to DB3), and a management table TB of the database DB.

制御実行装置２０は、制御入力データ作成部２、制御ルール実行部１０、制御出力演算部３、制御出力抑制部４、制御出力判定部５、および制御操作外乱発生部１６を主たる要素として構成されている。 The control execution device 20 includes a control input data creation unit 2, a control rule execution unit 10, a control output calculation unit 3, a control output suppression unit 4, a control output determination unit 5, and a control operation disturbance generation unit 16 as main elements. ing.

このうち制御実行装置２０においては、まず制御対象プラント１である圧延機の実績データＳｉより、制御入力データ作成部２を用いて、制御ルール実行部１０の入力データＳ１を作成する。制御ルール実行部１０は、制御対象の実績データＳｉと制御操作端操作指令Ｓ２の関係を表現するニューラルネット（制御ルール）を用いて、制御対象の実績データＳｉから制御操作端操作指令Ｓ２を作成する。制御出力演算部３においては、制御操作端操作指令Ｓ２をもとに、制御操作端への制御操作量Ｓ３を演算する。これにより、制御対象プラント１の実績データＳｉに応じて、ニューラルネットを用いて制御操作量Ｓ３を作成する。 Of these, in the control execution device 20, first, the input data S1 of the control rule execution unit 10 is created by using the control input data creation unit 2 from the actual data Si of the rolling mill which is the control target plant 1. The control rule execution unit 10 creates a control operation end operation command S2 from the control target actual data Si by using a neural network (control rule) expressing the relationship between the control target actual data Si and the control operation end operation command S2. do. The control output calculation unit 3 calculates the control operation amount S3 to the control operation end based on the control operation end operation command S2. As a result, the control operation amount S3 is created using the neural network according to the actual data Si of the controlled plant 1.

また制御実行装置２０内の制御出力判定部５においては、制御対象プラント１からの実績データＳｉおよび制御出力演算部３からの制御操作量Ｓ３を用いて、制御操作端への制御操作量出力可否データＳ４を決定する。制御出力抑制部４においては、制御操作量出力可否データＳ４に応じて制御操作端への制御操作量Ｓ３の出力可否を決定し、可とされた制御操作量Ｓ３を、制御対象プラント１に与える制御操作量出力ＳＯとして出力する。これにより、異常と判断される制御操作量Ｓ３は、制御対象プラント１に出力されなくなる。なお制御操作外乱発生部１６は、プラント制御装置を検証する目的のために、外乱を生成し、制御対象プラント１に与えるものである。 Further, in the control output determination unit 5 in the control execution device 20, whether or not the control operation amount can be output to the control operation end by using the actual data Si from the control target plant 1 and the control operation amount S3 from the control output calculation unit 3. The data S4 is determined. The control output suppression unit 4 determines whether or not the control operation amount S3 can be output to the control operation end according to the control operation amount output availability data S4, and gives the permitted control operation amount S3 to the controlled target plant 1. Output as control operation amount output SO. As a result, the control operation amount S3 determined to be abnormal is not output to the controlled plant 1. The control operation disturbance generation unit 16 generates a disturbance and gives it to the controlled plant 1 for the purpose of verifying the plant control device.

以上のように構成された制御実行装置２０は、その処理実行のために、さらに後述するように、制御ルールデータベースＤＢ１および出力判定データベースＤＢ３を参照する。制御ルールデータベースＤＢ１は、制御実行装置２０内の制御ルール実行部１０と、後述する制御方法学習装置２１内の制御ルール学習部１１の双方にアクセス可能に接続されている。制御ルール学習部１１における学習結果としての制御ルール（ニューラルネット）が制御ルールデータベースＤＢ１に格納されており、制御ルール実行部１０は制御ルールデータベースＤＢ１に格納された制御ルールを参照する。出力判定データベースＤＢ３は、制御実行装置２０内の制御出力判定部５にアクセス可能に接続されている。 The control execution device 20 configured as described above refers to the control rule database DB1 and the output determination database DB3 for the purpose of executing the processing, as will be further described later. The control rule database DB 1 is accessiblely connected to both the control rule execution unit 10 in the control execution device 20 and the control rule learning unit 11 in the control method learning device 21 described later. The control rule (neural network) as a learning result in the control rule learning unit 11 is stored in the control rule database DB1, and the control rule execution unit 10 refers to the control rule stored in the control rule database DB1. The output determination database DB 3 is accessibly connected to the control output determination unit 5 in the control execution device 20.

図４は、本実施例に係る制御ルール実行部１０の具体例を示している。制御ルール実行部１０は、制御入力データ作成部２で作成した入力データＳ１を入力して、制御出力演算部３に制御操作端操作指令Ｓ２を与える。制御ルール実行部１０はニューラルネット１０１を備えており、ニューラルネット１０１では基本的には図２に例示したような特許文献１の手法により制御操作端操作指令Ｓ２を定めている。本発明においては、制御ルール実行部１０はさらにニューラルネット選択部１０２を備えており、制御ルールデータベースＤＢ１に格納された制御ルールを参照することで、ニューラルネット１０１における制御ルールとして、最適な制御ルールを選択し、実行せしめる。このように図４の制御ルール実行部１０においては、オペレータ班や制御目的で分けられた複数のニューラルネットから、必要なニューラルネットを選択し、使用している。制御ルールデータベースＤＢ１には、制御対象プラント１からのデータとして、ニューラルネットおよび良否判定基準を選択できるような実績データ（操業班のデータ等）Ｓｉも含むのがよい。なお、ニューラルネットを実行すると制御ルールになるという関係にあることから、本明細書においてはニューラルネットと制御ルールを区別せず、同義の意味で使用している。 FIG. 4 shows a specific example of the control rule execution unit 10 according to this embodiment. The control rule execution unit 10 inputs the input data S1 created by the control input data creation unit 2 and gives the control operation end operation command S2 to the control output calculation unit 3. The control rule execution unit 10 includes a neural network 101, and the neural network 101 basically defines a control operation end operation command S2 by the method of Patent Document 1 as illustrated in FIG. In the present invention, the control rule execution unit 10 further includes a neural network selection unit 102, and by referring to the control rules stored in the control rule database DB1, the optimum control rules as the control rules in the neural network 101 Select and execute. As described above, in the control rule execution unit 10 of FIG. 4, a necessary neural network is selected and used from a plurality of neural networks divided by an operator group and a control purpose. It is preferable that the control rule database DB1 also includes the actual data (data of the operation team, etc.) Si that allows the selection of the neural network and the pass / fail judgment criteria as the data from the controlled plant 1. In addition, since there is a relationship that when a neural network is executed, it becomes a control rule, in this specification, the neural network and the control rule are not distinguished and are used in the same meaning.

図３に戻り、制御方法学習装置２１においては、制御実行装置２０で使用するニューラルネット１０１の学習を実施する。制御実行装置２０が制御対象プラント１に対して、制御操作量出力ＳＯを出力した場合、実際に制御効果が実績データＳｉの変化となって現れるには時間を要する。このため、その時間だけ時間遅れさせたデータを用いて学習を実施する。図３において、Ｚ^−１は、各データに対する適宜の時間遅れ機能を表している。 Returning to FIG. 3, the control method learning device 21 learns the neural network 101 used by the control execution device 20. When the control execution device 20 outputs the control operation amount output SO to the controlled target plant 1, it takes time for the control effect to actually appear as a change in the actual data Si. Therefore, learning is performed using data that is delayed by that time. In FIG. 3, Z ^-1 represents an appropriate time delay function for each data.

制御方法学習装置２１は、制御結果良否判定部６、学習データ作成部７、制御ルール学習部１１、評価関数設定部１７、を主たる要素として構成されている。 The control method learning device 21 is composed of a control result quality determination unit 6, a learning data creation unit 7, a control rule learning unit 11, and an evaluation function setting unit 17 as main elements.

このうち、制御結果良否判定部６は、制御対象プラント１からの実績データＳｉおよび実績データ前回値Ｓｉ０と、評価関数設定部１７より設定される評価関数とを用いて、実績データＳｉが良くなる方向に変化したか、悪くなる方向に変化したか判定し、制御結果良否データＳ６を出力する。 Of these, the control result pass / fail determination unit 6 improves the actual data Si by using the actual data Si from the controlled plant 1, the actual data previous value Si0, and the evaluation function set by the evaluation function setting unit 17. It is determined whether the change is in the direction or the change in the worse direction, and the control result good / bad data S6 is output.

制御方法学習装置２１内の学習データ作成部７においては、制御実行装置２０にて作成した制御操作端操作指令Ｓ２、制御操作量Ｓ３、制御操作量出力可否データＳ４などの入力データをそれぞれ同じ時間だけ時間遅れさせたデータと、制御結果良否判定部６よりの制御結果良否データＳ６を用いて、ニューラルネットの学習に使用する新規の教師データＳ７ａを作成し、制御ルール学習部１１に与える。なお、教師データＳ７ａは、制御ルール実行部１０が出力する制御操作端操作指令Ｓ２に対応するものであり、学習データ作成部７は、制御結果良否判定部６が与える制御結果良否データＳ６を用いて制御ルール実行部１０が出力する制御操作端操作指令Ｓ２を推定して得たデータを、新規の教師データＳ７ａとして求めたものということができる。 In the learning data creation unit 7 in the control method learning device 21, input data such as the control operation end operation command S2, the control operation amount S3, and the control operation amount output availability data S4 created by the control execution device 20 are input for the same time. Using the data delayed by the time and the control result pass / fail data S6 from the control result pass / fail determination unit 6, new teacher data S7a used for learning the neural network is created and given to the control rule learning unit 11. The teacher data S7a corresponds to the control operation end operation command S2 output by the control rule execution unit 10, and the learning data creation unit 7 uses the control result quality data S6 given by the control result quality determination unit 6. It can be said that the data obtained by estimating the control operation end operation command S2 output by the control rule execution unit 10 is obtained as new teacher data S7a.

図５は、本実施例に係る制御ルール学習部１１の具体例を示している。制御ルール学習部１１は、入力データ作成部１１４、教師データ作成部１１５、ニューラルネット処理部１１０、ニューラルネット選択部１１３を主たる構成要素として構成されている。また制御ルール学習部１１は、外部からの入力として入力データ作成部２からの入力データＳ１を時間遅れさせたデータＳ８ａを、学習データ作成部７からの新規の教師データＳ７ａを得、また制御ルールデータベースＤＢ１および学習データデータベースＤＢ３に蓄積されたデータを参照する。 FIG. 5 shows a specific example of the control rule learning unit 11 according to this embodiment. The control rule learning unit 11 includes an input data creation unit 114, a teacher data creation unit 115, a neural network processing unit 110, and a neural network selection unit 113 as main components. Further, the control rule learning unit 11 obtains new teacher data S7a from the learning data creation unit 7 from the data S8a in which the input data S1 from the input data creation unit 2 is delayed in time as an external input, and the control rule. Refer to the data accumulated in the database DB1 and the learning data database DB3.

制御ルール学習部１１において、入力データＳ１は適宜の時間遅れ補償後に入力データ作成部１１４を介してニューラルネット処理部１１０に取り込まれる。 In the control rule learning unit 11, the input data S1 is taken into the neural network processing unit 110 via the input data creation unit 114 after appropriate time delay compensation.

また制御ルール学習部１１において、学習データ作成部７からの新規の教師データＳ７ａは、教師データ作成部１１５において学習データデータベースＤＢ３に記憶されている過去の教師データＳ７ｂも含めた合計の教師データＳ７ｃとして、ニューラルネット処理部１１０に与えられる。これらの教師データＳ７ａ、Ｓ７ｂは、適宜、学習データデータベースＤＢ３に記憶されて、利用される。 Further, in the control rule learning unit 11, the new teacher data S7a from the learning data creation unit 7 is the total teacher data S7c including the past teacher data S7b stored in the learning data database DB3 in the teacher data creation unit 115. Is given to the neural net processing unit 110. These teacher data S7a and S7b are appropriately stored in the learning data database DB3 and used.

同様に、制御入力データ作成部２からの入力データＳ８ａは、入力データ作成部１１４において学習データデータベースＤＢ３に記憶されている過去の入力データＳ８ｂも含めた合計の入力データＳ８ｃとして、ニューラルネット処理部１１０に与えられる。これらの入力データＳ８ａ、Ｓ８ｂは、適宜、学習データデータベースＤＢ３に記憶されて、利用される。 Similarly, the input data S8a from the control input data creation unit 2 is the total input data S8c including the past input data S8b stored in the learning data database DB3 in the input data creation unit 114, and is the neural net processing unit. Given to 110. These input data S8a and S8b are appropriately stored in the learning data database DB3 and used.

ニューラルネット処理部１１０は、ニューラルネット１１１とニューラルネット学習制御部１１２により構成されており、ニューラルネット１１１は、入力データ作成装置１１４からの入力データＳ８ｃ、教師データ作成部１１５からの教師データＳ７ｃ、ニューラルネット選択部１１３が選択した制御ルール（ニューラルネット）を取り込み、最終的に決定したニューラルネットを制御ルールデータベースＤＢ１に格納する。 The neural network processing unit 110 is composed of a neural network 111 and a neural network learning control unit 112, and the neural network 111 includes input data S8c from the input data creation device 114, teacher data S7c from the teacher data creation unit 115, and the like. The control rule (neural network) selected by the neural network selection unit 113 is taken in, and the finally determined neural network is stored in the control rule database DB1.

ニューラルネット学習制御部１１２は、入力データ作成装置１１４、教師データ作成部１１５、ニューラルネット選択部１１３に対して、適宜のタイミングでこれらを制御し、ニューラルネット１１１の入力を得、また処理結果を制御ルールデータベースＤＢ１に格納すべく制御している。 The neural network learning control unit 112 controls the input data creation device 114, the teacher data creation unit 115, and the neural network selection unit 113 at appropriate timings, obtains the input of the neural network 111, and outputs the processing result. Control rule Control is performed so that it is stored in the database DB1.

ここで、図４の制御実行装置２０におけるニューラルネット１０１と、図５の制御方法学習装置２１におけるニューラルネット１１１は、いずれも同じ概念のニューラルネットである。以下に、それらを利用するうえでの基本概念上の相違について説明する。 Here, the neural network 101 in the control execution device 20 of FIG. 4 and the neural network 111 in the control method learning device 21 of FIG. 5 are both neural networks of the same concept. The basic conceptual differences in using them will be explained below.

まず制御実行装置２０におけるニューラルネット１０１は、予め定められた内容のニューラルネットであり、入力データＳ１を与えたときに対応する出力としての制御操作端操作指令Ｓ２を求めるものであり、いわば一方方向の処理に利用されるニューラルネットである。これに対し、制御方法学習装置２１におけるニューラルネット１１１は、入力データＳ１と制御操作端操作指令Ｓ２についての入力データＳ８ｃ、教師データＳ７ｃを学習データとして設定したときに、この入出力関係を満足するニューラルネットを学習により求めるためのものである。 First, the neural network 101 in the control execution device 20 is a neural network having predetermined contents, and obtains a control operation end operation command S2 as a corresponding output when input data S1 is given, so to speak, in one direction. It is a neural network used for the processing of. On the other hand, the neural network 111 in the control method learning device 21 satisfies this input / output relationship when the input data S1 and the input data S8c and the teacher data S7c for the control operation end operation command S2 are set as learning data. This is for finding a neural network by learning.

上記のように構成された制御方法学習装置２１における基本的な処理の考え方は、以下のようである。まず、制御操作量出力可否データＳ４の内容が「可」の場合、制御対象プラント１に制御操作量出力ＳＯを出力し、制御結果良否データＳ６の内容が「良」（実績データＳｉが良くなる方向に変化）の場合、制御ルール実行部１０が出力した制御操作端操作指令Ｓ２は正しいと判断し、ニューラルネットの出力が制御操作端操作指令Ｓ２となるように学習データを作成する。 The concept of basic processing in the control method learning device 21 configured as described above is as follows. First, when the content of the control operation amount output availability data S4 is "OK", the control operation amount output SO is output to the controlled plant 1, and the content of the control result quality data S6 is "good" (actual data Si is improved). In the case of (change in direction), it is determined that the control operation end operation command S2 output by the control rule execution unit 10 is correct, and learning data is created so that the output of the neural network becomes the control operation end operation command S2.

一方、制御操作量出力可否データＳ４の内容が「否」、または、制御対象プラント１に制御操作量出力ＳＯを出力し、制御結果良否データＳ６の内容が「否」（実績データＳｉが悪くなる方向に変化）の場合、制御ルール実行部１０が出力した制御操作端操作指令Ｓ２は誤っていると判断し、ニューラルネットの出力が出ないように学習データを作成する。このとき、制御出力として、同じ制御操作端に対して＋方向、−方向の２種類の出力が出るようにニューラルネット出力を構成しておき、出力した側の制御操作端操作指令Ｓ２が出力されないように学習データを作成する。 On the other hand, the content of the control operation amount output availability data S4 is "No", or the content of the control operation amount output SO is output to the controlled plant 1 and the content of the control result quality data S6 is "No" (actual data Si becomes worse). In the case of (change in direction), it is determined that the control operation end operation command S2 output by the control rule execution unit 10 is incorrect, and learning data is created so that the output of the neural network is not output. At this time, the neural network output is configured so that two types of outputs, + direction and-direction, are output to the same control operation end as the control output, and the control operation end operation command S2 on the output side is not output. Create training data as in.

また図５に例示する制御ルール学習部１１においては、ニューラルネット学習制御部１１２によるデータ処理の結果として、以下のように処理している。ここでは、まず制御実行装置２０への入力データＳ１を時間遅れさせたＳ８ｃと、教師データ作成部１１５にて作成した教師データＳ７ｃの組合せである学習データを用いて、制御ルール実行部１０にて用いたニューラルネット１０１の学習を実施する。実際には、制御ルール実行部１０のニューラルネット１０１と同じニューラルネット１１１を制御ルール学習部１１内に備えて、各種条件で運用テストしてその時の応答を学習し、学習の結果としてより良い結果を生じることが確認された制御ルールを得るものである。学習は、複数個の学習データを用いて行わせる必要があるため、過去に作成された学習データを蓄積している学習データデータベースＤＢ２より、過去の学習データを複数個取り出して、学習し処理を実施するとともに、今回の学習データを学習データデータベースＤＢ２に格納する。また、学習したニューラルネットは、制御ルール実行部１０にて利用するために、制御ルールデータベースＤＢ１に格納される。 Further, in the control rule learning unit 11 illustrated in FIG. 5, as a result of data processing by the neural network learning control unit 112, the processing is performed as follows. Here, first, the control rule execution unit 10 uses learning data that is a combination of S8c in which the input data S1 to the control execution device 20 is delayed by time and the teacher data S7c created by the teacher data creation unit 115. The learning of the neural network 101 used is carried out. Actually, the same neural network 111 as the neural network 101 of the control rule execution unit 10 is provided in the control rule learning unit 11, and the operation test is performed under various conditions to learn the response at that time, and the result of learning is better. The control rule that has been confirmed to occur is obtained. Since it is necessary to perform learning using a plurality of learning data, a plurality of past learning data are extracted from the learning data database DB2 that stores the learning data created in the past, and learning is performed. At the same time, the learning data of this time is stored in the learning data database DB2. Further, the learned neural network is stored in the control rule database DB1 for use by the control rule execution unit 10.

ニューラルネットの学習は、新しい学習データが作成される毎に、過去の学習データを一緒に用いて学習しても良いし、学習データがある程度（例えば１００個分）蓄積されてから、過去の学習データを一緒に用いて学習しても良い。 In the learning of the neural net, each time new learning data is created, the past learning data may be used together for learning, or after the learning data is accumulated to some extent (for example, 100 pieces), the past learning is performed. You may learn by using the data together.

また、制御結果良否判定部６においては、評価関数設定部１７から設定される評価関数を用いて、良否判定を実施する。制御結果の良否判定は、使用する評価関数応じて判断結果が異なる。そのため、複数の評価関数に対応したニューラルネットをそれぞれ作成する。同じ入力データについてそれぞれの評価関数によりそれぞれ教師データを作成し、学習する。これにより、１回分の入力データに対して複数の教師データを作成し、それぞれの教師データに対応するニューラルネットの学習に用いる。同時に複数の評価関数対応したニューラルネットを学習していくことが可能である。ここで、複数の評価関数とは、例えば形状制御の場合、板幅方向でどの部分（板端部、センター部、非対称部等）を優先的に制御したいか、複数の制御対象項目（例えば、板厚と張力、圧延荷重等）のいずれを優先的に制御したいか、等のそれぞれのポリシに対して用いる評価関数のことである。 In addition, the control result pass / fail determination unit 6 performs a pass / fail determination using the evaluation function set by the evaluation function setting unit 17. The judgment result of the quality of the control result differs depending on the evaluation function used. Therefore, each neural network corresponding to a plurality of evaluation functions is created. Teacher data is created and learned by each evaluation function for the same input data. As a result, a plurality of teacher data are created for one input data and used for learning the neural network corresponding to each teacher data. It is possible to learn a neural network corresponding to multiple evaluation functions at the same time. Here, the plurality of evaluation functions are, for example, in the case of shape control, a plurality of control target items (for example, a plurality of control target items (for example, plate edge portion, center portion, asymmetric portion, etc.)) to be preferentially controlled in the plate width direction. It is an evaluation function used for each policy, such as which of the plate thickness, tension, rolling load, etc.) should be preferentially controlled.

本実施例を適用した場合、制御ルール実行部１０で用いられるニューラルネット１０１が一旦学習されると、新たな制御操作が実施されなくなることが考えられる。そのため、制御操作外乱発生部１６により、適時新たな操作方法をランダムに発生させ、制御操作量Ｓ３に加えて制御操作を実行する事で、新たな制御方法を学習するようにする。 When this embodiment is applied, it is conceivable that once the neural network 101 used in the control rule execution unit 10 is learned, a new control operation will not be executed. Therefore, the control operation disturbance generation unit 16 randomly generates a new operation method in a timely manner, and executes the control operation in addition to the control operation amount S3 to learn the new control method.

以下、一例として、特許文献１に示すようなセンヂミア圧延機における形状制御を対象に、本プラント制御方法の詳細を説明する。なお形状制御に関しては、下記のような仕様Ａ、Ｂを採用するものとして説明する。 Hereinafter, as an example, the details of this plant control method will be described for shape control in a Sendimia rolling mill as shown in Patent Document 1. The shape control will be described assuming that the following specifications A and B are adopted.

仕様Ａは、評価関数についての仕様であり、板幅方向の優先度の情報を持つものとする。例えば形状制御においては、板幅方向全域にわたって板厚などを目標値に制御する事が、機械特性上困難な場合が多い。そのため、板幅方向で、下記複数のポリシに対応する評価関数Ａ１〜ＡＮ（Ｎは評価関数の設定最大個数）を設ける。 Specification A is a specification for the evaluation function, and has information on the priority in the plate width direction. For example, in shape control, it is often difficult to control the plate thickness or the like to a target value over the entire plate width direction due to mechanical characteristics. Therefore, evaluation functions A1 to AN (N is the maximum number of evaluation functions to be set) corresponding to the following plurality of policies are provided in the plate width direction.

評価関数は、評価が良いほど値が小さくなるように定義する。例えば、制御偏差の２乗平均や、最大値−最小値等である。 The evaluation function is defined so that the better the evaluation, the smaller the value. For example, the squared average of control deviations, the maximum value-the minimum value, and the like.

ここでは、一例として、以下に例示する６種類のポリシおよび評価関数Ａ１〜Ａ６を用いるものとする。
＜Ａ１：板端部を優先し、板端部の重み付けを重くした評価関数を用いる。＞

＜Ａ２：中央部を優先し、中央部の重み付けを重くした評価関数を用いる。＞

＜Ａ３：板端部の、伸び方向は許容する。＞

＜Ａ４．板端部の、張り方向は許容する。＞

＜Ａ５．板端部がデッドバンド内の場合は許容する。＞

＜Ａ６．最大値−最小値＞

Here, as an example, it is assumed that the following six types of policies and evaluation functions A1 to A6 are used.
<A1: An evaluation function is used in which priority is given to the plate edge and the weight of the plate edge is increased. ＞

<A2: An evaluation function that gives priority to the central part and makes the weight of the central part heavier is used. ＞

<A3: The extension direction of the plate edge is allowed. ＞

<A4. The tension direction of the plate edge is allowed. ＞

<A5. If the plate edge is in the dead band, it is allowed. ＞

<A6. Maximum value-minimum value>

図６は、評価関数設定部１７の内部構成を示すブロック図である。評価関数設定部１７は、評価関数手動設定部１７１、評価関数選択部１７２、評価関数選択方法学習部１７３、および評価関数学習部１７４を有している。評価関数設定部１７は、評価関数ＤＢＤＢ５と連動して、評価関数に関する下記の処理を実行する。 FIG. 6 is a block diagram showing the internal configuration of the evaluation function setting unit 17. The evaluation function setting unit 17 includes an evaluation function manual setting unit 171, an evaluation function selection unit 172, an evaluation function selection method learning unit 173, and an evaluation function learning unit 174. The evaluation function setting unit 17 executes the following processing related to the evaluation function in conjunction with the evaluation function DB DB5.

＜処理１７−１：評価関数の設定＞
評価関数手動設定部１７１が評価関数を設定する。これは、操業技術者やオペレータの形状に対する考え方を予め数式化し、設定しておく処理である。
＜処理１７−２：評価関数の選択＞
評価関数選択部１７２が、圧延状態に応じて、制御実行装置２０にて使用する評価関数を選択する。
＜処理１７−３：評価関数の選択方法の学習＞
評価関数選択方法学習部１７３が、圧延状態とオペレータの手動操作実績より、圧延状態に応じた評価関数が選択されるように学習実施する。
＜処理１７−４：評価関数自体の学習＞
手動で予め設定した評価関数が正しいとは限らないため、評価関数自動学習部１７４が、評価関数自体を学習する。ここで学習する評価関数を学習評価関数と呼ぶ。学習がある程度進むと学習評価関数を用いて評価することが可能になる。その場合には、学習評価関数を評価関数として評価に用いることにしてもよい。 <Process 17-1: Evaluation function setting>
The evaluation function manual setting unit 171 sets the evaluation function. This is a process of formulating and setting the way of thinking about the shape of the operation engineer or the operator in advance.
<Process 17-2: Selection of evaluation function>
The evaluation function selection unit 172 selects an evaluation function to be used by the control execution device 20 according to the rolling state.
<Process 17-3: Learning how to select the evaluation function>
Evaluation function selection method The learning unit 173 learns so that the evaluation function according to the rolling state is selected from the rolling state and the manual operation results of the operator.
<Process 17-4: Learning of the evaluation function itself>
Since the evaluation function preset manually is not always correct, the evaluation function automatic learning unit 174 learns the evaluation function itself. The evaluation function to be learned here is called a learning evaluation function. When learning progresses to some extent, it becomes possible to evaluate using the learning evaluation function. In that case, the learning evaluation function may be used for evaluation as an evaluation function.

仕様Ｂは、予め判明している条件への対応についての仕様である。一例をあげると、形状パターンと制御方法の関係は、種々の条件で変化することから、例えば、仕様Ｂ１を板幅、仕様Ｂ２を鋼種とする区分で分ける必要がある事が考えられる。上記それぞれが変化することで、形状操作端の形状への影響度合が変化する。 Specification B is a specification for dealing with a condition that is known in advance. As an example, since the relationship between the shape pattern and the control method changes under various conditions, it is conceivable that, for example, it is necessary to classify the specification B1 as the plate width and the specification B2 as the steel type. By changing each of the above, the degree of influence on the shape of the shape operation end changes.

この事例では制御対象プラント１は、センヂミア圧延機であり、実績データは形状実績となる。なおセンヂミア圧延機は、ステンレスなどの硬い材料を冷間圧延するためのクラスターロールを持つ圧延機である。ゼンジミア圧延機では、硬い材料に強圧下を与える目的で、小径のワークロールを用いる。このため、平坦な鋼板を得ることが難しい。この対策として、クラスターロールの構造やさまざまな形状制御部を採用している。センヂミア圧延機は一般には、上下の第１中間ロールが片テーパを持ち、シフトできるようになっているほか、上下に６個の分割ロールと２個のＡＳ−Ｕと呼ばれるロールを備えている。以下に説明する事例では、形状の実績データＳｉとしては、形状検出器の検出データを用い、さらに入力データＳ１としては、目標形状との差である、形状偏差を用いる。また制御操作量Ｓ３としては、＃１〜＃ｎのＡＳ−Ｕ、上下の第１中間ロールのロールシフト量とする。 In this example, the controlled plant 1 is a Sendimia rolling mill, and the actual data is the actual shape. The Sendimia rolling mill is a rolling mill having a cluster roll for cold rolling a hard material such as stainless steel. In the Zendimia rolling mill, a work roll having a small diameter is used for the purpose of applying a strong rolling reduction to a hard material. Therefore, it is difficult to obtain a flat steel plate. As a countermeasure, the structure of the cluster roll and various shape control units are adopted. In general, the upper and lower first intermediate rolls of a sendimia rolling mill have a one-sided taper so that they can be shifted, and also have six split rolls and two rolls called AS-U on the upper and lower sides. In the example described below, the shape detector detection data is used as the shape actual data Si, and the shape deviation, which is the difference from the target shape, is used as the input data S1. The control operation amount S3 is the AS-U of # 1 to #n and the roll shift amount of the upper and lower first intermediate rolls.

図７に、センヂミア圧延機の形状制御に用いる場合のニューラルネットワークの構成を示す。ニューラルネットワークをニューラルネットと略す場合がある。ここでニューラルネットとは、制御ルール実行部１０用ではニューラル１０１のことであり、制御ルール学習部１１用ではニューラルネット１１１に示したニューラルネットを示しているが、いずも構造は同じである。 FIG. 7 shows the configuration of the neural network when used for shape control of the Sendimia rolling mill. A neural network may be abbreviated as a neural network. Here, the neural network is the neural 101 for the control rule execution unit 10, and the neural network shown in the neural network 111 for the control rule learning unit 11, but the structure is the same. ..

本実施例に示すセンヂミア圧延機の形状制御の事例では、制御対象プラント１からの実績データＳｉは形状検出器のデータ（ここでは、実績形状と目標形状との差である形状偏差が出力されるものとする）を含むセンヂミア圧延機の実績データであり、制御入力データ作成部２では、入力データＳ１として規格化形状偏差２０１、形状偏差段階２０２を得る。これによりニューラルネット１０１、１１１の入力層は、規格化形状偏差２０１、形状偏差段階２０２により構成される。なお図７では、形状偏差段階２０２をニューラルネット入力層への入力としているが、段階に応じてニューラルネットを切替てもよい。 In the case of shape control of the Sendimia rolling mill shown in this embodiment, the actual data Si from the controlled plant 1 is the data of the shape detector (here, the shape deviation which is the difference between the actual shape and the target shape is output. It is the actual data of the Sendimia rolling mill including), and the control input data creation unit 2 obtains the standardized shape deviation 201 and the shape deviation step 202 as the input data S1. As a result, the input layers of the neural networks 101 and 111 are composed of the normalized shape deviation 201 and the shape deviation step 202. In FIG. 7, the shape deviation step 202 is input to the neural network input layer, but the neural network may be switched according to the step.

また、出力層は、センヂミア圧延機の形状制御操作端である、ＡＳ−Ｕ、第１中間ロールに合わせて、ＡＳ−Ｕ操作度合３０１と第１中間操作度合３０２により構成される。それぞれの操作度合は、ＡＳ−Ｕについては、ＡＳ−Ｕ開方向（ロールギャップ（圧延機の上下作業ロール間の間隔）が開く方向）、ＡＳ−Ｕ閉方向（ロールギャップが閉じる方向）を各ＡＳ−Ｕについて持つ。また、第１中間ロールについては、第１中間ロール開方向（第１中間ロールが圧延機中心より外側に向かって動作する方向）、第１中間ロール閉方向（第１中間ロールが圧延機中心側に向かって動作する方向）を上下第１中間ロールについて持つ。例えば、形状検出器が２０ゾーンで、形状偏差段階２０２を３段階（大、中、小）とした場合、入力層は２３個の入力となる。また、ＡＳ−Ｕのサドルが７本、上下第１中間ロールが板幅方向でシフト可能とすると、出力層はＡＳ−Ｕ操作度合３０１が１４個、１中間操作度合が４個の計１８個となる。中間層の層数および各層のニューロン数については、適時設定する。なお図１０を参照して後述するが、出力層であるセンヂミア圧延機の形状制御操作端について、個々の制御操作端に対して＋方向、−方向の２種類の出力が出るようにニューラルネット出力を構成している。 Further, the output layer is composed of an AS-U operating degree 301 and a first intermediate operating degree 302 in accordance with the AS-U and the first intermediate roll, which are the shape control operation ends of the Sendimia rolling mill. For AS-U, the degree of operation is the AS-U opening direction (the direction in which the roll gap (distance between the upper and lower working rolls of the rolling mill) opens) and the AS-U closing direction (the direction in which the roll gap closes). I have about AS-U. Regarding the first intermediate roll, the first intermediate roll opening direction (the direction in which the first intermediate roll operates outward from the center of the rolling mill) and the first intermediate roll closing direction (the first intermediate roll is on the center side of the rolling mill). The direction of movement toward the upper and lower first intermediate rolls). For example, if the shape detector has 20 zones and the shape deviation stage 202 is set to 3 stages (large, medium, small), the input layer has 23 inputs. Further, assuming that the AS-U saddles are 7 and the upper and lower first intermediate rolls can be shifted in the plate width direction, the output layer has 14 AS-U operation degrees 301 and 4 1 intermediate operation degrees, for a total of 18 pieces. It becomes. The number of layers in the middle layer and the number of neurons in each layer are set in a timely manner. As will be described later with reference to FIG. 10, the neural network output is performed so that the shape control operation end of the Sendimia rolling mill, which is the output layer, outputs two types of output in the + direction and the-direction with respect to each control operation end. Consists of.

図８に形状偏差と制御方法について示している。ここでは図８上部に、形状偏差が大きい場合の制御方法を示し、図８の下部に形状偏差が小さい場合の制御方法を示している。なお高さ方向は形状偏差の大きさ、横軸方向は板幅方向であり、板幅の両側が板端部、中央が板中央部を表している。この図８の上部に示すように、形状偏差が大きい場合は、板幅方向の局部的な形状偏差よりも全体的な形状を修正することを優先する。一方図８の下部に示すように、形状偏差が小さい場合は、局部的な形状偏差を小さくすることを優先する。 FIG. 8 shows the shape deviation and the control method. Here, the upper part of FIG. 8 shows the control method when the shape deviation is large, and the lower part of FIG. 8 shows the control method when the shape deviation is small. The height direction is the magnitude of the shape deviation, the horizontal axis direction is the plate width direction, and both sides of the plate width represent the plate end and the center represents the plate center. As shown in the upper part of FIG. 8, when the shape deviation is large, the correction of the overall shape is prioritized over the local shape deviation in the plate width direction. On the other hand, as shown in the lower part of FIG. 8, when the shape deviation is small, priority is given to reducing the local shape deviation.

このように、形状偏差の大きさに応じて制御方法を変える必要があるため、図７に示すように形状偏差段階２０２を設けてニューラルネット１０１、１１１に与え、形状偏差の大きさを判定する。形状偏差については形状偏差の大小にかかわらず、例えば０〜１に規格化したものを用いるのがよい。これは、一例であって、形状偏差を規格化せずにそのままニューラルネットの入力層へ入力することも考えられるし、形状偏差の大小に応じて、ニューラルネット自体を変える（例えば、２つのニューラルネットを準備し、形状偏差が大きい場合に使用するニューラルネットと、小さい場合に使用するニューラルネットを分ける）事も考えられる。 In this way, since it is necessary to change the control method according to the magnitude of the shape deviation, the shape deviation step 202 is provided to the neural networks 101 and 111 as shown in FIG. 7, and the magnitude of the shape deviation is determined. .. Regarding the shape deviation, regardless of the magnitude of the shape deviation, it is preferable to use one standardized to 0 to 1, for example. This is just an example, and it is conceivable to input the shape deviation to the input layer of the neural network as it is without standardizing it, or to change the neural network itself according to the magnitude of the shape deviation (for example, two neural networks). It is also conceivable to prepare a net and separate the neural network used when the shape deviation is large and the neural network used when the shape deviation is small).

以上説明した図７のような構成のニューラルネット１０１、１１１に対して、形状パターンに対する操作方法を学習させ、学習させたニューラルネットを用いて形状制御を実施する。同じ構成のニューラルネットでも、学習の条件により異なった特性となり、同じ形状パターンに対して異なった制御出力を出すようにすることができる。 The neural networks 101 and 111 having the configuration shown in FIG. 7 described above are trained in the operation method for the shape pattern, and the shape control is performed using the trained neural network. Even neural networks with the same configuration have different characteristics depending on the learning conditions, and different control outputs can be output for the same shape pattern.

そのため、形状実績の他の条件に応じて、複数のニューラルネットを使い分けることで、多様な条件に対して最適な制御を構成することができる。これは仕様Ｂへの対応である。先に説明した図４の構成は、係る仕様を行う場合の具体例を示している。図４の構成事例では、制御ルール実行部１０において使用するニューラルネット１０１を、圧延実績や、圧延機オペレータ名、被圧延材の鋼種、板幅等により別個のニューラルネットを準備し、制御ルールデータベースＤＢ１に登録しておく。ニューラルネット選択部１０２においては、その時点の条件に合致するニューラルネットを選択し、制御ルール実行部１０のニューラルネット１０１に設定する。なおニューラルネット選択部１０２における、その時点の条件としては、制御対象プラント1における実績データＳｉの中から板幅のデータを取り込み、これに応じてニューラルネットを選択するのがよい。また、ここで使用する複数のニューラルネットは、図７に示すような入力層、出力層を持てば、中間層の層数、各層のユニット数は異なっても良い。 Therefore, by properly using a plurality of neural networks according to other conditions of the shape record, it is possible to configure the optimum control for various conditions. This is a response to specification B. The configuration of FIG. 4 described above shows a specific example in the case of making such a specification. In the configuration example of FIG. 4, the neural network 101 used in the control rule execution unit 10 is prepared as a separate neural network according to the rolling results, the name of the rolling mill operator, the steel type of the material to be rolled, the plate width, and the like, and the control rule database is prepared. Register in DB1. The neural network selection unit 102 selects a neural network that matches the conditions at that time, and sets it in the neural network 101 of the control rule execution unit 10. As a condition at that time in the neural network selection unit 102, it is preferable to take in the plate width data from the actual data Si in the controlled plant 1 and select the neural network accordingly. Further, if the plurality of neural networks used here have an input layer and an output layer as shown in FIG. 7, the number of layers of the intermediate layer and the number of units of each layer may be different.

図９に、ニューラルネット１０１、１１１の入力層へ入力するためのデータＳ１（規格化形状偏差２０１、形状偏差段階２０２）を作成する、制御入力データ作成部２の概要を示す。ここでは実績データＳｉとして、制御対象プラント１であるセンヂミア圧延機における圧延時の板形状を検出する、形状検出器の形状検出器データを入力とし、まず、形状偏差ＰＰ値演算装置２１０にて各形状検出器ゾーンの検出結果の最大値と最小値の差である形状偏差ＰＰ値（ＰｅａｋＴｏＰｅａｋ値）Ｓ_ＰＰを求める。形状偏差段階演算装置２１１では、形状偏差ＰＰ値Ｓ_ＰＰにより、形状偏差を大、中、小の３段階に分類する。形状は、被圧延材の伸び率の板幅方向分布であり、伸び率を１０−５単位で表すＩ−ＵＮＩＴが単位として用いられる。例えば、下式のように分類する。 FIG. 9 shows an outline of the control input data creation unit 2 that creates data S1 (normalized shape deviation 201, shape deviation step 202) for inputting to the input layers of the neural networks 101 and 111. Here, as the actual data Si, the shape detector data of the shape detector that detects the plate shape at the time of rolling in the Sendimia rolling mill, which is the controlled plant 1, is input, and first, each shape deviation PP value calculation device 210 is used. _{The shape deviation PP value (Peak To Peak value) S PP} , which is the difference between the maximum value and the minimum value of the detection result of the shape detector zone, is obtained. The shape deviation step calculation device 211 classifies the shape deviation into three stages of large, medium, and small according to the _{shape deviation PP value SPP.} The shape is the distribution of the elongation rate of the material to be rolled in the plate width direction, and I-UNIT, which expresses the elongation rate in 10-5 units, is used as a unit. For example, it is classified as follows.

ここでは、（１）式の成立により形状偏差段階が（大＝１、中＝０、小＝０）とし、（２）式の成立により形状偏差段階が（大＝０、中＝１、小＝０）とし、（３）式の成立により形状偏差段階が（大＝０、中＝０、小＝１）とするように分類している。なおここでは、各ゾーンの形状偏差については、Ｓ_ＰＭ＝Ｓ_ＰＰとした、Ｓ_ＰＭを用いて規格化を実施する。

Here, the shape deviation stage is (large = 1, medium = 0, small = 0) due to the establishment of equation (1), and the shape deviation stage is (large = 0, medium = 1, small) due to the establishment of equation (2). = 0), and the shape deviation stage is classified as (large = 0, medium = 0, small = 1) by the establishment of equation (3). Here, the shape deviation of each zone is standardized using S _PM _{with S PM} = S _PP.

以上のようにして、ニューラルネット１０１への入力データである規格化形状偏差２０１および形状偏差段階２０２を作成する。規格化形状偏差２０１および形状偏差段階２０２は、制御ルール実行部１０の入力データＳ１である。 As described above, the normalized shape deviation 201 and the shape deviation step 202, which are the input data to the neural network 101, are created. The standardized shape deviation 201 and the shape deviation step 202 are the input data S1 of the control rule execution unit 10.

図１０に、制御出力演算部３の概要を示す。制御出力演算部３は、制御ルール実行部１０内の、ニューラルネット１０１からの出力である制御操作端操作指令Ｓ２（センヂミア圧延機の形状制御の事例では、ＡＳ−Ｕ操作度合３０１、第１中間操作度合３０２がこれに相当する）より、各形状制御操作端への操作指令である制御操作量Ｓ３を作成する。なおここでは、複数個数が存在するＡＳ−Ｕ操作度合３０１、第１中間操作度合３０２について、各１つのデータ例を示しており、各データは開方向度合と閉方向度合の一対のデータで構成されている。 FIG. 10 shows an outline of the control output calculation unit 3. The control output calculation unit 3 is a control operation end operation command S2 (in the case of shape control of the Sendimia rolling mill, AS-U operation degree 301, first intermediate) which is an output from the neural network 101 in the control rule execution unit 10. From the operation degree 302 corresponds to this), the control operation amount S3 which is an operation command to each shape control operation end is created. Here, one data example is shown for each of the AS-U operation degree 301 and the first intermediate operation degree 302 in which a plurality of numbers exist, and each data is composed of a pair of data of the opening direction degree and the closing direction degree. Has been done.

制御出力演算部３内では、入力されたＡＳ−Ｕ操作度合３０１は、各ＡＳ−Ｕ開方向、閉方向の出力をもつため、それらの差に変換ゲインＧ_ＡＳＵを掛ける事で、各ＡＳ−Ｕへの操作指令を出力する。変換ゲインＧ_ＡＳＵは、各ＡＳ−Ｕへの制御出力がＡＳ−Ｕ位置変更量（単位は長さ）となることから、度合から位置変更量への変換ゲインとなる。 In the control output calculation unit 3, the input AS-U operation degree 301 has outputs in the open direction and the closed direction of each AS-U. _Therefore, by multiplying the difference between them by the conversion gain G ASU, each AS- Outputs an operation command to U. Since the control output to each AS-U is the AS-U position change amount (unit is length), the conversion gain G _{ASU is the conversion gain from the degree to the position change amount.}

また同じく入力された第１中間操作度合３０２は、第１中間外側、内側の出力をもつため、それらの差に変換ゲインＧ_１ＳＴを掛ける事で、各第１中間ロールシフトへの操作指令を出力する。変換ゲインＧ_１ＳＴは、各第１中間ロールへの制御出力が第１中間ロールシフト位置変更量（単位は長さ）となることから、度合から位置変更量への変換ゲインとなる。 Further, since the input first intermediate operation degree 302 has outputs of the first intermediate outer side and the inner side, the operation command to each first intermediate roll shift is output by multiplying _{the difference between them by the conversion gain G 1ST.} do. Since the control output to each first intermediate roll is the first intermediate roll shift position change amount (unit is length), the conversion gain G _{1ST is the conversion gain from the degree to the position change amount.}

以上により、制御操作量Ｓ３を演算することができる。制御操作量Ｓ３は、＃１〜＃ｎＡＳ−Ｕ位置変更量（ｎはＡＳ−Ｕロールのサドル数による）と、上第１中間シフト位置変更量、下第１中間シフト位置変更量から構成されている。なお、図１０には、制御操作外乱発生部１６からの外乱データを制御操作端操作指令Ｓ２に加算する系統が図示されている。 From the above, the control operation amount S3 can be calculated. The control operation amount S3 is composed of # 1 to # nAS-U position change amount (n depends on the number of saddles of the AS-U roll), the upper first intermediate shift position change amount, and the lower first intermediate shift position change amount. ing. Note that FIG. 10 shows a system for adding disturbance data from the control operation disturbance generation unit 16 to the control operation end operation command S2.

図６を参照して評価関数設定部１７の動作概要について説明する。評価関数は、圧延機における形状の制御に対するオペレータの意図を反映させたものである。オペレータの意図は圧延状態に応じて変化する。ここでは圧延状態は圧延速度により区別されるとする。図１１に示すように、圧延機の圧延速度は、停止状態から加速して一定速にて圧延し、減速して停止するというように変化する。その圧延速度の変化に応じて圧延状態も１７−１、１７−２、１７−３・・・と変化する。そして、圧延状態の変化に応じて、オペレータの意図も、意図１、意図２、意図３・・・というように変化する。オペレータの意図には例えば以下のようなものがある。 The outline of the operation of the evaluation function setting unit 17 will be described with reference to FIG. The evaluation function reflects the operator's intention to control the shape of the rolling mill. The operator's intention changes according to the rolling state. Here, it is assumed that the rolling state is distinguished by the rolling speed. As shown in FIG. 11, the rolling speed of the rolling mill changes from a stopped state to accelerating, rolling at a constant speed, decelerating, and stopping. The rolling state also changes to 17-1, 17-2, 17-3, ... According to the change in the rolling speed. Then, the operator's intention also changes according to the change in the rolling state, such as intention 1, intention 2, intention 3, and so on. The operator's intention is, for example, as follows.

＜意図１＞低速で圧延を開始した当初は、通板の安定性を確保するために板の中央部を優先する。
＜意図２＞圧延を加速するときには、板の蛇行等を防止するため板端部を重視する。
＜意図３＞圧延速度が一定のときには、被圧延材の品質を考慮し、かつ板破断が発生しないように、板端部の伸び方向の形状偏差は許容し、中央部の形状を優先する。 <Intention 1> At the beginning of rolling at low speed, priority is given to the central part of the plate in order to ensure the stability of the plate.
<Intention 2> When accelerating rolling, the edge of the plate is emphasized in order to prevent meandering of the plate.
<Intention 3> When the rolling speed is constant, the quality of the material to be rolled is taken into consideration, and the shape deviation in the elongation direction of the plate end is allowed so that the plate breakage does not occur, and the shape of the central portion is prioritized.

上記各意図に評価関数Ａ１〜ＡＮを対応づけると以下のようになる。
意図１には評価関数Ａ２が対応する。
意図２には評価関数Ａ１が対応する。
意図３には評価関数Ａ３が対応する。 The evaluation functions A1 to AN are associated with each of the above intentions as follows.
The evaluation function A2 corresponds to the intention 1.
The evaluation function A1 corresponds to the intention 2.
The evaluation function A3 corresponds to the intention 3.

評価関数ＤＢＤＢ５には、上記のようなオペレータの意図と評価関数との対応関係を記憶する。図１２に評価関数ＤＢＤＢ５の一例を示す。上記の圧延状態に応じたオペレータの各意図に対して評価関数Ａ１〜Ａ６（評価関数ＮＯ）のいずれを使用するかを定義する。 The evaluation function DB DB 5 stores the correspondence between the operator's intention and the evaluation function as described above. FIG. 12 shows an example of the evaluation function DB DB5. It is defined which of the evaluation functions A1 to A6 (evaluation function NO) is used for each intention of the operator according to the above-mentioned rolling state.

意図１、２、３が適用される圧延状態は圧延速度により区別できるので、圧延速度に応じて評価関数Ａ１〜ＡＮのいずれを用いるのか選択することができる。オペレータまたは操業技術者等が、評価関数手動設定部１７１を用いて、圧延速度と評価関数Ａ１〜ＡＮとの対応づけを評価関数ＤＢＤＢ５に手動設定する。評価関数選択部１７２が、その設定に従って、（圧延速度の実績値を含む）圧延実績Ｓｉより定めた圧延状態に応じたオペレータの意図に対応する評価関数を選択し、制御出力判定部５および制御結果良否判定部６に設定する。 Since the rolling states to which the intents 1, 2 and 3 are applied can be distinguished by the rolling speed, it is possible to select which of the evaluation functions A1 to AN is used according to the rolling speed. An operator, an operation engineer, or the like manually sets the association between the rolling speed and the evaluation functions A1 to AN in the evaluation function DB DB5 by using the evaluation function manual setting unit 171. The evaluation function selection unit 172 selects an evaluation function corresponding to the operator's intention according to the rolling state determined from the rolling performance Si (including the actual value of the rolling speed) according to the setting, and controls the control output determination unit 5 and the control. It is set in the result quality determination unit 6.

オペレータまたは操業技術者による評価関数の選択の手動設定は、実際のオペレータの判断が適切に設定されなかったり、オペレータが新たな判断基準を発見して使用するようになったりして、実際と異なる場合がありうる。この手動設定の良否を評価するために、評価関数選択方法学習部１７３が、実際の圧延操業で得られる実績データに基づいて、評価関数の選択方法の良否を判定する。更に、選択方法が良くないと判定したときには、評価関数選択方法学習部１７３は、評価関数データベースＤＢ５における評価関数の選択方法の設定を変更する。 The manual setting of the evaluation function selection by the operator or the operation engineer is different from the actual one because the actual operator's judgment is not set properly or the operator discovers and uses a new judgment criterion. There can be cases. In order to evaluate the quality of this manual setting, the evaluation function selection method learning unit 173 determines the quality of the evaluation function selection method based on the actual data obtained in the actual rolling operation. Further, when it is determined that the selection method is not good, the evaluation function selection method learning unit 173 changes the setting of the evaluation function selection method in the evaluation function database DB5.

図１３は、評価関数選択方法学習部１７３の動作概要を説明するための図である。オペレータは、圧延操業中に板の形状が悪いと判断したら手動操作を開始し、形状が良くなったと判断するまで手動操作を継続する。したがって、オペレータが手動操作を開始した時点と、手動操作を終了した時点にはオペレータの意図が反映される。評価関数選択方法学習部１７３は、その時点のデータで各評価関数Ａ１〜ＡＮでの形状評価結果を計算し、それら形状評価結果を相互に比較すれば、評価関数の相対的な良否、すなわちどの評価関数がオペレータの意図に近いものであるかを判定することができる。 FIG. 13 is a diagram for explaining an outline of the operation of the evaluation function selection method learning unit 173. If the operator determines that the shape of the plate is bad during the rolling operation, the operator starts the manual operation, and continues the manual operation until it determines that the shape has improved. Therefore, the intention of the operator is reflected when the operator starts the manual operation and when the manual operation is completed. The evaluation function selection method learning unit 173 calculates the shape evaluation results of each evaluation function A1 to AN from the data at that time, and if the shape evaluation results are compared with each other, the relative quality of the evaluation function, that is, which It can be determined whether the evaluation function is close to the operator's intention.

形状評価値は値が小さいほど形状が良好であることを示すものとすると、手動操作を開始した時点での形状評価値が大きく、手動操作を終了した時点での形状評価値が小さい評価関数は、その圧延状態（圧延速度）で好適な評価関数と判断できる。 Assuming that the smaller the value of the shape evaluation value, the better the shape, the evaluation function has a large shape evaluation value at the start of the manual operation and a small shape evaluation value at the end of the manual operation. , It can be judged that the evaluation function is suitable for the rolling state (rolling speed).

本実施例では、２乗平均を使用する評価関数、最大値あるいは最小値を使用する評価関数など、評価関数ごとに計算方法が異なるため、共通的な指標を評価関数の良否を評価するための指標（評価関数良否指標）として用いて比較する必要がある。ここでは一例として、評価関数選択方法学習部１７３は、以下の式に示す比率Ｘｉを用いて評価関数を比較する。
比率Ｘｉ＝（ａ−ｂ）／ｂ
ａは、手動操作を開始した時点の形状評価値である。ｂは、手動操作を終了した時点の形状評価値である。評価関数選択方法学習部１７３は、評価関数Ａ１〜ＡＮのうち、評価関数良否指標である比率Ｘｉが最も大きな値となる評価関数Ａｉを、そのときの圧延状態におけるオペレータの意図に最も合った評価を得られる評価関数であると判断し、最良の評価関数として選択する。 In this embodiment, the calculation method differs for each evaluation function, such as an evaluation function that uses the squared average and an evaluation function that uses the maximum or minimum value. Therefore, a common index is used to evaluate the quality of the evaluation function. It is necessary to compare by using it as an index (evaluation function quality index). Here, as an example, the evaluation function selection method learning unit 173 compares the evaluation functions using the ratio Xi shown in the following equation.
Ratio Xi = (ab) / b
a is a shape evaluation value at the time when the manual operation is started. b is a shape evaluation value at the time when the manual operation is completed. The evaluation function selection method learning unit 173 evaluates the evaluation function Ai in which the ratio Xi, which is the evaluation function quality index, has the largest value among the evaluation functions A1 to AN, which best suits the operator's intention in the rolled state at that time. Judge that it is an evaluation function that can be obtained, and select it as the best evaluation function.

手動操作が開始あるいは終了したときの圧延状態およびそのときのオペレータの意図は、圧延実績より判定可能である。評価関数選択方法学習部１７３は、評価関数ＤＢＤＢ５における、該当するオペレータの意図に対応づけられている評価関数が、ここで選択された最良の評価関数と異なれば、該当するオペレータの意図に対応する評価関数を評価関数Ａｉに更新する。そして、評価関数選択方法学習部１７３は、次回からは変更後の設定に従って評価関数Ａｉを制御出力判定部５および制御結果良否判定部６に設定する。 The rolling state at the start or end of the manual operation and the operator's intention at that time can be determined from the rolling results. The evaluation function selection method learning unit 173 corresponds to the intention of the corresponding operator if the evaluation function associated with the intention of the corresponding operator in the evaluation function DB DB 5 is different from the best evaluation function selected here. Update the evaluation function to be evaluated to the evaluation function Ai. Then, from the next time, the evaluation function selection method learning unit 173 sets the evaluation function Ai in the control output determination unit 5 and the control result pass / fail determination unit 6 according to the changed settings.

図１３のグラフには、２つの評価関数Ａ１、Ａ２の形状評価値の時間推移が示されている。評価関数Ａ１の手動操作が開始された時点での形状評価値がＡ１Ｓであり、手動操作が終了した時点での形状評価値がＡ１Ｅである。評価関数Ａ２の手動操作が開始された時点での形状評価値がＡ２Ｓであり、手動操作が終了した時点での形状評価値がＡ２Ｅである。 The graph of FIG. 13 shows the time transition of the shape evaluation values of the two evaluation functions A1 and A2. The shape evaluation value at the time when the manual operation of the evaluation function A1 is started is A1S, and the shape evaluation value at the time when the manual operation is completed is A1E. The shape evaluation value at the time when the manual operation of the evaluation function A2 is started is A2S, and the shape evaluation value at the time when the manual operation is completed is A2E.

図１３を見て明らかなように、評価関数Ａ２の比率Ｘ２＝（Ａ２Ｓ−Ａ２Ｅ）／Ａ２Ｅが、評価関数Ａ１の比率Ｘ１＝（Ａ１Ｓ−Ａ１Ｅ）／Ａ１Ｅよりも大きい。 As is clear from FIG. 13, the ratio X2 = (A2S-A2E) / A2E of the evaluation function A2 is larger than the ratio X1 = (A1S-A1E) / A1E of the evaluation function A1.

更に、手動で設定した評価関数そのものが適当で無いという可能性も考慮して、評価関数学習部１７４により評価関数の学習を行う。 Further, considering the possibility that the evaluation function itself set manually is not appropriate, the evaluation function learning unit 174 learns the evaluation function.

図１４は、評価関数学習部１７４の動作概要を説明するための図である。評価関数学習部１７４は、圧延により得られた板の形状の実績値である形状実績と、圧延における制御操作のパラメータ値である圧延実績とを入力とし、形状評価値を出力する評価関数用のニューラルネットワーク（評価関数用ニューラルネット）を設置し、実績データを用いて、その評価関数用ニューラルネットの学習を行う。なお、評価関数用ニューラルネットへの入力とする圧延実績には、評価関数に影響を与えそうな圧延実績（例えば圧延速度）を選択するとよい。学習済みのニューラルネットワークは評価関数として用いることができる。 FIG. 14 is a diagram for explaining an outline of the operation of the evaluation function learning unit 174. The evaluation function learning unit 174 is for an evaluation function that outputs the shape evaluation value by inputting the shape result which is the actual value of the shape of the plate obtained by rolling and the rolling result which is the parameter value of the control operation in rolling. A neural network (neural network for evaluation function) is set up, and the neural network for the evaluation function is trained using the actual data. For the rolling performance to be input to the neural net for the evaluation function, it is preferable to select a rolling performance (for example, rolling speed) that is likely to affect the evaluation function. The trained neural network can be used as an evaluation function.

先に述べたように、オペレータの意図する形状の評価は、オペレータが手動操作を開始した時点では板の形状が悪く、手動操作を終了した時点では板の形状は良いと解釈できる。そこで、圧延機により板を作成する過程で、評価関数学習部１７４は、オペレータが手動操作を開始した時点の形状評価値＝１（１は形状が悪いことを示す）とし、手動操作を終了した時点の形状評価値＝０（０は形状が良いことを示す）として、その時点での形状実績および圧延実績と共に教師データとして蓄積していく。そして、評価関数学習部１７４は、蓄積した教師データを用いて、ニューラルネットの教師あり学習を行う。これにより学習済みニューラルネットは、圧延実績および形状実績を入力すると、形状評価値を出力するものとなるので、評価関数として用いることができる。 As described above, the evaluation of the shape intended by the operator can be interpreted that the shape of the plate is bad when the operator starts the manual operation and the shape of the plate is good when the manual operation is completed. Therefore, in the process of producing the plate by the rolling mill, the evaluation function learning unit 174 sets the shape evaluation value = 1 (1 indicates that the shape is bad) at the time when the operator starts the manual operation, and ends the manual operation. Assuming that the shape evaluation value at the time point = 0 (0 indicates that the shape is good), the shape and rolling results at that time are accumulated as teacher data. Then, the evaluation function learning unit 174 performs supervised learning of the neural network using the accumulated teacher data. As a result, the trained neural network can be used as an evaluation function because the shape evaluation value is output when the rolling result and the shape result are input.

図１５に、評価関数学習部１７４の概要構成を示す。制御出力判定部５および制御結果良否判定部６は、当初はオペレータが手動で設定した評価関数を用いている。評価関数学習部１７４は、形状実績と圧延実績を含む圧延実績データＳｉに後述する教師データを加えて学習データとして学習することにより、当初の評価関数に代わる評価関数を提供する評価関数用ニューラルネットを構築する。 FIG. 15 shows an outline configuration of the evaluation function learning unit 174. The control output determination unit 5 and the control result quality determination unit 6 initially use an evaluation function manually set by the operator. The evaluation function learning unit 174 is a neural net for an evaluation function that provides an evaluation function in place of the initial evaluation function by adding the teacher data described later to the rolling result data Si including the shape result and the rolling result and learning as learning data. To build.

評価関数実行部１７４は評価実行部と学習実行部とを有する。 The evaluation function execution unit 174 has an evaluation execution unit and a learning execution unit.

評価実行部は、制御出力判定部５および制御結果良否判定部６にて用いられる評価関数用ニューラルネット１７４０を備え、その評価関数用ニューラルネット１７４０を用いて評価を実施する。 The evaluation execution unit includes an evaluation function neural network 1740 used in the control output determination unit 5 and the control result quality determination unit 6, and evaluates using the evaluation function neural network 1740.

学習実行部は、その評価関数用ニューラルネット１７４０と同等の評価関数用ニューラルネット１７４１を備え、その評価関数用ニューラルネット１７４１を用いて学習を実施する。ここで、評価関数用ニューラルネット１７４１は、図１４に示すように、形状実績と圧延実績を入力とし、形状評価値を出力とするニューラルネットである。この評価関数用ニューラルネット１７４１の学習時には、形状実績および圧延実績を含む圧延実績データＳｉを入力データとし、後述する形状評価値を教師データとし、それらの組合せを学習データとする。従って、形状実績および圧延実績と教師データの組合せを学習データとして評価関数学習データデータベース１７４３に格納しておき、ある程度まで学習データが蓄積された段階で、学習実行部がニューラルネットの学習を実施するとよい。 The learning execution unit includes an evaluation function neural net 1741 equivalent to the evaluation function neural net 1740, and performs learning using the evaluation function neural net 1741. Here, as shown in FIG. 14, the evaluation function neural network 1741 is a neural network that inputs the shape result and the rolling result and outputs the shape evaluation value. At the time of learning the evaluation function neural net 1741, the rolling record data Si including the shape record and the rolling record is used as input data, the shape evaluation value described later is used as teacher data, and the combination thereof is used as training data. Therefore, if the combination of the shape result, the rolling result, and the teacher data is stored as the learning data in the evaluation function learning data database 1743, and the learning data is accumulated to a certain extent, the learning execution unit performs the learning of the neural net. good.

学習実行部は、上述した評価関数用ニューラルネット１７４１の他に、評価関数用ニューラルネット学習制御部１７４４、入力データ作成部１７４５、および教師データ作成部１７４６を有している。 In addition to the evaluation function neural net 1741 described above, the learning execution unit includes an evaluation function neural net learning control unit 1744, an input data creation unit 1745, and a teacher data creation unit 1746.

教師データ作成部１７４６においては、オペレータの形状に対する手動操作の信号を用いて、手動操作が開始されたタイミングにて、形状評価値＝１の教師データを作成する。また、教師データ作成部１７４６は、手動操作が開始されたタイミングを、入力データ作成部１７４５に通知する。入力データ作成部１７４５は、手動操作が開始されたタイミングでの形状実績および圧延実績を取得し、それを入力データとする。入力データ作成部１７４５で作成された入力データと教師データ作成部１７４６にて作成された教師データは、一組の学習データとして、評価関数学習データデータベース１７４３に保存される。 The teacher data creation unit 1746 creates teacher data with a shape evaluation value = 1 at the timing when the manual operation is started by using the signal of the manual operation for the operator's shape. Further, the teacher data creation unit 1746 notifies the input data creation unit 1745 of the timing at which the manual operation is started. The input data creation unit 1745 acquires the shape record and the rolling record at the timing when the manual operation is started, and uses them as input data. The input data created by the input data creation unit 1745 and the teacher data created by the teacher data creation unit 1746 are stored in the evaluation function learning data database 1743 as a set of training data.

同様にして、教師データ作成部１７４６においては、オペレータの形状に対する手動操作の信号を用いて、手動操作が終了したタイミングにて、形状評価値＝０の教師データを作成する。また、教師データ作成部１７４６は、手動操作が終了したタイミングを、入力データ作成部１７４５に通知する。入力データ作成部１７４５は、手動操作が終了したタイミングでの形状実績および圧延実績を取得し、それを入力データとする。入力データ作成部１７４５で作成された入力データと教師データ作成部１７４６にて作成された教師データは、一組の学習データとして、評価関数学習データデータベース１７４３に保存される。 Similarly, the teacher data creation unit 1746 creates teacher data with a shape evaluation value of 0 at the timing when the manual operation is completed by using the signal of the manual operation for the operator's shape. Further, the teacher data creation unit 1746 notifies the input data creation unit 1745 of the timing when the manual operation is completed. The input data creation unit 1745 acquires the shape record and the rolling record at the timing when the manual operation is completed, and uses them as input data. The input data created by the input data creation unit 1745 and the teacher data created by the teacher data creation unit 1746 are stored in the evaluation function learning data database 1743 as a set of training data.

評価関数学習データデータベース１７４３に学習データがある程度（例えば１０００組）蓄積されたら、評価関数用ニューラルネット学習制御部１７４４が、評価関数学習データデータベース１７４３から学習データを読出し、その学習データから入力データと教師データを取得して評価関数用ニューラルネット１７４１に与え、ニューラルネットの学習を実施する。 When the training data is accumulated to some extent (for example, 1000 sets) in the evaluation function learning data database 1743, the evaluation function neural net learning control unit 1744 reads the training data from the evaluation function learning data database 1743, and inputs the input data from the training data. The teacher data is acquired and given to the evaluation function neural net 1741, and the neural net is trained.

学習実行部にて評価関数用ニューラルネット１７４１の学習が完了したら、その評価関数用ニューラルネット１７４１を、評価実行部の評価関数用ニューラルネット１７４０にコピーする。それにより、評価関数用ニューラルネット１７４０が新たなものに更新される。その結果、制御出力判定部５および制御結果良否判定部６では新たな評価関数用ニューラルネットワークによる評価を行うことが可能となる。 When the learning of the evaluation function neural net 1741 is completed in the learning execution unit, the evaluation function neural net 1741 is copied to the evaluation function neural net 1740 of the evaluation execution unit. As a result, the evaluation function neural network 1740 is updated with a new one. As a result, the control output determination unit 5 and the control result quality determination unit 6 can perform evaluation by a new neural network for the evaluation function.

本実施例では、制御目標とする板幅や板厚、材料の鋼種などの条件が異なれば好適な評価関数は異なるものになると考えられるので、それぞれの条件ごとに別々に学習を行った学習済みニューラルネットを評価関数として評価関数データベースＤＢ５に格納し、条件に応じてそれらを使い分けることにしてもよい。また、圧延実績として板幅や板厚、鋼種等を考慮することで、１つのニューラルネットでカバーする事も可能である。 In this embodiment, it is considered that the suitable evaluation function will be different if the conditions such as the plate width and plate thickness to be the control target and the steel type of the material are different. The neural net may be stored in the evaluation function database DB5 as an evaluation function, and they may be used properly according to the conditions. In addition, it is possible to cover with one neural network by considering the plate width, plate thickness, steel type, etc. as the rolling results.

学習がある程度進むまでの間、評価関数用ニューラルネットから得られる評価関数の数値は不正確である可能性がある。そのため、評価関数選択方法学習部１７３は、設定評価関数Ａ１〜ＡＮの値だけでなく、圧延状態も考慮して、評価関数を選択して使用するようにしても良い。 Until the learning progresses to some extent, the numerical value of the evaluation function obtained from the neural network for the evaluation function may be inaccurate. Therefore, the evaluation function selection method learning unit 173 may select and use the evaluation function in consideration of not only the values of the setting evaluation functions A1 to AN but also the rolling state.

以上のようにして、評価関数設定部１７は、圧延状態に応じた最適な評価関数を、制御出力設定部５および制御結果良否判定部６に設定する。 As described above, the evaluation function setting unit 17 sets the optimum evaluation function according to the rolling state in the control output setting unit 5 and the control result quality determination unit 6.

図１６は、制御出力判定部５の概要を説明するための図である。制御出力判定部５は、圧延現象モデル５０１と形状修正良否判定部５０２から構成されており、制御対象プラント１よりの実績データＳｉ、制御出力演算部３からの制御操作量Ｓ３、および出力判定データベースＤＢ３の情報を得て、制御操作端への制御操作量出力可否データＳ４を与える。係る構成により制御出力判定部５においては、制御出力演算部３にて演算した制御操作量Ｓ３を制御対象プラント１である圧延機に出力した場合の形状の変化を、既知の制御対象プラント１のモデル（図１６の実施例の場合は、圧延現象モデル５０１）に入力することで予測し、形状が悪化すると予想される場合は制御操作量出力ＳＯを抑制し、形状が大きく悪化する事を防止する。 FIG. 16 is a diagram for explaining the outline of the control output determination unit 5. The control output determination unit 5 is composed of a rolling phenomenon model 501 and a shape correction pass / fail determination unit 502, and includes actual data Si from the controlled plant 1, control operation amount S3 from the control output calculation unit 3, and an output determination database. The information of DB3 is obtained, and the control operation amount output availability data S4 is given to the control operation end. With this configuration, in the control output determination unit 5, the change in shape when the control operation amount S3 calculated by the control output calculation unit 3 is output to the rolling mill, which is the control target plant 1, is reported to the known control target plant 1. Predict by inputting to the model (rolling phenomenon model 501 in the case of the embodiment of FIG. 16), and if the shape is expected to deteriorate, the control operation amount output SO is suppressed to prevent the shape from deteriorating significantly. do.

より詳細に述べると、制御操作量Ｓ３を圧延現象モデル５０１に入力し制御操作量Ｓ３による形状変化を予測し、形状偏差修正量予測データ５０３を演算する。他方、制御対象プラント１からの形状検出器データＳｉ（現時点での形状偏差実績データ５０４）に、形状偏差修正量予測データ５０３を加算する事で形状偏差予測データ５０５を得、形状偏差予測データ５０５を評価することで、制御操作量Ｓ３を制御対象プラント１に出力したときに、形状がどのように変化するかが予測できる。現状の形状偏差実績データ５０４と形状偏差予測データ５０５より、形状修正良否判定部５０２においては、形状が良くなる方向に変化するのか、悪くなる方向に変化するのか判定し、制御操作量出力可否データＳ４を得る。 More specifically, the control operation amount S3 is input to the rolling phenomenon model 501, the shape change due to the control operation amount S3 is predicted, and the shape deviation correction amount prediction data 503 is calculated. On the other hand, the shape deviation prediction data 505 is obtained by adding the shape deviation correction amount prediction data 503 to the shape detector data Si (current shape deviation actual data 504) from the controlled plant 1, and the shape deviation prediction data 505 is obtained. By evaluating, it is possible to predict how the shape will change when the control operation amount S3 is output to the controlled plant 1. From the current shape deviation actual data 504 and shape deviation prediction data 505, the shape correction pass / fail judgment unit 502 determines whether the shape changes in the direction of improvement or deterioration, and the control operation amount output availability data. Obtain S4.

形状修正良否判定部５０２では、具体的には以下のようにして形状修正の良否判定を行う。まず、板幅方向での制御優先度を考慮するため、評価関数設定部１７から設定される圧延状態に応じた評価関数を用いて形状変化の良否を判定する。例えば下記式に示す評価関数Ｊを用いて形状変化の良否を判定する。下記式において、εｆｂ（ｉ）は形状偏差実績５０４、εｅｓｔ（ｉ）は形状偏差予測５０５、ｉは形状検出器ゾーン、ｒａｎｄは乱数項、Ｊ_Ａｉは評価関数設定部１７が設定した評価関数である。

Specifically, the shape correction quality determination unit 502 determines the quality of the shape correction as follows. First, in order to consider the control priority in the plate width direction, the quality of the shape change is determined by using the evaluation function according to the rolling state set from the evaluation function setting unit 17. For example, the evaluation function J shown in the following equation is used to determine the quality of the shape change. In the following formulas, εfb (i) the shape deviation results 504, εest (i) the shape deviation estimation 505, i is the shape detector zone, rand is a random number _{section, J Ai} in the evaluation function set by the evaluation function setting unit 17 be.

上記式の評価関数Ｊを用いた場合、形状が良くなるときは評価関数Ｊが正、悪くなるときは評価関数Ｊが負となる。また、ｒａｎｄは乱数項であり、評価関数Ｊの評価結果を乱数的に変化させる。これにより、形状が悪化する場合であっても、評価関数Ｊとしては正になる場合が発生するため、圧延現象モデル５０１が正しくない場合についても形状パターンと制御方法の関係を学習していく事が可能である。ここでｒａｎｄは、試運転当初の様に、制御対象プラント１のモデルが不確実の場合は最大値を大きくし、ある程度制御方法を学習し安定した制御を実施したい場合は０とするように、適時変更する。 When the evaluation function J of the above equation is used, the evaluation function J is positive when the shape is good, and the evaluation function J is negative when the shape is bad. Further, land is a random number term, and the evaluation result of the evaluation function J is randomly changed. As a result, even if the shape deteriorates, the evaluation function J may be positive. Therefore, even if the rolling phenomenon model 501 is incorrect, the relationship between the shape pattern and the control method should be learned. Is possible. Here, the random value should be increased in the maximum value when the model of the controlled plant 1 is uncertain as in the initial test run, and set to 0 when the control method is learned to some extent and stable control is to be performed. change.

形状修正良否判定部５０２においては、評価関数Ｊを演算し、Ｊ≧０のとき制御操作量出力可否データＳ４＝１（可）とし、Ｊ＜０のとき制御操作量出力可否データＳ４＝０（否）のように制御操作量出力可否データＳ４を出力する。 In the shape correction pass / fail judgment unit 502, the evaluation function J is calculated, and when J ≧ 0, the control operation amount output availability data S4 = 1 (possible), and when J <0, the control operation amount output availability data S4 = 0 (possible). The control operation amount output availability data S4 is output as in (No).

制御出力抑制部４においては、制御出力判定部５の判定結果である制御操作量出力可否データＳ４に応じて、制御対象プラント１への制御操作量出力ＳＯの出力有無を決定する。制御操作量出力可否データＳ４は、＃１〜＃ｎＡＳ−Ｕ位置変更量出力、上第１中間シフト位置変更量出力、下第１中間シフト位置変更量出力であり、
ＩＦ（制御操作量出力可否データＳ４＝０）ＴＨＥＮ
＃１〜＃ｎＡＳ−Ｕ位置変更量出力＝０
上第１中間シフト位置変更量出力＝０
下第１中間シフト位置変更量出力＝０
ＥＬＳＥ
＃１〜＃ｎＡＳ−Ｕ位置変更量出力＝＃１〜＃ｎＡＳ−Ｕ位置変更量
上第１中間シフト位置変更量出力＝上第１中間シフト位置変更量
下第１中間シフト位置変更量出力＝下第１中間シフト位置変更量
ＥＮＤＩＦ
により決定される。 The control output suppression unit 4 determines whether or not the control operation amount output SO is output to the controlled target plant 1 according to the control operation amount output availability data S4, which is the determination result of the control output determination unit 5. The control operation amount output availability data S4 is # 1 to # nAS-U position change amount output, upper first intermediate shift position change amount output, and lower first intermediate shift position change amount output.
IF (control operation amount output availability data S4 = 0) THEN
# 1 to # nAS-U Position change amount output = 0
Upper 1st intermediate shift position change amount output = 0
Lower 1st intermediate shift position change amount output = 0
ELSE
# 1 to # nAS-U position change amount output = # 1 to # nAS-U position change amount Upper 1st intermediate shift position change amount output = Upper 1st intermediate shift position change amount Lower 1st intermediate shift position change amount output = Lower 1st intermediate shift position change amount ENDIF
Is determined by.

制御実行装置２０においては、制御対象プラント１（圧延機）からの実績データＳｉより、上記の演算を実行し、制御操作量出力ＳＯを制御対象プラント１（圧延機）に出力する事により形状制御を実施する。 In the control execution device 20, shape control is performed by executing the above calculation from the actual data Si from the controlled target plant 1 (rolling machine) and outputting the control operation amount output SO to the controlled target plant 1 (rolling machine). To carry out.

次に、制御方法学習装置２１の動作概要について説明する。制御方法学習装置２１においては、制御実行装置２０で用いたデータの時間遅れデータを使用する。時間遅れＺ^−１は、ｅ^−ＴＳを意味し、予め設定した時間Ｔだけ遅延させる事を示す。制御対象プラント１は、時間応答を持つため、制御操作量出力ＳＯにより、実績データが変化するまで時間遅れが存在する。そのため、学習は、制御操作実行後、遅延時間Ｔだけ経過した時点での実績データを用いて実施する。形状制御においては、ＡＳ−Ｕや第１中間ロールに対する操作指令出力後、形状計が形状変化を検出するまで数秒要するため、Ｔ＝２から３秒程度に設定するのがよい（形状検出器の種類や圧延速度によっても、遅れ時間は変化するため、制御操作端の変更が形状変化となるまでの最適な時間をＴとして設定すればよい。）。 Next, an outline of the operation of the control method learning device 21 will be described. In the control method learning device 21, the time delay data of the data used in the control execution device 20 is used. Time delay ^{Z -1} ^means the ^{e -TS,} indicating that delaying preset time T. Since the controlled plant 1 has a time response, there is a time delay until the actual data changes due to the control operation amount output SO. Therefore, the learning is performed using the actual data at the time when the delay time T has elapsed after the control operation is executed. In shape control, it takes several seconds for the shape meter to detect the shape change after the operation command is output to the AS-U and the first intermediate roll, so it is better to set T = 2 to 3 seconds (of the shape detector). Since the delay time changes depending on the type and rolling speed, the optimum time until the change of the control operation end becomes the shape change may be set as T).

図１７は、制御良否判定部６の動作概要を説明するための図である。形状変化良否判定部６０２においては、下式のような良否判定評価関数Ｊ_Ｃを用いる。

FIG. 17 is a diagram for explaining an outline of the operation of the control quality determination unit 6. In the shape change pass / fail judgment unit 602, the _{pass / fail judgment evaluation function JC} as shown in the following equation is used.

なお、上記式において、εｆｂ（ｉ）は実績データＳｉに含まれる形状偏差実績データ、εｌａｓｔ（ｉ）は形状偏差実績データ前回値であり、Ｊ_Ａｉは評価関数設定部が設定した評価関数である。ここで、評価関数Ｊ_Ａｉには、評価関数設定部１７に手動で予め設定した評価関数Ｊ_Ａｉ、または評価関数学習部１７４が学習した評価関数（学習評価関数）を設定する。良否判定評価関数Ｊｃにより、制御結果の良否を判定する。また、制御出力判定部５の判定結果である制御操作量出力可否データＳ４が０（制御出力不可）の場合についても、実際に制御対象プラント１へ制御操作量出力＝０であるが、形状が悪くなったと判断する。 In the above formula, εfb (i) is the actual shape deviation actual data contained in the data Si, εlast (i) the shape deviation actual data preceding value, the J _Ai is the evaluation function set by the evaluation function setting unit .. Here, the evaluation function J _Ai, the evaluation function J _Ai manually set in advance in the evaluation function setting unit 17 or the evaluation function learning unit _174, sets the evaluation learned function (learning evaluation function). The quality judgment evaluation function Jc determines the quality of the control result. Also, when the control operation amount output availability data S4, which is the determination result of the control output determination unit 5, is 0 (control output is not possible), the control operation amount output to the controlled target plant 1 is actually 0, but the shape is Judge that it has become worse.

ここでは、制御操作量出力可否データＳ４＝０の場合、制御結果良否データＳ６＝−１とする。また閾値上限ＬＣＵと閾値加減ＬＣＬを、閾値条件（ＬＣＵ≧０≧ＬＣＬ）のもとで予め設定しておく。このときに、良否判定評価関数Ｊｃとの比較の結果が、Ｊｃ＞ＬＣＵであれば、制御結果良否データＳ６＝−１（形状が悪くなった）とし、
ＬＣＵ≧Ｊｃ≧０であれば、制御結果良否データＳ６＝０（形状が悪くなる方向に変化）とし、
０＞Ｊｃ≧ＬＣＬであれば、制御結果良否データＳ６＝１（形状が良くなる方向に変化）とし、
Ｊｃ＜ＬＣＬであれば、制御結果良否データＳ６＝０（形状が良くなった）とする。 Here, when the control operation amount output availability data S4 = 0, the control result quality data S6 = -1 is set. Further, the threshold upper limit LCU and the threshold addition / subtraction LCL are set in advance under the threshold condition (LCU ≧ 0 ≧ LCL). At this time, if the result of comparison with the pass / fail judgment evaluation function Jc is Jc> LCU, the control result pass / fail data S6 = -1 (the shape has deteriorated) is set.
If LCU ≧ Jc ≧ 0, the control result good / bad data S6 = 0 (changes in the direction of worsening the shape) is set.
If 0> Jc ≧ LCL, the control result quality data S6 = 1 (changes in the direction of improving the shape) is set.
If Jc <LCL, it is assumed that the control result quality data S6 = 0 (the shape has improved).

ここで、制御結果良否データＳ６＝−１は、形状が悪くなったので、出力した制御出力を抑制する場合、制御結果良否データＳ６＝０は、形状変化無し、または形状が良くなったので出力した制御出力を保持する場合、制御結果良否データＳ６＝１は、形状が良くなる方向に変化したが、更に良くなる可能性が有るので、出力した制御量を増大させる場合である。 Here, since the shape of the control result good / bad data S6 = -1 has deteriorated, when the output control output is suppressed, the control result good / bad data S6 = 0 is output because there is no shape change or the shape has improved. When the control output is held, the control result quality data S6 = 1 has changed in the direction of improving the shape, but there is a possibility that the shape will be further improved. Therefore, this is a case where the output control amount is increased.

評価関数Ｊ_Ａｉが異なると良否判定評価関数Ｊｃは異なる。そのため、制御結果良否データＳ６の判定結果も異なる事が考えられる。そのため、制御方法学習装置２１においては、予め設定された各評価関数ついて、制御結果良否データＳ６の判定を実施する。 If the evaluation function _JAi is different, the quality judgment evaluation function Jc is different. Therefore, it is conceivable that the determination result of the control result quality data S6 is also different. Therefore, in the control method learning device 21, the determination of the control result quality data S6 is performed for each preset evaluation function.

次に、学習データ作成部７の概要について説明する。図３に示したように、学習データ作成部７においては、制御結果良否判定部６からの判定結果（制御結果良否データＳ６）を基にして、制御操作端操作指令Ｓ２、制御操作量Ｓ３、制御出力抑制部の判定結果（制御操作量出力可否データＳ４）より、制御ルール学習部１１で使用するニューラルネット１１１に対する教師データＳ７ａを作成する。 Next, the outline of the learning data creation unit 7 will be described. As shown in FIG. 3, in the learning data creation unit 7, the control operation end operation command S2, the control operation amount S3, based on the determination result (control result quality data S6) from the control result quality determination unit 6. From the determination result of the control output suppression unit (control operation amount output availability data S4), the teacher data S7a for the neural network 111 used by the control rule learning unit 11 is created.

この場合の教師データＳ７ａは、図７に示す、ニューラルネット１１１の出力層からの出力である、ＡＳ−Ｕ操作度合３０１、１中間操作度合３０２となる。学習データ作成部７は、ニューラルネット１０１の出力である制御操作端操作指令Ｓ２（ＡＳ−Ｕ操作度合３０１、１中間操作度合３０１）と、制御操作量出力ＳＯである＃１〜＃ｎＡＳ−Ｕ位置変更量出力、上第１中間シフト位置変更量出力、下第１中間シフト位置変更量出力を用いて、制御ルール学習部１１で使用するニューラルネット１１１に対する教師データＳ７ａを作成する。 The teacher data S7a in this case has an AS-U operation degree 301 and an intermediate operation degree 302, which are outputs from the output layer of the neural network 111, as shown in FIG. The learning data creation unit 7 includes a control operation end operation command S2 (AS-U operation degree 301, 1 intermediate operation degree 301) which is an output of the neural network 101, and # 1 to # nAS-U which is a control operation amount output SO. The teacher data S7a for the neural network 111 used in the control rule learning unit 11 is created by using the position change amount output, the upper first intermediate shift position change amount output, and the lower first intermediate shift position change amount output.

学習データ作成部７の動作概要を説明するにあたり、図１０の制御出力演算部３における各部データや記号の関係を図１８に整理している。ここでは、ニューラルネット１０１の出力である制御操作端操作指令Ｓ２についてＡＳ−Ｕ操作度合３０１を代表的に示しており、操作度合正側のデータをＯＰｒｅｆ、操作度合負側のデータをＯＭｒｅｆ、制御操作外乱発生部１６からの乱数的に発生する操作度合を操作度合乱数Ｏｒｅｆ、変換ゲインをＧ、制御操作量出力ＳＯをＣｒｅｆとして説明する。このように、ここでは、簡単のため、制御ルール実行部１０のニューラルネット１０１の出力層からの出力として、操作度合正側および操作度合負側、制御操作外乱発生部１６からの乱数的に発生する操作度合を操作度合乱数としている。また、制御操作端に対する制御操作量出力ＳＯを操作指令値としている。 In explaining the outline of the operation of the learning data creation unit 7, the relationship between the data and symbols of each unit in the control output calculation unit 3 of FIG. 10 is organized in FIG. Here, the AS-U operation degree 301 is typically shown for the control operation end operation command S2 which is the output of the neural network 101, the data on the operation degree positive side is OPref, the data on the operation degree negative side is OMref, and control. The operation degree generated randomly from the operation disturbance generation unit 16 will be described as an operation degree random number Olef, the conversion gain will be G, and the control operation amount output SO will be Clef. As described above, here, for the sake of simplicity, the output from the output layer of the neural network 101 of the control rule execution unit 10 is randomly generated from the operation degree positive side and the operation degree negative side, and the control operation disturbance generation unit 16. The degree of operation to be performed is a random number of degree of operation. Further, the control operation amount output SO for the control operation end is set as the operation command value.

図１９は、学習データ作成部７における処理段階と処理内容を示している。ここで、図１８の記号の約束に則り説明すると、最初の処理段階７１では、操作指令値Ｃｒｅｆを（６）式により求めている。

FIG. 19 shows a processing stage and processing contents in the learning data creation unit 7. Here, to explain according to the promise of the symbols in FIG. 18, in the first processing step 71, the operation command value Clef is obtained by the equation (6).

次の処理段階７２では、制御結果良否データＳ６に応じて操作指令値Ｃｒｅｆを修正しＣ´ｒｅｆとする。具体的には制御結果良否データＳ６＝−１のとき（７）式、制御結果良否データＳ６＝０のとき（８）式、制御結果良否データＳ６＝１のとき（９）式により、操作指令値Ｃｒｅｆの修正値Ｃ´ｒｅｆとする。

In the next processing step 72, the operation command value Clef is modified to be C'ref according to the control result quality data S6. Specifically, the operation command is based on the equation (7) when the control result quality data S6 = -1, the equation (8) when the control result quality data S6 = 0, and the equation (9) when the control result quality data S6 = 1. Let the modified value C'ref of the value Clef.

処理段階７３では、修正された操作指令値Ｃ´ｒｅｆより、（１０）、（１１）式により操作度合修正量ΔＣｒｅｆを求める。

In the processing step 73, the operation degree correction amount ΔCref is obtained from the corrected operation command value C'ref by the equations (10) and (11).

処理段階７４では、ニューラルネット１１１への教師データＯＰ´ｒｅｆ、ＯＭ´ｒｅｆを（１２）式により求める。

In the processing step 74, the teacher data OP'ref and OM'ref to the neural network 111 are obtained by the equation (12).

このように学習データ作成部７では、図１８に示すように、実際に制御対象プラント１に対して出力した操作指令値Ｃｒｅｆを、制御結果良否判定部６における判定結果である制御結果良否データＳ６に応じて、操作指令値修正値Ｃ´ｒｅｆを演算する。具体的には、制御結果良否データＳ６＝１の場合は、制御方向はＯＫであるが、制御出力が不足していると判断された場合で、操作指令値を同じ方向にΔＣｒｅｆだけ増加するようにする。逆に制御結果良否データＳ６＝−１の場合は、制御方向が間違っていると判断された場合で、操作指令値を逆方向にΔＣｒｅｆだけ減少するようにする。変換ゲインＧは、予め設定したものであるから既知である事から、操作度合正側および操作度合負側の値が判れば、修正量ΔＯｒｅｆを求める事が可能である。ここでΔＣｒｅｆは、予め適当な値をシミュレーション等で求めておき、設定する。以上の手順により、制御ルール学習部１１にて使用する教師データＯＰ´ｒｅｆ、ＯＭ´ｒｅｆは上記の（１２）式により求める事ができる。 As described above, in the learning data creation unit 7, as shown in FIG. 18, the operation command value Clef actually output to the controlled target plant 1 is the control result good / bad data S6 which is the judgment result in the control result good / bad judgment unit 6. The operation command value correction value C'ref is calculated according to the above. Specifically, when the control result quality data S6 = 1, the control direction is OK, but when it is determined that the control output is insufficient, the operation command value is increased by ΔClef in the same direction. To. On the contrary, when the control result quality data S6 = -1, the operation command value is reduced by ΔClef in the opposite direction when it is determined that the control direction is wrong. Since the conversion gain G is known because it is set in advance, it is possible to obtain the correction amount ΔOref if the values on the positive operation degree side and the negative operation degree side are known. Here, ΔClef is set by obtaining an appropriate value in advance by simulation or the like. By the above procedure, the teacher data OP'ref and OM'ref used in the control rule learning unit 11 can be obtained by the above equation (12).

なお図１９では簡便な事例で説明を行っているが、実際には、＃１〜＃ｎＡＳ−Ｕに対するＡＳ−Ｕ操作度合３０１および、上第１中間ロールシフト、下第１中間ロールシフトに対する第１中間操作度合３０２についてその全てを実施し、制御ルール学習部１１で用いるニューラルネット１１１の教師データ（ＡＳ−Ｕ操作度合教師データ、１中間操作度合教師データ）とする。 Although the description is given with a simple example in FIG. 19, in reality, the AS-U operation degree 301 for # 1 to # nAS-U, the upper first intermediate roll shift, and the lower first intermediate roll shift are the first. All of the 1 intermediate operation degree 302 is carried out and used as the teacher data (AS-U operation degree teacher data, 1 intermediate operation degree teacher data) of the neural network 111 used in the control rule learning unit 11.

図２０は学習データデータベースＤＢ２に保存されたデータ例を示している。ニューラルネット１１１を学習するためには、多数の入力データＳ８ａと教師データＳ７ａの組合せが必要である。従って、学習データ作成部７で作成した教師データＳ７ａ（ＡＳ−Ｕ操作度合教師データ、第１中間操作度合）は、制御実行装置２０にて制御ルール実行部１０に入力された入力データＳ１（規格化形状偏差２０１および形状偏差段階）の時間遅れデータＳ８ａと組み合わせて一組の学習データＳ１１として、学習データデータベースＤＢ２に保存される。 FIG. 20 shows an example of data stored in the learning data database DB2. In order to learn the neural network 111, a combination of a large number of input data S8a and teacher data S7a is required. Therefore, the teacher data S7a (AS-U operation degree teacher data, first intermediate operation degree) created by the learning data creation unit 7 is the input data S1 (standard) input to the control rule execution unit 10 by the control execution device 20. It is stored in the training data database DB2 as a set of training data S11 in combination with the time delay data S8a of the formed shape deviation 201 and the shape deviation stage).

なお図３のプラント制御装置においては、各種のデータベースＤＢ１、ＤＢ２、ＤＢ３、ＤＢ４、ＤＢ５を使用しているが、図２０に各データベースＤＢ１、ＤＢ２、ＤＢ３、ＤＢ４を連系的に管理運用するためのニューラルネット管理テーブルＴＢの構成を示す。管理テーブルＴＢは、仕様の管理テーブルを備えている。具体的には、管理テーブルＴＢは、仕様について（Ｂ１）板幅、（Ｂ２）鋼種、および制御の優先度についての評価関数Ａ１〜ＡＮに応じて区分けされる。（Ｂ１）板幅としては、例えば、３フィート幅、メータ幅、４フィート幅、５フィート幅の４区分が、鋼種としては、鋼種（１）〜鋼種（１０）の１０区分程度を用いる。また、制御の評価関数についてはＮ（Ｎは設定した評価関数の個数。本実施例ではＮ＝６。）種類とする。この場合、８０区分となり、２４０個のニューラルネットを、圧延条件に応じて使い分けて使用する事となる。 In the plant control device of FIG. 3, various databases DB1, DB2, DB3, DB4, and DB5 are used, but in FIG. 20, each database DB1, DB2, DB3, and DB4 are managed and operated in an interconnected manner. The configuration of the neural net management table TB of is shown. The management table TB includes a specification management table. Specifically, the management table TB is classified according to (B1) plate width, (B2) steel grade, and evaluation functions A1 to AN regarding control priority. (B1) As the plate width, for example, 4 divisions of 3 feet width, meter width, 4 feet width, and 5 feet width are used, and as the steel grade, about 10 divisions of steel grades (1) to (10) are used. Further, the control evaluation function is of N (N is the number of set evaluation functions. In this embodiment, N = 6). In this case, there are 80 divisions, and 240 neural networks are used properly according to the rolling conditions.

ニューラルネット学習制御部１１２は、図２０に示すような、入力データおよび教師データの組合せである学習データを、図２１のニューラルネット管理テーブルＴＢに従って、該当するニューラルネットＮｏ．と紐付けて、図２２に示すような学習データデータベースＤＢ２に格納する。 The neural network learning control unit 112 applies the learning data, which is a combination of the input data and the teacher data, as shown in FIG. 20, according to the neural network management table TB of FIG. Is stored in the learning data database DB2 as shown in FIG. 22.

制御実行装置２０が、制御対象プラント１に対して、形状制御を実行するたびに、学習データが評価関数に応じてＮ組作成される。これは、同じ入力データ、制御出力に対して、制御結果良否判定が制御の優先度についてのＮ個の評価関数を用いて行われるため、教師データがＮ種類作成されるためである。教師データがある程度（例えば２００組）蓄積されたら、または新たに学習データデータベースＤＢ２に蓄積されたら、ニューラルネット学習制御部１１２は、ニューラルネット１１１の学習を指示する。 Every time the control execution device 20 executes shape control on the controlled plant 1, N sets of learning data are created according to the evaluation function. This is because N types of teacher data are created because the control result pass / fail judgment is performed using N evaluation functions for the control priority for the same input data and control output. When the teacher data is accumulated to some extent (for example, 200 sets) or newly accumulated in the learning data database DB2, the neural network learning control unit 112 instructs the learning of the neural network 111.

制御ルールデータベースＤＢ１には、図２１に示すような管理テーブルＴＢに従って、複数のニューラルネットが格納されており、ニューラルネット学習制御部１１２においては、学習が必要なニューラルネットＮｏ．を指定して、ニューラルネット選択部１１３が制御ルールデータベースＤＢ１より当該ニューラルネットを取り出し、ニューラルネット１１１に設定する。ニューラルネット学習制御部１１２は、学習データデータベースＤＢ２より、当該ニューラルネットに対応する、入力データおよび教師データの取り出しを、入力データ作成部１１４および教師データ作成部１１５に指示し、それらを用いてニューラルネット１１１の学習を実施する。なおニューラルネットの学習方法は手法が種々提案されており、いずれの手法を用いても良い。 A plurality of neural networks are stored in the control rule database DB1 according to the management table TB as shown in FIG. 21, and in the neural network learning control unit 112, the neural network Nos. Is specified, the neural network selection unit 113 extracts the neural network from the control rule database DB1 and sets it in the neural network 111. The neural network learning control unit 112 instructs the input data creation unit 114 and the teacher data creation unit 115 to take out the input data and the teacher data corresponding to the neural network from the learning data database DB2, and uses them to make a neural network. The learning of the net 111 is carried out. Various methods have been proposed for learning the neural network, and any method may be used.

ニューラルネット１１１の学習が完了すると、ニューラルネット学習制御部１１２は、学習結果であるニューラルネット１１１を、制御ルールデータベースＤＢ１の当該ニューラルネットＮｏ．の位置に書き戻すことで、学習が完了する。 When the learning of the neural network 111 is completed, the neural network learning control unit 112 sends the neural network 111, which is the learning result, to the neural network No. of the control rule database DB1. Learning is completed by writing back to the position of.

学習は、図２１にて定義された全てのニューラルネットに対して定時間間隔（例えば１日毎）で一斉に実施しても良いし、新しい学習データがある程度（例えば１００組）蓄積されたニューラルネットＮｏ．のニューラルネットのみ、その時点で学習させても良い。 The learning may be performed simultaneously for all the neural networks defined in FIG. 21 at regular time intervals (for example, every day), or the neural network in which new learning data is accumulated to some extent (for example, 100 sets). No. Only the neural network of the above may be trained at that time.

以上により、制御対象プラント１である圧延機の形状を大きく乱すことなく、
１）基準形状パターンと、それに対する制御操作を予め別々に設定し、制御操作方法を学習していくのではなく、形状パターンと制御操作の組合せを学習し、それを用いて制御操作を実施する。
２）新たな制御ルールは、予め予想できるものでは無く、全く予測できなかった制御ルールが最適となる場合も有る事から、ランダムに制御操作端を動作させ、それに対する制御結果を見ながら見つけていく。
３）制御対象に対する制御の優先度を決定する評価関数を、オペレータの感覚に合致するように、制御対象の状態に応じてオペレータの手動操作方法に合致するように設定する。
事が実現できる。 As a result, the shape of the rolling mill, which is the controlled plant 1, is not significantly disturbed.
1) Rather than setting the reference shape pattern and the control operation for it separately in advance and learning the control operation method, the combination of the shape pattern and the control operation is learned and the control operation is performed using it. ..
2) New control rules are not predictable in advance, and control rules that could not be predicted at all may be optimal. Therefore, operate the control operation end at random and find it while looking at the control results for it. go.
3) The evaluation function for determining the control priority for the controlled object is set so as to match the operator's feeling and the operator's manual operation method according to the state of the controlled object.
Things can be realized.

なお、制御ルールデータベースＤＢ１には、制御実行装置２０で使用するニューラルネットが格納されるが、格納されるニューラルネットが、乱数でイニシャル処理を実施しただけのものだと、ニューラルネットの学習が進行し、それなりの制御が可能となるまで時間がかかる。そのため、制御対象プラント１に対して、制御部を構築した時に、その時点で判明している制御対象プラント１の制御モデルに基づき、予めシミュレーションにて、制御ルールの学習を実施し、シミュレータでの学習が完了したニューラルネットをデータベースに格納しておく事で、制御対象プラントの立上げ当初から、ある程度の性能の制御を実施する事が可能である。 The control rule database DB1 stores the neural network used by the control execution device 20, but if the stored neural network is only the initial processing performed with random numbers, the learning of the neural network proceeds. However, it takes time until some control becomes possible. Therefore, when the control unit is constructed for the controlled target plant 1, the control rules are learned in advance by simulation based on the control model of the controlled target plant 1 known at that time, and the simulator is used. By storing the learned neural network in the database, it is possible to control the performance to some extent from the beginning of the controlled plant.

また、オペレータの手動操作方法に合致した評価関数を用いてニューラルネットの学習を実施しているため、制御出力による制御対象の変化に対してオペレータが手動操作を行う事が無くなり、オペレータの負荷軽減および制御精度、操業効率の向上が可能である。 In addition, since the neural network is learned using an evaluation function that matches the manual operation method of the operator, the operator does not have to manually operate the change of the control target due to the control output, which reduces the load on the operator. It is also possible to improve control accuracy and operation efficiency.

以上説明した実施形態には以下に示す事項が含まれている。ただし、実施形態に含まれる事項が以下に示す事項に限られるものではない。 The embodiments described above include the following items. However, the matters included in the embodiment are not limited to the matters shown below.

本開示の制御装置は、制御対象を制御する制御装置であって、与えられた制御ルールに従って前記制御対象へ制御出力を与える制御実行装置と、指定された評価関数を用いて前記制御対象に対して与えられた制御出力を評価し、その評価結果を利用して学習データを作成し、該学習データを学習することにより前記制御ルールを構築し、該制御ルールを前記制御実行装置に与える制御方法学習装置と、複数の評価関数を予め保持しており、前記制御対象への制御状態に基づいて、前記複数の評価関数のうちいずれかを選択し、前記選択した評価関数を前記制御方法学習装置に指定する評価関数設定部と、を有する。 The control device of the present disclosure is a control device that controls a control target, and is a control execution device that gives a control output to the control target according to a given control rule, and the control target using a designated evaluation function. A control method in which the given control output is evaluated, training data is created using the evaluation result, the control rule is constructed by learning the training data, and the control rule is given to the control execution device. A learning device and a plurality of evaluation functions are held in advance, one of the plurality of evaluation functions is selected based on the control state of the controlled object, and the selected evaluation function is used as the control method learning device. It has an evaluation function setting unit specified in.

この構成によれば、制御状態に基づいて選択した評価関数による制御出力に対する評価の評価結果を利用した学習データを学習して構築した制御ルールに従って制御対象へ制御出力を与えるので、制御結果の適切な良否判定に基づく制御が実行可能となることが期待される。 According to this configuration, the control output is given to the control target according to the control rule constructed by learning the learning data using the evaluation result of the evaluation for the control output by the evaluation function selected based on the control state, so that the control result is appropriate. It is expected that control based on good quality judgment will be feasible.

また、本開示によれば、前記評価関数設定部は、前記制御対象への制御状態とオペレータによる手動操作とに基づいて前記複数の評価関数の各々について評価関数良否指標を算出し、該評価関数良否判定指標に基づいて、前記制御方法学習装置に指定する評価関数を選択する。この構成によれば、オペレータによる手動操作と制御対象への制御状態との関係を利用することで、オペレータの意図する制御が高評価となる評価関数が選択されやすくなる。 Further, according to the present disclosure, the evaluation function setting unit calculates an evaluation function quality index for each of the plurality of evaluation functions based on the control state of the controlled object and the manual operation by the operator, and the evaluation function. The evaluation function specified in the control method learning device is selected based on the pass / fail judgment index. According to this configuration, by utilizing the relationship between the manual operation by the operator and the control state of the controlled object, it becomes easy to select an evaluation function in which the control intended by the operator is highly evaluated.

また、本開示によれば、前記評価関数設定部は、前記オペレータが手動操作を開始したタイミングと、前記オペレータが手動操作を終了したタイミングとにおける前記評価関数の評価値を算出し、該評価値を用いて前記評価関数良否判定指標を算出する。この構成によれば、オペレータは、圧延操業中に板の形状が悪いと判断したら手動操作を開始し、形状が良くなったと判断するまで手動操作を継続するので、その時点での評価値からオペレータの意図を得ることができる。 Further, according to the present disclosure, the evaluation function setting unit calculates the evaluation value of the evaluation function at the timing when the operator starts the manual operation and the timing when the operator finishes the manual operation, and the evaluation value. Is used to calculate the evaluation function pass / fail judgment index. According to this configuration, the operator starts the manual operation when it is judged that the shape of the plate is bad during the rolling operation, and continues the manual operation until it is judged that the shape is improved. Therefore, the operator is operated from the evaluation value at that time. Can get the intention of.

また、本開示によれば、前記評価関数設定部は、前記オペレータが手動操作を開始したタイミングにおける前記評価関数の評価値ａと、前記オペレータが手動操作を終了したタイミングにおける前記評価関数の評価値ｂを算出し、前記評価関数良否判定指標を（ａ−ｂ）／ｂとして算出する。この構成によれば、複数の評価関数ごとに計算方法が異なる場合でも、評価関数良否指標を相互に比較することが可能である。 Further, according to the present disclosure, the evaluation function setting unit has an evaluation value a of the evaluation function at the timing when the operator starts the manual operation and an evaluation value a of the evaluation function at the timing when the operator finishes the manual operation. b is calculated, and the evaluation function pass / fail judgment index is calculated as (ab) / b. According to this configuration, even if the calculation method is different for each of a plurality of evaluation functions, it is possible to compare the evaluation function quality indexes with each other.

また、本開示によれば、前記評価関数は、前記制御対象への前記制御出力と該制御出力の制御結果が反映された前記制御対象の実績データとを入力とし、前記評価結果を出力するものであり、前記評価関数設定部は、オペレータによる手動操作と前記制御対象への前記制御出力と前記制御対象の実績データとに基づく学習データを学習することにより、前記評価関数を構築する。この構成によれば、オペレータの手動操作を利用するので、オペレータの意図を反映した評価関数を構築することができる。 Further, according to the present disclosure, the evaluation function inputs the control output to the control target and the actual data of the control target reflecting the control result of the control output, and outputs the evaluation result. The evaluation function setting unit constructs the evaluation function by learning the learning data based on the manual operation by the operator, the control output to the control target, and the actual data of the control target. According to this configuration, since the manual operation of the operator is used, it is possible to construct an evaluation function that reflects the intention of the operator.

また、本開示によれば、前記評価関数設定部は、前記オペレータが手動操作を開始したタイミングと、前記オペレータが手動操作を終了したタイミングとにおける前記制御対象への前記制御出力と前記制御対象の実績データとに基づく学習データを学習することにより、前記評価関数を構築する。この構成によれば、オペレータは、圧延操業中に板の形状が悪いと評価したら手動操作を開始し、形状が良くなったと評価するまで手動操作を継続するので、オペレータの評価を反映した評価値を学習データとし、オペレータの評価に近い評価を行う評価関数を構築することができる。 Further, according to the present disclosure, the evaluation function setting unit has the control output to the control target and the control target at the timing when the operator starts the manual operation and the timing when the operator finishes the manual operation. The evaluation function is constructed by learning the learning data based on the actual data. According to this configuration, the operator starts the manual operation when the shape of the plate is evaluated to be bad during the rolling operation, and continues the manual operation until the shape is evaluated to be good. Therefore, the evaluation value reflecting the evaluation of the operator is reflected. Can be used as training data to construct an evaluation function that makes an evaluation close to the operator's evaluation.

また、本開示によれば、前記評価関数設定部は、前記オペレータが手動操作を開始したタイミングにおける評価値を所定値ｃとして学習データを生成し、前記オペレータが手動操作を終了したタイミングにおける評価値を所定値ｄとして学習データを生成し、前記学習データを学習することにより、前記評価関数を構築する。この構成によれば、オペレータは、圧延操業中に板の形状が悪いと判断したら手動操作を開始し、形状が良くなったと判断するまで手動操作を継続するので、その時点での評価値からオペレータの意図を得ることができる。 Further, according to the present disclosure, the evaluation function setting unit generates learning data with the evaluation value at the timing when the operator starts the manual operation as a predetermined value c, and the evaluation value at the timing when the operator finishes the manual operation. Is set as a predetermined value d, training data is generated, and the evaluation function is constructed by learning the training data. According to this configuration, the operator starts the manual operation when it is judged that the shape of the plate is bad during the rolling operation, and continues the manual operation until it is judged that the shape is improved. Therefore, the operator is operated from the evaluation value at that time. Can get the intention of.

また、本開示によれば、前記制御実行装置は、前記制御対象の実績データと制御操作との組合せに従って前記制御対象への制御出力を与える制御ルール実行部と、前記評価関数を用いて前記制御ルール実行部が出力する前記制御出力の適用可否を判定するとともに、適用否と判定したら当該実績データと制御操作との組合せが不適切であることを前記制御方法学習装置に通知する制御出力判定部と、該制御出力判定部が、適用否と判定したら、前記制御出力を前記制御対象に出力することを阻止する制御出力抑制部とを備え、前記制御方法学習装置は、前記制御実行装置が前記制御出力を実際に前記制御対象に出力した場合に、前記制御出力が前記制御対象の実績データに反映されるまでの時間遅れ後に、前記評価関数設定部が設定した評価関数を用いて、前記実績データが前記制御出力により改善されたか悪化したかという制御結果の良否を判定する制御結果良否判定部と、該制御結果良否判定部により判定された制御結果の良否と、前記制御出力とを用いて教師データを得る学習データ作成部と、前記実績データと前記教師データを学習データとして学習する制御ルール学習部とを備え、前記制御方法学習装置が学習する事で、前記制御対象プラントの状態に応じて複数の制御目標に対して別個の実績データと制御操作の組合せを得、得られた実績データと制御操作の組合せを前記制御ルール実行部における制御対象プラントの実績データと制御操作の定められた組合せとして使用する。 Further, according to the present disclosure, the control execution device uses the control rule execution unit that gives control output to the control target according to the combination of the actual data of the control target and the control operation, and the control using the evaluation function. The control output determination unit that determines whether or not the control output output by the rule execution unit is applicable, and notifies the control method learning device that the combination of the actual data and the control operation is inappropriate if it is determined that the control output is not applicable. The control output determination unit includes a control output suppression unit that prevents the control output from being output to the control target when the control output determination unit determines that the application is not applicable. When the control output is actually output to the control target, the performance is used by using the evaluation function set by the evaluation function setting unit after a time delay until the control output is reflected in the performance data of the control target. Using the control result quality determination unit for determining whether the data is improved or deteriorated by the control output, the control result quality determination unit determined by the control result quality determination unit, and the control output. A learning data creation unit for obtaining teacher data and a control rule learning unit for learning the actual data and the teacher data as learning data are provided, and the control method learning device learns the data according to the state of the controlled plant. A separate combination of actual data and control operation was obtained for a plurality of control targets, and the combination of the obtained actual data and control operation was defined as the actual data and control operation of the controlled plant in the control rule execution unit. Used as a combination.

また、本開示のプラント制御装置は、実際には計算機システムとして実現されることになるが、この場合には計算機システム内に複数のプログラム群を形成することになる。 Further, the plant control device of the present disclosure is actually realized as a computer system, but in this case, a plurality of program groups are formed in the computer system.

これらのプログラム群は、例えば、
制御実行装置の処理を達成させるための、制御対象プラントの実績データと制御操作の定められた組合せに従って制御出力を与える制御ルール実行プログラム、制御ルール実行プログラムが出力する制御出力の可否を判定するとともに、当該実績データと制御操作が誤りである事を前記制御方法学習装置に通知する制御出力判定プログラム、制御出力判定プログラムが、制御出力を制御対象プラントに出力した場合、制御対象プラントの前記実績データが悪化すると判断した場合は、制御出力を前記制御対象プラントに出力することを阻止する制御出力抑制プログラムであり、
制御方法学習装置の処理を達成させるための、制御実行装置が制御出力を実際に、制御対象プラントに出力した場合に、制御効果が実績データに表れるまでの時間遅れ後に、実績データが当該制御前に比較して良くなったか、悪くなったかについての制御結果の良否を判定する制御結果良否判定の処理を達成させるための制御結果良否判定プログラム、該制御結果良否判定プログラムにおける制御結果の良否と、制御出力をもちいて教師データを得る学習データ作成プログラム、前記実績データと前記教師データを学習データとして学習する制御ルール学習プログラムである。
そして、制御方法学習装置が学習する事で、前記制御対象プラントの状態に応じて複数の制御目標に対して別個の実績データと制御操作の組合せを得、得られた実績データと制御操作の組合せを前記制御ルール実行プログラムにおける制御対象プラントの実績データと制御操作の定められた組合せとして使用するものである。 These programs are, for example,
A control rule execution program that gives control output according to a defined combination of control target plant performance data and control operations to achieve the processing of the control execution device, and a judgment as to whether or not the control output output by the control rule execution program is possible. , When the control output judgment program and the control output judgment program that notify the control method learning device that the actual data and the control operation are incorrect, the control output is output to the controlled target plant, the actual data of the controlled target plant. It is a control output suppression program that prevents the control output from being output to the controlled plant when it is determined that
When the control execution device actually outputs the control output to the controlled plant in order to achieve the processing of the control method learning device, the actual data is displayed before the control after a time delay until the control effect appears in the actual data. A control result quality determination program for achieving the control result quality determination process for determining whether the control result is good or bad, and the control result quality in the control result quality determination program. It is a learning data creation program that obtains teacher data using control output, and a control rule learning program that learns the actual data and the teacher data as training data.
Then, by learning by the control method learning device, different combinations of actual data and control operations are obtained for a plurality of control targets according to the state of the controlled plant, and the obtained actual data and control operations are combined. Is used as a defined combination of the actual data of the controlled plant and the control operation in the control rule execution program.

なお本発明装置を実プラントに適用するに当たり、ニューラルネットの初期値を定めておく必要があるが、この点に関して実績データと制御操作の組合せを、制御対象プラントでの制御を実施する前に、制御対象プラントの制御モデルを用いてして、シミュレーションにより作成し、制御対象プラントにおける実績データと制御操作の組合せの学習期間を短縮するのがよい。 In applying the apparatus of the present invention to an actual plant, it is necessary to determine the initial value of the neural network. In this regard, the combination of the actual data and the control operation is controlled before the controlled plant is controlled. It is preferable to use the control model of the controlled plant and create it by simulation to shorten the learning period of the combination of the actual data and the control operation in the controlled plant.

本発明は、例えば圧延設備の１つである圧延機の制御方法及び部に関するものであり、実適用に当たっての問題点は特に無い。 The present invention relates to, for example, a control method and a part of a rolling mill, which is one of rolling equipment, and there is no particular problem in actual application.

１：制御対象プラント、２：制御入力データ作成部、３：制御出力演算部、４：制御出力抑制部、５：制御出力判定部、６：制御結果良否判定部、７：学習データ作成部、１０：制御ルール実行部、１１：制御ルール学習部、２０：制御実行装置、２１：制御方法学習装置、ＤＢ１：制御ルールデータベース、ＤＢ２：出力判定データベース、ＤＢ３：学習データデータベース、Ｓｉ：実績データ、ＳＯ：制御操作量出力、Ｓ１：入力データ、Ｓ２：制御操作端操作指令、Ｓ３：制御操作量、Ｓ４：制御操作量出力可否データ、Ｓ５：良否判定データ、Ｓ６：制御結果良否データ、Ｓ７ａ、Ｓ７ｂ、Ｓ７ｃ：教師データ、Ｓ８ａ、Ｓ８ｂ、Ｓ８ｃ：入力データ（制御ルール学習部用） 1: Control target plant 2: Control input data creation unit 3: Control output calculation unit 4: Control output suppression unit 5: Control output judgment unit 6: Control result quality judgment unit, 7: Learning data creation unit, 10: Control rule execution unit, 11: Control rule learning unit, 20: Control execution device, 21: Control method learning device, DB1: Control rule database, DB2: Output judgment database, DB3: Learning data database, Si: Actual data, SO: Control operation amount output, S1: Input data, S2: Control operation end operation command, S3: Control operation amount, S4: Control operation amount output availability data, S5: Good / bad judgment data, S6: Control result good / bad data, S7a, S7b, S7c: Teacher data, S8a, S8b, S8c: Input data (for control rule learning unit)

Claims

A control device that controls a controlled object
A control execution device that gives control output to the control target according to a given control rule,
The control output given to the controlled object is evaluated using the specified evaluation function, learning data is created using the evaluation result, and the control rule is constructed by learning the learning data. , A control method learning device that gives the control rule to the control execution device,
An evaluation that holds a plurality of evaluation functions in advance, selects one of the plurality of evaluation functions based on the control state of the controlled object, and designates the selected evaluation function in the control method learning device. Function setting part and
Control device with.

The evaluation function setting unit calculates an evaluation function good / bad index for each of the plurality of evaluation functions based on the control state of the controlled object and the manual operation by the operator, and based on the evaluation function good / bad judgment index, the evaluation function setting unit said. Control method Select the evaluation function to be specified for the learning device.
The control device according to claim 1.

The evaluation function setting unit calculates the evaluation value of the evaluation function at the timing when the operator starts the manual operation and the timing when the operator finishes the manual operation, and uses the evaluation value to determine the quality of the evaluation function. Calculate the index,
The control device according to claim 2.

The evaluation function setting unit calculates the evaluation value a of the evaluation function at the timing when the operator starts the manual operation and the evaluation value b of the evaluation function at the timing when the operator finishes the manual operation, and calculates the evaluation function b. Calculate the quality judgment index as (ab) / b,
The control device according to claim 3.

The evaluation function inputs the control output to the control target and the actual data of the control target reflecting the control result of the control output, and outputs the evaluation result.
The evaluation function setting unit constructs the evaluation function by learning the learning data based on the manual operation by the operator, the control output to the control target, and the actual data of the control target.
The control device according to claim 1.

The evaluation function setting unit obtains learning data based on the control output to the control target and the actual data of the control target at the timing when the operator starts the manual operation and the timing when the operator finishes the manual operation. By learning, the evaluation function is constructed.
The control device according to claim 5.

The evaluation function setting unit generates learning data with the evaluation value at the timing when the operator starts the manual operation as a predetermined value c, and generates the learning data with the evaluation value at the timing when the operator finishes the manual operation as the predetermined value d. The evaluation function is constructed by generating and learning the training data.
The control device according to claim 6.

The control execution device includes a control rule execution unit that gives control output to the control target according to a combination of actual data of the control target and a control operation, and the control output by the control rule execution unit using the evaluation function. The control output determination unit and the control output determination unit notify the control method learning device that the combination of the actual data and the control operation is inappropriate when the output is determined to be applicable or not. A control output suppression unit that prevents the control output from being output to the control target when it is determined that the control output is not applicable is provided.
The control method learning device is the evaluation function after a time delay until the control output is reflected in the actual data of the control target when the control execution device actually outputs the control output to the control target. Using the evaluation function set by the setting unit, the control result quality determination unit that determines whether the performance data is improved or deteriorated by the control output is determined by the control result quality determination unit and the control result quality determination unit. The control method learning device includes a learning data creation unit that obtains teacher data using the control result and the control output, and a control rule learning unit that learns the actual data and the teacher data as learning data. By learning, separate combinations of actual data and control operations are obtained for a plurality of control targets according to the state of the controlled plant, and the obtained combinations of actual data and control operations are combined in the control rule execution unit. Used as a defined combination of controlled plant performance data and control operations,
The control device according to claim 1.

A control method for controlling a controlled object,
A control output is given to the controlled object according to the given control rule,
The control output given to the controlled object is evaluated using the specified evaluation function, and the control output is evaluated.
Create learning data using the evaluation results and
By learning the learning data, the control rule is constructed.
Select and specify one of a plurality of evaluation functions held in advance based on the control state for the control target.
A control method in which a computer performs that.