JPH0350304B2

JPH0350304B2 -

Info

Publication number: JPH0350304B2
Application number: JP60023014A
Authority: JP
Inventors: Akio Komya
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1985-02-08
Filing date: 1985-02-08
Publication date: 1991-08-01
Also published as: JPS61183769A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、二重化された情報処理システムの、
障害によりダウンしたシステムを、初期プログラ
ムロードを行つて、稼動状態に回復する場合の、
システム回復制御方式に関する。[Detailed Description of the Invention] [Industrial Field of Application] The present invention is directed to the use of a redundant information processing system.
When a system that has gone down due to a failure is restored to operating condition by performing an initial program load,
Regarding system recovery control method.

二重化情報処理システムの一方式において、両
システムが正常に稼動する状況では、一方をマス
タ、他方をスレーブシステムとして両システムを
並列に稼動し、マスタシステムの出力を正規の出
力とする態様の、いわゆるデユアル運用状態で運
用される。 In one type of duplex information processing system, when both systems are operating normally, one is the master system and the other is the slave system, and both systems are operated in parallel, and the output of the master system is the normal output. Operated in dual operational status.

この状態で、何等かの原因により、一方のシス
テムが正常に稼動できない状態、即ちダウン状態
になると、他方のシステムがマスタとなつて（又
はマスタ状態を維持して）、いわゆるシングル運
用状態に切り替わり、マスタのみの単一システム
で業務を継続する。 In this state, if for some reason one system cannot operate normally, that is, goes down, the other system becomes the master (or maintains the master state) and switches to a so-called single operation state. , continue business operations with a single master-only system.

シングル運用状態で、稼動中のシステムが更に
ダウンした場合等に、両システムがダウンしてい
る状態になれば、いわゆるシステムダウン状態で
あつて、業務の処理は当然停止する。 In a single operating state, if the operating system goes down further and both systems go down, this is a so-called system down state, and business processing will naturally stop.

特に、このようなシステムダウン状態から回復
して、システムの稼動を再開する場合には、最後
まで稼動していた側のシステムに保持されている
はずの、いわゆるチエツクポイントフアイル等
の、最近の諸データを使つて、システムダウンの
直前の状態に、業務処理状態を回復し、中断され
た業務を継続するようにする必要がある。 In particular, when recovering from such a system down state and restarting the system, recent various files such as so-called checkpoint files that should have been retained in the system that was running until the end are saved. It is necessary to use the data to restore the business processing state to the state immediately before the system went down and to continue the interrupted business.

[Problems to be solved by conventional technology and invention]

第２図は、二重化情報処理システムの一構成例
を示すブロツク図である。 FIG. 2 is a block diagram showing an example of the configuration of a duplex information processing system.

Ａ系システム１ａと、Ｂ系システム１ｂは同一
の構成を有し、処理装置２ａ，２ｂ、主記憶装置
３ａ，３ｂ、及び補助記憶装置４ａ，４ｂに代表
される周辺装置等からなる。 The A system 1a and the B system 1b have the same configuration and consist of peripheral devices such as processing units 2a and 2b, main storage devices 3a and 3b, and auxiliary storage devices 4a and 4b.

外部からの入力は、入力線５によつて示すよう
に、Ａ、Ｂ系両システムに入力され、両システム
が稼動可能であれば、並列に同じ処理が実行され
るが、外部への出力は、出力線６に示すように、
マスタとされる一方のシステム（図の場合は、Ａ
系システム）から出力される。 Input from the outside is input to both systems A and B, as shown by input line 5, and if both systems are operational, the same processing is executed in parallel, but the output to the outside is , as shown in output line 6,
One system that is considered the master (in the diagram, A
system).

両システム１ａ，１ｂ間には、例えば信号線７
を設け、互いに相手システムの状態を監視等する
ことができる。 For example, a signal line 7 is connected between both systems 1a and 1b.
, and can monitor the status of each other's systems.

システムの運用開始において、各システムコン
ソール８ａ，８ｂから初期プログラムロード（以
下において、IPLという）を指定すると、例えば
第３図の処理の流れにより、IPL処理が行われ
る。 At the start of system operation, when an initial program load (hereinafter referred to as IPL) is specified from each system console 8a, 8b, IPL processing is performed according to the processing flow shown in FIG. 3, for example.

まず、処理のステツプ１０で、診断プログラム
が動いて、自システムの状態を診断し、その結果
ステツプ１１で正常であれば、以降の処理に進む
が、正常で無い場合は、例えばステツプ１２で可
能な場合はシステムコンソール８ａ又は８ｂに表
示して、停止する。 First, in step 10 of the process, a diagnostic program runs and diagnoses the state of the own system. If the result is normal in step 11, the process proceeds to the subsequent steps, but if it is not normal, it can be performed in step 12, for example. If so, it is displayed on the system console 8a or 8b and stopped.

正常な場合は、ステツプ１３で所要のプログラ
ムをロードする。 If normal, the required program is loaded in step 13.

次のステツプ１４で、例えば信号線７により、
所要の信号を授受することにより、相手システム
が稼動可能な状態（以下において、レデイ状態と
いう）かを検査する。 In the next step 14, for example, by the signal line 7,
By sending and receiving necessary signals, it is checked whether the other party's system is in an operational state (hereinafter referred to as a ready state).

相手システムがレデイ状態でなければ、ステツ
プ１５で、自システムをマスタとし、例えば補助
記憶装置４ａ又は４ｂに記憶されている、いわゆ
るチエツクポイントフアイルを読み出すことによ
り、ダウン状態からの回復か判断し、必要な場合
には所要の回復処理を行つた後、通常の業務処理
を開始する。 If the partner system is not in the ready state, in step 15, the system determines whether it has recovered from the down state by making the own system the master and reading out a so-called checkpoint file stored in the auxiliary storage device 4a or 4b, for example. After performing necessary recovery processing if necessary, normal business processing is started.

相手システムが既にレデイであれば、ステツプ
１６に進んで、スレーブとなり、相手のマスタシ
ステムとデルアル運用状態に入るための処理をし
た上、相手システムと同期した状態で業務処理を
開始する。 If the partner system is already ready, the process advances to step 16, where it becomes a slave, performs processing to enter a parallel operation state with the partner's master system, and then starts business processing in synchronization with the partner system.

前記のように、システムダウン状態から、正し
く回復するためには、システムダウン時にマスタ
であつたシステムが、マスタとして稼動状態に入
る必要がある。 As described above, in order to properly recover from a system down state, the system that was the master at the time of the system down needs to enter the operating state as the master.

このために、システムダウンからの回復におい
て、従来はオペレータが、何れをマスタとすべき
か判断して、該当の１システムのみを、まずIPL
する。それによつて、そのシステムが、前記処理
の流れに説明したように、マスタとして動き出す
ことができる。 For this reason, in the past, when recovering from a system down, the operator would first decide which system should be the master and then perform an IPL on only that one system.
do. Thereby, that system can start acting as the master, as described in the process flow above.

次に要すれば、他方のシステムのIPL処理を起
動することにより、そのシステムはスレーブとな
り、デルアル運用状態が構成される。 Next, if necessary, by activating the IPL process of the other system, that system becomes a slave and the Dell operational state is configured.

従つて、オペレータが、最初にIPLするシステ
ムの選択を誤ると、誤つた回復処理が行われて、
例えばシステムダウン後の業務処理を混乱させる
ような事態を発生するという問題点があつた。 Therefore, if the operator makes a mistake in selecting the first system to IPL, an incorrect recovery process may occur, resulting in
For example, there was a problem in that a situation occurred that disrupted business processing after a system went down.

[Means for solving problems]

前記の問題点は、２システムからなり、相互に
相手の該システムと信号を授受する信号線を有
し、各該システムは両システムが共に稼働するデ
ユアル運用状態の場合には、それぞれ一方がマス
タ状態、他方がスレーブ状態として稼働し、一方
のシステムのみ稼働するシングル運用状態の場合
には稼働する該システムがマスタ状態で稼働し、
各自システムの該稼働状態を示す所定の状態記録
を保持するように構成されている二重化情報処理
システムであつて、各該システムは、所定の初期
プログラムロードを行つた場合に、該信号線によ
つて該相手システムの状態を識別し、該相手シス
テムの状態を、稼働可能状態でないと識別した場
合には、自システムをマスタ状態として稼働し、
該相手システムの状態を、マスタ状態として稼働
している状態であると識別した場合には、自シス
テムをスレーブ状態として稼働し、該相手システ
ムの状態を、稼働可能状態であるがマスタ状態に
なつていないと識別した場合には、自システムに
保持する該状態記録を検査し、当該状態記録に最
近の稼働状態がマスタ状態であつたことが記録さ
れていない場合には、自システムをスレーブ状態
として稼働し、当該状態記録に最近の稼働状態が
マスタ状態であつたことが記録されている場合に
は、自システムをマスタ状態として稼働し、且つ
当該稼働状態がシングル運用状態であつたことが
記録されている場合には、該信号線によつて該相
手システムの処理を中断する信号を送出するよう
に構成されていることを特徴とするシステム回復
制御方法によつて解決される。 The problem mentioned above is that the two systems each have a signal line that sends and receives signals to and from the other system, and in the case of a dual operation state in which both systems operate, one of the systems is the master. state, the other system operates as a slave state, and in the case of a single operation state in which only one system operates, the operating system operates in a master state,
A redundant information processing system configured to maintain a predetermined status record indicating the operating status of each system, wherein each system is identify the state of the other system, and if the state of the other system is identified as not being operational, operate the own system as the master state;
If the state of the partner system is identified as operating as a master state, the host system operates as a slave state, and the state of the partner system is changed to a master state even though it is in an operable state. If it is determined that the system is not in the master state, the state record held in the own system is inspected, and if the state record does not record that the most recent operating state was the master state, the system is set to the slave state. If the system is running as the master state and the state record records that the most recent operating state was the master state, the system is operating as the master state and the operating state is the single operation state. The problem is solved by a system recovery control method characterized in that, if the problem is recorded, a signal is sent via the signal line to interrupt the processing of the partner system.

[Effect]

即ち、IPL処理を起動されたシステムが、相手
システムのレデイ状態と、チエツクポイントフア
イル等に記憶されている、以前のマスタ／スレー
ブ状態の表示とを参照することにより、自システ
ムをマスタするか否かを、自動的に判断するよう
に、回復時の制御を構成する。 In other words, a system that has started IPL processing determines whether or not it will master its own system by referring to the ready state of the other system and the display of the previous master/slave state stored in a checkpoint file, etc. Configure recovery control to automatically determine whether

これにより、オペレータは、両システムのIPL
処理を同時に起動すれば、自動的にマスタ／スレ
ーブが正しく決定し、システムの回復が行われる
ことになる。 This allows the operator to IPL both systems.
If the processes are started at the same time, the master/slave will be automatically determined correctly and the system will be recovered.

〔Example〕

第１図は本発明の一実施例構成の処理の流れ図
である。 FIG. 1 is a flowchart of processing in an embodiment of the present invention.

従来と同様に、処理のステツプ１０で、診断プ
ログラムが動いて、自システムの状態を診断し、
その結果正常であれば、ステツプ１３まで進む。 As in the past, in step 10 of the process, a diagnostic program runs to diagnose the state of the own system.
If the result is normal, the process proceeds to step 13.

ステツプ２０で、相手システムとの信号授受に
よつて、相手システムの状態を検査する。 In step 20, the status of the partner system is checked by exchanging signals with the partner system.

相手システムがレデイ状態でないことを検出す
ると、ステツプ１５に進んで、自システムをマス
タとし、必要な場合には所要の回復処理を行つた
後、通常の業務処理を開始する。 If it is detected that the partner system is not in the ready state, the process proceeds to step 15, where the own system is made the master, and after performing necessary recovery processing if necessary, normal business processing is started.

ステツプ２０で、相手システムが既にマスタと
して動作していると判定すると、ステプツ１６に
進み、自システムをスレーブとし、相手のマスタ
システムとデユアル運用状態に入るための処理を
した上、相手システムと同期した状態で業務処理
を開始する。 If it is determined in step 20 that the other system is already operating as a master, the process proceeds to step 16, where the own system is made a slave, processes are performed to enter a dual operation state with the other party's master system, and the system is synchronized with the other system. Start business processing in this state.

ステツプ２０で、相手システムがレデイでIPL
処理中と判定すると、ステツプ２２に進み、例え
ば補助記憶装置４ａ又は４ｂに記憶されているチ
エツクポイントフアイルから、以前（ダウン状態
からの回復の場合、最近のダウンの直前）のシス
テムの状態記録を読み出す。この状態記録は、自
システムの稼働状態が決定したとき、及び障害の
発生等で稼働状態の変更が生じたとき、その稼働
状態に入るための処理の中で、マスタ状態又はス
レーブ状態を表すように記録されている。 At step 20, the opponent system is ready and IPL
If it is determined that the process is in progress, the process proceeds to step 22, where the previous system status record (in the case of recovery from a down state, immediately before the recent down state) is retrieved from the checkpoint file stored in the auxiliary storage device 4a or 4b, for example. read out. This status record is used to indicate the master status or slave status when the operating status of the own system is determined, or when the operating status changes due to the occurrence of a failure, etc., during the process to enter the operating status. recorded in

ステツプ２３で、読み出した状態記録によつ
て、自システムがマスタであつたかを検査し、マ
スタで無かつた場合（この場合には、スレーブで
あつた場合と、何れの状態とも表示されて無い場
合とを含む）には、ステツプ１６に進み、自シス
テムをスレーブとし、相手のマスタシステムとデ
ユアル運用状態に入る。 In step 23, it is checked whether the own system is the master based on the read status record, and if it is not the master (in this case, it is either the slave or the status is not displayed). (including cases), proceed to step 16, set the own system as a slave, and enter into a dual operation state with the other party's master system.

ステツプ２３で、自システムがマスタであつた
と識別した場合には、ステツプ２４において、以
前のシステム状態の記録から、シングル運用状態
であつたかを更に検査する。 If it is determined in step 23 that the own system is the master, then in step 24 it is further checked from the record of the previous system state whether it was in the single operating state.

シングル運用状態であつた場合には、相手シス
テムがマスタの状態で先にダウンしており、相手
システムには、そのシステム状態の記録がその
まゝ最近の記録として保持されている可能性があ
り、この場合相手システムもマスタとして動きだ
すことになる。 If it was in a single operation state, the other system may have gone down first in the master state, and the other system may still have a recent record of that system state. In this case, the partner system will also start acting as the master.

このように、両システムがマスタとなることを
防ぐために、ステツプ２５で相手システムに切り
捨て信号を送ることによつて、相手システムで実
行中のIPL処理を中止させ、その後ステツプ１５
の回復処理に進む。この場合に相手システムは、
その後改めてIPL処理を起動させることにより、
デユアル運用状態のスレーブとなる。 In this way, in order to prevent both systems from becoming masters, the IPL processing being executed in the partner system is stopped by sending a truncation signal to the partner system in step 25, and then the IPL processing being executed in the partner system is stopped in step 15.
Proceed to the recovery process. In this case, the other system is
After that, by starting the IPL process again,
Becomes a slave in dual operation mode.

ステツプ２４で、以前がシングル運用状態で無
いと識別した場合には、前記の処理が必要ないの
で、直ちにステツプ１５の処理に進む。 If it is determined in step 24 that the previous single operation state is not present, the process immediately proceeds to step 15 since the above process is not necessary.

〔Effect of the invention〕

以上の説明から明らかなように本発明によれ
ば、二重化情報処理システムのシステムダウンか
らの回復において、オペレータの誤操作による不
正な回復処理が除かれるので、情報処理システム
の信頼性、可用性を改善するという著しい工業的
効果がある。 As is clear from the above description, according to the present invention, when recovering a redundant information processing system from a system failure, unauthorized recovery processing due to operator error is eliminated, thereby improving the reliability and availability of the information processing system. This has a significant industrial effect.

[Brief explanation of drawings]

第１図は本発明の一実施例の処理の流れ図、第
２図はシステムの一構成例ブロツク図、第３図は
従来の一構成例の処理の流れ図である。図において、１ａ，１ｂは情報処理システム、
２ａ，２ｂは処理装置、３ａ，３ｂは主記憶装
置、４ａ，４ｂは補助記憶装置、５は入力線、６
は出力線、７は信号線、８ａ，８ｂはシステムコ
ンソール、１０〜１６，２０〜２５は処理のステ
ツプを示す。 FIG. 1 is a flowchart of processing according to an embodiment of the present invention, FIG. 2 is a block diagram of an example of a system configuration, and FIG. 3 is a flowchart of processing of an example of a conventional configuration. In the figure, 1a and 1b are information processing systems,
2a and 2b are processing units, 3a and 3b are main storage devices, 4a and 4b are auxiliary storage devices, 5 is an input line, and 6
1 is an output line, 7 is a signal line, 8a and 8b are system consoles, and 10 to 16 and 20 to 25 are processing steps.

Claims

[Claims] 1. Consisting of two systems, each having a signal line for transmitting and receiving signals with the other system, and in the case of a dual operation state in which both systems operate, one of the systems is In the case of a single operating state where only one system operates in the master state, the other system operates as a slave state, the operating system operates in the master state, and maintains a predetermined state record indicating the operating state of each system. A duplex information processing system configured as follows, in which each system identifies the state of the partner system through the signal line when a predetermined initial program load is performed, and determines the state of the partner system. If the other system is identified as not being operational, it operates as a master state, and if the partner system is identified as operating as a master state, it operates as a slave system. If the other system is identified as being ready for operation but not in the master state, it inspects the status record held in its own system, and updates the status record with recent operations. If it is not recorded that the state was in the master state, the system operates as a slave state, and if the state record records that the most recent operating state was in the master state, If the own system is operating as a master state, and it is recorded that the operating state is single operation state, a signal to interrupt the processing of the other system is sent via the signal line. A system recovery control method comprising: