Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPS622334B2 - - Google Patents
[go: Go Back, main page]

JPS622334B2 - - Google Patents

Info

Publication number
JPS622334B2
JPS622334B2 JP55161516A JP16151680A JPS622334B2 JP S622334 B2 JPS622334 B2 JP S622334B2 JP 55161516 A JP55161516 A JP 55161516A JP 16151680 A JP16151680 A JP 16151680A JP S622334 B2 JPS622334 B2 JP S622334B2
Authority
JP
Japan
Prior art keywords
cpu
information
logical
group
logical device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP55161516A
Other languages
Japanese (ja)
Other versions
JPS5785151A (en
Inventor
Kenji Takahashi
Mamoru Ishibashi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
Nippon Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Electric Co Ltd filed Critical Nippon Electric Co Ltd
Priority to JP55161516A priority Critical patent/JPS5785151A/en
Publication of JPS5785151A publication Critical patent/JPS5785151A/en
Publication of JPS622334B2 publication Critical patent/JPS622334B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Retry When Errors Occur (AREA)
  • Hardware Redundancy (AREA)

Description

【発明の詳細な説明】 本発明は情報処理システムにおける論理装置の
エラー回復方式に関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an error recovery method for a logical device in an information processing system.

従来、マルチプロセツサシステムにおいて、1
個の論理装置が故障し回復できない場合に、故障
した論理装置で実行していた処理を他の論理装置
に継続するためには、例えば、特公昭47−36181
号公報記載のデータ処理装置のように、他の正常
な論理装置に救済のための回路が必要となるとい
う欠点がある。
Conventionally, in multiprocessor systems, 1
If a logical device fails and cannot be recovered, in order to continue the process that was being executed on the failed logical device to another logical device, for example,
As with the data processing device described in the above publication, there is a drawback that a circuit for relief is required in an otherwise normal logic device.

本発明の目的は他の正常な論理装置に救済のた
めの回路を用意することなしに故障した論理装置
で実行していた処理を他の正常な論理装置に継続
することのできる論理装置のエラー回復方式を提
供することにある。
An object of the present invention is to enable a faulty logic device to continue processing that was being executed in a faulty logic device without preparing a circuit for relief in other normal logic devices. The purpose is to provide a recovery method.

本発明の方式は、第1の論理装置と、第2の論
理装置と、前記2つの論理装置により共通に使用
される主記憶装置とから構成された論理装置のエ
ラー回復方式において、前記第1の論理装置は命
令読出し実行用情報群を格納する格納手段と、再
試行可能なエラー発生時自装置内での回復が不可
能なときに命令の実行を停止させる停止手段と、
該エラーの発生時に前記格納手段に格納されてい
る情報群を自装置自身で前記主記憶装置に退避す
る退避手段とを具備し、前記第2の論理装置は前
記第1の論理装置から前記主記憶装置に退避され
た情報群を取込むための取込み手段を具備する。
The method of the present invention provides an error recovery method for a logical device including a first logical device, a second logical device, and a main storage device commonly used by the two logical devices. The logical device has a storage means for storing a group of information for reading and executing instructions, and a stopping means for stopping execution of the instruction when a retryable error occurs and recovery within the own device is impossible;
and saving means for saving the information group stored in the storage means to the main storage device when the error occurs, and the second logical device saves the information group stored in the storage device to the main storage device, and the second logical device It is provided with an import means for importing the information group saved in the storage device.

次に本発明について図面を参照して詳細に説明
する。図を参照すると、本発明の一実施例は、エ
ラーを検出した中央処理装置(以下CPU)1
0、中断した処理を引継ぐCPU20、CPU10
のプログラム操作可能レジスタ群130、これら
レジスタ群130の情報を退避する退避レジスタ
群140、前記レジスタ群130と退避レジスタ
群140との間のデータ移送に使用するデータパ
ス135、制御線111および制御線112、前
記レジスタ群130を使用して命令を実行する演
算回路110、CPU10のエラー検出回路15
0、この回路150からの信号を伝送する制御線
152および制御線153、エラー時の回復可否
を制御する再試行制御回路160、この回路16
0からの信号を伝送する制御線161および制御
線162、CPU10とCPU20とにより共用さ
れ中断情報を記憶する主記憶30、主記憶30と
前記退避レジスタ群140との間のデータ移送に
使用するデータパス145、CPU10の処理の
中断を制御する中断制御回路120、この回路1
20からの信号を伝送する制御線121および主
記憶30の中断情報をCPU20の退避レジスタ
群240に取込むためのデータパス245から構
成されている。
Next, the present invention will be explained in detail with reference to the drawings. Referring to the figure, in one embodiment of the present invention, a central processing unit (hereinafter referred to as CPU) 1 that has detected an error
0, CPU20 and CPU10 take over the interrupted process
A programmable register group 130, a save register group 140 for saving information of these register groups 130, a data path 135 used for data transfer between the register group 130 and the save register group 140, a control line 111, and a control line. 112, an arithmetic circuit 110 that executes instructions using the register group 130, an error detection circuit 15 for the CPU 10;
0, a control line 152 and a control line 153 that transmit signals from this circuit 150, a retry control circuit 160 that controls whether or not recovery is possible in the event of an error, this circuit 16
A control line 161 and a control line 162 that transmit signals from 0, a main memory 30 that is shared by the CPU 10 and the CPU 20 and stores interruption information, and data used for data transfer between the main memory 30 and the save register group 140. path 145, an interruption control circuit 120 that controls interruption of processing of the CPU 10, this circuit 1;
The control line 121 transmits signals from the CPU 20 and the data path 245 takes interrupt information from the main memory 30 into the save register group 240 of the CPU 20.

次に本発明の動作を順を追つて詳細に説明す
る。
Next, the operation of the present invention will be explained in detail step by step.

最初に通常状態における動作を説明する。
CPU10で実行中の処理Aが周辺装置(図示せ
ず)による入出力動作の終了を待つ必要の生じた
ときに、処理Aは一時的に論理装置の使用を放棄
し、この放棄に応答してCPU10は退避レジス
タ群140の退避情報を主記憶30に退避して遊
休状態となる。遊休状態となつたCPU10は実
行されることを待つている別の処理Bが存在する
ならば、主記憶30から処理Bの中断情報を退避
レジスタ群140に取込んで、処理Bの実行を開
始する。処理Aでの入出力動作が終了したならば
処理Aは実行待状態となつて、CPU10とCPU
20が対称型であり主記憶30を共有しているこ
とから、CPU10またはCPU20により実行さ
れる機会を待つ。その後にCPU20が先に遊休
状態になつたならば、CPU20は主記憶30か
ら処理Aの中断情報を退避レジスタ群240に取
込んで、処理Aの実行を再開する。
First, the operation in the normal state will be explained.
When a process A being executed by the CPU 10 needs to wait for the completion of an input/output operation by a peripheral device (not shown), the process A temporarily abandons the use of the logical device and responds to this abandonment. The CPU 10 saves the save information in the save register group 140 to the main memory 30 and enters an idle state. If there is another process B waiting to be executed, the CPU 10 that has become idle takes the interruption information of process B from the main memory 30 into the save register group 140 and starts executing process B. do. When the input/output operation in process A is completed, process A enters the execution waiting state, and the CPU 10 and CPU
20 is symmetrical and shares the main memory 30, it waits for an opportunity to be executed by the CPU 10 or CPU 20. After that, if the CPU 20 becomes idle first, the CPU 20 takes in the interruption information of the process A from the main memory 30 into the save register group 240, and resumes execution of the process A.

次にエラー時における動作を説明する。いま、
CPU10でエラーが発生しエラー検出回路15
0によりエラー検出されると、エラー検出回路1
50は再試行制御回路160にエラー検出を報告
するとともに演算回路110に実行を停止するよ
うに指示する。再試行制御回路160は報告され
たエラーが再試行可能かどうかを調べ、再試行可
能であれば退避レジスタ群140の情報をプログ
ラム操作可能レジスタ群130に格納した後に演
算回路110に実行再開の起動をかける。規定回
数の再試行によつても回復しないときまたは再試
行不可能のとき、再試行制御回路160は中断制
御回路120に処理の中断を要求する。中断要求
を受けた中断制御回路120は、退避レジスタ群
140の情報を主記憶30に退避して、CPU1
0で実行していた処理を実行待状態にする。この
とき、演算回路110は実行再開の起動がなされ
ていないために新たな処理の受入れが不可能とな
つている。従つて、CPU10で実行されていた
処理は、通常状態のときとは異なり、CPU20
のみで実行が再開されることを待つ。CPU20
および主記憶30は正常であるから、CPU20
が遊休状態になつたときに、通常状態における動
作と同様に、CPU20は主記憶30からCPU1
0で実行していた処理の中断情報を退避レジスタ
群240に取込んで処理を引継ぐ。
Next, the operation in the event of an error will be explained. now,
An error occurs in the CPU 10 and the error detection circuit 15
When an error is detected by 0, the error detection circuit 1
50 reports the error detection to the retry control circuit 160 and instructs the arithmetic circuit 110 to stop execution. The retry control circuit 160 checks whether the reported error can be retried, and if it is possible to retry, stores the information in the save register group 140 in the programmable register group 130, and then instructs the arithmetic circuit 110 to restart execution. multiply. When the situation does not recover even after a predetermined number of retries, or when retry is not possible, retry control circuit 160 requests interruption control circuit 120 to suspend processing. Upon receiving the interrupt request, the interrupt control circuit 120 saves the information in the save register group 140 to the main memory 30, and
Put the process that was being executed in 0 into a waiting state. At this time, since the arithmetic circuit 110 has not been activated to resume execution, it is unable to accept new processing. Therefore, the processing that was being executed by CPU 10 is different from the normal state, and the processing that was being executed by CPU 20 is
Wait for execution to resume. CPU20
And since the main memory 30 is normal, the CPU 20
When the CPU 20 becomes idle, the CPU 20 transfers data from the main memory 30 to the CPU 1, similar to the operation in the normal state.
The interruption information of the process being executed at 0 is taken into the save register group 240 and the process is taken over.

CPU10とCPU20は対称型であることか
ら、CPU20が故障した場合に、CPU10によ
り処理が引継がれることは明らかである。さら
に、CPUが3台以上の場合であつても、この発
明の方式が適用可能なことも明らかである。ま
た、本実施例は処理の救済に着目したものであつ
て、システム内の資源管理のために、障害の発生
を通知する手段を有しても、この発明の方式は有
効である。
Since the CPU 10 and the CPU 20 are symmetrical, it is clear that if the CPU 20 fails, the CPU 10 will take over the processing. Furthermore, it is clear that the method of the present invention is applicable even when there are three or more CPUs. Further, this embodiment focuses on processing relief, and the method of the present invention is effective even if a means for notifying the occurrence of a failure is provided for resource management within the system.

本発明には、回復不可能なエラーの発生時に正
常な論理装置に救済のための回路を特別に用意す
ることなく、処理の連続性を保ちながら論理装置
の回復処理を行うことができるという効果があ
る。
The present invention has the advantage that when an unrecoverable error occurs, it is possible to perform recovery processing for a logic device while maintaining processing continuity without the need to provide a special rescue circuit for a normal logic device. There is.

【図面の簡単な説明】[Brief explanation of the drawing]

図は本発明の一実施例を示す図である。 図において、10,20……CPU、30……
主記憶、110,210……演算回路、120,
220……中断制御回路、130,230……プ
ログラム制御可能レジスタ群、140,240…
…退避レジスタ群、150,250……エラー検
出回路、160,260……再試行制御回路、1
15,135,145,215,235,245
……データパス、111,112,113,11
4,121,151,152,153,161,
162,163,211,212,213,21
4,221,251,252,253,261,
262,263……制御線。
The figure shows an embodiment of the present invention. In the figure, 10, 20... CPU, 30...
Main memory, 110, 210...Arithmetic circuit, 120,
220... Interruption control circuit, 130, 230... Program controllable register group, 140, 240...
...Saving register group, 150,250...Error detection circuit, 160,260...Retry control circuit, 1
15,135,145,215,235,245
...Data path, 111, 112, 113, 11
4,121,151,152,153,161,
162, 163, 211, 212, 213, 21
4,221,251,252,253,261,
262, 263...control line.

Claims (1)

【特許請求の範囲】 1 第1の論理装置と、第2の論理装置と、前記
2つの論理装置により共通に使用される主記憶装
置とから構成された論理装置のエラー回復方式に
おいて、 前記第1の論理装置は命令読出し実行用情報群
を格納する格納手段と、再試行可能なエラー発生
時自装置内での回復が不可能なときに命令の実行
を停止させる停止手段と、該エラーの発生時に前
記格納手段に格納されている情報群を自装置自身
で前記主記憶装置に退避する退避手段とを具備
し、 前記第2の論理装置は前記第1の論理装置から
前記主記憶装置に退避された情報群を取込むため
の取込手段を具備することを特徴とする論理装置
のエラー回復方式。
[Scope of Claims] 1. An error recovery method for a logical device comprising a first logical device, a second logical device, and a main storage device commonly used by the two logical devices, comprising: The logic device 1 includes a storage means for storing a group of information for reading and executing instructions, a stopping means for stopping the execution of the instruction when a retryable error occurs and recovery is impossible within the own device, and a stop means for stopping the execution of the instruction when a retryable error occurs and recovery is impossible within the own device. and saving means for saving a group of information stored in the storage means to the main storage device by itself when an occurrence occurs, and the second logical device saves the information group stored in the storage device to the main storage device from the first logical device. 1. An error recovery method for a logical device, characterized by comprising a capture means for capturing a group of saved information.
JP55161516A 1980-11-17 1980-11-17 Error recovery system of logical device Granted JPS5785151A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP55161516A JPS5785151A (en) 1980-11-17 1980-11-17 Error recovery system of logical device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP55161516A JPS5785151A (en) 1980-11-17 1980-11-17 Error recovery system of logical device

Publications (2)

Publication Number Publication Date
JPS5785151A JPS5785151A (en) 1982-05-27
JPS622334B2 true JPS622334B2 (en) 1987-01-19

Family

ID=15736550

Family Applications (1)

Application Number Title Priority Date Filing Date
JP55161516A Granted JPS5785151A (en) 1980-11-17 1980-11-17 Error recovery system of logical device

Country Status (1)

Country Link
JP (1) JPS5785151A (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6341943A (en) * 1986-08-08 1988-02-23 Nec Corp Error restoring system for logic unit
JPH0792763B2 (en) * 1988-11-16 1995-10-09 日本電気株式会社 Fault handling method
JPH08329026A (en) * 1995-06-05 1996-12-13 Nec Corp Dual processor system

Also Published As

Publication number Publication date
JPS5785151A (en) 1982-05-27

Similar Documents

Publication Publication Date Title
JPS6053339B2 (en) Logical unit error recovery method
JPS622334B2 (en)
JPS6048773B2 (en) Mutual monitoring method between multiple computers
JPS6128141B2 (en)
JPH0375836A (en) Succeeding processing memthod for resource information
JPS6143739B2 (en)
JP2922981B2 (en) Task execution continuation method
JPS6156537B2 (en)
JPS63197258A (en) Input/output processor
JPS6130296B2 (en)
JPH0149975B2 (en)
JPS585856A (en) Error recovery system for logical device
JP2814988B2 (en) Failure handling method
JPS635779B2 (en)
JPH04332047A (en) Switching system for step-out of synchronism of redundant computer
JPH01140357A (en) Memory access controller
JPH0469744A (en) Runaway detector for microcomputer
JPH07111684B2 (en) Logical unit error recovery method
JPH0254358A (en) Error processing system for input/output adaptor
JPH07120296B2 (en) Error control method in hot standby system
JPS62212865A (en) Multiprocessor control system
JPS6130297B2 (en)
JPS5911927B2 (en) Address failure handling method
JPH03119432A (en) Error restoring system for logic device
JPS6134654A (en) Bus master control device