Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPH0320774B2 - - Google Patents
[go: Go Back, main page]

JPH0320774B2 - - Google Patents

Info

Publication number
JPH0320774B2
JPH0320774B2 JP58056370A JP5637083A JPH0320774B2 JP H0320774 B2 JPH0320774 B2 JP H0320774B2 JP 58056370 A JP58056370 A JP 58056370A JP 5637083 A JP5637083 A JP 5637083A JP H0320774 B2 JPH0320774 B2 JP H0320774B2
Authority
JP
Japan
Prior art keywords
failure
total
value
abnormality
devices
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP58056370A
Other languages
Japanese (ja)
Other versions
JPS59194253A (en
Inventor
Masaaki Nagao
Yasutaka Oochi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Dai Ichi Communications Software Ltd
Fujitsu Ltd
Original Assignee
Fujitsu Dai Ichi Communications Software Ltd
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Dai Ichi Communications Software Ltd, Fujitsu Ltd filed Critical Fujitsu Dai Ichi Communications Software Ltd
Priority to JP58056370A priority Critical patent/JPS59194253A/en
Publication of JPS59194253A publication Critical patent/JPS59194253A/en
Publication of JPH0320774B2 publication Critical patent/JPH0320774B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/076Error or fault detection not based on redundancy by exceeding limits by exceeding a count or rate limit, e.g. word- or bit count limit
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/004Error avoidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Description

【発明の詳細な説明】 (1) 発明の技術分野 本発明は、システムを構成する装置の障害判定
方式に係り、特に装置間の接続状態においてある
装置の異常によつてその他の装置が障害とみなさ
れることを防止する障害判定方式に関する。
[Detailed Description of the Invention] (1) Technical Field of the Invention The present invention relates to a failure determination method for devices constituting a system, and in particular, in a connection state between devices, when an abnormality in one device causes a failure in another device. This invention relates to a failure determination method that prevents failures from occurring.

(2) 従来技術と問題点 従来の障害判定方式としては、ある装置に異常
を検出した時直ちに障害とみなす方式と、各装置
の異常発生回数が一定の値をこえた時に障害とみ
なす方式とがあるが、前者では、一時的な異常の
発生で実際は使用可能な場合にも障害とみなされ
る欠点があり、後者には前者の場合のような欠点
はないが、装置相互の接続状態によつて、ある装
置の異常によつて他の装置が障害とみなされるこ
とがあるという欠点を持つ。
(2) Prior art and problems There are two conventional failure determination methods: one that immediately considers a failure to occur when an abnormality is detected in a device, and the other that deems it a failure when the number of times an abnormality occurs in each device exceeds a certain value. However, the former has the disadvantage that the occurrence of a temporary abnormality is considered a failure even if it is actually usable, while the latter does not have the disadvantage of the former, but it may depend on the mutual connection state of the devices. However, it has the disadvantage that an abnormality in one device may cause other devices to be considered to be at fault.

(3) 発明の目的 本発明の目的は、上記問題点を解決することに
あり、システム内の装置の障害を判定する時に、
その装置自体の異常発生回数だけでなく、その装
置の下に接続された全ての装置の異常発生回数の
総和が一定値をこえた時にも障害と判定すること
によつて、ある装置の異常によつてその装置に接
続される別装置が異常とみなされ障害と判定され
ることを防止するような障害判定方式を提供する
ことにある。
(3) Purpose of the Invention The purpose of the present invention is to solve the above-mentioned problems.
By determining a failure not only when the number of times an error occurs in the device itself, but also when the sum of the number of times an error occurs in all devices connected to the device exceeds a certain value, it is possible to detect an error in a device. Therefore, it is an object of the present invention to provide a failure determination method that prevents another device connected to the device from being considered abnormal and being determined to be a failure.

(4) 発明の構成 上記目的を達成するために、本発明は、上位装
置に複数の下位装置が接続され階層的に運転さ
れ、各装置対応に異常の発生回数をカウントする
手段を備え、該手段により所定値以上カウントさ
れた装置を障害とみなしてシステムから切離し、
代替装置に切替ることにより運転を続行するよう
なシステムにおいて、前記各装置対応の異常発生
回数の合計を出す合計手段及び該合計の値により
障害とみなす合計の基準値を備え、前記装置対応
の前記カウントする手段によりカウントした各値
が装置対応の所定値を越えるとき、該当装置を障
害と判定し前記システムより切離すとともに、前
記合計手段により計数した合計値が前記合計の基
準値より越えるとき、前記各装置の上位装置を障
害と判定し該上位装置を代替装置に切替ることを
特徴とする。
(4) Structure of the Invention In order to achieve the above object, the present invention provides a system in which a plurality of lower-level devices are connected to a higher-level device and are operated in a hierarchical manner, and a device is provided with means for counting the number of times an abnormality occurs for each device. A device whose count exceeds a predetermined value by the means is regarded as a failure and is disconnected from the system.
In a system that continues operation by switching to an alternative device, the system is provided with a totalizing means for calculating the total number of abnormality occurrences corresponding to each of the devices, and a reference value for the total that is considered to be a failure based on the total value, When each value counted by the counting means exceeds a predetermined value corresponding to the device, the corresponding device is determined to be a failure and is disconnected from the system, and when the total value counted by the summing means exceeds the reference value for the total. , the host device of each of the devices is determined to be at fault, and the host device is switched to an alternative device.

(5) 発明の実施例 以下本発明を実施例により詳細に説明する。第
1図は本発明に係るシステム構成例を示す。第1
図において、制御装置Xとそれに接続され、現在
使用中の装置Aとその代替装置A′、装置A及び
A′に接続され、A又はA′を通して制御装置Xに
よつて制御される装置B,C,Dがある。また、
これらの各装置に対応した異常発生回数のカウン
タFCNT(Ca,Ca′,Cb,Cc,Cd)と、障害を判
定するための基準の値FB(Fa,Fa′,Fb,Fc,
Fd及びFbcd)を設ける。
(5) Examples of the invention The present invention will be explained in detail below using examples. FIG. 1 shows an example of a system configuration according to the present invention. 1st
In the figure, a control device X, a device A connected to it and currently in use, an alternative device A', a device A
There are devices B, C, D connected to A' and controlled by control device X through A or A'. Also,
Counter FCNT (Ca, Ca′, Cb, Cc, Cd) of the number of abnormalities that have occurred corresponding to each of these devices and standard value FB (Fa, Fa′, Fb, Fc,
Fd and Fbcd).

そこで、従来の障害装置の識別法を第2図の制
御フローを例に説明する。装置Bに異常が発生
し、これが検出されるとカウンタCbがカウント
アツプされ、異常発生回数Cbが規準の値Fbをこ
えた場合に装置Bは障害とみなされたシステムか
ら切離される。ところが、現在使用中の装置Aに
異常が発生し、これがAに接続された装置B,
C,Dの異常として検出された場合カウンタCb,
Cc,Cdは異常検出の都度計数が行なわれ、従来
の方式では装置B,C,Dがそれぞれ障害とみな
される。
Therefore, a conventional method for identifying a faulty device will be explained using the control flow shown in FIG. 2 as an example. When an abnormality occurs in device B and is detected, a counter Cb is incremented, and when the number of abnormality occurrences Cb exceeds a standard value Fb, device B is separated from the system deemed to be at fault. However, an abnormality occurs in the device A currently in use, and this causes the device B, which is connected to A, to
When C and D are detected as abnormal, counter Cb,
Cc and Cd are counted each time an abnormality is detected, and in the conventional system, devices B, C, and D are each regarded as a failure.

そこで、本発明では障害の判定基準値FBの値
を適当に設定することにより、例えば各カウンタ
Cb,Cc,Cdの総和が基準の値Fbcdをこえた時点
でB,C,Dの各装置が接続されている装置Aを
障害としてシステムから切離し、代替装置A′と
切り替えて運転を続行可能としたものである。こ
の制御フローを第3図に示す。このように、本発
明の方式ではFa,Fa′,Fb,Fc,Fd及びFbcdの
値を適当に設定することによつて、装置Aの異常
が原因でそれに接続された装置B,C,Dが異常
とみなされても、装置B,C,Dすべてが障害と
して切離される前に装置Aを障害とし、代替装置
A′に切替えて運転することができシステムの使
用効率を上げることができる。
Therefore, in the present invention, by appropriately setting the failure determination reference value FB, for example, each counter
When the sum of Cb, Cc, and Cd exceeds the standard value Fbcd, device A to which devices B, C, and D are connected can be disconnected from the system as a failure, and operation can be continued by switching to alternative device A'. That is. This control flow is shown in FIG. As described above, in the method of the present invention, by appropriately setting the values of Fa, Fa', Fb, Fc, Fd, and Fbcd, it is possible to prevent devices B, C, and D connected to device A due to an abnormality. Even if device B, C, and D are considered to be abnormal, device A should be considered a failure and an alternative device should be installed.
The system can be operated by switching to A', increasing the system usage efficiency.

第4図、第5図に本発明を適用した他のシステ
ム構成例を示す。
FIGS. 4 and 5 show other system configuration examples to which the present invention is applied.

第4図は第1図で示したように被制御装置Aに
代替装置を設けるのではなく、制御装置自体を二
重化した場合の例で第5図は下位の装置が上位の
装置と常時接続されるのではなく、その間の接続
をスイツチにより自由に変更できるような場合の
例である。ここれらの例についても、本発明の方
式は同様の効果を示す。また、各種の制御カウン
タの持ち方についても、下位の装置の異常回数の
総和によつて上位装置の障害を判定するための基
準値を別に設ける形にすればその構成及び物理的
な媒体の種別にかかわらず同様の効果が得られ
る。
Figure 4 is an example of a case where the control device itself is duplicated, rather than providing an alternative device to the controlled device A as shown in Figure 1, and Figure 5 shows a case where the lower-level device is always connected to the higher-level device. This is an example of a case where the connections between the two can be freely changed using a switch, rather than the connection between the two. The method of the present invention exhibits similar effects in these examples as well. In addition, regarding the way to hold various control counters, it is possible to set a separate reference value for determining a failure in the upper device based on the total number of abnormalities in the lower device, depending on the configuration and type of physical medium. The same effect can be obtained regardless.

次に本発明の障害装置判定方式を具体的システ
ムに適用した例を用いて説明する。第6図はシス
テム全体の構成を示し、CCは中央処理装置、
MMは主記憶装置、CHはチヤネル装置、ioは入
出力装置で、各ioはCHを通してCCから制御され
る。MM,CCはそれぞれ二重化構成をとり、使
用中装置を障害と認識した場合は、代替装置に切
かえ運転を続行する。MM,CHはそれぞれ、い
ずれのCCとも接続を行なうことが可能である。
Next, an explanation will be given using an example in which the faulty device determination method of the present invention is applied to a specific system. Figure 6 shows the overall system configuration, where CC is the central processing unit,
MM is the main memory, CH is a channel device, io is an input/output device, and each io is controlled from CC through CH. MM and CC each have a redundant configuration, and if a device in use is recognized as a failure, it switches to an alternative device and continues operation. MM and CH can each be connected to any CC.

各CH0,1に対応した異常発生回数を計数す
るためのカウンタCCH0,CCH1及びこれらの総和を
示すCTOTALを持つものとし、これらの値がそれぞ
れ基準の値FCH0,FCH1,FTOTALをこえた場合それ
ぞれ、CH0,CH1又はCCの障害と認識する。
Assume that there are counters C CH0 and C CH1 for counting the number of abnormal occurrences corresponding to each CH0 and CH1, and C TOTAL indicating the sum of these, and these values are the reference values F CH0 , F CH1 , and F TOTAL, respectively. If it exceeds, it is recognized as a failure of CH0, CH1 or CC respectively.

このようなシステムにおいて、CH0に障害が
発生した場合の時間的推移を第7図に示す。同図
において時間の経過Tに従い、CH0へのアクセ
スが発生しCH0が障害であるため、異常として
検出される。この異常を検出するたびにカウンタ
CCH0が計数され、この値が一定値FCH0をこえた時
にCH0は障害と認識されシステムから切離され
る。
In such a system, FIG. 7 shows the time course when a failure occurs in CH0. In the figure, as time elapses T, an access to CH0 occurs, and CH0 is a failure, so it is detected as an abnormality. Each time this abnormality is detected, the counter
C CH0 is counted, and when this value exceeds a certain value F CH0 , CH0 is recognized as a failure and is disconnected from the system.

即ち、この障害の段階ではFTOTALに及ぶことな
く、チヤネル装置CHの段階でチヤネル装置系が
切り換えられる。
That is, at this failure stage, the channel device system is switched at the channel device CH stage without reaching F TOTAL .

次にチヤネル障害ではなく、中央処理装置本
体、もしくはCC0とCHとの接続部が障害となつ
た場合を第8図に示す。時間経過Tに従つてCH
0,CH1へのアクセスが発生するがCC本体、も
しくはCC−CH接続部が障害のためにCH0,
CH1が正常であるにもかかわらず異常として検
出される。これによつて、カウンタCCH0,CCH1
び、CTOTALが計数される。
Next, FIG. 8 shows a case where the failure is not due to a channel failure but to the main body of the central processing unit or the connection between CC 0 and CH. CH according to time lapse T
0, CH1 is accessed, but due to a failure in the CC body or CC-CH connection, CH0,
CH1 is detected as abnormal even though it is normal. As a result, the counters C CH0 , C CH1 , and C TOTAL are counted.

今、基準の値FCH0,FCH1,FTOTALを3,3,4
という値であると仮定する。
Now, set the standard values F CH0 , F CH1 , F TOTAL to 3, 3, 4
Assume that the value is .

従来方式では、CTOTALがないため、t0の時でCH
0が、t1の時点でCH1が障害とみなされシステ
ムから切離されてしまいio装置が使用できない状
態となる。一方、本発明の方式ではCTOTAL
FTOTALを設けたためtTOTALの時において、CTOTAL
FTOTALとなつてCC0を障害と認識し、代替装置
CC1と切替え、その後は正常に運転を続行するこ
とが可能である。
In the conventional method, since there is no C TOTAL , CH
At time t1 , CH1 is considered to be a failure and is disconnected from the system, making the IO device unusable. On the other hand, in the method of the present invention, C TOTAL ,
Since F TOTAL is provided, when t TOTAL , C TOTAL >
F TOTAL recognizes CC 0 as a failure and installs an alternative device.
After switching to CC 1 , it is possible to continue operating normally.

以上述べたように、本発明の方式によれば、あ
る装置が障害となつた場合、その装置に接続され
た下位の装置を異常とみなして切離すことを防
ぎ、上位の装置を切替ることによつて、正常に処
理を続けることが可能となる。
As described above, according to the method of the present invention, when a certain device becomes a failure, it is possible to prevent lower-level devices connected to that device from being deemed abnormal and disconnect them, and to switch over to higher-level devices. This allows processing to continue normally.

(6) 発明の効果 本発明によれば、ある装置の異常によつてその
下に接続された装置が異常とみなされるような場
合でも、下位の装置の異常発生回路の総和によつ
て、上位の装置を障害としてシステムから切離し
代替装置によつて運転を続行することができるの
で、システムの使用効率を上昇させることができ
るという効果がある。
(6) Effects of the Invention According to the present invention, even if a device connected below it is deemed to be abnormal due to an abnormality in a certain device, the abnormality can be determined by the sum of the abnormality generating circuits of the lower devices. Since the faulty device can be disconnected from the system and operation can be continued with an alternative device, the efficiency of system use can be increased.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明に係るシステム構成図、第2図
は従来の制御フロー、第3図は本発明の制御フロ
ー、第4図、第5図は本発明を適用し得る他のシ
ステム構成図、第6図は本発明を適用した具体的
システム構成の実施例図、第7図は障害発生時の
一例としてタイムチヤート及びカウンタの流れ
図、第8図は障害発生時の他の例としてのタイム
チヤート及びカウンタの流れ図である。 X;制御装置、A,A′,B,C,D;装置、
FCNT;異常発生回数のカウンタ、FB;基準値。
Figure 1 is a system configuration diagram according to the present invention, Figure 2 is a conventional control flow, Figure 3 is a control flow of the present invention, and Figures 4 and 5 are other system configuration diagrams to which the present invention can be applied. , FIG. 6 is an example diagram of a specific system configuration to which the present invention is applied, FIG. 7 is a time chart and counter flow chart as an example when a failure occurs, and FIG. 8 is a time chart as another example when a failure occurs. Figure 3 is a chart and counter flow diagram. X: control device, A, A', B, C, D: device,
FCNT: Counter for the number of abnormal occurrences, FB: Reference value.

Claims (1)

【特許請求の範囲】 1 上位装置に複数の下位装置が接続され階層的
に運転され、各装置対応に異常の発生回数をカウ
ントする手段を備え、該手段により所定値以上カ
ウントされた装置を障害とみなしてシステムから
切離し、代替装置に切替ることにより運転を続行
するようなシステムにおいて、 前記各装置対応の異常発生回数の合計を出す合
計手段及び該合計の値により障害とみなす合計の
基準値を備え、 前記装置対応の前記カウントする手段によりカ
ウントした各値が装置対応の所定値を越えると
き、該当装置を障害と判定し前記システムより切
離すとともに、前記合計手段より計数した合計値
が前記合計の基準値より越えるとき、前記各装置
の上位装置を障害と判定し該上位装置を代替装置
に切替ることを特徴とする障害装置判定方式。
[Scope of Claims] 1. A plurality of lower-level devices are connected to a higher-level device and operated hierarchically, and each device is provided with means for counting the number of times an abnormality has occurred, and the means causes a device whose count exceeds a predetermined value to be faulty. In a system that continues operation by disconnecting from the system and switching to an alternative device, a totalizing means for calculating the total number of abnormality occurrences for each device, and a reference value for the total that is considered to be a failure based on the total value. and when each value counted by the counting means corresponding to the device exceeds a predetermined value corresponding to the device, the corresponding device is determined to be a failure and is disconnected from the system, and the total value counted by the totaling means is A failure device determination method characterized in that when the total exceeds a reference value, a higher-level device of each of the devices is determined to be a failure, and the higher-level device is switched to an alternative device.
JP58056370A 1983-03-31 1983-03-31 Decision system of faulty device Granted JPS59194253A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58056370A JPS59194253A (en) 1983-03-31 1983-03-31 Decision system of faulty device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58056370A JPS59194253A (en) 1983-03-31 1983-03-31 Decision system of faulty device

Publications (2)

Publication Number Publication Date
JPS59194253A JPS59194253A (en) 1984-11-05
JPH0320774B2 true JPH0320774B2 (en) 1991-03-20

Family

ID=13025366

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58056370A Granted JPS59194253A (en) 1983-03-31 1983-03-31 Decision system of faulty device

Country Status (1)

Country Link
JP (1) JPS59194253A (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0638243B2 (en) * 1986-03-19 1994-05-18 日本電気株式会社 Redundant terminal control unit configuration change method
JPH0293738A (en) * 1988-09-29 1990-04-04 Pfu Ltd Interrupt processing method
JPH02128233A (en) * 1988-11-09 1990-05-16 Nec Corp Fault processor
JP2758742B2 (en) * 1991-07-19 1998-05-28 日本電気株式会社 Malfunction detection method
JP2007304687A (en) * 2006-05-09 2007-11-22 Hitachi Ltd Cluster configuration and control method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS598461B2 (en) * 1976-09-10 1984-02-24 昭和アルミニウム株式会社 How to connect aluminum tubes with fins for heat exchangers

Also Published As

Publication number Publication date
JPS59194253A (en) 1984-11-05

Similar Documents

Publication Publication Date Title
JP2006072717A (en) Disk subsystem
JPH0320774B2 (en)
US20100162269A1 (en) Controllable interaction between multiple event monitoring subsystems for computing environments
JPS6027041B2 (en) How to switch lower control devices in Hiaraki control system
JP2001060160A (en) Cpu duplex system for controller
JPH11296311A (en) Fault-tolerant control method for storage devices
CN118034016A (en) Dual-redundancy input signal twice monitoring voting method
JP3147049B2 (en) Fabric failure detection method
JP2861595B2 (en) Switching control device for redundant CPU unit
JPS6112580B2 (en)
JPS5911455A (en) Redundancy system of central operation processing unit
JPS61134846A (en) Electronic computer system
KR0176085B1 (en) Error Detection Method of Processor Node and Node Connection Network in Parallel Processing Computer System
JP2946541B2 (en) Redundant control system
JPS5941345B2 (en) Electronic exchange emergency control circuit
SU928335A1 (en) Device for switching off external devices from the communication line that connects external devices to computer
JPS61170133A (en) Counter circuit
JPS62166401A (en) Multiplexing system for electronic computer
JPS6213700B2 (en)
JPS6282453A (en) Input/output control system
JPS617901A (en) Digital control device
JPH05216703A (en) Information processor
JPS60205706A (en) Process control device
JPS58107965A (en) System constituting system
JPS6339065A (en) Data transfer device