JPS6132702B2

JPS6132702B2 -

Info

Publication number: JPS6132702B2
Application number: JP56047242A
Authority: JP
Inventors: Yoshuki Horikoshi
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1981-04-01
Filing date: 1981-04-01
Publication date: 1986-07-29
Also published as: JPS57164345A

Description

【発明の詳細な説明】本発明は、共有メモリ方式複合マイクロコンピ
ユータシステムにおける複合マイクロコンピユー
タ障害検知方式に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a composite microcomputer failure detection method in a shared memory type composite microcomputer system.

従来、マイクロコンピユータの動作、不動作の
検知は、マイクロコンピユータの外部に設けたウ
オツチドツグタイマをマイクロコンピユータから
の指令で周期的に起動させ、マイクロコンピユー
タに障害が生じたときにはウオツチドツグタイマ
への指令を停止し、ウオツチドツグタイマをタイ
ムアウトさせることによつてマイクロコンピユー
タの不動作を検知していた。 Conventionally, the operation or non-operation of a microcomputer has been detected by periodically starting a watchdog timer installed outside the microcomputer based on a command from the microcomputer, and when a failure occurs in the microcomputer, the watchdog timer is activated. Non-operation of the microcomputer was detected by stopping commands to the timer and causing the watchdog timer to time out.

しかし、この方式は単一のマイクロコンピユー
タの不動作検知には有効であるが、複数のマイク
ロコンピユータが結合された複合マイクロコンピ
ユータにおいては、マイクロコンピユータが相互
にその不動作を検知しなければならず十分なもの
ではなかつた。 However, although this method is effective for detecting non-operation of a single microcomputer, in a composite microcomputer in which multiple microcomputers are combined, the microcomputers must mutually detect non-operation. It wasn't enough.

本発明は、これら単独のウオツチドツグタイマ
では不可能であつた複合マイクロコンピユータに
おける不動作マイクロコンピユータの確実なる検
知を、特別のハードウエアを用いることなく、各
マイクロコンピユータ共通の共有メモリに設けた
動作監視カウンタからの増分状態を示すデータを
周期的にサンプリングし、この値を比較するだけ
で簡単に行なえるようにした複合マイクロコンピ
ユータの障害検知方式の提供を目的とする。 The present invention provides reliable detection of inactive microcomputers in a complex microcomputer, which was impossible with a single watchdog timer, in a shared memory common to each microcomputer without using special hardware. The object of the present invention is to provide a fault detection method for a complex microcomputer that can be easily carried out by periodically sampling data indicating an incremental state from an operation monitoring counter and comparing the values.

以下、本発明を図面に示す実施例に基づいて説
明する。 Hereinafter, the present invention will be explained based on embodiments shown in the drawings.

第１図は、複数のマイクロコンピユータが各マ
イクロコンピユータから共通に呼出すことのでき
る共有メモリを介して相互に結合された状態の複
合マイクロコンピユータを示す。１_１，２_２，３
_３，……，１ｎはそれぞれマイクロコンピユータ
であり、これらのマイクロコンピユータ１_１，２
_２，３_３，……，１ｎはそれぞれ主メモリ２_１，
２_２，２_３，……，２ｎを設けている。３は各マ
イクロコンピユータ１_１，１_２，１_３，……１ｎ
から共通に呼出すことのできる共有メモリで、各
マイクロコンピユータ１_１，１_２，１_３，……，
１ｎに対応し、かつその指令によつてそれぞれ増
分する動作監視カウンタC₁，C₂，C₃，……，Cn
を設けている。これら動作監視カウンタC₁，
C₂，C₃，……，Cnは、共有メモリ３上の特定番
地のデータをマイクロコンピユータのレジスタ
（図示せず）に読出し、そのデータの値に１を加
えてから再び共有、各メモリ３上の同じメモリ番
地に格納することによつて増分がなされる。 FIG. 1 shows a composite microcomputer in which a plurality of microcomputers are interconnected through a shared memory that can be called in common from each microcomputer. 1 ₁ , 2 ₂ , 3
₃ , ..., 1n are microcomputers, respectively, and these microcomputers 1 ₁ , 2
₂ , 3 ₃ , ..., 1n are the main memories 2 ₁ , 1n, respectively.
2 ₂ , 2 ₃ , ..., 2n are provided. 3 is each microcomputer 1 ₁ , 1 ₂ , 1 ₃ , ... 1n
A shared memory that can be commonly accessed from each microcomputer 1 ₁ , 1 ₂ , 1 ₃ , ...,
Operation monitoring counters C ₁ , C ₂ , C ₃ , ..., Cn corresponding to 1n and incremented by the commands respectively
has been established. These operation monitoring counters C ₁ ,
C ₂ , C ₃ , . The increment is made by storing to the same memory location above.

一方、マイクロコンピユータ１_１，１_２，１
_３，……，１ｎの主メモリ２_１，２_２，２_３，…
…，２ｎには、それぞれ自己を除く各マイクロコ
ンピユータに対応したデータを格納する新データ
領域NDと旧データ領域ODが設けてあり、各新デ
ータ領域NDは、共有メモリ３上のそれぞれ対応
する動作監視カウンタＣの値を周期的にサンプリ
ングして格納し、各旧データ領域ODは、新デー
タ領域における前回のサンプリング値を移送して
格納している。そして、これら各新データ領域の
値と旧データ領域の値を比較し、各マイクロコン
ピユータの動作、不動作を検知する。したがつ
て、例えばマイクロコンピユータ１_１において
は、その主メモリ２_１にマイクロコンピユータ１
_２，１_３，……，１ｎに対応する新データ領域
ND₂，ND₃，……，NDnと旧データ領域OD₂，
OD₃，……，ODnが設けてあり、これらデータ領
域の値を比較することによつてマイクロコンピユ
ータ１_２，１_３，……，１ｎの動作、不動作を検
知する。 On the other hand, microcomputers 1 ₁ , 1 ₂ , 1
₃ , ..., 1n main memories 2 ₁ , 2 ₂ , 2 ₃ , ...
..., 2n are provided with a new data area ND and an old data area OD that store data corresponding to each microcomputer except for itself, and each new data area ND stores data corresponding to each microcomputer on the shared memory 3. The value of the monitoring counter C is periodically sampled and stored, and each old data area OD transfers and stores the previous sampling value in the new data area. The values of each new data area and the old data area are then compared to detect whether each microcomputer is operating or not. Therefore, for example, in the microcomputer ₁₁ , the main memory ₂₁ contains the microcomputer 1.
New data area corresponding to ₂ , 1 ₃ , ..., 1n
ND ₂ , ND ₃ , ..., NDn and old data area OD ₂ ,
OD ₃ , . . . , ODn are provided, and the operation or non-operation of the microcomputers 1 ₂ , 1 ₃ , . . . , 1n is detected by comparing the values of these data areas.

なお、共有メモリ３上の動作監視カウンタＣの
増分周期とこれら動作監視カウンタＣの値をマイ
クロコンピユータ側でサンプリングする周期との
関係は、サンプリング周期を増分周期よりも長く
設定することにより、動作監視カウンタが１回増
分する間に２回以上サンプリングしないようにし
てある。 Note that the relationship between the increment period of the operation monitoring counter C on the shared memory 3 and the period at which the values of these operation monitoring counters C are sampled on the microcomputer side is such that the operation monitoring can be performed by setting the sampling period longer than the increment period. The counter is incremented once to avoid sampling more than once.

次に、第２図に基づいて本発明による不動作マ
イクロコンピユータの検知手段を具体的に説明す
る。第２図は、各マイクロコンピユータ１_１，１
_２，１_３，……，１ｎが共有メモリ３上の各動作
監視カウンタC₁，C₂，C₃，……，Cnを所定の周
期で増分させ、それをマイクロコンピユータ１_１
が自己に対応する動作監視カウンタC₁以外の各
動作監視カウンタC₂，C₃，……，Cnの値を、増
分周期の最大周期より長い周期でサンプリング
し、その都度、今回サンプリングした新データ領
域の値と前回サンプリングした旧データ領域の値
との比較を行なつている状態を示している。な
お、後述するが、第２図ではマイクロコンピユー
タCnが故障して不動作状態となつている例を示
している。 Next, based on FIG. 2, the means for detecting an inoperable microcomputer according to the present invention will be specifically explained. Figure 2 shows each microcomputer 1 ₁ , 1
₂ _, ₁ ₃ _, _.
samples the values of the operation monitoring counters _{C 2} _, C ₃ , . This shows a state in which the value of the area is being compared with the value of the old data area that was sampled last time. As will be described later, FIG. 2 shows an example in which the microcomputer Cn has failed and is in an inoperable state.

各マイクロコンピユータ１_２，１_３，……，１
ｎがサンプリング周期間に各動作監視カウンタ
C₂，C₃，……，Cnを増分するカウント値が、マ
イクロコンピユータ１_２が２、マイクロコンピユ
ータ１_３が４、マイクロコンピユータ１ｎが２だ
とすると、サンプリングタイミングt₂とt₃におけ
る新データ領域ND₂，ND₃，……，NDnの値と旧
データ領域OD₂，OD₃，……，ODnの値との比較
差、すなわち動作監視カウンタC₂，C₃，……，
Cnのカウント値はそれぞれ２，４，２であり、
各マイクロコンピユータ１_２，１_３，……，１ｎ
が正常に動作していることを示している。しか
し、サンプリングタイムt₄の場合は、動作監視カ
ウンタC₂とC₃のカウント値はそれぞれ２と４で
あるが、動作監視カウンタCnのカウント値は、
前回のサンプリング時t₃から一度もカウントされ
ていない零の状態であるから、マイクロコンピユ
ータ１_２と１_３は正常に動作しているものの、マ
イクロコンピユータ１ｎは正常に動作していない
ことを示している。したがつて、これによりマイ
クロコンピユータ１_１はマイクロコンピユータ１
ｎの不動作を検知する。この場合、説明はしてな
いが、同じ手段で各マイクロコンピユータ１_２，
１_３，……，１n_-1もマイクロコンピユータ１ｎ
の不動作を検知する。 Each microcomputer 1 ₂ , 1 ₃ , ..., 1
n is each operation monitoring counter during the sampling period.
If the count values for incrementing C ₂ , C ₃ , ..., Cn are 2 for microcomputer 1 ₂ , 4 for microcomputer 1 ₃ , and 2 for microcomputer 1n, then the new data area ND at sampling timings t ₂ and t ₃ ₂ , ND ₃ , ..., NDn and the old data area OD ₂ , OD ₃ , ..., ODn, that is, the operation monitoring counter C ₂ , C ₃ , ...,
The count values of Cn are 2, 4, and 2, respectively.
Each microcomputer 1 ₂ , 1 ₃ , ..., 1n
indicates that it is working properly. However, at sampling time _t4 , the count values of the operation monitoring counters _C2 and _C3 are 2 and 4, respectively, but the count value of the operation monitoring counter Cn is
Since it is in a state of zero and has not been counted since the previous sampling time _t3 , this indicates that microcomputers 1, ₂ , and 1, _{and 3} are operating normally, but microcomputer 1n is not operating normally. There is. Therefore, with this, microcomputer 1 ₁ becomes microcomputer 1
Detect n's non-operation. In this case, although not explained, each microcomputer 1 ₂ ,
1 ₃ ,...,1n _-1 is also a microcomputer 1n
Detects non-operation.

なお、新データ領域NDから旧データ領域ODへ
のデータの移送は、両データ領域における値の比
較が行なわれると同時になされるようにしてある
ので、移送タイミングを原因とするカウント値の
誤差は生じない。 Note that data is transferred from the new data area ND to the old data area OD at the same time as the values in both data areas are compared, so errors in count values due to transfer timing may occur. do not have.

また、第２図の鎖線で示す如くカウンタ値が通
常の値より小さいときは、マイクロコンピユータ
は正常に動作しているものとするが、場合によつ
ては不動作状態にあるとすることもできる。 Furthermore, when the counter value is smaller than the normal value as shown by the chain line in Figure 2, it is assumed that the microcomputer is operating normally, but in some cases it may be assumed that it is in an inactive state. .

以上の如く本発明によれば、複数のマイクロコ
ンピユータが各マイクロコンピユータから共通に
呼出すことのできる共有メモリを介して相互に結
合された複合マイクロコンピユータシステムにお
ける、各マイクロコンピユータの動作状態を特別
のハードウエア等を用いることなく各マイクロコ
ンピユータが相互に監視できるので、各マイクロ
コンピユータにウオツチドツグタイマを個々に設
ける必要がなく、簡単かつ確実に不動作マイクロ
コンピユータの検知を行なうことができる。そし
て、この障害検知方式は、複合マイクロコンピユ
ータシステムにおいていずれかのマイクロコンピ
ユータが障害を生じた際に、他のマイクロコンピ
ユータでこれを救済するための基本的手段を提供
することができる。 As described above, according to the present invention, in a composite microcomputer system in which a plurality of microcomputers are interconnected via a shared memory that can be called in common from each microcomputer, the operating status of each microcomputer can be checked using a special hardware. Since each microcomputer can mutually monitor each other without using any hardware or the like, there is no need to provide a watchdog timer for each microcomputer, and an inactive microcomputer can be easily and reliably detected. This fault detection method can provide a basic means for relieving the problem with other microcomputers when any one of the microcomputers in the complex microcomputer system becomes faulty.

[Brief explanation of the drawing]

第１図は複数のコンピユータが各コンピユータ
から共通に呼出すことのできる共有メモリを介し
て結合された複合マイクロコンピユータにおける
本方式の概念図、第２図は本方式による不動作コ
ンピユータの検知手段説明図である。１_１，１_２，１_３，……，１ｎ：マイクロコン
ピユータ、２_１，２_２，２_３，……，２ｎ：主メ
モリ、ND₁，ND₂，ND₃，……，NDn：新データ
領域、OD₁，OD₂，OD₃，……，ODn：旧データ
領域、３：共有メモリ、C₁，C₂，C₃，……，
Cn：動作監視カウンタ。 Fig. 1 is a conceptual diagram of this method in a composite microcomputer in which multiple computers are connected via a shared memory that can be commonly called from each computer, and Fig. 2 is an explanatory diagram of the method for detecting an inactive computer using this method. It is. 1 ₁ , 1 ₂ , 1 ₃ , ..., 1n: Microcomputer, 2 ₁ , 2 ₂ , 2 ₃ , ..., 2n: Main memory, ND ₁ , ND ₂ , ND ₃ , ..., NDn: New data Area, OD ₁ , OD ₂ , OD ₃ , ..., ODn: Old data area, 3: Shared memory, C ₁ , C ₂ , C ₃ , ...,
Cn: Operation monitoring counter.

Claims

[Claims]

1. In a shared memory composite microcomputer system in which a plurality of microcomputers are interconnected via a shared memory common to each of these microcomputers, each of the microcomputers has a new A main memory including a data area and an old data area for transferring and storing immediately preceding data in the new data area is provided, and on the shared memory,
A microcomputer operation monitoring counter that can be incremented by a command from each microcomputer is provided corresponding to each microcomputer, and each of the microcomputers stores the contents of each operation monitoring counter other than the operation monitoring counter corresponding to itself. sampling at a cycle longer than the increment cycle of the operation monitoring counter and storing it in each of the new data areas of its own microcomputer, and at the same time comparing the values of each of the new data areas with the values of each of the old data areas; The microcomputer with a predetermined difference in the comparison result is determined to be in operation,
A failure detection method for a composite microcomputer that detects an inoperable microcomputer in the composite microcomputer by determining that the microcomputers with no difference in comparison results are inoperable.