JPH0661065B2

JPH0661065B2 - Cache memory control method

Info

Publication number: JPH0661065B2
Application number: JP63159699A
Authority: JP
Inventors: 克己中村
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1988-06-28
Filing date: 1988-06-28
Publication date: 1994-08-10
Anticipated expiration: 2009-08-10
Also published as: JPH028946A

Description

【発明の詳細な説明】［産業上の利用分野］この発明は主記憶装置を共有するマルチプロセッサシス
テムにおけるキャッシュメモリ制御方式に関するもので
ある。The present invention relates to a cache memory control system in a multiprocessor system sharing a main memory.

［従来の技術］第４図は例えば“ＣＯＭＰＵＴＥＲＤＥＳＩＧＮ”，
Glen G．Langdon ，Jr．著（ＣＯＭＰＵＴＥＡＣＨＰ
ＲＥＳＳＩＮＣ．１９８２）に示されたキャッシュメ
モリ制御方式の構成を示すブロック図である。図におい
て、１はデータ処理に必要なデータを格納する主記憶装
置、２はこの主記憶装置１を共有しデータ処理に関する
演算・制御を行う中央処理装置（ＣＰＵ）、３は中央処
理装置２と主記憶装置１間のデータ転送を高速化するた
めのキャッシュメモリ、４は読み出しデータパス、５は
主記憶読み出しパス、６は書き込みデータパス、７はロ
ードスルーパス、８はストアスルーパスである。[Prior Art] FIG. 4 shows, for example, "COMPUTER DESIGN",
Glen G. Langdon, Jr. By (COMPUTEACH P
RESS INC. 1982) is a block diagram showing the configuration of the cache memory control system shown in FIG. In the figure, 1 is a main memory device for storing data necessary for data processing, 2 is a central processing unit (CPU) which shares the main memory device 1 and performs arithmetic / control for data processing, and 3 is a central processing unit 2. A cache memory for speeding up data transfer between the main memory devices 1, 4 is a read data path, 5 is a main memory read path, 6 is a write data path, 7 is a load through path, and 8 is a store through path.

次に動作について説明する。キャッシュメモリ３は、中
央処理装置２が主記憶装置１からデータをアクセスする
時にそのデータを格納しておく。一般に、一度使用され
たデータは、再び使用される可能性が高いといわれてお
り、以後のデータのアクセス時に、このキャッシュメモ
リ３からデータがアクセスできる場合が多くなる。ま
た、このキャッシュメモリ３は、通常主記憶装置１より
かなり高速にアクセスできるものなので、したがって、
アクセスしたいデータがキャッシュメモリ３内にある場
合には、かなり高速にデータのアクセスができる。デー
タがキャッシュメモリ３内にない時には、主記憶装置１
へデータをアクセスする必要があるので、遅くなってし
まう。Next, the operation will be described. The cache memory 3 stores the data when the central processing unit 2 accesses the data from the main memory 1. Generally, it is said that once used data is highly likely to be used again, there are many cases where the data can be accessed from the cache memory 3 when accessing the data thereafter. Further, since the cache memory 3 can be accessed much faster than the main memory 1, the cache memory 3 is
If the data to be accessed is in the cache memory 3, the data can be accessed at a considerably high speed. When the data is not in the cache memory 3, the main memory 1
Since it needs to access the data to, it will be slow.

中央処理装置２がデータをアクセスする時には、まず、
必要なデータに対するリクエストを出し、このリクエス
トされたアクセスのデータがキャッシュメモリ３内に存
在しているか否かをテストする。もしあれば、そのキャ
ッシュメモリ３内のデータを読み出し、データパス４か
らそのままとりだす。もしキャッシュメモリ３内に必要
なデータがない場合には、主記憶読み出しパス５によっ
てデータを一度主記憶装置１からキャッシュメモリ３内
へ読み出してきて、その後にキャッシュメモリ３を読み
出す。あるいは、主記憶読み出しパス５からキャッシュ
メモリ３へ読み出すと同時にロードスルーパス７を介し
て中央処理装置２へも読み出す。書き込み動作の時にも
同様のテストを行い、データはデータ書き込みデータパ
ス６よりキャッシュメモリ３、あるいはストアスルーパ
ス８より主記憶装置１へ書き込まれる。When the central processing unit 2 accesses data, first,
A request for necessary data is issued, and it is tested whether the requested access data exists in the cache memory 3. If there is, the data in the cache memory 3 is read out and taken out from the data path 4 as it is. If there is no necessary data in the cache memory 3, the data is read once from the main memory 1 into the cache memory 3 by the main memory read path 5, and then the cache memory 3 is read. Alternatively, the data is read from the main memory read path 5 to the cache memory 3 and simultaneously to the central processing unit 2 via the load through path 7. The same test is performed during the write operation, and the data is written to the cache memory 3 through the data write data path 6 or the main memory 1 through the store through path 8.

したがって、主記憶装置１と比較して、かなり小さい容
量のキャッシュメモリ３には、最も使用頻度の高いデー
タをおいておくのが望ましい。通常キャッシュメモリ３
内において最も新しく使用されたデータは長くキャッシ
ュメモリ３内におけるように、キャッシュメモリ３内の
データの置き換えは最も古くに使用されたデータを主記
憶装置１へ返すように行われる。Therefore, it is desirable to store the most frequently used data in the cache memory 3 having a considerably smaller capacity than the main memory 1. Normal cache memory 3
The data most recently used in the cache memory 3 is long, and the data in the cache memory 3 is replaced so that the oldest used data is returned to the main memory 1.

また、第５図に示すように、このようなキャッシュメモ
リをそれぞれ持つ中央処理装置を複数台結合し主記憶装
置１を共有するマルチプロセッサシステムにおいては、
更に複雑な問題がある。マルチプロセッサシステムで
は、同一の主記憶装置１を共有するので、その主記憶装
置１のコピーであるキャッシュメモリ３ａ〜３ｄには、
複数台の中央処理装置２ａ〜２ｄに対応した同じ主記憶
装置１のデータのコピーが存在する場合がある。もし、
いずれかの中央処理装置がそのキャッシュメモリの内容
を書き換えた時には、全ての中央処理装置２ａ〜２ｄ中
のキャッシュメモリ３ａ〜３ｄの更新前のデータのコピ
ーは、もはや正しくないので誤って使用しないように無
効なものとする必要がある。これをキャッシュメモリの
無効化という。このキャッシュメモリの無効化は、通常
中央処理装置の書き込みの度毎に発生するものである。Further, as shown in FIG. 5, in a multiprocessor system in which a plurality of central processing units each having such a cache memory are coupled and the main storage device 1 is shared,
There is a more complicated problem. In the multiprocessor system, since the same main storage device 1 is shared, the cache memories 3a to 3d which are copies of the main storage device 1,
There may be a copy of the data in the same main storage device 1 corresponding to a plurality of central processing units 2a to 2d. if,
When one of the central processing units rewrites the contents of its cache memory, the copy of the data before update of the cache memories 3a to 3d in all the central processing units 2a to 2d is no longer correct and should not be used by mistake. It should be invalid. This is called invalidation of cache memory. This invalidation of the cache memory usually occurs each time writing is performed by the central processing unit.

第６図は、マルチプロセッサシステムにおける１つの中
央処理装置２におけるデータのストア動作（書き込み動
作）を表したものである。このフローチャートでは、ス
トア時には、キャッシュヒットした時にも主記憶装置１
への書き込みを行うストアスルー方式を仮定して記述し
ている。第６図において、ステップＳ１ではあるキャッ
シュメモリにデータがあるか否かを判断し、データがあ
ればステップＳ３へ移りキャッシュメモリ３にデータを
ストアし、データがなければステップＳ２へ移り主記憶
装置１にデータをストアする。また、ステップＳ４では
目的データが他の中央処理装置のキャッシュメモリにあ
るか否かを判断し、目的データがあればステップＳ６へ
移り他の中央処理装置のキャッシュメモリの内容を無効
化し、目的データがなければステップＳ５へ移り何の処
理もしない。FIG. 6 shows a data store operation (write operation) in one central processing unit 2 in the multiprocessor system. In this flowchart, the main storage device 1 is stored even when a cache hit occurs at the time of store.
It is described assuming a store-through method for writing to. In FIG. 6, in step S1, it is determined whether or not there is data in a certain cache memory, and if there is data, the process proceeds to step S3 to store the data in the cache memory 3, and if there is no data, the process proceeds to step S2 and the main storage device. Store the data in 1. In step S4, it is determined whether or not the target data is in the cache memory of another central processing unit. If there is the target data, the process proceeds to step S6 to invalidate the contents of the cache memory of the other central processing unit. If there is not, the process proceeds to step S5 and no process is performed.

ここに示すように、データのストアに対しては、キャッ
シュヒットの場合、あるいはキャッシュミスの場合にお
いても別の（他系）中央処理装置のキャッシュメモリの
内容をテストして目的データが他系の中央処理装置にも
あった場合には、そのキャッシュメモリを無効化する必
要がある。As shown here, for the data store, even in the case of a cache hit or a cache miss, the contents of the cache memory of another (other system) central processing unit are tested and the target data is If it is also in the central processing unit, its cache memory must be invalidated.

近年のマルチプロセッサシステムにおいては、従来から
行われている複数のジョブを複数の中央処理装置に割り
当ててシステムとしてのスループットを向上させること
を目的とした並列処理とは異なって、複数台の中央処理
装置が１つのジョブを実行することによって、そのジョ
ブのレスポンスを向上させることを目的とした並列処理
が行われる場合がある。この場合には、主記憶装置の同
じブロックのデータが複数の中央処理装置に分割される
ことがあり、この同じブロックのデータを別の中央処理
装置がアクセスすることが頻繁に起こる。In a recent multiprocessor system, unlike parallel processing which is conventionally performed to allocate a plurality of jobs to a plurality of central processing units to improve the throughput as a system, a plurality of central processing units are used. When the device executes one job, parallel processing may be performed for the purpose of improving the response of the job. In this case, the data of the same block in the main storage device may be divided into a plurality of central processing units, and the data of this same block is frequently accessed by another central processing unit.

例えば、第２図に示すように、あるプログラム中のルー
プを複数台の中央処理装置で並列に処理させようとする
場合には、いわゆる空間的並列性を生かし、そのループ
の繰り返し毎に分割して処理させることが良く行われ
る。第２例では、ＤＯＡＬＬという文が並列処理を行う
ことを表し、中央処理装置が４台あることを仮定し、こ
のループ内の４つの式がそれぞれ１台の中央処理装置に
割り当てられることを示している。つまり、ループの繰
り返しを４つに分けて、中央処理装置２ａが第１番目の式の処理である繰り返し
の第１回、５回、９回目、・・・を中央処理装置２ｂが第２番目の式の処理である繰り返し
の第２回、６回、１０回目、・・・を中央処理装置２ｃが第３番目の式の処理である繰り返し
の第３回、７回、１１回目、・・・を中央処理装置２ｄが第４番目の式の処理である繰り返し
の第４回、８回、１２回目、・・・をというように受け持つものとしている。このことは、空
間的に集中しているデータ、例えば、配列Ａの要素をあ
えて複数の中央処理装置に分割することになるので、複
数台の中央処理装置２ａ〜２ｄのキャッシュメモリ３ａ
〜３ｄに配列Ａの要素の同一のデータのブロックが存在
する結果となる。この複数のキャッシュメモリ３ａ〜３
ｄ内のデータの有効性を保つためには、各中央処理装置
でそのデータのブロックへの書き込みが行われる度毎に
書き込みが行われたブロックのアドレスを他系の中央処
理装置へ送り、キャッシュメモリ内にそのアドレスのブ
ロックを持つ中央処理装置の対応するキャッシュメモリ
のブロックのデータを無効化する必要がある。無効化さ
れたブロックを持っていた中央処理装置は、そのブロッ
クを使用したいときには、再度主記憶装置１からその一
部のみが変更されたブロックをアクセスする必要があ
る。第２図の例では、各中央処理装置２ａ〜２ｄにおけ
る１つの演算の処理の度毎に、この状況が発生する。そ
のたびに、各中央処理装置２ａ〜２ｄ内でキャッシュメ
モリ３ａ〜３ｄ内のデータの無効化と、キャッシュメモ
リ３ａ〜３ｄと主記憶装置１間でデータ転送とが行われ
ることになり、マルチプロセッサシステムの性能に多大
な影響を与える。For example, as shown in FIG. 2, when a loop in a program is to be processed in parallel by a plurality of central processing units, so-called spatial parallelism is used, and the loop is divided for each iteration. Is often processed. In the second example, the statement DOALL indicates that parallel processing is performed, it is assumed that there are four central processing units, and it is shown that each of the four expressions in this loop is assigned to one central processing unit. ing. In other words, the loop is divided into four, and the central processing unit 2a performs the first, fifth, ninth, ... The second, the sixth, the tenth, ..., Which is the process of the formula, the central processing unit 2c is the third, the seventh, the eleventh, which is the process of the third formula. The central processing unit 2d is responsible for the fourth, eighth, twelfth, and so on, which is the processing of the fourth equation. This means that data spatially concentrated, for example, the elements of the array A are intentionally divided into a plurality of central processing units, and therefore the cache memories 3a of the plurality of central processing units 2a to 2d.
This results in the presence of the same block of data for the elements of array A in ~ 3d. The plurality of cache memories 3a to 3
In order to maintain the validity of the data in d, the address of the written block is sent to the central processing unit of other system every time the writing of the data to the block is performed by each central processing unit, and the cache is cached. It is necessary to invalidate the data in the corresponding cache memory block of the central processing unit that has the block of that address in memory. When the central processing unit that has the invalidated block wants to use the block, it needs to access the block, a part of which has been changed, from the main memory 1 again. In the example of FIG. 2, this situation occurs every time one arithmetic processing is performed in each of the central processing units 2a to 2d. At each time, invalidation of data in the cache memories 3a to 3d in each of the central processing units 2a to 2d and data transfer between the cache memories 3a to 3d and the main storage device 1 are performed, so that the multiprocessor Significantly affects system performance.

［発明が解決しようとする課題］従来のキャッシュメモリ制御方式は上述したような動作
を行うので、並列処理を実行するマルチプロセッサシス
テムにおいては各中央処理装置の書き込み処理の１回毎
にキャッシュメモリに対する内容の無効要求が出される
ことになり、その度毎にキャッシュメモリと主記憶装置
間でデータ転送が行われ、したがって処理性能を著しく
低下させるという問題点があった。[Problems to be Solved by the Invention] Since the conventional cache memory control system operates as described above, in a multiprocessor system which executes parallel processing, the cache memory is written to the cache memory at each write processing of each central processing unit. A content invalidation request is issued, and each time data is transferred between the cache memory and the main memory, there is a problem that processing performance is significantly reduced.

この発明は上記のような問題点を解消するためになされ
たもので、書き込み動作を行った中央処理装置がその他
のすべての中央処理装置に対応するキャッシュメモリへ
の無効化を行わずに、処理性能の向上を図ることができ
るキャシュメモリ制御方式を得ることを目的とする。The present invention has been made to solve the above problems, and the central processing unit that has performed the write operation performs processing without invalidating the cache memory corresponding to all the other central processing units. It is an object of the present invention to obtain a cache memory control method capable of improving performance.

［課題を解決するための手段］この発明に係るキャシュメモリ制御方式は、主記憶装置
を共有する複数の中央処理装置のうちで書き込み動作を
行う中央処理装置が、その他のすべての中央処理装置に
対して書き込みデータを転送するとともに、この書き込
みデータを認識手段で並列データ処理が実行されている
ことが示された中央処理装置に対応するキャッシュメモ
リに書き込ませることを特徴とするものである。[Means for Solving the Problems] In the cache memory control system according to the present invention, a central processing unit that performs a write operation among a plurality of central processing units that share a main memory is the same for all other central processing units. The write data is transferred to the cache memory, and the write data is written in the cache memory corresponding to the central processing unit in which the parallel data processing is performed by the recognition means.

［作用］この発明のキャシュメモリ制御方式は、書き込み動作を
行う中央処理装置が、主記憶装置を共有する他のすべて
の中央処理装置にデータの書き込みが行われたアドレス
を送り、同他のすべての中央処理装置に対応するキャシ
ュメモリを無効化する代わりに、同他のすべての中央処
理装置に上記書き込みが行われたデータのアドレスと共
に書き込みデータを転送し、この転送されてきた書き込
みデータを並列データ処理の実行が示された中央処理装
置に対応するキャシュメモリに書き込む。[Operation] In the cache memory control system of the present invention, the central processing unit that performs the write operation sends the address where the data has been written to all the other central processing units that share the main storage device, and all the other operations. Instead of disabling the cache memory corresponding to this central processing unit, the write data is transferred to all the other central processing units together with the address of the written data, and the transferred write data is parallelized. Write to the cache memory corresponding to the central processing unit indicated to perform data processing.

［発明の実施例］第１図はこの発明の一実施例に係るキャッシュメモリ制
御方式を採用したマルチプロセッサシステムの構成を示
すブロック図である。図において、１はデータ処理に必
要なデータを格納する主記憶装置、２ａ〜２ｄは主記憶
装置１を共有しデータ処理に関する演算・制御を行う複
数の中央処理装置（ＣＰＵ）、３ａ〜３ｄは中央処理装
置２ａ〜２ｄにそれぞれ対応して設けられ中央処理装置
２ａ〜２ｄと主記憶装置１間のデータ転送を高速化する
ための複数のキャッシュメモリ、１０は各中央処理装置
２ａ〜２ｄが主記憶装置１をアクセスするためのグロー
バルメモリパス、１１は各キャッシュメモリ３ａ〜３ｄ
間で直接にデータ転送を行うための転送手段の機能及び
書き込み動作時にすべての中央処理装置２ａ〜２ｄに対
して書き込みデータを放送するデータパス、９ａ〜９ｄ
は指定された２個以上の中央処理装置によって並列デー
タ処理が実行されているときにその指定された中央処理
装置が複数の中央処理装置２ａ〜２ｄ群の１つであるこ
とを認識するための認識手段としての並列処理モードフ
ラグである。[Embodiment of the Invention] FIG. 1 is a block diagram showing a configuration of a multiprocessor system adopting a cache memory control system according to an embodiment of the present invention. In the figure, 1 is a main storage device that stores data required for data processing, 2a to 2d are a plurality of central processing units (CPUs) that share the main storage device 1 and perform arithmetic / control related to data processing, and 3a to 3d. A plurality of cache memories 10 provided respectively corresponding to the central processing units 2a to 2d for speeding up data transfer between the central processing units 2a to 2d and the main storage unit 1 are mainly constituted by the central processing units 2a to 2d. A global memory path for accessing the storage device 1, 11 is each cache memory 3a to 3d
And a data path for broadcasting write data to all the central processing units 2a to 2d at the time of a write operation and a function of a transfer means for directly transferring data between them, 9a to 9d
For recognizing that the designated central processing unit is one of the plurality of central processing units 2a to 2d when parallel data processing is being executed by the designated two or more central processing units. It is a parallel processing mode flag as a recognition means.

次に動作について説明する。Next, the operation will be described.

第１図では、４台の中央処理装置２ａ〜２ｄを持つマル
チプロセッサシステムを示している。このマルチプロセ
ッサシステムにおいて並列処理を行わせる時、最も単純
で良く行われる方法にプログラムのループの部分を分割
する手法がある。コンパイラなどによって自動的に並列
化することを前提とすると並列化の困難さから分割はこ
の様な単純なものとなることが多い。例えば、第２図に
示すように繰り返し数の多いループを４つに分割して、
ＤＯＡＬＬ文で示されるように変形する。ＤＯＡＬＬと
いう文は並列処理を行うことを表し、このループ内の４
つの式がそれぞれ１台の中央処理装置に割り当てられる
ことを示している。つまり、ループの繰り返しを４つに
分けて、中央処理装置２ａが第１番目の式の処理である繰り返し
の第１回、５回、９回目、・・・を中央処理装置２ｂが第２番目の式の処理である繰り返し
の第２回、６回、１０回目、・・・を中央処理装置２ｃが第３番目の式の処理である繰り返し
の第３回、７回、１１回目、・・・を中央処理装置２ｄが第４番目の式の処理である繰り返し
の第４回、８回、１２回目、・・・をというように受け持つものとしている。FIG. 1 shows a multiprocessor system having four central processing units 2a to 2d. When performing parallel processing in this multiprocessor system, there is a method of dividing the loop part of the program into the simplest and most commonly performed method. Assuming automatic parallelization by a compiler or the like, the division is often such a simple thing because of the difficulty of parallelization. For example, as shown in FIG. 2, divide a loop with a large number of iterations into four,
It is transformed as shown by the DOALL statement. The statement DOALL indicates that parallel processing is performed, and 4 in this loop
It is shown that each of the equations is assigned to one central processing unit. In other words, the loop is divided into four, and the central processing unit 2a performs the first, fifth, ninth, ... The second, the sixth, the tenth, ..., Which is the process of the formula, the central processing unit 2c is the third, the seventh, the eleventh, which is the process of the third formula. The central processing unit 2d is responsible for the fourth, eighth, twelfth, and so on, which is the processing of the fourth equation.

第３図に、この時の各中央処理装置内２ａ〜２ｄのキャ
ッシュメモリ３ａ〜３ｄ内のデータを示している。主記
憶装置１内において配列Ａ１〜Ａ８が同一のブロックに
あるとすると、このブロックは４つの中央処理装置２ａ
〜２ｄのすべてに読みだされ、４つの中央処理装置２ａ
〜２ｄのすべてがこのブロックへの書き込みを行う。こ
の場合には、中央処理装置２ａが配列Ａ１を中央処理装
置２ｂがその隣接する要素である配列Ａ２を、中央処理
装置２ｃが更にその隣接要素の配列Ａ３を、そして中央
処理装置２ｄがその配列Ａ３の隣接要素の配列Ａ４を同
一のブロックにデータをそれぞれ書き込むことになる。
この動作の流れは、この後の継続され常に隣の中央処理
装置が書き込みを行った要素の隣接要素にデータを書き
込むことになる。この時に、各中央処理装置２ａ〜２ｄ
内の並列処理モードフラグ９ａ〜９ｄはすべてオンとな
っていて、すべての中央処理装置２ａ〜２ｄが並列処理
のモードであることを示している。これに従って書き込
み動作時には、他系の中央処理装置へデータの書き込み
が行われたアドレスを送り、その他系の中央処理装置の
キャッシュメモリを無効化する代わりに、他系の中央処
理装置へ書き込みが行われたデータのアドレスと共に書
き込みデータをデータパス１１からすべての中央処理装
置へ放送する。これによって並列処理モードフラグ９ａ
〜９ｄがオンとなっていて、かつ、書き込まれたデータ
を含むブロックを保持している各中央処理装置２ａ〜２
ｄが、送られてきた書き込みデータを取り込みキャッシ
ュメモリ３ａ〜３ｄ内へ実際に書き込むことによってキ
ャッシュメモリ３ａ〜３ｄの内容を有効のままに維持で
き無効化の必要はなくなる。FIG. 3 shows data in the cache memories 3a to 3d of the respective central processing units 2a to 2d at this time. Assuming that the arrays A1 to A8 are in the same block in the main memory 1, this block has four central processing units 2a.
~ 2d read out to all four central processing units 2a
All ~ 2d write to this block. In this case, the central processing unit 2a has the array A1, the central processing unit 2b has the array A2 as its adjacent element, the central processing unit 2c has the array A3 of its adjacent elements, and the central processing unit 2d has the array. Data will be written to the same block in the array A4 of adjacent elements of A3.
The flow of this operation is continued thereafter, and data is always written to the adjacent element of the element to which the adjacent central processing unit has written. At this time, the central processing units 2a to 2d
All the parallel processing mode flags 9a to 9d are turned on, indicating that all the central processing units 2a to 2d are in the parallel processing mode. According to this, during a write operation, the address where the data was written is sent to the central processing unit of the other system, and instead of invalidating the cache memory of the central processing unit of the other system, the writing is performed to the central processing unit of the other system. The write data together with the address of the written data is broadcast from the data path 11 to all the central processing units. As a result, the parallel processing mode flag 9a
9d are turned on, and each central processing unit 2a-2 holds a block containing written data.
By fetching the write data that has been sent and actually writing the data into the cache memories 3a to 3d, the contents of the cache memories 3a to 3d can be maintained valid and need not be invalidated.

なお、上記実施例ではマルチプロセッサシステムが並列
処理のモードで動作している時について説明したが、そ
のシステムは通常のモードの実行時においてもある程度
の効果が望まれる。また、実施例では並列処理動作時に
限定してデータのストア時にキャッシュメモリを無効化
する代わりにストアデータ（書き込みデータ）を全中央
処理装置に放送して、並列処理モードフラグがオンの中
央処理装置のキャッシュメモリに書き込むものとしてい
た。これは、通常並列処理動作時には、前述のようにル
ープを分割する場合が多く同一ブロックのデータを複数
の中央処理装置が、共有して使用する可能性が非常に高
いことに着目したものである。ところが、実際のジョブ
の中には、ユーザがプログラム中で明示的に複数台の中
央処理装置を使用することを宣言して処理を行わせる場
合がある。この場合においても、複数の中央処理装置で
データを共有することがあるので、ユーザが複数の中央
処理装置間でデータの同一ブロックをアクセスすること
を認識した上で、ソフトウェア的に使用する中央処理装
置を指定して、指定された中央処理装置においてはキャ
ッシュの無効化を行わずにストアデータを放送するもの
としても目的の効果が得られる。In the above embodiments, the multiprocessor system is described as operating in the parallel processing mode. However, the system is expected to have some effect even when the normal mode is executed. Further, in the embodiment, instead of invalidating the cache memory when storing data only during the parallel processing operation, the store data (write data) is broadcast to all the central processing units, and the parallel processing mode flag is turned on. I was supposed to write to the cache memory. This is because, during normal parallel processing operation, the loop is often divided as described above, and it is very likely that a plurality of central processing units share the same block of data. . However, in an actual job, there are cases where the user explicitly declares to use a plurality of central processing units in the program and causes the processing to be performed. Even in this case, since the data may be shared by the plurality of central processing units, it is necessary to recognize that the user accesses the same block of data among the plurality of central processing units and then use the central processing performed by software. Even if the device is designated and the designated central processing unit broadcasts the store data without invalidating the cache, the intended effect can be obtained.

［発明の効果］以上のようにこの発明によれば、主記憶装置を共有する
複数の中央処理装置のうちで書き込み動作を行う中央処
理装置が、その他のすべての中央処理装置に対して書き
込みデータを転送するとともに、この書き込みデータを
認識手段で並列データ処理が実行されていることが示さ
れた中央処理装置に対応するキャッシュメモリに書き込
ませるので、書き込み動作ごとのキャシュメモリの無効
化に代替し、認識された中央処理装置に対応するキャッ
シュメモリに書き込みデータが書き込まれ、これにより
キャッシュメモリ内のデータを有効のままに保つことが
でき、キャッシュメモリと主記憶装置間のデータの入れ
換えが少なくなり、したがって処理性能が向上するとい
う効果が得られる。As described above, according to the present invention, the central processing unit that performs the write operation among the plurality of central processing units sharing the main storage device writes the write data to all the other central processing units. The write data is transferred to the cache memory corresponding to the central processing unit in which the parallel data processing is shown to be executed by the recognition means, so that the cache memory is invalidated for each write operation. Write data is written to the cache memory corresponding to the recognized central processing unit, which allows the data in the cache memory to remain valid and reduces the exchange of data between the cache memory and main memory. Therefore, the effect that the processing performance is improved can be obtained.

[Brief description of drawings]

第１図はこの発明の一実施例に係るキャッシュメモリ制
御方式を採用したマルチプロセッサシステムの構成を示
すブロック図、第２図はこの実施例において並列処理さ
れる典型的なプログラムの例を示す図、第３図はそのプ
ログラムをこの実施例において実行させた時に各中央処
理装置のキャッシュメモリ上にデータがどのように分割
されるかを示す図、第４図は従来のキャッシュメモリ制
御方式の構成を示すブロック図、第５図は従来のキャッ
シュメモリ制御方式を採用したマルチプロセッサシステ
ムの構成を示すブロック図、第６図はこの従来例におけ
るキャッシュメモリの動作を示すフローチャートであ
る。１……主記憶装置、２ａ〜２ｄ……中央処理装置、３ａ
〜３ｄ……キャッシュメモリ、11……データパス（転送
手段）、９ａ〜９ｄ……並列処理モードフラグ（認識手
段）。FIG. 1 is a block diagram showing the configuration of a multiprocessor system adopting a cache memory control system according to an embodiment of the present invention, and FIG. 2 is a diagram showing an example of a typical program to be processed in parallel in this embodiment. 3 shows how the data is divided in the cache memory of each central processing unit when the program is executed in this embodiment, and FIG. 4 shows the configuration of the conventional cache memory control system. FIG. 5 is a block diagram showing the configuration of a multiprocessor system adopting a conventional cache memory control method, and FIG. 6 is a flowchart showing the operation of the cache memory in this conventional example. 1 ... Main storage device, 2a-2d ... Central processing unit, 3a
.. 3d ... Cache memory, 11 ... Data path (transfer means), 9a to 9d ... Parallel processing mode flag (recognition means).

Claims

[Claims]

1. A main storage device for storing data required for data processing, a plurality of central processing units that share the main storage unit and perform arithmetic operations and control relating to data processing, and each of the plurality of central processing units is provided. And a plurality of cache memories provided corresponding to each of the plurality of central processing units, and a write operation is performed among the plurality of central processing units. A cache characterized in that the executing central processing unit transfers write data to all the other central processing units and writes the write data in the cache memory corresponding to the central processing unit indicated by the recognizing means. Memory control method.