JP5477206B2

JP5477206B2 - Compiling device and compiling program

Info

Publication number: JP5477206B2
Application number: JP2010153528A
Authority: JP
Inventors: 雅和上野
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2010-07-06
Filing date: 2010-07-06
Publication date: 2014-04-23
Anticipated expiration: 2030-07-06
Also published as: JP2012018434A

Description

本願は、コンパイル装置およびコンパイルプログラムに関する。 The present application relates to a compiling device and a compiling program.

プログラムの実行性能のボトルネックとなっている箇所の特定や改善方法を短期間で見つけ出すための技術として「プロファイラ」と呼ばれる技術が存在する。このプロファイラは、プログラム実行時に、関数の呼び出し回数、サイクル数、キャッシュミスの数など、プログラム実行時の挙動を表わすＰＡ（ＰｒｏｆｉｌｅＡｎａｌｙｓｉｓ）情報（性能解析情報、ハードウエア情報）を取得する技術である。このプロファイラには、大別して以下の２種類が存在する。 There is a technique called “profiler” as a technique for quickly identifying a part that is a bottleneck of the execution performance of a program and finding a method for improvement. This profiler is a technique for acquiring PA (Profile Analysis) information (performance analysis information, hardware information) representing behavior during program execution, such as the number of function calls, the number of cycles, and the number of cache misses, during program execution. . This profiler is roughly divided into the following two types.

・実行時に、一定間隔の割込みにより、全体の大まかなＰＡ情報を取得する。 -At the time of execution, the general PA information is acquired by interruption at regular intervals.

・ソースプログラムに、ＰＡ情報取得のための関数呼び出しを手作業で加え、そのソースプログラムの中の一部分について詳細なＰＡ情報を取得する。 A function call for acquiring PA information is manually added to the source program, and detailed PA information is acquired for a part of the source program.

詳細なＰＡ情報を取得するためには後者の方法が選択されるが、コンパイルの際の最適化処理によってソースプログラムには表現されないループや処理ブロックが生成されることがある。この場合、ＰＡ情報取得関数をソースプログラム上に書き加えておいてもコンパイルの際に生成される新たなループや処理ブロックにおける詳細なＰＡ情報を取得することができないおそれがある。 The latter method is selected to acquire detailed PA information, but a loop or processing block that is not expressed in the source program may be generated by optimization processing at the time of compilation. In this case, even if a PA information acquisition function is added to the source program, there is a possibility that detailed PA information in a new loop or processing block generated at the time of compilation cannot be acquired.

特開平９−６２５４４号公報JP 9-62544 A 特開２０００−１９４５６６号公報Japanese Patent Laid-Open No. 2000-194466 特開２００４−１１０８２４号公報JP 2004-110824 A

１つの側面では、本発明は、コンパイルの際に新たに生成されたループや処理ブロックについてもＰＡ情報を取得できるオブジェクトプログラムを生成することにある。 In one aspect, the present invention is to generate an object program that can acquire PA information for a loop or a processing block that is newly generated during compilation.

本件開示のコンパイル装置は、取得手段と、第１変換手段と、第１設定手段と、最適化手段と、第２設定手段と、関数埋込手段と、第２変換手段とを有する。 The compiling device of the present disclosure includes an acquisition unit, a first conversion unit, a first setting unit, an optimization unit, a second setting unit, a function embedding unit, and a second conversion unit.

ここで、取得手段は、ＰＡ位置ファイルとソースプログラムとを取得する。ＰＡ位置ファイルは、プログラム実行時の挙動を表わすＰＡ情報を取得する、そのプログラム上の範囲を表わすＰＡレンジが記述されたファイルである。また、ソースプログラムは、プログラム言語を用いて作成されたプログラムである。 Here, the acquisition unit acquires the PA position file and the source program. The PA position file is a file in which a PA range representing a range on the program is described, which acquires PA information representing a behavior at the time of program execution. The source program is a program created using a program language.

また、第１変換手段は、取得手段で取得したソースプログラムを中間言語で記述された第１のプログラムに変換する。 The first conversion means converts the source program acquired by the acquisition means into a first program described in an intermediate language.

第１設定手段は、取得手段が取得したＰＡ位置ファイルからＰＡレンジを読み出して、そのＰＡレンジを上記の第１のプログラム上に設定する。 The first setting means reads the PA range from the PA position file acquired by the acquisition means, and sets the PA range on the first program.

最適化手段は、第１設定手段でＰＡレンジが設定された第１のプログラムに最適化処理を施して、その第１のプログラム上に設定されたＰＡレンジを引き継いだ第２のプログラムを生成する。 The optimization unit performs an optimization process on the first program in which the PA range is set by the first setting unit, and generates a second program that takes over the PA range set on the first program. .

第２設定手段は、第１のプログラム上のＰＡレンジ内のループ又は処理ブロックの数と比べたときの第２のプログラム上のＰＡレンジ内のループ又は処理ブロックの数の増加の有無を調べる。そして、第２のプログラム上のＰＡレンジ内のループ又は処理ブロックの数が増加していた場合に、この第２設定手段は、その第２のプログラム上のＰＡレンジ内の各ループ又は各処理ブロックに新たなＰＡレンジを設定する。 The second setting means checks whether or not the number of loops or processing blocks in the PA range on the second program has increased as compared with the number of loops or processing blocks in the PA range on the first program. Then, when the number of loops or processing blocks in the PA range on the second program has increased, the second setting means sets each loop or processing block in the PA range on the second program. Set a new PA range.

関数埋込手段は、上記の第２のプログラムに、ＰＡ情報取得のためのＰＡ情報取得関数を、その第２のプログラムに設定されたＰＡレンジ毎に埋め込む。 The function embedding means embeds a PA information acquisition function for acquiring PA information in the second program for each PA range set in the second program.

さらに、第２変換手段は、ＰＡ情報取得関数が埋め込まれた状態の第２のプログラムを、コンピュータ上で実行可能なオブジェクトプログラムに変換する。 Further, the second conversion means converts the second program in which the PA information acquisition function is embedded into an object program executable on the computer.

また、本件開示のコンパイルプログラムは、プログラムをコンパイルする演算処理装置内で実行され、その演算処理装置内に、本件のコンパイル装置を構築するプログラムである。 Further, the compile program disclosed herein is a program that is executed in an arithmetic processing device that compiles a program and constructs the compiling device of the present subject in the arithmetic processing device.

本件開示のコンパイル装置およびコンパイルプログラムによれば、コンパイルの際に生成された新たなループや処理ブロックについても詳細なＰＡ情報を取得することができる。したがってプログラムの実行性能の詳細な分析が可能となる。 According to the compiling apparatus and the compiling program of the present disclosure, it is possible to acquire detailed PA information for a new loop or processing block generated at the time of compiling. Therefore, detailed analysis of program execution performance becomes possible.

最適化処理においてＳＩＭＤ化が行なわれ、その結果、複数のループが生成される例を示した図である。It is the figure which showed the example by which SIMD-ization is performed in an optimization process and a some loop is produced | generated as a result. 最適化処理によりループ分配が行なわれた例を示した図である。It is the figure which showed the example by which the loop distribution was performed by the optimization process. コンパイル装置の一実施形態として動作するコンピュータの外観図である。1 is an external view of a computer that operates as an embodiment of a compiling device. 図３に外観を示すコンピュータのハードウエア構成図である。FIG. 4 is a hardware configuration diagram of a computer whose appearance is shown in FIG. 3. 実施形態としてのコンパイル装置の機能説明図である。It is function explanatory drawing of the compiling apparatus as embodiment. ＰＡ位置ファイルの記述内容を示す図である。It is a figure which shows the description content of PA position file. 最適化情報を示す図である。It is a figure which shows optimization information. ソースプログラムを取得して中間言語による第１のプログラムに変換する処理のイメージを示した図である。It is the figure which showed the image of the process which acquires a source program and converts into the 1st program by an intermediate language. 図５のＰＡ位置ファイル解析部１３での処理を示すフローチャートである。It is a flowchart which shows the process in the PA position file analysis part 13 of FIG. ＰＡ位置ファイル解析部１３での処理のイメージを示す図である。It is a figure which shows the image of the process in PA position file analysis part. 最適化処理により生成された第２のプログラムを示すイメージ図である。It is an image figure which shows the 2nd program produced | generated by the optimization process. 図５のＰＡ取得関数挿入部１５の処理を示すフローチャートの前半部分を示す図である。It is a figure which shows the first half part of the flowchart which shows the process of PA acquisition function insertion part 15 of FIG. 図５のＰＡ取得関数挿入部１５の処理を示すフローチャートの後半部分を示す図である。It is a figure which shows the latter half part of the flowchart which shows the process of PA acquisition function insertion part 15 of FIG. ループ数の増加が認識された段階の第２のプログラムのイメージ図である。It is an image figure of the 2nd program of the stage by which the increase in the number of loops was recognized. 第２のプログラム上に新たなＰＡレンジが設定された状態を示すイメージ図である。It is an image figure which shows the state by which the new PA range was set on the 2nd program. 第２のプログラムにＰＡ情報取得関数が埋め込まれた状態を示すイメージ図である。It is an image figure which shows the state by which the PA information acquisition function was embedded in the 2nd program. ソースプログラム上に、図６の中間言語イメージ上のＰＡ取得関数と同じ内容を書き下したときのイメージ図である。It is an image figure when the same content as the PA acquisition function on the intermediate language image of FIG. 6 is written down on the source program. 「ａｌｌ」以外の「ＰＡレンジ設定箇所指定情報」が指定された場合の、ソースコードイメージを示した図である。It is the figure which showed the source code image when "PA range setting location designation information" other than "all" is designated. コンパイル実行時の表示画面例を示す図である。It is a figure which shows the example of a display screen at the time of compilation execution. 実行ファイルに格納された、ＰＡ情報取得関数が埋め込まれたオブジェクトプログラムの実行により得られるＰＡ情報の表示イメージを示した図である。It is the figure which showed the display image of PA information obtained by execution of the object program embedded with the PA information acquisition function stored in the execution file.

以下、コンパイル時の最適化処理によってループや処理ブロックが増加する場合を例示し、その後、実施形態を説明する。 Hereinafter, a case where the number of loops and processing blocks increases due to optimization processing at the time of compilation will be exemplified, and then the embodiment will be described.

図１は、最適化処理においてＳＩＭＤ化が行なわれ、その結果、複数のループが生成される例を示した図である。最適化処理はコンパイル装置内で取り扱うのに適した中間言語上で行なわれるが、この図１では、分かり易さのため、ソースプログラムであるか中間言語を用いたプログラムであるかを問わず、ソースプログラムの形式で示している。 FIG. 1 is a diagram illustrating an example in which SIMD processing is performed in the optimization process, and as a result, a plurality of loops are generated. The optimization process is performed on an intermediate language suitable for handling in the compiling apparatus. In FIG. 1, for the sake of simplicity, regardless of whether the program is a source program or a program using the intermediate language. Shown in source program format.

図１（Ａ）のソースプログラム中の（１）行目の「ｄｏｉ＝１，１００」は、以下の処理を行なうことを意味している。 “Do i = 1,100” on line (1) in the source program of FIG. 1A means that the following processing is performed.

・先ずｉ＝１と置いて、（３）行目の「ｅｎｄｄｏ」までの間の処理を実行する。 First, i = 1 is set, and the process up to “end do” on the (3) line is executed.

・その後ｉを１ずつインクリメントしながらｉ＝１００になるまで「ｅｎｄｄｏ」までの間の処理を繰り返す。 After that, i is incremented by 1 and the process until “end do” is repeated until i = 100.

また、（２）行目の「ａ（ｉ）＝３」は、配列ａの中のａ（ｉ）に３を代入することを意味している。 In addition, “a (i) = 3” in the (2) th line means that 3 is assigned to a (i) in the array a.

さらに、（３）行目の「ｅｎｄｄｏ」は、（１）行目の「ｄｏｉ＝１，１００」を開始位置とするループの終了位置であることを意味している。 Furthermore, “end do” in the (3) line means that the loop is at the end position of “do i = 1,100” in the (1) line.

ここでは、この図１（Ａ）に示す単純なループを例に挙げて説明する。 Here, a simple loop shown in FIG. 1A will be described as an example.

図１（Ｂ）は、図１（Ａ）のプログラムを元にコンパイルにおける最適化処理により作成されたプログラムを示している。 FIG. 1B shows a program created by optimization processing in compilation based on the program of FIG.

図１（Ｂ）の（１）行目の「ｉｆ」文は、配列ａが８バイトの境界上にある場合に、（５）行目の「ｅｌｓｅ」よりも手前（（４）行目まで）の処理を実行し、そうでない場合には、（５）行目の「ｅｌｓｅ」以降の処理を実行することを意味している。 In FIG. 1B, the “if” statement on the (1) line is the line before the “else” on the (5) line (up to the (4) line) when the array a is on the 8-byte boundary. ) Is executed. Otherwise, the process after “else” on the (5) line is executed.

ここでは、メモリのアクセスが８バイト単位であって、かつアクセスの境界がハードウェア的に決まっていることを想定している。ただし、ここでは、この条件自体に大きな意味はなく、詳細説明は省略する。 Here, it is assumed that the memory access is in units of 8 bytes and the access boundary is determined by hardware. However, this condition itself has no significant meaning here, and detailed description thereof is omitted.

また、（２）行目の「ｄｏｉ＝１，９９，２」は、先ずｉ＝１と置いて（４）行目の「ｅｎｄｄｏ」までの処理を実行し、その後ｉを２ずつインクリメントしながら（４）行目「ｅｎｄｄｏ」までの処理をｉ＝９９まで繰り返すことを意味している。 In (2) line “do i = 1,99,2”, i = 1 is set first and the process up to (4) line “end do” is executed, and then i is incremented by two. On the other hand, (4) the process up to the line “end do” is repeated until i = 99.

また、（３）行目の「ａ（ｉ：ｉ＋１）＝３」は、ａ（ｉ）とａ（ｉ＋１）の双方に３を代入することを意味している。 Also, “a (i: i + 1) = 3” in the (3) th line means that 3 is substituted for both a (i) and a (i + 1).

この（３）行目の処理のように１つの命令で複数のデータを処理（ここではａ（ｉ）＝３の処理とａ（ｉ＋１）＝３の処理）を実行することをＳＩＭＤ（ＳｉｎｇｌｅＩｎｓｔｒｕｃｔｉｏｎＭｕｌｔｉｐｌｅＤａｔａ）と呼ぶ。このＳＩＭＤによれば高速処理が期待できる。 It is SIMD (Single Instruction) that a plurality of data is processed by one instruction (here, a (i) = 3 processing and a (i + 1) = 3 processing) as in the processing of the (3) line. This is called “Multiple Data”. According to this SIMD, high-speed processing can be expected.

（４）行目の「ｅｎｄｄｏ」は、（２）行目の「ｄｏｉ＝１，９９，２」を開始位置とするループの終了位置であることを意味している。 (4) “end do” in the line means that the end position of the loop starts with “do i = 1,99,2” in line (2).

（５）行目は、（１）行目の「ｉｆ」文での実例（ここでは「配列ａが８バイトの境界上にある」という条件）を満たさない場合に、この「ｅｌｓｅ」文以下の処理を行なうことを意味している。 (5) If the line does not satisfy the example in the “if” statement on line (1) (here, the condition that “array a is on an 8-byte boundary”), this “else” statement This means that the process is performed.

（６）行目の「ｄｏｉ＝１，１００，１」は、先ずｉ＝１と置いて（８）行目の「ｅｎｄｄｏ」までの間の処理を実行し、その後ｉを１ずつインクリメントしながら（８）行目までの間の処理をｉ＝１００まで繰り返すことを意味している。 (6) For “do i = 1,100,1” in the row, first, i = 1 is set, and the processing until “end do” in the (8) row is executed, and then i is incremented by 1 However, (8) means that the process up to the line is repeated until i = 100.

（７）行目の「ａ（ｉ）＝３」は、配列ａの中のａ（ｉ）に３を代入することを意味している。 (7) “a (i) = 3” in the line means that 3 is assigned to a (i) in the array a.

（８）行目の「ｅｎｄｄｏ」は、（６）行目の「ｄｏｉ＝１，１００，１」を開始位置とするループの終了位置であることを意味している。 (8) “end do” in the line means the end position of the loop starting from “do i = 1,100,1” in the (6) line.

（９）行目の「ｅｎｄｉｆ」は、（１）行目の「ｉｆ」文を開始位置とする処理の終了位置であることを意味している。 (9) “end if” on the line means that the end position of the process starts with the “if” statement on the (1) line.

すなわち、この図１（Ｂ）は、配列ａが８バイトの境界上にある場合は、（３）行目のＳＩＭＤにより処理の高速化を図り、そうでない場合は、単純な逐次処理が行なわれるプログラムを表わしている。 That is, in FIG. 1B, when the array a is on the 8-byte boundary, the processing speed is increased by SIMD in the (3) th row, otherwise simple sequential processing is performed. Represents a program.

この図１（Ｂ）のプログラムの場合、図１（Ｂ）に示した（ｃ）のｉｆとｅｎｄｉｆとに囲まれたブロックが図１（Ａ）のループに相当する。図１（Ａ）の場合は単純な１つのループで構成されているが、図１（Ｂ）の場合はｉｆ−ｅｌｓｅ文のｉｆ配下とｅｌｓｅ配下それぞれにループが生成されている。すなわちこの図１（Ｂ）には、（ｃ）のブロック内に（ａ）と（ｂ）の２つのループが存在する。 In the case of the program shown in FIG. 1B, the block surrounded by if and endif shown in FIG. 1B corresponds to the loop shown in FIG. In the case of FIG. 1A, a simple loop is formed, but in the case of FIG. 1B, loops are generated under the if and else subordinates of the if-else statement. That is, in FIG. 1B, there are two loops (a) and (b) in the block (c).

図１（Ａ）のループに関しＰＡ情報を取得するための関数を図１（Ａ）のソースプログラムに埋め込んだ場合、単純には、図１（Ｂ）の（ｃ）のブロック全体についてのＰＡ情報が得られ、（ａ）や（ｂ）のループ単独でのＰＡ情報を得ることはできない。（ｃ）のブロックのＰＡ情報を取得しても（ａ）のループと（ｂ）のループのどちらが実行されたのか不明ともなりかねず、プログラムの実行性能の分析が不十分となるおそれがある。 When a function for acquiring PA information related to the loop of FIG. 1A is embedded in the source program of FIG. 1A, simply, PA information for the entire block of FIG. Thus, PA information cannot be obtained for the loops (a) and (b) alone. Even if the PA information of the block (c) is acquired, it may be unclear whether the loop (a) or the loop (b) was executed, and the analysis of the program execution performance may be insufficient. .

後述する実施形態の場合、（ａ）のループや（ｂ）のループのそれぞれについて分離したＰＡ情報を取得することが可能であり、プログラムの実行性能の詳細な分析を行なうことが可能となる。 In the case of an embodiment to be described later, it is possible to acquire separate PA information for each of the loops (a) and (b), and a detailed analysis of the program execution performance can be performed.

ここでは、最適化処理によりｉｆ−ｅｌｓｅ文のｉｆ配下とｅｌｓｅ配下それぞれにループが生成される例について説明したが、そのプログラムの記述によっては、処理ブロックが生成されることもある。実際のプログラムは通常はもっと複雑な記述となっているが、図１の例で処理ブロックを説明すると、例えば以下のように説明される。すなわち、図２（Ｂ）に示す（ｂ）のループに代わり、「ａ（１）＝３」，「ａ（２）＝３」，「ａ（３）＝３」の３つの処理が逐次的に行なわれるように記述されたプログラムが生成される可能性がある。このような場合、「ａ（１）＝３」，「ａ（２）＝３」，「ａ（３）＝３」の３つの処理の連続を１つの処理ブロックとして捉え、後述する実施形態ではそのような処理ブロックについても、ループと同様、その処理ブロックについての独立したＰＡ情報を取得することが可能である。 Here, an example has been described in which loops are generated under the if and else subordinates of an if-else statement by optimization processing, but depending on the description of the program, a processing block may be generated. The actual program is usually a more complicated description, but the processing block will be explained as follows, for example, in the example of FIG. That is, instead of the loop of (b) shown in FIG. 2B, three processes “a (1) = 3”, “a (2) = 3”, and “a (3) = 3” are sequentially performed. There is a possibility that a program written to be executed at the same time is generated. In such a case, a series of three processes “a (1) = 3”, “a (2) = 3”, and “a (3) = 3” is regarded as one processing block. For such a processing block, it is possible to acquire independent PA information for the processing block as in the case of the loop.

図２は、最適化処理によりループ分配が行なわれた例を示した図である。 FIG. 2 is a diagram illustrating an example in which loop distribution is performed by optimization processing.

図２（Ａ）はソースプログラムであって、その（１）行目の「ｄｏｉ＝１，ｎ」は、以下の処理を行なうことを意味している。 FIG. 2A shows a source program, and “do i = 1, n” in the (1) line means that the following processing is performed.

・先ずｉ＝１を置いて（４）行目の「ｅｎｄｄｏ」までの間の処理を実行する。 First, i = 1 is set, and the processing up to (end do) in the (4) th line is executed.

・その後ｉを１つずつインクリメントしながらｉ＝ｎに達するまで、（４）行目の「ｅｎｄｄｏ」までの間の処理を繰り返す。 Thereafter, i is incremented by 1 until i = n is reached, and the process until “end do” in the (4) th line is repeated.

また、図２（Ａ）の（２）行目のａ（ｉ）＝ｉは、配列ａのうちのａ（ｉ）にｉを代入することを意味している。 Further, a (i) = i in the (2) th row in FIG. 2A means that i is substituted for a (i) in the array a.

また、３行目のｂ（ｉ）＝ｂ（ｉ−１）＋ａ（ｉ）は、配列ｂのうちのｂ（ｉ−１）の値とａ（ｉ）の値とを足し算して、ｂ（ｉ）に代入することを意味している。 B (i) = b (i−1) + a (i) in the third row is obtained by adding the value of b (i−1) and the value of a (i) in the array b to b This means that it is assigned to (i).

さらに、（４）行目の「ｅｎｄｄｏ」は、（１）行目の「ｄｏｉ＝１，ｎ」を開始位置とするループの終了位置であることを意味している。 Furthermore, “end do” in the (4) th line means the end position of the loop starting from “do i = 1, n” in the (1) th line.

図２（Ｂ）は、最適化処理により図２（Ａ）のソースプログラムを元にして作成されたプログラムを示している。 FIG. 2B shows a program created based on the source program of FIG. 2A by the optimization process.

この図２（Ｂ）の（１）行目の「ｄｏｉ＝１，ｎ」は、図２（Ａ）の１行目と同様である。 “Do i = 1, n” in the (1) line in FIG. 2B is the same as in the first line in FIG.

図２（Ｂ）の（２）行目の「ａ（ｉ）＝ｉ」は、図２（Ａ）の（２）行目と同じである。 “A (i) = i” in the (2) line in FIG. 2B is the same as the (2) line in FIG.

また、図２（Ｂ）の（４）行目の「ｄｏｉ＝１，ｎ」は、（１）行目の「ｄｏｉ＝１，ｎ」と同様である。ただし、この（４）行目を開始位置とするループの終了位置は（６）行目の「ｅｎｄｄｏ」である。 Further, “do i = 1, n” in the (4) line in FIG. 2B is the same as “do i = 1, n” in the (1) line. However, the end position of the loop starting from the (4) th line is “end do” on the (6) th line.

また、図２（Ｂ）の（５）行目の「ｂ（ｉ）＝ｂ（ｉ−１）＋ａ（ｉ）」は、図２（Ａ）の（３）行目と同じである。ずなわち、この図２の場合、図２（Ａ）の（１）行目〜（４）行目の全域からなるループは、図２（Ｂ）の（ｃ）で示す範囲の全域に相当する。図２（Ａ）の場合は単純な１つのループで構成されているが、図２（Ｂ）の場合ループ分配が行なわれ、（ｃ）で示す範囲の中に（ａ）と（ｂ）の２つのループが存在する。 Further, “b (i) = b (i−1) + a (i)” in the (5) line in FIG. 2B is the same as the (3) line in FIG. In other words, in the case of FIG. 2, the loop composed of the entire area from line (1) to line (4) in FIG. 2A corresponds to the entire area indicated by (c) in FIG. To do. In the case of FIG. 2 (A), it is composed of a simple loop, but in the case of FIG. 2 (B), loop distribution is performed, and (a) and (b) are included in the range shown in (c). There are two loops.

ここでは、簡単なソースプログラムを用いて例示しているが、例えば演算のためのレジスタが不足するような複雑な演算処理を行なうループの場合、図２（Ｂ）にようにループを複数に分離した方が効率の良い演算処理を行なうことができる場合がある。最適化処理によりループ分配した方が効率的と判断されると図２（Ｂ）のようなループ分配が行なわれることがある。 Here, a simple source program is used as an example. For example, in the case of a loop that performs complicated arithmetic processing in which a register for arithmetic is insufficient, the loop is separated into a plurality as shown in FIG. In some cases, efficient calculation processing can be performed. If it is determined that the loop distribution by the optimization process is more efficient, the loop distribution as shown in FIG. 2B may be performed.

この図２の場合も図１の場合と同様であり、ＰＡ情報取得のための関数を図２（Ａ）のソースプログラムに埋め込んだ場合、単純には、図２（Ｂ）の（ｃ）の範囲全体についてのＰＡ情報の取得が行なわれることになる。この場合、（ａ）や（ｂ）のループ単独でのＰＡ情報を得ることはできない。したがって、（ｃ）の範囲での実行性能が悪い場合、（ａ）のループに問題があるのか（ｂ）のループに問題があるのか詳細な分析が不充分となるおそれがある。 The case of FIG. 2 is the same as the case of FIG. 1, and when a function for obtaining PA information is embedded in the source program of FIG. 2A, it is simply shown in FIG. 2B (c). The PA information for the entire range is acquired. In this case, it is not possible to obtain PA information for the loops (a) and (b) alone. Therefore, when the execution performance in the range of (c) is poor, there is a possibility that detailed analysis will be insufficient to determine whether there is a problem in the loop of (a) or a problem in the loop of (b).

そこで以下では、図１（Ｂ）や図２（Ｂ）の（ａ）や（ｂ）のループについても独立したＰＡ情報を取得できるようにした実施形態を説明する。 Therefore, in the following, an embodiment in which independent PA information can be acquired also for the loops (a) and (b) in FIG. 1 (B) and FIG. 2 (B) will be described.

図３は、コンパイル装置の一実施形態として動作するコンピュータの外観図、図４は、図３に外観を示すコンピュータのハードウエア構成図である。 FIG. 3 is an external view of a computer that operates as an embodiment of the compiling device, and FIG. 4 is a hardware configuration diagram of the computer whose external appearance is shown in FIG.

図３に示すコンピュータ１００は、後述するＣＰＵ（ＣｏｎｔｒｏｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、メモリ、ハードディスク装置等を内蔵した本体部１１０を有する。またこのコンピュータ１００は、その本体部１１０からの指示により表示画面１２１上に画像を表示する画像表示部１２０を有する。さらに、このコンピュータ１００は、ユーザによるキー操作に応じてこのコンピュータ１００内に指示や文字情報等を入力するキーボード１３０を有する。さらに、このコンピュータ１００は、ユーザ操作により、表示画面１２１上に表示されたカーソルの移動や、表示画面１２１上に表示されたアイコン等の指定による指示入力等を行なうマウス１４０を有する。 A computer 100 illustrated in FIG. 3 includes a main body 110 that incorporates a CPU (Control Processing Unit), a memory, a hard disk device, and the like which will be described later. The computer 100 also has an image display unit 120 that displays an image on the display screen 121 in accordance with an instruction from the main body unit 110. The computer 100 further includes a keyboard 130 for inputting instructions, character information, and the like in the computer 100 in response to key operations by the user. Further, the computer 100 includes a mouse 140 that performs a user's operation to move a cursor displayed on the display screen 121 and to input an instruction by designating an icon or the like displayed on the display screen 121.

本体部１１０は、ＣＤ（ＣｏｍｐａｃｔＤｉｓｋ）やＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）（以下では、ＣＤとＤＶＤを区別せずにＣＤ／ＤＶＤと称する）が装填されるＣＤ／ＤＶＤ装填口１１１を有する。その本体部１１０の内部には、そのＣＤ／ＤＶＤ装填口１１１から装填されたＣＤ／ＤＶＤをドライブする、ＣＤ／ＤＶＤドライブが内蔵されている。 The main body 110 has a CD / DVD loading slot 111 into which a CD (Compact Disk) and a DVD (Digital Versatile Disk) (hereinafter referred to as CD / DVD without distinguishing between CD and DVD). A CD / DVD drive for driving a CD / DVD loaded from the CD / DVD loading slot 111 is built in the main body 110.

本体部１１０には、さらに、図４に示すように、ＣＰＵ（ＣｏｎｔｒｏｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１２、メモリ１１３、ハードディスク装置１１４およびネットワークインタフェイス１１５を有する。また、この図５には、ＣＤ／ＤＶＤ１１７をドライブする上述のＣＤ／ＤＶＤドライブ１１６も示されている。 The main body 110 further includes a CPU (Control Processing Unit) 112, a memory 113, a hard disk device 114, and a network interface 115, as shown in FIG. FIG. 5 also shows the above-described CD / DVD drive 116 that drives the CD / DVD 117.

これらの各要素の相互間、さらに、これらの各要素と、図４に外観を示した画像表示部１２０、キーボード１３０およびマウス１４０との相互間は、バス１５０で接続されている。 A bus 150 connects these elements to each other and the image display unit 120, the keyboard 130, and the mouse 140 whose appearance is shown in FIG. 4.

ここで、ハードディスク装置１１４には、各種プログラムやデータ等が保存されている。このハードディスク装置１１４に保存されているプログラムが読み出されてメモリ１１３に展開され、そのメモリ１１３上に展開されたプログラムがＣＰＵ１１２で実行される。また、ネットワークインタフェイス１１５は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）やインタネット等を介して外部と通信する要素である。 Here, the hard disk device 114 stores various programs and data. The program stored in the hard disk device 114 is read out and expanded in the memory 113, and the program expanded on the memory 113 is executed by the CPU 112. The network interface 115 is an element that communicates with the outside via a LAN (Local Area Network), the Internet, or the like.

ここでハードディスク装置１１４には、ソースプログラム（例えば図１（Ａ），図２（Ａ）参照）を元にオブジェクトプログラムを作成するコンパイルプログラムが保存されている。 Here, the hard disk device 114 stores a compile program for creating an object program based on a source program (see, for example, FIGS. 1A and 2A).

ソースプログラムは、ＣＤ／ＤＶＤ１１７に格納されてＣＤ／ＤＶＤドライブ１１６で読み込まれる。あるいは、ネットワークインタフェース１１５を介して受信してもよい。ハードディスク装置１１４に保存されているコンパイルプログラムがメモリ１１３に展開されＣＰＵ１１２で実行されると、このコンピュータ１００はコンパイル装置の一例として動作する。このコンピュータ１００がコンパイル装置として動作すると、上記のようにして取得したソースプログラムを元にオブジェクトプログラムが作成される。作成されたオブジェクトプログラムは一旦はハードディスク装置１１４内に保存された後、ＣＤ／ＤＶＤドライブ１１６によりＣＤ／ＤＶＤに書き込まれたり、あるいは、ネットワークインタフェース１１５を介して送信される。このようにして、そのオブジェクトプログラムが実行先のコンピュータ（図示せず）に渡され、そのコンピュータで実行される。 The source program is stored in the CD / DVD 117 and read by the CD / DVD drive 116. Alternatively, it may be received via the network interface 115. When a compile program stored in the hard disk device 114 is expanded in the memory 113 and executed by the CPU 112, the computer 100 operates as an example of a compile device. When the computer 100 operates as a compiling device, an object program is created based on the source program acquired as described above. The created object program is once stored in the hard disk device 114 and then written to the CD / DVD by the CD / DVD drive 116 or transmitted via the network interface 115. In this way, the object program is delivered to the execution destination computer (not shown) and executed on the computer.

図５は、実施形態としてのコンパイル装置の機能説明図である。 FIG. 5 is a functional explanatory diagram of the compiling device as the embodiment.

この図５のコンパイル装置１０は、図３，図４に示すコンピュータ内でのコンパイルプログラムの実行によりそのコンピュータ内に実現する機能を表わしたものである。 The compile device 10 shown in FIG. 5 represents functions realized in the computer by executing the compile program in the computer shown in FIGS.

この図５のコンパイル装置１０は、プログラム言語で作成されたソースプログラム１を取り込み、そのソースプログラム１を、実行予定のコンピュータ上で実行可能なオブジェクトプログラム２に変換するコンパイルを実行する装置である。ここで、このオブジェクトプログラム２が実行されるコンピュータは、このコンパイル装置１０の機能実現に用いるコンピュータと同一のコンピュータであることを妨げるものではないが、別のコンピュータであってもよい。 The compiling device 10 in FIG. 5 is a device that takes in a source program 1 created in a program language and executes compiling to convert the source program 1 into an object program 2 that can be executed on a computer that is scheduled to be executed. Here, the computer on which the object program 2 is executed does not prevent the computer used for realizing the functions of the compiling apparatus 10 from being the same computer, but may be another computer.

ここで、本実施形態の場合、コンパイル装置１０にはソースプログラム１のほかＰＡ位置ファイル３も取り込まれ、そのＰＡ位置ファイル３の記述を元にＰＡ情報取得のための関数が埋め込まれたオブジェクトプログラム２が生成される。本実施形態ではＰＡ情報として、ＰＡレンジで指定された区間内での「サイクル数」、「命令数」、「キャッシュミス」などを取得する関数が埋め込まれる。 Here, in the case of the present embodiment, the compiling device 10 takes in the PA location file 3 in addition to the source program 1 and embeds a function for acquiring PA information based on the description of the PA location file 3. 2 is generated. In the present embodiment, a function for obtaining “number of cycles”, “number of instructions”, “cache miss”, etc. within the section specified by the PA range is embedded as PA information.

このコンパイル装置１０は、ソースプログラム／ＰＡ位置ファイル取得部１１と、構文解析部１２と、ＰＡ位置ファイル解析部１３と、最適化部１４と、ＰＡ取得関数挿入部１５と、コード生成部１６とを有する。また、このうちのＰＡ取得関数挿入部１５は、ＰＡレンジ設定部１５１と、関数埋込部１５２とを有する。 The compiling device 10 includes a source program / PA position file acquisition unit 11, a syntax analysis unit 12, a PA position file analysis unit 13, an optimization unit 14, a PA acquisition function insertion unit 15, a code generation unit 16, and the like. Have Of these, the PA acquisition function insertion unit 15 includes a PA range setting unit 151 and a function embedding unit 152.

ソースプログラム／ＰＡ位置ファイル取得部１１は、取得手段の一例であり、このコンパイル装置１０にソースプログラム１とＰＡ位置ファイル２とを取り込む。ソースプログラム１は、フォートラン言語やＣ言語、Ｃ＋＋言語などのプログラム言語で作成されたプログラムである。また、ＰＡ位置ファイル３は、プログラム実行時の挙動を表わすＰＡ情報を取得する、プログラム上の範囲を表わすＰＡレンジが記述されたファイルである。また、このＰＡ位置ファイル３には、そのＰＡレンジとともに、後述するＰＡレンジ設定部１５１で新たなＰＡレンジを設定する場合の、その新たなＰＡレンジ設定箇所を指定するＰＡレンジ設定箇所指定情報も記述されている。 The source program / PA location file acquisition unit 11 is an example of an acquisition unit, and takes the source program 1 and the PA location file 2 into the compiling device 10. The source program 1 is a program created in a program language such as a Fortran language, a C language, or a C ++ language. Further, the PA position file 3 is a file in which a PA range representing a range on the program is described for obtaining PA information representing a behavior at the time of executing the program. The PA position file 3 also includes PA range setting location designation information for designating the new PA range setting location when a new PA range is set by the PA range setting unit 151 described later, along with the PA range. It has been described.

図６は、本実施形態におけるＰＡ位置ファイルの記述内容を示す図である。１行目には、「＊」印に続き、「ＰＡレンジ設定箇所指定情報」のいずれかが記述されている。このＰＡレンジ設定箇所指定情報には、
「ａｌｌ」
「ｅｎｔｉｒｅｔｙ−ｌｏｏｐ−ｏｎｌｙ」
「ｍａｉｎ−ｌｏｏｐ−ｏｎｌｙ」
「ｍｏｄ−ｌｏｏｐ−ｏｎｌｙ」
の４種類が存在する。 FIG. 6 is a diagram showing the description contents of the PA location file in the present embodiment. On the first line, “PA range setting location designation information” is described after the “*” mark. This PA range setting location designation information includes
"All"
"Entity-loop-only"
“Main-loop-only”
"Mod-loop-only"
There are four types.

これらについて、図１（Ｂ），図２（Ｂ）を参照しながら説明する。 These will be described with reference to FIGS. 1B and 2B.

「ａｌｌ」は、図１（Ｂ），図２（Ｂ）の（ａ），（ｂ），（ｃ）の全てのループについてＰＡ情報を取得することを意味している。 “All” means that PA information is acquired for all the loops of (a), (b), and (c) of FIG. 1 (B) and FIG. 2 (B).

また、「ｅｎｔｉｒｅｔｙ−ｏｎｌｙ」は、（ｃ）のループについてのみＰＡ情報を取得し、（ａ），（ｂ）の個々のループについての独立したＰＡ情報の取得は不要であることを意味している。 “Entity-only” means that PA information is acquired only for the loop of (c), and it is not necessary to acquire independent PA information for the individual loops of (a) and (b). Yes.

また、「ｍａｉｎ−ｌｏｏｐ−ｏｎｌｙ」は、子ループのうちの最初のループ、すなわち（ａ）のループについてのみＰＡ情報を取得し、（ｂ），（ｃ）のループについてのＰＡ情報の取得は不要であることを意味している。 “Main-loop-only” acquires PA information only for the first loop of child loops, that is, the loop of (a), and acquisition of PA information for the loops of (b) and (c) It means that it is unnecessary.

さらに、「ｍｏｄ−ｌｏｏｐ−ｏｎｌｙ」は、子ループのうちの最初のループを除く各ループ（図１（Ｂ），図２（Ｂ）の場合は（ｂ）のループ）についてのＰＡ情報を取得し、（ａ），（ｃ）のループについてのＰＡ情報の取得は不要であることを意味している。 Furthermore, “mod-loop-only” obtains PA information for each loop (the loop of (b) in the case of FIG. 1B and FIG. 2B) except the first loop among the child loops. This means that it is not necessary to acquire PA information for the loops (a) and (c).

また、ＰＡ位置ファイル内の記述内容のうちの２行目は以下のことを意味している。 The second line of the description contents in the PA position file means the following.

「ソースファイル名」はＰＡ情報取得対象のソースプログラムが記述されたソースファイルの名前、「モジュール名」は、そのソースファイル中の、ＰＡ情報を取得するモジュールの名前である。さらに、「始点行番号，終点行番号」は、そのモジュール中のＰＡ取得範囲の、それぞれ開始位置，終了位置を表わす始点行番号，終点行番号である。これら開始位置と終了位置とに挟まれる範囲を、ここでは「ＰＡレンジ」と称している。 “Source file name” is the name of the source file in which the source program for which PA information is to be acquired is described, and “module name” is the name of the module for acquiring PA information in the source file. Further, the “starting line number and ending line number” are a starting line number and an ending line number representing the start position and the end position, respectively, of the PA acquisition range in the module. A range between the start position and the end position is referred to as “PA range” here.

図６では、ＰＡ位置ファイルは２行で記述されているが、３行目以降にも２行目と同様の記述を行ない、１つのＰＡ位置ファイルで複数のＰＡレンジを指定することも可能である。 In FIG. 6, the PA position file is described in two lines. However, the same description as the second line can be made in the third and subsequent lines, and a plurality of PA ranges can be specified in one PA position file. is there.

図５に戻って説明を続ける。 Returning to FIG.

図５に示すコンパイル装置１０の構文解析部１２は、第１変換手段の一例であり、ソースプログラム／ＰＡ位置ファイル取得部１１で取得したソースプログラムを、このコンパイル装置１０での取扱いに適した中間言語で記述された第１のプログラムに変換する。 The syntax analysis unit 12 of the compiling device 10 illustrated in FIG. 5 is an example of a first conversion unit, and the source program acquired by the source program / PA position file acquisition unit 11 is an intermediate suitable for handling by the compiling device 10. Convert to the first program written in the language.

また、ＰＡ位置ファイル解析部１３は、第１設定手段の一例である。すなわち、このＰＡ位置ファイル解析部１３は、ＰＡ位置ファイルから、ＰＡ取得範囲の開始位置を終了位置とで規定されるＰＡレンジを読み出して、そのＰＡレンジ（開始位置と終了位置）を上記の第１のプログラム上に設定する。 The PA position file analysis unit 13 is an example of a first setting unit. That is, the PA position file analysis unit 13 reads a PA range defined by the start position of the PA acquisition range and the end position from the PA position file, and sets the PA range (start position and end position) as described above. 1 is set on the program.

また最適化部１４は、最適化手段の一例であり、ＰＡレンジが設定された第１のプログラムに最適化処理を施して、その第１のプログラム上に設されたＰＡレンジを引き継いだ第２のプログラムを生成する。ここでの最適化処理は、例えば図１，図２に示すＳＩＭＤ化やループ分配など、第１のプログラムに、そのプログラムがより効率的に実行されるように変更を加える処理をいう。 The optimization unit 14 is an example of an optimization unit. The optimization unit 14 performs the optimization process on the first program in which the PA range is set, and takes over the PA range set on the first program. Generate the program. The optimization processing here refers to processing for changing the first program so that the program is executed more efficiently, such as SIMD conversion and loop distribution shown in FIGS.

この最適化部１４による最適化処理内容は、最適化情報１４１として書き残される。 The contents of the optimization process by the optimization unit 14 are left as optimization information 141.

図７は最適化情報を示す図である。 FIG. 7 is a diagram showing optimization information.

ここでは、「ソースファイル名」と「モジュール名」と「対象行番号」とにより、最適化処理によるプログラムの変更が加えられた位置が特定される。また、「最適化種別」により、最適化処理によって行なわれたプログラムの変更の種別が特定される。「最適化種別」には、例えば、「ＳＩＭＤ化」（図１参照）や「ループ分配」（図２参照）などが含まれる。 Here, the position where the program is changed by the optimization process is specified by the “source file name”, “module name”, and “target line number”. The “optimization type” identifies the type of program change performed by the optimization process. The “optimization type” includes, for example, “SIMD” (see FIG. 1), “loop distribution” (see FIG. 2), and the like.

また図５に示すＰＡ取得関数挿入部１５を構成するＰＡレンジ設定部１５１は、第２設定手段の一例であり、第１のプログラムと比べたときの、第２のプログラム上のＰＡレンジ内のループ又は処理ブロックの数の増加の有無を調べる。そして増加していた場合に、第２のプログラム上のＰＡレンジ内の各ループ又は処理ブロックに新たなＰＡレンジを設定する。この新たなＰＡレンジを設定するにあたっては、ＰＡ位置ファイル３に記述された「ＰＡレンジ設定箇所指定情報」に基づいて新たなＰＡレンジが設定される。 The PA range setting unit 151 that constitutes the PA acquisition function insertion unit 15 shown in FIG. 5 is an example of a second setting unit, and is within the PA range on the second program when compared with the first program. Check for an increase in the number of loops or processing blocks. If it has increased, a new PA range is set for each loop or processing block in the PA range on the second program. In setting the new PA range, a new PA range is set based on “PA range setting location designation information” described in the PA position file 3.

また、ＰＡ取得関数挿入部１５を構成する関数埋込部１５２は、関数埋込手段の一例であって、第２のプログラムに、ＰＡ情報取得のためのＰＡ情報取得関数を、第２のプログラムに設定されたＰＡレンジごとに埋め込む。 The function embedding unit 152 constituting the PA acquisition function inserting unit 15 is an example of a function embedding unit, and a PA information acquisition function for acquiring PA information is added to the second program. Embed each PA range set to.

さらに、コード生成部１６は、第２の変換手段の一例であり、ＰＡ情報取得関数が埋め込まれた第２のプログラムを、コンピュータ上で実行可能なオブジェクトプログラム２に変換する。ソースプログラム１が、メインルーチンとサブルーチンなど複数のプログラムに分かれている場合もある。このとき、コード生成部１６は、それら複数のプログラムについてオブジェクトプログラムを作成する。その後、リンカ１７によって、複数のオブジェクトプログラムを統合し、最終的に実行可能なオブジェクトプログラムを作成する。 Furthermore, the code generation unit 16 is an example of a second conversion unit, and converts the second program in which the PA information acquisition function is embedded into an object program 2 that can be executed on a computer. The source program 1 may be divided into a plurality of programs such as a main routine and a subroutine. At this time, the code generation unit 16 creates an object program for the plurality of programs. Thereafter, the linker 17 integrates a plurality of object programs, and finally creates an executable object program.

以下、ソースプログラムおよびＰＡ位置ファイルの例を挙げながら、図５に示すコンパイル装置１１の各部の動作についてさらに記述する。 Hereinafter, the operation of each part of the compiling device 11 shown in FIG. 5 will be further described with examples of the source program and the PA location file.

図８は、ソースプログラムを取得して中間言語による第１のプログラムに変換する処理のイメージを示した図である。 FIG. 8 is a diagram showing an image of processing for acquiring a source program and converting it into a first program in an intermediate language.

図８（Ａ）はソースプログラムの一例、図８（Ｂ）はＰＡ位置ファイルの一例、図８（Ｃ）は、中間言語による第１プログラムの一例を示している。 FIG. 8A shows an example of a source program, FIG. 8B shows an example of a PA position file, and FIG. 8C shows an example of a first program in an intermediate language.

図８（Ａ）に記載されている「ソースプログラム（ａ．ｆ９０）」は、このソースプログラムが記述されているソースファイルの名前が「ａ．ｆ９０」であることを意味している。 “Source program (a.f90)” described in FIG. 8A means that the name of the source file in which the source program is described is “a.f90”.

また、このソースプログラムの（１）行目の「ｓｕｂｒｏｕｔｉｎｅｓｕｂ１（ａ，ｂ，ｎ）」は、この（１）行目以降が、「ｓｕｂ１」と名づけられａ，ｂ，ｎを引数とするサブルーチンであることを表わしている。 In addition, “subroutine sub1 (a, b, n)” on the (1) line of this source program is a subroutine that is named “sub1” on and after the (1) line and has a, b, n as arguments. It represents that.

また、（２）行目の「ｉｎｔｅｇｅｒ：：ｎ」は、ｎが整数であることを表わしている。 Also, “integer :: n” on the (2) line represents that n is an integer.

また、（３）行目の「ｒｅａｌ，ｄｉｍｅｎｓｉｏｎ（１０００）：：ａ，ｂ」は、ａ，ｂのそれぞれが実数であって、かつそれぞれが１０００個の要素を持つ配列であることを表わしている。 In addition, “real, dimension (1000) :: a, b” on the (3) line indicates that each of a and b is a real number and each is an array having 1000 elements. Yes.

また、（４）〜（７）行目については、図２（Ａ）の（１）〜（４）行目とそれぞれ同一であり、ここでの重複説明は省略する。 Further, the (4) to (7) lines are the same as the (1) to (4) lines in FIG. 2A, respectively, and a duplicate description is omitted here.

さらに（８）行目の「ｅｎｄｓｕｂｒｏｕｔｉｎｅｓｕｂ１」は、「ｓｕｂ１」と名づけられたサブルーチンの終了位置であることを表わしている。 Furthermore, “end subroutine sub1” on line (8) represents the end position of the subroutine named “sub1”.

また、図８（Ｂ）の「ＰＡ位置ファイル（ｐａｒａｎｇｅ．ｔｘｔ）」は、このＰＡ位置ファイルのファイル名が「ｐａｒａｎｇｅ．ｔｘｔ」であることを意味している。 In addition, “PA location file (page.txt)” in FIG. 8B means that the file name of this PA location file is “page.txt”.

また、図８（Ｂ）のＰＡ位置ファイル中の１行目の「＊ａｌｌ」は、図６を参照して説明した「ＰＡレンジ設定箇所指定情報」が「ａｌｌ」であることを表わしている。 Also, “* all” on the first line in the PA position file in FIG. 8B indicates that “PA range setting location designation information” described with reference to FIG. 6 is “all”. .

また、図８（Ｂ）のＰＡ位置ファイル中の２行目の「ａ．ｆ９０」，「ｓｕｂ１」，「４」，「７」は、ソースファイル名が「ａ．ｆ９０」（図８（Ａ）参照）、モジュール名が「ｓｕｂ１」、ＰＡレンジが４行目から７行目までであることを表わしている。 Further, “a.f90”, “sub1”, “4”, and “7” in the second line in the PA position file of FIG. 8B have source file names “a.f90” (FIG. 8A )), The module name is “sub1”, and the PA range is from the fourth line to the seventh line.

ここでは、図５に示すコンパイル装置１０のソースプログラム／ＰＡ位置ファイル取得部１１で、図８（Ａ）のソースプログラムおよび図８（Ｂ）のＰＡ位置ファイルが取得されたものとする。 Here, it is assumed that the source program / PA location file acquisition unit 11 of the compiling apparatus 10 illustrated in FIG. 5 has acquired the source program of FIG. 8A and the PA location file of FIG. 8B.

このとき、図５の構文解析部１２では、図８（Ａ）に示すソースプログラムを元にして、図８（Ｃ）にイメージを示す、中間言語で記述された第１のプログラムが作成される。 At this time, the syntax analysis unit 12 in FIG. 5 creates a first program described in an intermediate language whose image is shown in FIG. 8C based on the source program shown in FIG. .

図８（Ｃ）の、「基本処理ブロック（Ｌｉｎｅ１）」，「基本処理ブロック（Ｌｉｎｅ２），「基本処理ブロック（Ｌｉｎｅ３），および「基本処理ブロック（Ｌｉｎｅ８）は、図８（Ａ）のソースプログラムの、それぞれ（１）行目、（２）行目、（３）行目、および（８）行目の命令が中間言語に変換された部分である。また、図８（Ｃ）の「ループ（ｌｉｎｅ４−７）」は、図８（Ａ）の（４）行目から（７）行目までのループが中間言語で記述された部分である。その「ループ（ｌｉｎｅ４−７）」は、Ｌｉｎｅ４のブロック、Ｌｉｎｅ５，６のブロック、Ｌｉｎｅ５，６を指定回数繰り返すためのカウンタのブロック、およびＬｉｎｅ７のブロックからなる。Ｌｉｎｅ４のブロックは、図８（Ａ）の（４）行目に対応し、Ｌｉｎｅ５，６のブロックは、図８（Ａ）の（５），（６）行目に対応する。カウンタのブロックは、ｉを１からｎまでカウントしながらＬｉｎｅ５，６の処理を繰り返させるためのブロックである。Ｌｉｎｅ７のブロックは、図８（Ａ）の（７）行目に対応するブロックである。カウンタのブロックからＬｉｎｅ５，６のブロックに向かう矢印は処理の繰り返しを表わしている。また、Ｌｉｎｅ４のブロックからＬｉｎｅ７のブロックに向かう矢印は、例えばｎ＝０又はｎ＜０など、図８（Ａ）の（４）行目から（７）行目の間のループ処理の条件を満足しない場合に、（４）行目から（７）行目にスキップすることを表わしている。図８（Ｃ）の「ループ（ｌｉｎｅ４−７）」は、１つのまとまった処理を実行する複数の基本処理ブロックからなる「処理ブロック」の１つである。 The “basic processing block (Line1)”, “basic processing block (Line2)”, “basic processing block (Line3), and“ basic processing block (Line8) ”in FIG. 8C are the source programs of FIG. Are the parts in which the instructions on the (1) line, (2) line, (3) line, and (8) line are converted into an intermediate language, respectively, and the “loop” in FIG. “(Line4-7)” is a portion in which a loop from the (4) line to the (7) line in FIG. 8A is described in an intermediate language. The “loop (line 4-7)” includes a line 4 block, a line 5 and 6 block, a counter block for repeating Lines 5 and 6 a specified number of times, and a line 7 block. The block of Line 4 corresponds to the (4) line of FIG. 8A, and the blocks of Line 5 and 6 correspond to the (5) and (6) lines of FIG. The counter block is a block for repeating the processing of Lines 5 and 6 while counting i from 1 to n. The block of Line 7 is a block corresponding to the (7) line in FIG. The arrows from the counter block to the Line 5 and 6 blocks indicate repetition of the processing. Further, the arrow from the Line 4 block to the Line 7 block satisfies the loop processing condition between the (4) line and the (7) line in FIG. 8A, for example, n = 0 or n <0. In the case of not doing so, it represents skipping from the (4) line to the (7) line. The “loop (line 4-7)” in FIG. 8C is one of “processing blocks” composed of a plurality of basic processing blocks that execute one set of processing.

図９は、図５のＰＡ位置ファイル解析部１３での処理を示すフローチャートである。 FIG. 9 is a flowchart showing processing in the PA position file analysis unit 13 of FIG.

また図１０は、ＰＡ位置ファイル解析部１３での処理のイメージを示す図である。 FIG. 10 is a diagram showing an image of processing in the PA position file analysis unit 13.

ＰＡ位置ファイル解析部１３では、図９に示すように、先ずＰＡ位置ファイル（図８（Ｂ）参照）から、そこに記述された、「ソースファイル名」「モジュール名」「ＰＡレンジ（開始位置と終了位置）」が読み取られる（ステップＳ１１）。そして、中間言語で記述された第１のプログラム（図８（Ｃ）参照）を辿り、ＰＡレンジの開始位置、終了位置として指定された行番号を探索する（ステップＳ１２）。この探索により見つけた行番号に対応づけて開始位置、終了位置を表わす属性値を設定する（ステップＳ１３）。またここでは、その開始位置から終了位置までの間に存在する処理ブロックやループの数も計数して設定しておく。 In the PA position file analysis unit 13, as shown in FIG. 9, first, from the PA position file (see FIG. 8B), “source file name” “module name” “PA range (start position) And end position) "are read (step S11). Then, the first program described in the intermediate language (see FIG. 8C) is traced, and the line number designated as the PA range start position and end position is searched (step S12). An attribute value representing the start position and end position is set in association with the line number found by this search (step S13). Here, the number of processing blocks and loops existing between the start position and the end position is also counted and set.

この処理を、図８に示す例の続きで説明すると、図１０に示すように、Ｌｉｎｅ４のブロックに「ＰＡ（１）ｓｔａｒｔ／ａｌｌ／ｂｌｏｃｋＮｕｍ：１」なる属性値が設定され、Ｌｉｎｅ７のブロックに「ＰＡ（1）ｅｎｄ／ａｌｌ」なる属性値が設定される。 This process will be described in the continuation of the example shown in FIG. 8. As shown in FIG. 10, the attribute value “PA (1) start / all / blockNum: 1” is set in the Line4 block, and the Line7 block is set. An attribute value “PA (1) end / all” is set.

Ｌｉｎｅ４のブロックに設定された属性値のうちの「ＰＡ（１）ｓｔａｒｔ」は、通し番号（１）のＰＡレンジの開始位置であることを表わしている。また、「ａｌｌ」は、図６を参照して説明した「ＰＡレンジ設定箇所指定情報」が「ａｌｌ」であることを表わしている。さらに、「ｂｌｏｃｋＮｕｍ：１」は、このＰＡレンジ内にある処理ブロック又はループの数が１であることを表わしている。 “PA (1) start” among the attribute values set in the block of Line 4 represents the start position of the PA range of the serial number (1). “All” indicates that “PA range setting location designation information” described with reference to FIG. 6 is “all”. Further, “blockNum: 1” represents that the number of processing blocks or loops in the PA range is 1.

Ｌｉｎｅ７のブロックに設定された属性値のうちの「ＰＡ（１）ｅｎｄ」は、通し番号（１）のＰＡレンジの終了位置であることを表わしている。「ａｌｌ」は、Ｌｉｎｅ４のブロックの属性値のうちの「ａｌｌ」が繰り返されている。 “PA (1) end” among the attribute values set in the block of Line 7 represents the end position of the PA range of the serial number (1). In “all”, “all” in the attribute values of the block of Line 4 is repeated.

図９のステップＳ１４では、ＰＡ位置ファイルから全てのＰＡレンジについての設定が終了したか否かが判定され、未設定のＰＡレンジが存在するときはステップＳ１１に戻ってその未設定のＰＡレンジについての処理が繰り返される。 In step S14 of FIG. 9, it is determined whether or not the settings for all the PA ranges have been completed from the PA position file. If there is an unset PA range, the process returns to step S11 to determine the unset PA range. The process is repeated.

ここで、図８（Ｂ）に示すＰＡ位置ファイルは２行で構成されており、ＰＡレンジはその２行目に記載された１つのみである。ただし、前述した通り２行目と同様の記述を３行目以降に記載することにより、１つのＰＡ位置ファイルで複数のＰＡレンジを設定することもできる。図９のステップＳ１４は、複数のＰＡレンジが設定されている場合があることを考慮したステップである。 Here, the PA position file shown in FIG. 8B is composed of two lines, and the PA range is only one described in the second line. However, as described above, by describing the same description as the second line in the third and subsequent lines, a plurality of PA ranges can be set with one PA position file. Step S14 in FIG. 9 is a step that takes into consideration that a plurality of PA ranges may be set.

全てのＰＡレンジについて属性値の設定が終了すると（図９のステップＳ１４）、それらの属性値が設定された第１のプログラム（図１０参照）が図５に示す最適化部１４に渡される（ステップＳ１５）。 When setting of attribute values for all PA ranges is completed (step S14 in FIG. 9), the first program (see FIG. 10) in which those attribute values are set is passed to the optimization unit 14 shown in FIG. Step S15).

図５の最適化部１４では、ＰＡ位置ファイル解析部１３から受け取った第１のプログラムに最適化処理が施される。ここでは、最適化処理の結果、ループ分配（図２参照）が行なわれたものとする。 In the optimization unit 14 in FIG. 5, optimization processing is performed on the first program received from the PA position file analysis unit 13. Here, it is assumed that loop distribution (see FIG. 2) has been performed as a result of the optimization processing.

このとき、最適化部１４は、ソースファイル名「ａ．ｆ９０」、モジュール名「ｓｕｂ１」、対象行番号「４，７」、および最適化種別「ループ分配」なる最適化情報１４１を生成する（図５，図７参照）。 At this time, the optimization unit 14 generates optimization information 141 having the source file name “a.f90”, the module name “sub1”, the target line number “4, 7”, and the optimization type “loop distribution” ( FIG. 5 and FIG. 7).

図１１は、最適化処理により生成された第２のプログラムを示すイメージ図である。 FIG. 11 is an image diagram showing a second program generated by the optimization process.

ここには、「ループ（ｌｉｎｅ４−７）」が２つ存在する。上側のループは、図８（Ａ）の（５）行目の処理を担当するループ、下側のループは図８（Ａ）の（６）行目の処理を担当するループである。また、上側のループ、下側のループのいずれにも「Ｌ−ｄｓｔ」が記載されている。この「Ｌ−ｄｓｔ」は、そのループに関する最適化情報１４１（図５参照）を示している。また、この第２のプログラムは、最適化前の第１のプログラム（図１０参照）のＰＡレンジに関する属性値を引き継いでいる。ＰＡレンジの開始位置に関する属性値は、２つのループのうちの上側のループのＬｉｎｅ４のブロックに対応づけられている。また、ＰＡレンジの終了位置に関する属性値は２つのループのうちの下側のループのＬｉｎｅ７のブロックに対応づけられている。 There are two “loops (lines 4-7)”. The upper loop is a loop in charge of the processing of the (5) line in FIG. 8A, and the lower loop is a loop in charge of the processing of the (6) line in FIG. 8A. In addition, “L-dst” is described in both the upper loop and the lower loop. This “L-dst” indicates optimization information 141 (see FIG. 5) regarding the loop. In addition, the second program takes over the attribute value related to the PA range of the first program (see FIG. 10) before optimization. The attribute value related to the start position of the PA range is associated with the Line 4 block in the upper loop of the two loops. The attribute value relating to the end position of the PA range is associated with the Line 7 block of the lower loop of the two loops.

この段階では、ＰＡレンジは、上側のループの先頭から下側のループの最終までが１つのＰＡレンジ（ここでは、「ＰＡレンジ＜１＞」と称する）が設定されていることになる。 At this stage, one PA range (herein referred to as “PA range <1>”) is set from the beginning of the upper loop to the end of the lower loop.

図１２，図１３は、図５のＰＡ取得関数挿入部１５の処理を示すフローチャートの、それぞれ前半部分、後半部分を示す図である。 FIGS. 12 and 13 are diagrams showing the first half and the second half of the flowchart showing the processing of the PA acquisition function insertion unit 15 shown in FIG. 5, respectively.

ＰＡ取得関数挿入部１５では、最適化処理の行なわれた第２のプログラムに関し、図１２，図１３の処理を行なう。 The PA acquisition function insertion unit 15 performs the processes shown in FIGS. 12 and 13 for the second program subjected to the optimization process.

ここでは先ず、最適化情報１４１（図５参照）が調べられて、ループに関する最適化処理が行なわれたか否かが判定される（ステップＳ２１）。そしてループに関する最適化処理が行なわれた場合、その最適化処理が行なわれたループがＰＡレンジに含まれるループであるか否かが判定される（ステップＳ２２）。ループに関する最適化処理が行なわれていない場合（ステップＳ２１）、あるいは、ループに関する最適化処理が行なわれていてもＰＡレンジ内のループに関して最適化処理が行なわれていない場合（ステップＳ２２）は、図１３のステップＳ３３に進む。ステップＳ３３については後述する。 Here, first, the optimization information 141 (see FIG. 5) is examined to determine whether or not the optimization processing related to the loop has been performed (step S21). When the optimization process related to the loop is performed, it is determined whether or not the loop subjected to the optimization process is a loop included in the PA range (step S22). When the optimization process related to the loop is not performed (step S21), or when the optimization process related to the loop is performed, the optimization process is not performed regarding the loop in the PA range (step S22). Proceed to step S33 of FIG. Step S33 will be described later.

ＰＡレンジ内のループに関し最適化処理が行なわれている場合（ステップＳ２２）は、ステップＳ２３に進み、そのＰＡレンジ内の処理フローを辿り、処理ブロックやループの数が数えられる。その数が増加していないときは（ステップＳ２４）、ステップＳ３３に進む。 If the optimization process is being performed for the loop in the PA range (step S22), the process proceeds to step S23, the process flow in the PA range is followed, and the number of processing blocks and loops is counted. When the number has not increased (step S24), the process proceeds to step S33.

図１４は、ループ数の増加が認識された段階の第２のプログラムのイメージ図である。 FIG. 14 is an image diagram of the second program at the stage where the increase in the number of loops is recognized.

ここでは、元々は１つが存在していなかったループが２つに増加しており、ＰＡレンジ＜１＞を親のＰＡレンジとしたときに、その親のＰＡレンジ＜１＞の下に２つの子のＰＡレンジ＜１．１＞，＜１．２＞が存在することが認識される。 Here, the number of loops that originally did not exist has increased to two, and when PA range <1> is set as the parent PA range, two loops are placed under the parent PA range <1>. It is recognized that there are child PA ranges <1.1> and <1.2>.

ループ又は処理ブロックの数が増加したことが認識されると（図１２，ステップＳ２４）、ステップＳ２５に進み、第２のプログラム上に新たなＰＡレンジが設定される。 When it is recognized that the number of loops or processing blocks has increased (FIG. 12, step S24), the process proceeds to step S25, and a new PA range is set on the second program.

図１５は、第２のプログラム上に新たなＰＡレンジが設定された状態を示すイメージ図である。 FIG. 15 is an image diagram showing a state in which a new PA range is set on the second program.

ここには、上下２つのループのうちの上側のループのＬｉｎｅ４のブロックには、ＰＡレンジに関する属性値として「ＰＡ（１，１．１）ｓｔａｒｔ／ａｌｌ」が設定されている。また、上側のループのＬｉｎｅ７のブロックには、「ＰＡ（１．１）ｅｎｄ／ａｌｌ」、下側のループのＬｉｎｅ４のブロックには「ＰＡ（１．２）ｓｔａｒｔ／ａｌｌ」、下側のループのＬｉｎｅ７のブロックには「ＰＡ（１，１．２）ｅｎｄ／ａｌｌ」の属性値が設定されている。 Here, “PA (1, 1.1) start / all” is set as an attribute value related to the PA range in the block of Line 4 in the upper loop of the two upper and lower loops. The upper loop Line 7 block has “PA (1.1) end / all”, and the lower loop Line 4 block has “PA (1.2) start / all”. An attribute value of “PA (1, 1.2) end / all” is set in the block of Line7.

上側のループのＬｉｎｅ４のブロックに設定されている属性値のうちの、「ＰＡ（１，１．１）ｓｔａｒｔ」は、親のＰＡレンジ＜１＞と子のＰＡレンジ＜１．１＞の双方のＰＡレンジの開始位置であることを表わしている。「ａｌｌ」については説明済であり、ここでの説明は省略する。 Among the attribute values set in the Line 4 block of the upper loop, “PA (1, 1.1) start” is both the parent PA range <1> and the child PA range <1.1>. Represents the start position of the PA range. Since “all” has already been described, description thereof is omitted here.

上側のループのＬｉｎｅ７のブロックに設定されている属性値のうちの「ＰＡ（１．１）ｅｎｄ」は、子のＰＡレンジ＜１．１＞の終了位置であることを表わしている。 “PA (1.1) end” among the attribute values set in the Line 7 block of the upper loop represents the end position of the child PA range <1.1>.

また、下側のループのＬｉｎｅ４のブロックに設定されている属性値のうちの「ＰＡ（１．２）ｓｔａｒｔ」は子のＰＡレンジ＜１．２＞の開始位置であることを表わしている。 Of the attribute values set in the Line 4 block of the lower loop, “PA (1.2) start” represents the start position of the child PA range <1.2>.

さらに、下側のループのＬｉｎｅ７のブロックに設定されている属性値のうちの「ＰＡ（１，１．２）ｅｎｄ」は、親のＰＡレンジ＜１＞および子のＰＡレンジ＜１．２＞の双方のＰＡレンジの終了位置であることを表わしている。 Further, “PA (1, 1.2) end” among the attribute values set in the Line 7 block of the lower loop is the parent PA range <1> and the child PA range <1.2>. Represents the end position of both PA ranges.

ここでは、このように、親のＰＡレンジ＜１＞、および２つの子のＰＡレンジ＜１．１＞，＜１．２＞が、互いの親子関係が分かるように設定されている。 Here, in this way, the parent PA range <1> and the two child PA ranges <1.1> and <1.2> are set so that the parent-child relationship can be understood.

図１２のステップＳ２６では、ＰＡ位置ファイルから「ＰＡレンジ設定箇所指定情報」が読み込まれる。前述の通り、この「ＰＡレンジ設定箇所指定情報」には、ここで例示している「ａｌｌ」のほか、「ｅｎｔｉｒｅｔｙ−ｏｎｌｙ」、「ｍａｉｎ−ｌｏｏｐ−ｏｎｌｙ」、および「ｍｏｄ−ｌｏｏｐ−ｏｎｌｙ」の合計４種類存在する。 In step S26 of FIG. 12, “PA range setting location designation information” is read from the PA position file. As described above, the “PA range setting location designation information” includes “all-only”, “main-loop-only”, and “mod-loop-only” in addition to “all” illustrated here. There are a total of four types.

ステップＳ２７では、その読み込んだ「ＰＡレンジ設定箇所指定情報」が「ｅｎｔｉｒｅｔｙ−ｏｎｌｙ」であるか否かが判定され、「ｅｎｔｉｒｅｔｙ−ｏｎｌｙ」であったときは子のＰＡレンジに関する属性値が削除される（ステップＳ２８）。 In step S27, it is determined whether or not the read “PA range setting location designation information” is “entry-only”, and if it is “entry-only”, the attribute value related to the child PA range is deleted. (Step S28).

「ｅｎｔｉｒｅｔｙ−ｏｎｌｙ」は、子のＰＡレンジについての個別のＰＡ情報は不要であって、親のＰＡレンジについて一括してＰＡ情報を取得することを指示する情報だからである。 This is because “entity-only” does not require individual PA information for the child PA range, and is information for instructing to collectively acquire PA information for the parent PA range.

また、その「ＰＡレンジ設定箇所指定情報」が「ｍａｉｎ−ｌｏｏｐ−ｏｎｌｙ」のときは、子のＰＡレンジのうちの２番目以降の子のＰＡレンジに関する属性値が削除される（ステップＳ３０）。ここで、２番目以降と表現しているのは、親のＰＡレンジ内に３つ以上のループ又は処理ブロックが存在することもあるからである。 If the “PA range setting location designation information” is “main-loop-only”, the attribute values related to the second and subsequent child PA ranges of the child PA ranges are deleted (step S30). Here, the reason for expressing the second or later is that there may be three or more loops or processing blocks in the parent PA range.

さらに、その「ＰＡレンジ設定箇所指定情報」が「ｍｏｄ−ｌｏｏｐ−ｏｎｌｙ」のときは（ステップＳ３１）、子のＰＡレンジのうちの１番目の子のＰＡレンジに関する属性値が削除される（ステップＳ３２）。 Furthermore, when the “PA range setting location designation information” is “mod-loop-only” (step S31), the attribute value related to the PA range of the first child of the child PA ranges is deleted (step S31). S32).

尚、ここで示している具体例（図１５参照）では「ａｌｌ」が指定されているため、ステップＳ２７，Ｓ２９，Ｓ３１のいずれも「ｎｏ」であり、子のＰＡレンジの属性値はいずれも削除されることなく残される。 In the specific example shown here (see FIG. 15), since “all” is designated, all of steps S27, S29, and S31 are “no”, and the attribute values of the child PA range are all. It remains without being deleted.

図１２，図１３に示す処理のうち、ステップＳ３２までの処理が、図５のＰＡレンジ設定部１５１の処理に相当し、以下のステップＳ３３以降の処理が図５の関数埋込部１５２の処理に相当する。 Of the processes shown in FIGS. 12 and 13, the process up to step S32 corresponds to the process of the PA range setting unit 151 in FIG. 5, and the process after step S33 is the process of the function embedding unit 152 in FIG. It corresponds to.

ステップＳ３３では、ＰＡ情報取得関数挿入処理が行なわれる。ここでは、ＰＡレンジに関する属性値が参照されて、各ＰＡレンジについて第２のプログラム内にＰＡ情報取得のための関数が埋め込まれる。 In step S33, PA information acquisition function insertion processing is performed. Here, an attribute value related to the PA range is referred to, and a function for acquiring PA information is embedded in the second program for each PA range.

図１６は、第２のプログラムにＰＡ情報取得関数が埋め込まれた状態を示すイメージ図である。 FIG. 16 is an image diagram showing a state in which the PA information acquisition function is embedded in the second program.

上下２つのループのうちの上側のループのＬｉｎｅ４のブロックには、「ＣａｌｌＰＡ＿ｓｔａｒｔ（１．１）」が埋め込まれ、上側のループのＬｉｎｅ７のブロックには「ＣａｌｌＰＡ＿ｅｎｄ（１．１）」が埋め込まれている。 Of the two upper and lower loops, “Call PA_start (1.1)” is embedded in the Line 4 block of the upper loop, and “Call PA_end (1.1)” is embedded in the Line 7 block of the upper loop. It is.

また、下側のループのＬｉｎｅ４のブロックには、「ＣａｌｌＰＡ＿ｓｔａｒｔ（１．２）」が埋め込まれ、下側のループのＬｉｎｅ７のブロックには「ＣａｌｌＰＡ＿ｅｎｄ（１．２）」が埋め込まれている。 Also, “Call PA_start (1.2)” is embedded in the Line 4 block of the lower loop, and “Call PA_end (1.2)” is embedded in the Line 7 block of the lower loop. .

上側のループのＬｉｎｅ４のブロックに埋め込まれている「ＣａｌｌＰＡ＿ｓｔａｒｔ（１．１）」は、ＰＡ情報の取得の開始を実行する関数を呼び出す命令である。また、上側のループのＬｉｎｅ７のブロックに埋め込まれている「ＣａｌｌＰＡ＿ｅｎｄ（１．１）」は、「ＣａｌｌＰＡ＿ｓｔａｒｔ（１．１）」で開始したＰＡ情報の取得の停止を実行する関数を呼び出す命令である。 “Call PA_start (1.1)” embedded in the block of Line 4 in the upper loop is an instruction for calling a function for starting the acquisition of PA information. In addition, “Call PA_end (1.1)” embedded in the Line 7 block of the upper loop is an instruction for calling a function for executing the stop of acquisition of PA information started by “Call PA_start (1.1)”. It is.

又、これと同じく、下側のループのＬｉｎｅ４のブロックに埋め込まれている「ＣａｌｌＰＡ＿ｓｔａｒｔ（１．２）」、およびＬｉｎｅ７のブロックに埋め込まれている「ＣａｌｌＰＡ＿ｅｎｄ（１．２）」は、それぞれ、ＰＡ情報の取得の開始、ＰＡ情報の取得の終了を実行する各関数を呼び出す命令である。２つのＰＡレンジに＜１．１＞、＜１．２＞という番号を付すことにより、それら２つのＰＡレンジを合算したＰＡレンジが親のＰＡレンジ＜１＞であることが分かるように関連づけている。 Similarly, “Call PA_start (1.2)” embedded in the Line 4 block of the lower loop and “Call PA_end (1.2)” embedded in the Line 7 block are respectively , An instruction for calling each function for starting the acquisition of PA information and ending the acquisition of PA information. By attaching the numbers <1.1> and <1.2> to the two PA ranges, it is related so that the PA range obtained by adding the two PA ranges is the parent PA range <1>. Yes.

図１３のステップＳ３３で、例えば図１６に示すようにＰＡ情報取得関数の埋込みが行なわれると、ＰＡ取得関数が埋め込まれた第２のプログラムがコード生成部１６（図５参照）に渡される。 In step S33 of FIG. 13, for example, when the PA information acquisition function is embedded as shown in FIG. 16, the second program in which the PA acquisition function is embedded is passed to the code generation unit 16 (see FIG. 5).

コード生成部１６は、ＰＡ取得情報が埋め込まれた第２のプログラムを受け取り、コンピュータで実行可能なオブジェクトプログラムを生成する。また、メインルーチンとサブルーチンなど複数のプログラムが存在するときは、コード生成部１６、リンカ１７によって、それら複数のプログラムの結合も行なわれて、最終的に実行可能なオブジェクトプログラムが生成される。 The code generation unit 16 receives the second program in which the PA acquisition information is embedded, and generates an object program executable by the computer. In addition, when there are a plurality of programs such as a main routine and a subroutine, the code generation unit 16 and the linker 17 also combine the plurality of programs to finally generate an executable object program.

図１７は、ソースプログラム上に、図１６の中間言語イメージ上のＰＡ取得関数と同じ内容を書き下したときのイメージ図である。 FIG. 17 is an image diagram when the same content as the PA acquisition function on the intermediate language image of FIG. 16 is written on the source program.

ここでは、（３）行目〜（５）行目のループと（８）行目〜（１０）行目のループとの２つのループにループ分配されている。ここで、（１）行目に親のＰＡレンジ（１）のＰＡ情報取得開始を実行する関数を呼び出す命令が記述されている。また、（２）行目には子のＰＡレンジ（１．１）のＰＡ情報取得開始を実行する関数を呼び出す命令が記述されている。さらに、（６）行目には子のＰＡレンジ（１．１）のＰＡ情報の取得の停止を実行する関数を呼び出す命令が記述されている。さらに、（７）行目には子のＰＡレンジ（１．２）のＰＡ情報の取得を開始する関数を呼び出す命令が記述されている。さらに、（１１）行目には子のＰＡレンジ（１．２）のＰＡ情報の取得を停止する関数を呼び出す命令が記述され、（１２）行目には親のＰＡレンジ（１）のＰＡ情報の取得を停止する関数を呼び出す命令が記述されている。 Here, the loop is distributed to two loops, that is, the loop of the (3) line to the (5) line and the loop of the (8) line to the (10) line. Here, an instruction for calling a function for executing the PA information acquisition start of the parent PA range (1) is described in the (1) line. Further, in the (2) line, an instruction for calling a function for executing the PA information acquisition start of the child PA range (1.1) is described. Further, the (6) line describes an instruction for calling a function for stopping the acquisition of the PA information of the child PA range (1.1). Further, the (7) line describes an instruction for calling a function for starting the acquisition of PA information of the child PA range (1.2). Further, an instruction for calling a function for stopping acquisition of PA information of the child PA range (1.2) is described in the (11) line, and a PA in the parent PA range (1) is described in the (12) line. An instruction for calling a function for stopping the acquisition of information is described.

ここで、図１６では、親のＰＡレンジに関するＰＡ情報取得関数は明示的には埋め込まれていない。上述の通り、親子関係が分かっており、親のＰＡ情報は２つの子のＰＡ情報の和として自動的に求められる。 Here, in FIG. 16, the PA information acquisition function related to the parent PA range is not explicitly embedded. As described above, the parent-child relationship is known, and the parent PA information is automatically obtained as the sum of the PA information of the two children.

これに対し、図１７の場合、プログラマが書き下すことが想定されており、プログラマが自分でループ分配を行ない、ＰＡ取得関数も自分で埋め込んでいる。 On the other hand, in the case of FIG. 17, it is assumed that the programmer writes down, the programmer performs the loop distribution by himself, and also embeds the PA acquisition function by himself.

このように、従来はＰＡレンジの親子関係が自動認識されることもなく、ＰＡレンジ＜１＞と、ＰＡレンジ＜１．１＞と、ＰＡレンジ＜１．２＞はそれぞれ別々のＰＡレンジとして認識される。したがってプログラマが全てのＰＡレンジについてＰＡ情報取得関数を記述しておく必要がある。 Thus, conventionally, the parent-child relationship of the PA range is not automatically recognized, and the PA range <1>, PA range <1.1>, and PA range <1.2> are set as separate PA ranges, respectively. Be recognized. Therefore, it is necessary for the programmer to describe PA information acquisition functions for all PA ranges.

図１８は、「ａｌｌ」以外の「ＰＡレンジ設定箇所指定情報」が指定された場合の、ソースコードイメージを示した図である。 FIG. 18 is a diagram showing a source code image when “PA range setting location designation information” other than “all” is designated.

図１８（Ａ）は、「ｅｎｔｉｒｅｔｙ−ｏｎｌｙ」が指定された場合であり、複数のループ全体を含むようにＰＡ情報取得関数が埋め込まれている。 FIG. 18A shows a case where “entry-only” is designated, and a PA information acquisition function is embedded so as to include the entire plurality of loops.

図１８（Ｂ）は、「ｍａｉｎ−ｌｏｏｐ−ｏｎｌｙ」が指定された場合であり、ＰＡ情報取得関数が１番目のループのみ含むように埋め込まれている。 FIG. 18B shows a case where “main-loop-only” is designated, and the PA information acquisition function is embedded so as to include only the first loop.

図１８（Ｃ）は、「ｍｏｄ−ｌｏｏｐ−ｏｎｌｙ」が指定された場合であり、ＰＡ情報取得関数が２番目のループのみ含むように埋め込まれている。尚、ここには、ループ２つのみの場合が示されているが、３つ以上のループが存在する場合、「ｍｏｄ−ｌｏｏｐ−ｏｎｌｙ」では２番目以降の各ループそれぞれについて独立してＰＡ情報が得られるように、ＰＡ情報取得関数が埋め込まれる。 FIG. 18C shows a case where “mod-loop-only” is designated, and the PA information acquisition function is embedded so as to include only the second loop. Although only two loops are shown here, when there are three or more loops, in the “mod-loop-only”, the PA information is independently set for each of the second and subsequent loops. PA information acquisition function is embedded.

図１９は、コンパイル実行時の表示画面例を示す図である。 FIG. 19 is a diagram showing an example of a display screen at the time of compiling.

１行目の「ｆｒｔａ．ｆ９０ −ｐａ＿ｒａｎｇｅ“ｐａｒａｎｇｅ．ｔｘｔ” −ｏａ．ｅｘｅ」は、オペレータによるキーボード１３０（図３参照）等の操作により入力される。ここで、「ｆｒｔ」は、フォートラン（Ｆｏｒｔｒａｎ）言語で記述されたソースプログラムをコンパイルするためのコマンドである。また、「ａ．ｆ９０」はソースプログラムが記述されたソースファイルの名前である（図８（Ａ）参照）。また、−ｐａ＿ｒａｎｇｅ“ｐａｒａｎｇｅ．ｔｘｔ”は、そのソースファイル「ａ．ｆ９０」と結びつけるべきＰＡ位置ファイルのファイル名が「ｐａｒａｎｇｅ．ｔｘｔ」であることを表わしている（図８（Ｂ）参照）。さらに、「−ｏａ．ｅｘｅ」は、最終的なオブジェクトプログラムが格納される実行ファイルの名称が「ａ．ｅｘｅ」であることを表わしている。 In the first line, “frt a.f90 -pa_range“ parrange. “txt” -o a.exe ”is input by an operation of the keyboard 130 (see FIG. 3) by the operator. Here, “frt” is a command for compiling a source program described in the Fortran language. “A.f90” is the name of the source file in which the source program is described (see FIG. 8A). Further, -pa_range “place.txt” indicates that the file name of the PA location file to be associated with the source file “a.f90” is “place.txt” (see FIG. 8B). Further, “-o a.exe” indicates that the name of the execution file in which the final object program is stored is “a.exe”.

すなわち、この１行目は、フォートラン言語で記述された、「ａ．ｆ９０」のファイルに格納されているソースプログラムに、ＰＡ位置ファイル「ｐａｒａｎｇｅ．ｔｘｔ」を参照しながらＰＡ情報取得関数を埋め込んでオブジェクトプログラムを生成することを指示するオペレータの命令を表わしている。また、この１行目はさらに、その生成したオブジェクトプログラムを「ａ．ｅｘｅ」のファイルに格納することを指示している。 In other words, the first line embeds a PA information acquisition function in the source program stored in the file “a.f90” described in the Fortran language while referring to the PA location file “parrange.txt”. It represents an operator command instructing generation of an object program. The first line further instructs to store the generated object program in the file “a.exe”.

２行目以降は、コンパイル装置でのコンパイルの実行の進度に応じて順次表示される情報である。最終行の「実行ファイル完成（ａ．ｅｘｅ）」まで表示されることで、コンパイルが完了したことが分かる。 The second and subsequent lines are information that is sequentially displayed according to the progress of execution of compilation by the compiling device. By displaying up to “execution file completion (a.exe)” on the last line, it is understood that the compilation is completed.

図２０は、実行ファイルに格納された、ＰＡ情報取得関数が埋め込まれたオブジェクトプログラムの実行により得られるＰＡ情報の表示イメージを示した図である。 FIG. 20 is a diagram showing a display image of PA information obtained by executing an object program embedded with a PA information acquisition function, stored in an execution file.

１行目は、親のＰＡレンジ＜１＞に関するＰＡ情報である。この親のＰＡレンジ＜１＞のＰＡ情報は２行目の子のＰＡレンジ＜１．１＞のＰＡ情報と３行目の子のＰＡレンジ＜１．２＞のＰＡ情報との和となっている。 The first line is PA information related to the parent PA range <1>. The PA information of the parent PA range <1> is the sum of the PA information of the child PA range <1.1> of the second row and the PA information of the child PA range <1.2> of the third row.

「ｉｎｔｅｒｖａｌ」はＰＡレンジを示している。１行目の「４−７／ｂｌｏｃｋ１（ａ．ｆ９０−ｓｕｂ１）」は、ａ．ｆ９０／ｓｕｂ１の４行目から７行目までの処理ブロックに関するＰＡ情報であることを意味している。２行目、３行目についても同様である。「ｃｙｃｌｅ」、「ｉｎｓｔｒｕｃｔｉｏｎｓ」「ｃａｃｈｅｍｉｓｓ」は、各処理ブロック実行時における、それぞれ、サイクル数、命令数、およびキャッシュミスの数を表わしている。 “Interval” indicates the PA range. “4-7 / block1 (a.f90-sub1)” on the first line is a. This means that the PA information is related to the processing blocks from the fourth line to the seventh line of f90 / sub1. The same applies to the second and third lines. “Cycle”, “instructions”, and “cachemiss” represent the number of cycles, the number of instructions, and the number of cache misses when each processing block is executed.

このように、本実施形態では、最適化処理によりループ又は処理ブロックが増えた場合であっても、各ループ又は各処理ブロックについての詳細なＰＡ情報を取得することができる。したがってプログラムの実行性能の詳細な分析が可能となる。 As described above, in this embodiment, even when the number of loops or processing blocks is increased by the optimization processing, detailed PA information about each loop or each processing block can be acquired. Therefore, detailed analysis of program execution performance becomes possible.

１ソースプログラム
２オブジェクトプログラム
３ＰＡ位置ファイル
１０コンパイル装置
１１ソースプログラム／ＰＡ位置ファイル取得部
１２構文解析部
１３ＰＡ位置ファイル解析部
１４最適化部
１５ＰＡ取得関数挿入部
１６コード生成部
１７リンカ
１００コンピュータ
１１０本体部
１１１ＣＤ／ＤＶＤ装填口
１１２ＣＰＵ
１１３メモリ
１１４ハードディスク装置
１１５ネットワークインタフェイス
１１６ＣＤ／ＤＶＤドライブ
１１７ＣＤ／ＤＶＤ
１２０画像表示部
１３０キーボード
１２１表示画面
１４０マウス
１４１最適化情報
１５０バス
１５１ＰＡレンジ設定部
１５２関数埋込部 DESCRIPTION OF SYMBOLS 1 Source program 2 Object program 3 PA position file 10 Compile apparatus 11 Source program / PA position file acquisition part 12 Syntax analysis part 13 PA position file analysis part 14 Optimization part 15 PA acquisition function insertion part 16 Code generation part 17 Linker 100 Computer 110 Main body 111 CD / DVD loading slot 112 CPU
113 Memory 114 Hard Disk Device 115 Network Interface 116 CD / DVD Drive 117 CD / DVD
DESCRIPTION OF SYMBOLS 120 Image display part 130 Keyboard 121 Display screen 140 Mouse 141 Optimization information 150 Bus 151 PA range setting part 152 Function embedding part

Claims

An acquisition means for acquiring a PA position file describing a PA range representing a range on the program for obtaining PA information representing a behavior at the time of program execution, and a source program created using a program language;
First conversion means for converting the source program acquired by the acquisition means into a first program described in an intermediate language;
First setting means for reading the PA range from the PA position file and setting the PA range on the first program;
Optimizing the first program in which the PA range is set by the first setting means to generate a second program that takes over the PA range set on the first program And
Check if there is an increase in the number of loops or processing blocks in the PA range on the second program as compared to the number of loops or processing blocks in the PA range on the first program; When the number of loops or processing blocks in the PA range on the second program has increased, a new PA range is set for each loop or processing block in the PA range on the second program A second setting means;
Function embedding means for embedding a PA information acquisition function for acquiring the PA information in the second program for each PA range set in the second program;
A compiling apparatus comprising: a second conversion unit configured to convert the second program in which the PA information acquisition function is embedded into an object program executable on a computer.

The PA position file describes PA range setting location designation information for designating the new PA range setting location when the new PA range is set by the second setting means together with the PA range. ,
The compiling apparatus according to claim 1, wherein the second setting means sets the new PA range on the second program based on the PA range setting location designation information.

It is executed in an arithmetic processing unit that executes a program or compiles, and in the arithmetic processing unit,
An acquisition means for acquiring a PA position file describing a PA range representing a range on the program for obtaining PA information representing a behavior at the time of program execution, and a source program created using a program language;
First conversion means for converting the source program acquired by the acquisition means into a first program described in an intermediate language;
First setting means for reading the PA range from the PA position file and setting the PA range on the first program;
Optimizing the first program in which the PA range is set by the first setting means to generate a second program that takes over the PA range set on the first program And
Check if there is an increase in the number of loops or processing blocks in the PA range on the second program as compared to the number of loops or processing blocks in the PA range on the first program; When the number of loops or processing blocks in the PA range on the second program has increased, a new PA range is set for each loop or processing block in the PA range on the second program A second setting means;
Function embedding means for embedding a PA information acquisition function for acquiring the PA information in the second program for each PA range set in the second program;
A compiling program comprising: a compiling device having second converting means for converting the second program in which the PA information acquisition function is embedded into an object program executable on a computer.

The PA position file describes PA range setting location designation information for designating the new PA range location when the second setting means sets a new PA range together with the PA range,
4. The compiled program according to claim 3, wherein the second setting means sets the new PA range on the second program based on the PA range setting location designation information.