JP7812966B2

JP7812966B2 - Image encoding device, image decoding device, and program

Info

Publication number: JP7812966B2
Application number: JP2025108304A
Authority: JP
Inventors: 俊輔岩村; 敦郎市ヶ谷; 慎平根本
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2018-08-15
Filing date: 2025-06-26
Publication date: 2026-02-10
Anticipated expiration: 2038-08-15
Also published as: JP7483964B2; JP2023053272A; JP2024087082A; JP7704929B2; JP7249111B2; JP2020028067A; JP2025123577A

Description

本発明は、画像符号化装置、画像復号装置、及びプログラムに関する。 The present invention relates to an image encoding device, an image decoding device, and a program.

従来、フレーム単位の現画像を分割して得られたブロック単位の対象画像を符号化する画像符号化装置において、複数の参照画像を用いて対象画像を予測して予測画像を生成し、対象画像と予測画像との間の差分を示す予測残差に対して直交変換処理を行って変換係数を算出し、変換係数を量子化及びエントロピー符号化して符号化データを出力する方法が知られている。 In a conventional image coding device that encodes target images in blocks obtained by dividing a current image in frames, a method is known in which the target image is predicted using multiple reference images to generate a predicted image, the prediction residual indicating the difference between the target image and the predicted image is subjected to orthogonal transform processing to calculate transform coefficients, and the transform coefficients are quantized and entropy coded to output coded data.

また、画像符号化装置と同様に、画像復号装置は、複数の参照画像を用いて対象画像を予測して予測画像を生成する。画像復号装置は、符号化データを復号して変換係数を取得するとともに逆量子化し、逆量子化後の変換係数に対して逆直交変換処理を行って予測残差を算出し、予測画像と予測残差とを合成することにより対象画像を復号する。 Also, like an image encoding device, an image decoding device predicts a target image using multiple reference images to generate a predicted image. The image decoding device decodes encoded data to obtain transform coefficients and dequantizes them, performs inverse orthogonal transform processing on the dequantized transform coefficients to calculate prediction residuals, and decodes the target image by combining the predicted image and the prediction residuals.

ＨＥＶＣでは、変換処理（直交変換処理及び逆直交変換処理）に適用可能な直交変換として、ＤＣＴ－２及びＤＳＴ－７の２種類が規定されている（非特許文献１参照）。具体的には、ＨＥＶＣでは、対象画像のブロックサイズや、対象画像に適用するイントラ予測のモードに基づいて、２種類の直交変換のうちどちらの種類の直交変換を適用するかを決定する。 HEVC specifies two types of orthogonal transform that can be applied to transform processes (orthogonal transform processes and inverse orthogonal transform processes): DCT-2 and DST-7 (see Non-Patent Document 1). Specifically, HEVC determines which of the two types of orthogonal transform to apply based on the block size of the target image and the intra-prediction mode applied to the target image.

ＲｅｃｏｍｍｅｎｄａｔｉｏｎＩＴＵ－ＴＨ．２６５，（１２／２０１６）， “Ｈｉｇｈｅｆｆｉｃｉｅｎｃｙｖｉｄｅｏｃｏｄｉｎｇ”，ＩｎｔｅｒｎａｔｉｏｎａｌＴｅｌｅｃｏｍｍｕｎｉｃａｔｉｏｎＵｎｉｏｎRecommendation ITU-T H. 265, (12/2016), “High efficiency video coding”, International Telecommunication Union

しかしながら、対象画像のブロックサイズや対象画像に適用するイントラ予測のモードに基づいて直交変換の種類を決定するだけでは、予測残差におけるエネルギー分布に応じた最適な種類の直交変換を適用することができない。例えば、本来ＤＣＴ－２の方が効率的に予測残差のエネルギーを集中させる場合であっても、適用する直交変換としてＤＳＴ－７を決定してしまう場合があるため、符号化効率が低下してしまう問題がある。 However, simply determining the type of orthogonal transform based on the block size of the target image and the intra-prediction mode applied to the target image does not allow for the optimal type of orthogonal transform to be applied in accordance with the energy distribution of the prediction residual. For example, even when DCT-2 would actually be more efficient at concentrating the energy of the prediction residual, DST-7 may be selected as the orthogonal transform to be applied, resulting in a decrease in coding efficiency.

また、予測残差のエネルギーをより効率的に集中させる直交変換の方法としてＫＬＴが挙げられるが、画像符号化装置で行ったＫＬＴの逆処理のための情報を画像復号装置側で必要とすることから、伝送すべき情報量が増大してしまい、符号化効率が低下してしまう問題がある。 KLT is an orthogonal transform method that can more efficiently concentrate the energy of prediction residuals. However, since the image decoding device requires information for the inverse process of KLT performed by the image encoding device, the amount of information to be transmitted increases, resulting in a problem of reduced encoding efficiency.

そこで、本発明は、符号化効率を改善できる画像符号化装置、画像復号装置、及びプログラムを提供することを目的とする。 Therefore, an object of the present invention is to provide an image encoding device, an image decoding device, and a program that can improve encoding efficiency.

第１の特徴に係る画像符号化装置は、フレーム単位の現画像を分割して得られたブロック単位の対象画像を符号化する画像符号化装置であって、複数の参照画像を用いて前記対象画像を予測して予測画像を生成する予測部と、前記複数の参照画像間の類似度を画素単位で評価することにより、前記予測画像における誤差の分布を示すマップ情報を生成する評価部と、前記対象ブロックと前記予測画像との差分を示す予測残差を画素単位で算出する減算部と、前記予測残差に適用する直交変換を前記マップ情報に基づいて決定する決定部と、前記決定された直交変換によって前記予測残差に対する直交変換処理を行う変換部と、を備えることを要旨とする。他の特徴に係る画像符号化装置は、フレーム単位の現画像を分割して得られたブロック単位の対象画像を符号化する画像符号化装置であって、複数の参照画像を用いて前記対象画像を予測して予測画像を生成する予測部と、前記複数の参照画像間の類似度を示す差分絶対値和を前記ブロックよりも小さい領域単位であって複数の画素からなる前記領域単位で算出する評価部と、を備え、前記評価部が前記領域単位で算出した前記差分絶対値和に基づいて符号化の処理を制御することを要旨とする。 An image coding device according to a first feature encodes a target image in blocks obtained by dividing a current image in frames, and includes: a prediction unit that predicts the target image using multiple reference images to generate a predicted image; an evaluation unit that generates map information indicating an error distribution in the predicted image by evaluating the similarity between the multiple reference images on a pixel-by-pixel basis; a subtraction unit that calculates a prediction residual indicating a difference between the target block and the predicted image on a pixel-by-pixel basis; a determination unit that determines an orthogonal transform to apply to the prediction residual based on the map information; and a transformation unit that performs orthogonal transform processing on the prediction residual using the determined orthogonal transform. An image coding device according to another feature encodes a target image in blocks obtained by dividing a current image in frames, and includes: a prediction unit that predicts the target image using multiple reference images to generate a predicted image; and an evaluation unit that calculates a sum of absolute differences indicating the similarity between the multiple reference images in units of regions smaller than the blocks and consisting of multiple pixels; and controls the encoding process based on the sum of absolute differences calculated by the evaluation unit for each region.

第２の特徴に係る画像復号装置は、フレーム単位の現画像を分割して得られたブロック単位の対象画像を復号する画像復号装置であって、符号化データを復号することにより変換係数を取得する復号部と、複数の参照画像を用いて前記対象画像を予測して予測画像を生成する予測部と、前記複数の参照画像間の類似度を画素単位で評価することにより、前記予測画像における誤差の分布を示すマップ情報を生成する評価部と、前記変換係数に適用する逆直交変換を前記マップ情報に基づいて決定する決定部と、前記決定された逆直交変換によって前記変換係数に対する逆直交変換処理を行う逆変換部と、を備えることを要旨とする。他の特徴に係る画像復号装置は、フレーム単位の現画像を分割して得られたブロック単位の対象画像を復号する画像復号装置であって、複数の参照画像を用いて前記対象画像を予測して予測画像を生成する予測部と、前記複数の参照画像間の類似度を示す差分絶対値和を前記ブロックよりも小さい領域単位であって複数の画素からなる前記領域単位で算出する評価部と、を備え、前記評価部が前記領域単位で算出した前記差分絶対値和に基づいて復号の処理を制御することを要旨とする。 An image decoding device according to a second feature is an image decoding device that decodes a target image in blocks obtained by dividing a current image in frames, and includes: a decoding unit that obtains transform coefficients by decoding encoded data; a prediction unit that predicts the target image using multiple reference images to generate a predicted image; an evaluation unit that generates map information indicating an error distribution in the predicted image by evaluating the similarity between the multiple reference images on a pixel-by-pixel basis; a determination unit that determines an inverse orthogonal transform to apply to the transform coefficients based on the map information; and an inverse transform unit that performs inverse orthogonal transform processing on the transform coefficients using the determined inverse orthogonal transform. An image decoding device according to another feature is an image decoding device that decodes a target image in blocks obtained by dividing a current image in frames, and includes: a prediction unit that predicts the target image using multiple reference images to generate a predicted image; and an evaluation unit that calculates a sum of absolute differences indicating the similarity between the multiple reference images in units of regions smaller than the blocks and consisting of multiple pixels; and controls the decoding process based on the sum of absolute differences calculated by the evaluation unit for each region.

第３の特徴に係るプログラムは、コンピュータを第１の特徴に係る画像符号化装置として機能させることを要旨とする。 The program relating to the third feature is characterized in that it causes a computer to function as the image encoding device relating to the first feature.

第４の特徴に係るプログラムは、コンピュータを第２の特徴に係る画像復号装置として機能させることを要旨とする。 The program relating to the fourth feature is characterized in that it causes a computer to function as the image decoding device relating to the second feature.

本発明によれば、符号化効率を改善できる画像符号化装置、画像復号装置、及びプログラムを提供できる。 The present invention provides an image encoding device, an image decoding device, and a program that can improve encoding efficiency.

第１実施形態に係る画像符号化装置の構成を示す図である。FIG. 1 is a diagram showing a configuration of an image encoding device according to a first embodiment. 第１乃至第３実施形態に係るインター予測の一例を示す図である。FIG. 10 is a diagram showing an example of inter prediction according to the first to third embodiments. 第１乃至第３実施形態に係る評価部の構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a configuration of an evaluation unit according to the first to third embodiments. 第１実施形態に係る適応変換生成部の動作を示す図である。FIG. 4 is a diagram illustrating an operation of an adaptive transformation generation unit according to the first embodiment. 第１実施形態に係る画像復号装置の構成を示す図である。FIG. 1 is a diagram showing the configuration of an image decoding device according to a first embodiment. 第２実施形態に係る画像符号化装置の構成を示す図である。FIG. 10 is a diagram showing the configuration of an image encoding device according to a second embodiment. 第２実施形態に係る画像復号装置の構成を示す図である。FIG. 10 is a diagram showing the configuration of an image decoding device according to a second embodiment. 第３実施形態に係る画像符号化装置の構成を示す図である。FIG. 11 is a diagram showing the configuration of an image encoding device according to a third embodiment. 第３実施形態に係る特徴量評価部の動作を示す図である。FIG. 11 is a diagram illustrating an operation of a feature amount evaluation unit according to the third embodiment. 第３実施形態に係る画像復号装置の構成を示す図である。FIG. 11 is a diagram showing the configuration of an image decoding device according to a third embodiment.

図面を参照して、実施形態に係る画像符号化装置及び画像復号装置について説明する。以下の図面の記載において、同一又は類似の部分には同一又は類似の符号を付している。 An image encoding device and an image decoding device according to an embodiment will be described with reference to the drawings. In the following description of the drawings, the same or similar parts are denoted by the same or similar reference numerals.

＜第１実施形態＞
第１実施形態に係る画像符号化装置及び画像復号装置について説明する。第１実施形態に係る画像符号化装置及び画像復号装置は、ＭＰＥＧに代表される動画の符号化及び復号を行う。 First Embodiment
An image encoding device and an image decoding device according to the first embodiment will be described below. The image encoding device and the image decoding device according to the first embodiment encode and decode moving images, such as MPEG.

（画像符号化装置）
図１は、第１実施形態に係る画像符号化装置１の構成を示す図である。図１に示すように、画像符号化装置１は、ブロック分割部１００と、減算部１１０と、変換・量子化部１２０と、エントロピー符号化部１３０と、逆量子化・逆変換部１４０と、合成部１５０と、メモリ１６０と、予測部１７０と、評価部１８０と、決定部１９０とを備える。 (Image encoding device)
Fig. 1 is a diagram showing the configuration of an image coding device 1 according to the first embodiment. As shown in Fig. 1, the image coding device 1 includes a block division unit 100, a subtraction unit 110, a transformation and quantization unit 120, an entropy coding unit 130, an inverse quantization and inverse transform unit 140, a synthesis unit 150, a memory 160, a prediction unit 170, an evaluation unit 180, and a determination unit 190.

ブロック分割部１００は、動画像を構成するフレーム（或いはピクチャ）単位の入力画像をブロック状の小領域に分割し、分割により得たブロックを減算部１１０に出力する。ブロックのサイズは、例えば３２×３２画素、１６×１６画素、８×８画素、又は４×４画素等である。ブロックの形状は正方形に限らず、長方形であってもよい。ブロックは、画像符号化装置１が符号化を行う単位及び画像復号装置２が復号を行う単位である。 The block division unit 100 divides the input image, which is made up of frames (or pictures) that make up a moving image, into small block-shaped regions and outputs the blocks obtained by division to the subtraction unit 110. The size of the blocks is, for example, 32 x 32 pixels, 16 x 16 pixels, 8 x 8 pixels, or 4 x 4 pixels. The shape of the blocks is not limited to squares and may be rectangular. Blocks are the units used for encoding by the image encoding device 1 and for decoding by the image decoding device 2.

減算部１１０は、ブロック分割部１００から入力されたブロックと当該ブロックを予測部１７０が予測して得た予測画像（予測ブロック）との間の画素単位での差分を示す予測残差を算出する。具体的には、減算部１１０は、ブロックの各画素値から予測画像の各画素値を減算することにより予測残差を算出し、算出した予測残差を変換・量子化部１２０に出力する。 The subtraction unit 110 calculates a prediction residual that indicates the pixel-by-pixel difference between the block input from the block division unit 100 and the predicted image (predicted block) obtained by predicting that block by the prediction unit 170. Specifically, the subtraction unit 110 calculates the prediction residual by subtracting each pixel value of the predicted image from each pixel value of the block, and outputs the calculated prediction residual to the transformation/quantization unit 120.

変換・量子化部１２０は、ブロック単位で直交変換処理及び量子化処理を行う。変換・量子化部１２０は、変換部１２１と、量子化部１２２とを備える。 The transform/quantization unit 120 performs orthogonal transform processing and quantization processing on a block-by-block basis. The transform/quantization unit 120 includes a transform unit 121 and a quantization unit 122.

変換部１２１は、減算部１１０から入力された予測残差に対して直交変換処理を行って変換係数を算出し、算出した変換係数を量子化部１２２に出力する。直交変換とは、例えば、離散コサイン変換（ＤＣＴ：ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）や離散サイン変換（ＤＳＴ：ＤｉｓｃｒｅｔｅＳｉｎｅＴｒａｎｓｆｏｒｍ）、カルーネンレーブ変換（ＫＬＴ：ＫａｒｈｕｎｅｎＬｏeｖｅＴｒａｎｓｆｏｒｍ）等をいう。第１実施形態において、変換部１２１は、ＫＬＴにより直交変換処理を行う。 The transform unit 121 performs orthogonal transform processing on the prediction residuals input from the subtraction unit 110 to calculate transform coefficients, and outputs the calculated transform coefficients to the quantization unit 122. Examples of orthogonal transform include discrete cosine transform (DCT), discrete sine transform (DST), and Karhunen-Loeve transform (KLT). In the first embodiment, the transform unit 121 performs orthogonal transform processing using KLT.

量子化部１２２は、変換部１２１から入力された変換係数を量子化パラメータ（Ｑｐ）及び量子化行列を用いて量子化し、量子化した変換係数をエントロピー符号化部１３０及び逆量子化・逆変換部１４０に出力する。なお、量子化パラメータ（Ｑｐ）は、ブロック内の各変換係数に対して共通して適用されるパラメータであって、量子化の粗さを定めるパラメータである。量子化行列は、各変換係数を量子化する際の量子化値を要素として有する行列である。 The quantization unit 122 quantizes the transform coefficients input from the transform unit 121 using a quantization parameter (Qp) and a quantization matrix, and outputs the quantized transform coefficients to the entropy coding unit 130 and the inverse quantization/inverse transform unit 140. The quantization parameter (Qp) is a parameter that is commonly applied to each transform coefficient in a block and determines the coarseness of quantization. The quantization matrix is a matrix whose elements are the quantization values used when quantizing each transform coefficient.

エントロピー符号化部１３０は、量子化部１２２から入力された変換係数に対してエントロピー符号化を行い、データ圧縮を行って符号化データ（ビットストリーム）を生成し、符号化データを画像符号化装置１の外部に出力する。エントロピー符号化には、ハフマン符号やＣＡＢＡＣ（Ｃｏｎｔｅｘｔ－ｂａｓｅｄＡｄａｐｔｉｖｅＢｉｎａｒｙＡｒｉｔｈｍｅｔｉｃＣｏｄｉｎｇ；コンテキスト適応型２値算術符号）等を用いることができる。なお、エントロピー符号化部１３０は、予測部１７０から予測に関する制御情報が入力され、入力された制御情報のエントロピー符号化も行う。 The entropy coding unit 130 performs entropy coding on the transform coefficients input from the quantization unit 122, compresses the data, generates coded data (bitstream), and outputs the coded data to the outside of the image coding device 1. Entropy coding can use Huffman coding or CABAC (Context-based Adaptive Binary Arithmetic Coding), among others. Note that the entropy coding unit 130 also receives control information related to prediction from the prediction unit 170 and performs entropy coding on the input control information.

逆量子化・逆変換部１４０は、ブロック単位で逆量子化処理及び逆直交変換処理を行う。逆量子化・逆変換部１４０は、逆量子化部１４１と、逆変換部１４２とを備える。 The inverse quantization and inverse transform unit 140 performs inverse quantization and inverse orthogonal transform processing on a block-by-block basis. The inverse quantization and inverse transform unit 140 includes an inverse quantization unit 141 and an inverse transform unit 142.

逆量子化部１４１は、量子化部１２２が行う量子化処理に対応する逆量子化処理を行う。具体的には、逆量子化部１４１は、量子化部１２２から入力された変換係数を、量子化パラメータ（Ｑｐ）及び量子化行列を用いて逆量子化することにより変換係数を復元し、復元した変換係数を逆変換部１４２に出力する。 The inverse quantization unit 141 performs inverse quantization processing corresponding to the quantization processing performed by the quantization unit 122. Specifically, the inverse quantization unit 141 restores the transform coefficients by inverse quantizing the transform coefficients input from the quantization unit 122 using a quantization parameter (Qp) and a quantization matrix, and outputs the restored transform coefficients to the inverse transform unit 142.

逆変換部１４２は、変換部１２１が行う直交変換処理に対応する逆直交変換処理を行う。例えば、変換部１２１が離散コサイン変換を行った場合には、逆変換部１４２は逆離散コサイン変換を行う。逆変換部１４２は、逆量子化部１４１から入力された変換係数に対して逆直交変換処理を行って予測残差を復元し、復元した予測残差である復元予測残差を合成部１５０に出力する。 The inverse transform unit 142 performs inverse orthogonal transform processing corresponding to the orthogonal transform processing performed by the transform unit 121. For example, if the transform unit 121 performs a discrete cosine transform, the inverse transform unit 142 performs an inverse discrete cosine transform. The inverse transform unit 142 performs inverse orthogonal transform processing on the transform coefficients input from the inverse quantization unit 141 to restore the prediction residual, and outputs the restored prediction residual, which is the restored prediction residual, to the synthesis unit 150.

合成部１５０は、逆変換部１４２から入力された復元予測残差を、予測部１７０から入力された予測画像と画素単位で合成する。合成部１５０は、復元予測残差の各画素値と予測画像の各画素値を加算してブロックを再構成（復号）し、復号したブロック単位の復号画像をメモリ１６０に出力する。かかる復号画像は、再構成画像と称されることがある。 The synthesis unit 150 synthesizes, on a pixel-by-pixel basis, the reconstructed prediction residual input from the inverse transform unit 142 with the predicted image input from the prediction unit 170. The synthesis unit 150 adds each pixel value of the reconstructed prediction residual to each pixel value of the predicted image to reconstruct (decode) a block, and outputs the decoded image in block units to the memory 160. Such a decoded image is sometimes referred to as a reconstructed image.

メモリ１６０は、合成部１５０から入力された復号画像を記憶する。メモリ１６０は、復号画像をフレーム単位で記憶する。メモリ１６０は、記憶している復号画像を予測部１７０に出力する。なお、合成部１５０とメモリ１６０との間にループフィルタが設けられてもよい。 Memory 160 stores the decoded image input from synthesis unit 150. Memory 160 stores the decoded image on a frame-by-frame basis. Memory 160 outputs the stored decoded image to prediction unit 170. Note that a loop filter may be provided between synthesis unit 150 and memory 160.

予測部１７０は、ブロック単位で予測を行う。予測部１７０は、イントラ予測部１７１と、インター予測部１７２と、切替部１７３とを備える。 The prediction unit 170 performs prediction on a block-by-block basis. The prediction unit 170 includes an intra prediction unit 171, an inter prediction unit 172, and a switching unit 173.

イントラ予測部１７１は、メモリ１６０に記憶された復号画像のうち、予測対象のブロックの周辺にある復号画素値を参照してイントラ予測画像を生成し、生成したイントラ予測画像を切替部１７３に出力する。また、イントラ予測部１７１は、複数のイントラ予測モードの中から、対象ブロックに適用する最適なイントラ予測モードを選択し、選択したイントラ予測モードを用いてイントラ予測を行う。イントラ予測部１７１は、選択したイントラ予測モードに関する制御情報をエントロピー符号化部１３０に出力する。なお、イントラ予測モードには、Ｐｌａｎａｒ予測、ＤＣ予測、及び方向性予測がある。 The intra prediction unit 171 generates an intra prediction image by referencing decoded pixel values surrounding the block to be predicted from among the decoded images stored in the memory 160, and outputs the generated intra prediction image to the switching unit 173. The intra prediction unit 171 also selects the optimal intra prediction mode to apply to the current block from among multiple intra prediction modes, and performs intra prediction using the selected intra prediction mode. The intra prediction unit 171 outputs control information related to the selected intra prediction mode to the entropy coding unit 130. Note that intra prediction modes include planar prediction, DC prediction, and directional prediction.

インター予測部１７２は、メモリ１６０に記憶された復号画像を参照画像として用いて、ブロックマッチングなどの手法により動きベクトルを算出し、予測対象のブロックを予測してインター予測画像を生成し、生成したインター予測画像を切替部１７３に出力する。インター予測部１７２は、複数の参照画像を用いるインター予測（典型的には、双予測）や、１つの参照画像を用いるインター予測（片方向予測）の中から最適なインター予測方法を選択し、選択したインター予測方法を用いてインター予測を行う。インター予測部１７２は、インター予測に関する制御情報（インター予測方法や動きベクトルの情報等）をエントロピー符号化部１３０に出力する。インター予測部１７２は、インター予測画像を生成するために複数の参照画像を用いる場合に、当該複数の参照画像を評価部１８０に出力する。 The inter prediction unit 172 uses the decoded image stored in the memory 160 as a reference image to calculate a motion vector using a technique such as block matching, predicts the block to be predicted, and generates an inter prediction image, which it outputs to the switching unit 173. The inter prediction unit 172 selects the most appropriate inter prediction method from inter prediction using multiple reference images (typically, bi-prediction) and inter prediction using a single reference image (unidirectional prediction), and performs inter prediction using the selected inter prediction method. The inter prediction unit 172 outputs control information related to the inter prediction (such as information on the inter prediction method and motion vectors) to the entropy coding unit 130. When multiple reference images are used to generate the inter prediction image, the inter prediction unit 172 outputs the multiple reference images to the evaluation unit 180.

なお、複数の参照画像を用いて行う予測は、インター予測における双予測が代表的なものであるが、これに限定されない。予測部１７０は、複数の参照画像を用いてイントラブロックコピーによる予測を行ってもよい。イントラブロックコピーでは、現フレームと同じフレーム内の参照画像が現フレーム内のブロックの予測に用いられる。複数の参照画像を用いてイントラブロックコピーによる予測を行う場合、予測部１７０は、当該複数の参照画像を評価部１８０に出力する。 Note that prediction using multiple reference images is typically performed using bi-prediction in inter prediction, but is not limited to this. The prediction unit 170 may also perform prediction using intra block copying using multiple reference images. In intra block copying, a reference image in the same frame as the current frame is used to predict a block in the current frame. When performing prediction using intra block copying using multiple reference images, the prediction unit 170 outputs the multiple reference images to the evaluation unit 180.

切替部１７３は、イントラ予測部１７１から入力されるイントラ予測画像とインター予測部１７２から入力されるインター予測画像とを切り替えて、いずれかの予測画像を減算部１１０及び合成部１５０に出力する。 The switching unit 173 switches between the intra-predicted image input from the intra-prediction unit 171 and the inter-predicted image input from the inter-prediction unit 172, and outputs either of the predicted images to the subtraction unit 110 and the synthesis unit 150.

評価部１８０は、予測部１７０から入力された複数の参照画像間の類似度を画素単位で評価することにより、当該複数の参照画像を用いて生成された予測画像における誤差の分布を示すマップ情報を生成し、生成したマップ情報を決定部１９０に出力する。評価部１８０の詳細については後述する。 The evaluation unit 180 evaluates the similarity between the multiple reference images input from the prediction unit 170 on a pixel-by-pixel basis, thereby generating map information indicating the distribution of errors in the predicted image generated using the multiple reference images, and outputs the generated map information to the determination unit 190. Details of the evaluation unit 180 will be described later.

決定部１９０は、評価部１８０から入力されたマップ情報に基づいて、評価部１８０により予測精度が評価された予測画像に対応する予測残差に適用する直交変換を決定し、決定した直交変換を変換部１２１及び逆変換部１４２に出力する。第１実施形態において、決定部１９０は、マップ情報に基づいて垂直方向の直交変換及び水平方向の直交変換を生成する適応変換生成部１９１を備える。変換部１２１は、決定部１９０から入力された直交変換に従って直交変換処理を行う。逆変換部１４２は、決定部１９０から入力された直交変換に従って直交変換処理を行う。適応変換生成部１９１の詳細については後述する。 Based on the map information input from the evaluation unit 180, the determination unit 190 determines the orthogonal transform to apply to the prediction residual corresponding to the predicted image whose prediction accuracy has been evaluated by the evaluation unit 180, and outputs the determined orthogonal transform to the transformation unit 121 and the inverse transformation unit 142. In the first embodiment, the determination unit 190 includes an adaptive transformation generation unit 191 that generates a vertical orthogonal transformation and a horizontal orthogonal transformation based on the map information. The transformation unit 121 performs orthogonal transformation processing in accordance with the orthogonal transformation input from the determination unit 190. The inverse transformation unit 142 performs orthogonal transformation processing in accordance with the orthogonal transformation input from the determination unit 190. Details of the adaptive transformation generation unit 191 will be described later.

（インター予測の一例）
図２は、インター予測の一例を示す図である。図２（ａ）はインター予測の一例としての双予測を示し、図２（ｂ）は双予測により生成される予測画像の一例を示す。 (An example of inter prediction)
2A and 2B are diagrams showing an example of inter prediction, in which Fig. 2A shows bi-prediction as an example of inter prediction, and Fig. 2B shows an example of a predicted image generated by bi-prediction.

図２（ａ）に示すように、双予測は、対象フレーム（現フレーム）に対して時間的に前及び後のフレームを参照する。図２（ａ）の例では、ｔフレーム目の画像中のブロックの予測を、ｔ－１フレーム目とｔ＋１フレーム目とを参照して行う。動き検出では、ｔ－１及びｔ＋１フレーム目の参照フレーム内から、対象画像ブロックと類似する箇所（ブロック）をシステムで設定された探索範囲の中から検出する。 As shown in Figure 2(a), bi-prediction refers to frames temporally before and after the target frame (current frame). In the example of Figure 2(a), prediction of a block in the image of frame t is performed by referring to frames t-1 and t+1. Motion estimation detects locations (blocks) similar to the target image block within the reference frames t-1 and t+1 within a search range set by the system.

検出された箇所が参照画像である。対象画像ブロックに対する参照画像の相対位置を示す情報が図中に示す矢印であり、動きベクトルと呼ばれる。動きベクトルの情報は、画像符号化装置１において、参照画像のフレーム情報とともにエントロピー符号化によって符号化される。一方、画像復号装置は、画像符号化装置１により生成された動きベクトルの情報に基づいて参照画像を検出する。 The detected location is the reference image. Information indicating the relative position of the reference image with respect to the target image block is represented by the arrow in the figure and is called a motion vector. The motion vector information is encoded by the image encoding device 1 using entropy coding together with the frame information of the reference image. Meanwhile, the image decoding device detects the reference image based on the motion vector information generated by the image encoding device 1.

図２（ａ）及び図２（ｂ）に示すように、動き検出によって検出された参照画像１及び２は、対象画像ブロックに対し、参照するフレーム内で位置合わせされた類似する部分画像であるため、対象画像ブロック（符号化対象画像）に類似した画像となる。図２（ｂ）の例では、対象画像ブロックは、星の絵柄と部分的な円の絵柄とを含んでいる。参照画像１は、星の絵柄と全体的な円の絵柄とを含んでいる。参照画像２は、星の絵柄を含むが、円の絵柄を含んでいない。 As shown in Figures 2(a) and 2(b), reference images 1 and 2 detected by motion detection are similar partial images that are aligned with the target image block within the reference frame, and are therefore similar to the target image block (image to be encoded). In the example of Figure 2(b), the target image block includes a star pattern and a partial circle pattern. Reference image 1 includes a star pattern and a full circle pattern. Reference image 2 includes a star pattern, but does not include a circle pattern.

かかる参照画像１及び２から予測画像を生成する。なお、予測処理は、一般的に、特徴は異なるが部分的に類似する参照画像１及び２を平均化することによって、それぞれの参照画像の特徴を備えた予測画像を生成する。但し、より高度な処理、例えば、ローパスフィルタやハイパスフィルタ等による信号強調処理を併用して予測画像を生成してもよい。ここで、参照画像１は円の絵柄を含み、参照画像２は円の絵柄を含まないため、参照画像１及び２を平均化して予測画像を生成すると、予測画像における円の絵柄は、参照画像１に比べて信号が半減する。 A predicted image is generated from these reference images 1 and 2. Note that prediction processing generally generates a predicted image with the characteristics of each reference image by averaging reference images 1 and 2, which have different characteristics but are partially similar. However, predicted images may also be generated using more advanced processing, such as signal enhancement processing using low-pass filters or high-pass filters. Here, reference image 1 contains a circular pattern, while reference image 2 does not. Therefore, when reference images 1 and 2 are averaged to generate a predicted image, the signal of the circular pattern in the predicted image is halved compared to reference image 1.

参照画像１及び２から得られた予測画像と対象画像ブロック（符号化対象画像）との差分が予測残差である。図２（ｂ）に示す予測残差において、星の絵柄のエッジのずれ部分と丸の絵柄のずれた部分（斜線部）とにのみ大きな差分が生じているが、それ以外の部分については、精度よく予測が行えており、差分が少なくなる（図２（ｂ）の例では差分が生じていない）。 The difference between the predicted image obtained from reference images 1 and 2 and the target image block (image to be encoded) is the prediction residual. In the prediction residual shown in Figure 2(b), a large difference occurs only in the misaligned parts of the star image and the circle image (shaded area), but for other parts, predictions are made accurately and the difference is small (no difference occurs in the example of Figure 2(b)).

差分が生じていない部分（星の絵柄の非エッジ部分及び背景部分）は、参照画像１と参照画像２との間の類似度が高い部分であって、高精度な予測が行われた部分である。一方、大きな差分が生じている部分は、各参照画像に特有な部分、すなわち、参照画像１と参照画像２との間の類似度が著しく低い部分である。よって、参照画像１と参照画像２との間の類似度が著しく低い部分は、予測の精度が低く、大きな差分（残差）を生じさせることが分かる。 The areas where no differences occur (non-edge parts of the star pattern and background parts) are areas where the similarity between Reference Image 1 and Reference Image 2 is high, and where highly accurate predictions have been made. On the other hand, the areas where large differences occur are areas unique to each reference image, i.e., areas where the similarity between Reference Image 1 and Reference Image 2 is significantly low. Therefore, it can be seen that areas where the similarity between Reference Image 1 and Reference Image 2 is significantly low have low prediction accuracy, resulting in large differences (residuals).

このように差分が大きい部分と差分が無い部分とが混在した予測残差を直交変換し、量子化による変換係数の劣化が生じると、かかる変換係数の劣化が逆量子化及び逆直交変換を経て画像（ブロック）内に全体的に伝搬する。そして、逆量子化及び逆直交変換によって復元された予測残差（復元予測残差）を予測画像に合成して対象画像ブロックを再構成すると、図２（ｂ）に示す星の絵柄の非エッジ部分及び背景部分のように高精度な予測が行われた部分にも画質の劣化が伝搬してしまう。 When prediction residuals, which contain a mixture of large and small differences, are orthogonally transformed and quantization causes degradation of the transform coefficients, this degradation of the transform coefficients propagates throughout the image (block) through inverse quantization and inverse orthogonal transform. When the prediction residuals (restored prediction residuals) restored by inverse quantization and inverse orthogonal transform are combined with a predicted image to reconstruct the target image block, the degradation of image quality propagates to areas where high-precision predictions were made, such as the non-edge and background parts of the star pattern shown in Figure 2(b).

（評価部）
図３は、評価部１８０の構成の一例を示す図である。図３に示すように、評価部１８０は、差分算出部（減算部）１８０ａと、正規化部１８０ｂと、調整部１８０ｃとを備える。 (Evaluation Department)
Fig. 3 is a diagram showing an example of the configuration of the evaluation unit 180. As shown in Fig. 3, the evaluation unit 180 includes a difference calculation unit (subtraction unit) 180a, a normalization unit 180b, and an adjustment unit 180c.

差分算出部１８０ａは、参照画像１と参照画像２との間の差分値（差の絶対値）を画素単位で算出し、算出した差分値を正規化部１８０ｂに出力する。かかる差分値は、類似度を示す値の一例である。差分値が小さいほど類似度が高く、差分値が大きいほど類似度が低いといえる。差分算出部１８０ａは、各参照画像に対してフィルタ処理を行ったうえで差分値を算出してもよい。差分算出部１８０ａは、二乗誤差等の統計量を算出し、かかる統計量を類似度として用いてもよい。 The difference calculation unit 180a calculates the difference value (absolute value of the difference) between reference image 1 and reference image 2 on a pixel-by-pixel basis, and outputs the calculated difference value to the normalization unit 180b. Such difference value is an example of a value indicating similarity. It can be said that the smaller the difference value, the higher the similarity, and the larger the difference value, the lower the similarity. The difference calculation unit 180a may calculate the difference value after performing a filter process on each reference image. The difference calculation unit 180a may also calculate a statistical quantity such as squared error and use this statistical quantity as the similarity.

正規化部１８０ｂは、差分算出部１８０ａから入力された差分値を、ブロック内で最大となる差分値（すなわち、ブロック内の差分値の最大値）で正規化して出力する。かかる差分値が小さいほど類似度が高く、予測精度も高くなる。一方、差分値が大きいほど類似度が低く、予測精度も低くなる（予測誤差が大きくなる）。 The normalization unit 180b normalizes the difference value input from the difference calculation unit 180a by the largest difference value within the block (i.e., the maximum difference value within the block) and outputs the normalized value. The smaller the difference value, the higher the similarity and the higher the prediction accuracy. On the other hand, the larger the difference value, the lower the similarity and the lower the prediction accuracy (the larger the prediction error).

正規化部１８０ｂは、差分算出部１８０ａから入力された各画素の差分値を、ブロック内で差分値が最大となる画素の差分値（すなわち、ブロック内の差分値の最大値）で正規化し、正規化した差分値である正規化差分値を出力する。かかる正規化差分値は、予測誤差の大きさを表す推定値として用いることができる。 The normalization unit 180b normalizes the difference value of each pixel input from the difference calculation unit 180a by the difference value of the pixel with the largest difference value in the block (i.e., the maximum difference value in the block), and outputs the normalized difference value. This normalized difference value can be used as an estimate representing the magnitude of the prediction error.

調整部１８０ｃは、量子化の粗さを定める量子化パラメータ（Ｑｐ）に基づいて、正規化部１８０ｂから入力された正規化差分値を調整し、調整した正規化差分値を出力する。量子化の粗さが大きいほど復元予測残差の劣化度が高いため、調整部１８０ｃは、量子化パラメータ（Ｑｐ）に基づいて正規化差分値（重み）を調整する。 The adjustment unit 180c adjusts the normalized difference value input from the normalization unit 180b based on a quantization parameter (Qp) that determines the coarseness of quantization, and outputs the adjusted normalized difference value. Since the greater the coarseness of quantization, the greater the degree of degradation of the restored prediction residual, the adjustment unit 180c adjusts the normalized difference value (weight) based on the quantization parameter (Qp).

評価部１８０が出力する各画素位置（ｉｊ）における予測誤差の推定値Ｒｉｊは、例えば下記の式（１）のように表現することができる。 The estimated value Rij of the prediction error at each pixel position (ij) output by the evaluation unit 180 can be expressed, for example, as in the following equation (1):

Rij = (abs(Xij-Yij)/maxD × Scale(Qp)) ・・・（１） Rij = (abs(Xij-Yij)/maxD × Scale(Qp)) ・・・(1)

式（１）において、Ｘｉｊは参照画像１の画素ｉｊの画素値であり、Ｙｉｊは参照画像２の画素ｉｊの画素値であり、ａｂｓは絶対値を得る関数である。差分算出部１８０ａでは、ａｂｓ（Ｘｉｊ－Ｙｉｊ）を出力する。 In equation (1), Xij is the pixel value of pixel ij in reference image 1, Yij is the pixel value of pixel ij in reference image 2, and abs is a function that obtains the absolute value. The difference calculation unit 180a outputs abs(Xij - Yij).

また、式（１）において、ｍａｘＤは、ブロック内の差分値ａｂｓ（Ｘｉｊ－Ｙｉｊ）の最大値である。ｍａｘＤを求めるために、ブロック内のすべての画素について差分値を求める必要があるが、この処理を省略するためにすでに符号化処理済みの隣接するブロックの最大値などで代用してもよい。或いは、量子化パラメータ（Ｑｐ）とｍａｘＤとの対応関係を定めるテーブルを用いて、量子化パラメータ（Ｑｐ）からｍａｘＤを求めてもよい。或いは、予め仕様で規定された固定値をｍａｘＤとして用いてもよい。正規化部１８０ｂは、ａｂｓ（Ｘｉｊ－Ｙｉｊ）／ｍａｘＤを出力する。 In addition, in equation (1), maxD is the maximum value of the difference value abs(Xij - Yij) within the block. To calculate maxD, it is necessary to calculate the difference value for all pixels within the block, but to omit this process, the maximum value of an adjacent block that has already been encoded may be used instead. Alternatively, maxD may be calculated from the quantization parameter (Qp) using a table that defines the correspondence between the quantization parameter (Qp) and maxD. Alternatively, a fixed value defined in advance in the specifications may be used as maxD. The normalization unit 180b outputs abs(Xij - Yij)/maxD.

また、式（１）において、Ｓｃａｌｅ（Ｑｐ）は、量子化パラメータ（Ｑｐ）に応じて乗じられる係数である。Ｓｃａｌｅ（Ｑｐ）は、Ｑｐが大きい場合に１．０に近づき、小さい場合に０に近づくように設計され、その度合いはシステムによって調整するものとする。或いは、予め仕様で規定された固定値をＳｃａｌｅ（Ｑｐ）として用いてもよい。さらに、処理を簡略化するため、Ｓｃａｌｅ（Ｑｐ）を１．０などシステムに応じて設計された固定値としてもよい。 In addition, in equation (1), Scale (Qp) is a coefficient multiplied according to the quantization parameter (Qp). Scale (Qp) is designed to approach 1.0 when Qp is large and to approach 0 when Qp is small, with the degree of this being adjusted by the system. Alternatively, a fixed value defined in advance in the specifications may be used as Scale (Qp). Furthermore, to simplify processing, Scale (Qp) may be a fixed value designed according to the system, such as 1.0.

調整部１８０ｃは、ａｂｓ（Ｘｉｊ－Ｙｉｊ）／ｍａｘＤ×Ｓｃａｌｅ（Ｑｐ）を誤差推定値Ｒｉｊとして出力する。また、このＲｉｊは、システムに応じて設計される感度関数によって調整された重み付けを出力してもよい。例えば、ａｂｓ（Ｘｉｊ－Ｙｉｊ）／ｍａｘＤ×Ｓｃａｌｅ（Ｑｐ）＝Ｒｉｊとし、Ｒｉｊ＝Ｃｌｉｐ（Ｒｉｊ，１．０，０．０）とする、又はＲｉｊ＝Ｃｌｉｐ（Ｒｉｊ＋ｏｆｆｓｅｔ，１．０，０．０）とオフセットをつけて感度を調整してもよい。なお、Ｃｌｉｐ（ｘ，ｍａｘ，ｍｉｎ）は、ｘがｍａｘを超える場合はｍａｘで、ｘがｍｉｎを下回る場合はｍｉｎでクリップする処理を示す。 The adjustment unit 180c outputs abs(Xij-Yij)/maxD×Scale(Qp) as the error estimate Rij. This Rij may also be weighted and adjusted using a sensitivity function designed for the system. For example, the sensitivity may be adjusted by setting abs(Xij-Yij)/maxD×Scale(Qp)=Rij and Rij=Clip(Rij,1.0,0.0), or by adding an offset such as Rij=Clip(Rij+offset,1.0,0.0). Note that Clip(x,max,min) indicates clipping at max if x exceeds max, and at min if x is below min.

このようにして算出された画素位置ごとの誤差推定値Ｒｉｊは、０から１．０までの範囲内の値となる。基本的には、誤差推定値Ｒｉｊは、参照画像間の画素位置ｉｊの差分値が大きい（すなわち、予測精度が低い）場合に１．０に近づき、参照画像間の画素位置ｉｊの差分値が小さい（すなわち、予測精度が高い）場合に０に近づく。評価部１８０は、ブロック内の各画素位置ｉｊの誤差推定値Ｒｉｊからなる２次元のマップ情報（以下、「誤差マップ」と称する）を出力する。 The error estimate Rij for each pixel position calculated in this way is a value in the range from 0 to 1.0. Basically, the error estimate Rij approaches 1.0 when the difference value of pixel position ij between the reference images is large (i.e., the prediction accuracy is low), and approaches 0 when the difference value of pixel position ij between the reference images is small (i.e., the prediction accuracy is high). The evaluation unit 180 outputs two-dimensional map information (hereinafter referred to as the "error map") consisting of the error estimate Rij for each pixel position ij within the block.

（適応変換生成部）
図４は、第１実施形態に係る適応変換生成部１９１の動作を示す図である。適応変換生成部１９１は、評価部１８０から入力された誤差マップを用いて、予測残差に対して垂直方向に適用する垂直適応直交変換及び水平方向に適用する水平適応直交変換を主成分分析により生成する。 (Adaptive transformation generator)
4 is a diagram showing the operation of the adaptive transform generation unit 191 according to the first embodiment. The adaptive transform generation unit 191 uses the error map input from the evaluation unit 180 to generate a vertical adaptive orthogonal transform to be applied to the prediction residual in the vertical direction and a horizontal adaptive orthogonal transform to be applied to the prediction residual in the horizontal direction by principal component analysis.

具体的には、適応変換生成部１９１は、誤差マップを列ベクトルの集合とみなして共分散行列を生成し、生成した共分散行列の固有ベクトルを算出する。適応変換生成部１９１は、得られた固有ベクトルを垂直適応直交変換として出力する。また、適応変換生成部１９１は、生成した垂直適応直交変換を垂直方向に適用して得られた行列を行ベクトルの集合とみなして共分散行列を生成し、その固有ベクトルを算出する。適応変換生成部１９１は、得られた固有ベクトルを水平適応直交変換として出力する。 Specifically, the adaptive transform generation unit 191 generates a covariance matrix by treating the error map as a set of column vectors, and calculates the eigenvectors of the generated covariance matrix. The adaptive transform generation unit 191 outputs the obtained eigenvectors as a vertical adaptive orthogonal transform. Furthermore, the adaptive transform generation unit 191 generates a covariance matrix by treating the matrix obtained by applying the generated vertical adaptive orthogonal transform in the vertical direction as a set of row vectors, and calculates its eigenvectors. The adaptive transform generation unit 191 outputs the obtained eigenvectors as a horizontal adaptive orthogonal transform.

図４（ａ）に示すように、適応変換生成部１９１は、誤差マップが幅ｗ高さｈであるとき、誤差マップをｗ個のｈ×１の列ベクトルとみなして、共分散行列Λ_hを算出する。適応変換生成部１９１は、得られた共分散行列を対角化することで固有ベクトルを算出する。ここで、共分散行列の対角化には、例えばＪａｃｏｂｉ法などを用いて反復演算により算出する。適応変換生成部１９１は、得られた固有ベクトルｅ₀からｅ_h変換の結合によりｈ×ｈの行列を垂直適応直交変換として出力する。 As shown in FIG. 4A, when the error map has width w and height h, the adaptive transformation generation unit 191 regards the error map as w h × 1 column vectors and calculates a covariance matrix Λ _h . The adaptive transformation generation unit 191 calculates eigenvectors by diagonalizing the obtained covariance matrix. Here, the covariance matrix is diagonalized by iterative calculations using, for example, the Jacobi method. The adaptive transformation generation unit 191 outputs an h × h matrix as a vertical adaptive orthogonal transform by combining the obtained eigenvectors e ₀ to e _h transforms.

さらに、図４（ｂ）に示すように、適応変換生成部１９１は、誤差マップをｈ個の１×ｗの行ベクトルとみなして、共分散行列Λ_hを算出する。適応変換生成部１９１は、得られた共分散行列を対角化することで固有ベクトルを算出する。適応変換生成部１９１は、得られた固有ベクトルの結合によりｗ×ｗの行列を水平適応直交変換として出力する。 4(b), the adaptive transformation generator 191 regards the error map as h 1×w row vectors and calculates a covariance matrix Λ _h . The adaptive transformation generator 191 calculates eigenvectors by diagonalizing the obtained covariance matrix. The adaptive transformation generator 191 combines the obtained eigenvectors to output a w×w matrix as a horizontal adaptive orthogonal transform.

ＨＥＶＣ（非特許文献１参照）や、国際標準化団体で検討中の最新の映像符号化技術（ＪＥＭ）などの映像符号化手法では、変換処理を高速かつ軽量に行う目的で、整数精度の変換係数及びビットシフトにより実現している。本実施形態に係る適応変換生成部１９１においても、得られた固有ベクトルを整数係数に近似し、変換係数のダイナミックレンジが拡大しないようなビットシフト量を予め画像符号化装置及び画像復号装置で規定してもよい。 Video coding methods such as HEVC (see Non-Patent Document 1) and the latest video coding technology (JEM) currently being considered by international standardization organizations use integer-precision transform coefficients and bit shifting to perform transform processing quickly and efficiently. The adaptive transform generation unit 191 according to this embodiment may also approximate the obtained eigenvectors to integer coefficients, and the amount of bit shifting may be specified in advance in the image coding device and image decoding device so as not to expand the dynamic range of the transform coefficients.

なお、図１に示すように、変換部１２１は、適応変換生成部１９１から入力された垂直適応直交変換及び水平適応直交変換を用いて、減算部１１０により生成された予測残差に対し垂直及び水平方向に直交変換処理を行うことにより変換係数を算出し、算出した変換係数を量子化部１２２に出力する。 As shown in FIG. 1, the transform unit 121 calculates transform coefficients by performing orthogonal transform processing in the vertical and horizontal directions on the prediction residuals generated by the subtraction unit 110 using the vertical adaptive orthogonal transform and horizontal adaptive orthogonal transform input from the adaptive transform generation unit 191, and outputs the calculated transform coefficients to the quantization unit 122.

また、逆変換部１４２は、適応変換生成部１９１から入力された垂直適応直交変換及び水平適応直交変換を用いて、逆量子化部１４１から入力された変換係数に対して、変換部１２１が行う直交変換処理に対応する逆直交変換処理を行う。 In addition, the inverse transform unit 142 uses the vertical adaptive orthogonal transform and horizontal adaptive orthogonal transform input from the adaptive transform generation unit 191 to perform inverse orthogonal transform processing on the transform coefficients input from the inverse quantization unit 141, corresponding to the orthogonal transform processing performed by the transform unit 121.

（画像復号装置）
図５は、第１実施形態に係る画像復号装置２の構成を示す図である。図５に示すように、画像復号装置２は、エントロピー符号復号部２００と、逆量子化・逆変換部２１０と、合成部２２０と、メモリ２３０と、予測部２４０と、評価部２５０と、決定部２６０とを備える。 (Image decoding device)
Fig. 5 is a diagram showing the configuration of an image decoding device 2 according to the first embodiment. As shown in Fig. 5, the image decoding device 2 includes an entropy code decoding unit 200, an inverse quantization and inverse transform unit 210, a synthesis unit 220, a memory 230, a prediction unit 240, an evaluation unit 250, and a determination unit 260.

エントロピー符号復号部２００は、画像符号化装置１により生成された符号化データを復号し、量子化された変換係数を逆量子化・逆変換部２１０に出力する。また、エントロピー符号復号部２００は、予測（イントラ予測及びインター予測）に関する制御情報を取得し、取得した制御情報を予測部２４０に出力する。 The entropy code decoding unit 200 decodes the coded data generated by the image coding device 1 and outputs the quantized transform coefficients to the inverse quantization and inverse transform unit 210. The entropy code decoding unit 200 also acquires control information related to prediction (intra prediction and inter prediction) and outputs the acquired control information to the prediction unit 240.

逆量子化・逆変換部２１０は、ブロック単位で逆量子化処理及び逆直交変換処理を行う。逆量子化・逆変換部２１０は、逆量子化部２１１と、逆変換部２１２とを備える。 The inverse quantization and inverse transform unit 210 performs inverse quantization and inverse orthogonal transform processing on a block-by-block basis. The inverse quantization and inverse transform unit 210 includes an inverse quantization unit 211 and an inverse transform unit 212.

逆量子化部２１１は、画像符号化装置１の量子化部１２２が行う量子化処理に対応する逆量子化処理を行う。逆量子化部２１１は、エントロピー符号復号部２００から入力された量子化変換係数を、量子化パラメータ（Ｑｐ）及び量子化行列を用いて逆量子化することにより、変換係数を復元し、復元した変換係数を逆変換部２１２に出力する。 The inverse quantization unit 211 performs inverse quantization processing corresponding to the quantization processing performed by the quantization unit 122 of the image encoding device 1. The inverse quantization unit 211 restores the transform coefficients by inverse quantizing the quantized transform coefficients input from the entropy encoding/decoding unit 200 using a quantization parameter (Qp) and a quantization matrix, and outputs the restored transform coefficients to the inverse transform unit 212.

逆変換部２１２は、画像符号化装置１の変換部１２１が行う直交変換処理に対応する逆直交変換処理を行う。逆変換部２１２は、逆量子化部２１１から入力された変換係数に対して逆直交変換処理を行って予測残差を復元し、復元した予測残差（復元予測残差）を合成部２２０に出力する。 The inverse transform unit 212 performs inverse orthogonal transform processing corresponding to the orthogonal transform processing performed by the transform unit 121 of the image encoding device 1. The inverse transform unit 212 performs inverse orthogonal transform processing on the transform coefficients input from the inverse quantization unit 211 to restore the prediction residual, and outputs the restored prediction residual (restored prediction residual) to the synthesis unit 220.

合成部２２０は、逆変換部２１２から入力された予測残差と、予測部２４０から入力された予測画像とを画素単位で合成することにより、元のブロックを再構成（復号）し、ブロック単位の復号画像をメモリ２３０に出力する。 The synthesis unit 220 reconstructs (decodes) the original block by synthesizing the prediction residual input from the inverse transform unit 212 and the predicted image input from the prediction unit 240 on a pixel-by-pixel basis, and outputs the decoded image on a block-by-block basis to the memory 230.

メモリ２３０は、合成部２２０から入力された復号画像を記憶する。メモリ２３０は、復号画像をフレーム単位で記憶する。メモリ２３０は、フレーム単位の復号画像を画像復号装置２の外部に出力する。なお、合成部２２０とメモリ２３０との間にループフィルタが設けられてもよい。 The memory 230 stores the decoded image input from the synthesis unit 220. The memory 230 stores the decoded image in frame units. The memory 230 outputs the decoded image in frame units to the outside of the image decoding device 2. Note that a loop filter may be provided between the synthesis unit 220 and the memory 230.

予測部２４０は、ブロック単位で予測を行う。予測部２４０は、イントラ予測部２４１と、インター予測部２４２と、切替部２４３とを備える。 The prediction unit 240 performs prediction on a block-by-block basis. The prediction unit 240 includes an intra prediction unit 241, an inter prediction unit 242, and a switching unit 243.

イントラ予測部２４１は、メモリ２３０に記憶された復号画像を参照し、エントロピー符号復号部２００から入力された制御情報に従ってイントラ予測を行うことによりイントラ予測画像を生成し、生成したイントラ予測画像を切替部２４３に出力する。 The intra prediction unit 241 references the decoded image stored in the memory 230 and performs intra prediction in accordance with the control information input from the entropy coding/decoding unit 200 to generate an intra prediction image, and outputs the generated intra prediction image to the switching unit 243.

インター予測部２４２は、メモリ２３０に記憶された復号画像を参照画像として用いて予測対象のブロックを予測するインター予測を行う。インター予測部２４２は、エントロピー符号復号部２００から入力された制御情報（インター予測方法や動きベクトル情報等）に従ってインター予測を行うことによりインター予測画像を生成し、生成したインター予測画像を切替部２４３に出力する。インター予測部２４２は、インター予測画像を生成するために複数の参照画像を用いる場合に、当該複数の参照画像を評価部２５０に出力する。 The inter prediction unit 242 performs inter prediction, which predicts the block to be predicted, using the decoded image stored in the memory 230 as a reference image. The inter prediction unit 242 generates an inter prediction image by performing inter prediction in accordance with the control information (inter prediction method, motion vector information, etc.) input from the entropy code decoding unit 200, and outputs the generated inter prediction image to the switching unit 243. When multiple reference images are used to generate the inter prediction image, the inter prediction unit 242 outputs the multiple reference images to the evaluation unit 250.

なお、複数の参照画像を用いて行う予測は、インター予測における双予測が代表的なものであるが、これに限定されない。予測部２４０は、複数の参照画像を用いてイントラブロックコピーによる予測を行ってもよい。複数の参照画像を用いてイントラブロックコピーによる予測を行う場合、予測部２４０は、当該複数の参照画像を評価部２５０に出力する。 Note that prediction using multiple reference images is typically performed using bi-prediction in inter prediction, but is not limited to this. The prediction unit 240 may also perform prediction using intra block copying using multiple reference images. When performing prediction using intra block copying using multiple reference images, the prediction unit 240 outputs the multiple reference images to the evaluation unit 250.

切替部２４３は、イントラ予測部２４１から入力されるイントラ予測画像とインター予測部２４２から入力されるインター予測画像とを切り替えて、いずれかの予測画像を合成部２２０に出力する。 The switching unit 243 switches between the intra-predicted image input from the intra-prediction unit 241 and the inter-predicted image input from the inter-prediction unit 242, and outputs one of the predicted images to the synthesis unit 220.

評価部２５０は、画像符号化装置１の評価部１８０（図３参照）と同様な動作を行う。評価部２５０は、予測部２４０から入力された複数の参照画像間の類似度を画素単位で評価することにより、当該複数の参照画像を用いて生成された予測画像における誤差の分布を示す誤差マップを生成し、生成した誤差マップを決定部２６０に出力する。 The evaluation unit 250 operates in the same manner as the evaluation unit 180 (see Figure 3) of the image encoding device 1. The evaluation unit 250 evaluates the similarity between the multiple reference images input from the prediction unit 240 on a pixel-by-pixel basis, thereby generating an error map that indicates the distribution of errors in the predicted image generated using the multiple reference images, and outputs the generated error map to the determination unit 260.

決定部２６０は、評価部２５０から入力された誤差マップに基づいて、評価部２５０により予測精度が評価された予測画像に対応する予測残差に適用する逆直交変換を決定し、決定した逆直交変換を逆変換部２１２に出力する。決定部２６０は、誤差マップに基づいて垂直方向の直交変換及び水平方向の直交変換を生成する適応変換生成部２６１を備える。適応変換生成部２６１は、画像符号化装置１の適応変換生成部１９１と同様な動作（図４参照）を行う。逆変換部２１２は、決定部２６０から入力された逆直交変換に従って逆直交変換処理を行う。 The determination unit 260 determines the inverse orthogonal transform to apply to the prediction residual corresponding to the predicted image whose prediction accuracy has been evaluated by the evaluation unit 250, based on the error map input from the evaluation unit 250, and outputs the determined inverse orthogonal transform to the inverse transform unit 212. The determination unit 260 includes an adaptive transform generation unit 261 that generates a vertical orthogonal transform and a horizontal orthogonal transform based on the error map. The adaptive transform generation unit 261 performs the same operation as the adaptive transform generation unit 191 of the image encoding device 1 (see Figure 4). The inverse transform unit 212 performs inverse orthogonal transform processing in accordance with the inverse orthogonal transform input from the determination unit 260.

（第１実施形態のまとめ）
第１実施形態に係る画像符号化装置１は、フレーム単位の現画像を分割して得られたブロック単位の対象画像を符号化する。画像符号化装置１は、複数の参照画像を用いて対象画像を予測して予測画像を生成する予測部１７０と、当該複数の参照画像間の類似度を画素単位で評価することにより、予測画像における誤差の分布を示す誤差マップを生成する評価部１８０と、対象ブロックと予測画像との差分を示す予測残差を画素単位で算出する減算部１１０と、予測残差に適用する直交変換を誤差マップに基づいて決定する決定部１９０と、決定された直交変換によって予測残差に対する直交変換処理を行う変換部１２１とを備える。決定部１９０は、マップ情報に基づいて直交変換を生成する適応変換生成部１９１を備える。 (Summary of the first embodiment)
An image encoding device 1 according to the first embodiment encodes a target image in units of blocks obtained by dividing a current image in units of frames. The image encoding device 1 includes a prediction unit 170 that predicts the target image using multiple reference images to generate a predicted image, an evaluation unit 180 that generates an error map indicating the distribution of errors in the predicted image by evaluating the similarity between the multiple reference images on a pixel-by-pixel basis, a subtraction unit 110 that calculates a prediction residual indicating the difference between the target block and the predicted image on a pixel-by-pixel basis, a determination unit 190 that determines an orthogonal transform to apply to the prediction residual based on the error map, and a transformation unit 121 that performs orthogonal transform processing on the prediction residual using the determined orthogonal transform. The determination unit 190 includes an adaptive transform generation unit 191 that generates an orthogonal transform based on the map information.

また、第１実施形態に係る画像復号装置２は、フレーム単位の現画像を分割して得られたブロック単位の対象画像を復号する。画像復号装置２は、符号化データを復号することにより変換係数を取得するエントロピー符号復号部２００と、複数の参照画像を用いて対象画像を予測して予測画像を生成する予測部２４０と、当該複数の参照画像間の類似度を画素単位で評価することにより、予測画像における誤差の分布を示す誤差マップを生成する評価部２５０と、変換係数に適用する逆直交変換を誤差マップに基づいて決定する決定部２６０と、決定された逆直交変換によって変換係数に対する逆直交変換処理を行う逆変換部２１２とを備える。決定部２６０は、マップ情報に基づいて逆直交変換を生成する適応変換生成部２６１を備える。 The image decoding device 2 according to the first embodiment also decodes target images in units of blocks obtained by dividing a current image in units of frames. The image decoding device 2 includes an entropy code decoding unit 200 that obtains transform coefficients by decoding encoded data, a prediction unit 240 that predicts the target image using multiple reference images to generate a predicted image, an evaluation unit 250 that generates an error map indicating the distribution of errors in the predicted image by evaluating the similarity between the multiple reference images on a pixel-by-pixel basis, a determination unit 260 that determines the inverse orthogonal transform to apply to the transform coefficients based on the error map, and an inverse transform unit 212 that performs inverse orthogonal transform processing on the transform coefficients using the determined inverse orthogonal transform. The determination unit 260 includes an adaptive transform generation unit 261 that generates an inverse orthogonal transform based on the map information.

このように、第１実施形態によれば、予測画像における誤差の分布を示す誤差マップに基づいて直交変換を生成することにより、予測残差におけるエネルギー分布に応じた最適な直交変換を適用することができる。 In this way, according to the first embodiment, by generating an orthogonal transform based on an error map that indicates the distribution of errors in a predicted image, it is possible to apply an optimal orthogonal transform that corresponds to the energy distribution in the prediction residual.

また、画像符号化装置１及び画像復号装置２のそれぞれが誤差マップに基づいて直交変換を生成可能であるため、画像符号化装置１で行った直交変換（ＫＬＴ）の逆処理のための情報を画像復号装置２側で必要としない。よって、伝送すべき情報量の増大を抑制できる。 Furthermore, because the image encoding device 1 and the image decoding device 2 can each generate an orthogonal transform based on an error map, the image decoding device 2 does not need information for the inverse process of the orthogonal transform (KLT) performed by the image encoding device 1. This prevents an increase in the amount of information to be transmitted.

したがって、第１実施形態に係る画像符号化装置１及び画像復号装置２によれば、予測残差のエネルギーを効率的に集中させる適応直交変換を適用可能にし、符号化効率を改善できる。 Therefore, the image encoding device 1 and image decoding device 2 according to the first embodiment make it possible to apply adaptive orthogonal transform that efficiently concentrates the energy of prediction residuals, thereby improving encoding efficiency.

＜第２実施形態＞
第２実施形態に係る画像符号化装置１及び画像復号装置２について、第１実施形態との相違点を主として説明する。 Second Embodiment
The image encoding device 1 and image decoding device 2 according to the second embodiment will be described, focusing on the differences from the first embodiment.

（画像符号化装置）
図６は、第２実施形態に係る画像符号化装置１の構成を示す図である。図６に示すように、第２実施形態に係る画像符号化装置１は、決定部１９０の構成が第１実施形態とは異なる。決定部１９０は、第１実施形態と同様に、誤差マップの主成分分析（図４参照）により直交変換を生成する適応変換生成部１９１を備える。第２実施形態において、決定部１９０は、候補選択部（第１選択部）１９２と、直交変換選択部（第２選択部）１９３とをさらに備える。 (Image encoding device)
Fig. 6 is a diagram showing the configuration of an image encoding device 1 according to the second embodiment. As shown in Fig. 6, the image encoding device 1 according to the second embodiment differs from the first embodiment in the configuration of a decision unit 190. As in the first embodiment, the decision unit 190 includes an adaptive transform generation unit 191 that generates an orthogonal transform by principal component analysis of an error map (see Fig. 4). In the second embodiment, the decision unit 190 further includes a candidate selection unit (first selection unit) 192 and an orthogonal transform selection unit (second selection unit) 193.

適応変換生成部１９１は、誤差マップに基づいて生成した適応直交変換（垂直適応直交変換及び水平適応直交変換）を候補選択部１９２に出力する。 The adaptive transform generation unit 191 outputs the adaptive orthogonal transforms (vertical adaptive orthogonal transform and horizontal adaptive orthogonal transform) generated based on the error map to the candidate selection unit 192.

候補選択部１９２は、予め規定された複数種類の直交変換の中から、適応変換生成部１９１から入力された適応直交変換との相関が高い順に１つ以上の直交変換候補を選択し、選択した１つ以上の直交変換候補を直交変換選択部２６３に出力する。予め規定された複数種類の直交変換は、画像符号化装置１及び画像復号装置２で共有されている。第２実施形態において、複数種類の直交変換として、ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、及びＤＣＴ－５が予め規定されているものとする。 The candidate selection unit 192 selects one or more orthogonal transform candidates from among multiple predefined types of orthogonal transforms in descending order of correlation with the adaptive orthogonal transform input from the adaptive transform generation unit 191, and outputs the selected one or more orthogonal transform candidates to the orthogonal transform selection unit 263. The multiple predefined types of orthogonal transforms are shared by the image encoding device 1 and the image decoding device 2. In the second embodiment, it is assumed that DCT-2, DST-7, DCT-8, DST-1, and DCT-5 are predefined as the multiple types of orthogonal transforms.

具体的には、候補選択部１９２は、適応変換生成部１９１から入力された水平適応直交変換と各直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）との相関を評価する。そして、候補選択部１９２は、相関評価の結果に基づいて、複数種類の直交変換のうち、相関の高い順に１つ以上の直交変換を適応直交変換候補として直交変換選択部１９３に出力する。 Specifically, the candidate selection unit 192 evaluates the correlation between the horizontal adaptive orthogonal transform input from the adaptive transform generation unit 191 and each orthogonal transform (DCT-2, DST-7, DCT-8, DST-1, DCT-5). Based on the results of the correlation evaluation, the candidate selection unit 192 then outputs one or more orthogonal transforms from among multiple types of orthogonal transforms in descending order of correlation as adaptive orthogonal transform candidates to the orthogonal transform selection unit 193.

なお、候補選択部１９２が出力する適応直交変換候補の数は、画像符号化装置１及び画像復号装置２で同じ数とするように予め規定される。また、候補選択部１９２が出力する適応直交変換候補の数は、符号化対象の画像ブロックのブロックサイズや色成分（輝度成分、色差成分）などに応じて可変としてもよい。 The number of adaptive orthogonal transform candidates output by the candidate selection unit 192 is specified in advance to be the same for the image encoding device 1 and the image decoding device 2. The number of adaptive orthogonal transform candidates output by the candidate selection unit 192 may also be variable depending on the block size and color components (luminance component, chrominance component) of the image block to be encoded, etc.

さらに、候補選択部１９２は、水平方向と垂直方向とで適応直交変換候補を別々に選択してもよい。かかる場合、候補選択部１９２は、適応変換生成部１９１から入力された垂直適応直交変換と各直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）との相関をさらに評価し、適応水平直交変換候補に加えて適応垂直直交変換候補を直交変換選択部１９３に出力する。 Furthermore, the candidate selection unit 192 may select adaptive orthogonal transform candidates separately for the horizontal and vertical directions. In such a case, the candidate selection unit 192 further evaluates the correlation between the vertical adaptive orthogonal transform input from the adaptive transform generation unit 191 and each orthogonal transform (DCT-2, DST-7, DCT-8, DST-1, DCT-5), and outputs adaptive vertical orthogonal transform candidates in addition to adaptive horizontal orthogonal transform candidates to the orthogonal transform selection unit 193.

直交変換選択部１９３は、候補選択部１９２から入力された１つ以上の直交変換候補の中から、予測残差に適用する直交変換を選択し、選択した直交変換を変換部１２１及び逆変換部１４２に出力する。また、直交変換選択部１９３は、選択した直交変換の種類を示すインデックスをエントロピー符号化部１３０に出力する。 The orthogonal transform selection unit 193 selects an orthogonal transform to apply to the prediction residual from one or more orthogonal transform candidates input from the candidate selection unit 192, and outputs the selected orthogonal transform to the transform unit 121 and the inverse transform unit 142. The orthogonal transform selection unit 193 also outputs an index indicating the type of selected orthogonal transform to the entropy coding unit 130.

例えば、直交変換選択部１９３は、各直交変換候補を適用した場合の符号化効率をシミュレーションにより算出し、かかるシミュレーションの結果に応じて最適な直交変換を選択する。なお、直交変換選択部１９３は、水平方向及び垂直方向で同一種類の直交変換を選択するよう構成してもよいし、水平方向及び垂直方向で別々の種類の直交変換を選択するよう構成してもよい。 For example, the orthogonal transform selection unit 193 calculates the coding efficiency when each orthogonal transform candidate is applied by simulation, and selects the optimal orthogonal transform based on the results of the simulation. Note that the orthogonal transform selection unit 193 may be configured to select the same type of orthogonal transform in the horizontal and vertical directions, or may be configured to select different types of orthogonal transform in the horizontal and vertical directions.

エントロピー符号化部１３０は、直交変換選択部１９３から入力された適応直交変換インデックスをエントロピー符号化する。水平方向及び垂直方向で別々の種類の直交変換を選択する場合には、エントロピー符号化部１３０は、水平方向及び垂直方向で別々の適応直交変換インデックスをエントロピー符号化する。但し、適応直交変換候補が１種類の直交変換により構成されている場合には、画像復号装置２において一意に直交変換を特定できるため、かかるインデックスをエントロピー符号化しなくてもよい。 The entropy coding unit 130 entropy codes the adaptive orthogonal transform index input from the orthogonal transform selection unit 193. When selecting different types of orthogonal transform in the horizontal and vertical directions, the entropy coding unit 130 entropy codes different adaptive orthogonal transform indices in the horizontal and vertical directions. However, when the adaptive orthogonal transform candidates are composed of a single type of orthogonal transform, the orthogonal transform can be uniquely identified in the image decoding device 2, and therefore such indices do not need to be entropy coded.

（画像復号装置）
図７は、第２実施形態に係る画像復号装置２の構成を示す図である。図７に示すように、第２実施形態に係る画像復号装置２は、決定部２６０の構成が第１実施形態とは異なる。決定部２６０は、第１実施形態と同様に、誤差マップの主成分分析（図４参照）により直交変換を生成する適応変換生成部２６１を備える。第２実施形態において、決定部２６０は、候補選択部（第１選択部）２６２と、直交変換選択部（第２選択部）２６３とをさらに備える。 (Image decoding device)
Fig. 7 is a diagram showing the configuration of an image decoding device 2 according to the second embodiment. As shown in Fig. 7, the image decoding device 2 according to the second embodiment differs from the first embodiment in the configuration of the decision unit 260. As in the first embodiment, the decision unit 260 includes an adaptive transform generation unit 261 that generates an orthogonal transform by principal component analysis of an error map (see Fig. 4). In the second embodiment, the decision unit 260 further includes a candidate selection unit (first selection unit) 262 and an orthogonal transform selection unit (second selection unit) 263.

適応変換生成部２６１は、誤差マップに基づいて生成した適応直交変換（垂直適応直交変換及び水平適応直交変換）を候補選択部２６２に出力する。 The adaptive transform generation unit 261 outputs the adaptive orthogonal transforms (vertical adaptive orthogonal transform and horizontal adaptive orthogonal transform) generated based on the error map to the candidate selection unit 262.

候補選択部２６２は、予め規定された複数種類の直交変換の中から、適応変換生成部２６１から入力された適応直交変換との相関が高い順に１つ以上の直交変換候補を選択し、選択した１つ以上の直交変換候補を直交変換選択部２６３に出力する。上述したように、複数種類の直交変換として、ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、及びＤＣＴ－５が予め規定されているものとする。 The candidate selection unit 262 selects one or more orthogonal transform candidates from among multiple predefined types of orthogonal transforms in descending order of correlation with the adaptive orthogonal transform input from the adaptive transform generation unit 261, and outputs the selected one or more orthogonal transform candidates to the orthogonal transform selection unit 263. As described above, DCT-2, DST-7, DCT-8, DST-1, and DCT-5 are predefined as multiple types of orthogonal transforms.

具体的には、候補選択部２６２は、適応変換生成部２６１から入力された水平適応直交変換と各直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）との相関を評価する。そして、候補選択部２６２は、相関評価の結果に基づいて、複数種類の直交変換のうち、相関の高い順に１つ以上の直交変換を適応直交変換候補として直交変換選択部２６３に出力する。 Specifically, the candidate selection unit 262 evaluates the correlation between the horizontal adaptive orthogonal transform input from the adaptive transform generation unit 261 and each orthogonal transform (DCT-2, DST-7, DCT-8, DST-1, DCT-5). Based on the results of the correlation evaluation, the candidate selection unit 262 then outputs one or more orthogonal transforms from among the multiple types of orthogonal transforms in descending order of correlation as adaptive orthogonal transform candidates to the orthogonal transform selection unit 263.

上述したように、候補選択部２６２が出力する適応直交変換候補の数は、画像符号化装置１及び画像復号装置２で同じ数とするように予め規定される。また、候補選択部２６２が出力する適応直交変換候補の数は、復号対象の画像ブロックのブロックサイズや色成分（輝度成分、色差成分）などに応じて可変としてもよい。 As described above, the number of adaptive orthogonal transform candidates output by the candidate selection unit 262 is predetermined to be the same for the image encoding device 1 and the image decoding device 2. In addition, the number of adaptive orthogonal transform candidates output by the candidate selection unit 262 may be variable depending on the block size and color components (luminance component, chrominance component) of the image block to be decoded, etc.

さらに、候補選択部２６２は、水平方向と垂直方向とで適応直交変換候補を別々に選択してもよい。かかる場合、候補選択部２６２は、適応変換生成部２６１から入力された垂直適応直交変換と各直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）との相関をさらに評価し、適応水平直交変換候補に加えて適応垂直直交変換候補を直交変換選択部２６３に出力する。 Furthermore, the candidate selection unit 262 may select adaptive orthogonal transform candidates separately for the horizontal and vertical directions. In such a case, the candidate selection unit 262 further evaluates the correlation between the vertical adaptive orthogonal transform input from the adaptive transform generation unit 261 and each orthogonal transform (DCT-2, DST-7, DCT-8, DST-1, DCT-5), and outputs adaptive vertical orthogonal transform candidates in addition to the adaptive horizontal orthogonal transform candidates to the orthogonal transform selection unit 263.

一方、エントロピー符号復号部２００は、１つ以上の直交変換候補の中から画像符号化装置１が選択した直交変換を示すインデックスを復号し、当該インデックスを直交変換選択部２６３に出力する。水平方向及び垂直方向で別々の種類の直交変換を選択する場合には、エントロピー符号復号部２００は、水平方向及び垂直方向で別々の適応直交変換インデックスを復号する。 On the other hand, the entropy code decoding unit 200 decodes an index indicating the orthogonal transform selected by the image encoding device 1 from one or more orthogonal transform candidates, and outputs the index to the orthogonal transform selection unit 263. When different types of orthogonal transform are selected in the horizontal and vertical directions, the entropy code decoding unit 200 decodes different adaptive orthogonal transform indices for the horizontal and vertical directions.

直交変換選択部２６３は、エントロピー符号復号部２００から入力されたインデックスに基づいて、候補選択部２６２から入力された１つ以上の直交変換候補の中から変換係数に適用する逆直交変換を選択し、選択した逆直交変換を逆変換部２１２に出力する。なお、直交変換選択部２６３は、水平方向及び垂直方向で同一種類の直交変換を選択するよう構成してもよいし、水平方向及び垂直方向で別々の種類の直交変換を選択するよう構成してもよい。 The orthogonal transform selection unit 263 selects an inverse orthogonal transform to apply to the transform coefficients from one or more orthogonal transform candidates input from the candidate selection unit 262 based on the index input from the entropy coding/decoding unit 200, and outputs the selected inverse orthogonal transform to the inverse transform unit 212. Note that the orthogonal transform selection unit 263 may be configured to select the same type of orthogonal transform in the horizontal and vertical directions, or may be configured to select different types of orthogonal transform in the horizontal and vertical directions.

（第２実施形態のまとめ）
第２実施形態に係る画像符号化装置１において、決定部１９０は、誤差マップの主成分分析により直交変換を生成する適応変換生成部１９１と、予め規定された複数種類の直交変換の中から、生成された直交変換との相関が高い順に１つ以上の直交変換候補を選択する候補選択部１９２と、１つ以上の直交変換候補の中から予測残差に適用する直交変換を選択する直交変換選択部１９３とを備える。エントロピー符号化部１３０は、１つ以上の直交変換候補の中から直交変換選択部１９３が選択した直交変換を示すインデックスを符号化する。 (Summary of the second embodiment)
In the image coding device 1 according to the second embodiment, the determination unit 190 includes an adaptive transform generation unit 191 that generates an orthogonal transform by principal component analysis of an error map, a candidate selection unit 192 that selects one or more orthogonal transform candidates from among a plurality of predefined types of orthogonal transforms in descending order of correlation with the generated orthogonal transform, and an orthogonal transform selection unit 193 that selects an orthogonal transform to apply to a prediction residual from among the one or more orthogonal transform candidates. The entropy coding unit 130 codes an index indicating the orthogonal transform selected by the orthogonal transform selection unit 193 from among the one or more orthogonal transform candidates.

また、第２実施形態に係る画像復号装置２において、決定部２６０は、誤差マップの主成分分析により直交変換を生成する適応変換生成部２６１と、予め規定された複数種類の直交変換の中から、生成された直交変換との相関が高い順に１つ以上の直交変換候補を選択する候補選択部２６２と、１つ以上の直交変換候補の中から変換係数に適用する逆直交変換を選択する直交変換選択部２６３とを備える。エントロピー符号復号部２００は、１つ以上の直交変換候補の中から画像符号化装置１が選択した直交変換を示すインデックスを復号する。直交変換選択部２６３は、当該インデックスに基づいて、１つ以上の直交変換候補の中から変換係数に適用する逆直交変換を選択する。 In the image decoding device 2 according to the second embodiment, the determination unit 260 includes an adaptive transform generation unit 261 that generates an orthogonal transform by principal component analysis of an error map, a candidate selection unit 262 that selects one or more orthogonal transform candidates from among multiple predefined types of orthogonal transforms in descending order of correlation with the generated orthogonal transform, and an orthogonal transform selection unit 263 that selects an inverse orthogonal transform to apply to the transform coefficients from among the one or more orthogonal transform candidates. The entropy code decoding unit 200 decodes an index indicating the orthogonal transform selected by the image encoding device 1 from among the one or more orthogonal transform candidates. The orthogonal transform selection unit 263 selects an inverse orthogonal transform to apply to the transform coefficients from among the one or more orthogonal transform candidates based on the index.

このように、第２実施形態によれば、予め規定された複数種類の直交変換の中から、誤差マップの主成分分析により生成された直交変換（ＫＬＴ）と相関の高い直交変換候補を選択し、当該直交変換候補の中から変換処理に適用する直交変換を選択する。予め規定された複数種類の直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）は、ＫＬＴに比べて演算処理量が少ない。 In this way, according to the second embodiment, from among multiple predefined types of orthogonal transforms, orthogonal transform candidates highly correlated with the orthogonal transform (KLT) generated by principal component analysis of the error map are selected, and from these orthogonal transform candidates, the orthogonal transform to be applied to the transform processing is selected. The multiple predefined types of orthogonal transforms (DCT-2, DST-7, DCT-8, DST-1, DCT-5) require less computational processing than KLT.

したがって、第２実施形態によれば、予測残差のエネルギーを効率的に集中させる直交変換を適用可能にして符号化効率を改善しつつ、第１実施形態に比べて変換処理の演算処理量を削減できる。 Therefore, according to the second embodiment, it is possible to apply an orthogonal transform that efficiently concentrates the energy of the prediction residual, improving coding efficiency, while reducing the amount of computation required for the transform process compared to the first embodiment.

＜第３実施形態＞
第３実施形態に係る画像符号化装置１及び画像復号装置２について、第１実施形態及び第２実施形態との相違点を主として説明する。 Third Embodiment
The image encoding device 1 and image decoding device 2 according to the third embodiment will be described, focusing mainly on the differences from the first and second embodiments.

上述した第２実施形態では、予め規定された複数種類の直交変換の中から、誤差マップの主成分分析により生成された直交変換（ＫＬＴ）と相関の高い直交変換候補を選択していた。これに対し、第３実施形態では、予め規定された複数種類の直交変換の中から、誤差マップの特徴量評価によって直交変換候補を選択する。 In the second embodiment described above, an orthogonal transform candidate highly correlated with the orthogonal transform (KLT) generated by principal component analysis of the error map was selected from among multiple predefined types of orthogonal transform. In contrast, in the third embodiment, an orthogonal transform candidate is selected from among multiple predefined types of orthogonal transform by evaluating the feature quantities of the error map.

（画像符号化装置）
図８は、第３実施形態に係る画像符号化装置１の構成を示す図である。図８に示すように、第３実施形態に係る画像符号化装置１は、決定部１９０が特徴量評価部１９１ａを備える点で第２実施形態とは異なる。特徴量評価部１９１ａは、評価部１８０から入力された誤差マップの特徴量を評価し、評価結果を候補選択部１９２に出力する。 (Image encoding device)
8 is a diagram showing the configuration of an image encoding device 1 according to the third embodiment. As shown in Fig. 8, the image encoding device 1 according to the third embodiment differs from the second embodiment in that the determination unit 190 includes a feature amount evaluation unit 191a. The feature amount evaluation unit 191a evaluates the feature amounts of the error map input from the evaluation unit 180 and outputs the evaluation result to the candidate selection unit 192.

図９は、特徴量評価部１９１ａの動作を示す図である。図９に示すように、特徴量評価部１９１ａは、誤差マップのエネルギー分布を評価するために、誤差マップを水平方向に４分割するとともに垂直方向に４分割し、分割された各領域について誤差推定値の合計値Ｅｘｙ（Ｅ₀₀乃至Ｅ₃₃）を算出する。特徴量評価部１９１ａは、評価したエネルギー分布Ｅ₀₀乃至Ｅ₃₃を候補選択部１９２に出力する。 9 is a diagram showing the operation of the feature amount evaluation unit 191a. As shown in FIG. 9, in order to evaluate the energy distribution of the error map, the feature amount evaluation unit 191a divides the error map into four regions horizontally and four regions vertically, and calculates the sum of error estimates Exy (E ₀₀ to E ₃₃ ) for each divided region. The feature amount evaluation unit 191a outputs the evaluated energy distributions E ₀₀ to E ₃₃ to the candidate selection unit 192.

候補選択部１９２は、下記の条件に基づいて、予め規定された複数種類の直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）の中から適応直交変換候補を選択し、選択した適応直交変換候補を直交変換選択部１９３に出力する。 The candidate selection unit 192 selects an adaptive orthogonal transform candidate from among several predefined types of orthogonal transform (DCT-2, DST-7, DCT-8, DST-1, DCT-5) based on the following conditions, and outputs the selected adaptive orthogonal transform candidate to the orthogonal transform selection unit 193.

（ａ）候補選択部１９２は、水平方向について： (a) The candidate selection unit 192 selects the horizontal direction as follows:

のとき、ＤＣＴ－２及びＤＳＴ－１を水平適応直交変換候補として選択し、 When , DCT-2 and DST-1 are selected as horizontal adaptive orthogonal transform candidates,

のとき、ＤＣＴ－２及びＤＳＴ－７を水平適応直交変換候補として選択し、 When , DCT-2 and DST-7 are selected as horizontal adaptive orthogonal transform candidates,

のとき、ＤＣＴ－２及びＤＣＴ－５を水平適応直交変換候補として選択し、
いずれも当てはまらないとき、ＤＣＴ－８及びＤＳＴ－７を水平適応直交変換候補として選択する。 select DCT-2 and DCT-5 as horizontal adaptive orthogonal transform candidates when
If neither of these applies, DCT-8 and DST-7 are selected as horizontally adaptive orthogonal transform candidates.

（ｂ）候補選択部１９２は、垂直方向について： (b) The candidate selection unit 192 determines the vertical direction as follows:

のとき、ＤＣＴ－２及びＤＳＴ－１を垂直適応直交変換候補として選択し、 When , DCT-2 and DST-1 are selected as vertical adaptive orthogonal transform candidates,

のとき、ＤＣＴ－２及びＤＳＴ－７を垂直適応直交変換候補として選択し、 When , DCT-2 and DST-7 are selected as vertical adaptive orthogonal transform candidates,

のとき、ＤＣＴ－２及びＤＣＴ－５を垂直適応直交変換候補として選択し、
いずれも当てはまらないとき、ＤＣＴ－８及びＤＳＴ－７を垂直適応直交変換候補として選択する。 select DCT-2 and DCT-5 as vertical adaptive orthogonal transform candidates when
If neither of these applies, DCT-8 and DST-7 are selected as candidates for vertical adaptive orthogonal transform.

ここでは、候補選択部１９２が出力する適応直交変換候補の数は、水平方向及び垂直方向のそれぞれで２つであるが、候補選択部１９２が出力する適応直交変換候補の数は、符号化対象の画像ブロックのブロックサイズや色成分（輝度成分、色差成分）などに応じて可変としてもよい。 Here, the number of adaptive orthogonal transform candidates output by the candidate selection unit 192 is two in each of the horizontal and vertical directions, but the number of adaptive orthogonal transform candidates output by the candidate selection unit 192 may be variable depending on the block size and color components (luminance component, chrominance component) of the image block to be encoded, etc.

直交変換選択部１９３は、第２実施形態と同様に、候補選択部１９２から入力された直交変換候補の中から、予測残差に適用する直交変換を選択し、選択した直交変換を変換部１２１及び逆変換部１４２に出力する。また、直交変換選択部１９３は、選択した直交変換の種類を示すインデックスをエントロピー符号化部１３０に出力する。エントロピー符号化部１３０は、直交変換選択部１９３から入力された適応直交変換インデックスをエントロピー符号化する。 As in the second embodiment, the orthogonal transform selection unit 193 selects an orthogonal transform to apply to the prediction residual from the orthogonal transform candidates input from the candidate selection unit 192, and outputs the selected orthogonal transform to the transform unit 121 and the inverse transform unit 142. The orthogonal transform selection unit 193 also outputs an index indicating the type of selected orthogonal transform to the entropy coding unit 130. The entropy coding unit 130 entropy codes the adaptive orthogonal transform index input from the orthogonal transform selection unit 193.

（画像復号装置）
図１０は、第３実施形態に係る画像復号装置２の構成を示す図である。図１０に示すように、第３実施形態に係る画像復号装置２は、決定部２６０が特徴量評価部２６１ａを備える点で第２実施形態とは異なる。特徴量評価部２６１ａは、画像符号化装置１の特徴量評価部１９１ａと同様な動作を行う（図９参照）。 (Image decoding device)
Fig. 10 is a diagram showing the configuration of an image decoding device 2 according to the third embodiment. As shown in Fig. 10, the image decoding device 2 according to the third embodiment differs from the second embodiment in that the determination unit 260 includes a feature amount evaluation unit 261a. The feature amount evaluation unit 261a performs the same operation as the feature amount evaluation unit 191a of the image encoding device 1 (see Fig. 9).

特徴量評価部２６１ａは、誤差マップのエネルギー分布を評価するために、誤差マップを水平方向に４分割するとともに垂直方向に４分割し、分割された各領域について誤差推定値の合計値Ｅｘｙ（Ｅ₀₀乃至Ｅ₃₃）を算出する。特徴量評価部２６１ａは、評価したエネルギー分布Ｅ₀₀乃至Ｅ₃₃を候補選択部２６２に出力する。 In order to evaluate the energy distribution of the error map, the feature amount evaluation unit 261a divides the error map into four regions horizontally and vertically, and calculates the sum of error estimates Exy (E ₀₀ to E ₃₃ ) for each divided region. The feature amount evaluation unit 261a outputs the evaluated energy distributions E ₀₀ to E ₃₃ to the candidate selection unit 262.

候補選択部２６２は、画像符号化装置１の候補選択部１９２と同様な条件に基づいて、予め規定された複数種類の直交変換（ＤＣＴ－２、ＤＳＴ－７、ＤＣＴ－８、ＤＳＴ－１、ＤＣＴ－５）の中から適応直交変換候補を選択し、選択した適応直交変換候補を直交変換選択部２６３に出力する。 The candidate selection unit 262 selects an adaptive orthogonal transform candidate from among multiple predefined types of orthogonal transform (DCT-2, DST-7, DCT-8, DST-1, DCT-5) based on the same conditions as the candidate selection unit 192 of the image encoding device 1, and outputs the selected adaptive orthogonal transform candidate to the orthogonal transform selection unit 263.

一方、エントロピー符号復号部２００は、直交変換候補の中から画像符号化装置１が選択した直交変換を示すインデックスを復号し、当該インデックスを直交変換選択部２６３に出力する。直交変換選択部２６３は、エントロピー符号復号部２００から入力されたインデックスに基づいて、候補選択部２６２から入力された直交変換候補の中から変換係数に適用する逆直交変換を選択し、選択した逆直交変換を逆変換部２１２に出力する。 Meanwhile, the entropy code decoding unit 200 decodes an index indicating the orthogonal transform selected by the image encoding device 1 from among the orthogonal transform candidates, and outputs the index to the orthogonal transform selection unit 263. The orthogonal transform selection unit 263 selects an inverse orthogonal transform to apply to the transform coefficients from among the orthogonal transform candidates input from the candidate selection unit 262, based on the index input from the entropy code decoding unit 200, and outputs the selected inverse orthogonal transform to the inverse transform unit 212.

（第３実施形態のまとめ）
第３実施形態に係る画像符号化装置１において、決定部１９０は、誤差マップの特徴量を評価する特徴量評価部１９１ａと、評価された特徴量に基づいて、予め規定された複数種類の直交変換の中から１つ以上の直交変換候補を選択する候補選択部１９２と、１つ以上の直交変換候補の中から予測残差に適用する直交変換を選択する直交変換選択部１９３とを備える。エントロピー符号化部１３０は、１つ以上の直交変換候補の中から直交変換選択部１９３が選択した直交変換を示すインデックスを符号化する。 (Summary of the third embodiment)
In the image coding device 1 according to the third embodiment, the determination unit 190 includes a feature evaluation unit 191 a that evaluates feature quantities of the error map, a candidate selection unit 192 that selects one or more orthogonal transform candidates from among a plurality of predefined types of orthogonal transform based on the evaluated feature quantities, and an orthogonal transform selection unit 193 that selects an orthogonal transform to be applied to the prediction residual from among the one or more orthogonal transform candidates. The entropy coding unit 130 codes an index indicating the orthogonal transform selected by the orthogonal transform selection unit 193 from among the one or more orthogonal transform candidates.

また、第３実施形態に係る画像復号装置２において、決定部２６０は、誤差マップの特徴量を評価する特徴量評価部２６１ａと、評価された特徴量に基づいて、予め規定された複数種類の直交変換の中から１つ以上の直交変換候補を選択する候補選択部２６２と、１つ以上の直交変換候補の中から変換係数に適用する逆直交変換を選択する直交変換選択部２６３とを備える。エントロピー符号復号部２００は、１つ以上の直交変換候補の中から画像符号化装置１が選択した直交変換を示すインデックスを復号する。直交変換選択部２６３は、当該インデックスに基づいて、１つ以上の直交変換候補の中から変換係数に適用する逆直交変換を選択する。 In the image decoding device 2 according to the third embodiment, the determination unit 260 includes a feature evaluation unit 261a that evaluates features of the error map, a candidate selection unit 262 that selects one or more orthogonal transform candidates from multiple predefined types of orthogonal transform based on the evaluated features, and an orthogonal transform selection unit 263 that selects an inverse orthogonal transform to apply to the transform coefficients from the one or more orthogonal transform candidates. The entropy code decoding unit 200 decodes an index indicating the orthogonal transform selected by the image encoding device 1 from the one or more orthogonal transform candidates. The orthogonal transform selection unit 263 selects an inverse orthogonal transform to apply to the transform coefficients from the one or more orthogonal transform candidates based on the index.

このように、第３実施形態によれば、誤差マップの特徴量をシンプルな演算処理により評価できるため、誤差マップの主成分分析により直交変換（ＫＬＴ）を生成する第２実施形態に比べて演算処理量を削減できる。 In this way, according to the third embodiment, the feature quantities of the error map can be evaluated through simple calculations, thereby reducing the amount of calculations required compared to the second embodiment, in which an orthogonal transform (KLT) is generated through principal component analysis of the error map.

したがって、第３実施形態によれば、予測残差のエネルギーを効率的に集中させる直交変換を適用可能にして符号化効率を改善しつつ、第２実施形態に比べて誤差マップの分析のための演算処理量を削減できる。 Therefore, according to the third embodiment, it is possible to apply an orthogonal transform that efficiently concentrates the energy of the prediction residual, improving coding efficiency, while reducing the amount of computation required for analyzing the error map compared to the second embodiment.

＜その他の実施形態＞
上述した第１乃至第３実施形態において、一次元の直交変換を用いて垂直方向及び垂直方向で別々に変換処理を行う一例について説明した。しかしながら、一次元の直交変換に代えて二次元の直交変換を用いて垂直方向及び垂直方向の変換処理をまとめて行ってもよい。 <Other embodiments>
In the first to third embodiments described above, an example has been described in which a one-dimensional orthogonal transform is used to perform separate transform processing in the vertical direction and the vertical direction. However, instead of the one-dimensional orthogonal transform, a two-dimensional orthogonal transform may be used to perform the transform processing in the vertical direction and the vertical direction together.

また、画像符号化装置１が行う各処理をコンピュータに実行させるプログラム及び画像復号装置２が行う各処理をコンピュータに実行させるプログラムにより提供されてもよい。また、プログラムは、コンピュータ読取り可能媒体に記録されていてもよい。コンピュータ読取り可能媒体を用いれば、コンピュータにプログラムをインストールすることが可能である。ここで、プログラムが記録されたコンピュータ読取り可能媒体は、非一過性の記録媒体であってもよい。非一過性の記録媒体は、特に限定されるものではないが、例えば、ＣＤ－ＲＯＭやＤＶＤ－ＲＯＭ等の記録媒体であってもよい。 The image encoding device 1 may also be provided by a program that causes a computer to execute each process performed by the image decoding device 2, and a program that causes a computer to execute each process performed by the image decoding device 2. The program may also be recorded on a computer-readable medium. Using a computer-readable medium makes it possible to install the program on a computer. Here, the computer-readable medium on which the program is recorded may be a non-transitory recording medium. There are no particular limitations on the non-transitory recording medium, and it may be, for example, a CD-ROM, DVD-ROM, or other recording medium.

また、画像符号化装置１が行う各処理を実行する回路を集積化し、画像符号化装置１を半導体集積回路（チップセット、ＳｏＣ）として構成してもよい。同様に、画像復号装置２が行う各処理を実行する回路を集積化し、画像復号装置２を半導体集積回路（チップセット、ＳｏＣ）として構成してもよい。 Furthermore, the circuits that perform each process performed by the image encoding device 1 may be integrated, and the image encoding device 1 may be configured as a semiconductor integrated circuit (chip set, SoC). Similarly, the circuits that perform each process performed by the image decoding device 2 may be integrated, and the image decoding device 2 may be configured as a semiconductor integrated circuit (chip set, SoC).

以上、図面を参照して実施形態について詳しく説明したが、具体的な構成は上述のものに限られることはなく、要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 The above describes the embodiments in detail with reference to the drawings, but the specific configuration is not limited to that described above, and various design changes can be made without departing from the spirit of the invention.

１：画像符号化装置
２：画像復号装置
１００：ブロック分割部
１１０：減算部
１２０：変換・量子化部
１２１：変換部
１２２：量子化部
１３０：エントロピー符号化部
１４０：逆量子化・逆変換部
１４１：逆量子化部
１４２：逆変換部
１５０：合成部
１６０：メモリ
１７０：予測部
１７１：イントラ予測部
１７２：インター予測部
１７３：切替部
１８０：評価部
１８０ａ：差分算出部
１８０ｂ：正規化部
１８０ｃ：調整部
１９０：決定部
１９１：適応変換生成部
１９１ａ：特徴量評価部
１９２：候補選択部
１９３：直交変換選択部
２００：エントロピー符号復号部
２１０：逆量子化・逆変換部
２１１：逆量子化部
２１２：逆変換部
２２０：合成部
２３０：メモリ
２４０：予測部
２４１：イントラ予測部
２４２：インター予測部
２４３：切替部
２５０：評価部
２６０：決定部
２６１：適応変換生成部
２６１ａ：特徴量評価部
２６２：候補選択部
２６３：直交変換選択部 1: Image encoding device 2: Image decoding device 100: Block division unit 110: Subtraction unit 120: Transformation and quantization unit 121: Transformation unit 122: Quantization unit 130: Entropy encoding unit 140: Inverse quantization and inverse transformation unit 141: Inverse quantization unit 142: Inverse transformation unit 150: Synthesis unit 160: Memory 170: Prediction unit 171: Intra prediction unit 172: Inter prediction unit 173: Switching unit 180: Evaluation unit 180a: Difference calculation unit 180b: Normalization unit 180c: Adjustment unit 190: Determination unit 191: Adaptive transformation generation unit 191a: Feature evaluation unit 192: Candidate selection unit 193: Orthogonal transformation selection unit 200 : Entropy encoding/decoding unit 210 : Inverse quantization/inverse transform unit 211 : Inverse quantization unit 212 : Inverse transform unit 220 : Combining unit 230 : Memory 240 : Prediction unit 241 : Intra prediction unit 242 : Inter prediction unit 243 : Switching unit 250 : Evaluation unit 260 : Decision unit 261 : Adaptive transform generation unit 261a : Feature evaluation unit 262 : Candidate selection unit 263 : Orthogonal transform selection unit

Claims

An image encoding device that encodes a target image in units of blocks obtained by dividing a current image in units of frames, comprising:
a block dividing unit that divides the current image into blocks;
a prediction unit that predicts the target image by inter prediction including bi-prediction using a plurality of reference images to generate a predicted image;
an evaluation unit that calculates a value indicating a similarity between the plurality of reference images in units of regions smaller than the block and consisting of a plurality of pixels only when the prediction unit performs the bi-prediction using a reference image that is temporally earlier and a reference image that is temporally later than the target image,
an image encoding device, characterized in that the evaluation unit controls encoding processing based on the value calculated for each region.

An image decoding device that decodes a target image in units of blocks obtained by dividing a current image in units of frames, comprising:
a prediction unit that predicts the target image by inter prediction including bi-prediction using a plurality of reference images to generate a predicted image;
an evaluation unit that calculates a value indicating a similarity between the plurality of reference images in units of regions smaller than the block and consisting of a plurality of pixels only when the prediction unit performs the bi-prediction using a reference image that is temporally earlier and a reference image that is temporally later than the target image,
The image decoding device controls a decoding process based on the value calculated by the evaluation unit for each region.

A program that causes a computer to function as the image encoding device described in claim 1.

A program that causes a computer to function as the image decoding device described in claim 2.